svn rev #23432: trunk/src/lib/crypto/builtin/enc_provider/
raeburn@MIT.EDU
raeburn at MIT.EDU
Wed Dec 2 18:09:33 EST 2009
http://src.mit.edu/fisheye/changelog/krb5/?cs=23432
Commit By: raeburn
Log Message:
Perform the AES-CBC XOR operations 4 bytes at a time, using the helper
functions for loading and storing potentially-unaligned values.
Improves bulk AES encryption performance by 2% or so on 32-bit x86
with gcc 4.
Changed Files:
U trunk/src/lib/crypto/builtin/enc_provider/aes.c
Modified: trunk/src/lib/crypto/builtin/enc_provider/aes.c
===================================================================
--- trunk/src/lib/crypto/builtin/enc_provider/aes.c 2009-12-02 23:09:29 UTC (rev 23431)
+++ trunk/src/lib/crypto/builtin/enc_provider/aes.c 2009-12-02 23:09:33 UTC (rev 23432)
@@ -51,8 +51,24 @@
xorblock(unsigned char *out, const unsigned char *in)
{
int z;
- for (z = 0; z < BLOCK_SIZE; z++)
- out[z] ^= in[z];
+ for (z = 0; z < BLOCK_SIZE/4; z++) {
+ unsigned char *outptr = &out[z*4];
+ unsigned char *inptr = &in[z*4];
+ /* Use unaligned accesses. On x86, this will probably still
+ be faster than multiple byte accesses for unaligned data,
+ and for aligned data should be far better. (One test
+ indicated about 2.4% faster encryption for 1024-byte
+ messages.)
+
+ If some other CPU has really slow unaligned-word or byte
+ accesses, perhaps this function (or the load/store
+ helpers?) should test for alignment first.
+
+ If byte accesses are faster than unaligned words, we may
+ need to conditionalize on CPU type, as that may be hard to
+ determine automatically. */
+ store_32_n (load_32_n(outptr) ^ load_32_n(inptr), outptr);
+ }
}
krb5_error_code
More information about the cvs-krb5
mailing list