You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This change adds AVX-512 support for RSA 2k, 3k and 4k signing. It is
built around the use of AVX512_IFMA within the [(Almost) Montgomery
Multiplication](https://eprint.iacr.org/2011/239) implementation that
comprises the modular exponentiation part of the RSA algorithm. It is
ported from the [OpenSSL
patch](openssl/openssl#13750).
On C6i instance, clang 12, Release build:
Before:
Did 832 RSA 2048 signing operations in 1009511us (824.2 ops/sec)
Did 41000 RSA 2048 verify (same key) operations in 1019103us (40231.5 ops/sec)
Did 30000 RSA 2048 verify (fresh key) operations in 1007956us (29763.2 ops/sec)
Did 3684 RSA 2048 private key parse operations in 1067692us (3450.4 ops/sec)
Did 340 RSA 3072 signing operations in 1051690us (323.3 ops/sec)
Did 13000 RSA 3072 verify (same key) operations in 1087695us (11951.9 ops/sec)
Did 16000 RSA 3072 verify (fresh key) operations in 1005781us (15908.0 ops/sec)
Did 1870 RSA 3072 private key parse operations in 1017467us (1837.9 ops/sec)
Did 128 RSA 4096 signing operations in 1015724us (126.0 ops/sec)
Did 10000 RSA 4096 verify (same key) operations in 1071670us (9331.2 ops/sec)
Did 6952 RSA 4096 verify (fresh key) operations in 1016484us (6839.3 ops/sec)
Did 1110 RSA 4096 private key parse operations in 1092991us (1015.6 ops/sec)
After:
Did 1690 RSA 2048 signing operations in 1025072us (1648.7 ops/sec)
Did 63000 RSA 2048 verify (same key) operations in 1008785us (62451.4 ops/sec)
Did 54000 RSA 2048 verify (fresh key) operations in 1000298us (53983.9 ops/sec)
Did 8000 RSA 2048 private key parse operations in 1000938us (7992.5 ops/sec)
Did 550 RSA 3072 signing operations in 1012078us (543.4 ops/sec)
Did 30000 RSA 3072 verify (same key) operations in 1022061us (29352.5 ops/sec)
Did 27000 RSA 3072 verify (fresh key) operations in 1037663us (26020.0 ops/sec)
Did 4140 RSA 3072 private key parse operations in 1006526us (4113.2 ops/sec)
Did 253 RSA 4096 signing operations in 1050767us (240.8 ops/sec)
Did 18000 RSA 4096 verify (same key) operations in 1057742us (17017.4 ops/sec)
Did 15000 RSA 4096 verify (fresh key) operations in 1000483us (14992.8 ops/sec)
Did 2510 RSA 4096 private key parse operations in 1004408us (2499.0 ops/sec)
There is currently no support for 8k, so no change there. However, this
could be a follow on if there is interest in that.
Call-outs:
This patch is primarily additive modulo a small logic change that occurs in `mod_exp()` in `rsa_impl.c`,
where, previously, the calls to `mod_montgomery` and `BN_mod_exp_mont_consttime` were
interleaved. Here, in order to make possible the parallel exponentiations, `r1` is kept around and a new
`BIGNUM`, `r2`, is created on the context.
---------
Co-authored-by: Nevine Ebeid <[email protected]>
Co-authored-by: Nevine Ebeid <[email protected]>
0 commit comments