SlideShare a Scribd company logo
Effective Software
Implementation of
Advanced Encryption Standard
December 2014
Roman Oliynykov
Professor at
Information Technologies Security Department
Kharkov National University of Radioelectronics
Head of Scientific Research Department
JSC “Institute of Information Technologies”
Ukraine
Visiting professor at
Samsung Advanced Technology Training Institute
Korea
ROliynykov@gmail.com
Outline
 A few words about myself
 Brief history of AES/Rijndael
 AES properties
 Direct AES implementation and problems with it
 Methods for effective encryption
implementation (proposed by Rijndael authors
in their submission to AES competition)
 Decryption optimization
 Conclusions
About myself (I)
 I’m from Ukraine (Eastern part of
Europe),
host country of Euro2012 football
championship
 I live in Kharkov (the second biggest
city in the country, population is 1.5
million people), Eastern Ukraine
(near Russia),
former capital of the Soviet Ukraine
(1918-1934)
three Nobel prize winners worked at
Kharkov University
About myself (II)
 Professor at Information Technologies Security
Department at Kharkov National University of
Radioelectronics
 courses on computer networks and operation
system security, special mathematics for
cryptographic applications
 Head of Scientific Research Department at JSC
“Institute of Information Technologies”
 Scientific interests: symmetric cryptographic
primitives synthesis and cryptanalysis
 Visiting professor at Samsung Advanced
Technology Training Institute
 courses on computer networks and operation
system security, software security, effective
application and implementation of symmetric
cryptography
Modern and effective solution:
Advanced Encryption Standard (AES)
 result of international public cryptographic competition
(1997-2000)
 had been chosen among 15 candidate ciphers
(developed in the US, Belgium, Denmark, Germany,
Israel, Japan, Switzerland, Armenia, etc.)
 original name is Rijndael (developed by researchers from
Belgium)
 votes on 3rd AES conference had been given to this
cipher, but the rest Twofish (US), MARS (US, IBM), E2
(Japan, Camellia predecessor), Serpent (Israel) are also
remain strong
 the most researched block cipher all over the world
(2014, open publications)
 basis for development of many other symmetric primitives
AES properties
 block length 128 bits only (subset of Rijndael which
supports 128, 192 and 256 bits)
 key length is 128, 192 and 256 bits
 uses Substitution-Permutation Network (SPN)
 number of rounds (10,12,14) depends on key length
 quite transparent design, algebraic structure
(theoretically may be vulnerable to algebraic
analysis)
 quite effective in software (32-bit platforms) and
hardware implementation
AES parameters: key length,
block size, number of rounds
AES: presentation of processing
bytes as a “cipher state”
AES: main steps
running key schedule procedure:
generation of all round keys
running encryption or decryption
procedure
 or, for compact hardware implementation,
sequential operations:
 generation of the current round key
 one encryption round
AES: high-level structure
(pseudocode)
AES: high-level structure
(picture for 128 bit key)
AES: SubBytes transformation
AES: ShiftRows
transformation
AES: MixColumns
transformation
AES: AddRoundKey
transformation
AES round key generation (key
expansion)
NB: not all key length (128, 192, 256) must be supported; for many
applications it’s enough to have the single key length
AES round key generation:
RotWord
AES round key generation:
SubBytes
AES round key generation:
round constant application
NB: without Rcon there would be equal blocks in ciphertext if plaintext and
keys have equal blocks (1, 2 or 4 bytes repeats in plaintext and key)
AES round key sequence
AES decryption (direct
presentation): reverse operations
in different order
AES/Rijndael design goals
 be extremely fast on 32 bit platforms (+++)
 be compact on hardware implementation with
small number of gates (++)
 possibility to implement cipher on 8-bit smart-
card processors actual for 1990th (++)
 cryptographic strength (+)
Direct implementation of AES
round function: SubBytes
16 operations (byte substitution)
Direct implementation of AES
round function: ShiftRows
12 operations (byte permutation)
AES: MixColumns
transformation
60 operations (logical and conditional):

3+ operations for each input byte (48+ total):
• shift and conditional XOR (mult by 02)
• XOR (mult by 03)

3 XORs for each row (12 total)
Direct implementation of AES
round function

SubBytes: 16 operations (byte substitution)

ShiftRows: 12 operations (byte permutation)

MixColumns: 60 or even more operations
(conditions will prevent effective pipelining)

AddRoundKey: 16 operations (logical)
TOTAL: more than 102 operations per round
AES effective software
implementation: 32-bit platform
 three different operations can be united
into the single (!) look-up table access:
 SubBytes (non-linear)
 ShiftRows (linear)
 MixColumns (linear)
 cipher consists of look-up table accesses and
round key additions
AES effective software
implementation: MixColumns
Matrix multiplication: 7 operations (4 memory look-ups + 3
XORs) instead of 60:

32-bit XOR of 4 columns

each column depends on one input byte only

all 4 bytes in each column are precomputed and stored in
advance
AES round function operations
sequence variants:
Original:

SubBytes

ShiftRows

MixColumns
Equivalent:

ShiftRows

SubBytes

MixColumns
AES effective software implementation:
MixColumns and SubBytes at one
precomputed table
SubBytes and MixColumns: 7 operations (4 memory look-ups + 3
XORs) total:

32-bit XOR of 4 columns

each column depends on one input byte only (already sent throw
S-box)

all 4 bytes in each column are precomputed and stored in advance
Fragment of OpenSSL AES source
code (based on Rijndael author's
implementation)
4 tables are needed; size of each table is 256 * 4 = 1 kByte
Fragment of OpenSSL AES source
code (based on Rijndael author's
implementation)
ShiftRows is implemented as usual shift and mask of 32-bit register;
SubBytes and MixColumns are implemented as memory lookups (8 bit → 32 bit)
AES effective software implementation:
extra memory optimization
Decreasing memory amount: single table (1 kByte instead of
4 tables of 1 kB each)
Main table size for the fastest and
compact optimized 32-bit AES
implementation
 fastest:
 (4 bytes) x (256 different entries to S-box) x
x (4 different positions for ShiftRow) == 4 kbytes
 compact optimized:
 (4 bytes) x (256 different entries to S-box) ==
== 1 kbyte
 three additional operations in C ( << , >>, | or ^)
are needed besides a table look-up
NB: for reaching highest performance precomputed tables and processing data
must fit into L1 processor cache (32-64kBytes for modern processors)
Number of 32-bit operations needed for a
single block encryption at main
transformation (having all round keys)
 ( (4 look-up) + (3 xors) ) * (4 columns) ==
== 28 operations / round
 4 xors with round keys ==
== 4 operations / round
 (28 + 4) * (9 rounds) == 288 operations for high
strength encryption of 9 rounds (!)
 (16 operations on SubBytes) + (24 operations on
ShiftRows) + (4 xors with round keys) ==
== 44 operations at last round
AES decryption: high-level
structure (pseudocode)
AES decryption: optimization
 SubBytes() and ShiftRows() transformations
commute, their sequence can be chaged
 The column mixing operations -
MixColumns() and InvMixColumns() – are
linear with respect to the column input, which
means InvMixColumns(state xor Round Key)
== InvMixColumns(state) xor
InvMixColumns(Round Key)
AES optimized decryption with
changed round keys
Additional details on AES
implementation
 two set of tables for encryption
 main optimized set (MixColumns, ShiftRows and
SubBytes)
 separate S-box array for the last round
 two set of tables for decryption (complexity is
the same as for encryption)
 main optimized set (InvMixColumns, InvShiftRows
and InvSubBytes)
 separate reverse S-box array for the last round
NB: ECB decryption is not needed for the most block cipher modes of operation
Conclusions
 direct AES implementation is very slow (requires
many byte operations and conditions)
 three different round function operations can be
united into the single look-up table access
 with effective implementation AES consists of look-
up table accesses and round key additions
 the fastest version AES requires 4 kB of memory for
tables, fast but compact requires 1 kB
 fast AES decryption operation has the same speed
as encryption and uses changed order of round
function operations with modified round keys

More Related Content

What's hot (20)

PPT
Hashing
Ghaffar Khan
 
PPTX
Topic1 substitution transposition-techniques
MdFazleRabbi18
 
PPT
Data encryption standard
Vasuki Ramasamy
 
PDF
Computer Security Lecture 5: Simplified Advanced Encryption Standard
Mohamed Loey
 
PDF
Block Ciphers and the Data Encryption Standard
Dr.Florence Dayana
 
PPTX
Advanced encryption standard (aes)
farazvirk554
 
PPTX
Compiler Design Unit 4
Jena Catherine Bel D
 
PPTX
Information and data security block cipher and the data encryption standard (...
Mazin Alwaaly
 
PDF
Hash mitad al cuadrado pdf
Héctor Riquelme Burgos
 
PPT
block ciphers
Asad Ali
 
DOCX
What is AES? Advanced Encryption Standards
Faisal Shahzad Khan
 
PPTX
AES Encryption
Rahul Marwaha
 
PPTX
Unit 2
KRAMANJANEYULU1
 
PPT
Towers Hanoi Algorithm
Jorge Jasso
 
PPTX
I mage encryption using rc5
Suramrit Singh
 
PPTX
Matrices
Magda Fernandez
 
PDF
Computer Security Lecture 7: RSA
Mohamed Loey
 
Hashing
Ghaffar Khan
 
Topic1 substitution transposition-techniques
MdFazleRabbi18
 
Data encryption standard
Vasuki Ramasamy
 
Computer Security Lecture 5: Simplified Advanced Encryption Standard
Mohamed Loey
 
Block Ciphers and the Data Encryption Standard
Dr.Florence Dayana
 
Advanced encryption standard (aes)
farazvirk554
 
Compiler Design Unit 4
Jena Catherine Bel D
 
Information and data security block cipher and the data encryption standard (...
Mazin Alwaaly
 
Hash mitad al cuadrado pdf
Héctor Riquelme Burgos
 
block ciphers
Asad Ali
 
What is AES? Advanced Encryption Standards
Faisal Shahzad Khan
 
AES Encryption
Rahul Marwaha
 
Towers Hanoi Algorithm
Jorge Jasso
 
I mage encryption using rc5
Suramrit Singh
 
Matrices
Magda Fernandez
 
Computer Security Lecture 7: RSA
Mohamed Loey
 

Similar to AES effecitve software implementation (20)

PPT
Advanced Encryption Standard presentation slide
ssr978534
 
PPT
Network Security Lec4
Federal Urdu University
 
PPT
Advanced Encryption System - Network and Security.ppt
VimalAadhithan
 
PDF
Aes
Manju Hegde
 
PDF
icwet1097
Sapna Agarwal
 
PPTX
AES (Intro Advanced Encryption Standard).pptx
ssuser0a47f0
 
PPTX
Stream Ciphers and Block Ciphers in Security.pptx
Vivekananda Gn
 
PDF
Computer security module 2
Deepak John
 
PDF
sheet7.pdf
aminasouyah
 
PDF
paper7.pdf
aminasouyah
 
PDF
lecture6.pdf
aminasouyah
 
PDF
doc7.pdf
aminasouyah
 
PDF
modified aes algorithm using multiple s-boxes
chutinhha
 
PDF
Next generation block ciphers
Roman Oliynykov
 
PPT
Unit -2.ppt
DHANABALSUBRAMANIAN
 
PDF
Network security cs5
Infinity Tech Solutions
 
PDF
A design of a fast parallel pipelined implementation of aes advanced encrypti...
ijcsit
 
PDF
A VHDL Implemetation of the Advanced Encryption Standard-Rijndael.pdf
RamRaja15
 
Advanced Encryption Standard presentation slide
ssr978534
 
Network Security Lec4
Federal Urdu University
 
Advanced Encryption System - Network and Security.ppt
VimalAadhithan
 
icwet1097
Sapna Agarwal
 
AES (Intro Advanced Encryption Standard).pptx
ssuser0a47f0
 
Stream Ciphers and Block Ciphers in Security.pptx
Vivekananda Gn
 
Computer security module 2
Deepak John
 
sheet7.pdf
aminasouyah
 
paper7.pdf
aminasouyah
 
lecture6.pdf
aminasouyah
 
doc7.pdf
aminasouyah
 
modified aes algorithm using multiple s-boxes
chutinhha
 
Next generation block ciphers
Roman Oliynykov
 
Unit -2.ppt
DHANABALSUBRAMANIAN
 
Network security cs5
Infinity Tech Solutions
 
A design of a fast parallel pipelined implementation of aes advanced encrypti...
ijcsit
 
A VHDL Implemetation of the Advanced Encryption Standard-Rijndael.pdf
RamRaja15
 
Ad

More from Roman Oliynykov (8)

PDF
Cryptocurrency with central bank regulations: the RSCoin framework
Roman Oliynykov
 
PDF
Buffer overflow and other software vulnerabilities: theory and practice of pr...
Roman Oliynykov
 
PDF
Kalyna block cipher presentation in English
Roman Oliynykov
 
PDF
Software Security
Roman Oliynykov
 
PDF
Block Ciphers Modes of Operation
Roman Oliynykov
 
PDF
Kupyna
Roman Oliynykov
 
PDF
Kalyna
Roman Oliynykov
 
PPT
Software security
Roman Oliynykov
 
Cryptocurrency with central bank regulations: the RSCoin framework
Roman Oliynykov
 
Buffer overflow and other software vulnerabilities: theory and practice of pr...
Roman Oliynykov
 
Kalyna block cipher presentation in English
Roman Oliynykov
 
Software Security
Roman Oliynykov
 
Block Ciphers Modes of Operation
Roman Oliynykov
 
Software security
Roman Oliynykov
 
Ad

Recently uploaded (20)

PPTX
原版西班牙莱昂大学毕业证(León毕业证书)如何办理
Taqyea
 
PPT
introductio to computers by arthur janry
RamananMuthukrishnan
 
PPTX
unit 2_2 copy right fdrgfdgfai and sm.pptx
nepmithibai2024
 
PPTX
Presentation3gsgsgsgsdfgadgsfgfgsfgagsfgsfgzfdgsdgs.pptx
SUB03
 
PPT
introduction to networking with basics coverage
RamananMuthukrishnan
 
PPTX
英国假毕业证诺森比亚大学成绩单GPA修改UNN学生卡网上可查学历成绩单
Taqyea
 
PDF
Build Fast, Scale Faster: Milvus vs. Zilliz Cloud for Production-Ready AI
Zilliz
 
PDF
Web Hosting for Shopify WooCommerce etc.
Harry_Phoneix Harry_Phoneix
 
PPTX
Research Design - Report on seminar in thesis writing. PPTX
arvielobos1
 
PDF
The-Hidden-Dangers-of-Skipping-Penetration-Testing.pdf.pdf
naksh4thra
 
PPTX
Lec15_Mutability Immutability-converted.pptx
khanjahanzaib1
 
PPTX
ZARA-Case.pptx djdkkdjnddkdoodkdxjidjdnhdjjdjx
RonnelPineda2
 
PPTX
internet básico presentacion es una red global
70965857
 
PPTX
一比一原版(LaTech毕业证)路易斯安那理工大学毕业证如何办理
Taqyea
 
PPTX
L1A Season 1 ENGLISH made by A hegy fixed
toszolder91
 
PPT
Agilent Optoelectronic Solutions for Mobile Application
andreashenniger2
 
PPT
Computer Securityyyyyyyy - Chapter 2.ppt
SolomonSB
 
PDF
Azure_DevOps introduction for CI/CD and Agile
henrymails
 
PPTX
ONLINE BIRTH CERTIFICATE APPLICATION SYSYTEM PPT.pptx
ShyamasreeDutta
 
PPTX
L1A Season 1 Guide made by A hegy Eng Grammar fixed
toszolder91
 
原版西班牙莱昂大学毕业证(León毕业证书)如何办理
Taqyea
 
introductio to computers by arthur janry
RamananMuthukrishnan
 
unit 2_2 copy right fdrgfdgfai and sm.pptx
nepmithibai2024
 
Presentation3gsgsgsgsdfgadgsfgfgsfgagsfgsfgzfdgsdgs.pptx
SUB03
 
introduction to networking with basics coverage
RamananMuthukrishnan
 
英国假毕业证诺森比亚大学成绩单GPA修改UNN学生卡网上可查学历成绩单
Taqyea
 
Build Fast, Scale Faster: Milvus vs. Zilliz Cloud for Production-Ready AI
Zilliz
 
Web Hosting for Shopify WooCommerce etc.
Harry_Phoneix Harry_Phoneix
 
Research Design - Report on seminar in thesis writing. PPTX
arvielobos1
 
The-Hidden-Dangers-of-Skipping-Penetration-Testing.pdf.pdf
naksh4thra
 
Lec15_Mutability Immutability-converted.pptx
khanjahanzaib1
 
ZARA-Case.pptx djdkkdjnddkdoodkdxjidjdnhdjjdjx
RonnelPineda2
 
internet básico presentacion es una red global
70965857
 
一比一原版(LaTech毕业证)路易斯安那理工大学毕业证如何办理
Taqyea
 
L1A Season 1 ENGLISH made by A hegy fixed
toszolder91
 
Agilent Optoelectronic Solutions for Mobile Application
andreashenniger2
 
Computer Securityyyyyyyy - Chapter 2.ppt
SolomonSB
 
Azure_DevOps introduction for CI/CD and Agile
henrymails
 
ONLINE BIRTH CERTIFICATE APPLICATION SYSYTEM PPT.pptx
ShyamasreeDutta
 
L1A Season 1 Guide made by A hegy Eng Grammar fixed
toszolder91
 

AES effecitve software implementation

  • 1. Effective Software Implementation of Advanced Encryption Standard December 2014 Roman Oliynykov Professor at Information Technologies Security Department Kharkov National University of Radioelectronics Head of Scientific Research Department JSC “Institute of Information Technologies” Ukraine Visiting professor at Samsung Advanced Technology Training Institute Korea [email protected]
  • 2. Outline  A few words about myself  Brief history of AES/Rijndael  AES properties  Direct AES implementation and problems with it  Methods for effective encryption implementation (proposed by Rijndael authors in their submission to AES competition)  Decryption optimization  Conclusions
  • 3. About myself (I)  I’m from Ukraine (Eastern part of Europe), host country of Euro2012 football championship  I live in Kharkov (the second biggest city in the country, population is 1.5 million people), Eastern Ukraine (near Russia), former capital of the Soviet Ukraine (1918-1934) three Nobel prize winners worked at Kharkov University
  • 4. About myself (II)  Professor at Information Technologies Security Department at Kharkov National University of Radioelectronics  courses on computer networks and operation system security, special mathematics for cryptographic applications  Head of Scientific Research Department at JSC “Institute of Information Technologies”  Scientific interests: symmetric cryptographic primitives synthesis and cryptanalysis  Visiting professor at Samsung Advanced Technology Training Institute  courses on computer networks and operation system security, software security, effective application and implementation of symmetric cryptography
  • 5. Modern and effective solution: Advanced Encryption Standard (AES)  result of international public cryptographic competition (1997-2000)  had been chosen among 15 candidate ciphers (developed in the US, Belgium, Denmark, Germany, Israel, Japan, Switzerland, Armenia, etc.)  original name is Rijndael (developed by researchers from Belgium)  votes on 3rd AES conference had been given to this cipher, but the rest Twofish (US), MARS (US, IBM), E2 (Japan, Camellia predecessor), Serpent (Israel) are also remain strong  the most researched block cipher all over the world (2014, open publications)  basis for development of many other symmetric primitives
  • 6. AES properties  block length 128 bits only (subset of Rijndael which supports 128, 192 and 256 bits)  key length is 128, 192 and 256 bits  uses Substitution-Permutation Network (SPN)  number of rounds (10,12,14) depends on key length  quite transparent design, algebraic structure (theoretically may be vulnerable to algebraic analysis)  quite effective in software (32-bit platforms) and hardware implementation
  • 7. AES parameters: key length, block size, number of rounds
  • 8. AES: presentation of processing bytes as a “cipher state”
  • 9. AES: main steps running key schedule procedure: generation of all round keys running encryption or decryption procedure  or, for compact hardware implementation, sequential operations:  generation of the current round key  one encryption round
  • 16. AES round key generation (key expansion) NB: not all key length (128, 192, 256) must be supported; for many applications it’s enough to have the single key length
  • 17. AES round key generation: RotWord
  • 18. AES round key generation: SubBytes
  • 19. AES round key generation: round constant application NB: without Rcon there would be equal blocks in ciphertext if plaintext and keys have equal blocks (1, 2 or 4 bytes repeats in plaintext and key)
  • 20. AES round key sequence
  • 21. AES decryption (direct presentation): reverse operations in different order
  • 22. AES/Rijndael design goals  be extremely fast on 32 bit platforms (+++)  be compact on hardware implementation with small number of gates (++)  possibility to implement cipher on 8-bit smart- card processors actual for 1990th (++)  cryptographic strength (+)
  • 23. Direct implementation of AES round function: SubBytes 16 operations (byte substitution)
  • 24. Direct implementation of AES round function: ShiftRows 12 operations (byte permutation)
  • 25. AES: MixColumns transformation 60 operations (logical and conditional):  3+ operations for each input byte (48+ total): • shift and conditional XOR (mult by 02) • XOR (mult by 03)  3 XORs for each row (12 total)
  • 26. Direct implementation of AES round function  SubBytes: 16 operations (byte substitution)  ShiftRows: 12 operations (byte permutation)  MixColumns: 60 or even more operations (conditions will prevent effective pipelining)  AddRoundKey: 16 operations (logical) TOTAL: more than 102 operations per round
  • 27. AES effective software implementation: 32-bit platform  three different operations can be united into the single (!) look-up table access:  SubBytes (non-linear)  ShiftRows (linear)  MixColumns (linear)  cipher consists of look-up table accesses and round key additions
  • 28. AES effective software implementation: MixColumns Matrix multiplication: 7 operations (4 memory look-ups + 3 XORs) instead of 60:  32-bit XOR of 4 columns  each column depends on one input byte only  all 4 bytes in each column are precomputed and stored in advance
  • 29. AES round function operations sequence variants: Original:  SubBytes  ShiftRows  MixColumns Equivalent:  ShiftRows  SubBytes  MixColumns
  • 30. AES effective software implementation: MixColumns and SubBytes at one precomputed table SubBytes and MixColumns: 7 operations (4 memory look-ups + 3 XORs) total:  32-bit XOR of 4 columns  each column depends on one input byte only (already sent throw S-box)  all 4 bytes in each column are precomputed and stored in advance
  • 31. Fragment of OpenSSL AES source code (based on Rijndael author's implementation) 4 tables are needed; size of each table is 256 * 4 = 1 kByte
  • 32. Fragment of OpenSSL AES source code (based on Rijndael author's implementation) ShiftRows is implemented as usual shift and mask of 32-bit register; SubBytes and MixColumns are implemented as memory lookups (8 bit → 32 bit)
  • 33. AES effective software implementation: extra memory optimization Decreasing memory amount: single table (1 kByte instead of 4 tables of 1 kB each)
  • 34. Main table size for the fastest and compact optimized 32-bit AES implementation  fastest:  (4 bytes) x (256 different entries to S-box) x x (4 different positions for ShiftRow) == 4 kbytes  compact optimized:  (4 bytes) x (256 different entries to S-box) == == 1 kbyte  three additional operations in C ( << , >>, | or ^) are needed besides a table look-up NB: for reaching highest performance precomputed tables and processing data must fit into L1 processor cache (32-64kBytes for modern processors)
  • 35. Number of 32-bit operations needed for a single block encryption at main transformation (having all round keys)  ( (4 look-up) + (3 xors) ) * (4 columns) == == 28 operations / round  4 xors with round keys == == 4 operations / round  (28 + 4) * (9 rounds) == 288 operations for high strength encryption of 9 rounds (!)  (16 operations on SubBytes) + (24 operations on ShiftRows) + (4 xors with round keys) == == 44 operations at last round
  • 37. AES decryption: optimization  SubBytes() and ShiftRows() transformations commute, their sequence can be chaged  The column mixing operations - MixColumns() and InvMixColumns() – are linear with respect to the column input, which means InvMixColumns(state xor Round Key) == InvMixColumns(state) xor InvMixColumns(Round Key)
  • 38. AES optimized decryption with changed round keys
  • 39. Additional details on AES implementation  two set of tables for encryption  main optimized set (MixColumns, ShiftRows and SubBytes)  separate S-box array for the last round  two set of tables for decryption (complexity is the same as for encryption)  main optimized set (InvMixColumns, InvShiftRows and InvSubBytes)  separate reverse S-box array for the last round NB: ECB decryption is not needed for the most block cipher modes of operation
  • 40. Conclusions  direct AES implementation is very slow (requires many byte operations and conditions)  three different round function operations can be united into the single look-up table access  with effective implementation AES consists of look- up table accesses and round key additions  the fastest version AES requires 4 kB of memory for tables, fast but compact requires 1 kB  fast AES decryption operation has the same speed as encryption and uses changed order of round function operations with modified round keys