VLSI DESIGN CONFERENCE 2016
Domain- Analog/Digital Design
Challenge D3- Efficient Accelerator for Authenticated Encryption
Title: HarSam
Authors: Samnit Dua and Hardik Manocha
Passcode: 26X-C4E3D5E4H7
Confirmation No: 26
Introduction:
Our Team has selected one of the CAESAR Candidate’s paper to be implemented in the Design Contest
for VLSI Design Conference 2016, named TIAOXIN-346. As stated in the paper
(http://competitions.cr.yp.to/round1/tiaoxinv1.pdf ), implementation has been done on software
displaying Speed analysis for the design. No Hardware implementation has been listed in the paper.
Our Team, thus decided to design the Hardware for TIAOXIN-346, emphasizing on the FPGA
implementation using VerilogHDL and try to achieve the same speed as stated in the paper, on the
FPGA. Further, our team worked on the memory feature of the design as well.
Complete analysis of our design is listed on the pages to come with comparison to the analysis listed in
the paper.
We have worked on the 256 number of bits of the Text that has to be encrypted and decrypted.
Specification:
TIAOXIN-346 is a nonce based authenticated encryption scheme, which operates on 256 bits of the
Message and Associated data, along with 128 bits Key and Nonce (public message number)
For ENCRYPTION/AUTHENTICTION stage Tiaoxin- 346 (K; IV; M; AD) = (C; Tag)
Inputs-
Key, K (128 bits)
Nonce, IV (128 bits)
Plain Text, M (256 bits)
Associated Data, AD (256 bits)
Outputs-
Cipher Text, C (256 bits)
Authentication Tag, Tag (128 bits)
For DECRYPTION stage
Inputs-
Key, K (128 bits)
Nonce, IV (128 bits)
Cipher Text, M (256 bits)
Associated Data, AD (256 bits)
Authentication Tag, Tag (128 bits)
Outputs-
Plain Text, M (256 bits), if Authentication Tag generated matches with the input
Authentication Tag.
Notations:
Z0 - a constant word defined as Z0 =428a2f98d728ae227137449123ef65cd
Z1 - a constant word defined as Z1 =b5c0fbcfec4d3b2fe9b5dba58189dbbc
Ts - a state composed of s words. For instance, T3 has 3 words, T6 has 6 words. To index state words we
use the language C notation, hence Ts = (Ts[0]; Ts[1]; : : : ; Ts[s-1]), where Ts[i]; i = 0; : : : ; s-1 are words,
and Ts[0] is the first word.
Operations:
X ^ Y {bitwise addition (XOR) of the words X and Y
X&Y {bitwise conjunction (AND) of the words X and Y
AES(X; SK) {one keyed round of AES applied to the word X, where SK is the sub key, i.e.:
AES(X; SK) = Mix Columns (Shift Rows (Sub Bytes(X))) ^ SK
Sub Bytes; Shift Rows; Mix Columns are the same operations as in AES.
Thus, AES(X; SK) is the AES-NI instruction aesenc.
R (Ts; M) {a round transformation of a state with s words. The inputs of R are state Ts and word M,
while the output is a new state Tnew i.e.
R: Ts X M->Tnew s :
Tnew s [0] = AES (Ts [s - 1]; Ts [0]) ^M
Tnew s [1] = AES (Ts [0]; Z0)
Tnew s [2] = Ts [1]
: : :
Tnew s [s - 1] = Ts[s - 2]
States of TIAOXIN-346:
Both the Encryption and Decryption parts of TIAOXIN-346 operate upon three states- T3, T4 and
T6. T3 consists of 3 words; T4 consists of 4 words while T6 consists of 6 words.
Update Operation:
T3, T4 and T6 are updated using UPDATE function. UPDATE function uses the R(T;M) operation,
as defined above.
Update: T3 X T4 X T5 XM0 XM1 XM2 -> T3 X T4 X T6
T3new = R(T3;M0); T3 = T3new
T4 new = R(T4;M1); T4 = T4new
T6 new = R(T6;M2); T6 = T6new
T3 Update T4 Update T6 Update
Circled A stands for one AES round. The AES rounds applied to T3[2]; T4[3]; T6[5] are keyless, while the
AES rounds applied to T3[0]; T4[0]; T6[0] use Z0 as a sub key.
Definition of TIAOXIN-346
Tiaoxin – 346 processes the associated data AD and the message Min blocks where each block is
composed of 2 words (32 bytes, 256 bits)
The associated data AD is of 32 bytes. The length of the AD is encoded as 16-byte big endian word and
stored in AD Length, i.e. AD Length = |AD|.
The message M is of 32 bytes. The length of the M is encoded as 16-byte big endian word and stored in
M Length, i.e. M Length = |M|.
Tiaoxin - 346 is a stream cipher based design and as such it works in four phases: Initialization,
Processing associated data, Encryption, and Finalization. These phases are executed in the order
specified above.
INITIALIZATION:
In the initialization, the key K and the public message number (nonce) IV are loaded into the three states
T3; T4; T6 and the states go through 15 rounds.
T3 [0] = K; T3 [1] = K; T3 [2] = IV;
T4 [0] = K; T4 [1] = K; T4 [2] = IV; T4 [3] = Z0;
T6 [0] = K; T6 [1] = K; T6 [2] = IV; T6 [3] = Z1; T6 [4] = 0; T6 [5] = 0;
for i = 1 to 15
Update (T3; T4; T6; Z0; Z1; Z0);
end for
PROCESSING ASSOCIATED DATA
Assume the associated data is composed of two words, i.e. AD= AD0 || AD1. The Processing associated
data is defined as:
Update (T3; T4; T6; AD0; AD1; AD0 ^ AD1);
ENCRYPTION:
Assume the Message M is composed of two words, i.e. M= M0 || M1. In the encryption, a block M is
processed in one round, and a block of cipher text C = C0 || C1 (concatenation) is output. The Processing
associated data is defined as:
Update (T3; T4; T6; M0; M1; M0 ^ M1);
C0=T3 [0] ^ T3 [2] ^ T4 [1] ^ (T6 [3] & T4 [3]);
C1= T6 [0] ^ T4 [2] ^ T3 [1] ^ (T6 [5] & T3 [2]);
TAG PRODUCTION:
After all message blocks have been processed, the words holding the lengths of the associated data and
message are processed, then the states go through 20 more rounds, and the tag Tag is produced as an
XOR of all words of all states. This final phase is defined as:
Update (T3; T4; T6; AD Length; M Length; AD Length ^ M Length);
for i = 1 to 20
Update (T3; T4; T6; Z1; Z0; Z1);
end for
Tag= T3 [0] ^ T3 [1] ^ T3 [2] ^ T4 [0] ^ T4 [1] ^ T4 [2] ^ T4 [3] ^ T6 [0] ^
T6 [1] ^ T6 [2] ^ T6 [3] ^ T6 [4] ^ T6 [5];
DECRYPTION and VERIFICATION:
In the decryption-verification process, the order of the phases is the same: Initialization, Processing
associated data, Decryption, and Finalization. Initialization, Processing associated data and Finalization
are the same as during the encryption. Decryption is defined as:
Update (T3; T4; T6; 0; 0; 0);
M0= C0 ^ T3 [0] ^ T3 [2] ^ T4 [1] ^ (T6 [3] & T4 [3]);
M1= C ^ T6 [0] ^ T4 [2] ^ T3 [1] ^ (T6 [5] & T3 [2]) ^ M0;
T3 [0] = T3 [0] ^ M0;
T4 [0] = T4 [0] ^ M1;
T6 [0] = T6 [0] ^ M0 ^ M1;
VERILOG IMPLEMENTATION:
The VerilogHDL is an IEEE standard hardware description language. It is widely used in the design of
digital integrated circuits. Basically Verilog is verification through simulation, for timing analysis, for test
analysis and for logic synthesis. Verilog HDL allows designers to design at various levels of abstraction
like register transfer level, gate level and switch level. Verilog is used as an input for synthesis programs
which will generate a gate-level description for the circuit. Xilinx ISE 13.2 is a software tool developed by
Xilinx for synthesis and analysis of HDL designs.
VerilogHDL code is written in Xilinx ISE 13.2.
SIMULATION:
Our VerilogHDL code is simulated using ISIM available with Xilinx 13.2.
Test Vectors/Data for ENCRYPTION:
Inputs
Key, K = 91cc70a38f1cf31c3a3a39c748e8ee3a
Nonce, IV = b7ddefbdfad7df7b7dbee3e5f5f5fbe6
Message, M= b7ddf2398e1471e39e6387474738e91d1dc74fbdfad7df7b7dbee3e5f5f5fbe6
Associate Data, AD= 91cc70a38f1cf31c3a3a39c748edbeef7defd6befbdbedf71f2fafafdf30ee3a
Outputs
C= d4a1b9fb02fa511cdf7f8cfbb90e22438702502bada2b70436ca6fc14c5d6224
Tag= bf979c14211c4930064abc4f50c2d0d0
Simulation Result for ENCRYPTION:
Test Vectors/Data for DECRYPTION:
Inputs
Key, K = 91cc70a38f1cf31c3a3a39c748e8ee3a
Nonce, IV = b7ddefbdfad7df7b7dbee3e5f5f5fbe6
Associate Data, AD= 91cc70a38f1cf31c3a3a39c748edbeef7defd6befbdbedf71f2fafafdf30ee3a
C= d4a1b9fb02fa511cdf7f8cfbb90e22438702502bada2b70436ca6fc14c5d6224
Tag= bf979c14211c4930064abc4f50c2d0d0
Output CASE (1): When Same Tag is entered to DECRYPTION:
Message, M= b7ddf2398e1471e39e6387474738e91d1dc74fbdfad7df7b7dbee3e5f5f5fbe6
fail= 1
Simulation Result for Decryption:
Output CASE (2): When Different Tag is entered to DECRYPTION:
Tag= bf979c14211c4930064abc4f50c2d0d0 (here just first HEX value is changed to 0)
Message, M= X
fail= 0
Simulation Result for Decryption:
SYNYTHESIZE SETTINGS:
Authenticated Encryption Decryption Scheme
SYNTHESIZE SUMMARY
Device and the family used for our design implementation is SPARTAN 3E (xc3s500e-5vq100).
SUMMARY- ENCRYPTION
This summary shows Synthesize report for Enhanced Pentium M architecture.
Summary for Haswell architecture
SUMMARY- DECRYPTION
Detailed Synthesize Report for Decryption is available in the main folder, named “reports ->
synthesize reports -> detailed_synthesize_report_dec.txt”.
FPGA IMPLEMENTATION:
Code written by our team is altered in order to test our design over FPGA.
Encryption code changes:
1) Inputs K, IV, AD, M are created constant in the code.
2) Rest of the Inputs is similar.
3) Inputs K,IV, AD, M are created as parameters with the values listed in this paper.
4) Outputs defined for FPGA code are only 8 bits. All these bits are used to display
Cipher Data and Tag on LEDs.
5) Only certain bits of C and Tag are displayed on LEDs, in order to have maximum
similarity with the actual code.
SUMMARY- ENCRYPTION for FPGA
Detailed Synthesize report is available in main folder as “synthesize
report_fpga_implementation -> detailed_synthesize_report_enc_fpga.txt”.
Decryption Code changes:
1) Only inputs are clk and rst.
2) Inputs K, IV, C, Tag, and AD are created as parameters with the values generated
from Encryption module.
3) Output is only single LED, which describes the match between Input Tag and
generated Tag, and thus describes what would be the output.
SUMMARY- DECRYPTION for FPGA
COMPARISON:
This section would describe the comparison among our design and the one described in the
paper (http://competitions.cr.yp.to/round1/tiaoxinv1.pdf ). This comparison is done for ENCRYPTION,
as the performance listed in the TIAOXIN-346 is only for ENCRYPTION.
Features Our Design TIAOXIN-346
Software Yes Yes
Hardware (SPARTAN 3E) Yes No
SPEED (Enhanced Pentium M
micro architecture)
(256 bits Data)
7.562ns N A
SPEED (Haswell micro
architecture)
(256 bits Data)
7.782ns N A
SPEED (Sandy Bridge micro
architecture)
(256 bits Data)
N A 1.45ns
CONCLUSION:
As the problem statement for the Design Contest demanded, that teams participating should
implement Hardware for a CAESAR Entry, therefore, Our Team was able to achieve Hardware
Implementation for TIAOXIN-346. In the paper, entitled “TIAOXIN-346”, no Hardware
Implementation has been listed. Only Software Implementation has been described.
Although, in our design, we were not able to achieve similar SPEED performance as compared
to TIAOXIN-346. Our design is 5 times slower than TIAOXIN-346 but we have successfully
verified our design over FPGA.
Our Team is still working on to bring the listed SPEED Features in TIAOXIN-346, to be available
with our design, so that we add one more feature of HARDWARE IMPLEMENATION to TIAOXIN-
346. For this, our team has built another design which makes use of “Function Calls and LOOPS
Structures” instead of “multiple times Module Instantiations”. We have successfully
SIMULATED this design, but due to lack of system resources, we were not able to determine the
SPEED features of that design as Synthesize Process is taking whole lot of time and still not
completing and thereby that design is not Hardware Implemented as well. We are pretty sure
that Design with Functions and LOOP Structure would match the SPEED features of TIAOXIN-
346, as Functions and LOOP Structures take only 2-3 clock ticks rather than complete clock
cycles.

More Related Content

PDF
SHA 1 Algorithm
PPTX
VLSI DESIGN Conference 2016, Kolkata- Authenticated Encryption Decryption
PPTX
SHA- Secure hashing algorithm
PDF
A Comparative Analysis between SHA and MD5 algorithms
PPT
Lcdf4 chap 03_p2
PPT
Hash& mac algorithms
PDF
Analysing space complexity of various encryption algorithms 2
PDF
Simulated Analysis and Enhancement of Blowfish Algorithm
SHA 1 Algorithm
VLSI DESIGN Conference 2016, Kolkata- Authenticated Encryption Decryption
SHA- Secure hashing algorithm
A Comparative Analysis between SHA and MD5 algorithms
Lcdf4 chap 03_p2
Hash& mac algorithms
Analysing space complexity of various encryption algorithms 2
Simulated Analysis and Enhancement of Blowfish Algorithm

What's hot (16)

PPTX
Hash function
PDF
Design And Implementation Of Tiny Encryption Algorithm
DOC
Information Theory and Coding Question Bank
PPTX
Partial Homomorphic Encryption
PPTX
Computing on Encrypted Data
PPT
Information and Network Security
PDF
Error control coding using bose chaudhuri hocquenghem bch codes
PPT
Logic Fe Tcom
PDF
Cmps290 classnoteschap02
DOC
Rsa Algorithm
PDF
Code optimization in compiler design
PDF
CRYPTOGRAPHY AND NETWORK SECURITY
PDF
Code generation in Compiler Design
PDF
A survey on Fully Homomorphic Encryption
PPTX
Discrete Logarithmic Problem- Basis of Elliptic Curve Cryptosystems
PDF
Y03301460154
Hash function
Design And Implementation Of Tiny Encryption Algorithm
Information Theory and Coding Question Bank
Partial Homomorphic Encryption
Computing on Encrypted Data
Information and Network Security
Error control coding using bose chaudhuri hocquenghem bch codes
Logic Fe Tcom
Cmps290 classnoteschap02
Rsa Algorithm
Code optimization in compiler design
CRYPTOGRAPHY AND NETWORK SECURITY
Code generation in Compiler Design
A survey on Fully Homomorphic Encryption
Discrete Logarithmic Problem- Basis of Elliptic Curve Cryptosystems
Y03301460154
Ad

Similar to Authenticated Encryption Decryption Scheme (20)

PDF
Implementation of Fast Pipelined AES Algorithm on Xilinx FPGA
PPTX
1st review major project aes algorithm.pptx
PDF
PDF
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
PDF
IRJET- A Review on Various Secured Data Encryption Models based on AES Standard
PDF
A04660105
PPT
Chiffremtn asymetriqye AES Introduction.ppt
PDF
PDF
Triple Data Encryption Standard (t-DES)
PDF
“Optimized AES Algorithm Core Using FeedBack Architecture”
PDF
Fpga implementation of encryption and decryption algorithm based on aes
PDF
FPGA and ASIC Implementation of Speech Encryption and Decryption using AES Al...
PPT
Encryption and Decryption using Tag Design
PDF
icwet1097
PDF
IRJET- Hardware and Software Co-Design of AES Algorithm on the basis of NIOS ...
PDF
Iaetsd an survey of efficient fpga implementation of advanced encryption
PDF
Arm recognition encryption by using aes algorithm
PDF
PDF
Design and Implementation of Area Efficiency AES Algoritham with FPGA and ASIC
Implementation of Fast Pipelined AES Algorithm on Xilinx FPGA
1st review major project aes algorithm.pptx
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IRJET- A Review on Various Secured Data Encryption Models based on AES Standard
A04660105
Chiffremtn asymetriqye AES Introduction.ppt
Triple Data Encryption Standard (t-DES)
“Optimized AES Algorithm Core Using FeedBack Architecture”
Fpga implementation of encryption and decryption algorithm based on aes
FPGA and ASIC Implementation of Speech Encryption and Decryption using AES Al...
Encryption and Decryption using Tag Design
icwet1097
IRJET- Hardware and Software Co-Design of AES Algorithm on the basis of NIOS ...
Iaetsd an survey of efficient fpga implementation of advanced encryption
Arm recognition encryption by using aes algorithm
Design and Implementation of Area Efficiency AES Algoritham with FPGA and ASIC
Ad

More from Hardik Manocha (8)

PDF
Solar Energy assisted E-Rickshaw
PDF
Hybrid AES DES
PDF
Hybrid Communication Protocol- UART & SPI
PDF
8 bit Microprocessor with Single Vectored Interrupt
PDF
Advanced Encryption Standard (AES)
PDF
Advanced Encryption Standard (AES) with Dynamic Substitution Box
PPT
Minor Project- AES Implementation in Verilog
PPTX
Seminar on Encryption and Authenticity
Solar Energy assisted E-Rickshaw
Hybrid AES DES
Hybrid Communication Protocol- UART & SPI
8 bit Microprocessor with Single Vectored Interrupt
Advanced Encryption Standard (AES)
Advanced Encryption Standard (AES) with Dynamic Substitution Box
Minor Project- AES Implementation in Verilog
Seminar on Encryption and Authenticity

Recently uploaded (20)

PPTX
Soumya Das post quantum crypot algorithm
PPT
linux chapter 1 learning operating system
PDF
The Journal of Finance - July 1993 - JENSEN - The Modern Industrial Revolutio...
PPTX
Embedded Systems Microcontrollers and Microprocessors.pptx
PDF
ForSee by Languify Teardown final product management
PDF
August 2025 Top read articles in International Journal of Database Managemen...
PDF
IoT-Based Hybrid Renewable Energy System.pdf
PDF
1.-fincantieri-investor-presentation2.pdf
PPTX
CC PPTS unit-I PPT Notes of Cloud Computing
PPTX
22ME926Introduction to Business Intelligence and Analytics, Advanced Integrat...
PDF
BBC NW_Tech Facilities_30 Odd Yrs Ago [J].pdf
PDF
PhD defense presentation in field of Computer Science
PPTX
MODULE 3 SUSTAINABLE DEVELOPMENT GOALSPPT.pptx
PPTX
Downstream processing_in Module1_25.pptx
PPTX
Ingredients of concrete technology .pptx
PDF
Snapchat product teardown product management
PPTX
240409 Data Center Training Programs by Uptime Institute (Drafting).pptx
PDF
Thesis of the Fruit Harvesting Robot .pdf
PDF
Water Industry Process Automation & Control Monthly - September 2025
PPTX
1. Effective HSEW Induction Training - EMCO 2024, O&M.pptx
Soumya Das post quantum crypot algorithm
linux chapter 1 learning operating system
The Journal of Finance - July 1993 - JENSEN - The Modern Industrial Revolutio...
Embedded Systems Microcontrollers and Microprocessors.pptx
ForSee by Languify Teardown final product management
August 2025 Top read articles in International Journal of Database Managemen...
IoT-Based Hybrid Renewable Energy System.pdf
1.-fincantieri-investor-presentation2.pdf
CC PPTS unit-I PPT Notes of Cloud Computing
22ME926Introduction to Business Intelligence and Analytics, Advanced Integrat...
BBC NW_Tech Facilities_30 Odd Yrs Ago [J].pdf
PhD defense presentation in field of Computer Science
MODULE 3 SUSTAINABLE DEVELOPMENT GOALSPPT.pptx
Downstream processing_in Module1_25.pptx
Ingredients of concrete technology .pptx
Snapchat product teardown product management
240409 Data Center Training Programs by Uptime Institute (Drafting).pptx
Thesis of the Fruit Harvesting Robot .pdf
Water Industry Process Automation & Control Monthly - September 2025
1. Effective HSEW Induction Training - EMCO 2024, O&M.pptx

Authenticated Encryption Decryption Scheme

  • 1. VLSI DESIGN CONFERENCE 2016 Domain- Analog/Digital Design Challenge D3- Efficient Accelerator for Authenticated Encryption Title: HarSam Authors: Samnit Dua and Hardik Manocha Passcode: 26X-C4E3D5E4H7 Confirmation No: 26 Introduction: Our Team has selected one of the CAESAR Candidate’s paper to be implemented in the Design Contest for VLSI Design Conference 2016, named TIAOXIN-346. As stated in the paper (http://competitions.cr.yp.to/round1/tiaoxinv1.pdf ), implementation has been done on software displaying Speed analysis for the design. No Hardware implementation has been listed in the paper. Our Team, thus decided to design the Hardware for TIAOXIN-346, emphasizing on the FPGA implementation using VerilogHDL and try to achieve the same speed as stated in the paper, on the FPGA. Further, our team worked on the memory feature of the design as well. Complete analysis of our design is listed on the pages to come with comparison to the analysis listed in the paper. We have worked on the 256 number of bits of the Text that has to be encrypted and decrypted.
  • 2. Specification: TIAOXIN-346 is a nonce based authenticated encryption scheme, which operates on 256 bits of the Message and Associated data, along with 128 bits Key and Nonce (public message number) For ENCRYPTION/AUTHENTICTION stage Tiaoxin- 346 (K; IV; M; AD) = (C; Tag) Inputs- Key, K (128 bits) Nonce, IV (128 bits) Plain Text, M (256 bits) Associated Data, AD (256 bits) Outputs- Cipher Text, C (256 bits) Authentication Tag, Tag (128 bits) For DECRYPTION stage Inputs- Key, K (128 bits) Nonce, IV (128 bits) Cipher Text, M (256 bits) Associated Data, AD (256 bits) Authentication Tag, Tag (128 bits) Outputs- Plain Text, M (256 bits), if Authentication Tag generated matches with the input Authentication Tag.
  • 3. Notations: Z0 - a constant word defined as Z0 =428a2f98d728ae227137449123ef65cd Z1 - a constant word defined as Z1 =b5c0fbcfec4d3b2fe9b5dba58189dbbc Ts - a state composed of s words. For instance, T3 has 3 words, T6 has 6 words. To index state words we use the language C notation, hence Ts = (Ts[0]; Ts[1]; : : : ; Ts[s-1]), where Ts[i]; i = 0; : : : ; s-1 are words, and Ts[0] is the first word. Operations: X ^ Y {bitwise addition (XOR) of the words X and Y X&Y {bitwise conjunction (AND) of the words X and Y AES(X; SK) {one keyed round of AES applied to the word X, where SK is the sub key, i.e.: AES(X; SK) = Mix Columns (Shift Rows (Sub Bytes(X))) ^ SK Sub Bytes; Shift Rows; Mix Columns are the same operations as in AES. Thus, AES(X; SK) is the AES-NI instruction aesenc. R (Ts; M) {a round transformation of a state with s words. The inputs of R are state Ts and word M, while the output is a new state Tnew i.e. R: Ts X M->Tnew s : Tnew s [0] = AES (Ts [s - 1]; Ts [0]) ^M Tnew s [1] = AES (Ts [0]; Z0) Tnew s [2] = Ts [1] : : : Tnew s [s - 1] = Ts[s - 2] States of TIAOXIN-346: Both the Encryption and Decryption parts of TIAOXIN-346 operate upon three states- T3, T4 and T6. T3 consists of 3 words; T4 consists of 4 words while T6 consists of 6 words. Update Operation: T3, T4 and T6 are updated using UPDATE function. UPDATE function uses the R(T;M) operation, as defined above. Update: T3 X T4 X T5 XM0 XM1 XM2 -> T3 X T4 X T6 T3new = R(T3;M0); T3 = T3new
  • 4. T4 new = R(T4;M1); T4 = T4new T6 new = R(T6;M2); T6 = T6new T3 Update T4 Update T6 Update Circled A stands for one AES round. The AES rounds applied to T3[2]; T4[3]; T6[5] are keyless, while the AES rounds applied to T3[0]; T4[0]; T6[0] use Z0 as a sub key. Definition of TIAOXIN-346 Tiaoxin – 346 processes the associated data AD and the message Min blocks where each block is composed of 2 words (32 bytes, 256 bits) The associated data AD is of 32 bytes. The length of the AD is encoded as 16-byte big endian word and stored in AD Length, i.e. AD Length = |AD|. The message M is of 32 bytes. The length of the M is encoded as 16-byte big endian word and stored in M Length, i.e. M Length = |M|. Tiaoxin - 346 is a stream cipher based design and as such it works in four phases: Initialization, Processing associated data, Encryption, and Finalization. These phases are executed in the order specified above. INITIALIZATION: In the initialization, the key K and the public message number (nonce) IV are loaded into the three states T3; T4; T6 and the states go through 15 rounds. T3 [0] = K; T3 [1] = K; T3 [2] = IV; T4 [0] = K; T4 [1] = K; T4 [2] = IV; T4 [3] = Z0; T6 [0] = K; T6 [1] = K; T6 [2] = IV; T6 [3] = Z1; T6 [4] = 0; T6 [5] = 0; for i = 1 to 15
  • 5. Update (T3; T4; T6; Z0; Z1; Z0); end for PROCESSING ASSOCIATED DATA Assume the associated data is composed of two words, i.e. AD= AD0 || AD1. The Processing associated data is defined as: Update (T3; T4; T6; AD0; AD1; AD0 ^ AD1); ENCRYPTION: Assume the Message M is composed of two words, i.e. M= M0 || M1. In the encryption, a block M is processed in one round, and a block of cipher text C = C0 || C1 (concatenation) is output. The Processing associated data is defined as: Update (T3; T4; T6; M0; M1; M0 ^ M1); C0=T3 [0] ^ T3 [2] ^ T4 [1] ^ (T6 [3] & T4 [3]); C1= T6 [0] ^ T4 [2] ^ T3 [1] ^ (T6 [5] & T3 [2]); TAG PRODUCTION: After all message blocks have been processed, the words holding the lengths of the associated data and message are processed, then the states go through 20 more rounds, and the tag Tag is produced as an XOR of all words of all states. This final phase is defined as: Update (T3; T4; T6; AD Length; M Length; AD Length ^ M Length); for i = 1 to 20 Update (T3; T4; T6; Z1; Z0; Z1); end for Tag= T3 [0] ^ T3 [1] ^ T3 [2] ^ T4 [0] ^ T4 [1] ^ T4 [2] ^ T4 [3] ^ T6 [0] ^ T6 [1] ^ T6 [2] ^ T6 [3] ^ T6 [4] ^ T6 [5];
  • 6. DECRYPTION and VERIFICATION: In the decryption-verification process, the order of the phases is the same: Initialization, Processing associated data, Decryption, and Finalization. Initialization, Processing associated data and Finalization are the same as during the encryption. Decryption is defined as: Update (T3; T4; T6; 0; 0; 0); M0= C0 ^ T3 [0] ^ T3 [2] ^ T4 [1] ^ (T6 [3] & T4 [3]); M1= C ^ T6 [0] ^ T4 [2] ^ T3 [1] ^ (T6 [5] & T3 [2]) ^ M0; T3 [0] = T3 [0] ^ M0; T4 [0] = T4 [0] ^ M1; T6 [0] = T6 [0] ^ M0 ^ M1; VERILOG IMPLEMENTATION: The VerilogHDL is an IEEE standard hardware description language. It is widely used in the design of digital integrated circuits. Basically Verilog is verification through simulation, for timing analysis, for test analysis and for logic synthesis. Verilog HDL allows designers to design at various levels of abstraction like register transfer level, gate level and switch level. Verilog is used as an input for synthesis programs which will generate a gate-level description for the circuit. Xilinx ISE 13.2 is a software tool developed by Xilinx for synthesis and analysis of HDL designs. VerilogHDL code is written in Xilinx ISE 13.2. SIMULATION: Our VerilogHDL code is simulated using ISIM available with Xilinx 13.2. Test Vectors/Data for ENCRYPTION: Inputs Key, K = 91cc70a38f1cf31c3a3a39c748e8ee3a
  • 7. Nonce, IV = b7ddefbdfad7df7b7dbee3e5f5f5fbe6 Message, M= b7ddf2398e1471e39e6387474738e91d1dc74fbdfad7df7b7dbee3e5f5f5fbe6 Associate Data, AD= 91cc70a38f1cf31c3a3a39c748edbeef7defd6befbdbedf71f2fafafdf30ee3a Outputs C= d4a1b9fb02fa511cdf7f8cfbb90e22438702502bada2b70436ca6fc14c5d6224 Tag= bf979c14211c4930064abc4f50c2d0d0 Simulation Result for ENCRYPTION: Test Vectors/Data for DECRYPTION: Inputs Key, K = 91cc70a38f1cf31c3a3a39c748e8ee3a Nonce, IV = b7ddefbdfad7df7b7dbee3e5f5f5fbe6 Associate Data, AD= 91cc70a38f1cf31c3a3a39c748edbeef7defd6befbdbedf71f2fafafdf30ee3a
  • 8. C= d4a1b9fb02fa511cdf7f8cfbb90e22438702502bada2b70436ca6fc14c5d6224 Tag= bf979c14211c4930064abc4f50c2d0d0 Output CASE (1): When Same Tag is entered to DECRYPTION: Message, M= b7ddf2398e1471e39e6387474738e91d1dc74fbdfad7df7b7dbee3e5f5f5fbe6 fail= 1 Simulation Result for Decryption: Output CASE (2): When Different Tag is entered to DECRYPTION: Tag= bf979c14211c4930064abc4f50c2d0d0 (here just first HEX value is changed to 0) Message, M= X fail= 0
  • 9. Simulation Result for Decryption: SYNYTHESIZE SETTINGS:
  • 12. Device and the family used for our design implementation is SPARTAN 3E (xc3s500e-5vq100). SUMMARY- ENCRYPTION This summary shows Synthesize report for Enhanced Pentium M architecture. Summary for Haswell architecture
  • 13. SUMMARY- DECRYPTION Detailed Synthesize Report for Decryption is available in the main folder, named “reports -> synthesize reports -> detailed_synthesize_report_dec.txt”. FPGA IMPLEMENTATION: Code written by our team is altered in order to test our design over FPGA. Encryption code changes: 1) Inputs K, IV, AD, M are created constant in the code. 2) Rest of the Inputs is similar. 3) Inputs K,IV, AD, M are created as parameters with the values listed in this paper. 4) Outputs defined for FPGA code are only 8 bits. All these bits are used to display Cipher Data and Tag on LEDs. 5) Only certain bits of C and Tag are displayed on LEDs, in order to have maximum similarity with the actual code.
  • 14. SUMMARY- ENCRYPTION for FPGA Detailed Synthesize report is available in main folder as “synthesize report_fpga_implementation -> detailed_synthesize_report_enc_fpga.txt”. Decryption Code changes: 1) Only inputs are clk and rst. 2) Inputs K, IV, C, Tag, and AD are created as parameters with the values generated from Encryption module. 3) Output is only single LED, which describes the match between Input Tag and generated Tag, and thus describes what would be the output. SUMMARY- DECRYPTION for FPGA
  • 15. COMPARISON: This section would describe the comparison among our design and the one described in the paper (http://competitions.cr.yp.to/round1/tiaoxinv1.pdf ). This comparison is done for ENCRYPTION, as the performance listed in the TIAOXIN-346 is only for ENCRYPTION. Features Our Design TIAOXIN-346 Software Yes Yes Hardware (SPARTAN 3E) Yes No SPEED (Enhanced Pentium M micro architecture) (256 bits Data) 7.562ns N A SPEED (Haswell micro architecture) (256 bits Data) 7.782ns N A SPEED (Sandy Bridge micro architecture) (256 bits Data) N A 1.45ns
  • 16. CONCLUSION: As the problem statement for the Design Contest demanded, that teams participating should implement Hardware for a CAESAR Entry, therefore, Our Team was able to achieve Hardware Implementation for TIAOXIN-346. In the paper, entitled “TIAOXIN-346”, no Hardware Implementation has been listed. Only Software Implementation has been described. Although, in our design, we were not able to achieve similar SPEED performance as compared to TIAOXIN-346. Our design is 5 times slower than TIAOXIN-346 but we have successfully verified our design over FPGA. Our Team is still working on to bring the listed SPEED Features in TIAOXIN-346, to be available with our design, so that we add one more feature of HARDWARE IMPLEMENATION to TIAOXIN- 346. For this, our team has built another design which makes use of “Function Calls and LOOPS Structures” instead of “multiple times Module Instantiations”. We have successfully SIMULATED this design, but due to lack of system resources, we were not able to determine the SPEED features of that design as Synthesize Process is taking whole lot of time and still not completing and thereby that design is not Hardware Implemented as well. We are pretty sure that Design with Functions and LOOP Structure would match the SPEED features of TIAOXIN- 346, as Functions and LOOP Structures take only 2-3 clock ticks rather than complete clock cycles.