SlideShare a Scribd company logo
NXFEE INNOVATION
(SEMICONDUCTOR IP &PRODUCT DEVELOPMENT)
(ISO : 9001:2015Certified Company),
# 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam,
Pondicherry– 605004, India.
Buy Project on Online :www.nxfee.com | contact : +91 9789443203 |
email : nxfee.innovation@gmail.com
_________________________________________________________________
Analysis and Design of Cost-Effective, High-Throughput LDPC Decoders
Abstract:
This paper introduces a new approach to cost effective, high-throughput hardware
designs for low-density parity-check (LDPC) decoders. The proposed approach, called
nonsurjective finite alphabet iterative decoders (NS-FAIDs), exploits the robustness of
message-passing LDPC decoders to inaccuracies in the calculation of exchanged
messages, and it is shown to provide a unified framework for several designs previously
proposed in the literature. NS-FAIDs are optimized by density evolution for regular and
irregular LDPC codes, and are shown to provide different tradeoffs between hardware
complexity and decoding performance. Two hardware architectures targeting high-
throughput applications are also proposed, integrating both Min-Sum (MS) and NS-FAID
decoding kernels. ASIC post synthesis implementation results on 65-nm CMOS
technology show that NS-FAIDs yield significant improvements in the throughput to area
ratio, by up to 58.75% with respect to the MS decoder, with even better or only slightly
degraded error correction performance.
Software Implementation:
 Modelsim
 Xilinx 14.2
Existing System:
The increasing demand of massive data rates in wireless communication systems will
require significantly higher processing speed of the baseband signal, as compared with
conventional solutions. This is especially challenging for forward error correction (FEC)
mechanisms, since FEC decoding is one of the most computationally intensive baseband
processing tasks, consuming a large amount of hardware resources and energy. The use
NXFEE INNOVATION
(SEMICONDUCTOR IP &PRODUCT DEVELOPMENT)
(ISO : 9001:2015Certified Company),
# 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam,
Pondicherry– 605004, India.
Buy Project on Online :www.nxfee.com | contact : +91 9789443203 |
email : nxfee.innovation@gmail.com
_________________________________________________________________
of very large bandwidths will also result in stringent, application-specific, requirements
in terms of both throughput and latency. The conventional approach to increase
throughput is to use massively parallel architectures. In this context, low-density parity-
check (LDPC) codes are recognized as the foremost solution, due to the intrinsic capacity
of their decoders to accommodate various degrees of parallelism. They have found
extensive applications in modern communication systems, due to their excellent decoding
performance, high-throughput capabilities, and power efficiency, and have been adopted
in several recent communication standards. This paper targets the design of cost-
effective, high throughput LDPC decoders. One important characteristic of LDPC
decoders is that the memory and interconnect blocks dominate the overall
area/delay/power performance of the hardware design. To address this issue, we build
upon the concept of finite alphabet iterative decoders (FAIDs) introduced. While FAIDs
have been previously investigated for variable-node (VN) regular LDPC codes over the
binary symmetric channel, this paper extends their use to any channel model and to both
regular and irregular LDPC codes.
The approach considered in this paper, referred to as nonsurjective finite FAIDs (NS-
FAIDs), is to allow storing the exchanged messages using a lower precision (a smaller
number of bits) than that used by the processing units. The basic idea is to reduce the size
of the exchanged messages, once they have been updated by the processing units. Hence,
to some extent, the proposed approach is akin to the use of imprecise storage, which is
seen as an enabler for cost and throughput optimizations. Moreover, NS-FAIDs are
shown to provide a unified framework for several designs previously proposed in the
literature, including the normalized and offset Min-Sum (OMS) decoders, the partially
OMS (POMS) decoder, the MS-based decoders proposed, or the recently introduced
dual-quantization domain MS decoder.
This paper refines and extends some of the concepts we previously introduced. In
particular, the definition of NS-FAIDs is extended such as to cover a larger class of
NXFEE INNOVATION
(SEMICONDUCTOR IP &PRODUCT DEVELOPMENT)
(ISO : 9001:2015Certified Company),
# 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam,
Pondicherry– 605004, India.
Buy Project on Online :www.nxfee.com | contact : +91 9789443203 |
email : nxfee.innovation@gmail.com
_________________________________________________________________
decoders, which is shown to significantly improve the decoding performance in case that
the exchanged messages are quantized on a small number of bits (e.g., 2 bits per
exchanged message). We show that NS-FAIDs can be optimized by using the density
evolution (DE) technique, so as to obtain
Disadvantages:
 Cost is high
 Error correction performance is poor
Proposed System:
Full-Layer Architecture
A different possibility to increase throughput is to increase the hardware parallelism, by
including several non overlapping
NXFEE INNOVATION
(SEMICONDUCTOR IP &PRODUCT DEVELOPMENT)
(ISO : 9001:2015Certified Company),
# 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam,
Pondicherry– 605004, India.
Buy Project on Online :www.nxfee.com | contact : +91 9789443203 |
email : nxfee.innovation@gmail.com
_________________________________________________________________
Fig. 1. Mapping between VNs and VNUs. Black: VNs of degree 2. Red: VNs of degree 3. Blue: VNs of
degree 6
rows of the base matrix in one decoding layer. For instance, for the base matrix, we may
consider RPL = 4 consecutive rows per decoding layer, and thus the number of decoding
layers is L = 3. In this case, each column of the base matrix has one (and only one)
nonzero entry in each decoding layer; such a decoding layer is referred to as being full.
Full layers correspond to the maximum hardware parallelism that can be exploited by
layered architectures, but they also prevent the pipelining of the data path. Fig. 1 shows
Mapping between VNs and VNUs. Black: VNs of degree 2. Red: VNs of degree 3. Blue:
VNs of degree 6 One possibility to implement a full-layer decoder is to use a similar
architecture to the pipelined one, by removing the registers inserted after the VNU (since
pipelining is incompatible with the use of full layers), and updating the control unit.
However, in such an architecture, read/write operations from/to the β_memory would
occur at the same memory location, corresponding to the current layer being processed .
This would require the use of asynchronous dual-port RAM to implement the β_memory,
which in general is known to be slower than synchronous dual port RAM. The
architecture proposed in this section is aimed at avoiding the use of asynchronous RAM,
while providing an effective way to benefit from the increased hardware parallelism
enabled by the use of full layers. We discuss below the main changes with respect to the
pipelined architecture, consisting of the α_memory and the barrel shifters blocks (the
other blocks are the same as for the pipelined architecture), as well as a complete
reorganization of the data path. However, it can be easily verified that both architectures
are logically equivalent, i.e., they both implement the same decoding algorithm.
1) α_Memory: This memory is used to store the VN-messages for the current decoding
layer (unlike the previous architecture, the AP-LLR values are not stored in memory).
Since only one -bit (unsaturated) VN-message is stored for each VN, this memory has
NXFEE INNOVATION
(SEMICONDUCTOR IP &PRODUCT DEVELOPMENT)
(ISO : 9001:2015Certified Company),
# 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam,
Pondicherry– 605004, India.
Buy Project on Online :www.nxfee.com | contact : +91 9789443203 |
email : nxfee.innovation@gmail.com
_________________________________________________________________
exactly the same size as the _memory used within the previous pipelined architecture.
VN-messages for the current layer are read from the α_memory, then saturated or framed
depending on the decoding kernel, and supplied to the corresponding CNUs. CN-
messages computed by the CNUs are stored in the β_memory (location corresponding to
layer ), and also forwarded to the AP-LLR unit, through the DCP (decompress) and DE-
FRA (deframing) blocks, according to the CNU implementation (compressed or
uncompressed) and the decoding
Fig. 2. High-level description of the proposed HW architectures, with both MS and NS-FAID kernels.
(a) Pipelined architecture. (b) Full-layer architecture
kernel (MS of NS-FAID). The AP-LLR unit computes the sum of the incoming VN- and
CN-messages, which corresponds to the AP-LLR value to be used at layer + 1 (since
already updated by layer ). The AP-LLR value is forwarded to the VNU, through
corresponding BS and PER blocks. Eventually, the VN-message for the layer + 1 is
computed as the difference between the incoming AP-LLR and the corresponding layer-(
NXFEE INNOVATION
(SEMICONDUCTOR IP &PRODUCT DEVELOPMENT)
(ISO : 9001:2015Certified Company),
# 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam,
Pondicherry– 605004, India.
Buy Project on Online :www.nxfee.com | contact : +91 9789443203 |
email : nxfee.innovation@gmail.com
_________________________________________________________________
+ 1) CN-message computed at the previous iteration, the latter being read from the
β_memory.
2) PER/BS Blocks: PER_1 / BS_1 blocks permute / shift the data read from the input
buffer, according to the positions / values of the nonnegative entries in the first decoding
layer. Similar to the BS_R blocks in the pipelined architecture, the PER_WR / BS_WR
blocks permute / shift the AP-LLR values, according to the difference between the
positions / values of the current layer’s ( ) nonnegative entries and those of the next layer
( + 1). This way, VN-messages stored in the α_memory are already permuted and shifted
for the subsequent decoding layer. Finally, PER_L / BS_L blocks permute / shift the hard
decision bits (sign of AP-LLR values), according to the positions / values of the
nonnegative entries in the last decoding layer.
Advantages:
 Cost effective
 Error correction performance is good
References:
[1] M. Karkooti and J. R. Cavallaro, ―Semi-parallel reconfigurable architectures for real-time LDPC
decoding,‖ in Proc. Int. Conf. Inf. Technol., Coding Comput. (ITCC), vol. 1. Apr. 2004, pp. 579–585.
[2] X. Chen, J. Kang, S. Lin, and V. Akella, ―Memory system optimization for FPGA-based
implementation of quasi-cyclic LDPC codes decoders,‖ IEEE Trans. Circuits Syst. I, Reg. Papers, vol.
58, no. 1, pp. 98–111, Jan. 2011.
[3] V. A. Chandrasetty and S. M. Aziz, ―Resource efficient LDPC decoders for multimedia
communication,‖ Integr., VLSI J., vol. 48, pp. 213–220, Jan. 2015.
[4] K. Zhang, X. Huang, and Z. Wang, ―High-throughput layered decoder implementation for quasi-
cyclic LDPC codes,‖ IEEE J. Sel. Areas Commun., vol. 27, no. 6, pp. 985–994, Aug. 2009.
NXFEE INNOVATION
(SEMICONDUCTOR IP &PRODUCT DEVELOPMENT)
(ISO : 9001:2015Certified Company),
# 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam,
Pondicherry– 605004, India.
Buy Project on Online :www.nxfee.com | contact : +91 9789443203 |
email : nxfee.innovation@gmail.com
_________________________________________________________________
[5] X. Peng, Z. Chen, X. Zhao, D. Zhou, and S. Goto, ―A 115 mW 1 Gbps QC-LDPC decoder ASIC for
WiMAX in 65 nm CMOS,‖ in Proc. IEEE Asian Solid State Circuits Conf. (A-SSCC), Nov. 2011, pp.
317–320.
[6] B. Xiang, D. Bao, S. Huang, and X. Zeng, ―An 847–955 Mb/s 342–397 mW dual-path fully-
overlapped QC-LDPC decoder for WiMAX system in 0.13 μm CMOS,‖ IEEE J. Solid-State Circuits,
vol. 46, no. 6, pp. 1416–1432, Jun. 2011.
[7] E. Boutillon and G. Masera, ―Hardware design and realization for iteratively decodable codes,‖ in
Channel Coding: Theory, Algorithms, and Applications. Amsterdam, The Netherlands: Elsevier, 2014,
pp. 583–642.
[8] S. K. Planjery, S. K. Chilappagari, B. Vasi´c, D. Declercq, and L. Danjean, ―Iterative decoding
beyond belief propagation,‖ in Proc. IEEE Inf. Theory Appl. Workshop (ITA), Jan. 2010, pp. 1–10.
[9] S. K. Planjery, D. Declercq, L. Danjean, and B. Vasi´c, ―Finite alphabet iterative decoders for LDPC
codes surpassing floating-point iterative decoders,‖ Electron. Lett., vol. 47, no. 16, pp. 919–921, 2011.
[10] S. K. Planjery, D. Declercq, L. Danjean, and B. Vasi´c, ―Finite alphabet iterative decoders—Part I:
Decoding beyond belief propagation on the binary symmetric channel,‖ IEEE Trans. Commun., vol. 61,
no. 10, pp. 4033–4045, Oct. 2013.
[11] J. Chen, A. Dholakia, E. Eleftheriou, M. P. C. Fossorier, and X.-Y. Hu, ―Reduced-complexity
decoding of LDPC codes,‖ IEEE Trans. Commun., vol. 53, no. 8, pp. 1288–1299, Aug. 2005.
[12] V. Savin, ―LDPC decoders,‖ in Channel Coding: Theory, Algorithms, and Applications.
Amsterdam, The Netherlands: Elsevier, 2014, pp. 211–260.

More Related Content

PDF
A reconfigurable ldpc decoder optimized applications
Nxfee Innovation
 
PDF
Design and fpga implementation of a reconfigurable digital down converter for...
Nxfee Innovation
 
PDF
Int hhg case_study_0
intelligrated
 
PDF
Design of an area efficient million-bit integer multiplier using double modul...
Nxfee Innovation
 
PDF
Approximate hybrid high radix encoding for energy efficient inexact multipliers
Nxfee Innovation
 
PDF
Vector processing aware advanced clock-gating techniques for low-power fused ...
Nxfee Innovation
 
PDF
Combating data leakage trojans in commercial and asic applications with time ...
Nxfee Innovation
 
PDF
Feedback based low-power soft-error-tolerant design for dual-modular redundancy
Nxfee Innovation
 
A reconfigurable ldpc decoder optimized applications
Nxfee Innovation
 
Design and fpga implementation of a reconfigurable digital down converter for...
Nxfee Innovation
 
Int hhg case_study_0
intelligrated
 
Design of an area efficient million-bit integer multiplier using double modul...
Nxfee Innovation
 
Approximate hybrid high radix encoding for energy efficient inexact multipliers
Nxfee Innovation
 
Vector processing aware advanced clock-gating techniques for low-power fused ...
Nxfee Innovation
 
Combating data leakage trojans in commercial and asic applications with time ...
Nxfee Innovation
 
Feedback based low-power soft-error-tolerant design for dual-modular redundancy
Nxfee Innovation
 

Similar to Analysis and design of cost effective, high-throughput ldpc decoders (20)

PDF
Efficient fpga mapping of pipeline sdf fft cores
Nxfee Innovation
 
PDF
Fast neural network training on fpga using quasi newton optimization method
Nxfee Innovation
 
PDF
A 12 bit 40-ms s sar adc with a fast-binary-window dac switching scheme
Nxfee Innovation
 
DOC
Resume
vEERAPANDI M
 
PDF
Secure Transaction Model for NoSQL Database Systems: Review
rahulmonikasharma
 
PDF
Bhups
guest48878e
 
PDF
An energy efficient programmable many core accelerator for personalized biome...
Nxfee Innovation
 
DOCX
Capstone Project
Andrew Johnson
 
PDF
A fast and low complexity operator for the computation of the arctangent of a...
Nxfee Innovation
 
PDF
Fpga based 128 bit customised vliw processor for executing dual scalarvector ...
eSAT Publishing House
 
PDF
A high accuracy programmable pulse generator with a 10-ps timing resolution
Nxfee Innovation
 
PDF
ACCELERATE SAP® APPLICATIONS WITH CDNETWORKS
CDNetworks
 
DOCX
Capstone Project
Jacob McLain
 
PDF
Why Network Functions Virtualization sdn?
idrajeev
 
DOC
Chandru resume2 15-2016 (1)
Chandrasekaran Vasudevan
 
PDF
Multilevel half rate phase detector for clock and data recovery circuits
Nxfee Innovation
 
DOC
Resume
Ajith Jain
 
PDF
8 Reasons To Choose True Scale
seiland
 
PDF
8 Reasons To Choose True Scale
seiland
 
Efficient fpga mapping of pipeline sdf fft cores
Nxfee Innovation
 
Fast neural network training on fpga using quasi newton optimization method
Nxfee Innovation
 
A 12 bit 40-ms s sar adc with a fast-binary-window dac switching scheme
Nxfee Innovation
 
Resume
vEERAPANDI M
 
Secure Transaction Model for NoSQL Database Systems: Review
rahulmonikasharma
 
An energy efficient programmable many core accelerator for personalized biome...
Nxfee Innovation
 
Capstone Project
Andrew Johnson
 
A fast and low complexity operator for the computation of the arctangent of a...
Nxfee Innovation
 
Fpga based 128 bit customised vliw processor for executing dual scalarvector ...
eSAT Publishing House
 
A high accuracy programmable pulse generator with a 10-ps timing resolution
Nxfee Innovation
 
ACCELERATE SAP® APPLICATIONS WITH CDNETWORKS
CDNetworks
 
Capstone Project
Jacob McLain
 
Why Network Functions Virtualization sdn?
idrajeev
 
Chandru resume2 15-2016 (1)
Chandrasekaran Vasudevan
 
Multilevel half rate phase detector for clock and data recovery circuits
Nxfee Innovation
 
Resume
Ajith Jain
 
8 Reasons To Choose True Scale
seiland
 
8 Reasons To Choose True Scale
seiland
 
Ad

More from Nxfee Innovation (12)

PDF
VLSI IEEE Transaction 2018 - IEEE Transaction
Nxfee Innovation
 
DOCX
Noise insensitive pll using a gate-voltage-boosted source-follower regulator ...
Nxfee Innovation
 
DOCX
An efficient fault tolerance design for integer parallel matrix vector
Nxfee Innovation
 
PDF
The implementation of the improved omp for aic reconstruction based on parall...
Nxfee Innovation
 
PDF
Securing the present block cipher against combined side channel analysis and ...
Nxfee Innovation
 
PDF
Low complexity methodology for complex square-root computation
Nxfee Innovation
 
PDF
Approximate sum of-products designs based on distributed arithmetic
Nxfee Innovation
 
PDF
Algorithm and vlsi architecture design of proportionate type lms adaptive fil...
Nxfee Innovation
 
PDF
A flexible wildcard pattern matching accelerator via simultaneous discrete fi...
Nxfee Innovation
 
PDF
A closed form expression for minimum operating voltage of cmos d flip-flop
Nxfee Innovation
 
PDF
A 128 tap highly tunable cmos if finite impulse response filter for pulsed ra...
Nxfee Innovation
 
PDF
Nxfee Innovation Brochure
Nxfee Innovation
 
VLSI IEEE Transaction 2018 - IEEE Transaction
Nxfee Innovation
 
Noise insensitive pll using a gate-voltage-boosted source-follower regulator ...
Nxfee Innovation
 
An efficient fault tolerance design for integer parallel matrix vector
Nxfee Innovation
 
The implementation of the improved omp for aic reconstruction based on parall...
Nxfee Innovation
 
Securing the present block cipher against combined side channel analysis and ...
Nxfee Innovation
 
Low complexity methodology for complex square-root computation
Nxfee Innovation
 
Approximate sum of-products designs based on distributed arithmetic
Nxfee Innovation
 
Algorithm and vlsi architecture design of proportionate type lms adaptive fil...
Nxfee Innovation
 
A flexible wildcard pattern matching accelerator via simultaneous discrete fi...
Nxfee Innovation
 
A closed form expression for minimum operating voltage of cmos d flip-flop
Nxfee Innovation
 
A 128 tap highly tunable cmos if finite impulse response filter for pulsed ra...
Nxfee Innovation
 
Nxfee Innovation Brochure
Nxfee Innovation
 
Ad

Recently uploaded (20)

PDF
Machine Learning All topics Covers In This Single Slides
AmritTiwari19
 
PPTX
MULTI LEVEL DATA TRACKING USING COOJA.pptx
dollysharma12ab
 
PPTX
Inventory management chapter in automation and robotics.
atisht0104
 
PPTX
Online Cab Booking and Management System.pptx
diptipaneri80
 
PDF
Zero Carbon Building Performance standard
BassemOsman1
 
PDF
The Effect of Artifact Removal from EEG Signals on the Detection of Epileptic...
Partho Prosad
 
PDF
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
PDF
2010_Book_EnvironmentalBioengineering (1).pdf
EmilianoRodriguezTll
 
PPTX
Victory Precisions_Supplier Profile.pptx
victoryprecisions199
 
PDF
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
PPTX
quantum computing transition from classical mechanics.pptx
gvlbcy
 
PPTX
Civil Engineering Practices_BY Sh.JP Mishra 23.09.pptx
bineetmishra1990
 
PDF
All chapters of Strength of materials.ppt
girmabiniyam1234
 
PDF
AI-Driven IoT-Enabled UAV Inspection Framework for Predictive Maintenance and...
ijcncjournal019
 
DOCX
SAR - EEEfdfdsdasdsdasdasdasdasdasdasdasda.docx
Kanimozhi676285
 
PPTX
Module2 Data Base Design- ER and NF.pptx
gomathisankariv2
 
PDF
Packaging Tips for Stainless Steel Tubes and Pipes
heavymetalsandtubes
 
PPTX
MT Chapter 1.pptx- Magnetic particle testing
ABCAnyBodyCanRelax
 
PDF
LEAP-1B presedntation xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
hatem173148
 
PDF
Chad Ayach - A Versatile Aerospace Professional
Chad Ayach
 
Machine Learning All topics Covers In This Single Slides
AmritTiwari19
 
MULTI LEVEL DATA TRACKING USING COOJA.pptx
dollysharma12ab
 
Inventory management chapter in automation and robotics.
atisht0104
 
Online Cab Booking and Management System.pptx
diptipaneri80
 
Zero Carbon Building Performance standard
BassemOsman1
 
The Effect of Artifact Removal from EEG Signals on the Detection of Epileptic...
Partho Prosad
 
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
2010_Book_EnvironmentalBioengineering (1).pdf
EmilianoRodriguezTll
 
Victory Precisions_Supplier Profile.pptx
victoryprecisions199
 
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
quantum computing transition from classical mechanics.pptx
gvlbcy
 
Civil Engineering Practices_BY Sh.JP Mishra 23.09.pptx
bineetmishra1990
 
All chapters of Strength of materials.ppt
girmabiniyam1234
 
AI-Driven IoT-Enabled UAV Inspection Framework for Predictive Maintenance and...
ijcncjournal019
 
SAR - EEEfdfdsdasdsdasdasdasdasdasdasdasda.docx
Kanimozhi676285
 
Module2 Data Base Design- ER and NF.pptx
gomathisankariv2
 
Packaging Tips for Stainless Steel Tubes and Pipes
heavymetalsandtubes
 
MT Chapter 1.pptx- Magnetic particle testing
ABCAnyBodyCanRelax
 
LEAP-1B presedntation xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
hatem173148
 
Chad Ayach - A Versatile Aerospace Professional
Chad Ayach
 

Analysis and design of cost effective, high-throughput ldpc decoders

  • 1. NXFEE INNOVATION (SEMICONDUCTOR IP &PRODUCT DEVELOPMENT) (ISO : 9001:2015Certified Company), # 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam, Pondicherry– 605004, India. Buy Project on Online :www.nxfee.com | contact : +91 9789443203 | email : [email protected] _________________________________________________________________ Analysis and Design of Cost-Effective, High-Throughput LDPC Decoders Abstract: This paper introduces a new approach to cost effective, high-throughput hardware designs for low-density parity-check (LDPC) decoders. The proposed approach, called nonsurjective finite alphabet iterative decoders (NS-FAIDs), exploits the robustness of message-passing LDPC decoders to inaccuracies in the calculation of exchanged messages, and it is shown to provide a unified framework for several designs previously proposed in the literature. NS-FAIDs are optimized by density evolution for regular and irregular LDPC codes, and are shown to provide different tradeoffs between hardware complexity and decoding performance. Two hardware architectures targeting high- throughput applications are also proposed, integrating both Min-Sum (MS) and NS-FAID decoding kernels. ASIC post synthesis implementation results on 65-nm CMOS technology show that NS-FAIDs yield significant improvements in the throughput to area ratio, by up to 58.75% with respect to the MS decoder, with even better or only slightly degraded error correction performance. Software Implementation:  Modelsim  Xilinx 14.2 Existing System: The increasing demand of massive data rates in wireless communication systems will require significantly higher processing speed of the baseband signal, as compared with conventional solutions. This is especially challenging for forward error correction (FEC) mechanisms, since FEC decoding is one of the most computationally intensive baseband processing tasks, consuming a large amount of hardware resources and energy. The use
  • 2. NXFEE INNOVATION (SEMICONDUCTOR IP &PRODUCT DEVELOPMENT) (ISO : 9001:2015Certified Company), # 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam, Pondicherry– 605004, India. Buy Project on Online :www.nxfee.com | contact : +91 9789443203 | email : [email protected] _________________________________________________________________ of very large bandwidths will also result in stringent, application-specific, requirements in terms of both throughput and latency. The conventional approach to increase throughput is to use massively parallel architectures. In this context, low-density parity- check (LDPC) codes are recognized as the foremost solution, due to the intrinsic capacity of their decoders to accommodate various degrees of parallelism. They have found extensive applications in modern communication systems, due to their excellent decoding performance, high-throughput capabilities, and power efficiency, and have been adopted in several recent communication standards. This paper targets the design of cost- effective, high throughput LDPC decoders. One important characteristic of LDPC decoders is that the memory and interconnect blocks dominate the overall area/delay/power performance of the hardware design. To address this issue, we build upon the concept of finite alphabet iterative decoders (FAIDs) introduced. While FAIDs have been previously investigated for variable-node (VN) regular LDPC codes over the binary symmetric channel, this paper extends their use to any channel model and to both regular and irregular LDPC codes. The approach considered in this paper, referred to as nonsurjective finite FAIDs (NS- FAIDs), is to allow storing the exchanged messages using a lower precision (a smaller number of bits) than that used by the processing units. The basic idea is to reduce the size of the exchanged messages, once they have been updated by the processing units. Hence, to some extent, the proposed approach is akin to the use of imprecise storage, which is seen as an enabler for cost and throughput optimizations. Moreover, NS-FAIDs are shown to provide a unified framework for several designs previously proposed in the literature, including the normalized and offset Min-Sum (OMS) decoders, the partially OMS (POMS) decoder, the MS-based decoders proposed, or the recently introduced dual-quantization domain MS decoder. This paper refines and extends some of the concepts we previously introduced. In particular, the definition of NS-FAIDs is extended such as to cover a larger class of
  • 3. NXFEE INNOVATION (SEMICONDUCTOR IP &PRODUCT DEVELOPMENT) (ISO : 9001:2015Certified Company), # 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam, Pondicherry– 605004, India. Buy Project on Online :www.nxfee.com | contact : +91 9789443203 | email : [email protected] _________________________________________________________________ decoders, which is shown to significantly improve the decoding performance in case that the exchanged messages are quantized on a small number of bits (e.g., 2 bits per exchanged message). We show that NS-FAIDs can be optimized by using the density evolution (DE) technique, so as to obtain Disadvantages:  Cost is high  Error correction performance is poor Proposed System: Full-Layer Architecture A different possibility to increase throughput is to increase the hardware parallelism, by including several non overlapping
  • 4. NXFEE INNOVATION (SEMICONDUCTOR IP &PRODUCT DEVELOPMENT) (ISO : 9001:2015Certified Company), # 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam, Pondicherry– 605004, India. Buy Project on Online :www.nxfee.com | contact : +91 9789443203 | email : [email protected] _________________________________________________________________ Fig. 1. Mapping between VNs and VNUs. Black: VNs of degree 2. Red: VNs of degree 3. Blue: VNs of degree 6 rows of the base matrix in one decoding layer. For instance, for the base matrix, we may consider RPL = 4 consecutive rows per decoding layer, and thus the number of decoding layers is L = 3. In this case, each column of the base matrix has one (and only one) nonzero entry in each decoding layer; such a decoding layer is referred to as being full. Full layers correspond to the maximum hardware parallelism that can be exploited by layered architectures, but they also prevent the pipelining of the data path. Fig. 1 shows Mapping between VNs and VNUs. Black: VNs of degree 2. Red: VNs of degree 3. Blue: VNs of degree 6 One possibility to implement a full-layer decoder is to use a similar architecture to the pipelined one, by removing the registers inserted after the VNU (since pipelining is incompatible with the use of full layers), and updating the control unit. However, in such an architecture, read/write operations from/to the β_memory would occur at the same memory location, corresponding to the current layer being processed . This would require the use of asynchronous dual-port RAM to implement the β_memory, which in general is known to be slower than synchronous dual port RAM. The architecture proposed in this section is aimed at avoiding the use of asynchronous RAM, while providing an effective way to benefit from the increased hardware parallelism enabled by the use of full layers. We discuss below the main changes with respect to the pipelined architecture, consisting of the α_memory and the barrel shifters blocks (the other blocks are the same as for the pipelined architecture), as well as a complete reorganization of the data path. However, it can be easily verified that both architectures are logically equivalent, i.e., they both implement the same decoding algorithm. 1) α_Memory: This memory is used to store the VN-messages for the current decoding layer (unlike the previous architecture, the AP-LLR values are not stored in memory). Since only one -bit (unsaturated) VN-message is stored for each VN, this memory has
  • 5. NXFEE INNOVATION (SEMICONDUCTOR IP &PRODUCT DEVELOPMENT) (ISO : 9001:2015Certified Company), # 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam, Pondicherry– 605004, India. Buy Project on Online :www.nxfee.com | contact : +91 9789443203 | email : [email protected] _________________________________________________________________ exactly the same size as the _memory used within the previous pipelined architecture. VN-messages for the current layer are read from the α_memory, then saturated or framed depending on the decoding kernel, and supplied to the corresponding CNUs. CN- messages computed by the CNUs are stored in the β_memory (location corresponding to layer ), and also forwarded to the AP-LLR unit, through the DCP (decompress) and DE- FRA (deframing) blocks, according to the CNU implementation (compressed or uncompressed) and the decoding Fig. 2. High-level description of the proposed HW architectures, with both MS and NS-FAID kernels. (a) Pipelined architecture. (b) Full-layer architecture kernel (MS of NS-FAID). The AP-LLR unit computes the sum of the incoming VN- and CN-messages, which corresponds to the AP-LLR value to be used at layer + 1 (since already updated by layer ). The AP-LLR value is forwarded to the VNU, through corresponding BS and PER blocks. Eventually, the VN-message for the layer + 1 is computed as the difference between the incoming AP-LLR and the corresponding layer-(
  • 6. NXFEE INNOVATION (SEMICONDUCTOR IP &PRODUCT DEVELOPMENT) (ISO : 9001:2015Certified Company), # 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam, Pondicherry– 605004, India. Buy Project on Online :www.nxfee.com | contact : +91 9789443203 | email : [email protected] _________________________________________________________________ + 1) CN-message computed at the previous iteration, the latter being read from the β_memory. 2) PER/BS Blocks: PER_1 / BS_1 blocks permute / shift the data read from the input buffer, according to the positions / values of the nonnegative entries in the first decoding layer. Similar to the BS_R blocks in the pipelined architecture, the PER_WR / BS_WR blocks permute / shift the AP-LLR values, according to the difference between the positions / values of the current layer’s ( ) nonnegative entries and those of the next layer ( + 1). This way, VN-messages stored in the α_memory are already permuted and shifted for the subsequent decoding layer. Finally, PER_L / BS_L blocks permute / shift the hard decision bits (sign of AP-LLR values), according to the positions / values of the nonnegative entries in the last decoding layer. Advantages:  Cost effective  Error correction performance is good References: [1] M. Karkooti and J. R. Cavallaro, ―Semi-parallel reconfigurable architectures for real-time LDPC decoding,‖ in Proc. Int. Conf. Inf. Technol., Coding Comput. (ITCC), vol. 1. Apr. 2004, pp. 579–585. [2] X. Chen, J. Kang, S. Lin, and V. Akella, ―Memory system optimization for FPGA-based implementation of quasi-cyclic LDPC codes decoders,‖ IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 58, no. 1, pp. 98–111, Jan. 2011. [3] V. A. Chandrasetty and S. M. Aziz, ―Resource efficient LDPC decoders for multimedia communication,‖ Integr., VLSI J., vol. 48, pp. 213–220, Jan. 2015. [4] K. Zhang, X. Huang, and Z. Wang, ―High-throughput layered decoder implementation for quasi- cyclic LDPC codes,‖ IEEE J. Sel. Areas Commun., vol. 27, no. 6, pp. 985–994, Aug. 2009.
  • 7. NXFEE INNOVATION (SEMICONDUCTOR IP &PRODUCT DEVELOPMENT) (ISO : 9001:2015Certified Company), # 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam, Pondicherry– 605004, India. Buy Project on Online :www.nxfee.com | contact : +91 9789443203 | email : [email protected] _________________________________________________________________ [5] X. Peng, Z. Chen, X. Zhao, D. Zhou, and S. Goto, ―A 115 mW 1 Gbps QC-LDPC decoder ASIC for WiMAX in 65 nm CMOS,‖ in Proc. IEEE Asian Solid State Circuits Conf. (A-SSCC), Nov. 2011, pp. 317–320. [6] B. Xiang, D. Bao, S. Huang, and X. Zeng, ―An 847–955 Mb/s 342–397 mW dual-path fully- overlapped QC-LDPC decoder for WiMAX system in 0.13 μm CMOS,‖ IEEE J. Solid-State Circuits, vol. 46, no. 6, pp. 1416–1432, Jun. 2011. [7] E. Boutillon and G. Masera, ―Hardware design and realization for iteratively decodable codes,‖ in Channel Coding: Theory, Algorithms, and Applications. Amsterdam, The Netherlands: Elsevier, 2014, pp. 583–642. [8] S. K. Planjery, S. K. Chilappagari, B. Vasi´c, D. Declercq, and L. Danjean, ―Iterative decoding beyond belief propagation,‖ in Proc. IEEE Inf. Theory Appl. Workshop (ITA), Jan. 2010, pp. 1–10. [9] S. K. Planjery, D. Declercq, L. Danjean, and B. Vasi´c, ―Finite alphabet iterative decoders for LDPC codes surpassing floating-point iterative decoders,‖ Electron. Lett., vol. 47, no. 16, pp. 919–921, 2011. [10] S. K. Planjery, D. Declercq, L. Danjean, and B. Vasi´c, ―Finite alphabet iterative decoders—Part I: Decoding beyond belief propagation on the binary symmetric channel,‖ IEEE Trans. Commun., vol. 61, no. 10, pp. 4033–4045, Oct. 2013. [11] J. Chen, A. Dholakia, E. Eleftheriou, M. P. C. Fossorier, and X.-Y. Hu, ―Reduced-complexity decoding of LDPC codes,‖ IEEE Trans. Commun., vol. 53, no. 8, pp. 1288–1299, Aug. 2005. [12] V. Savin, ―LDPC decoders,‖ in Channel Coding: Theory, Algorithms, and Applications. Amsterdam, The Netherlands: Elsevier, 2014, pp. 211–260.