SlideShare a Scribd company logo
NXFEE INNOVATION
(SEMICONDUCTOR IP &PRODUCT DEVELOPMENT)
(ISO : 9001:2015Certified Company),
# 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam,
Pondicherry– 605004, India.
Buy Project on Online :www.nxfee.com | contact : +91 9789443203 |
email : nxfee.innovation@gmail.com
_________________________________________________________________
A Fast and Low-Complexity Operator for the Computation of the
Arctangent of a Complex Number
Abstract:
The computation of the arctangent of a complex number, i.e., the atan2 function, is
frequently needed in hardware systems that could profit from an optimized operator. In
this brief, we present a novel method to compute the atan2 function and a hardware
architecture for its implementation. The method is based on a first stage that performs a
coarse approximation of the atan2 function and a second stage that improves the output
accuracy by means of a lookup table. We present results for fixed-point implementations
in a field-programmable gate array device, all of them guaranteeing last-bit accuracy,
which provide an advantage in latency, speed, and use of resources, when compared with
well-established fixed-point options.
Software Implementation:
 Modelsim
 Xilinx 14.2
Existing System:
The computation of the arctangent function atan2(a, b) (see Fig. 1), i.e., obtaining the
angle of a complex number c = b + ja, has been the subject of extensive study, because
this computation is required in many applications. In hardware approximations for
atan2(a, b), there is often a tradeoff between the use of resources and the computation
speed and/or latency. For example, the fastest option for the computation of any function
may always be the direct implementation with a lookup table (LUT), but, since the
atan2(a, b) is a function of two input variables, in such a case, if the precision of the input
data increases by one bit, the amount of memory needed increases by a factor of four. On
NXFEE INNOVATION
(SEMICONDUCTOR IP &PRODUCT DEVELOPMENT)
(ISO : 9001:2015Certified Company),
# 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam,
Pondicherry– 605004, India.
Buy Project on Online :www.nxfee.com | contact : +91 9789443203 |
email : nxfee.innovation@gmail.com
_________________________________________________________________
the other hand, iterative algorithms, such as the Coordinated Rotation Digital Computer
(CORDIC), can be implemented with a minimal use of resources, but at the cost of a low
processing speed. If more parallelism is introduced in the implementation, much higher
throughputs can be achieved, but the use of resources and the computational delay are
increased.
Fig. 1. 3-D plot of atan2(a, b)/2π for the first octant
Fig. 1(b). Simplified scheme of the proposed approximation for atan2(a, b)/2π
Smaller LUTs can be achieved by using the recip-mult-atan method (RMAM), first, z =
a/b is calculated by computing 1/b with an LUT (avoiding thus the large LUT needed for
a two-variable function) and multiplying the result by a, and finally, the one-variable
function atan(z) is computed using another LUT. Another option is to use high-order
algebraic polynomials, like Chebyshev polynomials or Taylor series. These methods offer
NXFEE INNOVATION
(SEMICONDUCTOR IP &PRODUCT DEVELOPMENT)
(ISO : 9001:2015Certified Company),
# 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam,
Pondicherry– 605004, India.
Buy Project on Online :www.nxfee.com | contact : +91 9789443203 |
email : nxfee.innovation@gmail.com
_________________________________________________________________
good precision, but, since the arctangent is highly nonlinear, they lead to long
polynomials and intensive computations. In other cases, approximations based on rational
functions are used, as they may provide good enough results with a few elementary
operations. As a general rule, in this kind of approximations, the division operation is the
main contributor to their computational cost, but in addition to that division, they usually
require one or more multiplication operations
The architecture we propose is essentially a two-stage method, as shown in Fig. 1
(b). The First Stage uses a low-complexity coarse approximation for the two-input
atan2(a, b)/2π. The Second Stage improves the accuracy by means of a small LUT that
stores precomputed error values, as a function of the output of the First Stage. This table
does not depend on the two inputs a and b of the atan2 operator and is, therefore,
comparatively much smaller. As will be shown, the resulting operator is small and can
compute the arctangent faster than other popular options, for the same output accuracy.
Disadvantages:
 Less Latency
 Less speed and efficiency
Proposed System:
Hardware Architecture
Fig. 2(a) shows the scheme of the proposed hardware architecture. It works with the
absolute values of the inputs a and b, and their signs are used later in the scheme. The
“sel” signal selects the operands for the division operation according to the signs of a + b
and a − b. The division required for the computation of | fr| is implemented with a table
that stores 1/x and a multiplier that completes the division. A scaling stage is added in
order to reduce the size of the reciprocal table. The most relevant implementation details
NXFEE INNOVATION
(SEMICONDUCTOR IP &PRODUCT DEVELOPMENT)
(ISO : 9001:2015Certified Company),
# 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam,
Pondicherry– 605004, India.
Buy Project on Online :www.nxfee.com | contact : +91 9789443203 |
email : nxfee.innovation@gmail.com
_________________________________________________________________
are commented upon in Section IV-A–E. Specifically, we give details for three
accuracies: w = {8, 12, 16} bit.
Fig. 2 (a) Implementation scheme for the proposed approximation for the atan2(a, b)/2π function. (b)
Simplified model for error analysis.
Datapath Dimensioning
Fig. 2(a) shows the binary formats used in different buses of the system, where s(q, t) and
u(q, t) denote signed and unsigned fixed point formats, respectively. 2q is the weight of
the most significant bit (MSB) and 2t is the weight of the LSB
Note that truncating the output of the multiplier does not introduce an additional error
unless the truncated bits are used in the following processing steps. In a more realistic
scenario, LBA can still be achieved with smaller LUTs that either do not use all the
available bits as their address word or that are implemented as bipartite tables. Although
in these cases the tables introduce bigger errors (as explained in the following), LBA
could still be achieved, since and represent an upper bound that could not be reached. For
this reason, we performed exhaustive tests for different LUT sizes looking for optimized
implementations. Fig. 3 is an example of the error pattern obtained in one of the w = 12
operators.
NXFEE INNOVATION
(SEMICONDUCTOR IP &PRODUCT DEVELOPMENT)
(ISO : 9001:2015Certified Company),
# 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam,
Pondicherry– 605004, India.
Buy Project on Online :www.nxfee.com | contact : +91 9789443203 |
email : nxfee.innovation@gmail.com
_________________________________________________________________
Range Reduction Obtaining
| fr| requires the computation of y/x [see Fig. 2(a)], which involves the computation using
an LUT of 1/x. Since 1/x can
Fig. 3. Absolute error for one of our w = 12 atan2(a, b)/2π operators.
be extremely large for small values of x, a scaling operation is performed on |a| and |b|.
This block detects the leading-zeros in both |a| and |b|, scaling both by the same factor
(2s) so the MSB of x, the biggest one of both outputs, is always 1.Thanks to this block, x
is always in the [0.5, 1) range and the biggest possible value of 1/x is 2. C. Computation
of the Reciprocal For the computation of 1/x, two different strategies, both table based,
are used: direct tabulation for the w = 8 bit operator and bipartite tables for w = {12, 16}
bits. In both cases, all the bits from the input word are used. Therefore, for the direct
tabulation the only errors are those created by rounding the words stored in the table, and
for the bipartite tables, the maximum absolute value of the error can be estimated from
the second derivative of the stored function and also from the rounding errors. The value
stored in the first address of this table should be 2, but 2 − 2−n+1 is stored for a table
with n-bit words, so the MSB of all the stored words is the same, and therefore, it does
not need to be stored. The size of this table is 64 × 6 for w = 8, 128 × 14 + 128 × 7 for w
= 12, and 1k × 18 + 512 × 8 for w = 16 bit.
NXFEE INNOVATION
(SEMICONDUCTOR IP &PRODUCT DEVELOPMENT)
(ISO : 9001:2015Certified Company),
# 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam,
Pondicherry– 605004, India.
Buy Project on Online :www.nxfee.com | contact : +91 9789443203 |
email : nxfee.innovation@gmail.com
_________________________________________________________________
Advantages:
 Latency is more
 Speed and efficiency is more
References:
[1] J.-M. Muller, Elementary Functions: Algorithms and Implementation. Cambridge, MA, USA:
Birkhäuser, 1997.
[2] R. Gutierrez and J. Valls, “Low-power FPGA-implementation of atan(Y/X) using look-up table
methods for communication applications,” J. Signal Process. Syst., vol. 56, no. 1, pp. 25–33, 2009.
[3] F. de Dinechin and M. Istoan, “Hardware implementations of fixed-point atan2,” in Proc. 22nd
Symp. Comput. Arithmetic, Jun. 2015, pp. 34–40.
[4] R. Lyons, “Another contender in the arctangent race,” IEEE Signal Process. Mag., vol. 21, no. 1, pp.
109–110, Jan. 2004.
[5] S. Rajan, S. Wang, R. Inkol, and A. Joyal, “Efficient approximations for the arctangent function,”
IEEE Signal Process. Mag., vol. 23, no. 3, pp. 108–111, May 2006.
[6] X. Girones, C. Julia, and D. Puig, “Full quadrant approximations for the arctangent function,” IEEE
Signal Process. Mag., vol. 30, no. 1, pp. 130–135, Jan. 2013.
[7] J. M. Shima, “FM demodulation using a digital radio and digital signal processing,” M.S. thesis,
Univ. Florida, Gainesville, FL, USA, 1995.
[8] M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions With Formulas, Graphs, and
Mathematical Tables, 10th ed. New York, NY, USA: Dover, 1964.
[9] M. Arnold, T. Bailey, and J. Cowles, “Error analysis of the kmetz/maenner algorithm,” J. VLSI
Signal Process. Syst. Signal, Image Video Technol., vol. 33, no. 1, pp. 37–53, 2003. [Online]. Available:
http://dx.doi.org/10.1023/A:1021189701352
NXFEE INNOVATION
(SEMICONDUCTOR IP &PRODUCT DEVELOPMENT)
(ISO : 9001:2015Certified Company),
# 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam,
Pondicherry– 605004, India.
Buy Project on Online :www.nxfee.com | contact : +91 9789443203 |
email : nxfee.innovation@gmail.com
_________________________________________________________________
[10] D. D. Sarma and D. W. Matula, “Faithful bipartite ROM reciprocal tables,” in Proc. 12th Symp.
Comput. Arithmetic, Jul. 1995, pp. 17–28.
[11] M. J. Schulte and J. E. Stine, “Symmetric bipartite tables for accurate function approximation,” in
Proc. 13th IEEE Symp. Comput. Arithmetic, Jul. 1997, pp. 175–183.
[12] M. J. Schulte and J. E. Stine, “Approximating elementary functions with symmetric bipartite
tables,” IEEE Trans. Comput., vol. 48, no. 8, pp. 842–847, Aug. 1999.

More Related Content

Similar to A fast and low complexity operator for the computation of the arctangent of a complex number (20)

PDF
Numerical methods by Jeffrey R. Chasnov
ankushnathe
 
PDF
An efficient hardware logarithm generator with modified quasi-symmetrical app...
IJECEIAES
 
PDF
Area and power performance analysis of floating point ALU using pipelining
IRJET Journal
 
PDF
Reconfigurable Design of Rectangular to Polar Converter using Linear Convergence
AnuragVijayAgrawal
 
PDF
Best of numerical
CAALAAA
 
PDF
Implementation and validation of multiplier less fpga based digital filter
IAEME Publication
 
PDF
Id2514581462
IJERA Editor
 
PDF
Id2514581462
IJERA Editor
 
PDF
IRJET- Single Precision Floating Point Arithmetic using VHDL Coding
IRJET Journal
 
PDF
An Area Efficient Vedic-Wallace based Variable Precision Hardware Multiplier ...
IDES Editor
 
PPTX
A floating-point adder (IEEE 754 floating-point.pptx
NiveditaAcharyya2035
 
PPT
Chapter 04 - ALU Operation.ppt
MonirJihad1
 
PPT
Counit2
Himanshu Dua
 
PDF
A Novel Efficient VLSI Architecture for IEEE 754 Floating point multiplier us...
IJERA Editor
 
PDF
Lp2520162020
IJERA Editor
 
PDF
Lp2520162020
IJERA Editor
 
PDF
IRJET- Image and Signal Filtering using Fir Filter Made using Approximate Hyb...
IRJET Journal
 
PDF
Chapter 07 Digital Alrithmetic and Arithmetic Circuits
SSE_AndyLi
 
PDF
APPROXIMATE ARITHMETIC CIRCUIT DESIGN FOR ERROR RESILIENT APPLICATIONS
VLSICS Design
 
PDF
APPROXIMATE ARITHMETIC CIRCUIT DESIGN FOR ERROR RESILIENT APPLICATIONS
VLSICS Design
 
Numerical methods by Jeffrey R. Chasnov
ankushnathe
 
An efficient hardware logarithm generator with modified quasi-symmetrical app...
IJECEIAES
 
Area and power performance analysis of floating point ALU using pipelining
IRJET Journal
 
Reconfigurable Design of Rectangular to Polar Converter using Linear Convergence
AnuragVijayAgrawal
 
Best of numerical
CAALAAA
 
Implementation and validation of multiplier less fpga based digital filter
IAEME Publication
 
Id2514581462
IJERA Editor
 
Id2514581462
IJERA Editor
 
IRJET- Single Precision Floating Point Arithmetic using VHDL Coding
IRJET Journal
 
An Area Efficient Vedic-Wallace based Variable Precision Hardware Multiplier ...
IDES Editor
 
A floating-point adder (IEEE 754 floating-point.pptx
NiveditaAcharyya2035
 
Chapter 04 - ALU Operation.ppt
MonirJihad1
 
Counit2
Himanshu Dua
 
A Novel Efficient VLSI Architecture for IEEE 754 Floating point multiplier us...
IJERA Editor
 
Lp2520162020
IJERA Editor
 
Lp2520162020
IJERA Editor
 
IRJET- Image and Signal Filtering using Fir Filter Made using Approximate Hyb...
IRJET Journal
 
Chapter 07 Digital Alrithmetic and Arithmetic Circuits
SSE_AndyLi
 
APPROXIMATE ARITHMETIC CIRCUIT DESIGN FOR ERROR RESILIENT APPLICATIONS
VLSICS Design
 
APPROXIMATE ARITHMETIC CIRCUIT DESIGN FOR ERROR RESILIENT APPLICATIONS
VLSICS Design
 

More from Nxfee Innovation (20)

PDF
VLSI IEEE Transaction 2018 - IEEE Transaction
Nxfee Innovation
 
DOCX
Noise insensitive pll using a gate-voltage-boosted source-follower regulator ...
Nxfee Innovation
 
DOCX
An efficient fault tolerance design for integer parallel matrix vector
Nxfee Innovation
 
PDF
Vector processing aware advanced clock-gating techniques for low-power fused ...
Nxfee Innovation
 
PDF
The implementation of the improved omp for aic reconstruction based on parall...
Nxfee Innovation
 
PDF
Securing the present block cipher against combined side channel analysis and ...
Nxfee Innovation
 
PDF
Multilevel half rate phase detector for clock and data recovery circuits
Nxfee Innovation
 
PDF
Low complexity methodology for complex square-root computation
Nxfee Innovation
 
PDF
Feedback based low-power soft-error-tolerant design for dual-modular redundancy
Nxfee Innovation
 
PDF
Fast neural network training on fpga using quasi newton optimization method
Nxfee Innovation
 
PDF
Efficient fpga mapping of pipeline sdf fft cores
Nxfee Innovation
 
PDF
Design of an area efficient million-bit integer multiplier using double modul...
Nxfee Innovation
 
PDF
Design and fpga implementation of a reconfigurable digital down converter for...
Nxfee Innovation
 
PDF
Combating data leakage trojans in commercial and asic applications with time ...
Nxfee Innovation
 
PDF
Approximate sum of-products designs based on distributed arithmetic
Nxfee Innovation
 
PDF
Analysis and design of cost effective, high-throughput ldpc decoders
Nxfee Innovation
 
PDF
An energy efficient programmable many core accelerator for personalized biome...
Nxfee Innovation
 
PDF
Algorithm and vlsi architecture design of proportionate type lms adaptive fil...
Nxfee Innovation
 
PDF
A reconfigurable ldpc decoder optimized applications
Nxfee Innovation
 
PDF
A high accuracy programmable pulse generator with a 10-ps timing resolution
Nxfee Innovation
 
VLSI IEEE Transaction 2018 - IEEE Transaction
Nxfee Innovation
 
Noise insensitive pll using a gate-voltage-boosted source-follower regulator ...
Nxfee Innovation
 
An efficient fault tolerance design for integer parallel matrix vector
Nxfee Innovation
 
Vector processing aware advanced clock-gating techniques for low-power fused ...
Nxfee Innovation
 
The implementation of the improved omp for aic reconstruction based on parall...
Nxfee Innovation
 
Securing the present block cipher against combined side channel analysis and ...
Nxfee Innovation
 
Multilevel half rate phase detector for clock and data recovery circuits
Nxfee Innovation
 
Low complexity methodology for complex square-root computation
Nxfee Innovation
 
Feedback based low-power soft-error-tolerant design for dual-modular redundancy
Nxfee Innovation
 
Fast neural network training on fpga using quasi newton optimization method
Nxfee Innovation
 
Efficient fpga mapping of pipeline sdf fft cores
Nxfee Innovation
 
Design of an area efficient million-bit integer multiplier using double modul...
Nxfee Innovation
 
Design and fpga implementation of a reconfigurable digital down converter for...
Nxfee Innovation
 
Combating data leakage trojans in commercial and asic applications with time ...
Nxfee Innovation
 
Approximate sum of-products designs based on distributed arithmetic
Nxfee Innovation
 
Analysis and design of cost effective, high-throughput ldpc decoders
Nxfee Innovation
 
An energy efficient programmable many core accelerator for personalized biome...
Nxfee Innovation
 
Algorithm and vlsi architecture design of proportionate type lms adaptive fil...
Nxfee Innovation
 
A reconfigurable ldpc decoder optimized applications
Nxfee Innovation
 
A high accuracy programmable pulse generator with a 10-ps timing resolution
Nxfee Innovation
 
Ad

Recently uploaded (20)

PPTX
Optimizing Cancer Screening With MCED Technologies: From Science to Practical...
i3 Health
 
PDF
1, 2, 3… E MAIS UM CICLO CHEGA AO FIM!.pdf
Colégio Santa Teresinha
 
PPTX
How to Create Rental Orders in Odoo 18 Rental
Celine George
 
PDF
ARAL-Orientation_Morning-Session_Day-11.pdf
JoelVilloso1
 
PPTX
Optimizing Cancer Screening With MCED Technologies: From Science to Practical...
i3 Health
 
PDF
Federal dollars withheld by district, charter, grant recipient
Mebane Rash
 
PPTX
A PPT on Alfred Lord Tennyson's Ulysses.
Beena E S
 
PPTX
How to Manage Access Rights & User Types in Odoo 18
Celine George
 
PPTX
Pyhton with Mysql to perform CRUD operations.pptx
Ramakrishna Reddy Bijjam
 
PPTX
Gall bladder, Small intestine and Large intestine.pptx
rekhapositivity
 
PPTX
Views on Education of Indian Thinkers J.Krishnamurthy..pptx
ShrutiMahanta1
 
PPSX
HEALTH ASSESSMENT (Community Health Nursing) - GNM 1st Year
Priyanshu Anand
 
PPTX
Presentation: Climate Citizenship Digital Education
Karl Donert
 
PPTX
Nutri-QUIZ-Bee-Elementary.pptx...................
ferdinandsanbuenaven
 
PDF
community health nursing question paper 2.pdf
Prince kumar
 
PDF
CEREBRAL PALSY: NURSING MANAGEMENT .pdf
PRADEEP ABOTHU
 
PPTX
ROLE OF ANTIOXIDANT IN EYE HEALTH MANAGEMENT.pptx
Subham Panja
 
PPTX
Capitol Doctoral Presentation -July 2025.pptx
CapitolTechU
 
PDF
IMP NAAC-Reforms-Stakeholder-Consultation-Presentation-on-Draft-Metrics-Unive...
BHARTIWADEKAR
 
PPTX
Unit 2 COMMERCIAL BANKING, Corporate banking.pptx
AnubalaSuresh1
 
Optimizing Cancer Screening With MCED Technologies: From Science to Practical...
i3 Health
 
1, 2, 3… E MAIS UM CICLO CHEGA AO FIM!.pdf
Colégio Santa Teresinha
 
How to Create Rental Orders in Odoo 18 Rental
Celine George
 
ARAL-Orientation_Morning-Session_Day-11.pdf
JoelVilloso1
 
Optimizing Cancer Screening With MCED Technologies: From Science to Practical...
i3 Health
 
Federal dollars withheld by district, charter, grant recipient
Mebane Rash
 
A PPT on Alfred Lord Tennyson's Ulysses.
Beena E S
 
How to Manage Access Rights & User Types in Odoo 18
Celine George
 
Pyhton with Mysql to perform CRUD operations.pptx
Ramakrishna Reddy Bijjam
 
Gall bladder, Small intestine and Large intestine.pptx
rekhapositivity
 
Views on Education of Indian Thinkers J.Krishnamurthy..pptx
ShrutiMahanta1
 
HEALTH ASSESSMENT (Community Health Nursing) - GNM 1st Year
Priyanshu Anand
 
Presentation: Climate Citizenship Digital Education
Karl Donert
 
Nutri-QUIZ-Bee-Elementary.pptx...................
ferdinandsanbuenaven
 
community health nursing question paper 2.pdf
Prince kumar
 
CEREBRAL PALSY: NURSING MANAGEMENT .pdf
PRADEEP ABOTHU
 
ROLE OF ANTIOXIDANT IN EYE HEALTH MANAGEMENT.pptx
Subham Panja
 
Capitol Doctoral Presentation -July 2025.pptx
CapitolTechU
 
IMP NAAC-Reforms-Stakeholder-Consultation-Presentation-on-Draft-Metrics-Unive...
BHARTIWADEKAR
 
Unit 2 COMMERCIAL BANKING, Corporate banking.pptx
AnubalaSuresh1
 
Ad

A fast and low complexity operator for the computation of the arctangent of a complex number

  • 1. NXFEE INNOVATION (SEMICONDUCTOR IP &PRODUCT DEVELOPMENT) (ISO : 9001:2015Certified Company), # 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam, Pondicherry– 605004, India. Buy Project on Online :www.nxfee.com | contact : +91 9789443203 | email : [email protected] _________________________________________________________________ A Fast and Low-Complexity Operator for the Computation of the Arctangent of a Complex Number Abstract: The computation of the arctangent of a complex number, i.e., the atan2 function, is frequently needed in hardware systems that could profit from an optimized operator. In this brief, we present a novel method to compute the atan2 function and a hardware architecture for its implementation. The method is based on a first stage that performs a coarse approximation of the atan2 function and a second stage that improves the output accuracy by means of a lookup table. We present results for fixed-point implementations in a field-programmable gate array device, all of them guaranteeing last-bit accuracy, which provide an advantage in latency, speed, and use of resources, when compared with well-established fixed-point options. Software Implementation:  Modelsim  Xilinx 14.2 Existing System: The computation of the arctangent function atan2(a, b) (see Fig. 1), i.e., obtaining the angle of a complex number c = b + ja, has been the subject of extensive study, because this computation is required in many applications. In hardware approximations for atan2(a, b), there is often a tradeoff between the use of resources and the computation speed and/or latency. For example, the fastest option for the computation of any function may always be the direct implementation with a lookup table (LUT), but, since the atan2(a, b) is a function of two input variables, in such a case, if the precision of the input data increases by one bit, the amount of memory needed increases by a factor of four. On
  • 2. NXFEE INNOVATION (SEMICONDUCTOR IP &PRODUCT DEVELOPMENT) (ISO : 9001:2015Certified Company), # 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam, Pondicherry– 605004, India. Buy Project on Online :www.nxfee.com | contact : +91 9789443203 | email : [email protected] _________________________________________________________________ the other hand, iterative algorithms, such as the Coordinated Rotation Digital Computer (CORDIC), can be implemented with a minimal use of resources, but at the cost of a low processing speed. If more parallelism is introduced in the implementation, much higher throughputs can be achieved, but the use of resources and the computational delay are increased. Fig. 1. 3-D plot of atan2(a, b)/2π for the first octant Fig. 1(b). Simplified scheme of the proposed approximation for atan2(a, b)/2π Smaller LUTs can be achieved by using the recip-mult-atan method (RMAM), first, z = a/b is calculated by computing 1/b with an LUT (avoiding thus the large LUT needed for a two-variable function) and multiplying the result by a, and finally, the one-variable function atan(z) is computed using another LUT. Another option is to use high-order algebraic polynomials, like Chebyshev polynomials or Taylor series. These methods offer
  • 3. NXFEE INNOVATION (SEMICONDUCTOR IP &PRODUCT DEVELOPMENT) (ISO : 9001:2015Certified Company), # 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam, Pondicherry– 605004, India. Buy Project on Online :www.nxfee.com | contact : +91 9789443203 | email : [email protected] _________________________________________________________________ good precision, but, since the arctangent is highly nonlinear, they lead to long polynomials and intensive computations. In other cases, approximations based on rational functions are used, as they may provide good enough results with a few elementary operations. As a general rule, in this kind of approximations, the division operation is the main contributor to their computational cost, but in addition to that division, they usually require one or more multiplication operations The architecture we propose is essentially a two-stage method, as shown in Fig. 1 (b). The First Stage uses a low-complexity coarse approximation for the two-input atan2(a, b)/2π. The Second Stage improves the accuracy by means of a small LUT that stores precomputed error values, as a function of the output of the First Stage. This table does not depend on the two inputs a and b of the atan2 operator and is, therefore, comparatively much smaller. As will be shown, the resulting operator is small and can compute the arctangent faster than other popular options, for the same output accuracy. Disadvantages:  Less Latency  Less speed and efficiency Proposed System: Hardware Architecture Fig. 2(a) shows the scheme of the proposed hardware architecture. It works with the absolute values of the inputs a and b, and their signs are used later in the scheme. The “sel” signal selects the operands for the division operation according to the signs of a + b and a − b. The division required for the computation of | fr| is implemented with a table that stores 1/x and a multiplier that completes the division. A scaling stage is added in order to reduce the size of the reciprocal table. The most relevant implementation details
  • 4. NXFEE INNOVATION (SEMICONDUCTOR IP &PRODUCT DEVELOPMENT) (ISO : 9001:2015Certified Company), # 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam, Pondicherry– 605004, India. Buy Project on Online :www.nxfee.com | contact : +91 9789443203 | email : [email protected] _________________________________________________________________ are commented upon in Section IV-A–E. Specifically, we give details for three accuracies: w = {8, 12, 16} bit. Fig. 2 (a) Implementation scheme for the proposed approximation for the atan2(a, b)/2π function. (b) Simplified model for error analysis. Datapath Dimensioning Fig. 2(a) shows the binary formats used in different buses of the system, where s(q, t) and u(q, t) denote signed and unsigned fixed point formats, respectively. 2q is the weight of the most significant bit (MSB) and 2t is the weight of the LSB Note that truncating the output of the multiplier does not introduce an additional error unless the truncated bits are used in the following processing steps. In a more realistic scenario, LBA can still be achieved with smaller LUTs that either do not use all the available bits as their address word or that are implemented as bipartite tables. Although in these cases the tables introduce bigger errors (as explained in the following), LBA could still be achieved, since and represent an upper bound that could not be reached. For this reason, we performed exhaustive tests for different LUT sizes looking for optimized implementations. Fig. 3 is an example of the error pattern obtained in one of the w = 12 operators.
  • 5. NXFEE INNOVATION (SEMICONDUCTOR IP &PRODUCT DEVELOPMENT) (ISO : 9001:2015Certified Company), # 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam, Pondicherry– 605004, India. Buy Project on Online :www.nxfee.com | contact : +91 9789443203 | email : [email protected] _________________________________________________________________ Range Reduction Obtaining | fr| requires the computation of y/x [see Fig. 2(a)], which involves the computation using an LUT of 1/x. Since 1/x can Fig. 3. Absolute error for one of our w = 12 atan2(a, b)/2π operators. be extremely large for small values of x, a scaling operation is performed on |a| and |b|. This block detects the leading-zeros in both |a| and |b|, scaling both by the same factor (2s) so the MSB of x, the biggest one of both outputs, is always 1.Thanks to this block, x is always in the [0.5, 1) range and the biggest possible value of 1/x is 2. C. Computation of the Reciprocal For the computation of 1/x, two different strategies, both table based, are used: direct tabulation for the w = 8 bit operator and bipartite tables for w = {12, 16} bits. In both cases, all the bits from the input word are used. Therefore, for the direct tabulation the only errors are those created by rounding the words stored in the table, and for the bipartite tables, the maximum absolute value of the error can be estimated from the second derivative of the stored function and also from the rounding errors. The value stored in the first address of this table should be 2, but 2 − 2−n+1 is stored for a table with n-bit words, so the MSB of all the stored words is the same, and therefore, it does not need to be stored. The size of this table is 64 × 6 for w = 8, 128 × 14 + 128 × 7 for w = 12, and 1k × 18 + 512 × 8 for w = 16 bit.
  • 6. NXFEE INNOVATION (SEMICONDUCTOR IP &PRODUCT DEVELOPMENT) (ISO : 9001:2015Certified Company), # 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam, Pondicherry– 605004, India. Buy Project on Online :www.nxfee.com | contact : +91 9789443203 | email : [email protected] _________________________________________________________________ Advantages:  Latency is more  Speed and efficiency is more References: [1] J.-M. Muller, Elementary Functions: Algorithms and Implementation. Cambridge, MA, USA: Birkhäuser, 1997. [2] R. Gutierrez and J. Valls, “Low-power FPGA-implementation of atan(Y/X) using look-up table methods for communication applications,” J. Signal Process. Syst., vol. 56, no. 1, pp. 25–33, 2009. [3] F. de Dinechin and M. Istoan, “Hardware implementations of fixed-point atan2,” in Proc. 22nd Symp. Comput. Arithmetic, Jun. 2015, pp. 34–40. [4] R. Lyons, “Another contender in the arctangent race,” IEEE Signal Process. Mag., vol. 21, no. 1, pp. 109–110, Jan. 2004. [5] S. Rajan, S. Wang, R. Inkol, and A. Joyal, “Efficient approximations for the arctangent function,” IEEE Signal Process. Mag., vol. 23, no. 3, pp. 108–111, May 2006. [6] X. Girones, C. Julia, and D. Puig, “Full quadrant approximations for the arctangent function,” IEEE Signal Process. Mag., vol. 30, no. 1, pp. 130–135, Jan. 2013. [7] J. M. Shima, “FM demodulation using a digital radio and digital signal processing,” M.S. thesis, Univ. Florida, Gainesville, FL, USA, 1995. [8] M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions With Formulas, Graphs, and Mathematical Tables, 10th ed. New York, NY, USA: Dover, 1964. [9] M. Arnold, T. Bailey, and J. Cowles, “Error analysis of the kmetz/maenner algorithm,” J. VLSI Signal Process. Syst. Signal, Image Video Technol., vol. 33, no. 1, pp. 37–53, 2003. [Online]. Available: http://dx.doi.org/10.1023/A:1021189701352
  • 7. NXFEE INNOVATION (SEMICONDUCTOR IP &PRODUCT DEVELOPMENT) (ISO : 9001:2015Certified Company), # 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam, Pondicherry– 605004, India. Buy Project on Online :www.nxfee.com | contact : +91 9789443203 | email : [email protected] _________________________________________________________________ [10] D. D. Sarma and D. W. Matula, “Faithful bipartite ROM reciprocal tables,” in Proc. 12th Symp. Comput. Arithmetic, Jul. 1995, pp. 17–28. [11] M. J. Schulte and J. E. Stine, “Symmetric bipartite tables for accurate function approximation,” in Proc. 13th IEEE Symp. Comput. Arithmetic, Jul. 1997, pp. 175–183. [12] M. J. Schulte and J. E. Stine, “Approximating elementary functions with symmetric bipartite tables,” IEEE Trans. Comput., vol. 48, no. 8, pp. 842–847, Aug. 1999.