Skip to main content
Account

Table 3 Comparison to FPGA and GPU implementations processing MNIST, DIBCO 2017 datasets.

From: Efficient Hardware Architectures for 1D- and MD-LSTM Networks

 

Name

Platform

Model, [bits]

Params, [M]

Ops.a, [M]

Acc., [%]

Freq., [MHz]

𝜃, [GOp/s]

kFPSb

Pchip, [W]

Pboard, [W]

\(\frac {kFPS}{P_{chip}}\)

\(\frac {kFPS}{P_{board}}\)

MNIST

FINN [53]

ZC706

1

3.34E-1

6.69E-1

95.83

200

8265.45

12361

7.30

21.2

1.69E + 3

5.83E + 2

 

FINN [53]

ZC706

1

2.91E + 0

5.82E + 0

98.40

200

9085.67

1561

8.80

22.6

1.77E + 2

6.91E + 1

 

BNN [33]

Stratix-V 5SGSD8

1

1.00E + 1

2.00E + 1

98.32

150

12219.40

610.36

-

26.2

-

2.33E + 1

 

TNN [4]

Kintex-7 160T

2

1.99E-1

3.97E-1

97.76

200

101.28

255.102

0.32

-

7.97E + 2

-

 

TNN [4]

Kintex-7 160T

2

1.72E + 0

3.44E + 0

98.33

200

877.81

255.102

2.86

-

8.92E + 1

-

 

[42]

ZC706

3

2.90E + 0

5.80E + 0

98.92

172

384.16

66.255

4.98

11.4

1.41E + 1

4.98E + 0

 

This work

ZCU102

1

6.39E-1

2.77E + 1

99.37

300

8710.28

314.82

13.20

39.3

2.39E + 1

8.01E + 0

 

This workD

Tesla K80

32F

6.39E-1

2.77E + 1

99.46

-

239.83

8.66

273.85

-

3.16E-2

-

 

This workP

Tesla K80

35F

6.39E-1

2.77E + 1

99.46

-

103.33

3.73

193.21

-

1.93E-2

-

DIBCO

This work

ZCU102

4/8

6.75E-2

1.35E-1

87.54

240

3618.23

6.53

15.47

43.6

4.22E-1

1.50E-1

 

This workD

Tesla K80

32F

6.75E-2

1.35E-1

88.00

-

319.27

0.58

247.47

-

2.34E-3

-

 

This workP

Tesla K80

35F

6.75E-2

1.35E-1

88.00

-

101.91

0.18

183.11

-

9.83E-4

-

  1. a indicates number of operations per 28×28 image for MNIST, and number of operations per pixel for DIBCO
  2. b taking into account 28×28 images in the case of MNIST, and 64×64 patches in the case of DIBCO
  3. D diagonal-wise order of execution.
  4. P pixel-by-pixel order of execution.