SlideShare a Scribd company logo
RISC-V IN SPACE
14.12.2022
pablo.ghiglino@klepsydra.com
www.klepsydra.com
Klepsydra Technologies
Part 1
Pre-existing work
COMPARE AND SWAP
• Compare-and-swap (CAS) is an instruction
used in multithreading to achieve
synchronisation. It compares the contents of
a memory location with a given value and,
only if they are the same, modi
fi
es the
contents of that memory location to a new
given value. This is done as a single
atomic operation.
• Compare-and-Swap has been an integral
part of the IBM 370 architectures since
1970.
• Maurice Herlihy (1991) proved that CAS can
implement more of these algorithms than
atomic read, write, and fetch-and-add
Event Loop
Sensor Multiplexer
Two main data
processing approaches
Producer 1
Consumer 1 Consumer 2
Producer 2
Producer 3
Consumer
Producer 1
4
Lightweight, modular and compatible with most used operating systems
Worldwide
application
Klepsydra
SDK
Klepsydra
GPU
Streaming
Klepsydra
AI
Klepsydra
ROS2
executor
plugin
SDK – Software Development Kit
Boost data processing at the edge for general
applications and processor intensive
algorithms
AI – Artificial Intelligence
High performance deep neural network
(DNN) engine to deploy any AI or
machine learning module at the edge
ROS2 Executor plugin
Executor for ROS2 able to process up
to 10 times more data with up to 50%
reduction in CPU consumption.
GPU (Graphic Processing Unit)
High parallelisation of GPU to increase
the processing data rate and GPU
utilization
THE PRODUCT
LOCK-FREE AS ALTERNATIVE TO
PARALLELISATION
Parallelisation Pipeline
2-DIM THREADING MODEL
Input
Data
Layer
Output
Data
First dimension: pipelining
{
Thread 1 (Core 1)
Layer
Layer
Layer
Layer
Layer
{
Thread 2 (Core 2)
Layer
Layer
Layer
Layer
Layer Layer Layer Layer Layer
Deep Neural Network Structure
2-DIM THREADING MODEL
Input
Data
Output
Data
Second dimension: Matrix
multiplication parallelisation
{
T
hread
1
(Core
1)
Layer
{
T
hread
2
(Core
2)
{
T
hread
3
(Core
3)
2-DIM THREADING MODEL
Core 1 Core 2
Core 3 Core 4
Layer
Layer
Layer
Layer
Layer
Layer
Core 1 Core 2
Core 3 Core 4
Layer
Layer
Layer
Layer
Layer
Layer
Core 1 Core 2
Core 3 Core 4
Layer
Layer
Layer
Layer
Layer
Layer
Layer
Layer
Layer
• Low CPU
• High throughput CPU
• High latency
• Mid CPU
• Mid throughput CPU
• Mid latency
• High CPU
• Mid throughput CPU
• Low latency
Threading model con
fi
guration
ONNX API
class KPSR_API OnnxDNNImporter
{
public:
/**
* @brief import an onnx file and uses a default eventloop factory for all processor cores
* @param onnxFileName
* @param testDNN
* @return a share pointer to a DeepNeuralNetwork object
*
* When log level is debug, dumps the YAML configuration of the default factory.
* It makes use of all processor cores.
*/
static std::shared_ptr<kpsr::ai::DeepNeuralNetworkFactory> createDNNFactory(const std::string & onnxFileName,
bool testDNN = false);
/**
* @brief importForTest an onnx file and uses a default synchronous factory
* @param onnxFileName
* @param envFileName. Klepsydra AI configuration environment file.
* @return a share pointer to a DeepNeuralNetwork object
*
* This method is intented to be used for testing purposes only.
*
*/
static std::shared_ptr<kpsr::ai::DeepNeuralNetworkFactory> createDNNFactory(const std::string & onnxFileName,
const std::string & envFileName);
};
10
Core API
class DeepNeuralNetwork {
public:
/**
* @brief setCallback
* @param callback. Callback function for the prediction result.
*/
virtual void setCallback(std::function<void(const unsigned long &, const kpsr::ai::F32AlignedVector &)> callback) = 0;
/**
* @brief predict. Load input matrix as input to network.
* @param inputVector. An F32AlignedVector of floats containing network input.
*
* @return Unique id corresponding to the input vector
*/
virtual unsigned long predict(const kpsr::ai::F32AlignedVector& inputVector) = 0;
/**
* @brief predict. Copy-less version of predict.
* @param inputVector. An F32AlignedVector of floats containing network input.
*
* @return Unique id corresponding to the input vector
*/
virtual unsigned long predict(const std::shared_ptr<kpsr::ai::F32AlignedVector> & inputVector) = 0;
};
11
KLEPSYDRA SDO PROCESS
• Klepsydra Streaming Distribution Optimiser (SDO):
• Runs on a separate computer
• Executes several dry runs on the OBC
• Collect statistics
• Runs a genetic algorithm to
fi
nd the optimal
solution for latency, power or throughput
• The main variable to optimise is the distribution of
layers are the two dimension of the threading model
KLEPSYDRA STREAMING DISTRIBUTION
OPTIMISER (SDO)
Part 2
The KATESU Project
QORIQ® LAYERSCAPE LS1046A
MULTICORE PROCESSOR
QorIQ® Layerscape LS1046A
Klepsydra AI Container
STATUS
• Successful installation of the following setup:
• LS1046 running Yocto Jethro
• Docker Installed on LS1046
• Container with the following:
• Ubuntu 20.04
• Klepsydra AI software fully supported (quantised and non-
quantised)
XILINX ZEDBOARD
ZedBoard
Klepsydra AI Container
PetaLinux
Klepsydra AI Container
STATUS
• Successful installation of the following setup:
• ZedBoard running PetaLinux 2019.2
• Docker Installed on ZedBoard
• Container with the following:
• Ubuntu 20.04
• Klepsydra AI software with quantised support only
PERFORMANCE RESULTS: CME ON LS1046
0
6,5
13
19,5
26
CPU / Hz
TFLite + NEON Klepsydra
0
45
90
135
180
Latency (ms)
TFLite + NEON Klepsydra
0
4,5
9
13,5
18
Throughput (Hz)
TFLite + NEON Klepsydra
PERFORMANCE RESULTS: CME-Q ON LS1046
0
6,75
13,5
20,25
27
CPU / Hz
TFLite + NEON Klepsydra
0
30
60
90
120
Latency (ms)
TFLite + NEON Klepsydra
0
7,5
15
22,5
30
Throughput (Hz)
TFLite + NEON Klepsydra
PERFORMANCE RESULTS: CME-Q ON ZEDBOARD
0
12,5
25
37,5
50
CPU / Hz
TFLite + NEON Klepsydra
0
250
500
750
1000
Latency (ms)
TFLite + NEON Klepsydra
0
0,65
1,3
1,95
2,6
Throughput (Hz)
TFLite + NEON Klepsydra
PERFORMANCE RESULTS: BSC ON LS1046
0
20
40
60
80
CPU / Hz
TFLite + NEON Klepsydra
0
1250
2500
3750
5000
Latency (ms)
TFLite + NEON Klepsydra
0
0,15
0,3
0,45
0,6
Throughput (Hz)
TFLite + NEON Klepsydra
Part 2
The PATTERN
Project
THE PATTERN PROJECT
PATTERN: Klepsydra AI ported to the GR740 aNd
RISC-V
• Target Processor: GR740, GR765 (Leon5 & Noel-V)
• Target OS: RTMES5
• Development on commercial FPGA board
• Validation on Space quali
fi
ed hardware
THE MULTI-THREADING API
Klepsydra SDK
Multi-threading framework
POSIX Operating System
PTHREAD
Klepsydra AI
Klepsydra SDK
Threading Abstraction Layer
POSIX
POSIX Operating System
PTHREAD
Klepsydra AI
RTEMS5
Multi-threading framework
RTEMS
THE PARALLELISATION FRAMEWORK
Klepsydra AI
Back-ends
Full-backend (Float32, Int8) Quantized-backend (Int8)
POSIX Operating System
PTHREAD
Parallelisation Framework
Klepsydra AI
Back-ends
Full-backend (Float32, Int8) Quantized-backend (Int8)
Parallelisation Framework
Threading Abstraction Layer
POSIX
RTEMS5
Multi-threading framework
POSIX Operating System
PTHREAD
RTEMS
THE MATHEMATICAL BACKEND
Klepsydra AI
Back-ends
Full-backend (Float32, Int8) Quantized-backend (Int8)
ARM x86 ARM x86
Klepsydra AI
Back-ends
Full-backend (Float32, Int8) Quantized-backend (Int8)
ARM x86 ARM x86
RISC-V?
RISC-V Extensions
• Current version of Klepsydra AI supports RV32GV and RV64GC
• Preparation for NOEL-V in three modes:
• ‘Vanilla’
• P-Extension and V-Extension,
• And more….
THE PLAN
Phase 1:
Klepsydra AI for RTEMS5
Phase 2:
Klepsydra AI for GR765/Leon5
Phase 3:
Klepsydra AI for GR765/Noel-V
Phase 4:
Validation of Klepsydra AI on
GR740 and GR765
THE SCHEDULE
Work Package Start Month End Month
Duration in
Months
0 1 2 3 4 5 6 7 8 9 10 11
KOM MTR1 MTR2 FR
WP4.2 11 11 1
WP0 0 17 18
2
11
10
WP4.1
1
10
10
WP3.3
0 0 1
WP2.1
WP2.2 1 4 4
WP1.1 5
0 4
5 8 4
WP1.2
WP2.3 9 10 2
WP3.2 5 8 4
WP3.1 5 5 1
CONCLUSIONS
• Enable real AI for future missions on the GR765/NOEL-V
• Very easy to use, via a simple API and web-based
optimisation tool
• Highly optimised for the GR765/NOEL-V processors
• Lightweight software (current version is 4Mb)
• Deterministic and full control of the dedicated resources
NEXT STEPS
• In-orbit-demonstration:
• OPSSAT OBC: Using Onboard Altera FPGA and NOEL-V
softcore
• Other?
• Health Monitoring (core operation failures, etc).
CONTACT INFORMATION
Dr Pablo Ghiglino
pablo.ghiglino@klepsydra.com
+41786931544
www.klepsydra.com
linkedin.com/company/klepsydra-technologies

More Related Content

PDF
GR740 User day
klepsydratechnologie
 
PPTX
HiPEAC Computing Systems Week 2022_Mario Porrmann presentation
VEDLIoT Project
 
PPTX
Introduction to HPC & Supercomputing in AI
Tyrone Systems
 
PDF
ADCSS 2022
klepsydratechnologie
 
PDF
Exploring Github Data with Apache Drill on ARM64
Ganesh Raju
 
PPTX
HiPEAC 2022_Marco Tassemeier presentation
VEDLIoT Project
 
PDF
Serving Deep Learning Models At Scale With RedisAI: Luca Antiga
Redis Labs
 
PDF
SDVIs and In-Situ Visualization on TACC's Stampede
Intel® Software
 
GR740 User day
klepsydratechnologie
 
HiPEAC Computing Systems Week 2022_Mario Porrmann presentation
VEDLIoT Project
 
Introduction to HPC & Supercomputing in AI
Tyrone Systems
 
Exploring Github Data with Apache Drill on ARM64
Ganesh Raju
 
HiPEAC 2022_Marco Tassemeier presentation
VEDLIoT Project
 
Serving Deep Learning Models At Scale With RedisAI: Luca Antiga
Redis Labs
 
SDVIs and In-Situ Visualization on TACC's Stampede
Intel® Software
 

Similar to RISC V in Spacer (20)

PDF
Seminar Accelerating Business Using Microservices Architecture in Digital Age...
PT Datacomm Diangraha
 
PPTX
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
MayaData Inc
 
PPTX
Container & kubernetes
Ted Jung
 
PDF
Dpdk 2019-ipsec-eventdev
Hemant Agrawal
 
PDF
Ironic 140622212631-phpapp02
Narender Kumar
 
PDF
Ironic 140622212631-phpapp02
Narender Kumar
 
PDF
Ironic
Haomeng Wang
 
PDF
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*
Michelle Holley
 
PDF
What’s New in ScyllaDB Open Source 5.0
ScyllaDB
 
PDF
Running Applications on the NetBSD Rump Kernel by Justin Cormack
eurobsdcon
 
ODP
LSA2 - 02 Namespaces
Marian Marinov
 
PPTX
VEDLIoT at FPL'23_Accelerators for Heterogenous Computing in AIoT
VEDLIoT Project
 
PDF
Using VPP and SRIO-V with Clear Containers
Michelle Holley
 
PDF
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
OpenStack Korea Community
 
PDF
Building SuperComputers @ Home
Abhishek Parolkar
 
PDF
A Dataflow Processing Chip for Training Deep Neural Networks
inside-BigData.com
 
PDF
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
PPT
Current Trends in HPC
Putchong Uthayopas
 
PPTX
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
DataWorks Summit/Hadoop Summit
 
PPTX
Heterogeneous Computing on POWER - IBM and OpenPOWER technologies to accelera...
Cesar Maciel
 
Seminar Accelerating Business Using Microservices Architecture in Digital Age...
PT Datacomm Diangraha
 
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
MayaData Inc
 
Container & kubernetes
Ted Jung
 
Dpdk 2019-ipsec-eventdev
Hemant Agrawal
 
Ironic 140622212631-phpapp02
Narender Kumar
 
Ironic 140622212631-phpapp02
Narender Kumar
 
Ironic
Haomeng Wang
 
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*
Michelle Holley
 
What’s New in ScyllaDB Open Source 5.0
ScyllaDB
 
Running Applications on the NetBSD Rump Kernel by Justin Cormack
eurobsdcon
 
LSA2 - 02 Namespaces
Marian Marinov
 
VEDLIoT at FPL'23_Accelerators for Heterogenous Computing in AIoT
VEDLIoT Project
 
Using VPP and SRIO-V with Clear Containers
Michelle Holley
 
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
OpenStack Korea Community
 
Building SuperComputers @ Home
Abhishek Parolkar
 
A Dataflow Processing Chip for Training Deep Neural Networks
inside-BigData.com
 
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Current Trends in HPC
Putchong Uthayopas
 
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
DataWorks Summit/Hadoop Summit
 
Heterogeneous Computing on POWER - IBM and OpenPOWER technologies to accelera...
Cesar Maciel
 
Ad

More from klepsydratechnologie (8)

PDF
Robotics technical Presentation
klepsydratechnologie
 
PDF
OBDPC 2022
klepsydratechnologie
 
PDF
Dasia 2022
klepsydratechnologie
 
PDF
Klepsydra Company Presentation
klepsydratechnologie
 
PDF
Roscon2021 Executor
klepsydratechnologie
 
PDF
Smallsat 2021
klepsydratechnologie
 
PDF
IAC 2019
klepsydratechnologie
 
Robotics technical Presentation
klepsydratechnologie
 
Klepsydra Company Presentation
klepsydratechnologie
 
Roscon2021 Executor
klepsydratechnologie
 
Smallsat 2021
klepsydratechnologie
 
Ad

Recently uploaded (20)

PDF
ChatPharo: an Open Architecture for Understanding How to Talk Live to LLMs
ESUG
 
PPTX
Web Testing.pptx528278vshbuqffqhhqiwnwuq
studylike474
 
PPTX
AI-Ready Handoff: Auto-Summaries & Draft Emails from MQL to Slack in One Flow
bbedford2
 
PDF
MiniTool Power Data Recovery Crack New Pre Activated Version Latest 2025
imang66g
 
PDF
49785682629390197565_LRN3014_Migrating_the_Beast.pdf
Abilash868456
 
PDF
advancepresentationskillshdhdhhdhdhdhhfhf
jasmenrojas249
 
PDF
What to consider before purchasing Microsoft 365 Business Premium_PDF.pdf
Q-Advise
 
PPTX
Can You Build Dashboards Using Open Source Visualization Tool.pptx
Varsha Nayak
 
PPT
Activate_Methodology_Summary presentatio
annapureddyn
 
PDF
New Download FL Studio Crack Full Version [Latest 2025]
imang66g
 
PDF
Adobe Illustrator Crack Full Download (Latest Version 2025) Pre-Activated
imang66g
 
PDF
Using licensed Data Loss Prevention (DLP) as a strategic proactive data secur...
Q-Advise
 
PDF
Summary Of Odoo 18.1 to 18.4 : The Way For Odoo 19
CandidRoot Solutions Private Limited
 
PDF
Bandai Playdia The Book - David Glotz
BluePanther6
 
PDF
Salesforce Implementation Services Provider.pdf
VALiNTRY360
 
PDF
Key Features to Look for in Arizona App Development Services
Net-Craft.com
 
PPTX
GALILEO CRS SYSTEM | GALILEO TRAVEL SOFTWARE
philipnathen82
 
PDF
Protecting the Digital World Cyber Securit
dnthakkar16
 
PDF
Generating Union types w/ Static Analysis
K. Matthew Dupree
 
PDF
Download iTop VPN Free 6.1.0.5882 Crack Full Activated Pre Latest 2025
imang66g
 
ChatPharo: an Open Architecture for Understanding How to Talk Live to LLMs
ESUG
 
Web Testing.pptx528278vshbuqffqhhqiwnwuq
studylike474
 
AI-Ready Handoff: Auto-Summaries & Draft Emails from MQL to Slack in One Flow
bbedford2
 
MiniTool Power Data Recovery Crack New Pre Activated Version Latest 2025
imang66g
 
49785682629390197565_LRN3014_Migrating_the_Beast.pdf
Abilash868456
 
advancepresentationskillshdhdhhdhdhdhhfhf
jasmenrojas249
 
What to consider before purchasing Microsoft 365 Business Premium_PDF.pdf
Q-Advise
 
Can You Build Dashboards Using Open Source Visualization Tool.pptx
Varsha Nayak
 
Activate_Methodology_Summary presentatio
annapureddyn
 
New Download FL Studio Crack Full Version [Latest 2025]
imang66g
 
Adobe Illustrator Crack Full Download (Latest Version 2025) Pre-Activated
imang66g
 
Using licensed Data Loss Prevention (DLP) as a strategic proactive data secur...
Q-Advise
 
Summary Of Odoo 18.1 to 18.4 : The Way For Odoo 19
CandidRoot Solutions Private Limited
 
Bandai Playdia The Book - David Glotz
BluePanther6
 
Salesforce Implementation Services Provider.pdf
VALiNTRY360
 
Key Features to Look for in Arizona App Development Services
Net-Craft.com
 
GALILEO CRS SYSTEM | GALILEO TRAVEL SOFTWARE
philipnathen82
 
Protecting the Digital World Cyber Securit
dnthakkar16
 
Generating Union types w/ Static Analysis
K. Matthew Dupree
 
Download iTop VPN Free 6.1.0.5882 Crack Full Activated Pre Latest 2025
imang66g
 

RISC V in Spacer

  • 3. COMPARE AND SWAP • Compare-and-swap (CAS) is an instruction used in multithreading to achieve synchronisation. It compares the contents of a memory location with a given value and, only if they are the same, modi fi es the contents of that memory location to a new given value. This is done as a single atomic operation. • Compare-and-Swap has been an integral part of the IBM 370 architectures since 1970. • Maurice Herlihy (1991) proved that CAS can implement more of these algorithms than atomic read, write, and fetch-and-add
  • 4. Event Loop Sensor Multiplexer Two main data processing approaches Producer 1 Consumer 1 Consumer 2 Producer 2 Producer 3 Consumer Producer 1 4
  • 5. Lightweight, modular and compatible with most used operating systems Worldwide application Klepsydra SDK Klepsydra GPU Streaming Klepsydra AI Klepsydra ROS2 executor plugin SDK – Software Development Kit Boost data processing at the edge for general applications and processor intensive algorithms AI – Artificial Intelligence High performance deep neural network (DNN) engine to deploy any AI or machine learning module at the edge ROS2 Executor plugin Executor for ROS2 able to process up to 10 times more data with up to 50% reduction in CPU consumption. GPU (Graphic Processing Unit) High parallelisation of GPU to increase the processing data rate and GPU utilization THE PRODUCT
  • 6. LOCK-FREE AS ALTERNATIVE TO PARALLELISATION Parallelisation Pipeline
  • 7. 2-DIM THREADING MODEL Input Data Layer Output Data First dimension: pipelining { Thread 1 (Core 1) Layer Layer Layer Layer Layer { Thread 2 (Core 2) Layer Layer Layer Layer Layer Layer Layer Layer Layer Deep Neural Network Structure
  • 8. 2-DIM THREADING MODEL Input Data Output Data Second dimension: Matrix multiplication parallelisation { T hread 1 (Core 1) Layer { T hread 2 (Core 2) { T hread 3 (Core 3)
  • 9. 2-DIM THREADING MODEL Core 1 Core 2 Core 3 Core 4 Layer Layer Layer Layer Layer Layer Core 1 Core 2 Core 3 Core 4 Layer Layer Layer Layer Layer Layer Core 1 Core 2 Core 3 Core 4 Layer Layer Layer Layer Layer Layer Layer Layer Layer • Low CPU • High throughput CPU • High latency • Mid CPU • Mid throughput CPU • Mid latency • High CPU • Mid throughput CPU • Low latency Threading model con fi guration
  • 10. ONNX API class KPSR_API OnnxDNNImporter { public: /** * @brief import an onnx file and uses a default eventloop factory for all processor cores * @param onnxFileName * @param testDNN * @return a share pointer to a DeepNeuralNetwork object * * When log level is debug, dumps the YAML configuration of the default factory. * It makes use of all processor cores. */ static std::shared_ptr<kpsr::ai::DeepNeuralNetworkFactory> createDNNFactory(const std::string & onnxFileName, bool testDNN = false); /** * @brief importForTest an onnx file and uses a default synchronous factory * @param onnxFileName * @param envFileName. Klepsydra AI configuration environment file. * @return a share pointer to a DeepNeuralNetwork object * * This method is intented to be used for testing purposes only. * */ static std::shared_ptr<kpsr::ai::DeepNeuralNetworkFactory> createDNNFactory(const std::string & onnxFileName, const std::string & envFileName); }; 10
  • 11. Core API class DeepNeuralNetwork { public: /** * @brief setCallback * @param callback. Callback function for the prediction result. */ virtual void setCallback(std::function<void(const unsigned long &, const kpsr::ai::F32AlignedVector &)> callback) = 0; /** * @brief predict. Load input matrix as input to network. * @param inputVector. An F32AlignedVector of floats containing network input. * * @return Unique id corresponding to the input vector */ virtual unsigned long predict(const kpsr::ai::F32AlignedVector& inputVector) = 0; /** * @brief predict. Copy-less version of predict. * @param inputVector. An F32AlignedVector of floats containing network input. * * @return Unique id corresponding to the input vector */ virtual unsigned long predict(const std::shared_ptr<kpsr::ai::F32AlignedVector> & inputVector) = 0; }; 11
  • 12. KLEPSYDRA SDO PROCESS • Klepsydra Streaming Distribution Optimiser (SDO): • Runs on a separate computer • Executes several dry runs on the OBC • Collect statistics • Runs a genetic algorithm to fi nd the optimal solution for latency, power or throughput • The main variable to optimise is the distribution of layers are the two dimension of the threading model
  • 14. Part 2 The KATESU Project
  • 15. QORIQ® LAYERSCAPE LS1046A MULTICORE PROCESSOR QorIQ® Layerscape LS1046A Klepsydra AI Container
  • 16. STATUS • Successful installation of the following setup: • LS1046 running Yocto Jethro • Docker Installed on LS1046 • Container with the following: • Ubuntu 20.04 • Klepsydra AI software fully supported (quantised and non- quantised)
  • 17. XILINX ZEDBOARD ZedBoard Klepsydra AI Container PetaLinux Klepsydra AI Container
  • 18. STATUS • Successful installation of the following setup: • ZedBoard running PetaLinux 2019.2 • Docker Installed on ZedBoard • Container with the following: • Ubuntu 20.04 • Klepsydra AI software with quantised support only
  • 19. PERFORMANCE RESULTS: CME ON LS1046 0 6,5 13 19,5 26 CPU / Hz TFLite + NEON Klepsydra 0 45 90 135 180 Latency (ms) TFLite + NEON Klepsydra 0 4,5 9 13,5 18 Throughput (Hz) TFLite + NEON Klepsydra
  • 20. PERFORMANCE RESULTS: CME-Q ON LS1046 0 6,75 13,5 20,25 27 CPU / Hz TFLite + NEON Klepsydra 0 30 60 90 120 Latency (ms) TFLite + NEON Klepsydra 0 7,5 15 22,5 30 Throughput (Hz) TFLite + NEON Klepsydra
  • 21. PERFORMANCE RESULTS: CME-Q ON ZEDBOARD 0 12,5 25 37,5 50 CPU / Hz TFLite + NEON Klepsydra 0 250 500 750 1000 Latency (ms) TFLite + NEON Klepsydra 0 0,65 1,3 1,95 2,6 Throughput (Hz) TFLite + NEON Klepsydra
  • 22. PERFORMANCE RESULTS: BSC ON LS1046 0 20 40 60 80 CPU / Hz TFLite + NEON Klepsydra 0 1250 2500 3750 5000 Latency (ms) TFLite + NEON Klepsydra 0 0,15 0,3 0,45 0,6 Throughput (Hz) TFLite + NEON Klepsydra
  • 24. THE PATTERN PROJECT PATTERN: Klepsydra AI ported to the GR740 aNd RISC-V • Target Processor: GR740, GR765 (Leon5 & Noel-V) • Target OS: RTMES5 • Development on commercial FPGA board • Validation on Space quali fi ed hardware
  • 25. THE MULTI-THREADING API Klepsydra SDK Multi-threading framework POSIX Operating System PTHREAD Klepsydra AI Klepsydra SDK Threading Abstraction Layer POSIX POSIX Operating System PTHREAD Klepsydra AI RTEMS5 Multi-threading framework RTEMS
  • 26. THE PARALLELISATION FRAMEWORK Klepsydra AI Back-ends Full-backend (Float32, Int8) Quantized-backend (Int8) POSIX Operating System PTHREAD Parallelisation Framework Klepsydra AI Back-ends Full-backend (Float32, Int8) Quantized-backend (Int8) Parallelisation Framework Threading Abstraction Layer POSIX RTEMS5 Multi-threading framework POSIX Operating System PTHREAD RTEMS
  • 27. THE MATHEMATICAL BACKEND Klepsydra AI Back-ends Full-backend (Float32, Int8) Quantized-backend (Int8) ARM x86 ARM x86 Klepsydra AI Back-ends Full-backend (Float32, Int8) Quantized-backend (Int8) ARM x86 ARM x86 RISC-V? RISC-V Extensions • Current version of Klepsydra AI supports RV32GV and RV64GC • Preparation for NOEL-V in three modes: • ‘Vanilla’ • P-Extension and V-Extension, • And more….
  • 28. THE PLAN Phase 1: Klepsydra AI for RTEMS5 Phase 2: Klepsydra AI for GR765/Leon5 Phase 3: Klepsydra AI for GR765/Noel-V Phase 4: Validation of Klepsydra AI on GR740 and GR765
  • 29. THE SCHEDULE Work Package Start Month End Month Duration in Months 0 1 2 3 4 5 6 7 8 9 10 11 KOM MTR1 MTR2 FR WP4.2 11 11 1 WP0 0 17 18 2 11 10 WP4.1 1 10 10 WP3.3 0 0 1 WP2.1 WP2.2 1 4 4 WP1.1 5 0 4 5 8 4 WP1.2 WP2.3 9 10 2 WP3.2 5 8 4 WP3.1 5 5 1
  • 30. CONCLUSIONS • Enable real AI for future missions on the GR765/NOEL-V • Very easy to use, via a simple API and web-based optimisation tool • Highly optimised for the GR765/NOEL-V processors • Lightweight software (current version is 4Mb) • Deterministic and full control of the dedicated resources
  • 31. NEXT STEPS • In-orbit-demonstration: • OPSSAT OBC: Using Onboard Altera FPGA and NOEL-V softcore • Other? • Health Monitoring (core operation failures, etc).
  • 32. CONTACT INFORMATION Dr Pablo Ghiglino [email protected] +41786931544 www.klepsydra.com linkedin.com/company/klepsydra-technologies