Nadav Har'El, ScyllaDB
The Generalist Engineer meetup, Tel-Aviv
Ides of March, 2016
SeastarSeastar Or how we implemented a
10-times faster Cassandra
2
● Israeli but multi-national startup company
– 15 developers cherry-picked from 10 countries.
● Founded 2013 (“Cloudius Systems”)
– by Avi Kivity and Dor Laor of KVM fame.
● Fans of open-source: OSv, Seastar, ScyllaDB.
3
Make Cassandra 10 times faster
Your mission, should
you choose to accept it:
4
“Make Cassandra 10 times faster”
● Why 10?
● Why Cassandra?
– Popular NoSQL database (2nd to MongoDB).
– Powerful and widely applicable.
– Example of a wider class of middleware.
● Why “mission impossible”?
– Cassandra not considered particularly slow -
– Considered faster than MongoDB, Hbase, et al.
– “disk is bottleneck” (no longer, with SSD!)
5
Our first attempt: OSv
● New OS design specifically for cloud VMs:
– Run a single application per VM (“unikernel”)
– Run existing Linux applications (Cassandra)
– Run these faster than Linux.
6
OSv
●
Some of the many ideas we used in OSv:
– Single address space.
– System call is just a function call.
– Faster context switches.
– No spin locks.
– Smaller code.
– Redesigned network stack (Van Jacobson).
7
OSv
● Writing an entire OS from scratch was a really
fun exercise for our generalist engineers.
●
Full description of OSv is beyond the scope of
this talk. Check out:
– “OSv—Optimizing the Operating System for Virtual
Machines”, Usenix ATC 2014.
8
Cassandra on OSv
● Cassandra-stress, READ, 4 vcpu:
On OSv, 34% faster than Linux
● Very nice, but not even close to our goal.
What are the remaining bottlenecks?
9
Bottlenecks: API locks
● In one profile, we saw 20% of run on lock()
and unlock() operations. Most uncontended
– Posix APIs allow threads to share
● file descriptors
● sockets
– As many as 20 lock/unlock for each network packet!
● Uncontended locks were efficient on UP (flag to
disable preemption),
But atomic operations slow on many cores.
10
Bottlenecks: API copies
● Write/send system calls copies user data to
kernel
– Even on OSv with no user-kernel separation
– Part of the socket API
● Similar for read
11
Bottlenecks: context switching
● One thread per CPU is optimal, >1 require:
– Context switch time
– Stacks consume memory and polute CPU cache
– Thread imbalance
● Requires fully non-blocking APIs
– Cassandra's uses mmap() for disk….
12
Bottlenecks:
unscalable applications
● Contended locks ruin scalability to many cores
– Memcache's counter and shared cache
● Solution: per-cpu data.
● Even lock-free atomic algorithms are unscalable
– Cache line bouncing
● Again, better to shard, not share, data.
– Becomes worse as core count grows
● NUMA
13
Therefore
● Need to provide a better APIs for server
applications
– Not file descriptors, sockets, threads, etc.
● Need to write better applications.
14
Framework
● One thread per CPU
– Event-driven programming
– Everything (network & disk) is non-blocking
– How to write complex applications?
15
Framework
● Sharded (shared-nothing) applications
– Important!
16
Framework
● Language with no runtime overheads or built-in
data sharing
17
Seastar
● C++14 library
● For writing new high-performance server applications
● Share-nothing model, fully asynchronous
● Futures & Continuations based
– Unified API for all asynchronous operations
– Compose complex asyncrhonous operations
– The key to complex applications
● (Optionally) full zero-copy user-space TCP/IP (over DPDK)
● Open source: http://www.seastar-project.org/
18
Seastar linear scaling in #cores
19
Seastar linear scaling in #cores
20
Brief introduction to Seastar
21
Sharded application design
● One thread per CPU
● Each thread handles one shard of data
– No shared data (“share nothing”)
– Separate memory per CPU (NUMA aware)
– Message-passing between CPUs
– No locks or cache line bounces
● Reactor (event loop) per thread
● User-space network stack also sharded
22
Futures and continuations
● Futures and continuations are the building
blocks of asynchronous programming in
Seastar.
● Can be composed together to a large, complex,
asynchronous program.
23
Futures and continuations
● A future is a result which may not be available yet:
– Data buffer from the network
– Timer expiration
– Completion of a disk write
– The result of a computation which requires the values
from one or more other futures.
● future<int>
● future<>
24
Futures and continuations
● An asynchronous function (also “promise”) is
a function returning a future:
– future<> sleep(duration)
– future<temporary_buffer<char>> read()
● The function sets up for the future to be fulfilled
– sleep() sets a timer to fulfill the future it returns
25
Futures and continuations
● A continuation is a callback, typically a lambda
executed when a future becomes ready
– sleep(1s).then([] {
std::cerr << “done”;
});
● A continuation can hold state (lambda capture)
– future<int> slow_incr(int i) {
sleep(10ms).then(
[i] { return i+1; });
}
26
Futures and continuations
● Continuations can be nested:
– future<int> get();
future<> put(int);
get().then([] (int value) {
put(value+1).then([] {
std::cout << “done”;
});
});
● Or chained:
– get().then([] (int value) {
return put(value+1);
}).then([] {
std::cout << “done”;
});
27
Futures and continuations
● Parallelism is easy:
– sleep(100ms).then([] {
std::cout << “100msn”;
});
sleep(200ms).then([] {
std::cout << “200msn”;
28
Futures and continuations
● In Seastar, every asynchronous operation is a
future:
– Network read or write
– Disk read or write
– Timers
– …
– A complex combination of other futures
● Useful for everything from writing network stack to
writing a full, complex, application.
29
Network zero-copy
● future<temporary_buffer>
input_stream::read()
– temporary_buffer points at driver-provided pages, if
possible.
– Automatically discarded after use (C++).
● future<> output_stream::
write(temporary_buffer)
– Future becomes ready when TCP window allows further
writes (usually immediately).
– Buffer discarded after data is ACKed.
30
Two TCP/IP implementations
Networking API
Seastar (native) Stack POSIX (hosted) stack
Linux kernel (sockets)
User-space TCP/IP
Interface layer
DPDK
Virtio Xen
igb ixgb
31
Disk I/O
● Asynchronous and zero copy, using AIO and
O_DIRECT.
● Not implemented well by all filesystems
– XFS recommended
● Focusing on SSD
● Future thought:
– Direct NVMe support,
– Implement filesystem in Seastar.
32
More info on Seastar
● http://seastar-project.com
● https://github.com/scylladb/seastar
● http://docs.seastar-project.org/
● http://docs.seastar-project.org/master/md_doc_tu
torial.html
33
ScyllaDB
● NoSQL database, implemented in Seastar.
● Fully compatible with Cassandra:
– Same CQL queries
– Copy over a complete Cassandra database
– Use existing drivers
– Use existing cassandra.yaml
– Use same nodetool or JMX console
– Can be clustered (of course...)
34
ScyllaDBCassandra
Key cache
Row cache
On-
heap /
Off-heap
Linux page cache
SSTables
Unified cache
SSTables
● Don't double-cache.
● Don't cache unrelated rows.
● Don't cache unparsed sstables.
● Can fit much more into cache.
● No page faults, threads, etc.
35
Scylla vs. Cassandra
● Single node benchmark:
– 2 x 12-core x 2 hyperthread Intel(R) Xeon(R) CPU
E5-2690 v3 @ 2.60GHz
cassandra-stress
Benchmark
ScyllaDB Cassandra
Write 1,871,556 251,785
Read 1,585,416 95,874
Mixed 1,372,451 108,947
36
Scylla vs. Cassandra
● We really got a x7 – x16 speedup!
● Read speeded up more -
– Cassandra writes are simpler
– Row-cache benefits further improve Scylla's read
● Almost 2 million writes per second on single
machine!
– Google reported in their blogs achieving 1 million writes
per second on 330 (!) machines
– (2 years ago, and RF=3… but still impressive).
37
Scylla vs. Cassandra
3 node cluster, 2x12 cores each; RF=3, CL=quorum
38
Better latency, at all load levels
39
What will you do with 10x performance?
● Shrink your cluster by a factor of 10
● Use stronger (but slower) data models
● Run more queries - more value from your data
● Stop using caches in front of databases
40
41
Do we qualify?
In 3 years, our small team wrote:
● A complete kernel and library (OSv).
● An asynchronous programming framework
(Seastar).
● A complete Cassandra-compatible NoSQL
database (ScyllaDB).
42
43
This project has received funding from the European Union’s
Horizon 2020 research and innovation programme under grant
agreement No 645402.

More Related Content

PDF
PostgreSQL HA
PDF
5 Steps to PostgreSQL Performance
PPTX
RocksDB detail
PDF
Etsy Activity Feeds Architecture
PPTX
CockroachDB
PPTX
Deep Dive into Apache Kafka
PDF
Rust: Unlocking Systems Programming
PDF
4年前にRustで新規プロダクトを?!枯れてない技術の採択にまつわるエトセトラ:developers summit 2023 10-d-8
PostgreSQL HA
5 Steps to PostgreSQL Performance
RocksDB detail
Etsy Activity Feeds Architecture
CockroachDB
Deep Dive into Apache Kafka
Rust: Unlocking Systems Programming
4年前にRustで新規プロダクトを?!枯れてない技術の採択にまつわるエトセトラ:developers summit 2023 10-d-8

What's hot (20)

PDF
Introduction to MongoDB
PDF
Introduction to Apache Calcite
PDF
Introduction to DataFusion An Embeddable Query Engine Written in Rust
PPT
Introduction to MongoDB
PPTX
Hadoop Meetup Jan 2019 - Router-Based Federation and Storage Tiering
PDF
Cosco: An Efficient Facebook-Scale Shuffle Service
PPTX
JavaScript Event Loop
PPTX
RocksDB compaction
PDF
MyRocks introduction and production deployment
PDF
20090622 Velocity
PDF
How We Optimize Spark SQL Jobs With parallel and sync IO
PDF
MyRocks Deep Dive
PDF
Lessons from managing a Pulsar cluster (Nutanix)
PDF
Advanced backup methods (Postgres@CERN)
PDF
Node.js, Uma breve introdução
PPTX
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
PDF
Rust Tutorial | Rust Programming Language Tutorial For Beginners | Rust Train...
PDF
Towards Holistic Systems
PDF
Cassandra Database
PDF
Introduction to PySpark
Introduction to MongoDB
Introduction to Apache Calcite
Introduction to DataFusion An Embeddable Query Engine Written in Rust
Introduction to MongoDB
Hadoop Meetup Jan 2019 - Router-Based Federation and Storage Tiering
Cosco: An Efficient Facebook-Scale Shuffle Service
JavaScript Event Loop
RocksDB compaction
MyRocks introduction and production deployment
20090622 Velocity
How We Optimize Spark SQL Jobs With parallel and sync IO
MyRocks Deep Dive
Lessons from managing a Pulsar cluster (Nutanix)
Advanced backup methods (Postgres@CERN)
Node.js, Uma breve introdução
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
Rust Tutorial | Rust Programming Language Tutorial For Beginners | Rust Train...
Towards Holistic Systems
Cassandra Database
Introduction to PySpark
Ad

Similar to Seastar / ScyllaDB, or how we implemented a 10-times faster Cassandra (20)

PDF
ScyllaDB: NoSQL at Ludicrous Speed
PDF
Critical Attributes for a High-Performance, Low-Latency Database
PDF
Back to the future with C++ and Seastar
PDF
Scylla db deck, july 2017
PDF
How to achieve no compromise performance and availability
PDF
Transforming the Database: Critical Innovations for Performance at Scale
PDF
How we got to 1 millisecond latency in 99% under repair, compaction, and flus...
PDF
Under The Hood Of A Shard-Per-Core Database Architecture
PDF
ScyllaDB: What could you do with Cassandra compatibility at 1.8 million reque...
PPTX
Seastar at Linux Foundation Collaboration Summit
PDF
Seastar @ SF/BA C++UG
PDF
Seastar @ NYCC++UG
PDF
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
KEY
High performance network programming on the jvm oscon 2012
PPTX
Seastar Summit 2019 Keynote
PDF
ScyllaDB @ Apache BigData, may 2016
PDF
Voldemort Nosql
PDF
What’s New in ScyllaDB Open Source 5.0
PDF
Scaling and hardware provisioning for databases (lessons learned at wikipedia)
PDF
Scylla: 1 Million CQL operations per second per server
ScyllaDB: NoSQL at Ludicrous Speed
Critical Attributes for a High-Performance, Low-Latency Database
Back to the future with C++ and Seastar
Scylla db deck, july 2017
How to achieve no compromise performance and availability
Transforming the Database: Critical Innovations for Performance at Scale
How we got to 1 millisecond latency in 99% under repair, compaction, and flus...
Under The Hood Of A Shard-Per-Core Database Architecture
ScyllaDB: What could you do with Cassandra compatibility at 1.8 million reque...
Seastar at Linux Foundation Collaboration Summit
Seastar @ SF/BA C++UG
Seastar @ NYCC++UG
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
High performance network programming on the jvm oscon 2012
Seastar Summit 2019 Keynote
ScyllaDB @ Apache BigData, may 2016
Voldemort Nosql
What’s New in ScyllaDB Open Source 5.0
Scaling and hardware provisioning for databases (lessons learned at wikipedia)
Scylla: 1 Million CQL operations per second per server
Ad

Recently uploaded (20)

PDF
Design of Material Handling Equipment Lecture Note
PDF
electrical machines course file-anna university
PPTX
A Brief Introduction to IoT- Smart Objects: The "Things" in IoT
PPTX
AI-Reporting for Emerging Technologies(BS Computer Engineering)
PPTX
Amdahl’s law is explained in the above power point presentations
PPTX
ASME PCC-02 TRAINING -DESKTOP-NLE5HNP.pptx
PDF
[jvmmeetup] next-gen integration with apache camel and quarkus.pdf
PDF
August -2025_Top10 Read_Articles_ijait.pdf
PDF
Project_Mgmt_Institute_-Marc Marc Marc .pdf
PPTX
BBOC407 BIOLOGY FOR ENGINEERS (CS) - MODULE 1 PART 1.pptx
PPTX
Chapter 2 -Technology and Enginerring Materials + Composites.pptx
PDF
UEFA_Carbon_Footprint_Calculator_Methology_2.0.pdf
PPTX
Environmental studies, Moudle 3-Environmental Pollution.pptx
PPT
Chapter 1 - Introduction to Manufacturing Technology_2.ppt
PDF
Present and Future of Systems Engineering: Air Combat Systems
PPTX
Principal presentation for NAAC (1).pptx
PPTX
Cisco Network Behaviour dibuywvdsvdtdstydsdsa
PDF
Unit I -OPERATING SYSTEMS_SRM_KATTANKULATHUR.pptx.pdf
PPTX
CT Generations and Image Reconstruction methods
PPTX
mechattonicsand iotwith sensor and actuator
Design of Material Handling Equipment Lecture Note
electrical machines course file-anna university
A Brief Introduction to IoT- Smart Objects: The "Things" in IoT
AI-Reporting for Emerging Technologies(BS Computer Engineering)
Amdahl’s law is explained in the above power point presentations
ASME PCC-02 TRAINING -DESKTOP-NLE5HNP.pptx
[jvmmeetup] next-gen integration with apache camel and quarkus.pdf
August -2025_Top10 Read_Articles_ijait.pdf
Project_Mgmt_Institute_-Marc Marc Marc .pdf
BBOC407 BIOLOGY FOR ENGINEERS (CS) - MODULE 1 PART 1.pptx
Chapter 2 -Technology and Enginerring Materials + Composites.pptx
UEFA_Carbon_Footprint_Calculator_Methology_2.0.pdf
Environmental studies, Moudle 3-Environmental Pollution.pptx
Chapter 1 - Introduction to Manufacturing Technology_2.ppt
Present and Future of Systems Engineering: Air Combat Systems
Principal presentation for NAAC (1).pptx
Cisco Network Behaviour dibuywvdsvdtdstydsdsa
Unit I -OPERATING SYSTEMS_SRM_KATTANKULATHUR.pptx.pdf
CT Generations and Image Reconstruction methods
mechattonicsand iotwith sensor and actuator

Seastar / ScyllaDB, or how we implemented a 10-times faster Cassandra

  • 1. Nadav Har'El, ScyllaDB The Generalist Engineer meetup, Tel-Aviv Ides of March, 2016 SeastarSeastar Or how we implemented a 10-times faster Cassandra
  • 2. 2 ● Israeli but multi-national startup company – 15 developers cherry-picked from 10 countries. ● Founded 2013 (“Cloudius Systems”) – by Avi Kivity and Dor Laor of KVM fame. ● Fans of open-source: OSv, Seastar, ScyllaDB.
  • 3. 3 Make Cassandra 10 times faster Your mission, should you choose to accept it:
  • 4. 4 “Make Cassandra 10 times faster” ● Why 10? ● Why Cassandra? – Popular NoSQL database (2nd to MongoDB). – Powerful and widely applicable. – Example of a wider class of middleware. ● Why “mission impossible”? – Cassandra not considered particularly slow - – Considered faster than MongoDB, Hbase, et al. – “disk is bottleneck” (no longer, with SSD!)
  • 5. 5 Our first attempt: OSv ● New OS design specifically for cloud VMs: – Run a single application per VM (“unikernel”) – Run existing Linux applications (Cassandra) – Run these faster than Linux.
  • 6. 6 OSv ● Some of the many ideas we used in OSv: – Single address space. – System call is just a function call. – Faster context switches. – No spin locks. – Smaller code. – Redesigned network stack (Van Jacobson).
  • 7. 7 OSv ● Writing an entire OS from scratch was a really fun exercise for our generalist engineers. ● Full description of OSv is beyond the scope of this talk. Check out: – “OSv—Optimizing the Operating System for Virtual Machines”, Usenix ATC 2014.
  • 8. 8 Cassandra on OSv ● Cassandra-stress, READ, 4 vcpu: On OSv, 34% faster than Linux ● Very nice, but not even close to our goal. What are the remaining bottlenecks?
  • 9. 9 Bottlenecks: API locks ● In one profile, we saw 20% of run on lock() and unlock() operations. Most uncontended – Posix APIs allow threads to share ● file descriptors ● sockets – As many as 20 lock/unlock for each network packet! ● Uncontended locks were efficient on UP (flag to disable preemption), But atomic operations slow on many cores.
  • 10. 10 Bottlenecks: API copies ● Write/send system calls copies user data to kernel – Even on OSv with no user-kernel separation – Part of the socket API ● Similar for read
  • 11. 11 Bottlenecks: context switching ● One thread per CPU is optimal, >1 require: – Context switch time – Stacks consume memory and polute CPU cache – Thread imbalance ● Requires fully non-blocking APIs – Cassandra's uses mmap() for disk….
  • 12. 12 Bottlenecks: unscalable applications ● Contended locks ruin scalability to many cores – Memcache's counter and shared cache ● Solution: per-cpu data. ● Even lock-free atomic algorithms are unscalable – Cache line bouncing ● Again, better to shard, not share, data. – Becomes worse as core count grows ● NUMA
  • 13. 13 Therefore ● Need to provide a better APIs for server applications – Not file descriptors, sockets, threads, etc. ● Need to write better applications.
  • 14. 14 Framework ● One thread per CPU – Event-driven programming – Everything (network & disk) is non-blocking – How to write complex applications?
  • 15. 15 Framework ● Sharded (shared-nothing) applications – Important!
  • 16. 16 Framework ● Language with no runtime overheads or built-in data sharing
  • 17. 17 Seastar ● C++14 library ● For writing new high-performance server applications ● Share-nothing model, fully asynchronous ● Futures & Continuations based – Unified API for all asynchronous operations – Compose complex asyncrhonous operations – The key to complex applications ● (Optionally) full zero-copy user-space TCP/IP (over DPDK) ● Open source: http://www.seastar-project.org/
  • 21. 21 Sharded application design ● One thread per CPU ● Each thread handles one shard of data – No shared data (“share nothing”) – Separate memory per CPU (NUMA aware) – Message-passing between CPUs – No locks or cache line bounces ● Reactor (event loop) per thread ● User-space network stack also sharded
  • 22. 22 Futures and continuations ● Futures and continuations are the building blocks of asynchronous programming in Seastar. ● Can be composed together to a large, complex, asynchronous program.
  • 23. 23 Futures and continuations ● A future is a result which may not be available yet: – Data buffer from the network – Timer expiration – Completion of a disk write – The result of a computation which requires the values from one or more other futures. ● future<int> ● future<>
  • 24. 24 Futures and continuations ● An asynchronous function (also “promise”) is a function returning a future: – future<> sleep(duration) – future<temporary_buffer<char>> read() ● The function sets up for the future to be fulfilled – sleep() sets a timer to fulfill the future it returns
  • 25. 25 Futures and continuations ● A continuation is a callback, typically a lambda executed when a future becomes ready – sleep(1s).then([] { std::cerr << “done”; }); ● A continuation can hold state (lambda capture) – future<int> slow_incr(int i) { sleep(10ms).then( [i] { return i+1; }); }
  • 26. 26 Futures and continuations ● Continuations can be nested: – future<int> get(); future<> put(int); get().then([] (int value) { put(value+1).then([] { std::cout << “done”; }); }); ● Or chained: – get().then([] (int value) { return put(value+1); }).then([] { std::cout << “done”; });
  • 27. 27 Futures and continuations ● Parallelism is easy: – sleep(100ms).then([] { std::cout << “100msn”; }); sleep(200ms).then([] { std::cout << “200msn”;
  • 28. 28 Futures and continuations ● In Seastar, every asynchronous operation is a future: – Network read or write – Disk read or write – Timers – … – A complex combination of other futures ● Useful for everything from writing network stack to writing a full, complex, application.
  • 29. 29 Network zero-copy ● future<temporary_buffer> input_stream::read() – temporary_buffer points at driver-provided pages, if possible. – Automatically discarded after use (C++). ● future<> output_stream:: write(temporary_buffer) – Future becomes ready when TCP window allows further writes (usually immediately). – Buffer discarded after data is ACKed.
  • 30. 30 Two TCP/IP implementations Networking API Seastar (native) Stack POSIX (hosted) stack Linux kernel (sockets) User-space TCP/IP Interface layer DPDK Virtio Xen igb ixgb
  • 31. 31 Disk I/O ● Asynchronous and zero copy, using AIO and O_DIRECT. ● Not implemented well by all filesystems – XFS recommended ● Focusing on SSD ● Future thought: – Direct NVMe support, – Implement filesystem in Seastar.
  • 32. 32 More info on Seastar ● http://seastar-project.com ● https://github.com/scylladb/seastar ● http://docs.seastar-project.org/ ● http://docs.seastar-project.org/master/md_doc_tu torial.html
  • 33. 33 ScyllaDB ● NoSQL database, implemented in Seastar. ● Fully compatible with Cassandra: – Same CQL queries – Copy over a complete Cassandra database – Use existing drivers – Use existing cassandra.yaml – Use same nodetool or JMX console – Can be clustered (of course...)
  • 34. 34 ScyllaDBCassandra Key cache Row cache On- heap / Off-heap Linux page cache SSTables Unified cache SSTables ● Don't double-cache. ● Don't cache unrelated rows. ● Don't cache unparsed sstables. ● Can fit much more into cache. ● No page faults, threads, etc.
  • 35. 35 Scylla vs. Cassandra ● Single node benchmark: – 2 x 12-core x 2 hyperthread Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz cassandra-stress Benchmark ScyllaDB Cassandra Write 1,871,556 251,785 Read 1,585,416 95,874 Mixed 1,372,451 108,947
  • 36. 36 Scylla vs. Cassandra ● We really got a x7 – x16 speedup! ● Read speeded up more - – Cassandra writes are simpler – Row-cache benefits further improve Scylla's read ● Almost 2 million writes per second on single machine! – Google reported in their blogs achieving 1 million writes per second on 330 (!) machines – (2 years ago, and RF=3… but still impressive).
  • 37. 37 Scylla vs. Cassandra 3 node cluster, 2x12 cores each; RF=3, CL=quorum
  • 38. 38 Better latency, at all load levels
  • 39. 39 What will you do with 10x performance? ● Shrink your cluster by a factor of 10 ● Use stronger (but slower) data models ● Run more queries - more value from your data ● Stop using caches in front of databases
  • 40. 40
  • 41. 41 Do we qualify? In 3 years, our small team wrote: ● A complete kernel and library (OSv). ● An asynchronous programming framework (Seastar). ● A complete Cassandra-compatible NoSQL database (ScyllaDB).
  • 42. 42
  • 43. 43 This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 645402.