SlideShare a Scribd company logo
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
San Francisco Java Users Group
A Java Library for High-Speed Streaming
Data into Your Database
Pablo Silberkasten, Software Development Manager, Oracle
Kuassi Mensah, Director Product Management, Oracle
OJDBC and OJVM Development
April 15, 2019
Confidential – Oracle Internal/Restricted/Highly Restricted
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Safe Harbor Statement
The following is intended to outline our general product direction. It is intended for
information purposes only, and may not be incorporated into any contract. It is not a
commitment to deliver any material, code, or functionality, and should not be relied upon
in making purchasing decisions. The development, release, timing, and pricing of any
features or functionality described for Oracle’s products may change and remains at the
sole discretion of Oracle Corporation.
Confidential – Oracle Internal/Restricted/Highly Restricted
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Program Agenda
Challenges when Streaming Data into the Database
Introducing the High Speed Streaming Library
API and Code Samples
Cloud Service Demo
1
2
3
4
Confidential – Oracle Internal/Restricted/Highly Restricted 3
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Challenges when Streaming Data into the Database
• Scalability: handle thousands of concurrent clients streaming data into the
same table/database
• Responsiveness: minimal response time, asynchronous processing, with
non-blocking back-pressure
• Elasticity: responsiveness not being affected under varying workload
Common requirements for multiple concurrent agents streaming data
Confidential – Oracle Internal/Restricted/Highly Restricted 4
In these scenarios regular JDBC inserts will not scale
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Challenges when Streaming Data into the Database
• Automatic Routing (payload dependent)
• High Availability
• Disaster Recover
• Planned Maintenance and changes in the database topology
• Database upgrades (use new features with no changes on the client)
Transparently exploit -with no changes in the client- features of the Database
Confidential – Oracle Internal/Restricted/Highly Restricted 5
Provide an abstraction layer with these features
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Program Agenda
Challenges when Streaming Data into the Database
Introducing the High Speed Streaming Library
API and Code Samples
Cloud Service Demo
1
2
3
4
Confidential – Oracle Internal/Restricted/Highly Restricted 6
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Introducing the High Speed Streaming Library
• Fastest insert method for Oracle
Database: through Direct Path
• RAC and Shard awareness (routing
capabilities): through native UCP
• Extremely simple to configure and use
• Streaming capability: unblockingly
receive data from a large group of
clients
Java library that allows users to stream data into the Oracle Database
Confidential – Oracle Internal/Restricted/Highly Restricted 7
UCP
Connection
Pool
Thin Driver Direct
Path
Core Library
Local Threads
processing
queues
(grouped
records) with
tagged
connections
queue2
queue1
Java Client
(Push
Publisher
API)
Java Client
(Flow
Publisher
API)
accept(byte[])
onNext(byte[])
request(n)
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Introducing the High Speed Streaming Library
– Since 19c Thin Driver supports Direct Path Insert as part of the Oracle Connection
internal APIs (not JDBC standard)
– The library uses these APIs for loading bulk data faster in database.
– Direct Path skips SQL engine in server side writing data directly in database buffers.
– Direct Path loads stream in a table or in a partition with some intrinsic limitations:
• Triggers are not supported
• Referential integrity is not checked
• Partitioning columns come before any LOB
Fastest insert method for Oracle Database: through Direct Path
Confidential – Oracle Internal/Restricted/Highly Restricted 8
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Introducing the High Speed Streaming Library
– Through UCP usage get the capability to recognize the sharding keys specified by the
users and allow them to connect to the specific shard and chunk
– Cache sharding-key ranges to the location of the shard to allow connection request
bypassing the GSM (shard director)
– Select an available connection in the pool just by providing sharding keys; this allows
the reuse of connections to various shards by using the cached routing topology
RAC and Shard awareness (routing capabilities): through native UCP
Confidential – Oracle Internal/Restricted/Highly Restricted 9
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Introducing the High Speed Streaming Library
Extremely simple to configure and use
Confidential – Oracle Internal/Restricted/Highly Restricted 10
// Simple and intuitive constructor
FastIngestSuite fis = new FISBuilder()
.url(url)
.schema("scott")
.username("scott")
.password("tiger")
.executor(newScheduledThreadPool(2))
.bufferCapacity(bufferCapacity)
.bufferInterval(Duration.ofMillis(1000))
.transformer(bytes -> new Customer(bytes)) // Function<byte[], Record>
.table("customers")
.columns(new String[] { "ID", "NAME", "REGION" })
.build();
// Coming soon!: Record API (no need for transformer, table and/or column
@Record
@Table(name = “customers”)
class Costumer {
@Column(name = “ID”) private long id;
@Column(name = “NAME”) private String name;
@Column(name = “REGION”) private String region;
}
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Program Agenda
Challenges when Streaming Data into the Database
Introducing the High Speed Streaming Library
API and Code Samples
Demo
1
2
3
4
Confidential – Oracle Internal/Restricted/Highly Restricted 11
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
API and Code Samples
Push Publisher - For Simple Usage
Confidential – Oracle Internal/Restricted/Highly Restricted 12
FastIngestSuite fis = FastIngestSuite.builder(). . . // Easy constructor
.transformer(bytes -> { // Lambda transformation
final Transformer.Record record = new Transformer.Record();
record.setColumnValue("payload", bytes);
record.setColumnValue("state_code", "CA");
return record;})
.table("fis_demo_nonsharded_sample")
.build();
PushPublisher<byte[]> pushPublisher = FastIngestSuite.pushPublisher(); // Factory builder
pushPublisher.subscribe(fis.subscriberByteArray());
// Ingest ad-hoc
pushPublisher.accept(new byte[] {'a', 'b'});
// Ingest using Consumer Functional Interface
List<byte[]> data = Arrays.asList(
new byte[] { 'a', 'b' }, new byte[] { 'a', 'c' }, new byte[] { 'b', 'c' });
data.stream()
.filter(p -> p[0] == 'b')
.forEach(pushPublisher::accept);
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
API and Code Samples
Flow Publisher - A recap on Flow API
Confidential – Oracle Internal/Restricted/Highly Restricted 13
Provided by the library
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
API and Code Samples
Flow Publisher - Creating Custom Publishers
Confidential – Oracle Internal/Restricted/Highly Restricted 14
// Library user creates Publisher’s implementation
public class MyPublisher implements Publisher<byte[]> {
// This implementation has a reference to the threadpool where execution will take place
private final ExecutorService threadPool;
// This implementation has a reference to only one subscription (it could be more than one)
private Subscription subscription = null;
// Implementation must provide a subscribe behavior
@Override public void subscribe(Subscriber<? super byte[]> subscriber) {
subscription = new MySubscription(); // Subscription also provided by user
(fisSubscriber = subscriber).onSubscribe(subscription); // Execute subscriber’s onSubscribe
startPublishing(); // Optional, start publishing after subscription
. . .
// Sample publishing (like startPublishing)
private void startPublishing() {
threadPool.submit(() -> { // publishing executes on the Publisher’s threadppol
while (recordsCount > 0) {
fisSubscriber.onNext(byteArray); // <- onNext’s subscriber is where the data is sent
recordsCount--; // . . .
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
API and Code Samples
Flow Publisher - Using the Custom Publisher with the Library
Confidential – Oracle Internal/Restricted/Highly Restricted 15
// Create the publisher and subscribe to Library’s subscriber
MyPublisher myPublisher = new MyPublisher(threadPool, recordsCount);
flowPublisher.subscribe(fis.subscriberByteArray()); // Subscriber provided by the library (onNext –> ingest)
// Subscription provided by the user
private class MySubscription implements Subscription {
@Override public void request(long request) {
demand.addAndGet(request); // This is the mechanism for the Subscriber to signal unfulfilled demand
}
@Override public void cancel() {
isCancelled = true; // The subscriber signaling that it will stop receiving message
}
}
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
API and Code Samples
Flow Publisher - The Library is the Subscriber
Confidential – Oracle Internal/Restricted/Highly Restricted 16
public class FlowSubscriber<T> implements Flow.Subscriber<T> {
@Override public void onSubscribe(Subscription givenSubscription) { // It’s called from the Publisher
if (activeSubscription == null) {
isSubscribed = true;
activeSubscription = givenSubscription;
long request = fis.request(); // Check availability on the internal buffer
if (request > 0) {
activeSubscription.request(request); // Signal availability . . .
@Override public void onNext(T item) {
// Unblocking adding item to the buffer
// Ingestion will execute in the library's threadpool (which is the most important feature!)
fis.putRecord((byte[]) item);
// Calculate remaining capability and signal
long request = fis.request();
if (request > 0) {
activeSubscription.request(request);
} else {
// Buffer is full . . .
@Override public void onError(Throwable throwable) { // Provide error handling
@Override public void onComplete() { // Provide completion handling
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
API and Code Samples
Flow Publisher – Using SubmissionPublisher (java.base module/java.util.concurrent pck.)
Confidential – Oracle Internal/Restricted/Highly Restricted 17
// Use SubmissionPublisher provided by the JRE
// Default ForkJoinPool.commonPool();, DEFAULT_BUFFER_SIZE = 256;
SubmissionPublisher<byte[]> publisher = new SubmissionPublisher<>();
publisher.subscribe(fis.subscriberByteArray());
// User publisher’s built-in methods: submit, offer, status of Pubs vs Subs.
publisher.submit(new byte[] { 'a', 'b' });
// Graceful shutdown
publisher.close();
fis.close();
ForkJoinPool.commonPool().shutdown();
Flow Publisher – Ready for 3rd party implementations!
• RxJava
• Reactive4JavaFlow
• Future implementations!
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Program Agenda
Challenges when Streaming Data into the Database
Introducing the High Speed Streaming Library
API and Code Samples
Demo
1
2
3
4
Confidential – Oracle Internal/Restricted/Highly Restricted 18
POC
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Demo & Outcome
Confidential – Oracle Internal/Restricted/Highly Restricted 19
Oracle Database 19c running
on 1 Exalogic X4-2 compute
node*.
Table:
• Id
• Name
• Region
*Exalogic X4-2 compute node - Memory: 94.6GB - CPU 12 cores, 24 threads, 2 sockets, Intel Xeon Processor X5675 3.07GHz,
Collector running on 1
Exalogic X4-2 compute
node*.
• 4 threads for service I/O
• 8 threads for library
• 20s buffer limit
• 4gb buffer cap
JMeter (load simulator)
running on 1 Exalogic X4-2
compute node*.
• 10.000 concurrent threads
• 2’ start up time
• 0.25s between posts
• 120b payload
Rate: 40.000 rpc/second
Avg. response time: ~ 4-5ms

More Related Content

PPTX
GoldenGate Monitoring - GOUSER - 4/2014
Bobby Curtis
 
PDF
Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...
Sandesh Rao
 
PDF
Oracle Autonomous Health Service- For Protecting Your On-Premise Databases- F...
Sandesh Rao
 
PDF
Aioug ha day oct2015 goldengate- High Availability Day 2015
aioughydchapter
 
POTX
Schema Registry & Stream Analytics Manager
Sriharsha Chintalapani
 
PDF
TFA Collector - what can one do with it
Sandesh Rao
 
PDF
Dg broker &amp; client connectivity - High Availability Day 2015
aioughydchapter
 
PPTX
Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache Knox
DataWorks Summit
 
GoldenGate Monitoring - GOUSER - 4/2014
Bobby Curtis
 
Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...
Sandesh Rao
 
Oracle Autonomous Health Service- For Protecting Your On-Premise Databases- F...
Sandesh Rao
 
Aioug ha day oct2015 goldengate- High Availability Day 2015
aioughydchapter
 
Schema Registry & Stream Analytics Manager
Sriharsha Chintalapani
 
TFA Collector - what can one do with it
Sandesh Rao
 
Dg broker &amp; client connectivity - High Availability Day 2015
aioughydchapter
 
Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache Knox
DataWorks Summit
 

What's hot (20)

PPTX
eProseed Oracle Open World 2016 debrief - Oracle 12.2.0.1 Database
Marco Gralike
 
PPTX
Oracle GoldenGate Performance Tuning
Bobby Curtis
 
PDF
New Generation Oracle RAC Performance
Anil Nair
 
PPTX
Oracle GoldenGate 21c New Features and Best Practices
Bobby Curtis
 
PDF
Using Machine Learning to Debug complex Oracle RAC Issues
Anil Nair
 
PDF
TFA, ORAchk and EXAchk 20.2 - What's new
Sandesh Rao
 
PPTX
Oracle GoldenGate Microservices Overview ( with Demo )
Mari Kupatadze
 
PPTX
What’s New in Oracle Database 19c - Part 1
Satishbabu Gunukula
 
PPTX
Hit Refresh with Oracle GoldenGate Microservices
Bobby Curtis
 
PPTX
Hive ACID Apache BigData 2016
alanfgates
 
PPTX
The Oracle Autonomous Database
Connor McDonald
 
PPTX
Best practices and lessons learnt from Running Apache NiFi at Renault
DataWorks Summit
 
PDF
Rac 12c rel2_operational_best_practices_sangam_2017_as_pdf
Anil Nair
 
PPTX
OOW19 - HOL5221
Bobby Curtis
 
PPTX
Anil nair rac_internals_sangam_2016
Anil Nair
 
PPTX
Running Enterprise Workloads in the Cloud
DataWorks Summit
 
PDF
How to Use EXAchk Effectively to Manage Exadata Environments
Sandesh Rao
 
PDF
Building RESTful services using SCA and JAX-RS
Luciano Resende
 
PDF
Oracle RAC 19c: Best Practices and Secret Internals
Anil Nair
 
PDF
New availability features in oracle rac 12c release 2 anair ss
Anil Nair
 
eProseed Oracle Open World 2016 debrief - Oracle 12.2.0.1 Database
Marco Gralike
 
Oracle GoldenGate Performance Tuning
Bobby Curtis
 
New Generation Oracle RAC Performance
Anil Nair
 
Oracle GoldenGate 21c New Features and Best Practices
Bobby Curtis
 
Using Machine Learning to Debug complex Oracle RAC Issues
Anil Nair
 
TFA, ORAchk and EXAchk 20.2 - What's new
Sandesh Rao
 
Oracle GoldenGate Microservices Overview ( with Demo )
Mari Kupatadze
 
What’s New in Oracle Database 19c - Part 1
Satishbabu Gunukula
 
Hit Refresh with Oracle GoldenGate Microservices
Bobby Curtis
 
Hive ACID Apache BigData 2016
alanfgates
 
The Oracle Autonomous Database
Connor McDonald
 
Best practices and lessons learnt from Running Apache NiFi at Renault
DataWorks Summit
 
Rac 12c rel2_operational_best_practices_sangam_2017_as_pdf
Anil Nair
 
OOW19 - HOL5221
Bobby Curtis
 
Anil nair rac_internals_sangam_2016
Anil Nair
 
Running Enterprise Workloads in the Cloud
DataWorks Summit
 
How to Use EXAchk Effectively to Manage Exadata Environments
Sandesh Rao
 
Building RESTful services using SCA and JAX-RS
Luciano Resende
 
Oracle RAC 19c: Best Practices and Secret Internals
Anil Nair
 
New availability features in oracle rac 12c release 2 anair ss
Anil Nair
 
Ad

Similar to Java Library for High Speed Streaming Data (20)

PPTX
OUGLS 2016: Guided Tour On The MySQL Source Code
Georgi Kodinov
 
PDF
Oracle Cloud
MarketingArrowECS_CZ
 
PPTX
Oracle REST Data Services Best Practices/ Overview
Kris Rice
 
PDF
TDC Connections 2023 - A High-Speed Data Ingestion Service in Java Using MQTT...
Juarez Junior
 
PDF
Node.js and Oracle Database: New Development Techniques
Christopher Jones
 
PPTX
Boost Your Content Strategy for REST APIs with Gururaj BS
Information Development World
 
PDF
Oracle goldegate microservice
Mojtaba Khandan
 
PDF
Oracle Cloud DBaaS
Arush Jain
 
PDF
con8832-cloudha-2811114.pdf
Neaman Ahmed MBA ITIL OCP Automic
 
PDF
TechEvent 2019: Create a Private Database Cloud in the Public Cloud using the...
Trivadis
 
PDF
Introduction to MySQL
Ted Wennmark
 
PDF
Deep Dive into MySQL InnoDB Cluster Read Scale-out Capabilities.pdf
Miguel Araújo
 
PDF
RMOUG MySQL 5.7 New Features
Dave Stokes
 
PDF
Whats new in Oracle Trace File analyzer 18.3.0
Sandesh Rao
 
PDF
Whats new in oracle trace file analyzer 18.3.0
Gareth Chapman
 
PDF
My sql router
Tinku Ajit
 
PDF
MySQL User Camp : MySQL-Router
Prasad Vasudevan
 
PDF
MySQL Router - Explore The Secrets (MySQL Belgian Days 2024)
Miguel Araújo
 
PDF
Whats new in Autonomous Database in 2022
Sandesh Rao
 
PDF
Přehled portfolia Oracle Database Appliance a praktických případů v regionu EMEA
MarketingArrowECS_CZ
 
OUGLS 2016: Guided Tour On The MySQL Source Code
Georgi Kodinov
 
Oracle Cloud
MarketingArrowECS_CZ
 
Oracle REST Data Services Best Practices/ Overview
Kris Rice
 
TDC Connections 2023 - A High-Speed Data Ingestion Service in Java Using MQTT...
Juarez Junior
 
Node.js and Oracle Database: New Development Techniques
Christopher Jones
 
Boost Your Content Strategy for REST APIs with Gururaj BS
Information Development World
 
Oracle goldegate microservice
Mojtaba Khandan
 
Oracle Cloud DBaaS
Arush Jain
 
con8832-cloudha-2811114.pdf
Neaman Ahmed MBA ITIL OCP Automic
 
TechEvent 2019: Create a Private Database Cloud in the Public Cloud using the...
Trivadis
 
Introduction to MySQL
Ted Wennmark
 
Deep Dive into MySQL InnoDB Cluster Read Scale-out Capabilities.pdf
Miguel Araújo
 
RMOUG MySQL 5.7 New Features
Dave Stokes
 
Whats new in Oracle Trace File analyzer 18.3.0
Sandesh Rao
 
Whats new in oracle trace file analyzer 18.3.0
Gareth Chapman
 
My sql router
Tinku Ajit
 
MySQL User Camp : MySQL-Router
Prasad Vasudevan
 
MySQL Router - Explore The Secrets (MySQL Belgian Days 2024)
Miguel Araújo
 
Whats new in Autonomous Database in 2022
Sandesh Rao
 
Přehled portfolia Oracle Database Appliance a praktických případů v regionu EMEA
MarketingArrowECS_CZ
 
Ad

More from Oracle Developers (20)

PDF
Running Kubernetes Workloads on Oracle Cloud Infrastructure
Oracle Developers
 
PDF
Apex atp customer_presentation_wwc march 2019
Oracle Developers
 
PDF
Building Cloud Native Applications with Oracle Autonomous Database.
Oracle Developers
 
PDF
Fn meetup by Sardar Jamal Arif
Oracle Developers
 
PDF
Get ready for_an_autonomous_data_driven_future_ext
Oracle Developers
 
PDF
Cloud Native Meetup Santa Clara 07-11-2019 by Manish Kapur
Oracle Developers
 
PDF
Container Native Development Tools - Talk by Mickey Boxell
Oracle Developers
 
PDF
General Capabilities of GraalVM by Oleg Selajev @shelajev
Oracle Developers
 
PDF
GraalVM Native Images by Oleg Selajev @shelajev
Oracle Developers
 
PDF
Serverless Patterns by Jesse Butler
Oracle Developers
 
PDF
Artificial Intelligence
Oracle Developers
 
PDF
Reactive Java Programming: A new Asynchronous Database Access API by Kuassi M...
Oracle Developers
 
PDF
Managing containers on Oracle Cloud by Jamal Arif
Oracle Developers
 
PDF
North America November Meetups
Oracle Developers
 
PDF
GraphPipe - Blazingly Fast Machine Learning Inference by Vish Abrams
Oracle Developers
 
PDF
North America Meetups in September
Oracle Developers
 
PPTX
Introduction to the Oracle Container Engine
Oracle Developers
 
PPTX
Oracle Data Science Platform
Oracle Developers
 
PDF
Persistent storage with containers By Kaslin Fields
Oracle Developers
 
PDF
The Fn Project by Jesse Butler
Oracle Developers
 
Running Kubernetes Workloads on Oracle Cloud Infrastructure
Oracle Developers
 
Apex atp customer_presentation_wwc march 2019
Oracle Developers
 
Building Cloud Native Applications with Oracle Autonomous Database.
Oracle Developers
 
Fn meetup by Sardar Jamal Arif
Oracle Developers
 
Get ready for_an_autonomous_data_driven_future_ext
Oracle Developers
 
Cloud Native Meetup Santa Clara 07-11-2019 by Manish Kapur
Oracle Developers
 
Container Native Development Tools - Talk by Mickey Boxell
Oracle Developers
 
General Capabilities of GraalVM by Oleg Selajev @shelajev
Oracle Developers
 
GraalVM Native Images by Oleg Selajev @shelajev
Oracle Developers
 
Serverless Patterns by Jesse Butler
Oracle Developers
 
Artificial Intelligence
Oracle Developers
 
Reactive Java Programming: A new Asynchronous Database Access API by Kuassi M...
Oracle Developers
 
Managing containers on Oracle Cloud by Jamal Arif
Oracle Developers
 
North America November Meetups
Oracle Developers
 
GraphPipe - Blazingly Fast Machine Learning Inference by Vish Abrams
Oracle Developers
 
North America Meetups in September
Oracle Developers
 
Introduction to the Oracle Container Engine
Oracle Developers
 
Oracle Data Science Platform
Oracle Developers
 
Persistent storage with containers By Kaslin Fields
Oracle Developers
 
The Fn Project by Jesse Butler
Oracle Developers
 

Recently uploaded (20)

PDF
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
PDF
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
PDF
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
PDF
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
PDF
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
PDF
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
PDF
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
PDF
Peak of Data & AI Encore - Real-Time Insights & Scalable Editing with ArcGIS
Safe Software
 
PPTX
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
PDF
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
PDF
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
PDF
The Future of Artificial Intelligence (AI)
Mukul
 
PDF
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
PDF
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
PDF
Software Development Methodologies in 2025
KodekX
 
PDF
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
PPTX
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
PDF
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
PPTX
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
PPTX
The Future of AI & Machine Learning.pptx
pritsen4700
 
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
Peak of Data & AI Encore - Real-Time Insights & Scalable Editing with ArcGIS
Safe Software
 
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
The Future of Artificial Intelligence (AI)
Mukul
 
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
Software Development Methodologies in 2025
KodekX
 
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
The Future of AI & Machine Learning.pptx
pritsen4700
 

Java Library for High Speed Streaming Data

  • 1. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | San Francisco Java Users Group A Java Library for High-Speed Streaming Data into Your Database Pablo Silberkasten, Software Development Manager, Oracle Kuassi Mensah, Director Product Management, Oracle OJDBC and OJVM Development April 15, 2019 Confidential – Oracle Internal/Restricted/Highly Restricted
  • 2. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, timing, and pricing of any features or functionality described for Oracle’s products may change and remains at the sole discretion of Oracle Corporation. Confidential – Oracle Internal/Restricted/Highly Restricted
  • 3. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Program Agenda Challenges when Streaming Data into the Database Introducing the High Speed Streaming Library API and Code Samples Cloud Service Demo 1 2 3 4 Confidential – Oracle Internal/Restricted/Highly Restricted 3
  • 4. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Challenges when Streaming Data into the Database • Scalability: handle thousands of concurrent clients streaming data into the same table/database • Responsiveness: minimal response time, asynchronous processing, with non-blocking back-pressure • Elasticity: responsiveness not being affected under varying workload Common requirements for multiple concurrent agents streaming data Confidential – Oracle Internal/Restricted/Highly Restricted 4 In these scenarios regular JDBC inserts will not scale
  • 5. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Challenges when Streaming Data into the Database • Automatic Routing (payload dependent) • High Availability • Disaster Recover • Planned Maintenance and changes in the database topology • Database upgrades (use new features with no changes on the client) Transparently exploit -with no changes in the client- features of the Database Confidential – Oracle Internal/Restricted/Highly Restricted 5 Provide an abstraction layer with these features
  • 6. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Program Agenda Challenges when Streaming Data into the Database Introducing the High Speed Streaming Library API and Code Samples Cloud Service Demo 1 2 3 4 Confidential – Oracle Internal/Restricted/Highly Restricted 6
  • 7. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Introducing the High Speed Streaming Library • Fastest insert method for Oracle Database: through Direct Path • RAC and Shard awareness (routing capabilities): through native UCP • Extremely simple to configure and use • Streaming capability: unblockingly receive data from a large group of clients Java library that allows users to stream data into the Oracle Database Confidential – Oracle Internal/Restricted/Highly Restricted 7 UCP Connection Pool Thin Driver Direct Path Core Library Local Threads processing queues (grouped records) with tagged connections queue2 queue1 Java Client (Push Publisher API) Java Client (Flow Publisher API) accept(byte[]) onNext(byte[]) request(n)
  • 8. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Introducing the High Speed Streaming Library – Since 19c Thin Driver supports Direct Path Insert as part of the Oracle Connection internal APIs (not JDBC standard) – The library uses these APIs for loading bulk data faster in database. – Direct Path skips SQL engine in server side writing data directly in database buffers. – Direct Path loads stream in a table or in a partition with some intrinsic limitations: • Triggers are not supported • Referential integrity is not checked • Partitioning columns come before any LOB Fastest insert method for Oracle Database: through Direct Path Confidential – Oracle Internal/Restricted/Highly Restricted 8
  • 9. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Introducing the High Speed Streaming Library – Through UCP usage get the capability to recognize the sharding keys specified by the users and allow them to connect to the specific shard and chunk – Cache sharding-key ranges to the location of the shard to allow connection request bypassing the GSM (shard director) – Select an available connection in the pool just by providing sharding keys; this allows the reuse of connections to various shards by using the cached routing topology RAC and Shard awareness (routing capabilities): through native UCP Confidential – Oracle Internal/Restricted/Highly Restricted 9
  • 10. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Introducing the High Speed Streaming Library Extremely simple to configure and use Confidential – Oracle Internal/Restricted/Highly Restricted 10 // Simple and intuitive constructor FastIngestSuite fis = new FISBuilder() .url(url) .schema("scott") .username("scott") .password("tiger") .executor(newScheduledThreadPool(2)) .bufferCapacity(bufferCapacity) .bufferInterval(Duration.ofMillis(1000)) .transformer(bytes -> new Customer(bytes)) // Function<byte[], Record> .table("customers") .columns(new String[] { "ID", "NAME", "REGION" }) .build(); // Coming soon!: Record API (no need for transformer, table and/or column @Record @Table(name = “customers”) class Costumer { @Column(name = “ID”) private long id; @Column(name = “NAME”) private String name; @Column(name = “REGION”) private String region; }
  • 11. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Program Agenda Challenges when Streaming Data into the Database Introducing the High Speed Streaming Library API and Code Samples Demo 1 2 3 4 Confidential – Oracle Internal/Restricted/Highly Restricted 11
  • 12. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | API and Code Samples Push Publisher - For Simple Usage Confidential – Oracle Internal/Restricted/Highly Restricted 12 FastIngestSuite fis = FastIngestSuite.builder(). . . // Easy constructor .transformer(bytes -> { // Lambda transformation final Transformer.Record record = new Transformer.Record(); record.setColumnValue("payload", bytes); record.setColumnValue("state_code", "CA"); return record;}) .table("fis_demo_nonsharded_sample") .build(); PushPublisher<byte[]> pushPublisher = FastIngestSuite.pushPublisher(); // Factory builder pushPublisher.subscribe(fis.subscriberByteArray()); // Ingest ad-hoc pushPublisher.accept(new byte[] {'a', 'b'}); // Ingest using Consumer Functional Interface List<byte[]> data = Arrays.asList( new byte[] { 'a', 'b' }, new byte[] { 'a', 'c' }, new byte[] { 'b', 'c' }); data.stream() .filter(p -> p[0] == 'b') .forEach(pushPublisher::accept);
  • 13. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | API and Code Samples Flow Publisher - A recap on Flow API Confidential – Oracle Internal/Restricted/Highly Restricted 13 Provided by the library
  • 14. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | API and Code Samples Flow Publisher - Creating Custom Publishers Confidential – Oracle Internal/Restricted/Highly Restricted 14 // Library user creates Publisher’s implementation public class MyPublisher implements Publisher<byte[]> { // This implementation has a reference to the threadpool where execution will take place private final ExecutorService threadPool; // This implementation has a reference to only one subscription (it could be more than one) private Subscription subscription = null; // Implementation must provide a subscribe behavior @Override public void subscribe(Subscriber<? super byte[]> subscriber) { subscription = new MySubscription(); // Subscription also provided by user (fisSubscriber = subscriber).onSubscribe(subscription); // Execute subscriber’s onSubscribe startPublishing(); // Optional, start publishing after subscription . . . // Sample publishing (like startPublishing) private void startPublishing() { threadPool.submit(() -> { // publishing executes on the Publisher’s threadppol while (recordsCount > 0) { fisSubscriber.onNext(byteArray); // <- onNext’s subscriber is where the data is sent recordsCount--; // . . .
  • 15. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | API and Code Samples Flow Publisher - Using the Custom Publisher with the Library Confidential – Oracle Internal/Restricted/Highly Restricted 15 // Create the publisher and subscribe to Library’s subscriber MyPublisher myPublisher = new MyPublisher(threadPool, recordsCount); flowPublisher.subscribe(fis.subscriberByteArray()); // Subscriber provided by the library (onNext –> ingest) // Subscription provided by the user private class MySubscription implements Subscription { @Override public void request(long request) { demand.addAndGet(request); // This is the mechanism for the Subscriber to signal unfulfilled demand } @Override public void cancel() { isCancelled = true; // The subscriber signaling that it will stop receiving message } }
  • 16. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | API and Code Samples Flow Publisher - The Library is the Subscriber Confidential – Oracle Internal/Restricted/Highly Restricted 16 public class FlowSubscriber<T> implements Flow.Subscriber<T> { @Override public void onSubscribe(Subscription givenSubscription) { // It’s called from the Publisher if (activeSubscription == null) { isSubscribed = true; activeSubscription = givenSubscription; long request = fis.request(); // Check availability on the internal buffer if (request > 0) { activeSubscription.request(request); // Signal availability . . . @Override public void onNext(T item) { // Unblocking adding item to the buffer // Ingestion will execute in the library's threadpool (which is the most important feature!) fis.putRecord((byte[]) item); // Calculate remaining capability and signal long request = fis.request(); if (request > 0) { activeSubscription.request(request); } else { // Buffer is full . . . @Override public void onError(Throwable throwable) { // Provide error handling @Override public void onComplete() { // Provide completion handling
  • 17. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | API and Code Samples Flow Publisher – Using SubmissionPublisher (java.base module/java.util.concurrent pck.) Confidential – Oracle Internal/Restricted/Highly Restricted 17 // Use SubmissionPublisher provided by the JRE // Default ForkJoinPool.commonPool();, DEFAULT_BUFFER_SIZE = 256; SubmissionPublisher<byte[]> publisher = new SubmissionPublisher<>(); publisher.subscribe(fis.subscriberByteArray()); // User publisher’s built-in methods: submit, offer, status of Pubs vs Subs. publisher.submit(new byte[] { 'a', 'b' }); // Graceful shutdown publisher.close(); fis.close(); ForkJoinPool.commonPool().shutdown(); Flow Publisher – Ready for 3rd party implementations! • RxJava • Reactive4JavaFlow • Future implementations!
  • 18. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Program Agenda Challenges when Streaming Data into the Database Introducing the High Speed Streaming Library API and Code Samples Demo 1 2 3 4 Confidential – Oracle Internal/Restricted/Highly Restricted 18 POC
  • 19. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Demo & Outcome Confidential – Oracle Internal/Restricted/Highly Restricted 19 Oracle Database 19c running on 1 Exalogic X4-2 compute node*. Table: • Id • Name • Region *Exalogic X4-2 compute node - Memory: 94.6GB - CPU 12 cores, 24 threads, 2 sockets, Intel Xeon Processor X5675 3.07GHz, Collector running on 1 Exalogic X4-2 compute node*. • 4 threads for service I/O • 8 threads for library • 20s buffer limit • 4gb buffer cap JMeter (load simulator) running on 1 Exalogic X4-2 compute node*. • 10.000 concurrent threads • 2’ start up time • 0.25s between posts • 120b payload Rate: 40.000 rpc/second Avg. response time: ~ 4-5ms