Webinar Sneak Peek: How Sayari Built & Ran Live Analytics on a Billion-Node-Graphb
Memgraph Webinar Sneak Peek - Memgraph At Scale: Analyzing Company Ownership & Supply Networks with a 2-Billion-Node Graph

Webinar Sneak Peek: How Sayari Built & Ran Live Analytics on a Billion-Node-Graphb

By: Sabika Tasneem

Did you miss our live webinar featuring James Conkling, SVP of Product Engineering at Sayari , where he shares how Sayari uses Memgraph to build a massive knowledge graph of global corporate ownership and trade networks? Worry not! We've got you covered.

Here is the full webinar recording available for on-demand replay, ready for you to dive into the fascinating insights shared 👉 Watch the full Webinar recording here!

This blog offers a sneak peek into the insightful webinar led by Maya Natarajan and James Conkling . But before we dive in, let's set the stage.

Memgraph: Powering Your Graph Exploration

Memgraph is a property graph database designed for lightning-fast in-memory data storage and analysis. It boasts three operational modes, each tailored to specific needs:

  • In-Memory Transactional Mode: Ideal for applications demanding immediate response times and ACID guarantees.
  • In-Memory Analytical Mode (the focus of our webinar!): Perfect for executing complex analytical queries and parallel data imports with exceptional speed.
  • On-Disk Transactional Mode: Suitable for datasets exceeding RAM capacity, ensuring data integrity with ACID guarantees.

Sayari's Massive Knowledge Graph

James Conkling presented how Sayari leverages Memgraph to build a comprehensive knowledge graph of global corporate ownership and trade. With data from 568 sources across 200 countries, Sayari's graph helps detect money laundering, financial fraud, and much more.

Article content
Sayari’s Billion-Node Global Knowledge Graph

This massive graph comprises nearly 2 billion entities (companies, individuals, etc.) and over 7.5 billion relationships (representing ownership structures, trade partnerships, and more).

The Challenge: Managing a Billion-Node Graph

Building and maintaining this massive knowledge graph is no easy feat - especially, while running live analytics on it. Sayari needed a graph database to handle this colossal scale while offering the power and flexibility to support their specific use case.

Here's why other graph databases like Neo4j & TigerGraph weren't the right fit for Sayari:

  • Limited Scalability: Traditional graph databases struggled with the sheer size of Sayari's graph, impacting performance and query execution times.
  • ACID Trade-offs: Some graph databases prioritize ACID guarantees, which can introduce latency in writes and updates. Sayari's read-only workload prioritizes fast reads over strict data consistency.
  • Visualization Bottlenecks: Visualizing such a large graph can be cumbersome with certain graph database tools.

The Solution: Memgraph for Real-Time Analysis

Sayari uses Memgraph's in-memory analytical mode to manage and query a billion-node dataset. This approach allows for:

  • Efficiently handling bulk data loads and exposing data to users via an API.
  • Exposing the entire 1 billion node graph directly to end users for live queries and real-time data insights.
  • Combining the speed of OLAP databases with many performance guarantees of OLTP systems, excluding ACID transactions.

Performance Optimization Tips for Large-Scale Graphs

Sayari's innovative approach highlights the potential of Memgraph for large, read-only graph workloads. However, it also emphasizes the importance of strategic query design and data management. James shared some key techniques to optimize performance:

  • Managing Cardinality: It is best to restrict the number of paths explored by queries, especially in breadth-first searches (BFS), to avoid performance bottlenecks.
  • Partial Results: Design queries to return partial results when they can't fully explore the graph within a given time limit, ensuring that users still gain valuable insights.
  • Use Multi-Graphs with Caution! Understand the implications and manage the complexity of having multiple edges between nodes in your graph structure.
  • Treat Memgraph as an Index: For highly intricate queries, storing essential data in Memgraph and maintaining a master data store elsewhere can improve efficiency.
  • Decompression & Serialization: Use efficient data formats and reduce unnecessary processing to enhance query performance and overall system efficiency.

Join the Journey to Scale

Sayari's experience demonstrates that Memgraph's in-memory analytical mode can be a powerful tool for the interactive exploration of massive, read-only knowledge graphs. The webinar delves deeper into these strategies and offers valuable insights for organizations considering similar graph-based solutions for complex data analysis.

Ready to dive deeper? Here’s the full replay of this knowledge-packed webinar. It’s a must-watch to learn more about Sayari's journey and explore the full potential of Memgraph for your own graph-driven applications!

To explore how knowledge graphs can be implemented for your business use case, simply contact us on our Discord Server with your questions.

Further Reading:

To view or add a comment, sign in

More articles by Memgraph

Explore content categories