Showing 274 open source projects for "data integration"

View related business solutions
  • Auth0 for AI Agents now in GA Icon
    Auth0 for AI Agents now in GA

    Ready to implement AI with confidence (without sacrificing security)?

    Connect your AI agents to apps and data more securely, give users control over the actions AI agents can perform and the data they can access, and enable human confirmation for critical agent actions.
    Start building today
  • Easy-to-use Business Software for the Waste Management Software Industry Icon
    Easy-to-use Business Software for the Waste Management Software Industry

    Increase efficiency, expedite accounts receivables, optimize routes, acquire new customers, & more!

    DOP Software’s mission is to streamline waste and recycling business’ processes by providing them with dynamic, comprehensive software and services that increase productivity and quality of performance.
    Learn More
  • 1
    Spring Data MongoDB

    Spring Data MongoDB

    Provide support to increase developer productivity in Java

    ...The Spring Data MongoDB project provides integration with the MongoDB document database. Key functional areas of Spring Data MongoDB are a POJO-centric model for interacting with a MongoDB Document and easily writing a repository-style data access layer. You do not need to build from source to use Spring Data. Binaries are available in repo.spring.io and accessible from Maven using the Maven configuration noted.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    Apache SeaTunnel

    Apache SeaTunnel

    SeaTunnel is a distributed, high-performance data integration platform

    SeaTunnel is a very easy-to-use ultra-high-performance distributed data integration platform that supports real-time synchronization of massive data. It can synchronize tens of billions of data stably and efficiently every day, and has been used in the production of nearly 100 companies. There are hundreds of commonly-used data sources of which versions are incompatible. With the emergence of new technologies, more data sources are appearing. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Flink CDC

    Flink CDC

    Flink CDC is a streaming data integration tool

    Apache Flink CDC is a distributed data integration tool that captures data changes in real-time from various databases. It leverages Change Data Capture (CDC) technology to stream data changes into Apache Flink, enabling real-time analytics and data processing. Flink CDC simplifies data pipeline development with its declarative YAML configurations.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Apache InLong

    Apache InLong

    Apache InLong - a one-stop integration framework for massive data

    Apache InLong is a one-stop integration framework for massive data that provides automatic, secure and reliable data transmission capabilities. InLong supports both batch and stream data processing at the same time, which offers great power to build data analysis, modeling and other real-time applications based on streaming data. InLong (应龙) is a divine beast in Chinese mythology who guides the river into the sea, and it is regarded as a metaphor of the InLong system for reporting data streams. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Desktop and Mobile Device Management Software Icon
    Desktop and Mobile Device Management Software

    It's a modern take on desktop management that can be scaled as per organizational needs.

    Desktop Central is a unified endpoint management (UEM) solution that helps in managing servers, laptops, desktops, smartphones, and tablets from a central location.
    Learn More
  • 5
    LakeSoul

    LakeSoul

    An end-to-end, realtime and cloud native Lakehouse framework

    LakeSoul is a high-performance, unified table storage framework for big data lakes, supporting both streaming and batch data in a single format. Built on top of Apache Spark and leveraging Apache Arrow and Parquet, LakeSoul provides ACID transactions, schema evolution, and time travel. It is designed for large-scale data lake architectures that require consistency, efficiency, and easy integration with modern data stacks.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    Apache Hudi

    Apache Hudi

    Upserts, Deletes And Incremental Processing on Big Data

    Apache Hudi (pronounced Hoodie) stands for Hadoop Upserts Deletes and Incrementals. Hudi manages the storage of large analytical datasets on DFS (Cloud stores, HDFS or any Hadoop FileSystem compatible storage). Apache Hudi is a transactional data lake platform that brings database and data warehouse capabilities to the data lake. Hudi reimagines slow old-school batch data processing with a powerful new incremental processing framework for low latency minute-level analytics. Hudi provides...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    RStudio

    RStudio

    RStudio is an integrated development environment (IDE) for R

    RStudio is a powerful, full-featured integrated development environment (IDE) tailored primarily for the R programming language but increasingly supportive of other languages like Python and Julia. It brings together console, editor, plotting, workspace, history, and file-management panes into a unified interface, helping data scientists, statisticians, and analysts to work more productively. The IDE is cross-platform: there are desktop versions for Windows, macOS and Linux, as well as a server version for remote or multi-user deployment via a web browser. In addition to code editing and execution, RStudio offers extensive support for reproducible research via R Markdown, notebooks, and integration with version control systems like Git and SVN. ...
    Downloads: 27 This Week
    Last Update:
    See Project
  • 8
    Canal

    Canal

    MySQL binlog

    Canal is an open-source project developed by Alibaba that simulates MySQL slave functionality to parse MySQL binlog files. It enables real-time data synchronization and change data capture (CDC) between MySQL and other systems such as Elasticsearch, Kafka, or HBase. Canal is widely used for data integration, replication, and monitoring across distributed systems, offering high performance and low-latency log parsing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Apache Avro

    Apache Avro

    Apache Avro is a data serialization system

    Apache Avro™ is a data serialization system. Simple integration with dynamic languages. Code generation is not required to read or write data files nor to use or implement RPC protocols. Code generation is an optional optimization, is only worth implementing for statically typed languages. Avro relies on schemas. When Avro data is read, the schema used when writing it is always present.
    Downloads: 0 This Week
    Last Update:
    See Project
  • The Most Powerful Software Platform for EHSQ and ESG Management Icon
    The Most Powerful Software Platform for EHSQ and ESG Management

    Addresses the needs of small businesses and large global organizations with thousands of users in multiple locations.

    Choose from a complete set of software solutions across EHSQ that address all aspects of top performing Environmental, Health and Safety, and Quality management programs.
    Learn More
  • 10
    Addax

    Addax

    Addax is a versatile open-source ETL tool

    Addax is a data integration and ETL (Extract, Transform, Load) tool designed for high-performance data migration tasks. It simplifies the process of moving data between different systems and formats.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Datacap

    Datacap

    DataCap is integrated software for data transformation

    Datacap is an open-source data catalog and governance tool that helps organizations manage and document their data assets. It provides metadata management, lineage tracking, and collaboration features to ensure data transparency and quality. Datacap is designed for teams that need a lightweight, self-hosted solution to organize and govern their data ecosystems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Spring Data R2DBC

    Spring Data R2DBC

    Provide support to increase developer productivity in Java

    Provide support to increase developer productivity in Java when using Reactive Relational Database Connectivity. Uses familiar Spring concepts such as a DatabaseClient for core API usage and lightweight repository-style data access. The primary goal of the Spring Data project is to make it easier to build Spring-powered applications that use new data access technologies such as non-relational databases, map-reduce frameworks, and cloud-based data services. Spring Data JDBC, part of the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Siddhi Core Libraries

    Siddhi Core Libraries

    Stream Processing and Complex Event Processing Engine

    Fully open source, cloud-native, scalable, micro streaming, and complex event processing system capable of building event-driven applications for use cases such as real-time analytics, data integration, notification management, and adaptive decision-making. Event processing logic can be written using Streaming SQL queries via graphical and source editors, to capture events from diverse data sources, process and analyze them, integrate with multiple services and data stores, and publish output to various endpoints in real time. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    MyBatis Mapper4

    MyBatis Mapper4

    Mybatis common mapper, easy to use

    This book starts with a simple MyBatis query to build a basic development environment for learning MyBatis. Through a comprehensive sample code and test, the basic usage of adding, deleting, modifying, and checking operations in the MyBatis XML mode and annotation mode is explained, and the application of dynamic SQL in different aspects and the best practice program in the use process are introduced. Provides a wealth of examples for MyBatis advanced mapping, stored procedures, and type...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    AKHQ

    AKHQ

    Kafka GUI for Apache Kafka to manage topics, topics data, etc.

    Kafka GUI for Apache Kafka to manage topics, topics data, consumers group, schema registry, connect and more. Enabling your teams to search and explore data in a unified console, while supporting its administration and integration within your ecosystem. Multi-Cluster vision into a central console, available in Multi-Cloud environments. Enabling users to access, search and get insights from your topics, including Live Tail.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 16
    Stirling-PDF

    Stirling-PDF

    Web application that allows you to perform operations on PDF files

    Stirling PDF is a powerful, locally hosted web-based PDF manipulation tool offering a wide range of editing, conversion, and utility features. It allows users to merge, split, compress, convert, OCR, and perform other operations on PDF files directly from a browser without uploading data to third-party servers. The tool is privacy-conscious, self-hostable via Docker, and built with modularity in mind to allow future expansion and integration.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 17
    Wren Engine

    Wren Engine

    The Semantic Engine for Model Context Protocol(MCP)

    Wren Engine is a semantic engine designed to empower Model Context Protocol (MCP) clients and AI agents by providing accurate, contextual, and governed access to business data. It serves as a bridge between large language models (LLMs) and enterprise systems, facilitating seamless integration and interaction. ​
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    APIJSON

    APIJSON

    Real-Time coding-free, powerful and secure ORM

    APIJSON is an open-source framework developed by Tencent that enables zero-code, real-time, and secure API development. It allows developers to perform CRUD operations through JSON-based requests without writing backend code, significantly accelerating development and reducing errors. APIJSON supports fine-grained access control, parameter validation, and seamless integration with various databases, making it a powerful tool for building scalable APIs.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 19
    Testcontainers Java

    Testcontainers Java

    Testcontainers is a Java library that supports JUnit tests

    Testcontainers for Java is a Java library that supports JUnit tests, providing lightweight, throwaway instances of common databases, Selenium web browsers, or anything else that can run in a Docker container. Use a containerized instance of a MySQL, PostgreSQL or Oracle database to test your data access layer code for complete compatibility, but without requiring complex setup on developers' machines and safe in the knowledge that your tests will always start with a known DB state. Any other...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 20
    IoTDB

    IoTDB

    Apache IoTDB

    Apache IoTDB (Database for Internet of Things) is an IoT native database with high performance for data management and analysis, deployable on the edge and the cloud. Due to its light-weight architecture, high performance and rich feature set together with its deep integration with Apache Hadoop, Spark and Flink, Apache IoTDB can meet the requirements of massive data storage, high-speed data ingestion and complex data analysis in the IoT industrial fields. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 21
    Reactor Core

    Reactor Core

    Non-Blocking Reactive Foundation for the JVM

    Reactor Core is a foundational library for building reactive applications in Java, providing a powerful API for asynchronous, non-blocking programming.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    RESTHeart

    RESTHeart

    Rapid API Development with MongoDB

    RESTHeart is an open-source middleware that exposes MongoDB databases as a RESTful API, allowing developers to interact with MongoDB using HTTP-based queries instead of traditional drivers.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 23
    Apache RocketMQ

    Apache RocketMQ

    Distributed messaging and streaming platform with low latency

    ...A variety of cross language clients, such as Java, C/C++, Python, Go. Pluggable transport protocols, such as TCP, SSL, AIO. Built-in message tracing capability, also support opentracing. Versatile big-data and streaming ecosytem integration. Message retroactivity by time or offset. Reliable FIFO and strict ordered messaging in the same queue. Efficient pull and push consumption model. Million-level message accumulation capacity in a single queue. Multiple messaging protocols like JMS and OpenMessaging. Flexible distributed scale-out deployment architecture. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    Apache Iceberg

    Apache Iceberg

    Apache Iceberg

    Iceberg is a high-performance format for huge analytic tables. Iceberg brings the reliability and simplicity of SQL tables to big data while making it possible for engines like Spark, Trino, Flink, Presto, Hive, and Impala to safely work with the same tables, at the same time. The core Java library that tracks table snapshots and metadata is complete, but still evolving. Current work is focused on adding row-level deletes and upserts, and integration work with new engines like Flink and Hive. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25

    Stanford Data Miner

    Tools for integration and analysis of heterogeneous immunological data

    An extensive description of this system is published in the Journal of Translational Medicine (http://www.translational-medicine.com/). In brief, the system consists of two main web applications, a data integration app and a data exploration app. The data integration app is a fully custom Java "Web 2.0" product called Sherpa. Sherpa uses Seam, a platform integrating Asynchronous JavaScript and XML (AJAX), JavaServer Faces (JSF), the Java Persistence API (JPA), and Enterprise Java Beans (EJB) 3.0. The data exploration app is an open source business intelligence product called JasperServer (version 3.7), customized through supported configuration changes. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next