Showing 104 open source projects for "pentaho data integration"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • BoldTrail Real Estate CRM Icon
    BoldTrail Real Estate CRM

    A first-of-its-kind homeownership solution that puts YOU at the center of the coveted lifetime consumer relationship.

    BoldTrail, the #1 rated real estate platform, is built to power your entire brokerage with next-generation technology your agents will use and love. Showcase your unique brand with customizable websites for your company, offices, and every agent. Maximize lead capture with a modern, portal-like consumer search experience and intelligent behavior tracking. Hyper-local area pages, home valuation pages and options for rich lifestyle data keep customers searching with your brokerage as the local experts. The most robust lead gen tools on the market help your brokerage, teams & agents effectively drive new business - no matter their budget. Empower your agents to generate free leads instantly with our simple to use landing pages & IDX squeeze pages. Drive more leads with higher quality and lower cost through in-house tools built within the platform. Diversify lead sources with our automated social media posting, integrated Google and Facebook advertising, custom text codes and more.
    Learn More
  • 1
    Pentaho

    Pentaho

    Pentaho offers comprehensive data integration and analytics platform.

    Pentaho couples data integration with business analytics in a modern platform to easily access, visualize and explore data that impacts business results. Use it as a full suite or as individual components that are accessible on-premise, in the cloud, or on-the-go (mobile). Pentaho enables IT and developers to access and integrate data from any source and deliver it to your applications all from within an intuitive and easy to use graphical tool. ...
    Leader badge
    Downloads: 1,800 This Week
    Last Update:
    See Project
  • 2
    Airbyte

    Airbyte

    Data integration platform for ELT pipelines from APIs, databases

    ...Our teams process more than 300 billion rows each month for ambitious businesses of all sizes. Enable your data engineering teams to focus on projects that are more valuable to your business. Building and maintaining custom connectors have become 5x easier with Airbyte. With an average response rate of 10 minutes or less and a Customer Satisfaction score of 96/100, our team is ready to support your data integration journey all over the world.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 3
    Gradle Docker Compose Plugin

    Gradle Docker Compose Plugin

    Simplifies usage of Docker Compose for integration testing

    The Gradle Docker Compose Plugin by Avast integrates Docker Compose lifecycle management into Gradle builds. It allows developers to define and manage Docker containers required for integration testing or local development directly from their Gradle build scripts. This plugin automates the startup and shutdown of services, supports container health checks, and enables tight integration between application code and containerized services, enhancing reproducibility and automation in...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    CellTypist

    CellTypist

    A tool for semi-automatic cell type classification, harmonization

    CellTypist is an automated tool for cell type classification, harmonization, and integration. Classification, transfer cell type labels from the reference to query dataset. Harmonization, match and harmonize cell types defined by independent datasets. integration, integrate cell and cell types with supervision from harmonization. CellTypist recapitulates cell type structure and biology of independent datasets. Regularised linear models with Stochastic Gradient Descent provide a fast and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Free and Open Source HR Software Icon
    Free and Open Source HR Software

    OrangeHRM provides a world-class HRIS experience and offers everything you and your team need to be that HR hero you know that you are.

    Give your HR team the tools they need to streamline administrative tasks, support employees, and make informed decisions with the OrangeHRM free and open source HR software.
    Learn More
  • 5
    reticulate

    reticulate

    R Interface to Python

    reticulate is an R package from Posit that creates seamless interoperability between R and Python. It lets you call Python modules, classes, and functions from within R, automatically translating between R and Python data structures. Useful for combining Python tooling with R projects, data analysis, and RMarkdown reports.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    nango

    nango

    A single API for all your integrations.

    Nango is a single API to interact with all other external APIs. It should be the only API you need to integrate to your app. Nango is an open-source solution for integrating third-party APIs with applications, simplifying API authentication, data syncing, and management.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Alova.js

    Alova.js

    Workflow-Streamlined next-generation request tools

    Extremely streamline API integration workflow. Quickly find APIs in the editor, and enjoy full type hints even in js projects with the API code automatically generated by Alova's extension. Request in various complex scenes by one line of code. Automatically manage paging data, and data preloading, reduce unnecessary data refresh, improve fluency by 300%, and reduce coding difficulty by 50%.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Cassandra Spark Connector

    Cassandra Spark Connector

    Apache Spark to Apache Cassandra connector

    The Apache Cassandra Spark Connector allows Spark jobs (RDDs or DataFrames/Datasets) to read from and write to Cassandra tables. Compatible with Apache Cassandra (v2.1+), Spark 1.0–3.5, and Scala 2.11–2.13, it supports mapping Cassandra rows to Scala case classes, saving results back to Cassandra, and executing arbitrary CQL within Spark applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Recap

    Recap

    Recap tracks and transform schemas across your whole application

    Recap is a schema language and multi-language toolkit to track and transform schemas across your whole application. Your data passes through web services, databases, message brokers, and object stores. Recap describes these schemas in a single language, regardless of which system your data passes through. Recap schemas can be defined in YAML, TOML, JSON, XML, or any other compatible language.
    Downloads: 0 This Week
    Last Update:
    See Project
  • FusionAuth: Authentication and User Management Software Icon
    FusionAuth: Authentication and User Management Software

    Offer your users flexible authentication options, including passwords, passwordless, single sign-on (SSO), and multi-factor authentication (MFA).

    FusionAuth adds login, registration, SSO, MFA, and a bazillion other features to your app in days - not months.
    Learn More
  • 10
    harmonypy

    harmonypy

    Integrate multiple high-dimensional datasets with fuzzy k-means

    Harmony is an algorithm for integrating multiple high-dimensional datasets. harmonypy is a port of the harmony R package by Ilya Korsunsky. Harmony is a general-purpose R package with an efficient algorithm for integrating multiple data sets. It is especially useful for large single-cell datasets such as single-cell RNA-seq.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Dagster

    Dagster

    An orchestration platform for the development, production

    Dagster is an orchestration platform for the development, production, and observation of data assets. Dagster as a productivity platform: With Dagster, you can focus on running tasks, or you can identify the key assets you need to create using a declarative approach. Embrace CI/CD best practices from the get-go: build reusable components, spot data quality issues, and flag bugs early. Dagster as a robust orchestration engine: Put your pipelines into production with a robust...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    Jitsu

    Jitsu

    Jitsu is an open-source Segment alternative

    Jitsu is a fully-scriptable data ingestion engine for modern data teams. Set-up a real-time data pipeline in minutes, not days. Installing Jitsu is a matter of selecting your framework and adding few lines of code to your app. Jitsu is built to be framework agnostic, so regardless of your stack, we have a solution that'll work for your team. Connect data warehouse (Snowflake, Clickhouse, BigQuery, S3, Redshift ot Postgres) and query your data instantly. Jitsu can either stream data in...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    nichenetr

    nichenetr

    NicheNet: predict active ligand-target links between interacting cells

    nichenetr: the R implementation of the NicheNet method. The goal of NicheNet is to study intercellular communication from a computational perspective. NicheNet uses human or mouse gene expression data of interacting cells as input and combines this with a prior model that integrates existing knowledge on ligand-to-target signaling paths. This allows to predict ligand-receptor interactions that might drive gene expression changes in cells of interest. This model of prior information on...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 14
    Apache Hudi

    Apache Hudi

    Upserts, Deletes And Incremental Processing on Big Data

    Apache Hudi (pronounced Hoodie) stands for Hadoop Upserts Deletes and Incrementals. Hudi manages the storage of large analytical datasets on DFS (Cloud stores, HDFS or any Hadoop FileSystem compatible storage). Apache Hudi is a transactional data lake platform that brings database and data warehouse capabilities to the data lake. Hudi reimagines slow old-school batch data processing with a powerful new incremental processing framework for low latency minute-level analytics. Hudi provides...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Apache DevLake

    Apache DevLake

    Apache DevLake is an open-source dev data platform

    Apache DevLake is an open-source dev data platform that ingests, analyzes, and visualizes the fragmented data from DevOps tools to extract insights for engineering excellence, developer experience, and community growth. Apache DevLake is designed for developer teams looking to make better sense of their development process and to bring a more data-driven approach to their own practices. You can ask Apache DevLake many questions regarding your development process. Just connect and query. Your...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    KubeRay

    KubeRay

    A toolkit to run Ray applications on Kubernetes

    KubeRay is a powerful, open-source Kubernetes operator that simplifies the deployment and management of Ray applications on Kubernetes. It offers several key components. KubeRay core: This is the official, fully-maintained component of KubeRay that provides three custom resource definitions, RayCluster, RayJob, and RayService. These resources are designed to help you run a wide range of workloads with ease.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Common Core Ontologies

    Common Core Ontologies

    The Common Core Ontology Repository

    The Common Core Ontologies (CCO) comprise twelve ontologies that are designed to represent and integrate taxonomies of generic classes and relations across all domains of interest. CCO is a mid-level extension of Basic Formal Ontology (BFO), an upper-level ontology framework widely used to structure and integrate ontologies in the biomedical domain (Arp, et al., 2015). BFO aims to represent the most generic categories of entity and the most generic types of relations that hold between them,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Searchkick

    Searchkick

    Intelligent search made easy

    Searchkick brings powerful, production-ready search to Rails by mapping Active Record models into Elasticsearch with sensible defaults and easy customization. It supports language analyzers, stemming, synonyms, misspelling tolerance, and highlighting so search results feel natural to end users. Indexing is model-centric: you declare what fields to index, add computed fields, and trigger reindexing via callbacks or background jobs, with options for zero-downtime rolling reindexes. On the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19

    Stanford Data Miner

    Tools for integration and analysis of heterogeneous immunological data

    An extensive description of this system is published in the Journal of Translational Medicine (http://www.translational-medicine.com/). In brief, the system consists of two main web applications, a data integration app and a data exploration app. The data integration app is a fully custom Java "Web 2.0" product called Sherpa. Sherpa uses Seam, a platform integrating Asynchronous JavaScript and XML (AJAX), JavaServer Faces (JSF), the Java Persistence API (JPA), and Enterprise Java Beans (EJB) 3.0. The data exploration app is an open source business intelligence product called JasperServer (version 3.7), customized through supported configuration changes. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Mara Pipelines

    Mara Pipelines

    A lightweight opinionated ETL framework, halfway between plain scripts

    This package contains a lightweight data transformation framework with a focus on transparency and complexity reduction. Data integration pipelines as code: pipelines, tasks and commands are created using declarative Python code. PostgreSQL as a data processing engine. Extensive web ui. The web browser as the main tool for inspecting, running and debugging pipelines. GNU make semantics.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    Pytente

    Uma Ferramenta Computacional para Análise e Recuperação de Patentes

    O Pytente é uma solução avançada para automatizar o processo de coleta, armazenamento e tratamento de dados bibliográficos de patentes. A ferramenta foi projetada para simplificar a coleta de grandes volumes de dados em repositórios de acesso aberto. O Pytente garante o armazenamento estruturado das informações, além da validação e eliminação de registros duplicados. Dentre as diversas funcionalidades disponibilizadas pela ferramenta, destacam-se a extração personalizada de subconjuntos de...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    PANDORA

    PANDORA

    Revolutionizing Biomedical Research with Advanced Machine Learning

    PANDORA is a machine learning (ML) tool that can be used to integrate various data types, including clinical, transcriptome and microbiome data and find connections in large datasets. PANDORA can be easily installed using Docker, a pre-built version of the software can be pulled from DockerHub. In order to run a test instance of PANDORA, users will first need to prepare their local environment by downloading, installing, and configuring Docker. genular is a community behind SIMON an...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    scArches

    scArches

    Reference mapping for single-cell genomics

    Single-cell architecture surgery (scArches) is a package for reference-based analysis of single-cell data. scArches allows your single-cell query data to be analyzed by integrating it into a reference atlas. By mapping your data into an integrated reference you can transfer cell-type annotation from reference to query, identify disease states by mapping to healthy atlas, and advanced applications such as imputing missing data modalities or spatial locations.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24

    incubator-seatunnel

    SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).

    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Metl ETL Data Integration

    Metl ETL Data Integration

    Simple message-based, web-based ETL integration

    Metl is a simple, web-based ETL tool that allows for data integrations including database, files, messaging, and web services. Supports RDBMS, SOAP, HTTP, FTP, SFTP, XML, FIXLEN, CSV, JSON, ZIP, and more. Metl implements scheduled integration tasks without the need for custom coding or heavy infrastructure. It can be deployed in the cloud or in an internal data center, and it was built to allow developers to extend it with custom components.
    Downloads: 4 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next