Simplismart

Simplismart · 2025-10-16T11:46:21.915Z

We’re excited to announce that Simplismart is a Gold Sponsor at PyTorch Conference 2025! The PyTorch community has been at the heart of our innovation journey powering how we build, optimize, and scale AI models in production. We’re proud to support the ecosystem that continues to shape the future of GenAI. 👉 Drop by our booth (#G9) to: • Experience the Inference Clinic and see Simplismart Copilot in action • Chat with our engineering team about real-world AI scaling challenges • Grab some exclusive swag 🎁 📍 Moscone West, San Francisco 🗓️ October 22-23, 2025

Software Development

Fastest inference with terraform-like orchestration

View all 27 employees

About us

Fastest inference for generative AI workloads. Simplified orchestration via a declarative language similar to terraform. Deploy any open-source model and take advantage of Simplismart’s optimised serving. With a growing quantum of workloads, one size does not fit all; use our building blocks to personalise an inference engine for your needs. ***API vs In-house*** Renting AI via third-party APIs has apparent downsides: data security, rate limits, unreliable performance, and inflated cost. Every company has different inferencing needs: *One size does not fit all.* Businesses need control to manage their cost <> performance tradeoffs. Hence, the movement towards open-source usage: businesses prefer small niche models trained on relevant datasets over large generalist models that do not justify ROI. **Need for MLOps platform*** Deploying large models comes with its hurdles: access to compute, model optimisation, scaling infrastructure, CI/CD pipelines, and cost efficiency, all requiring highly skilled machine learning engineers. We need a tool to support this advent towards generative AI, as we had tools to transition to cloud and mobile. MLOps platforms simplify orchestration workflows for in-house deployment cycles. Two off-the-shelf solutions readily available: 1. Orchestration platforms with model serving layer: *do not offer optimised performance for all models, limiting user’s ability to squeeze performance* 2. GenAI Cloud Platforms: *GPU brokers offering no control over cost* Enterprises need control. Simplismart’s MLOps platform provides them with building blocks to prepare for the necessary inference. The fastest inference engine allows businesses to unlock and run each model at performant speed. The inference engine has been optimised at three levels: the model-serving layer, infrastructure layer, and a model-GPU-chip interaction layer, while also enhanced with a known model compilation technique.

Website: https://simplismart.ai
External link for Simplismart
Industry: Software Development
Company size: 11-50 employees
Headquarters: San Francisco
Type: Privately Held
Founded: 2022
Specialties: Machine learning, Artificial Intelligence, Deep Learning, Predictive analysis, Supervised Learning, Distributed learning, Workflow management, Auto-scale deployments, Audio Transcription, Summarization, Entity Extraction, Entity Classification, LLM, RAG, Diffusion Pipelines, and Voice

Products

Simplismart

Data Science & Machine Learning Platforms

Build, deploy, and manage custom deep learning models in minutes. Easy to use for a beginner and flexible for an expert. It's like making a subway sandwich! Simplismart automates and simplifies every step of the modern MLOps lifecycle, putting you in control with our easy-to-use declarative interface console and SDKs. The platform assembles a custom deep learning model on the fly through rigorous data analysis and intelligent resource estimation for deployment based on your throughput and latency SLAs.

Locations

Primary

San Francisco , US

Get directions

Employees at Simplismart

See all employees

Updates

Simplismart

17,529 followers
2d
Report this post
We're excited to share that our CTO, Devansh Ghatak will be speaking at the Ray Summit on "Orchestrating the GenAI Lifecycle with KubeRay: Training, Inference and Benchmarking" In his talk, Devansh will walk through how KubeRay simplifies distributed model workflows by providing a unified, scalable layer for managing them on Kubernetes. We’ll explore real-world examples, including: - Distributed Fine-Tuning: Cut training time from 600 days to 60 hours using Ray DDP + Simplismart’s optimization layer. - Batch Inference at Scale: Transcribe a month of video for just $1 with Ray-orchestrated Whisper jobs. - Automated Benchmarking: Discover 2× cost-efficient configs through intelligent SLA-based tuning powered by Ray 📍 Ray Summit 2025, San Francisco 🗓️ 12:30 PM PT | November 4, 2025
Like Comment Share
Simplismart

17,529 followers
5d
Report this post
🚀 Simplismart is a Gold Sponsor at Ray Summit 2025! We’re turning our Booth G2 into a mini-lab: demos, puzzles, problem-solving, and a few surprises. Stop by to: • See Simplismart Copilot find the best inference setups in seconds. • Compete in the GenAI Performance Prediction Challenge, flex your config instincts and win cool prizes. • Drop your toughest inference issue at our Inference Clinic, we’ll brainstorm practical fixes. • Score exclusive swag (while supplies last). If you’re shipping GenAI to production (or trying to), come talk trade-offs, real metrics, and practical fixes. 📍 Marriott Marquis, San Francisco 🗓️ November 4-5, 2025 #Simplismart #RaySummit2025 #GenAI #GenAIOptimization
Like Comment Share
Simplismart

17,529 followers
1w
Report this post
We’re excited to share that our CTO, Devansh Ghatak, will be speaking at ODSC AI West 2025! 🎤 Talk: Tailor-made Inference: Managing Trade-offs Across Performance & Cost Optimizing inference goes far beyond choosing the right engine. It’s about navigating real-world challenges like: • Latency spikes at the wrong time • Throughput issues that limit scaling • Cloud costs that keep growing. Devansh will break down what it takes to design an inference system built for your specific workload, balancing latency, throughput, and cost without compromise. He’ll share how modular architectures can help teams stay within SLAs while scaling efficiently across diverse production environments. See you there!
Like Comment Share
Simplismart

17,529 followers
1w
Report this post
🚀 Simplismart is heading to ODSC AI West 2025! We’ll be at Booth #23 come say Hi to the team and dive into how we make production AI actually scale. Here’s what you’ll find at our booth: • Inference Clinic: Bring your toughest deployment or inference challenges our engineers will help you brainstorm ways to optimize and scale your workloads • Simplismart Copilot: See how our Copilot helps you optimize your inference stack and boost model performance. • GenAI Performance Prediction Challenge: Put your intuition to the test and win some exciting prizes! 📍 Hyatt Regency, Burlingame, CA 🗓️ October 28-30, 2025 #Simplismart #ODSC2025 #ODSCWest #GenAI #DataScience
Like Comment Share
Simplismart

17,529 followers
1w
Report this post
We’ve introduced the Simplismart Benchmarking Suite, a practical way to evaluate GenAI models in real-world conditions. It helps teams test GenAI models the way they actually run in the real world, not just for accuracy, but for speed, reliability, and consistency under load. If you’ve ever wondered which model really performs better in production, this is for you. Read the full post here 👇

Benchmarking GenAI Inference with Simplismart's benchmarking suite simplismart.ai

Like Comment Share
Simplismart

17,529 followers
1w
Report this post
What are the usual roadblocks you face when you're deploying an LLM in production? Our team at Simplismart wrote about how you'd approach an LLM deployment and how to overcome the challenges. In this complete guide on deploying GPT-OSS 120B on NVIDIA H100 GPUs using vLLM, we've covered: - Single GPU deployment with memory optimization - Multi-GPU setup using tensor parallelism for production workloads - Configuration strategies for high-throughput vs. low-latency scenarios The results? Multi-GPU deployment achieved ~200ms median TTFT (Time To First Token). Check out the full technical walkthrough!

How to Deploy GPT-OSS 120B with vLLM on H100 GPUs simplismart.ai

1 Comment

Like Comment Share
Simplismart

17,529 followers
2w Edited
Report this post
We’re all set up at PyTorch Conference 2025! 🚀 Swing by Booth G9 to: • See Simplismart Co-pilot in action: Our tool that helps you find the best inference setup in seconds. • Play the GenAI Performance Prediction Challenge and win some awesome gifts! • Drop by our Inference Clinic to brainstorm your ongoing GenAI deployment or inference challenges with our team. Come say Hi, we’d love to chat about making inference faster, smarter, and tailor-made for your product. #PyTorch #Simplismart #GenAI #Inference #AIInfrastructure #PyTorchConference
1 Comment

Like Comment Share
Simplismart

17,529 followers
2w
Report this post
We’re delighted to share that our CTO, Devansh Ghatak, will be speaking at the PyTorch Conference 2025 on “Optimizing Model Inference with PyTorch 2.0.” He’ll explore how to push PyTorch inference performance to the next level by combining dynamic compilation, CUDA graph capture, Quantization, AOT compilation, and custom fused operators to achieve low-latency, production-grade deployments. If you’re working on scaling GenAI workloads or optimizing inference pipelines, this is one session you won’t want to miss. 🗓️ October 22 | 4:05 PM PDT
Like Comment Share
Simplismart reposted this
Simplismart

17,529 followers
2w
Report this post
We’re excited to announce that Simplismart is a Gold Sponsor at PyTorch Conference 2025! The PyTorch community has been at the heart of our innovation journey powering how we build, optimize, and scale AI models in production. We’re proud to support the ecosystem that continues to shape the future of GenAI. 👉 Drop by our booth (#G9) to: • Experience the Inference Clinic and see Simplismart Copilot in action • Chat with our engineering team about real-world AI scaling challenges • Grab some exclusive swag 🎁 📍 Moscone West, San Francisco 🗓️ October 22-23, 2025
Like Comment Share
Simplismart

17,529 followers
2w
Report this post
We’re excited to announce that Simplismart is a Gold Sponsor at PyTorch Conference 2025! The PyTorch community has been at the heart of our innovation journey powering how we build, optimize, and scale AI models in production. We’re proud to support the ecosystem that continues to shape the future of GenAI. 👉 Drop by our booth (#G9) to: • Experience the Inference Clinic and see Simplismart Copilot in action • Chat with our engineering team about real-world AI scaling challenges • Grab some exclusive swag 🎁 📍 Moscone West, San Francisco 🗓️ October 22-23, 2025
Like Comment Share

Browse jobs

Funding

Simplismart 3 total rounds

Last Round

Non equity assistance Aug 18, 2024

Investors

Google for Startups

See more info on crunchbase

Simplismart

Software Development

Fastest inference with terraform-like orchestration

About us

Products

Simplismart

Data Science & Machine Learning Platforms

Locations

Employees at Simplismart

Puneet Lamba

Scaling enterprise AI infra

Thomas Hall

Scaling AI/MLOps through Tailor Made Inference

Saurabh Thakral

Vice President & Asia Business Head @ RapidCanvas | SaaS | AI agents | AI | Data Science | MLOps

Devansh Ghatak

Building Simplismart | BITS Pilani | ex-Google

Updates

Join now to see what you are missing

Similar pages

SkySQL

SimpliSmart

Sphere Labs

Counsel Health

Yuma AI (YC W23)

Verax AI

Adfin

Avantee

/dev/agents

MirrorTab

Browse jobs

Developer jobs

Engineer jobs

Machine Learning Engineer jobs

Scientist jobs

Intern jobs

Analyst jobs

Senior Software Engineer jobs

Python Developer jobs

Javascript Developer jobs

Full Stack Engineer jobs

Associate jobs

Intelligence Specialist jobs

Frontend Developer jobs

Science Specialist jobs

Software Engineer jobs

Data Science Specialist jobs

Account Executive jobs

Reliability Engineer jobs

Data Engineer jobs

Android Developer jobs

Funding