How KERMT Turns Chemical Fine-Tuning into a Shared, Multitask Pipeline Fine-tuning large chemical models has become standard practice, but it’s still slow and fragmented. Most property prediction pipelines train one model per endpoint; accurate, but hard to scale when you’re tracking dozens of ADMET properties at once. KERMT (Kinetic GROVER Multi-Task) changes that. Developed by Meta AI and ETH Zurich, it’s a multitask extension of GROVER, a graph transformer pretrained on over 11 million molecules. The framework improves performance and efficiency across molecular property prediction tasks by introducing true multitask finetuning and new engineering optimisations for large-scale chemistry workloads. Applications and Insights 1. Multitask finetuning at scale KERMT learns shared representations across related molecular properties, improving accuracy and data efficiency compared to single-task finetuning. The benefits are most pronounced in medium to large data regimes, where related tasks reinforce each other’s signal. 2. Enhanced pretraining and architecture Built on the GROVER backbone, KERMT uses atom- and bond-level message passing with motif and k-hop subgraph prediction tasks during pretraining, capturing both local and global structure information. 3. Significant computational acceleration The team reimplemented KERMT with distributed PyTorch (DDP) and integrated cuik-molmaker, a high-throughput molecular featurisation library. This setup achieved 2.2× faster finetuning and 2.9× faster inference, along with better GPU scaling efficiency and reduced CPU memory use. 4. Robust multitask performance On large ADMET datasets, KERMT outperforms non-pretrained GNN baselines and single-task models, showing clear gains in predictive correlation and stability. On public datasets like MoleculeNet, performance varies by task but KERMT remains competitive with other pretrained backbones. I think this is cool because it shows how chemical foundation models are moving from individual tasks to shared infrastructure. KERMT demonstrates that multitask finetuning isn’t just possible: it’s efficient, scalable, and production-ready. With distributed training, shared graph embeddings, and open datasets, it’s a practical step toward more unified, real-world molecular modelling pipelines. Not another “bigger model.” Just a smarter way to use the ones we already have. Pretty nice. Check out the pre-print: https://lnkd.in/e4dX_bcw Try out the code: https://lnkd.in/evUhJRQd Public multi-task split dataset: https://lnkd.in/euVaBD2F Congrats to Alan Cheng, Matthew Adrian, Yunsie Chung, Kevin Boyd, Saee Paliwal and the rest of the amazing team! 🤙🏻 Follow for more updates on AI x Life Science, and subscribe to our weekly newsletter for more: https://lnkd.in/exvkedBd
About us
Our mission is to bridge AI and Life Sciences. The next breakthrough in drug discovery is a team of Virtual Scientists that designs, runs and interprets complex experiments by intelligently using the right software and data. Kiin Bio is bringing this vision to life with KiinOS. Follow us and subscribe to our newsletter (https://newsletter.kiin.bio/) for actionable intelligence, strategic insights, and unparalleled expertise in AI-driven solutions for the life science sector!
- Website
-
https://kiin.bio/
External link for Kiin Bio
- Industry
- Biotechnology Research
- Company size
- 2-10 employees
- Headquarters
- London
- Type
- Privately Held
- Founded
- 2024
- Specialties
- Biotech, Pharma, and AI
Locations
-
Primary
London, GB
Employees at Kiin Bio
-
Andreas Goeldi
Partner at b2venture | Ex founder and CTO
-
Mark Davies
Chief Data & Scientific Officer at Kiin Bio | Founder | BenevolentAI | EMBL-EBI
-
Chiara Bacchelli
Head of Science at Kiin Bio | Associate Professor in Personalised Medicine & Genomics
-
Filippo Abbondanza
Founder @Kiin Bio | ex Product Director @LifeBit | PhD in Bioinformatics @University of St Andrews | Winner best therapeutics @iGEM 2017
Updates
-
🧬 Learnings from BioTechX Europe The breadth and depth of AI adoption across the life sciences sector, from discovery to clinical translation was incredible to see. It was inspiring to see so many innovative TechBio startups emerging, eager to collaborate and accelerate progress together. This reinforces our belief that scalable, AI-powered solutions can truly transform how scientists explore disease biology and therapeutic potential. The future of biotech is collaborative, and it’s exciting to be part of that momentum. A standout moment was hearing from Isomorphic Labs, which kicked off the event, sharing how structure-based AI approaches are reshaping chemistry design and helping advance internal programs. We also saw commercial data providers starting to rethink how they engage with AI solution developers - a shift that opens up exciting partnership opportunities ahead. That said, not every session went deep enough for us. There’s still a gap in understanding what AI can really do for science versus what’s still aspirational but that only makes the conversations more valuable, as it’s through these open discussions and experimentation that progress happens. #BioTechX #LifeSciences #DrugDiscovery #TechBio
-
-
ChemRefine takes the pain out of running ML and quantum chemistry side by side. Combining quantum chemistry and machine learning has always been powerful, but historically messy. Most workflows involve patching together tools, juggling formats, and tuning everything by hand. ChemRefine, developed at the University of Texas at Dallas, claims to fix that. It’s an open-source platform that combines ML potentials and quantum chemistry into a unified, automated workflow. ChemRefine links ORCA’s quantum chemistry engine with ASE’s MLIP library. So, you can run conformer searches, transition states, redox or excited-state workflows, and switch seamlessly between DFT and ML methods. It also includes ChemRefineGPT, a custom LLM that writes YAML configs and job scripts from plain-text prompts. You describe the workflow, and it builds it. Pretty cool. Applications and Insights 1. MLIP training and fine-tuning Train models like MACE, UMA, or SevenNet using DFT data or normal-mode sampling. Get MD speeds 100x faster than DFT, with similar accuracy. 2. Conformer refinement across theory levels Automatically re-rank conformers using semi-empirical, hybrid, and double-hybrid methods to catch energy shifts missed at lower levels. 3. Spin, redox, and excited-state analysis Run TDDFT and spin-state workflows with seamless integration between MLIPs and electronic structure. 4. TS and host-guest workflows Find transition states, clean up vibrational modes, dock guests, and model microsolvation with ORCA’s built-in tools. ChemRefine turns what used to be a mess of scripts into a single, intelligent platform for AI-assisted computational chemistry. It brings the reasoning power of large language models together with the precision of quantum mechanics, making it easier to train, refine, and apply ML potentials across a wide range of chemical tasks. It’s a look at what chemistry automation could become; not code-first, but idea-first. Check out the paper: https://lnkd.in/eXSb3gjr Try it out: https://lnkd.in/eRxu2THQ Congrats to Ignacio Migliaro, PhD, Alistair J Sterling and Markus Weiss Follow for more updates on how the AI for life sciences field is changing, and subscribe to our weekly newsletter for more: https://lnkd.in/exvkedBd
-
-
Protein Hunter Turns Diffusion Hallucination Into a Protein Design Strategy Most protein design tools either optimise endlessly or rely on heavy fine-tuning. They can generate new folds, but usually only after a lot of compute, filtering, and (really,) luck. Protein Hunter, a new framework from MIT and the University of Washington, takes a different approach. It harnesses the natural hallucination ability of diffusion-based structure predictors like AlphaFold3 to create new proteins from scratch. So, no retraining, no gradients, no prior structure. Sounds pretty good, let's break it down. Starting from an entirely blank sequence, just a string of “X” tokens, the model hallucinates a folded backbone, which is then refined through iterative cycles of ProteinMPNN sequence design and structure prediction. Each cycle strengthens foldability, stability, and secondary structure until the protein converges to a realistic design. Essentially, its structure and sequence learning to fit each other in real time. Applications and Insights 1. Zero-shot protein generation From a blank sequence, Protein Hunter produces well-folded proteins between 100-900 residues in under two minutes, fine-tuning free, yet reaching AF3-style confidence levels. 2. Generalised binder design Designs binders for proteins, peptides, small molecules, DNA, and RNA, achieving higher in silico success rates across most targets than RFdiffusion or BoltzDesign. 3. Multi-motif scaffolding and partial redesign Can fix parts of existing structures (motifs, frameworks, or binding pockets) while regenerating the rest, enabling both de novo design and engineering of known proteins. 4. Controlled fold diversity By adjusting diffusion biases, researchers can steer designs toward β-sheet or α-helical topologies, breaking the helical bias seen in most generative models. Protein Hunter shows that structure prediction models really can do more than predict; they can actually create. I think this is cool because it switches the narrative of diffusion “hallucination,” once seen as noise, into a design feature. Instead of optimising endlessly, it lets the model imagine structure and sequence together, refining through simple, fast cycles until they converge on something stable. In today’s world, that’s a big step for generative biology, turning what was once a computational artefact into a creative tool for building real, functional proteins. Check out the pre-print: https://lnkd.in/emGkf3Yt The code is due for release later this week! Huge congrats to Yehlin Cho, Sergey Ovchinnikov, Gaurav Bhardwaj and Griffin Rangel! Follow for more updates on how the AI for life sciences field is changing, and subscribe to our weekly newsletter for more: https://lnkd.in/exvkedBd
-
-
What's new at the frontier of AI x Life Science? In last week's newsletter, we wrote about 3 new tools that are expanding the fields of molecular design and biological intelligence: 1. mBER: Controllable de novo antibody design platform with million-scale experimental screening 2. GatorAffinity: Large-scale synthetic structural pretraining for accurate protein-ligand affinity prediction 3. PHA: Multi-agent health reasoning framework integrating data science, physiology, and behavioural coaching Read the full newsletter here: https://lnkd.in/ex4E_6je 🧠 Follow Kiin Bio for weekly updates on AI x Life Science.
-
What If You Had a Health Agent That Actually Understood You? Most health apps collect data. Some even tell you things about it. Very few actually understand you. That is what Google’s research project prototype, The Personal Health Agent (PHA), is attempting to change. Instead of one model doing everything, PHA uses a multi-agent system of three AI specialists working like a care team: A Data Science Agent that reads your numbers and finds patterns. A Domain Expert Agent that explains why they matter. A Health Coach Agent that turns these insights into action. If you ask, “How did my workouts affect my sleep last month?” PHA does not just summarise your Fitbit graph. It finds correlations, explains the physiology, and gives you a plan that makes sense. Not a chatbot. Not a wellness app. A digital scientist that reasons. Definitely will be interesting to see how this multi-agent team actually operates. Applications and Insights 1. Built on real questions The team analysed over 1,300 real user queries from Fitbit and Google Search to ground the system in what people actually ask about their health. 2. Collaborative reasoning The agents share data and context dynamically, almost like a small research team combining analytics, biology, and behavioural science. 3. Extensive evaluation Over 7,000 human annotations and 1,100 hours of expert review were used to assess PHA’s reasoning, interpretation, and coaching quality. 4. Grounded in real-world data Tested on the WEAR-ME dataset, combining Fitbit metrics, lab results, and surveys from more than 1,100 participants. So it looks like PHA is not trying to be your doctor or your therapist. It’s aiming towards something in between, a system that sees your body as both data and biology. I think this is cool because it reframes what AI in health could be. Instead of telling you what to do, it helps you understand why your body behaves the way it does. It reasons, it explains, and it connects the dots. To me, that feels closer to how scientists think: health as a dynamic system that needs interpretation, not just fragmented tracking. If projects like this keep evolving responsibly, AI might soon help people understand their own biology the way language models help us understand information: through reasoning, conversation, and context. Check the comments for info on the training data! Read the paper: https://lnkd.in/eHyYAE9a Shoutout to Ali Heydari, PhD, Ken Gu, Hong Y., Zhihan Zhang, Yuwei Zhang and the rest of the amazing team! Follow for more updates on how the AI for life sciences field is changing, and subscribe to our weekly newsletter for more: https://lnkd.in/exvkedBd
-
-
GatorAffinity uses synthetic structures to beat the best in binding prediction Most structure-based affinity prediction tools hit a wall: the PDBbind dataset, with fewer than 20,000 high-quality experimental complexes. That’s not much, especially when machine learning thrives on scale. GatorAffinity breaks through that wall. Developed at the University of Florida, this new model is trained on more than 1.5 million synthetic protein–ligand complexes with annotated binding affinities, created using Boltz-1 and SAIR. With that kind of pretraining, and careful fine-tuning on experimental data, it outperforms every other method on the field’s toughest benchmarks. It also introduces a key idea: synthetic structure–affinity data can unlock scaling laws in this domain, just like we've seen in language and vision. The more data you give it, the better it gets. Applications and Insights 1. Breaks records on affinity benchmarks On the filtered LP-PDBbind dataset, GatorAffinity achieves an RMSE of 1.29, a Pearson R of 0.67, and a Spearman ρ of 0.65. That’s a significant improvement over the previous best, GIGN, across all three metrics. 2. Scales with data, not just parameters Performance kept improving as the dataset grew. The best results came from training on a large mix of Kd, Ki, and IC50-labeled entries, showing the value of both scale and biochemical diversity. 3. Robust to noisy structures Even when including low-confidence structures without ipTM filtering, the model performed well. That makes it easier to build large datasets without aggressive pruning. 4. Pretraining and fine-tuning are both essential Joint training on synthetic and experimental data didn’t work as well. The best approach was to pretrain on synthetic structures, then fine-tune on high-quality experimental data. I thought this was cool because it proves you can get these state-of-the-art results without needing a bigger model. You just need better data. GatorAffinity turns synthetic structure-affinity pairs into a real asset, enabling large-scale learning in a field where data has always been the bottleneck. To me, this feels like a real pivot towards something new. If we can trust and scale synthetic structural data like this, the path is open for faster screening, more accurate scoring, and smarter generative workflows across the board. Read about the training data in the comments! Check the paper: https://lnkd.in/egn7A3fJ Try the code: https://lnkd.in/eFj4xFm5 Congrats to Yanjun Li, Jinhang Wei, 🖥🧪💊💡 Gustavo Seabra and the rest of the amazing team! Follow for more updates on how the AI for life sciences field is changing, and subscribe to our weekly newsletter for more: https://lnkd.in/exvkedBd
-
-
mBER designs nanobody binders at massive scale without retraining a single model Antibody design usually means compromise. You can generate unconstrained minibinders by the thousands but if you want something usable, like a nanobody with therapeutic potential, your design space shrinks fast. That’s what makes mBER so interesting. Built by Manifold Bio, mBER is a smart, format-aware wrapper around existing models like AlphaFold-Multimer and ESM2. It generates VHH binders that respect the structure and sequence constraints of real nanobodies, without any model fine-tuning or custom pretraining. Just smart templating, efficient search, and a lot of screening. The proof is in the experimental pudding. The team designed over one million VHH sequences across 436 targets and screened them against 145 antigens, generating a dataset of more than 100 million protein-protein interactions. Sixty-five targets showed significant hits, and some epitopes reached hit rates of 38%. Applications and Insights 1. Format-aware binder generation mBER guides AlphaFold-Multimer to design within a realistic nanobody scaffold. It uses fixed framework residues, masked CDRs, and ESM2 priors to generate structured, sequence-valid outputs that are usable in downstream pipelines. 2. Big-screen validation More than one million designs were screened across 145 targets using phage display. Forty-five percent of targets yielded statistically significant hits, with several epitopes hitting success rates above 30%. 3. Filtering that works AlphaFold’s ipTM score was a strong predictor of success. Filtering for ipTM > 0.8 improved hit rates up to 10 times, making it easier to prioritize sequences for synthesis. 4. Multi-epitope targeting Several antigens yielded binders at multiple distinct epitopes, showing that mBER can explore different binding modes and generate functionally diverse hits for the same target. I thought this was cool because it’s a great example of what smart engineering can do with the tools we already have available to us. mBER doesn’t introduce a new model. It builds the scaffolding to make foundation models like AlphaFold and ESM2 useful in real therapeutic design contexts. That might sound simple but it's not. Designing antibody-like binders that actually fold, express, and hit diverse epitopes usually takes months. mBER shows that you can do it in days, using open models and accessible infrastructure. Now, with more than 100 million datapoints already generated, it sets the stage for a new kind of binder benchmarking, one where format, epitope, and affinity can all be explored systematically. Check the comments for more info on the training data! Read the pre-print: https://lnkd.in/gDxYpzTY Try the code: https://lnkd.in/gu93UsjN Congrats to Erik Swanson, Mike Nichols and Pierce Ogden! Subscribe to our weekly newsletter for updates on the AI x Life Sciences space: https://lnkd.in/exvkedBd
-
-
Last week’s newsletter highlighted three new tools that caught our attention: 1. CELLFIE: High-content CRISPR screening platform to optimise CAR T cell therapies 2. scPortrait: Standardised single-cell imaging framework for multimodal integration 3. ProteinDJ: Modular HPC pipeline for scalable protein binder design Read the full newsletter here: https://lnkd.in/ev5K7E2Z 🧠 Follow Kiin Bio for weekly updates on AI x Life Science. 🏃♂️ No time to read? Listen to the voiceover when you're on the move instead
-
AI isn’t just predicting drug candidates anymore, it’s learning how to reason like scientists. At Kiin Bio, we’re building virtual scientists that go beyond prediction to plan, design, and iterate across the full R&D lifecycle. That’s why we’re excited for BioTechX Europe this week, an opportunity to demonstrate how we’re combining vertical solutions into a horizontally integrated platform that actually help researchers do their work faster, smarter, and more collaboratively. 📍If you’re in Basel, come chat with us at Booth S31. Let’s talk about the future of drug discovery. #BioTechX #KiinBio #DrugDiscovery #LifeSciences
-