Big Ideas in Publishing: Tim Vines on AI Subscriptions
How creating AI-ready research content could unlock entirely new revenue streams
This is the first post in Big Ideas in Publishing, an occasional series exploring concepts that could reshape research communication.
Tim Vines has been thinking about the future of research publishing for some time. As founder of DataSeer and former Managing Editor of Molecular Ecology, he's seen how technology can transform scholarly communication. His latest idea is, what if publishers created AI-optimized versions of research articles and sold them as premium subscriptions?
I caught up with Tim to discuss this concept, which emerged from his thinking about what happens when artificial intelligence becomes a primary consumer of research literature.
You've been talking about AI subscriptions for research content. Can you walk me through the basic idea?
The arrival of generative AI has prompted widespread speculation about the future of scholarly communication. One scenario that has received little attention, but could have profound implications for publishers, is: What if artificial intelligence becomes the primary consumer of research literature?
Large language models (LLMs) have already been trained on huge volumes of the written word; to an order of magnitude, everything humans have ever published online. As these systems mature, we will need to find new forms of input. If we want these tools to contribute meaningfully to scientific discovery, we need to rethink how research content is structured for their consumption and – necessarily – how it's paid for.
The core limitation of today’s LLMs is that they operate by averaging. Their predictions are generated by assimilating vast numbers of tokens from existing texts and inferring what likely comes next. This approach may be useful in many domains, but it is ill-suited to scientific inquiry, where progress often stems from outliers – individual studies that take a new approach and reshape the consensus.
Scientific literature, moreover, is not of uniform quality. Some articles are foundational; many are derivative, flawed, or outright misleading. Yet because high-impact research is typically paywalled, and less rigorous work is often openly accessible, LLMs are disproportionately trained on lower-quality material. We’re asking AI systems to model the state of scientific knowledge by digesting the most available – not the most accurate – content.
So how do we fix this problem?
We need to provide AI with the underlying raw materials of science. What are the raw materials of science? It has to be an individual claim or conjecture—you could think of it as a set of conditions, a hypothesis that's tested, and then a result in terms of both a dataset and a conclusion drawn from that dataset.
If you just read scientific papers, you only get the words. You only hear about what the authors think about their results, and authors can overstate their significance. Unless you're very skilled and able to critically pick these apart, it’s hard to determine the value and validity of the work from the words alone.
To enable AI systems to assist in research, we need to provide them with structured, reliable input — not just the prose of published articles, but the underlying claims, data, and reasoning. A truly AI-ready version of a research paper might look very different from the human-readable PDF. It could be decomposed into modular units: individual hypotheses, linked datasets, methodological summaries, and direct mappings between evidence and conclusions. This kind of formatting would allow AI to assess the strength of specific claims and build a more nuanced understanding of the field.
That sounds like a lot of extra work. How would this actually function as a business model?
The current publishing infrastructure is optimized for human readers. Research articles are delivered in a format that presumes a shared understanding (between humans!) of how to communicate findings; most commonly, via a PDF. This static medium persists because it enables a predictable exchange between sender and receiver. But AI readers operate under entirely different assumptions. They aren’t concerned with narrative flow or rhetorical nuance; they want structured input that enables reasoning.
This divergence opens the door to a new business model: offering AI-ready versions of research articles via a subscription. Human-readable content could remain open access, while publishers monetize machine-optimized content streams for institutional clients, particularly those running proprietary, in-house AI systems.
Can you give me a concrete example of how this would work?
Imagine a pharmaceutical company subscribing to an AI feed that continuously ingests structured research about Parkinson’s disease, helping it identify promising leads in real time. In this scenario, access to high-quality, up-to-date content becomes essential, not optional.
This puts a premium on quality control, doesn't it?
Absolutely. When you're selling these subscriptions, your value as a journal depends on your ability to keep the fraudulent work out – nobody wants to train their AI on results that were fabricated.
What are the biggest obstacles to making this happen?
The key is to get the data from the authors. That's the missing piece. It comes back to the business model behind open data—researchers like to have access to the data, but they're not willing to pay for access to the data, and they would much rather read the PDF.
If publishers are going to be earning subscription revenue from AI subscriptions to their content, and they need the authors to go the extra mile to get the article into this format, then maybe the author should see some part of that. Maybe they get a reduced APC for the Open Access (= human) version of the article, or there's some revenue sharing.
Do you think this is where we're headed?
These AI models are only going to get better and better. The new generation of reasoning models is so impressive, but what they need to keep improving is knowledge about how the world works – and not more information about the most likely next word in a sentence. The AIs are not going to get that from the ‘words only’ version of human-readable articles that academic publishing currently makes, and that means there’s a huge opportunity for a new way to disseminate research.
Tim Vines is the Founder and CEO at DataSeer. Prior to that he founded Axios Review, an independent peer review company that helped authors find journals that wanted their paper. He was the Managing Editor for the journal Molecular Ecology for eight years, where he led their adoption of data sharing and numerous other initiatives. He writes for the industry-leading Scholarly Kitchen blog, and has published research papers on peer review, data sharing, and reproducibility (including one that was covered by Vanity Fair). He has a PhD in evolutionary ecology from the University of Edinburgh and now lives in Vancouver, Canada.
Have a big idea that could reshape research communication? I'd love to hear from you.
