Esther Gillespie, Co-Founder & CEO of Jumping Rivers, shares her insights on what it really takes to run AI systems reliably in production…
-
What gap in the industry made you feel AI in Production needed to happen now?
AI and LLMs have made prototyping faster than ever. AI-based no-code app builders let you spin up ideas quickly. Developers use AI assistants for their coding work. That speed is valuable, but it creates a new problem: the gap between prototype and production has never been wider.
When AI helps you build something in hours instead of weeks, expectations shift. It is assumed that AI improves every workflow, but some things still can’t be automated. Scaling systems for users brings bottlenecks that need engineering decisions, not better prompts.
We see organisations build prototypes that work beautifully on test data, then struggle when those same systems need to run reliably at scale, day after day. AI teams have access to good tools and models. What they lack are shared practices for deployment, monitoring, and governance once users and large-scale data enter the picture.
That’s where projects stall. Engineers hit bottlenecks. Technical debt piles up. The work that should be about delivering value turns into firefighting.
AI use is evolving, and so are the practices around monitoring, security, and deployment. These conversations are happening, but they need to happen more deliberately and more often. We launched AI in Production because now more than ever, teams need space to discuss what actually works in production, not benchmarks or theory. The conference brings together engineers, data scientists, and leaders to share operational realities and learn from each other’s experiences.
-
From your experience at Jumping Rivers, what’s one thing about putting AI into production that most teams only learn the hard way?
Teams underestimate what happens after deployment. Getting a model working feels like a milestone, and it is. But that’s when the hard part starts.
In production, you’re managing the full lifecycle. That means prompt engineering for LLM applications, fine-tuning that changes over time, security and access controls, and monitoring that tracks both technical performance and business outcomes. For traditional ML models, it’s data pipelines, feature drift, and model retraining schedules. For RAG systems, it’s keeping vector databases up to date, monitoring retrieval quality, and managing embedding versioning. For agentic AI systems, it’s orchestration between multiple models, managing tool use, and handling failures when agents make unexpected decisions.
In all cases, you’re dealing with systems that change. User behaviour shifts, and data distributions evolve. What worked last month might produce inconsistent results today, and you need systems in place to catch that before users do.
Too many teams realise this too late. Maintaining AI in production is a practice that requires investment, tooling, and skills that span functions.
-
With so much AI hype, what does the industry most need to understand about building reliable real-world AI systems?
When systems fail in production, it’s rarely the model. The problems show up in data pipelines, testing coverage, monitoring strategies, or integration with existing infrastructure.
If you can’t version your data alongside your models, reproducing results becomes nearly impossible. If testing only covers code and not model outputs, you’re leaving critical failure modes undetected. If monitoring tracks latency but misses distribution drift, you’ll be diagnosing issues reactively instead of catching them early.
Build rollback procedures before deployment, not after failures occur. Software engineering established patterns for version control, CI/CD, and environment management years ago. Those same principles apply to ML systems, yet teams often treat production deployment as an entirely new challenge and spend significant time recreating solutions that already exist.
The infrastructure work may not be the most interesting part of the job, but it determines whether systems operate reliably or require constant intervention.
-
What emerging trend in AI engineering or MLOps will most shape how teams run AI in the next few years?
Production-first tooling. We’re seeing frameworks that make it easier to deploy models in consistent, scalable environments. Automation for ML pipelines that actually works. Inference infrastructure with observability built in from the start, not bolted on later.
There’s also a shift toward treating responsible AI as part of the operational workflow. Teams are building systems that capture data lineage, track model decisions, and generate the metadata needed for governance. Not because it’s nice to have, but because production systems have to be trustworthy.
At AI in Production, the workshops cover these topics through hands-on sessions. Participants work through deployment patterns for different model types, set up dashboards that track what matters, and learn how to build testing frameworks that catch issues before users do. The sessions are led by engineers who’ve built and maintained these systems.
-
What would success for AI in Production look like to you personally?
Success for AI in Production means attendees leave with knowledge they can actually use, not just theory they’ll forget by Monday.
We want engineers who walk away confident about deploying systems in their own environments. Data scientists who can better diagnose what’s happening when models degrade in production.
But it’s also about having a good time. The conversations between sessions over coffee. Meeting people who are dealing with the same production headaches you are. Good food, new connections, and the kind of shop talk that only happens when practitioners get together in person.
If people leave with practical approaches they can implement, a few new contacts they’ll actually stay in touch with, and maybe some laughs along the way, that’s what we’re aiming for.
AI becomes dependable infrastructure when people share what works and what doesn’t, and conferences work best when that happens in an environment where people actually want to be.
Continue the conversation at AI in Production, taking place on 4–5 June 2026 in Newcastle. The conference brings together engineers, data scientists, and leaders to share practical lessons from running AI systems in the real world. For more details, visit: https://ai-in-production.jumpingrivers.com