In this edition of Shift’s “Four Questions with” series we asked Chief Scientist and Chief Product Officer Eric Sibony to reflect on the company’s evolution to fully embrace agentic AI.
We had a similar conversation about agentic AI when the company introduced Shift Claims. How has Shift's thinking about agents, and their role in insurance AI, evolved between then and now?
This is an interesting question in the sense that our thinking really hasn’t changed. We still believe that agents can deliver incredible value to insurers as they seek to automate more and more critical processes. What has evolved is the underlying technology that makes agents possible.
We’ve seen huge leaps in the performance of the LLMs that underpin agentic AI. The technical and product realities have matured rapidly in six months. Key model releases, most notably the big jumps in February released from major labs, have pushed agent capability from micromanaged, short tasks toward sustained, multi‑day reasoning. That means agents can increasingly act like subject‑matter experts rather than just automating repetitive steps. Practically, that shifts how we design products: from adding agentic features to discrete components, to making agentic reasoning the core of products across claims, fraud, subrogation, etc.
Operationally Shift moved from “AI as core” to “agents as product.” We’ve injected agentic reasoning into the engine and introduced an integration/packaging layer so agents behave as callable services and can orchestrate tools such as entity reconstruction, APIs, or external browsing, for example, as needed. We still reuse and improve the proven components (entity resolution, data pipelines), but we now treat them as tools for agents rather than stand‑alone outputs.
So fundamentally, the direction is the same, but the pace, the architectural thinking, and the emphasis on orchestration and continuous R&D have intensified.
When you say "Shift is now selling agents" what exactly do we mean by that, and why is the distinction important?
Saying “we’re selling agents” means two concrete things: first, agentic AI is embedded at the product core as opposed to being what could be considered an ancillary product capability; second, we provide the packaging and integration layer so our products can be used as autonomous, callable agents.
In this sense, a claims fraud capability becomes a “fraud agent” that assesses a claim, runs investigations with multiple tools, escalates to humans when needed, and can be invoked by other agents in a broader claims orchestration flow. The distinction matters because it changes buyer expectations: you’re purchasing an autonomous, composable decision‑maker, not just a predictive model or analytics service.
That framing also clarifies architecture and ecosystem thinking. Agents need reliable access to curated data, tool integrations (APIs, entity reconstruction, browsing), and orchestration protocols so they can chain work, persist state over multi‑day tasks, and defer to humans as necessary. Additionally, packaging agents reduces the integration burden for insurers: they get interoperable building blocks that play in an agent ecosystem rather than individual ML or GenAI models focused on a single task.
There are a lot of industry voices telling insurers they can just build agents for themselves. What challenges do insurers who choose this path face?
The opportunity to build is real, but the gap between a “good enough” agent and an industry‑leading one is wide. The immediate challenges are technical, organisational, and also relate to domain knowledge. The technical challenges include building a data architecture that feeds agents correctly, tool integrations, and state and long‑horizon task management. Organisational considerations center on establishing dedicated R&D and development teams that can continuously adapt to rapidly changing models. Domain knowledge fuels the ability to develop deep insurance business modeling to define when an agent should act, escalate, or say “I don’t know.” Prompt engineering alone is simply not enough; the best agents rely on structured access to context and curated decision logic, not just ad‑hoc prompts.
There’s also a time‑to‑value and maintenance problem. Model and tooling advances change the optimal architecture frequently; migrating and re‑optimising systems is now part of ongoing product work, not a one‑time project. Finally, quality and risk management are hard: to get to the high accuracy tolerances insurance requires — think >99% in some flows — you need multi‑agent review, monitoring for drift, and business‑aware guardrails. Insurers can build these capabilities, but it’s expensive and calls for specialisation that many won’t want to replicate when proven vertical solutions exist.
Shift has long advocated for AI that knows how to say, "I don't know." Does continued adoption of agentic AI make this stance more or less important?
The ability for AI to say “I don’t know” becomes more important, not less, in agentic architectures. As agents are tasked to make higher‑stakes, more autonomous decisioning—sometimes orchestrating work over days or months—the cost of an incorrect, overconfident answer rises. The only way to scale safe automation is to combine high‑precision decisions with calibrated abstention: agents that can defer when confidence is low or the business risk of a mistake is material. That remains the core mechanism to achieve the reliability insurers demand.
Agentic architectures actually make this capability more achievable and more nuanced. We can decompose tasks into orchestration agents plus specialist review agents, add monitoring modules that detect statistical drifts or inconsistent business signals, and tune the orchestration layer to decide “yes / no / I don’t know” based on both probability of correctness and downstream risk. In practice that means not forcing binary outputs but making abstention a first‑class outcome—risk‑adjusted, explainable, and tied to human escalation paths—so automation scales safely.