STRATIF-AI · CHU BREST · 17 JUNE 2026

Multimodal machine learning and agentic AI for health

a project with Alive Engine

Bastien Pasdeloup · IMT-Atlantique, BRAIN · Jean-Charles Vialatte · Alive Engine
Alive Engine Alive Engine
IMT-Atlantique BRAIN IMT-Atlantique · BRAIN
IMT-Atlantique incubateur

BRAIN and Alive Engine

Around twenty researchers making AI more accessible: less energy, less data, fewer priors on the data. A broad international network, and years of work on large language models and now on agentic AI.

A platform to build, deploy, and supervise persistent AI teammates that learn continuously, accumulate expertise, and work alongside human teams. On-prem for sensitive data, sandboxed, and fully auditable.

Team and roles

Jean-Charles Vialatte
Jean-Charles Vialatte
Founder, Alive Engine
  • PhD 2018, graph deep learning, IMT-Atlantique
  • 2018 to 2025, ML / AI engineer
  • 2025, founder and builder of Alive Engine
Bastien Pasdeloup
Bastien Pasdeloup
Associate Professor, IMT-Atlantique
  • PhD 2017, graph signal processing, IMT-Atlantique
  • Lab-STICC, BRAIN team
  • Graph ML, efficient and trustworthy AI, healthcare

Talk structure

1
Deploying agentic AI
2
Emergency stroke care
3
Supervising trust
1
ACT ONE

Deploying agentic AI

How an autonomous AI runs inside the hospital, on its own data, and what that solves.

On-prem model Local gateway Inbound only Secure messaging Sub-agents

What is an AI agent?

Equip a model with the tools of the service: databases, files, a shell, search. It stops chatting and starts doing the work.

OPEN MODEL
a capable chat, on its own
THE HARNESS
Databases
Files
Shell
Search
AGENT
does the work

Hospital AI stays inside the wall

The clinician UI, gateway, model, agents, and hospital data all run on the hospital network. Cloud inference is not in the loop.

HOSPITAL PERIMETERSecure messaging appphone as terminalAlive Gatewaypolicy · auditlogPrivate modellocal hardware orprivate cloudSub-agentsbounded clinicaltoolsLocal datarecords · imaging· labsthird-party cloudinference blockedno cloud inference · no patient context exported · every action logged

Patient data never leaves the perimeter

The agent may pull outside knowledge in. It cannot send patient context out to a third-party model.

Allowed
guidelines and literature come into the hospital network
Blocked
patient context cannot be sent to an external model
Recorded
gateway, model, and tool calls are audit events

Phone as secure terminal

Clinicians can reach the agent from a secure messaging app. The far end of the thread terminates inside the hospital network.

Sub-agents split the clinical work

A central agent orchestrates focused sub-agents, each with its own context and tools.

CentralagentIntake agentpatient chat · emergencymedical services handoverRecords agenthistory · antecedents · medsImaging agentcomputed tomography andangiography pre-read
01
Parallelism
work happens at once, not in sequence
02
Compartmentalization
bounded contexts do not bleed
03
Specialization
each sub-agent tuned to one task of the service
2
ACT TWO

Emergency stroke care

Acute stroke in the emergency room, where every minute is brain tissue.

Arrival to thrombolysis Intake History Imaging pre-read Compartments

Every minute matters in acute stroke

The deck's stroke use case is about one measurable clock: time from hospital arrival to thrombolysis.

UNTREATED ISCHEMIC STROKE
0
neurons lost per minute
10 min faster ≈ 0 fewer neurons exposed to untreated ischemia
PATHWAY CLOCK
Clockhospital arrival to thrombolysis
60 min90th-percentile target (guideline)
tPAbenefit is steeply time-dependent
The architecture matters only if it moves this clock.

Figures from the stroke-pathway literature (Bulmer et al., Front. Neurol. 2021).

Where the minutes leak

Patients who arrive by ambulance get a head start. Self-presenting patients wait for triage. And a system delay sits between imaging and the neuro read.

pre-notifiedwait for triageimaging to read lagAmbulance~80% · head startPrivate vehicle~20% · presentingTriagefirst assessmentImagingCT · CTANeuro readdecisiontPAneedle

Intake can start before triage

On a suspected stroke, the central agent spawns an intake sub-agent. By ambulance? It structures the handover. By private vehicle? It opens a secure thread with the patient or a companion, before triage.

Intake agent · collects onset, symptoms, contraindications before the patient is triaged

Records agent · pulls antecedents from database and cross-checks with intake

pre-notifiedintake agent works hereintake agentrecords agentimaging pre-diagnosisearly decision alertAmbulance~80% · head startPrivate vehicle~20% · presentingTriagefirst assessmentImagingCT · CTANeuro readdecisiontPAneedle
WHY AGENTIC Parallel intake collapses the pre-triage and registration gap the literature flags as slow for private-vehicle arrivals.

Records retrieval runs in parallel

A records agent queries the hospital databases and returns a structured brief while the patient is still arriving.

records agent · brief
prior TIA (2023)
apixaban, anticoagulant flag
eGFR normal, no allergies
contraindication Anticoagulation surfaced before the clinician sits down.
WHY AGENTIC History retrieval runs concurrently with intake and imaging prep, not serially after the clinician arrives.

Pre-read starts when images land

An imaging agent drafts a preliminary read immediately, raises urgency, then the neurologist confirms or overrides.

Before
~12 min dead time
neuro reads
After
agent flags
neuro reads sooner
images arrive needle
URGENT · possible LVO · prioritise neuro read
AGENT PRE-READ · DRAFT
  • Early ischemic change, left MCA
  • CTA: proximal occlusion
  • No hemorrhage
Preliminary confidence 78.0%
to be verified by clinician
Decision support only. The agent does not diagnose. A clinician confirms.
WHY AGENTIC It does not replace the neurologist. It removes the dead time before the neurologist looks, then a clinician confirms.

Architecture is the privacy control

Each contact point is a separate sub-agent with a bounded context. Information passes up as structured summaries, never raw context.

summary onlysummary onlysummary onlyCentralagentIntake agentpatient chat · emergencymedical services handoverRecords agenthistory · antecedents · medsImaging agentcomputed tomography andangiography pre-read
WHY AGENTIC The patient-facing agent never sees the database. The records agent never sees the chat. Smaller contexts, smaller blast radius, full audit.

Agents reduce arrival-to-thrombolysis delay

Agents attack the pre-triage gap, retrieve history in parallel, and cut the imaging-to-read dead time. Minutes saved from hospital arrival to thrombolysis become neurons saved.

Baseline pathway48 min
Agent-assisted (projected)35 min
−13 min
≈24.7M
neurons preserved per case, at the projected saving
−13 min × 1.9M neurons/min

Anchor: combining process changes alone cut median arrival-to-thrombolysis time by up to 26.7% in simulation (Bulmer et al., 2021). Agentic gains shown are a projected concept, not a measured result.

3
ACT THREE

Supervising trust
in medical agents

A funded research project on reliability, traceability, drift monitoring, and validation for medical agents.

Problem Alive Engine State of the art Specific objectives

Agentic AI is arriving before trust is solved

The research project starts from a simple risk: in medicine, autonomous systems must be supervised for what they do, not only for what they say.

61%
healthcare organizations already building or deploying agentic AI
91.8%
clinicians already exposed to medical hallucinations from foundation models
64–72%
residual hallucinations caused by causal or temporal reasoning failures

The research question is continuous trust supervision: reliability, reasoning coherence, traceability, and controlled evolution over time.

Problem: supervise agents that keep changing

The project frames trust as a longitudinal problem: a medical agent must remain reliable, coherent, and traceable as it learns.

RESEARCH QUESTION

How do we verify that an autonomous medical agent still behaves correctly after tool use, memory updates, and self-improvement cycles?

Autonomy

agents plan task sequences, call tools, and act without direct human intervention

Medical risk

errors are not cosmetic: reasoning failures can affect patient decisions

Learning over time

persistent agents change through memory, profiling, and introspection cycles

Project structure

A six-month maturation project with IMT-Atlantique BRAIN and Alive Engine, focused on supervision modules for learning medical agents.

M1

Literature baseline

what already works, what fails, and which evaluation gaps matter for medical agents

M2

Regression detection

catch capability loss before an updated agent reaches clinical workflow

M3

Reasoning audit trail

make recommendations inspectable, replayable, and attributable to evidence

M4

Drift monitoring

track data, behavior, and confidence shifts over time

M5

Clinical validation

test with clinicians, thresholds, overrides, and sign-off workflow

Alive Engine

The platform used in the project to create, deploy, and supervise persistent agents that learn continuously.

The project uses Alive Engine as the experimental substrate: continuous tasks, persistent state, introspection cycles, sandboxed execution, and real-time supervision.

Persistent memory

agents consolidate what they learn instead of resetting every run

Local orchestration

tools, sandboxes, and model calls remain inside the deployment boundary

Audit by default

every action, evidence source, and handoff can be replayed

Human supervision

thresholds, approvals, overrides, and halt states are first-class controls

Stroke pathway workspace
objective · reduce dead time from arrival to thrombolysis
supervised
5 agents 3 tool scopes 100% actions logged
Clinician workspace
secure thread, console, review queue
Alive Gateway
local endpoint, policy, audit log
Agent runtime
central agent, sub-agents, memory cycles
Hospital tools and data
records, imaging, guidelines, lab systems
Supervision layer
drift, regression, trust scores, sign-off

Current tools miss the learning cycle

The state of the art does not yet supervise whether an autonomous medical agent remains stable after introspection and self-improvement.

01

Chain-of-thought is not enough

It can reduce hallucinations in aggregate, while increasing major omissions in some clinical summarisation settings.

02

Self-improvement can drift

Small updates accumulate. An agent can move away from its original objective without a regression protocol.

03

Observability misses introspection

Guardrails and traces mostly monitor outputs. They rarely verify the agent’s self-evaluation cycles.

Medical benchmarks do not test agent stability

Strong medical evaluation datasets exist, but none evaluate whether a learning agent stays reliable over time.

AVAILABLE MEDICAL BENCHMARKS
MedQAPubMedQAMedMCQAMIMIC-IVBioASQMedS-Bench
MISSING FOR AGENTIC AI
  • longitudinal stability after memory updates
  • traceability across reasoning steps
  • behavior drift across interactions
  • trust between collaborating agents

Objectives 1–2: prove stability and provenance

The first half of the project turns agent learning into something testable and replayable.

01

Regression detection

Run periodic regression suites after introspection cycles to verify that mastered medical tasks did not degrade.

MedQA · PubMedQA
02

Reasoning audit trail

Attach every medical conclusion to sources, logical steps, and decision points, using knowledge graphs for provenance.

sources · steps · decisions

Objectives 3–4: monitor drift and validate on benchmarks

The second half of the project checks whether supervision works on a concrete medical-agent use case.

03

Behavior drift monitoring

Detect unwanted changes in bias, tone, inter-agent trust, and decision orientation over interactions and learning cycles.

personality · bias · trust
04

Experimental validation

Deploy a medical agent on Alive Engine and evaluate differential diagnosis, drug interaction detection, and literature synthesis.

MIMIC-IV · MedQA · PubMedQA

Project outputs

The maturation project should produce reusable supervision components, experimental evidence, and the basis for a longer collaboration and further works.

Regression protocol

periodic tests for learning agents after introspection cycles

Reasoning audit module

traceable conclusions with source and decision provenance

Drift dashboard

monitoring of personality, bias, tone, and inter-agent trust

Validated medical agent

Alive Engine deployment evaluated on open medical datasets

Longer ambition: a formal trust-supervision framework for learning multi-agent systems in critical domains.

Thank you

Questions welcome.

Jean-Charles Vialatte · jc@alive.dev Bastien Pasdeloup · bastien.pasdeloup@imt-atlantique.fr
medtech.alive.dev/deck/
Alive Engine Alive Engine
IMT-Atlantique BRAIN IMT-Atlantique · BRAIN
IMT-Atlantique incubateur
1 / 29 Title