Agent Efficiency, Memory and Confidence

Presentation Link

The sixty-third MeetUp of the Machine Learning Singapore Group, was titled : “Agents, Experts and Extracting Structured Data” - and had talks covering a wide spectrum.

My Presentation

My talk was titled “Agent Efficiency, Memory and Confidence”, and had the following outline:

Efficiency
Memory
Confidence
Wrap-up & QR-code (the latter to reduce audience distractions)

The main driver behind the Efficiency section was the Oppo paper that detailed how they had explored the trade-offs in agent performance over a few important dimensions (such as how to use memory, parallel roll-outs, etc).

Following up on the Memory angle, I talked about ‘Memento’ (fka AgentFly), and then progressed to an overview of two ‘Confidence’ papers - ARPO (Agentic Reinforced Policy Optimization), and Deep Think with Confidence (a paper from Meta). These two papers finally formally published the key drivers of entropix (the first directly, with applications to Agent RL, the second through a proxy confidence measure).

The slides for my talk, which contain links to all of the reference materials and sources, are here :