Anonymized outcomes from real Maximo AI turns: tool calls, ledger events, feedback, latency, and completion indicators.
Methodology
Score weights, evidence sources, confidence, privacy, and anti-gaming controls.
Benchmark Methodology
RPB means Real Performance Benchmark.
RPB is MyTabulon's public score for real AI agent performance. It measures whether a model can complete business work with tools, reason through multi-step tasks, stay safe, move quickly, and earn good user feedback inside live MyTabulon workflows.
Controlled MyTabulon scenarios covering CRM, accounting, documents, memory, operations, inventory, and integrations.
Public evidence cards showing domain, tool plan, outcome, and broad evidence without revealing business data.
The RPB score uses a 0-100 scale, but published scores are evidence-adjusted. It is not a token benchmark, a popularity vote, or a lab exam.
RPB is best read together with confidence, sample size, and domain evidence.
The benchmark is public. The underlying business workspace is not.
The goal is a public benchmark AI labs cannot tune against by memorizing a fixed worksheet. MyTabulon blends live work, rotating test tasks, human feedback, and tool-ledger evidence.