Bug Category Performance

How Chronos-1 performs across different bug types including logic, concurrency, memory, and syntax issues.

Chronos-1 handles diverse bug categories with significantly higher accuracy than general-purpose models.

Category-wise Accuracy

  • Syntax Errors: 94.2%
  • Logic Bugs: 72.8%
  • Concurrency Issues: 58.3%
  • Memory Problems: 61.7%
  • API Misuse: 79.1%
  • Performance Bugs: 65.4%

General-purpose models typically score 3–15% across these same categories.

Why Chronos-1 excels

  • training on 42.5M real debugging examples
  • multi-file reasoning
  • persistent memory of past fixes
  • cross-module dependency tracing

Chronos-1 is particularly strong on complex bug types (concurrency, memory, performance) where traditional models fall below 7%.