Debug-Specific Training Data

How Chronos-1 uses 42.5 million real debugging examples to learn root-cause diagnosis and multi-file repair.

Chronos-1 is trained exclusively on real debugging workflows, not code completion tasks.
This specialized dataset is a major reason for its superior debugging accuracy.

Training Dataset Composition

  • 15M GitHub issues paired with fix commits
  • 8M stack traces with successful resolutions
  • 3M CI/CD logs from failed and fixed builds
  • 2.5M production debugging sessions
  • 14M curated benchmark examples (SWE-Bench, Defects4J, BugsInPy, etc.)

Specialized Fine-Tuning Tasks

Chronos-1 is trained for:

  • Chain-of-cause reasoning
  • Multi-modal bug understanding (code + logs + traces)
  • Iterative fix refinement
  • Cross-repository pattern recognition