Debug-Specific Training Data

How Chronos-1 uses 42.5 million real debugging examples to learn root-cause diagnosis and multi-file repair.

Chronos-1 is trained exclusively on real debugging workflows, not code completion tasks.
This specialized dataset is a major reason for its superior debugging accuracy.

Training Dataset Composition

15M GitHub issues paired with fix commits
8M stack traces with successful resolutions
3M CI/CD logs from failed and fixed builds
2.5M production debugging sessions
14M curated benchmark examples (SWE-Bench, Defects4J, BugsInPy, etc.)

Specialized Fine-Tuning Tasks

Chronos-1 is trained for:

Chain-of-cause reasoning
Multi-modal bug understanding (code + logs + traces)
Iterative fix refinement
Cross-repository pattern recognition

Debug-Specific Training Data

Training Dataset Composition

Specialized Fine-Tuning Tasks

On this page