Switch to Kodezi Create

Switch to Kodezi Code

Welcome to Kodezi Chronos-1

Introduction to Chronos-1

What Is Chronos-1?Key Innovations Availability Timeline

Core Architecture

Multi-Source Input Layer Adaptive Retrieval Engine Debug-Oriented LLM Core Autonomous Fix Loop Persistent Debug Memory Execution Sandbox Explainability Layer

Getting Started

Model Access Using Chronos-1 Teams & Enterprise Get Access About Kodezi

SWE-Bench Lite Results Multi Random Retrieval Benchmark Bug Category Performance Repository-Scale Performance Language Support Component Impact Analysis

Kodezi OS Integration Kodezi Web IDE Integration Kodezi CLI Integration Kodezi API Integration

Key Innovations

Persistent Debug Memory (PDM)Adaptive Graph Guided Retrieval (AGR)Output-Optimized Design Debug-Specific Training Data Autonomous Debugging Loop

Chronos-1 vs General-Purpose Models Debugging Gap Analysis Technical Differences

Research Paper Frequently Asked Questions (FAQ)

Contact Information License & Proprietary Notice

Switch to Kodezi Create

Switch to Kodezi Code

SWE-Bench Lite Results

Chronos’ benchmark performance on SWE-Bench Lite and comparison with leading models.

Chronos-1 delivers state-of-the-art performance on SWE-Bench Lite, the industry standard benchmark for evaluating real software debugging.

Benchmark Results (Nov 2025)

Rank 1: Chronos: 80.33% (241 / 300 solved)
Rank 2: ExpeRepair v1.0 (Claude 4.5 Sonnet): 60.33%
Rank 3: Refact.ai Agent: 60.00%
Rank 4: KGCompass (Claude 4.5 Sonnet): 58.33%
Rank 5: SWE Agent (Claude 4.5 Sonnet): 56.67%

General-purpose models (no agent frameworks)

Claude 4.5 Sonnet (Bash-only): ~14%
Claude 4.1 Opus (Bash-only): 14.2%
GPT-4.1: 13.8%

Chronos’ 20-point absolute lead over the second-best system comes from:

debugging-specific training on 15M sessions
Persistent Debug Memory
Adaptive Graph-Guided Retrieval
autonomous fix-test-refine loops

These components enable Chronos-1 to solve failures that general models cannot.

Performance Overview

Summary of Chronos’ real-world debugging performance across benchmarks, repositories, and languages.

Multi Random Retrieval Benchmark

Chronos’ performance on complex debugging scenarios with dispersed context.

On this page

Benchmark Results (Nov 2025)

General-purpose models (no agent frameworks)