Skip to content

Stats Schema

The stats schema is the contract between producer and consumer for the --stats-json output of lunavox-cli. Three producers emit the same shape:

  1. src/main.cpp writes the JSON file when the user passes --stats-json report.json.
  2. src/lunavox_c_api.cpp::to_c_audio fills LunavoxAudio with the same fields so the Python ctypes binding gets them as struct members.
  3. src/lunavox/runtime/binding.py::SynthesisStats is the Python dataclass the GUI and any embedding script consumes.

Consumers like benchmark/run_benchmark.py and the GUI can from lunavox.core.stats_schema import StatsJSON instead of reaching into free-form dicts.

TypedDicts (structural types)

TimingMs

Bases: TypedDict

Per-stage wall timings in milliseconds. Fields that the engine does not measure for a given synth path are 0, not absent.

StreamStats

Bases: TypedDict

Streaming pipeline diagnostics. Populated for every synth call — the threaded decoder is always on.

MemoryBytes

Bases: TypedDict

Process memory snapshots in bytes, sampled at specific run checkpoints. rss_peak is the high-water mark seen during the synth call; rss_start / rss_end bracket the call.

RunStats

Bases: TypedDict

One element of StatsJSON['runs'].

StatsJSON

Bases: TypedDict

Top-level --stats-json payload.

t_load_ms is the wall time spent inside a single Engine::load_models call (includes warmup). t_warmup_ms is the warmup portion of that load time, broken out so cold-start regressions can be diagnosed without re-parsing DEBUG logs. runs is the sequence of synth calls that followed — order matches the CLI's --repeat loop.

Parser for downstream consumers

ParsedStats dataclass

ParsedStats(
    load_ms: int = 0, warmup_ms: int = 0, runs: list[RunDict] = _empty_run_list()
)

parse_stats_json

parse_stats_json(data: Any) -> ParsedStats

Coerce a parsed JSON dict (or str path) into a :class:ParsedStats.

Raises ValueError with the offending field name on shape mismatch.

Source code in src/lunavox/core/stats_schema.py
def parse_stats_json(data: Any) -> ParsedStats:
    """Coerce a parsed JSON dict (or str path) into a :class:`ParsedStats`.

    Raises ``ValueError`` with the offending field name on shape mismatch.
    """
    import json
    from pathlib import Path

    if isinstance(data, (str, Path)):
        with open(data, encoding="utf-8") as f:
            data = json.load(f)

    if not isinstance(data, dict):
        raise ValueError(
            f"stats JSON must be an object with t_load_ms / runs, got {type(data).__name__}"
        )

    payload = cast(dict[str, Any], data)
    runs_raw = payload.get("runs", [])
    if not isinstance(runs_raw, list):
        raise ValueError(f"stats JSON 'runs' must be a list, got {type(runs_raw).__name__}")
    runs_list = cast(list[Any], runs_raw)
    runs: list[RunDict] = [cast(RunDict, r) for r in runs_list if isinstance(r, dict)]

    return ParsedStats(
        load_ms=int(payload.get("t_load_ms", 0) or 0),
        warmup_ms=int(payload.get("t_warmup_ms", 0) or 0),
        runs=runs,
    )