Files
stonks-oracle/tests/replay_fixtures/README.md
T

857 B

Replay Dataset for Deterministic Extraction Testing

Archived document fixtures used to verify that the extraction pipeline produces consistent, schema-valid results across code changes.

Each fixture is a JSON file containing:

  • document_id: stable identifier for the fixture
  • document_type: article, filing, transcript, or press_release
  • document_text: normalized text as it would arrive from the parser
  • known_tickers: ticker hints passed to the extraction prompt
  • expected_extraction: the expected extraction result (schema-valid)
  • metadata: fixture provenance info (created_at, description, schema_version)

The replay runner (tests/test_replay_extraction.py) loads these fixtures, validates the expected outputs against the current extraction schema, and optionally runs them through a live Ollama instance for end-to-end checks.