Files

Lakehouse Schemas

Analytical fact table definitions for MinIO-backed datasets queried via Trino.

All tables use Hive-compatible partition layouts on MinIO (s3a://stonks-lakehouse/warehouse/) and are defined in the lakehouse.stonks schema. Parquet is the storage format.

Fact Tables

  • lake.market_bars — OHLCV bar data per symbol per interval
  • lake.market_quotes — bid/ask quote snapshots
  • lake.company_events — corporate actions, earnings, filings, and issuer events
  • lake.documents — ingested document metadata (articles, filings, transcripts)
  • lake.document_extractions — AI extraction outputs per document per company
  • lake.trade_signals — aggregated trend signals and recommendation actions
  • lake.trade_orders — order submission records (paper and live)
  • lake.trade_fills — fill and execution records from broker
  • lake.positions_daily — end-of-day position snapshots
  • lake.pnl_daily — daily PnL records per symbol per account
  • lake.prediction_vs_outcome — prediction accuracy tracking
  • lake.model_performance — extraction model performance metrics

Partitioning

  • Most tables partition by dt (date)
  • document_extractions, prediction_vs_outcome, and model_performance also partition by model_version

Trino Catalogs

  • lakehouse catalog (Hive connector) for external Hive-compatible tables
  • iceberg catalog (Iceberg connector) for managed Iceberg tables

Views

Example SQL views for dashboards and ad hoc analysis are in lakehouse/views/. See lakehouse/views/README.md for details.