1.5 KiB
1.5 KiB
Lakehouse Schemas
Analytical fact table definitions for MinIO-backed datasets queried via Trino.
All tables use Hive-compatible partition layouts on MinIO (s3a://stonks-lakehouse/warehouse/)
and are defined in the lakehouse.stonks schema. Parquet is the storage format.
Fact Tables
lake.market_bars— OHLCV bar data per symbol per intervallake.market_quotes— bid/ask quote snapshotslake.company_events— corporate actions, earnings, filings, and issuer eventslake.documents— ingested document metadata (articles, filings, transcripts)lake.document_extractions— AI extraction outputs per document per companylake.trade_signals— aggregated trend signals and recommendation actionslake.trade_orders— order submission records (paper and live)lake.trade_fills— fill and execution records from brokerlake.positions_daily— end-of-day position snapshotslake.pnl_daily— daily PnL records per symbol per accountlake.prediction_vs_outcome— prediction accuracy trackinglake.model_performance— extraction model performance metrics
Partitioning
- Most tables partition by
dt(date) document_extractions,prediction_vs_outcome, andmodel_performancealso partition bymodel_version
Trino Catalogs
lakehousecatalog (Hive connector) for external Hive-compatible tablesicebergcatalog (Iceberg connector) for managed Iceberg tables
Views
Example SQL views for dashboards and ad hoc analysis are in lakehouse/views/.
See lakehouse/views/README.md for details.