Files
stonks-oracle/lakehouse/schemas/README.md
T

33 lines
1.5 KiB
Markdown

# Lakehouse Schemas
Analytical fact table definitions for MinIO-backed datasets queried via Trino.
All tables use Hive-compatible partition layouts on MinIO (`s3a://stonks-lakehouse/warehouse/`)
and are defined in the `lakehouse.stonks` schema. Parquet is the storage format.
## Fact Tables
- `lake.market_bars` — OHLCV bar data per symbol per interval
- `lake.market_quotes` — bid/ask quote snapshots
- `lake.company_events` — corporate actions, earnings, filings, and issuer events
- `lake.documents` — ingested document metadata (articles, filings, transcripts)
- `lake.document_extractions` — AI extraction outputs per document per company
- `lake.trade_signals` — aggregated trend signals and recommendation actions
- `lake.trade_orders` — order submission records (paper and live)
- `lake.trade_fills` — fill and execution records from broker
- `lake.positions_daily` — end-of-day position snapshots
- `lake.pnl_daily` — daily PnL records per symbol per account
- `lake.prediction_vs_outcome` — prediction accuracy tracking
- `lake.model_performance` — extraction model performance metrics
## Partitioning
- Most tables partition by `dt` (date)
- `document_extractions`, `prediction_vs_outcome`, and `model_performance` also partition by `model_version`
## Trino Catalogs
- `lakehouse` catalog (Hive connector) for external Hive-compatible tables
- `iceberg` catalog (Iceberg connector) for managed Iceberg tables
## Views
Example SQL views for dashboards and ad hoc analysis are in `lakehouse/views/`.
See `lakehouse/views/README.md` for details.