phase 14-15: docker build validation and helm deployment
This commit is contained in:
@@ -24,99 +24,153 @@
|
||||
- [x] Add seed data support for an initial tracked watchlist
|
||||
## Phase 3
|
||||
- External API Adapters
|
||||
- [ ] Implement scheduler for symbol and source polling windows
|
||||
- [ ] Implement market data API adapter interface
|
||||
- [ ] Implement first concrete market data provider adapter
|
||||
- [ ] Implement news API adapter interface
|
||||
- [ ] Implement first concrete news API provider adapter
|
||||
- [ ] Implement filings or regulatory adapter interface
|
||||
- [ ] Implement first concrete filings provider adapter
|
||||
- [ ] Implement broker API adapter interface for paper trading and order events
|
||||
- [ ] Implement rate-limit coordination, retries, and backoff across adapters
|
||||
- [x] Implement scheduler for symbol and source polling windows
|
||||
- [x] Implement market data API adapter interface
|
||||
- [x] Implement first concrete market data provider adapter
|
||||
- [x] Implement news API adapter interface
|
||||
- [x] Implement first concrete news API provider adapter
|
||||
- [x] Implement filings or regulatory adapter interface
|
||||
- [x] Implement first concrete filings provider adapter
|
||||
- [x] Implement broker API adapter interface for paper trading and order events
|
||||
- [x] Implement rate-limit coordination, retries, and backoff across adapters
|
||||
|
||||
## Phase 4 - Ingestion Pipeline
|
||||
- [ ] Implement web scraper worker for curated URLs and article pages
|
||||
- [ ] Implement canonical URL normalization and content hashing
|
||||
- [ ] Implement raw artifact upload to MinIO
|
||||
- [ ] Implement metadata persistence in PostgreSQL for market payloads, documents, and broker events
|
||||
- [ ] Implement retry and failure tracking for source retrieval
|
||||
- [ ] Implement dedupe logic across article and filing sources
|
||||
- [x] Implement web scraper worker for curated URLs and article pages
|
||||
- [x] Implement canonical URL normalization and content hashing
|
||||
- [x] Implement raw artifact upload to MinIO
|
||||
- [x] Implement metadata persistence in PostgreSQL for market payloads, documents, and broker events
|
||||
- [x] Implement retry and failure tracking for source retrieval
|
||||
- [x] Implement dedupe logic across article and filing sources
|
||||
|
||||
## Phase 5 - Parsing and Normalization
|
||||
- [ ] Implement HTML-to-text parsing pipeline
|
||||
- [ ] Implement boilerplate reduction and body extraction heuristics
|
||||
- [ ] Implement parser quality scoring and confidence flags
|
||||
- [ ] Implement company mention detection using ticker, alias, and name matching
|
||||
- [ ] Persist normalized text and parser outputs to MinIO and PostgreSQL
|
||||
- [x] Implement HTML-to-text parsing pipeline
|
||||
- [x] Implement boilerplate reduction and body extraction heuristics
|
||||
- [x] Implement parser quality scoring and confidence flags
|
||||
- [x] Implement company mention detection using ticker, alias, and name matching
|
||||
- [x] Persist normalized text and parser outputs to MinIO and PostgreSQL
|
||||
|
||||
## Phase 6 - Ollama Structured Extraction
|
||||
- [ ] Build extraction prompt templates with anti-hallucination instructions
|
||||
- [ ] Build JSON schema definitions for document intelligence extraction
|
||||
- [ ] Implement Ollama client wrapper using structured output format
|
||||
- [ ] Implement schema validation and semantic validation layers
|
||||
- [ ] Persist prompts, model metadata, raw outputs, validation reports, and final intelligence objects
|
||||
- [ ] Add retry behavior for invalid or incomplete model responses
|
||||
- [ ] Add model performance metrics and dashboards
|
||||
- [x] Build extraction prompt templates with anti-hallucination instructions
|
||||
- [x] Build JSON schema definitions for document intelligence extraction
|
||||
- [x] Implement Ollama client wrapper using structured output format
|
||||
- [x] Implement schema validation and semantic validation layers
|
||||
- [x] Persist prompts, model metadata, raw outputs, validation reports, and final intelligence objects
|
||||
- [x] Add retry behavior for invalid or incomplete model responses
|
||||
- [x] Add model performance metrics and dashboards
|
||||
|
||||
## Phase 7 - Aggregation and Trend Engine
|
||||
- [ ] Implement recency decay and source credibility weighting
|
||||
- [ ] Integrate market context features into aggregation windows
|
||||
- [ ] Implement company-level rolling window aggregation
|
||||
- [ ] Implement contradiction detection and disagreement representation
|
||||
- [ ] Implement sector and market rollups
|
||||
- [ ] Implement evidence ranking for supporting and opposing documents
|
||||
- [ ] Persist trend windows and evidence mappings
|
||||
- [x] Implement recency decay and source credibility weighting
|
||||
- [x] Integrate market context features into aggregation windows
|
||||
- [x] Implement company-level rolling window aggregation
|
||||
- [x] Implement contradiction detection and disagreement representation
|
||||
- [x] Implement sector and market rollups
|
||||
- [x] Implement evidence ranking for supporting and opposing documents
|
||||
- [x] Persist trend windows and evidence mappings
|
||||
|
||||
## Phase 8 - Recommendation Engine
|
||||
- [ ] Design deterministic recommendation eligibility logic
|
||||
- [ ] Implement recommendation generation from aggregated scores and evidence
|
||||
- [ ] Add optional LLM wording layer for thesis generation only
|
||||
- [ ] Persist recommendation objects and evidence citations
|
||||
- [ ] Add suppression logic for low-quality data or low confidence
|
||||
- [ ] Publish prediction facts to analytical tables
|
||||
- [x] Design deterministic recommendation eligibility logic
|
||||
- [x] Implement recommendation generation from aggregated scores and evidence
|
||||
- [x] Add optional LLM wording layer for thesis generation only
|
||||
- [x] Persist recommendation objects and evidence citations
|
||||
- [x] Add suppression logic for low-quality data or low confidence
|
||||
- [x] Publish prediction facts to analytical tables
|
||||
|
||||
## Phase 9 - Risk Engine and Trade Adapter
|
||||
- [ ] Implement portfolio and account risk configuration model
|
||||
- [ ] Implement hard blocks for max position size, sector exposure, daily loss limits, and news-shock lockouts
|
||||
- [ ] Implement paper trading adapter behavior and state sync
|
||||
- [ ] Integrate first broker API in sandbox mode
|
||||
- [ ] Implement idempotent order submission keys and duplicate prevention
|
||||
- [ ] Implement full execution audit trail
|
||||
- [ ] Add operator approval workflow for live trading mode
|
||||
- [ ] Publish order, fill, and position facts to analytical tables
|
||||
- [x] Implement portfolio and account risk configuration model
|
||||
- [x] Implement hard blocks for max position size, sector exposure, daily loss limits, and news-shock lockouts
|
||||
- [x] Implement paper trading adapter behavior and state sync
|
||||
- [x] Integrate first broker API in sandbox mode
|
||||
- [x] Implement idempotent order submission keys and duplicate prevention
|
||||
- [x] Implement full execution audit trail
|
||||
- [x] Add operator approval workflow for live trading mode
|
||||
- [x] Publish order, fill, and position facts to analytical tables
|
||||
|
||||
## Phase 10 - Lakehouse and SQL Analytics
|
||||
- [ ] Define analytical fact tables for bars, documents, extractions, signals, orders, fills, positions, and PnL
|
||||
- [ ] Implement Parquet writers for analytical datasets
|
||||
- [ ] Implement Hive-compatible partition layout conventions on MinIO
|
||||
- [ ] Implement Iceberg table creation and metadata management for analytical datasets
|
||||
- [ ] Implement lake publisher jobs from operational data into analytical fact tables
|
||||
- [ ] Configure Trino catalogs for Hive and or Iceberg access to MinIO
|
||||
- [ ] Add example SQL views for prediction-vs-outcome and paper-trade scorecards
|
||||
- [x] Define analytical fact tables for bars, documents, extractions, signals, orders, fills, positions, and PnL
|
||||
- [x] Implement Parquet writers for analytical datasets
|
||||
- [x] Implement Hive-compatible partition layout conventions on MinIO
|
||||
- [x] Implement Iceberg table creation and metadata management for analytical datasets
|
||||
- [x] Implement lake publisher jobs from operational data into analytical fact tables
|
||||
- [x] Configure Trino catalogs for Hive and or Iceberg access to MinIO
|
||||
- [x] Add example SQL views for prediction-vs-outcome and paper-trade scorecards
|
||||
|
||||
## Phase 11 - Query API and Dashboard
|
||||
- [ ] Build APIs for companies, document timelines, trend summaries, recommendations, and order history
|
||||
- [ ] Build evidence drill-down view linking recommendations to source documents and raw artifacts
|
||||
- [ ] Build admin controls for source health, symbol configs, and trading mode
|
||||
- [ ] Build operational dashboard for ingestion throughput, model failures, and source coverage gaps
|
||||
- [ ] Build Superset starter dashboards for symbol overview, sentiment heatmap, PnL, and prediction accuracy
|
||||
- [x] Build APIs for companies, document timelines, trend summaries, recommendations, and order history
|
||||
- [x] Build evidence drill-down view linking recommendations to source documents and raw artifacts
|
||||
- [x] Build admin controls for source health, symbol configs, and trading mode
|
||||
- [x] Build operational dashboard for ingestion throughput, model failures, and source coverage gaps
|
||||
- [x] Build Superset starter dashboards for symbol overview, sentiment heatmap, PnL, and prediction accuracy
|
||||
|
||||
## Phase 12 - Observability and Hardening
|
||||
- [ ] Add structured logs and distributed tracing across services
|
||||
- [ ] Add Prometheus metrics for ingestion, parsing, extraction, aggregation, lake publication, and trading
|
||||
- [ ] Add alerting for source failures, schema failure spikes, analytical lag, and broker issues
|
||||
- [ ] Add dead-letter queues and replay tooling
|
||||
- [ ] Add data retention and lifecycle controls for raw and derived artifacts
|
||||
- [ ] Add security review for secrets, network policies, trading isolation, and dashboard access control
|
||||
- [x] Add structured logs and distributed tracing across services
|
||||
- [x] Add Prometheus metrics for ingestion, parsing, extraction, aggregation, lake publication, and trading
|
||||
- [x] Add alerting for source failures, schema failure spikes, analytical lag, and broker issues
|
||||
- [x] Add dead-letter queues and replay tooling
|
||||
- [x] Add data retention and lifecycle controls for raw and derived artifacts
|
||||
- [x] Add security review for secrets, network policies, trading isolation, and dashboard access control
|
||||
|
||||
## Phase 13 - Verification and Rollout
|
||||
- [ ] Create replay dataset from archived documents for deterministic extraction testing
|
||||
- [ ] Create integration tests for the full ingest-to-recommendation flow
|
||||
- [ ] Create paper trading simulation scenarios
|
||||
- [ ] Validate fail-closed behavior for broker outages and ambiguous order states
|
||||
- [ ] Validate lake publication and Trino query correctness over partitioned MinIO datasets
|
||||
- [ ] Run shadow mode before enabling any live execution
|
||||
- [ ] Prepare operator runbook and incident response procedures
|
||||
- [x] Create replay dataset from archived documents for deterministic extraction testing
|
||||
- [x] Create integration tests for the full ingest-to-recommendation flow
|
||||
- [x] Create paper trading simulation scenarios
|
||||
- [x] Validate fail-closed behavior for broker outages and ambiguous order states
|
||||
- [x] Validate lake publication and Trino query correctness over partitioned MinIO datasets
|
||||
- [x] ~~Run shadow mode~~ moved to Phase 15.5 (post-deployment)
|
||||
- [x] ~~Prepare operator runbook~~ moved to Phase 15.5 (post-deployment)
|
||||
|
||||
## Phase 14 - Local Docker Build Validation
|
||||
- [x] 14. Build and validate all Docker containers locally
|
||||
- [x] 14.1 Build all 11 service containers locally using the Makefile
|
||||
- Run `make build` to build scheduler, symbol-registry, ingestion, parser, extractor, aggregation, recommendation, risk, broker-adapter, lake-publisher, and query-api images
|
||||
- Fix any build failures (missing dependencies, import errors, syntax issues)
|
||||
- _Requirements: N1, 12.1_
|
||||
- [x] 14.2 Validate schema and logic consistency across all services
|
||||
- Run the full test suite with `pytest tests/ -x --tb=short -q` to catch import errors, schema mismatches, and logic inconsistencies
|
||||
- Verify all shared schemas in `services/shared/schemas.py` are consistent with what each service expects
|
||||
- Verify config loader fields match the configmap and secrets definitions
|
||||
- Fix any mismatches found between services, schemas, migrations, and K8s manifests
|
||||
- _Requirements: 5.2, 5.3, 9.2, N2_
|
||||
- [x] 14.3 Verify each container starts without immediate crash
|
||||
- Run each built image with `docker run --rm` and a quick health check or `--help` flag to confirm the entrypoint resolves
|
||||
- Fix any runtime import errors or missing module paths
|
||||
- _Requirements: N1_
|
||||
|
||||
## Phase 15 - CI Validation, Helm Deployment, and Cluster Rollout
|
||||
- [-] 15. Commit, push, validate CI, create Helm chart, and deploy to cluster
|
||||
- [-] 15.1 Commit and push code to GitHub
|
||||
- Configure git with SSH key for the private repo
|
||||
- Commit all current changes with message `phase 14-15: docker build validation and helm deployment`
|
||||
- Push to main branch
|
||||
- _Requirements: N1_
|
||||
- [ ] 15.2 Validate GitHub Actions workflow builds containers
|
||||
- Monitor the GitHub Actions run to confirm lint-and-test and build-services jobs succeed
|
||||
- Fix any CI failures and re-push if needed
|
||||
- _Requirements: N1_
|
||||
- [ ] 15.3 Create Helm chart for stonks-oracle deployment
|
||||
- Create `infra/helm/stonks-oracle/Chart.yaml` with chart metadata
|
||||
- Create `infra/helm/stonks-oracle/values.yaml` with configurable image tags, replica counts, resource limits, and environment references
|
||||
- Create Helm templates for all deployments, services, configmap, secrets, ingress, and network policies from existing K8s manifests
|
||||
- Add imagePullSecrets configuration for GHCR private registry access
|
||||
- Add a template for a Kubernetes Secret of type `kubernetes.io/dockerconfigjson` for GHCR authentication
|
||||
- _Requirements: N1, 8.2_
|
||||
- [ ] 15.4 Configure GHCR image pull authentication on the cluster
|
||||
- Create a `docker-registry` secret in the `stonks-oracle` namespace with GHCR credentials (using a GitHub PAT or deploy key)
|
||||
- Reference the imagePullSecret in all deployment specs via the Helm values
|
||||
- _Requirements: 8.2, N1_
|
||||
- [ ] 15.5 Deploy stonks-oracle to the cluster via Helm
|
||||
- Run `helm install` or `helm upgrade --install` targeting the `stonks-oracle` namespace
|
||||
- Verify all pods reach Running/Ready state
|
||||
- Verify services and ingress endpoints are reachable
|
||||
- Debug and fix any deployment issues (CrashLoopBackOff, image pull errors, config mismatches)
|
||||
- _Requirements: N1, 12.1_
|
||||
- [ ] 15.6 Run shadow mode before enabling any live execution
|
||||
- Confirm all services are running and processing in paper-only mode
|
||||
- Validate end-to-end data flow from ingestion through recommendation without live trades
|
||||
- _Requirements: N5, 8.1_
|
||||
- [ ] 15.7 Prepare operator runbook and incident response procedures
|
||||
- Document service restart procedures, log access, and common failure modes
|
||||
- Document how to toggle trading modes and approve live execution
|
||||
- _Requirements: 8.2, 12.1_
|
||||
|
||||
## Recommended First Vertical Slice
|
||||
- [ ] Track 5 to 10 symbols
|
||||
|
||||
Reference in New Issue
Block a user