The aggregation engine only ran when new documents were ingested,
leaving intraday trend data stale for long periods. Now the scheduler
enqueues all 50 tickers for re-aggregation every ~15 minutes during
US market hours (Mon-Fri, 6:30 AM - 1:30 PM PT). This ensures
continuous intraday trend updates based on existing signals and
market price changes.
Workers (ingestion, parser, extractor, aggregation, recommendation,
broker, lake-publisher) now check the pipeline:enabled Redis flag on
each loop iteration and sleep when disabled.
The toggle endpoint flushes all pipeline queues on disable so queued
jobs don't resume when workers eventually check. Broker/trading queues
are excluded from flush to avoid dropping in-flight orders.
- pipelineEnabled: true in beta so all pods run (Kargo happy)
- PIPELINE_DEFAULT_OFF=true in beta config — scheduler initializes
the Redis toggle to OFF on first boot
- Shared Ollama (10.1.1.12:2701) between beta and paper
- Flip pipeline ON from the UI when testing, OFF when done
- Optimistic UI update for the toggle button
- Added pipelineEnabled flag to Helm values (default: true)
- Worker services (scheduler, ingestion, parser, extractor, aggregation,
recommendation, broker-adapter, lake-publisher) scale to 0 when disabled
- API services always run regardless of toggle
- Redis-based runtime toggle: POST /api/ops/pipeline/toggle
- Scheduler checks the flag before each cycle
- Frontend: green/red Pipeline ON/OFF button on the pipeline page
- Beta defaults to pipelineEnabled: false
- Base values.yaml: blanked external URLs (Ollama, Polygon, Alpaca)
so stages only connect to what they explicitly configure
The 30-minute threshold was shorter than the queue drain time, causing
the recovery sweep to re-enqueue docs that were already queued but not
yet processed. Bumped to 4 hours with matching marker TTL.
Recovery sweeps and the retry endpoint now check a per-document Redis
key (SET NX, 1h TTL) before pushing to the queue. If the marker exists,
the doc is already enqueued and gets skipped. This prevents the
scheduler from re-enqueuing the same parsed docs every 5 minutes.
- POST /api/ops/pipeline/retry-failed endpoint resets extraction_failed
docs to parsed, deletes failed intelligence rows, and re-enqueues
them (batch of 200)
- Scheduler now auto-retries extraction_failed docs every ~10 minutes
(100 per cycle, 60-min cooldown per doc)
- Pipeline page shows 'Retry Failed (N)' button when extraction_failed
count > 0, with pending/success/error states
- New 'intraday_bars' endpoint in PolygonMarketAdapter: fetches hourly
bars for today using range_bars URL with timespan=hour, sort=asc
- Scheduler expands intraday_bars global source into per-ticker jobs
for all active companies (every 15 minutes via polling_interval)
- Migration 025 inserts the intraday source with 900s cadence
- Frontend price matching uses closest-timestamp instead of date-string
matching, with 2h tolerance for intraday and 36h for daily windows
- Bumped market price fetch limit to 200 for intraday granularity
Two tiers of market data:
1. Per-ticker prev bars (existing 50 sources, 15-min cadence) for
watchlist detail — trading decisions, stop-loss, position sizing
2. Grouped daily (new single source, once per day) for broad market
context — correlation analysis, sector rotation, competitive intel
Changes:
- Add grouped_daily endpoint to PolygonMarketAdapter with auto date
calculation (previous trading day, skip weekends)
- Add fetch_global_market_sources() to scheduler for sources without
company_id, scheduled once daily (86400s cadence)
- Update _persist_market_items to use item-level ticker from T field
and look up company_id dynamically for grouped daily bars
- Migration 020: make company_id nullable on sources and
market_snapshots tables, add grouped daily source row
- Fix backtest replay to query market_snapshots data->>'c' for prices
- Increase market_api polling cadence from 60s to 900s (15 min).
The prev-day bar endpoint returns the same data all day, so polling
every minute wastes API quota. 50 tickers at 15-min cadence = ~3.3
req/min, well within the 5/min rate limit.
- Reduce market_api rate limit from 30/min to 5/min to match.
- Fix backtest replay to query market_snapshots with data->>'c' for
close prices instead of nonexistent market_data.close_price column.
- Enrich backtest recommendations with prices from market_snapshots
and sectors from companies table.