- LLMClient Protocol for provider-agnostic inference
- VLLMClient for OpenAI-compatible /v1/chat/completions API
- LLM client factory with provider routing (ollama/vllm)
- VLLMConfig with VLLM_* environment variable loading
- Updated extractor worker with health check and provider switching
- Updated event classifier to use LLMClient protocol
- Helm values for vLLM configuration
- 18 unit tests + 6 property-based tests
- Full backward compatibility preserved
Beta was pointing at stonks_beta DB where tables were owned by postgres
superuser, causing permission denied for the stonks app user. Switch to
sharing stonks_paper DB/user (already has proper grants). DEPLOY_STAGE=beta
still isolates Redis keys and MinIO buckets. Added market data API key
so beta can test ingestion when pipeline is toggled ON.
- pipelineEnabled: true in beta so all pods run (Kargo happy)
- PIPELINE_DEFAULT_OFF=true in beta config — scheduler initializes
the Redis toggle to OFF on first boot
- Shared Ollama (10.1.1.12:2701) between beta and paper
- Flip pipeline ON from the UI when testing, OFF when done
- Optimistic UI update for the toggle button
kubectl wait fails immediately with 'no matching resources found' if
pods haven't been created yet. Added a poll loop to wait for all 3
infra pods (postgres, redis, minio) to exist before running wait.
Tests complete in ~7s. The 10-minute timeout was causing unnecessary
wait time on failures. Reduced Job activeDeadlineSeconds and kubectl
wait timeout to 300s.
- Added pipelineEnabled flag to Helm values (default: true)
- Worker services (scheduler, ingestion, parser, extractor, aggregation,
recommendation, broker-adapter, lake-publisher) scale to 0 when disabled
- API services always run regardless of toggle
- Redis-based runtime toggle: POST /api/ops/pipeline/toggle
- Scheduler checks the flag before each cycle
- Frontend: green/red Pipeline ON/OFF button on the pipeline page
- Beta defaults to pipelineEnabled: false
- Base values.yaml: blanked external URLs (Ollama, Polygon, Alpaca)
so stages only connect to what they explicitly configure
Base values.yaml now has empty OLLAMA_BASE_URL, MARKET_DATA_BASE_URL,
and BROKER_PROVIDER. Only paper (and eventually live) set the real
URLs. Beta inherits empty defaults so it can't reach external services.
Beta is for API testing only. Scale scheduler, ingestion, parser,
extractor, aggregation, recommendation, broker-adapter, and
lake-publisher to 0 replicas. Blank out Polygon and Alpaca keys.
Infra secrets (postgres, redis, minio) kept so API services work.
Beta is for API testing only. Blanked out Polygon/Alpaca/Ollama
credentials, set OLLAMA_BASE_URL to localhost:99999, and scaled
scheduler/ingestion/parser/extractor/aggregation/recommendation/
broker-adapter/lake-publisher to 0 replicas.
- All paper stage credentials now in values-paper.yaml so ArgoCD
renders them correctly on every sync (no more empty secrets)
- Added seed-if-empty init container to scheduler: runs the seed
script if the companies table is empty after migrations
The extraction queue had 3000+ SEC filings backed up with a single
extractor pod processing them at 10-115s each. Ollama handles
concurrent requests so multiple extractor pods can share the GPU.
The polling loop checked conditions[0].type which missed the Complete
condition when it wasn't at index 0. Switch to kubectl wait
--for=condition=complete which handles condition matching reliably.
- Poll job status instead of kubectl wait (catches Failed condition
immediately instead of waiting 600s for Complete that never comes)
- Replace grep -oP (Perl regex) with POSIX grep -o (BusyBox compat)
BusyBox mktemp in alpine/k8s doesn't support .json suffix in template.
The mktemp failure triggered set -e, causing pipeline to report failure
despite all 93 tests passing.
- Remove minio-bucket-init Job entirely (seed_minio.py creates bucket)
- Wait for pods to exist before kubectl wait --for=condition=ready
- Fixes 'no matching resources found' race when pods are still ContainerCreating
Migration 028: For each recommendation with no evidence rows, finds
the closest matching trend_window (by ticker + time_horizon + timestamp)
and re-inserts evidence from top_supporting/opposing_evidence arrays.
Filters out non-UUID pattern IDs and verifies documents exist.
This fixes 'No evidence linked' on recommendations created before the
UUID filtering fix in persist_recommendation.
- Migration 026 and OllamaConfig now default to qwen3.5:9b instead of
llama3.1:8b. Existing deployments keep their current model (qwen3.5:9b-fast)
since the migration uses WHERE NOT EXISTS on slug.
- Event classifier system prompt expanded with macro-vs-company filtering:
explicitly instructs the model to NOT classify single-company news
(lawsuits, earnings, management changes, debt crises) as macro events.
Sets severity=low and confidence<0.3 for company-specific articles.
Reserves 'critical' severity for multi-country/global market events.
Prevents over-tagging event_types by requiring direct description.
- Updated test_system_prompt_is_concise threshold to accommodate the
expanded prompt (300 → 1000 chars).
New Agents tab in the sidebar (Ops group) for viewing, editing, and
creating AI agent configurations:
Database (migration 026):
- ai_agents table: editable configs for each LLM agent (model, prompts,
temperature, tokens, retries). source='system' for built-in,
source='user' for custom. Seeds 3 system agents (Document Extractor,
Event Classifier, Thesis Rewriter) using WHERE NOT EXISTS to never
overwrite user edits across reinstalls.
- agent_performance_log table: per-invocation metrics (duration,
confidence, retries, tokens, errors) linked to agent config.
API endpoints:
- GET/POST /api/agents — list and create agents
- GET/PUT/DELETE /api/agents/{id} — view, edit, delete (system agents
can be edited but not deleted)
- GET /api/agents/{id}/performance — aggregated metrics (success rate,
avg/p95 latency, confidence, token usage)
- GET /api/agents/{id}/performance/history — hourly time series
Frontend:
- AgentsPage with sidebar list + detail panel
- Agent detail: config display, system prompt viewer, performance
dashboard with metrics cards and time-series chart
- Edit form: all config fields editable including system prompt,
model, temperature, tokens, retries
- Create form: new user-defined agents with auto-slug generation
- System agents show blue badge, user agents show green badge
Migration 023 was deleting all but the latest trend_windows row per
entity before 024 could save them to trend_history. On reinstall,
this wiped the entire history every time.
Fixed by restructuring:
- 023 now creates trend_history FIRST and copies all trend_windows
rows into it before deduplicating trend_windows down to latest-only.
Uses NOT EXISTS to avoid duplicating rows on re-runs.
- 024 is now idempotent: ensures table/indexes exist and backfills
from recommendations (last 7 days, 1 point per ticker/window/hour)
to reconstruct approximate history even if trend_windows was sparse.
Both migrations are safe to re-run on existing databases.
- New 'intraday_bars' endpoint in PolygonMarketAdapter: fetches hourly
bars for today using range_bars URL with timespan=hour, sort=asc
- Scheduler expands intraday_bars global source into per-ticker jobs
for all active companies (every 15 minutes via polling_interval)
- Migration 025 inserts the intraday source with 900s cadence
- Frontend price matching uses closest-timestamp instead of date-string
matching, with 2h tolerance for intraday and 36h for daily windows
- Bumped market price fetch limit to 200 for intraday granularity