Celes Renata 007189c0a5
ci/woodpecker/push/test Pipeline was successful
ci/woodpecker/push/build-1 Pipeline was successful
ci/woodpecker/push/build-2 Pipeline was successful
ci/woodpecker/push/build-3 Pipeline was successful
ci/woodpecker/push/finalize Pipeline was successful
Build and Push / lint-and-test (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.adapters.broker_adapter name:broker-adapter]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.aggregation.worker name:aggregation]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.extractor.worker name:extractor]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.ingestion.worker name:ingestion]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.lake_publisher.worker name:lake-publisher]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.parser.worker name:parser]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.recommendation.worker name:recommendation]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.scheduler.app name:scheduler]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.api.app:app --host 0.0.0.0 --port 8000 name:query-api]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.risk.app:app --host 0.0.0.0 --port 8000 name:risk]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.symbol_registry.app:app --host 0.0.0.0 --port 8000 name:symbol-registry]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.trading.app:app --host 0.0.0.0 --port 8000 name:trading-engine]) (push) Has been cancelled
Build and Push / build-dashboard (push) Has been cancelled
Build and Push / build-superset (push) Has been cancelled
Build and Push / integration-test (push) Has been cancelled
Build and Push / beta-gate (push) Has been cancelled
fix: handle plain-text thinking blocks and disable think mode
The model outputs 'Thinking Process:' as plain text (not in <think> tags).
Updated _strip_thinking_block to handle both XML tags and plain-text
reasoning patterns. Also:
- Added rule 7 to system prompt: 'Do NOT show your thinking process'
- Set think=False in Ollama payload to disable Qwen3 thinking mode
- Added fallback regex to extract thesis from after thinking blocks
2026-04-29 15:50:49 +00:00

Stonks Oracle

Copyright (c) 2025-2026 Celes Hillyerd. All rights reserved.

Licensed under the Business Source License 1.1. Production use requires written approval from the author. See LICENSE for details.


AI-powered market intelligence and autonomous paper-trading platform. Ingests market data, company news, and regulatory filings; extracts structured intelligence with local LLMs; aggregates signals across three layers (company, macro, competitive); and autonomously executes paper trades — all self-hosted on Kubernetes.

Documentation

Document Description
Service Reference All 13 services — purpose, configuration, queue topology, database tables
API Reference Complete endpoint reference for Query API, Symbol Registry, Trading, and Risk services
Helm Chart Reference All Helm values: services, config, secrets, ingress, network policies, analytics stack
Docker Deployment Guide Docker Compose setup, environment variables, volumes, operational commands
Kubernetes Architecture Mermaid diagram of the K8s deployment topology, namespaces, ingress, and secrets
Docker Compose Architecture Mermaid diagram of all containers, port mappings, volumes, and dependencies
Data Pipeline Architecture Mermaid diagram of the end-to-end data pipeline, queue topology, and signal layers
AI Agents Guide Built-in agents, variant management, prompt tuning, and performance monitoring
Backup & Restore Guide Backup scripts, restore procedures, retention policies, and disaster recovery
Observability Reference Prometheus metrics, alerting rules, structured logging, and dead-letter queues

What It Does

Stonks Oracle tracks 50 companies across 10 sectors. It monitors multiple data sources, runs every article and filing through a local Ollama model to extract structured intelligence, aggregates those signals into rolling trend summaries with contradiction detection, and generates explainable trade recommendations. An autonomous trading engine then evaluates those recommendations and executes paper trades through Alpaca without manual intervention.

Everything is auditable — raw artifacts, prompts, model outputs, decision traces, and trade execution logs are preserved. Historical data flows into a MinIO-backed lakehouse queryable via Trino and visualized through Superset dashboards and a built-in React dashboard.

Architecture

flowchart LR
    subgraph sources ["Data Sources"]
        polygon["Polygon.io"]
        sec["SEC EDGAR"]
        macro_src["Macro News"]
    end

    subgraph pipeline ["Signal Processing"]
        scheduler["Scheduler"]
        ingestion["Ingestion"]
        parser["Parser"]
        extractor["Extractor"]
        aggregation["Aggregation"]
        recommendation["Recommendation"]
    end

    subgraph trading ["Trading"]
        risk["Risk Engine"]
        engine["Trading Engine"]
        broker["Broker Adapter"]
        alpaca["Alpaca (paper)"]
    end

    subgraph analytics ["Analytics"]
        lake["Lake Publisher"]
        trino["Trino"]
        superset["Superset"]
        dashboard["Dashboard"]
    end

    sources --> scheduler --> ingestion --> parser --> extractor --> aggregation --> recommendation
    recommendation --> risk --> engine --> broker --> alpaca
    aggregation --> lake --> trino --> superset
    trino --> dashboard

For detailed architecture diagrams see:

Two planes:

  • Operational — ingestion, parsing, extraction, aggregation, recommendations, risk evaluation, autonomous trading, trade execution (PostgreSQL, Redis, MinIO)
  • Analytical — historical fact tables, SQL queries, dashboards (MinIO/Parquet, Trino, Superset)

Signal Layers

The aggregation engine merges signals from three independent layers via a unified WeightedSignal abstraction. Each layer has a runtime toggle — no restart required.

Layer Source What It Does
Layer 1: Company News, filings, market data Document intelligence extraction → per-company impact records → trend windows
Layer 2: Macro Global news, geopolitical events Ollama-based event classification → exposure profile matching → per-company macro impact scores
Layer 3: Competitive Historical platform data Pattern mining on past catalyst outcomes → cross-company signal propagation via competitor relationships
  • Pattern-only or macro-only trend shifts are forced to informational mode (suppression safety)
  • Macro weight default: 0.3, competitive weight default: 0.2
  • Toggles: macro_enabled and competitive_enabled in risk_configs

Tracked Universe

50 companies across 10 sectors: Technology, Consumer Cyclical, Financial Services, Healthcare, Energy, Communication Services, Industrials, Consumer Defensive, Real Estate, Utilities.

46 competitor relationships (direct_rival, same_sector, overlapping_products, supply_chain_adjacent).

Seed data: python -m services.symbol_registry.seed

Features

Autonomous Trading Engine

Continuous decision loop that polls for actionable recommendations and executes paper trades without manual intervention. Includes confidence-based position sizing (with sample-size-dampened agreement scoring to prevent thin-evidence inflation), dynamic stop-loss/take-profit (ATR-based), circuit breakers (daily loss cap, single-position loss, volatility detection), reserve pool management (auto-siphon from profits), risk tier auto-adjustment (conservative/moderate/aggressive based on trailing performance), portfolio rebalancing (sector and concentration limits), gradual entry (multi-tranche orders), correlation-aware diversification, earnings calendar awareness, portfolio heat management, tax-lot tracking with wash sale detection, performance tracking (Sharpe, drawdown, win rate, profit factor), and backtesting against historical data.

AI Agent Management

Configurable AI agents (document extractor, event classifier, thesis rewriter) with database-driven model/prompt resolution. 60-second TTL cache for hot-swapping models without restarts. Agent performance logging with variant attribution for future A/B testing support.

Global News Interpolation

Macro/geopolitical event ingestion from dedicated sources. Ollama-based classification by impact type, severity, affected regions, and sectors. Company exposure profiles (geographic revenue mix, supply chain regions, commodity dependencies, market position tier) map events to per-company macro impact scores with resilience modifiers. Forward-looking trend projections combine company momentum with macro trajectories.

Competitive Intelligence

Historical pattern mining on the platform's own data — how similar catalyst types resolved in the past for a company and its competitors. Cross-company signal propagation via competitor relationships. Major corporate decision tracking (M&A, restructuring, leadership changes) with extended lookback windows. Auto-inference of competitor relationships from sector matching and document co-mentions.

Data Ingestion

  • Grouped daily market data from Polygon.io (OHLCV bars, corporate actions)
  • Company news via news APIs with full article scraping
  • SEC filings and regulatory events
  • Macro/geopolitical news from dedicated sources
  • Content hash deduplication, rate limiting, retries, raw artifact preservation in MinIO

AI-Powered Extraction

  • Local Ollama models with schema-constrained JSON output
  • Per-document intelligence: sentiment, catalysts, impact horizon, key facts, risks, macro themes
  • Per-company impact records when a document mentions multiple companies
  • Schema and semantic validation with retry on invalid outputs

Trend Aggregation

  • Rolling company-level trend summaries across 5 windows (intraday, 1d, 7d, 30d, 90d)
  • Recency decay, source credibility weighting, document novelty scoring
  • Contradiction detection with explicit disagreement representation
  • Sector and market-level rollups incorporating macro event impacts
  • Forward-looking trend projections with driving factor explanations

Paper Trading

  • Alpaca paper trading integration (3 accounts max per Alpaca owner)
  • Full reset: liquidates broker positions, cancels orders, clears local DB, syncs capital from broker's actual account balance
  • No manual capital controls — engine capital always derived from broker state on reset
  • Moderate risk tier default, auto-adjustable
  • Full execution audit trail from signal to broker response
  • Operator approval workflow available for live mode

Notification Service

  • AWS SNS for SMS alerts on critical events (circuit breaker triggers, risk tier changes, large trades)
  • Gmail API for email alerts and daily performance summaries
  • Configurable alert channels and thresholds

Lakehouse and SQL Analytics

  • Parquet fact tables on MinIO with Hive-compatible partitioning
  • Iceberg table metadata for schema evolution
  • Trino SQL engine for ad-hoc queries
  • Fact tables: market bars, documents, extractions, trade signals, orders, fills, positions, PnL, global events, macro impacts, competitive signals, trend projections
  • Apache Superset for pre-built dashboards

Web Dashboard

  • React/TypeScript SPA with Tailwind CSS
  • Company, watchlist, and source management
  • Document timeline with intelligence drill-down
  • Trend visualization with evidence chain navigation (company, macro, and competitive signals distinguished)
  • Trading engine overview: risk tier, circuit breaker status, active/reserve pool, portfolio heat, P&L
  • Portfolio composition, trade history, backtesting panel
  • Global events browser, macro exposure panels, trend projection visualization
  • Competitor relationship management, historical pattern explorer, corporate decision timeline
  • DevOps dashboards: pipeline health, ingestion throughput, model performance
  • Interactive SQL explorer with Monaco Editor and chart builder

Observability

  • Structured JSON logging across all services
  • Prometheus metrics for every pipeline stage
  • Alerting for source failures, schema failure spikes, analytical lag, broker issues, and trading anomalies
  • Dead-letter queues with replay tooling
  • Data retention and lifecycle controls

Services

Service Description
scheduler Triggers ingestion cycles based on source polling intervals
symbol-registry Manages companies, aliases, watchlists, sources, exposure profiles, and competitor relationships
ingestion Fetches market data, news, filings, and macro events from external APIs
parser Normalizes raw HTML/text, reduces boilerplate, scores parse quality
extractor Runs Ollama extraction to produce document intelligence and global event classifications
aggregation Computes rolling trend summaries with contradiction detection and trend projections
recommendation Generates trade recommendations from aggregated evidence across all signal layers
risk Evaluates orders against portfolio risk controls
trading-engine Autonomous decision loop: position sizing, stop-loss, circuit breakers, reserve pool, rebalancing
broker-adapter Interfaces with Alpaca for paper/live order execution
lake-publisher Writes analytical Parquet datasets to MinIO
query-api REST API for all operational and analytical queries
dashboard React SPA served via nginx

Tech Stack

  • Language: Python 3.12, TypeScript (frontend)
  • AI: Ollama (local LLM inference with structured JSON output)
  • Databases: PostgreSQL 16, Redis 7
  • Object Storage: MinIO (S3-compatible)
  • Lakehouse: Parquet + Hive partitioning + Iceberg metadata
  • SQL Engine: Trino
  • BI: Apache Superset
  • Frontend: React 19, Vite, TanStack Router/Query, Recharts, Monaco Editor, Tailwind CSS
  • Infrastructure: Kubernetes (k3s), Helm, Traefik ingress, cert-manager
  • CI/CD: GitHub Actions → GHCR container registry
  • Broker: Alpaca (paper trading)
  • Market Data: Polygon.io
  • Notifications: AWS SNS (SMS), Gmail API (email)

Project Structure

├── services/
│   ├── shared/          # Config, schemas, Redis keys, logging, audit
│   ├── scheduler/       # Job scheduling and source polling
│   ├── symbol_registry/ # Company, source, exposure profile, competitor management API
│   ├── ingestion/       # External API adapters and raw artifact storage
│   ├── parser/          # HTML parsing, boilerplate reduction, quality scoring
│   ├── extractor/       # Ollama extraction, event classification, schema validation
│   ├── aggregation/     # Trend computation, contradiction detection, projections
│   ├── recommendation/  # Recommendation generation and suppression
│   ├── risk/            # Risk evaluation and approval workflow
│   ├── trading/         # Autonomous trading engine, backtester, performance tracker
│   ├── adapters/        # Broker API integration
│   ├── lake_publisher/  # Parquet fact table publication
│   └── api/             # Query API (FastAPI)
├── frontend/            # React dashboard SPA
├── infra/
│   ├── helm/            # Helm chart for Kubernetes deployment
│   ├── k8s/             # Raw Kubernetes manifests
│   ├── migrations/      # PostgreSQL schema migrations
│   ├── trino/           # Trino catalog configuration
│   ├── hive/            # Hive metastore configuration
│   ├── minio/           # MinIO lifecycle policies
│   └── superset/        # Superset configuration
├── scripts/             # Backup/restore scripts (backup-db.sh, restore-db.sh, backup-redis.sh)
├── dashboards/          # Superset dashboard JSON exports
├── tests/               # Python test suite
└── docker/              # Dockerfiles for services and Superset

Local Development

Prerequisites: Python 3.12, Node.js 24, Docker

# Start infrastructure
docker compose up -d

# Install Python dependencies
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

# Run tests
python -m pytest tests/ -x --tb=short -q

# Frontend
cd frontend
npm install
npx vitest --run

Deployment

The platform runs on Kubernetes (k3s cluster, 4 NixOS nodes). Full deployment is handled by runmefirst.sh, which sets up the database, runs migrations, and deploys via Helm.

# Full deploy (from gremlin-1 where secrets are available):
bash ~/sources/kube/stonks-oracle/runmefirst.sh

# Quick Helm upgrade after CI builds new images:
helm upgrade --install stonks-oracle infra/helm/stonks-oracle -n stonks-oracle

# Restart a specific service:
kubectl rollout restart deployment/<service-name> -n stonks-oracle

Secrets are stored in ~/sources/kube/stonks-oracle/ on the deploy host — not in the repo. The deploy script reads them from disk and injects them via Helm --set flags. See the runbook for operational details.

Live Endpoints

Service URL
Dashboard https://stonks.celestium.life
Query API https://stonks-api.celestium.life
Symbol Registry https://stonks-registry.celestium.life
Trading Engine https://stonks-trading.celestium.life
Superset https://stonks-dash.celestium.life
Trino https://stonks-trino.celestium.life

License

Private repository.

S
Description
Stonks Oracle — AI market intelligence and paper-trading platform
Readme 3.6 MiB
Languages
Python 83.7%
TypeScript 12.7%
Shell 3.3%
JavaScript 0.1%