feat: comprehensive docs, unit tests, docker-compose app services
- Add scheduler and ingestion unit tests (test_scheduler_unit.py, test_ingestion_unit.py) - Add all 13 app services + dashboard to docker-compose.yml - Add full documentation suite: API reference, Helm reference, Docker deployment guide, 3 architecture diagrams (K8s, Docker Compose, data pipeline), AI agent guide, backup/restore guide, observability/metrics reference, per-service docs - Add intelligence pipeline deep-dive docs with Mermaid diagrams - Update README with documentation index and links - Add specs for comprehensive-quality-docs, intelligence-pipeline-deep-dive, sanitized-pipeline-docs
This commit is contained in:
@@ -0,0 +1 @@
|
||||
{"specId": "e6d189b2-5861-4e24-954f-5e254246a910", "workflowType": "requirements-first", "specType": "feature"}
|
||||
@@ -0,0 +1,341 @@
|
||||
# Design Document: Sanitized Pipeline Documentation
|
||||
|
||||
## Overview
|
||||
|
||||
This design specifies the process and structure for producing a sanitized version of the 6-page intelligence pipeline deep dive documentation. The sanitized docs transform the existing `docs/intelligence-pipeline-deep-dive/` content into domain-neutral equivalents stored at `docs/sanitized-pipeline-deep-dive/`, stripping all financial, market, and trading language while preserving every engineering detail — algorithms, formulas, architectural patterns, queue topologies, database schemas, code module references, and Mermaid diagrams.
|
||||
|
||||
The deliverable is a documentation-only transformation. No application code, database schemas, or infrastructure changes are involved. The output is Markdown files and Mermaid diagram files that mirror the original structure with domain-neutral framing.
|
||||
|
||||
**Key design decision**: The sanitization is a manual content transformation guided by a defined terminology map. Each source file is read, transformed according to the mapping rules, and written to the output directory. The original files remain untouched.
|
||||
|
||||
### Source Material
|
||||
|
||||
The source documentation at `docs/intelligence-pipeline-deep-dive/` consists of:
|
||||
|
||||
| File | Content |
|
||||
|------|---------|
|
||||
| `index.md` | Table of contents, introduction, diagram links, related docs |
|
||||
| `01-data-ingestion-and-preparation.md` | Scheduler, ingestion worker, deduplication, parser |
|
||||
| `02-ai-agent-processing-and-extraction.md` | Document extractor, event classifier, JSON repair, validation |
|
||||
| `03-signal-scoring-and-weighted-signals.md` | Composite weight formula, three signal layers, sentiment mapping |
|
||||
| `04-trend-aggregation-and-accumulating-signals.md` | Time windows, trend direction, contradiction, evidence ranking, confidence |
|
||||
| `05-recommendation-generation.md` | Suppression, eligibility, position sizing, thesis, risk classification |
|
||||
| `06-trading-decisions-and-execution.md` | Trading engine, pre-trade checks, circuit breakers, broker adapter |
|
||||
| `diagrams/ingestion-to-extraction-flow.md` | Mermaid flowchart: scheduler → ingestion → parser → extractor |
|
||||
| `diagrams/three-layer-signal-merging.md` | Mermaid flowchart: three signal layers → aggregation |
|
||||
| `diagrams/weighted-signal-computation.md` | Mermaid flowchart: composite weight formula breakdown |
|
||||
| `diagrams/trend-accumulation-escalation.md` | Mermaid flowchart: time windows → escalation path |
|
||||
| `diagrams/recommendation-generation-flow.md` | Mermaid flowchart: suppression → eligibility → thesis → risk |
|
||||
| `diagrams/trading-engine-decision-loop.md` | Mermaid flowchart: pre-trade checks → position sizing → order submission |
|
||||
|
||||
## Architecture
|
||||
|
||||
### Output File Organization
|
||||
|
||||
The sanitized docs mirror the source structure with sanitized filenames:
|
||||
|
||||
```
|
||||
docs/sanitized-pipeline-deep-dive/
|
||||
├── index.md
|
||||
├── 01-data-ingestion-and-preparation.md
|
||||
├── 02-ai-agent-processing-and-extraction.md
|
||||
├── 03-signal-scoring-and-weighted-signals.md
|
||||
├── 04-trend-aggregation-and-accumulating-signals.md
|
||||
├── 05-recommendation-generation.md
|
||||
├── 06-decision-execution.md
|
||||
└── diagrams/
|
||||
├── ingestion-to-extraction-flow.md
|
||||
├── three-layer-signal-merging.md
|
||||
├── weighted-signal-computation.md
|
||||
├── trend-accumulation-escalation.md
|
||||
├── recommendation-generation-flow.md
|
||||
└── decision-engine-loop.md
|
||||
```
|
||||
|
||||
**Filename changes from source:**
|
||||
- `06-trading-decisions-and-execution.md` → `06-decision-execution.md` (removes "trading")
|
||||
- `diagrams/trading-engine-decision-loop.md` → `diagrams/decision-engine-loop.md` (removes "trading")
|
||||
- All other filenames are already domain-neutral and remain unchanged
|
||||
|
||||
### Transformation Process
|
||||
|
||||
The sanitization follows a three-pass approach for each file:
|
||||
|
||||
1. **Terminology pass**: Apply the terminology map to replace all financial/trading terms with domain-neutral equivalents. This covers inline text, headings, table cells, code blocks, and Mermaid diagram labels.
|
||||
2. **Reference pass**: Update all internal cross-references to point to sanitized filenames (e.g., `06-trading-decisions-and-execution.md` → `06-decision-execution.md`, `trading-engine-decision-loop.md` → `decision-engine-loop.md`). Remove or neutralize references to external financial docs (e.g., links to `../llm-to-trade-pipeline.md` become neutral descriptions).
|
||||
3. **Narrative pass**: Reframe example scenarios, inline illustrations, and narrative framing to use domain-neutral language. This pass handles context-dependent replacements that a simple find-and-replace cannot catch — e.g., "a bearish article about AAPL" becomes "a negative-sentiment article about Entity-A".
|
||||
|
||||
### Content Flow
|
||||
|
||||
The sanitized docs preserve the same page-to-page narrative flow as the originals:
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
P1["Page 1\nData Ingestion"] --> P2["Page 2\nAI Extraction"]
|
||||
P2 --> P3["Page 3\nSignal Scoring"]
|
||||
P3 --> P4["Page 4\nTrend Aggregation"]
|
||||
P4 --> P5["Page 5\nRecommendations"]
|
||||
P5 --> P6["Page 6\nDecision Execution"]
|
||||
```
|
||||
|
||||
## Components and Interfaces
|
||||
|
||||
### Terminology Map
|
||||
|
||||
The core of the sanitization is a defined mapping from financial/trading terms to domain-neutral equivalents. The map is applied consistently across all files.
|
||||
|
||||
#### System and Provider Names
|
||||
|
||||
| Source Term | Sanitized Replacement |
|
||||
|-------------|----------------------|
|
||||
| Stonks Oracle / stonks | the platform / the system |
|
||||
| Polygon.io / Polygon | external data provider / data source API |
|
||||
| SEC EDGAR / SEC / EFTS | public records API / regulatory filings source |
|
||||
| Alpaca / AlpacaBrokerAdapter | execution adapter / external execution API |
|
||||
| Wall Street | (removed or reframed) |
|
||||
|
||||
#### Trading and Financial Actions
|
||||
|
||||
| Source Term | Sanitized Replacement |
|
||||
|-------------|----------------------|
|
||||
| buy | act |
|
||||
| sell | defer |
|
||||
| hold | monitor |
|
||||
| watch | observe |
|
||||
| trading engine | decision execution engine |
|
||||
| paper trading / paper_eligible | simulation mode / simulation_eligible |
|
||||
| live trading / live_eligible | live execution mode / production_eligible |
|
||||
| trade / trading (as action) | decision / execution |
|
||||
| order (broker order) | execution request |
|
||||
| pre-trade checks | pre-execution checks |
|
||||
|
||||
#### Financial Concepts
|
||||
|
||||
| Source Term | Sanitized Replacement |
|
||||
|-------------|----------------------|
|
||||
| portfolio | resource pool / allocation pool |
|
||||
| portfolio allocation | resource allocation |
|
||||
| portfolio heat | pool exposure |
|
||||
| portfolio snapshots | pool snapshots |
|
||||
| position sizing | commitment sizing / resource allocation |
|
||||
| position (open position) | commitment / active commitment |
|
||||
| stop-loss | risk threshold / loss limit |
|
||||
| take-profit | gain target |
|
||||
| bullish | positive / favorable |
|
||||
| bearish | negative / unfavorable |
|
||||
| stock ticker / ticker symbol | entity identifier |
|
||||
| stock market | (removed or reframed) |
|
||||
| earnings / earnings call / earnings report | performance report / periodic disclosure |
|
||||
| 10-K / 10-Q / 8-K | regulatory filing types |
|
||||
| SEC filings | regulatory filings |
|
||||
| broker / broker API | execution adapter / execution API |
|
||||
| P&L | gain/loss |
|
||||
| Sharpe ratio | risk-adjusted return ratio |
|
||||
| drawdown | peak-to-trough decline |
|
||||
| win rate | success rate |
|
||||
|
||||
#### Ticker Symbols and Company Names
|
||||
|
||||
| Source Term | Sanitized Replacement |
|
||||
|-------------|----------------------|
|
||||
| AAPL / Apple | Entity-A |
|
||||
| TSLA / Tesla | Entity-B |
|
||||
| NVDA / NVIDIA | Entity-C |
|
||||
| XOM | Entity-D |
|
||||
| META | Entity-E |
|
||||
| Any other ticker | Entity-{letter} or "tracked entity" |
|
||||
|
||||
#### Redis Keys
|
||||
|
||||
| Source Pattern | Sanitized Pattern |
|
||||
|----------------|-------------------|
|
||||
| `stonks:queue:*` | `app:queue:*` |
|
||||
| `stonks:dedupe:*` | `app:dedupe:*` |
|
||||
| `stonks:ratelimit:*` | `app:ratelimit:*` |
|
||||
| `stonks:trading:circuit_breaker:*` | `app:execution:circuit_breaker:*` |
|
||||
| `stonks:dedupe:trading:*` | `app:dedupe:execution:*` |
|
||||
|
||||
#### MinIO Buckets
|
||||
|
||||
| Source Bucket | Sanitized Bucket |
|
||||
|---------------|-----------------|
|
||||
| `stonks-raw-market` | `app-raw-data` |
|
||||
| `stonks-raw-news` | `app-raw-content` |
|
||||
| `stonks-raw-filings` | `app-raw-filings` |
|
||||
| `stonks-normalized` | `app-normalized` |
|
||||
| `stonks-llm-prompts` | `app-llm-prompts` |
|
||||
| `stonks-llm-results` | `app-llm-results` |
|
||||
|
||||
#### Database Tables
|
||||
|
||||
| Source Table | Sanitized Table |
|
||||
|-------------|----------------|
|
||||
| `trading_decisions` | `execution_decisions` |
|
||||
| `portfolio_snapshots` | `pool_snapshots` |
|
||||
| `portfolio_pct` (column) | `allocation_pct` |
|
||||
|
||||
All other table names (`documents`, `document_intelligence`, `trend_windows`, `recommendations`, etc.) are already domain-neutral and remain unchanged.
|
||||
|
||||
#### Adapter and Source Type Names
|
||||
|
||||
| Source Term | Sanitized Replacement |
|
||||
|-------------|----------------------|
|
||||
| `PolygonNewsAdapter` | `ExternalNewsAdapter` |
|
||||
| `PolygonMarketAdapter` | `ExternalDataAdapter` |
|
||||
| `SECEdgarAdapter` | `RegulatoryFilingsAdapter` |
|
||||
| `AlpacaBrokerAdapter` | `ExecutionAdapter` |
|
||||
| `broker` (source_type) | `execution_api` |
|
||||
| `market_api` (source_type) | `data_api` |
|
||||
| `filings_api` (source_type) | `filings_api` (unchanged — already neutral) |
|
||||
|
||||
### Preserved Engineering Terms
|
||||
|
||||
The following terms are explicitly preserved because they describe engineering patterns, not financial concepts:
|
||||
|
||||
- **circuit breaker** — engineering safety pattern for rate limiting and cascading failure prevention
|
||||
- **exponential backoff** — retry pattern
|
||||
- **adapter pattern** — software design pattern (only the domain-specific adapter *names* are sanitized)
|
||||
- **signal** — used in signal processing and scoring context
|
||||
- **trend**, **sentiment**, **confidence**, **contradiction**, **evidence** — data analysis terms
|
||||
- **recency decay**, **credibility weight**, **novelty bonus** — scoring algorithm terms
|
||||
- **weighted sentiment average** — mathematical computation term
|
||||
|
||||
### Preserved Technical Content
|
||||
|
||||
All of the following are preserved verbatim (with only the terminology map applied to embedded financial terms):
|
||||
|
||||
- Composite signal scoring formula: `combined = gate × recency × credibility × (1 + novelty_bonus) × market_context_multiplier`
|
||||
- Confidence computation formula with log₂ scaling and four components
|
||||
- Weighted sentiment average formula
|
||||
- All threshold values, configuration parameters, and numeric constants
|
||||
- All Markdown table structures containing technical parameters
|
||||
- All code module path references (e.g., `services/aggregation/scoring.py`)
|
||||
- Three-layer signal architecture with weight ratios (1.0, 0.3, 0.2)
|
||||
- Contradiction detection algorithm and evidence ranking methodology
|
||||
- All PostgreSQL table structures and column descriptions (with sanitized names where needed)
|
||||
- All Redis queue patterns and operations (`rpush`/`lpop`/`blpop`)
|
||||
- All MinIO storage patterns (with sanitized bucket names)
|
||||
- Ollama as the LLM inference provider
|
||||
|
||||
### Index Page Reframing
|
||||
|
||||
The sanitized `index.md` describes the system as an "AI-driven intelligence-to-decision pipeline" that:
|
||||
1. Ingests data from multiple external data sources
|
||||
2. Extracts structured intelligence via NLP/LLM
|
||||
3. Scores and weights signals
|
||||
4. Aggregates trends across time windows
|
||||
5. Generates recommendations with quality gates
|
||||
6. Executes decisions autonomously with safety mechanisms
|
||||
|
||||
References to "Stonks Oracle" are replaced with "the platform" or "the system". References to financial-specific APIs (Polygon.io, SEC EDGAR) are replaced with neutral descriptions. The "Related Documentation" section links are updated to use neutral descriptions or removed if they reference financial-specific content.
|
||||
|
||||
### Page 06 Reframing
|
||||
|
||||
Page 06 undergoes the most extensive reframing since it covers the trading engine. Key changes:
|
||||
- Title: "Decision Execution" instead of "Trading Decisions and Execution"
|
||||
- "Trading engine" → "decision execution engine"
|
||||
- "Pre-trade checks" → "pre-execution checks"
|
||||
- "Broker adapter" / "Alpaca" → "execution adapter" / "external execution API"
|
||||
- "Paper trading" → "simulation mode"
|
||||
- "Live trading" → "live execution mode"
|
||||
- "Portfolio" → "resource pool" / "allocation pool"
|
||||
- "Position" → "commitment" / "active commitment"
|
||||
- "Stop-loss" → "risk threshold"
|
||||
- "Take-profit" → "gain target"
|
||||
- All order submission language reframed as "execution request submission"
|
||||
|
||||
### Diagram Sanitization
|
||||
|
||||
Each Mermaid diagram file receives the same terminology map treatment:
|
||||
- Node labels containing financial terms are replaced
|
||||
- Queue name labels (`stonks:queue:*` → `app:queue:*`)
|
||||
- Bucket name labels (`stonks-raw-market` → `app-raw-data`)
|
||||
- Table name labels (`trading_decisions` → `execution_decisions`)
|
||||
- Adapter names in node labels
|
||||
- Subgraph titles containing financial terms
|
||||
- The `trading-engine-decision-loop.md` diagram is renamed to `decision-engine-loop.md`
|
||||
|
||||
Mermaid syntax, node relationships, subgraph structures, and flow directions are preserved exactly.
|
||||
|
||||
## Data Models
|
||||
|
||||
This feature produces only documentation files. There are no new data models, database tables, or schema changes.
|
||||
|
||||
The sanitized narrative pages reference the same data models as the originals, with terminology-mapped names where applicable:
|
||||
|
||||
- **`WeightedSignal`** — document reference + composite weight + sentiment + impact (unchanged)
|
||||
- **`SignalWeight`** — breakdown of recency, credibility, novelty, confidence gate, market context multiplier (unchanged)
|
||||
- **`TrendSummary`** — rolling trend for an entity across a time window (unchanged)
|
||||
- **`Recommendation`** — actionable decision recommendation (reframed from "trade recommendation")
|
||||
- **`execution_decisions`** table — audit record of every decision evaluation (sanitized from `trading_decisions`)
|
||||
- **`pool_snapshots`** table — resource pool state snapshots (sanitized from `portfolio_snapshots`)
|
||||
|
||||
|
||||
## Correctness Properties
|
||||
|
||||
*A property is a characteristic or behavior that should hold true across all valid executions of a system — essentially, a formal statement about what the system should do. Properties serve as the bridge between human-readable specifications and machine-verifiable correctness guarantees.*
|
||||
|
||||
The sanitized documentation set has one key universal property: the complete absence of financial/trading terminology across all output files. This is well-suited to property-based testing because the property must hold for *every* file in the output set, and the banned term list is large enough that systematic checking across all files provides high-value coverage.
|
||||
|
||||
### Property 1: Banned Financial Terminology Exclusion
|
||||
|
||||
*For any* file in the sanitized documentation set (`docs/sanitized-pipeline-deep-dive/`), the file content shall not contain any term from the comprehensive banned financial terminology list. The banned list includes: stock ticker symbols (AAPL, TSLA, NVDA, XOM, META, and all 50 tracked tickers), company names used as financial examples (Apple, Tesla, NVIDIA), trading action labels (buy, sell, hold, watch as action labels — BUY, SELL, HOLD, WATCH in uppercase), financial system terms (trading engine, paper trading, live trading, paper_eligible, live_eligible, portfolio, portfolio allocation, portfolio heat, portfolio snapshots, broker, Alpaca, broker adapter, broker API, stock market, Wall Street, bullish, bearish, position sizing, stop-loss), financial event terms (SEC EDGAR, SEC filings, 10-K, 10-Q, 8-K, earnings, earnings call, earnings report), provider names (Polygon.io, Polygon), system names (Stonks Oracle, stonks), and infrastructure patterns containing financial terms (stonks: prefix in Redis keys, stonks- prefix in MinIO buckets, trading_decisions table name, portfolio_snapshots table name).
|
||||
|
||||
**Validates: Requirements 3.1, 3.2, 3.3, 3.4, 3.5, 3.6, 3.7, 3.8, 3.9, 3.10, 6.2, 7.1, 7.2, 7.3, 8.1, 8.2**
|
||||
|
||||
## Error Handling
|
||||
|
||||
Since this is a documentation-only deliverable, there is no runtime error handling to design. The primary quality concerns are:
|
||||
|
||||
### Accuracy of Terminology Replacement
|
||||
|
||||
Every financial/trading term must be replaced with its domain-neutral equivalent. Missing a single instance of "stonks" in a Redis key pattern or "AAPL" in an example scenario would violate the sanitization requirements. The terminology map defined in the Components section serves as the authoritative reference.
|
||||
|
||||
### Preservation of Technical Content
|
||||
|
||||
The sanitization must not accidentally remove or alter engineering content. Key risks:
|
||||
- **Formula corruption**: The composite weight formula contains `market_context_multiplier` — the word "market" must not be blindly replaced since it's part of a technical variable name
|
||||
- **Code path corruption**: Module paths like `services/trading/engine.py` contain "trading" — these paths reference actual files and must be preserved as-is (the code files are not being renamed)
|
||||
- **Table name corruption**: Database table names like `trading_decisions` need sanitization in narrative text but the actual SQL/code references to the original table names should be handled carefully
|
||||
|
||||
**Design decision**: Code module paths (e.g., `services/trading/engine.py`) are preserved exactly as they appear in the source, since they reference actual files in the repository. Only narrative references to concepts (e.g., "the trading engine") are sanitized. Variable names within formulas and code blocks are preserved. Database table names are sanitized in narrative descriptions and table listings, but inline code references note the sanitized name.
|
||||
|
||||
### Cross-Reference Integrity
|
||||
|
||||
All internal links must resolve to files that exist in the sanitized output:
|
||||
- Page-to-page links must use sanitized filenames
|
||||
- Diagram links must use sanitized diagram filenames
|
||||
- No links should point back to the source `docs/intelligence-pipeline-deep-dive/` directory
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
### Why Limited PBT Applies
|
||||
|
||||
This is a documentation-only deliverable — the output is static Markdown files, not executable code with functions and data transformations. However, one universal property (banned term exclusion) is well-suited to property-based testing because it must hold across all files and involves checking a large set of terms against file content.
|
||||
|
||||
Most other requirements (structural checks, content preservation, narrative reframing) are better verified through example-based tests and manual review.
|
||||
|
||||
### Property-Based Tests
|
||||
|
||||
- **Library**: Hypothesis (Python, already in the project)
|
||||
- **Configuration**: `@settings(max_examples=100)`
|
||||
- **Property 1 implementation**: Generate random selections from the banned term list and random file selections from the sanitized docs, verify the term does not appear in the file content. Alternatively, exhaustively check all banned terms against all files (since the file set is small and fixed, this is more practical as an exhaustive example-based test).
|
||||
|
||||
**Practical note**: Given the small, fixed file set (14 files), the banned term exclusion property is most practically implemented as an exhaustive check — iterate all files × all banned terms — rather than a randomized property test. This provides complete coverage rather than probabilistic coverage.
|
||||
|
||||
### Example-Based Tests
|
||||
|
||||
1. **File structure verification**: Verify all expected files exist at the correct paths
|
||||
2. **Cross-reference integrity**: Parse all sanitized files, extract markdown links, verify they resolve to existing sanitized files
|
||||
3. **Mermaid syntax validation**: Verify each diagram file contains valid Mermaid `flowchart` declarations
|
||||
4. **Technical content preservation**: Spot-check that key formulas, threshold values, and code module paths are present in the sanitized docs
|
||||
5. **Terminology replacement verification**: Spot-check that key replacements appear (e.g., "decision execution engine" replaces "trading engine")
|
||||
6. **Index page framing**: Verify the index describes the system as an "AI-driven intelligence-to-decision pipeline"
|
||||
7. **Database table sanitization**: Verify `execution_decisions` appears where `trading_decisions` was, and `pool_snapshots` where `portfolio_snapshots` was
|
||||
|
||||
### Manual Review
|
||||
|
||||
- Narrative coherence and readability of the sanitized content
|
||||
- Consistency of domain-neutral framing across all pages
|
||||
- Quality of example scenario replacements (e.g., "bearish article about AAPL" → "negative-sentiment article about Entity-A")
|
||||
- Preservation of page-to-page transition flow
|
||||
@@ -0,0 +1,202 @@
|
||||
# Requirements Document
|
||||
|
||||
## Introduction
|
||||
|
||||
This feature produces a sanitized version of the existing 6-page intelligence pipeline deep dive documentation (`docs/intelligence-pipeline-deep-dive/`) for use in a work presentation. The sanitized version strips all financial, market, and trading language — stock tickers, buy/sell/hold actions, portfolio allocation, broker APIs, and domain-specific framing — and reframes the content as a general-purpose AI decision intelligence pipeline. The sanitized docs are stored as a separate doc group under `docs/sanitized-pipeline-deep-dive/`, preserving the original documents untouched. All engineering depth — algorithms, formulas, architectural patterns, queue topologies, database schemas, code module references, and Mermaid diagrams — is preserved. Only the domain-specific framing changes.
|
||||
|
||||
## Glossary
|
||||
|
||||
- **Source_Docs**: The original 6-page documentation set at `docs/intelligence-pipeline-deep-dive/`, including `index.md`, pages `01` through `06`, and the `diagrams/` subdirectory containing 6 Mermaid diagram files.
|
||||
- **Sanitized_Docs**: The output documentation set at `docs/sanitized-pipeline-deep-dive/`, mirroring the structure of Source_Docs with all financial/market/trading language replaced by domain-neutral equivalents.
|
||||
- **Sanitization_Engine**: The process (manual or automated) that transforms Source_Docs into Sanitized_Docs by applying the terminology mapping and content reframing rules defined in this document.
|
||||
- **Terminology_Map**: The defined set of financial/market/trading terms and their domain-neutral replacements used by the Sanitization_Engine.
|
||||
- **Entity_Identifier**: The domain-neutral replacement for stock ticker symbols (e.g., AAPL, TSLA) in Sanitized_Docs.
|
||||
- **Decision_Term**: A domain-neutral action term (act, defer, monitor, observe) that replaces trading actions (buy, sell, hold, watch) in Sanitized_Docs.
|
||||
- **Decision_Execution_Engine**: The domain-neutral name for the trading engine in Sanitized_Docs.
|
||||
- **Execution_Adapter**: The domain-neutral name for broker adapters and broker API references in Sanitized_Docs.
|
||||
- **Allocation_Pool**: The domain-neutral name for portfolio references in Sanitized_Docs.
|
||||
- **Commitment_Sizing**: The domain-neutral name for position sizing in Sanitized_Docs.
|
||||
|
||||
---
|
||||
|
||||
## Requirements
|
||||
|
||||
### Requirement 1: Separate Output Directory
|
||||
|
||||
**User Story:** As a presenter, I want the sanitized docs stored in a separate directory from the originals, so that the original documentation remains untouched and both versions coexist.
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
1. THE Sanitization_Engine SHALL write all output files to `docs/sanitized-pipeline-deep-dive/`.
|
||||
2. THE Sanitization_Engine SHALL NOT modify, overwrite, or delete any file under `docs/intelligence-pipeline-deep-dive/`.
|
||||
3. THE Sanitized_Docs SHALL contain an `index.md` file at the root of `docs/sanitized-pipeline-deep-dive/`.
|
||||
4. THE Sanitized_Docs SHALL contain a `diagrams/` subdirectory under `docs/sanitized-pipeline-deep-dive/`.
|
||||
|
||||
---
|
||||
|
||||
### Requirement 2: Mirror the 6-Page Structure
|
||||
|
||||
**User Story:** As a presenter, I want the sanitized docs to mirror the same 6-page structure as the originals, so that readers familiar with the original can navigate the sanitized version identically.
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
1. THE Sanitized_Docs SHALL contain exactly 6 numbered page files matching the naming pattern of Source_Docs: `01-*.md` through `06-*.md`.
|
||||
2. THE Sanitized_Docs SHALL contain an `index.md` with a table of contents linking to all 6 pages and all diagrams, mirroring the structure of the Source_Docs index.
|
||||
3. THE Sanitized_Docs SHALL contain one Mermaid diagram file in `diagrams/` for each diagram file present in `docs/intelligence-pipeline-deep-dive/diagrams/`.
|
||||
4. WHEN a Source_Docs page contains internal cross-references to other pages or diagrams, THE Sanitized_Docs equivalent page SHALL contain corresponding cross-references pointing to the Sanitized_Docs versions of those pages and diagrams.
|
||||
5. THE Sanitized_Docs page filenames SHALL use sanitized titles (e.g., `06-decision-execution.md` instead of `06-trading-decisions-and-execution.md`).
|
||||
|
||||
---
|
||||
|
||||
### Requirement 3: Strip Financial and Trading Terminology
|
||||
|
||||
**User Story:** As a presenter, I want all financial, market, and trading language removed from the sanitized docs, so that the presentation focuses on engineering without revealing the financial domain.
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
1. THE Sanitized_Docs SHALL NOT contain any stock ticker symbols (e.g., AAPL, TSLA, NVDA, XOM, META).
|
||||
2. THE Sanitized_Docs SHALL NOT contain the trading action terms "buy", "sell", "hold", or "watch" when used as system action labels or decision outputs.
|
||||
3. THE Sanitized_Docs SHALL NOT contain the terms "trading engine", "paper trading", "live trading", "paper_eligible", or "live_eligible".
|
||||
4. THE Sanitized_Docs SHALL NOT contain the terms "portfolio", "portfolio allocation", "portfolio heat", or "portfolio snapshots" when referring to the resource management domain concept.
|
||||
5. THE Sanitized_Docs SHALL NOT contain references to "broker", "Alpaca", "broker adapter", or "broker API".
|
||||
6. THE Sanitized_Docs SHALL NOT contain the terms "stock market", "Wall Street", "bullish", "bearish", "position sizing" (as a financial concept label), or "stop-loss" (as a financial concept label).
|
||||
7. THE Sanitized_Docs SHALL NOT contain company names used as financial examples (e.g., "Apple", "Tesla", "NVIDIA" when used in a stock/market context).
|
||||
8. THE Sanitized_Docs SHALL NOT contain the terms "SEC EDGAR", "SEC filings", "10-K", "10-Q", "8-K", "earnings", "earnings call", or "earnings report" as domain-specific financial references.
|
||||
9. THE Sanitized_Docs SHALL NOT contain references to "Polygon.io" or "Polygon" as a financial data provider name.
|
||||
10. THE Sanitized_Docs SHALL NOT contain the term "Stonks Oracle" or "stonks" as a system name.
|
||||
|
||||
---
|
||||
|
||||
### Requirement 4: Apply Domain-Neutral Terminology Mapping
|
||||
|
||||
**User Story:** As a presenter, I want consistent domain-neutral replacements for all stripped terms, so that the sanitized docs read coherently as a general-purpose AI decision intelligence pipeline.
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
1. WHEN the Source_Docs use "stock ticker" or specific ticker symbols, THE Sanitized_Docs SHALL use "entity identifier" or "tracked entity".
|
||||
2. WHEN the Source_Docs use "buy/sell/hold/watch" as action labels, THE Sanitized_Docs SHALL use "act/defer/monitor/observe" or equivalent neutral decision terms.
|
||||
3. WHEN the Source_Docs use "trading engine", THE Sanitized_Docs SHALL use "decision execution engine" or "action engine".
|
||||
4. WHEN the Source_Docs use "portfolio", THE Sanitized_Docs SHALL use "resource pool" or "allocation pool".
|
||||
5. WHEN the Source_Docs use "broker" or "Alpaca", THE Sanitized_Docs SHALL use "execution adapter" or "external execution API".
|
||||
6. WHEN the Source_Docs use "paper trading", THE Sanitized_Docs SHALL use "simulation mode" or "dry-run mode".
|
||||
7. WHEN the Source_Docs use "live trading", THE Sanitized_Docs SHALL use "live execution mode" or "production mode".
|
||||
8. WHEN the Source_Docs use "bullish" or "bearish", THE Sanitized_Docs SHALL use "positive" or "negative" (or "favorable"/"unfavorable").
|
||||
9. WHEN the Source_Docs use "position sizing", THE Sanitized_Docs SHALL use "resource allocation" or "commitment sizing".
|
||||
10. WHEN the Source_Docs use "stop-loss", THE Sanitized_Docs SHALL use "risk threshold" or "loss limit".
|
||||
11. WHEN the Source_Docs use "Stonks Oracle" or "stonks", THE Sanitized_Docs SHALL use a neutral system name such as "the platform" or "the system".
|
||||
12. WHEN the Source_Docs use "SEC EDGAR" or "SEC filings", THE Sanitized_Docs SHALL use "regulatory filings source" or "public records API".
|
||||
13. WHEN the Source_Docs use "Polygon.io" or "Polygon", THE Sanitized_Docs SHALL use "external data provider" or "data source API".
|
||||
14. WHEN the Source_Docs use "earnings" as a catalyst type or event, THE Sanitized_Docs SHALL use "performance report" or "periodic disclosure".
|
||||
15. THE Sanitized_Docs SHALL apply the Terminology_Map consistently across all 6 pages, the index, and all diagram files.
|
||||
|
||||
---
|
||||
|
||||
### Requirement 5: Preserve Engineering and Technical Depth
|
||||
|
||||
**User Story:** As a presenter, I want all engineering concepts, algorithms, formulas, and architectural details preserved, so that the sanitized docs demonstrate the technical sophistication of the system.
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
1. THE Sanitized_Docs SHALL preserve all references to Redis queue patterns, including queue names and `rpush`/`lpop`/`blpop` operations.
|
||||
2. THE Sanitized_Docs SHALL preserve all references to PostgreSQL tables, including table names and column descriptions.
|
||||
3. THE Sanitized_Docs SHALL preserve all references to MinIO buckets and storage patterns.
|
||||
4. THE Sanitized_Docs SHALL preserve all references to Ollama as the LLM inference provider.
|
||||
5. THE Sanitized_Docs SHALL preserve the composite signal scoring formula: `combined = gate × recency × credibility × (1 + novelty_bonus) × market_context_multiplier`.
|
||||
6. THE Sanitized_Docs SHALL preserve the confidence computation formula with log₂ scaling and its four components (unique source count, average extraction credibility, signal agreement with sample-size dampening, contradiction penalty).
|
||||
7. THE Sanitized_Docs SHALL preserve the weighted sentiment average formula: `weighted_avg = Σ(combined_weight × impact_score × sentiment_value) / Σ(combined_weight × impact_score)`.
|
||||
8. THE Sanitized_Docs SHALL preserve all code module path references (e.g., `services/aggregation/scoring.py`, `services/recommendation/eligibility.py`).
|
||||
9. THE Sanitized_Docs SHALL preserve the three-layer signal architecture, renaming the layers with domain-neutral labels (e.g., "Entity-Specific Signals", "Environmental Signals", "Relational Signals") while retaining the weight ratios (1.0, 0.3, 0.2).
|
||||
10. THE Sanitized_Docs SHALL preserve all threshold values, configuration parameters, and numeric constants (e.g., confidence gate of 0.2, recency half-lives per window, eligibility thresholds).
|
||||
11. THE Sanitized_Docs SHALL preserve all Markdown table structures containing technical parameters and thresholds.
|
||||
12. THE Sanitized_Docs SHALL preserve the contradiction detection algorithm, evidence ranking methodology, and trend projection computation.
|
||||
|
||||
---
|
||||
|
||||
### Requirement 6: Sanitize Mermaid Diagrams
|
||||
|
||||
**User Story:** As a presenter, I want the Mermaid diagrams sanitized with the same terminology mapping as the narrative pages, so that diagrams and text are consistent.
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
1. THE Sanitized_Docs SHALL contain one sanitized Mermaid diagram file for each of the 6 diagram files in Source_Docs.
|
||||
2. WHEN a Source_Docs diagram contains financial/trading terminology (e.g., "trading engine", "buy/sell", "paper_eligible", "bullish/bearish", ticker symbols), THE corresponding Sanitized_Docs diagram SHALL use the same domain-neutral replacements defined in the Terminology_Map.
|
||||
3. THE Sanitized_Docs diagrams SHALL preserve all Mermaid syntax, node relationships, subgraph structures, and flow directions from the Source_Docs diagrams.
|
||||
4. THE Sanitized_Docs diagrams SHALL preserve all code module path references and service names within diagram nodes.
|
||||
5. THE Sanitized_Docs diagram filenames SHALL use sanitized names where the original names contain financial terms (e.g., `decision-engine-loop.md` instead of `trading-engine-decision-loop.md`).
|
||||
|
||||
---
|
||||
|
||||
### Requirement 7: Sanitize Redis Key and Queue Name References
|
||||
|
||||
**User Story:** As a presenter, I want Redis key patterns and queue names sanitized where they contain financial terms, so that even infrastructure-level references are domain-neutral.
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
1. WHEN a Source_Docs Redis queue name contains "stonks" (e.g., `stonks:queue:ingestion`), THE Sanitized_Docs SHALL replace "stonks" with a neutral prefix (e.g., `app:queue:ingestion`).
|
||||
2. WHEN a Source_Docs Redis key pattern contains "trading" (e.g., `stonks:queue:broker_orders`, `stonks:trading:circuit_breaker:*`), THE Sanitized_Docs SHALL replace the trading-specific segment with a neutral equivalent (e.g., `app:queue:execution_orders`, `app:execution:circuit_breaker:*`).
|
||||
3. THE Sanitized_Docs SHALL apply Redis key sanitization consistently across all narrative pages and diagram files.
|
||||
|
||||
---
|
||||
|
||||
### Requirement 8: Sanitize MinIO Bucket Name References
|
||||
|
||||
**User Story:** As a presenter, I want MinIO bucket names sanitized where they contain financial terms, so that storage references are domain-neutral.
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
1. WHEN a Source_Docs MinIO bucket name contains "stonks" (e.g., `stonks-raw-market`, `stonks-raw-news`, `stonks-normalized`), THE Sanitized_Docs SHALL replace "stonks" with a neutral prefix (e.g., `app-raw-data`, `app-raw-content`, `app-normalized`).
|
||||
2. THE Sanitized_Docs SHALL apply MinIO bucket name sanitization consistently across all narrative pages and diagram files.
|
||||
|
||||
---
|
||||
|
||||
### Requirement 9: Sanitize Database Table and Column References Where Needed
|
||||
|
||||
**User Story:** As a presenter, I want database table and column names that contain obvious financial terms sanitized, while preserving the overall schema structure.
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
1. WHEN a Source_Docs database table name contains "trading" (e.g., `trading_decisions`), THE Sanitized_Docs SHALL use a neutral equivalent (e.g., `execution_decisions`).
|
||||
2. WHEN a Source_Docs database table or column references "portfolio" (e.g., `portfolio_snapshots`, `portfolio_pct`), THE Sanitized_Docs SHALL use a neutral equivalent (e.g., `pool_snapshots`, `allocation_pct`).
|
||||
3. THE Sanitized_Docs SHALL preserve all other database table names that do not contain financial-specific terms (e.g., `documents`, `document_intelligence`, `trend_windows`, `recommendations`).
|
||||
4. THE Sanitized_Docs SHALL apply database reference sanitization consistently across all narrative pages.
|
||||
|
||||
---
|
||||
|
||||
### Requirement 10: Sanitize Example Scenarios and Inline References
|
||||
|
||||
**User Story:** As a presenter, I want all inline examples, scenario walkthroughs, and narrative references sanitized, so that no financial context leaks through illustrative content.
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
1. WHEN a Source_Docs page uses a specific company name or ticker in an example scenario (e.g., "a bearish article about AAPL"), THE Sanitized_Docs SHALL replace the reference with a generic entity (e.g., "a negative-sentiment article about Entity-A").
|
||||
2. WHEN a Source_Docs page describes a financial event as an example (e.g., "earnings miss", "tariff announcement affecting XOM"), THE Sanitized_Docs SHALL reframe the example using domain-neutral language (e.g., "a negative performance disclosure", "a regulatory policy change affecting Entity-B").
|
||||
3. WHEN a Source_Docs page references market-specific concepts in narrative flow (e.g., "markets move fast", "trading volume", "intraday swings"), THE Sanitized_Docs SHALL reframe using neutral language (e.g., "conditions change rapidly", "activity volume", "short-term fluctuations").
|
||||
4. THE Sanitized_Docs SHALL preserve the logical structure and teaching purpose of all example scenarios while removing the financial framing.
|
||||
|
||||
---
|
||||
|
||||
### Requirement 11: Preserve Acceptable Engineering Terms
|
||||
|
||||
**User Story:** As a presenter, I want general engineering terms that happen to overlap with financial language preserved when they describe engineering patterns, so that the technical accuracy is maintained.
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
1. THE Sanitized_Docs SHALL preserve the term "circuit breaker" when it describes the engineering safety pattern (rate limiting, cascading failure prevention).
|
||||
2. THE Sanitized_Docs SHALL preserve the term "exponential backoff" and all retry/backoff patterns.
|
||||
3. THE Sanitized_Docs SHALL preserve all adapter pattern references (the software design pattern), renaming only the domain-specific adapter names (e.g., "AlpacaBrokerAdapter" becomes a neutral name).
|
||||
4. THE Sanitized_Docs SHALL preserve the term "signal" as used in the signal processing and scoring context.
|
||||
5. THE Sanitized_Docs SHALL preserve the terms "trend", "sentiment", "confidence", "contradiction", and "evidence" as used in the data analysis context.
|
||||
|
||||
---
|
||||
|
||||
### Requirement 12: Reframe the System Narrative
|
||||
|
||||
**User Story:** As a presenter, I want the overall system narrative reframed as a general-purpose AI decision intelligence pipeline, so that the presentation tells a coherent story without financial context.
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
1. THE Sanitized_Docs index page SHALL describe the system as an "AI-driven intelligence-to-decision pipeline" that ingests data from multiple sources, extracts structured intelligence via NLP/LLM, scores and weights signals, aggregates trends across time windows, generates recommendations with quality gates, and executes decisions autonomously with safety mechanisms.
|
||||
2. THE Sanitized_Docs page 01 SHALL describe data ingestion from "multiple external data sources" rather than from financial-specific APIs.
|
||||
3. THE Sanitized_Docs page 06 SHALL describe "autonomous decision execution with safety mechanisms" rather than "trading decisions and execution".
|
||||
4. WHEN the Source_Docs conclusion references the "intelligence-to-decision pipeline in Stonks Oracle", THE Sanitized_Docs conclusion SHALL reference the "intelligence-to-decision pipeline" without a financial system name.
|
||||
5. THE Sanitized_Docs SHALL maintain the narrative flow where each page ends with a transition to the next page, preserving the end-to-end story structure.
|
||||
@@ -0,0 +1,47 @@
|
||||
# Tasks — Sanitized Pipeline Documentation
|
||||
|
||||
## Task 1: Create Output Directory and Index Page
|
||||
|
||||
- [x] 1.1 Create the `docs/sanitized-pipeline-deep-dive/` directory and `diagrams/` subdirectory
|
||||
- [x] 1.2 Create `docs/sanitized-pipeline-deep-dive/index.md` with sanitized content: replace "Stonks Oracle" with "the platform", replace Polygon.io/SEC EDGAR references with neutral descriptions, update all page links to use sanitized filenames (e.g., `06-decision-execution.md`), update diagram links to use sanitized names (e.g., `decision-engine-loop.md`), describe the system as an "AI-driven intelligence-to-decision pipeline", and update or remove the Related Documentation section to use neutral descriptions
|
||||
|
||||
## Task 2: Sanitize Page 01 — Data Ingestion and Preparation
|
||||
|
||||
- [x] 2.1 Create `docs/sanitized-pipeline-deep-dive/01-data-ingestion-and-preparation.md` by transforming the source page: replace "Stonks Oracle" with "the platform", replace "Polygon.io" with "external data provider", replace "SEC EDGAR"/"EFTS" with "public records API"/"regulatory filings source", replace "AlpacaBrokerAdapter" with "ExecutionAdapter", replace adapter class names (PolygonNewsAdapter → ExternalNewsAdapter, PolygonMarketAdapter → ExternalDataAdapter, SECEdgarAdapter → RegulatoryFilingsAdapter), replace all `stonks:` Redis key prefixes with `app:`, replace MinIO bucket names (stonks-raw-market → app-raw-data, stonks-raw-news → app-raw-content, stonks-raw-filings → app-raw-filings, stonks-normalized → app-normalized), replace ticker symbols (AAPL → Entity-A, etc.) and company names with generic entities, replace "broker" source_type with "execution_api", replace "SEC" references with "regulatory filings", replace "10-K"/"10-Q"/"8-K" with "regulatory filing types", replace "earnings" with "performance report", sanitize example paths (e.g., `news_api/AAPL/...` → `news_api/Entity-A/...`), update cross-references to use sanitized filenames, and preserve all engineering content (queue operations, table structures, quality scoring formula, code module paths)
|
||||
|
||||
## Task 3: Sanitize Page 02 — AI Agent Processing and Extraction
|
||||
|
||||
- [x] 3.1 Create `docs/sanitized-pipeline-deep-dive/02-ai-agent-processing-and-extraction.md` by transforming the source page: replace "Stonks Oracle" references, replace financial document type references (SEC filings → regulatory filings, earnings transcripts → performance transcripts), replace "financial document analyst" role description with "document analyst", replace ticker symbols and company names in examples (AAPL, TSLA, NVDA, XOM, META → Entity-A through Entity-E), replace "bearish"/"bullish" with "negative"/"positive", replace "earnings" catalyst type references with "performance_report", replace "stock ticker" with "entity identifier", replace "market implications" with neutral language, replace `stonks:queue:*` Redis keys with `app:queue:*`, replace MinIO bucket names (stonks-llm-prompts → app-llm-prompts, stonks-llm-results → app-llm-results, stonks-normalized → app-normalized), replace "tariff announcement affecting XOM" example with neutral equivalent, update cross-references, and preserve all engineering content (JSON repair pipeline, validation logic, AgentConfigResolver, Ollama references, code module paths, schema field descriptions)
|
||||
|
||||
## Task 4: Sanitize Page 03 — Signal Scoring and Weighted Signals
|
||||
|
||||
- [x] 4.1 Create `docs/sanitized-pipeline-deep-dive/03-signal-scoring-and-weighted-signals.md` by transforming the source page: replace "bullish"/"bearish" with "positive"/"negative" throughout, replace "trading recommendations" with "decision recommendations", replace ticker examples (AAPL, NVDA) with Entity-A/Entity-C, replace "market context" variable references carefully (preserve `market_context_multiplier` as a technical variable name but sanitize narrative references to "market conditions" → "environmental conditions"), replace "trading volume" with "activity volume", replace `stonks:queue:*` Redis keys with `app:queue:*`, replace "bullish_pct > bearish_pct" with "positive_pct > negative_pct" in signal propagation description, update cross-references, and preserve all engineering content (composite weight formula, recency decay formula, half-life tables, credibility weight computation, novelty bonus formula, weighted sentiment average formula, three-layer architecture with weight ratios 1.0/0.3/0.2, all threshold values and configuration parameters)
|
||||
|
||||
## Task 5: Sanitize Page 04 — Trend Aggregation and Accumulating Signals
|
||||
|
||||
- [x] 5.1 Create `docs/sanitized-pipeline-deep-dive/04-trend-aggregation-and-accumulating-signals.md` by transforming the source page: replace "bullish"/"bearish" with "positive"/"negative" in trend direction descriptions and TrendDirection enum values, replace "trading recommendations" with "decision recommendations", replace "BULLISH_THRESHOLD"/"BEARISH_THRESHOLD" with "POSITIVE_THRESHOLD"/"NEGATIVE_THRESHOLD", replace "paper_eligible"/"live_eligible" with "simulation_eligible"/"production_eligible", replace "paper trading"/"live trading" with "simulation mode"/"live execution mode", replace "buy"/"sell"/"hold"/"watch" action labels with "act"/"defer"/"monitor"/"observe", replace "trading_decisions" table with "execution_decisions", replace "portfolio" references, replace ticker examples (AAPL) with Entity-A, replace "earnings miss" example with "negative performance disclosure", replace `stonks:queue:*` Redis keys with `app:queue:*`, update cross-references, and preserve all engineering content (five time windows, trend direction derivation thresholds, contradiction detection algorithm, evidence ranking, confidence computation formula with log₂ scaling, trend projection computation, all persistence tables)
|
||||
|
||||
## Task 6: Sanitize Page 05 — Recommendation Generation
|
||||
|
||||
- [x] 6.1 Create `docs/sanitized-pipeline-deep-dive/05-recommendation-generation.md` by transforming the source page: replace "buy"/"sell"/"hold"/"watch" action labels with "act"/"defer"/"monitor"/"observe", replace "BUY"/"SELL"/"HOLD"/"WATCH" with "ACT"/"DEFER"/"MONITOR"/"OBSERVE", replace "paper_eligible"/"live_eligible" with "simulation_eligible"/"production_eligible", replace "paper trading"/"live trading" with "simulation mode"/"live execution mode", replace "trading engine" with "decision execution engine", replace "portfolio" with "resource pool"/"allocation pool", replace "portfolio_pct" with "allocation_pct", replace "position sizing" with "commitment sizing", replace "position" (as financial position) with "commitment", replace "stop-loss" with "risk threshold", replace "trading-eligible" with "execution-eligible", replace "trade" (as noun/verb) with "decision"/"execution", replace ticker examples (AAPL) with Entity-A, replace "earnings" catalyst references with "performance_report", replace `stonks:queue:*` Redis keys with `app:queue:*`, replace "broker adapter" with "execution adapter", replace "Alpaca" with "external execution API", update cross-references to use sanitized filenames (06-decision-execution.md), and preserve all engineering content (suppression thresholds, eligibility gates, position sizing formulas, thesis generation logic, risk classification computation, all persistence tables)
|
||||
|
||||
## Task 7: Sanitize Page 06 — Decision Execution
|
||||
|
||||
- [x] 7.1 Create `docs/sanitized-pipeline-deep-dive/06-decision-execution.md` by transforming the source page: change title to "Decision Execution", replace "trading engine" with "decision execution engine" throughout, replace "TradingEngine" class references with "DecisionEngine" in narrative (preserve code module path `services/trading/engine.py`), replace "trade"/"trading" with "decision"/"execution" in narrative, replace "pre-trade checks" with "pre-execution checks", replace "buy"/"sell" action labels with "act"/"defer", replace "paper trading"/"paper_eligible" with "simulation mode"/"simulation_eligible", replace "live trading"/"live_eligible" with "live execution mode"/"production_eligible", replace "broker"/"Alpaca" with "execution adapter"/"external execution API", replace "AlpacaBrokerAdapter" with "ExecutionAdapter" in narrative, replace "portfolio" with "resource pool"/"allocation pool", replace "portfolio heat" with "pool exposure", replace "portfolio_snapshots" with "pool_snapshots", replace "position"/"positions" (financial) with "commitment"/"commitments", replace "position sizing"/"PositionSizer" with "commitment sizing" in narrative, replace "stop-loss" with "risk threshold", replace "take-profit" with "gain target", replace "P&L" with "gain/loss", replace "Sharpe ratio" with "risk-adjusted return ratio", replace "win rate" with "success rate", replace "drawdown" with "peak-to-trough decline", replace "trading_decisions" table with "execution_decisions", replace `stonks:queue:broker_orders` with `app:queue:execution_orders`, replace `stonks:trading:circuit_breaker:*` with `app:execution:circuit_breaker:*`, replace `stonks:dedupe:trading:*` with `app:dedupe:execution:*`, replace all other `stonks:` Redis key prefixes with `app:`, replace "paper-api.alpaca.markets" with "execution-api.example.com", replace "Polygon API" with "data source API", replace ticker examples with Entity-{letter}, replace "earnings" references with "performance report"/"periodic disclosure", update cross-references to use sanitized filenames, update the Conclusion section to remove "Stonks Oracle" and financial framing, and preserve all engineering content (5 concurrent async tasks, circuit breaker algorithm, reserve pool logic, risk tier parameters table, position sizing pipeline, order submission flow, all code module paths, all threshold values)
|
||||
|
||||
## Task 8: Sanitize Mermaid Diagrams
|
||||
|
||||
- [x] 8.1 Create `docs/sanitized-pipeline-deep-dive/diagrams/ingestion-to-extraction-flow.md` by transforming the source diagram: replace `stonks:queue:*` with `app:queue:*`, replace MinIO bucket names (stonks-raw-market → app-raw-data, stonks-raw-news → app-raw-content, stonks-raw-filings → app-raw-filings, stonks-normalized → app-normalized), replace adapter names in node labels (PolygonMarketAdapter → ExternalDataAdapter, PolygonNewsAdapter → ExternalNewsAdapter, SECEdgarAdapter → RegulatoryFilingsAdapter, MacroNewsAdapter unchanged, WebScrapeAdapter unchanged), replace "AlpacaBrokerAdapter" if present, and preserve all Mermaid syntax, node relationships, subgraph structures, flow directions, and code module paths
|
||||
- [x] 8.2 Create `docs/sanitized-pipeline-deep-dive/diagrams/three-layer-signal-merging.md` by transforming the source diagram: replace `stonks:queue:*` with `app:queue:*`, replace "bullish_pct > bearish_pct" if present, and preserve all Mermaid syntax and structure
|
||||
- [x] 8.3 Create `docs/sanitized-pipeline-deep-dive/diagrams/weighted-signal-computation.md` by copying the source diagram with minimal changes (content is already domain-neutral — only replace any `stonks:` references if present), preserving all Mermaid syntax and structure
|
||||
- [x] 8.4 Create `docs/sanitized-pipeline-deep-dive/diagrams/trend-accumulation-escalation.md` by transforming the source diagram: replace "BULLISH"/"BEARISH" with "POSITIVE"/"NEGATIVE", replace "BUY / SELL" with "ACT / DEFER", replace "paper_eligible"/"live_eligible" if present, and preserve all Mermaid syntax and structure
|
||||
- [x] 8.5 Create `docs/sanitized-pipeline-deep-dive/diagrams/recommendation-generation-flow.md` by transforming the source diagram: replace `stonks:queue:*` with `app:queue:*`, replace "BUY"/"SELL"/"HOLD"/"WATCH" with "ACT"/"DEFER"/"MONITOR"/"OBSERVE", replace "paper_eligible"/"live_eligible" with "simulation_eligible"/"production_eligible", replace "portfolio" with "allocation pool", and preserve all Mermaid syntax and structure
|
||||
- [x] 8.6 Create `docs/sanitized-pipeline-deep-dive/diagrams/decision-engine-loop.md` (renamed from trading-engine-decision-loop.md) by transforming the source diagram: replace "Trading Engine" with "Decision Execution Engine", replace `stonks:queue:broker_orders` with `app:queue:execution_orders`, replace `stonks:dedupe:trading:*` with `app:dedupe:execution:*`, replace `stonks:trading:circuit_breaker:*` with `app:execution:circuit_breaker:*`, replace "buy, sell" with "act, defer", replace "paper_eligible, live_eligible" with "simulation_eligible, production_eligible", replace "Alpaca paper trading" with "external execution API (simulation)", replace "portfolio" references with "resource pool"/"allocation pool", replace "Portfolio heat" with "Pool exposure", replace "portfolio_snapshots" with "pool_snapshots", replace "trading_decisions" with "execution_decisions", replace "Sharpe ratio" with "risk-adjusted return ratio", replace "drawdown" with "peak-to-trough decline", replace "win rate" with "success rate", replace "P&L" with "gain/loss", and preserve all Mermaid syntax, node relationships, subgraph structures, flow directions, and code module paths
|
||||
|
||||
## Task 9: Verification and Cross-Reference Integrity
|
||||
|
||||
- [x] 9.1 Verify all sanitized files exist at the expected paths: index.md, 6 numbered pages (01-06), and 6 diagram files in diagrams/
|
||||
- [x] 9.2 Verify no sanitized file contains any banned financial term: scan all files for ticker symbols (AAPL, TSLA, NVDA, XOM, META), company names (Apple, Tesla, NVIDIA as financial references), system names (Stonks Oracle, stonks), provider names (Polygon.io, Polygon, SEC EDGAR, Alpaca), financial terms (trading engine, paper trading, live trading, paper_eligible, live_eligible, portfolio, broker, bullish, bearish, position sizing, stop-loss, stock market, Wall Street, earnings, 10-K, 10-Q, 8-K), and infrastructure patterns (stonks: prefix, stonks- prefix, trading_decisions, portfolio_snapshots)
|
||||
- [x] 9.3 Verify all internal cross-references resolve: parse all markdown links in sanitized files, confirm each link target exists in the sanitized output directory
|
||||
- [x] 9.4 Verify key engineering content is preserved: check that the composite weight formula, confidence computation formula, weighted sentiment average formula, three-layer weight ratios (1.0, 0.3, 0.2), and key threshold values (confidence gate 0.2, eligibility confidence 0.35) appear in the sanitized docs
|
||||
- [x] 9.5 Verify source files are unmodified: confirm that no files under `docs/intelligence-pipeline-deep-dive/` were changed
|
||||
Reference in New Issue
Block a user