# Requirements Document ## Introduction This feature produces a sanitized version of the existing 6-page intelligence pipeline deep dive documentation (`docs/intelligence-pipeline-deep-dive/`) for use in a work presentation. The sanitized version strips all financial, market, and trading language — stock tickers, buy/sell/hold actions, portfolio allocation, broker APIs, and domain-specific framing — and reframes the content as a general-purpose AI decision intelligence pipeline. The sanitized docs are stored as a separate doc group under `docs/sanitized-pipeline-deep-dive/`, preserving the original documents untouched. All engineering depth — algorithms, formulas, architectural patterns, queue topologies, database schemas, code module references, and Mermaid diagrams — is preserved. Only the domain-specific framing changes. ## Glossary - **Source_Docs**: The original 6-page documentation set at `docs/intelligence-pipeline-deep-dive/`, including `index.md`, pages `01` through `06`, and the `diagrams/` subdirectory containing 6 Mermaid diagram files. - **Sanitized_Docs**: The output documentation set at `docs/sanitized-pipeline-deep-dive/`, mirroring the structure of Source_Docs with all financial/market/trading language replaced by domain-neutral equivalents. - **Sanitization_Engine**: The process (manual or automated) that transforms Source_Docs into Sanitized_Docs by applying the terminology mapping and content reframing rules defined in this document. - **Terminology_Map**: The defined set of financial/market/trading terms and their domain-neutral replacements used by the Sanitization_Engine. - **Entity_Identifier**: The domain-neutral replacement for stock ticker symbols (e.g., AAPL, TSLA) in Sanitized_Docs. - **Decision_Term**: A domain-neutral action term (act, defer, monitor, observe) that replaces trading actions (buy, sell, hold, watch) in Sanitized_Docs. - **Decision_Execution_Engine**: The domain-neutral name for the trading engine in Sanitized_Docs. - **Execution_Adapter**: The domain-neutral name for broker adapters and broker API references in Sanitized_Docs. - **Allocation_Pool**: The domain-neutral name for portfolio references in Sanitized_Docs. - **Commitment_Sizing**: The domain-neutral name for position sizing in Sanitized_Docs. --- ## Requirements ### Requirement 1: Separate Output Directory **User Story:** As a presenter, I want the sanitized docs stored in a separate directory from the originals, so that the original documentation remains untouched and both versions coexist. #### Acceptance Criteria 1. THE Sanitization_Engine SHALL write all output files to `docs/sanitized-pipeline-deep-dive/`. 2. THE Sanitization_Engine SHALL NOT modify, overwrite, or delete any file under `docs/intelligence-pipeline-deep-dive/`. 3. THE Sanitized_Docs SHALL contain an `index.md` file at the root of `docs/sanitized-pipeline-deep-dive/`. 4. THE Sanitized_Docs SHALL contain a `diagrams/` subdirectory under `docs/sanitized-pipeline-deep-dive/`. --- ### Requirement 2: Mirror the 6-Page Structure **User Story:** As a presenter, I want the sanitized docs to mirror the same 6-page structure as the originals, so that readers familiar with the original can navigate the sanitized version identically. #### Acceptance Criteria 1. THE Sanitized_Docs SHALL contain exactly 6 numbered page files matching the naming pattern of Source_Docs: `01-*.md` through `06-*.md`. 2. THE Sanitized_Docs SHALL contain an `index.md` with a table of contents linking to all 6 pages and all diagrams, mirroring the structure of the Source_Docs index. 3. THE Sanitized_Docs SHALL contain one Mermaid diagram file in `diagrams/` for each diagram file present in `docs/intelligence-pipeline-deep-dive/diagrams/`. 4. WHEN a Source_Docs page contains internal cross-references to other pages or diagrams, THE Sanitized_Docs equivalent page SHALL contain corresponding cross-references pointing to the Sanitized_Docs versions of those pages and diagrams. 5. THE Sanitized_Docs page filenames SHALL use sanitized titles (e.g., `06-decision-execution.md` instead of `06-trading-decisions-and-execution.md`). --- ### Requirement 3: Strip Financial and Trading Terminology **User Story:** As a presenter, I want all financial, market, and trading language removed from the sanitized docs, so that the presentation focuses on engineering without revealing the financial domain. #### Acceptance Criteria 1. THE Sanitized_Docs SHALL NOT contain any stock ticker symbols (e.g., AAPL, TSLA, NVDA, XOM, META). 2. THE Sanitized_Docs SHALL NOT contain the trading action terms "buy", "sell", "hold", or "watch" when used as system action labels or decision outputs. 3. THE Sanitized_Docs SHALL NOT contain the terms "trading engine", "paper trading", "live trading", "paper_eligible", or "live_eligible". 4. THE Sanitized_Docs SHALL NOT contain the terms "portfolio", "portfolio allocation", "portfolio heat", or "portfolio snapshots" when referring to the resource management domain concept. 5. THE Sanitized_Docs SHALL NOT contain references to "broker", "Alpaca", "broker adapter", or "broker API". 6. THE Sanitized_Docs SHALL NOT contain the terms "stock market", "Wall Street", "bullish", "bearish", "position sizing" (as a financial concept label), or "stop-loss" (as a financial concept label). 7. THE Sanitized_Docs SHALL NOT contain company names used as financial examples (e.g., "Apple", "Tesla", "NVIDIA" when used in a stock/market context). 8. THE Sanitized_Docs SHALL NOT contain the terms "SEC EDGAR", "SEC filings", "10-K", "10-Q", "8-K", "earnings", "earnings call", or "earnings report" as domain-specific financial references. 9. THE Sanitized_Docs SHALL NOT contain references to "Polygon.io" or "Polygon" as a financial data provider name. 10. THE Sanitized_Docs SHALL NOT contain the term "Stonks Oracle" or "stonks" as a system name. --- ### Requirement 4: Apply Domain-Neutral Terminology Mapping **User Story:** As a presenter, I want consistent domain-neutral replacements for all stripped terms, so that the sanitized docs read coherently as a general-purpose AI decision intelligence pipeline. #### Acceptance Criteria 1. WHEN the Source_Docs use "stock ticker" or specific ticker symbols, THE Sanitized_Docs SHALL use "entity identifier" or "tracked entity". 2. WHEN the Source_Docs use "buy/sell/hold/watch" as action labels, THE Sanitized_Docs SHALL use "act/defer/monitor/observe" or equivalent neutral decision terms. 3. WHEN the Source_Docs use "trading engine", THE Sanitized_Docs SHALL use "decision execution engine" or "action engine". 4. WHEN the Source_Docs use "portfolio", THE Sanitized_Docs SHALL use "resource pool" or "allocation pool". 5. WHEN the Source_Docs use "broker" or "Alpaca", THE Sanitized_Docs SHALL use "execution adapter" or "external execution API". 6. WHEN the Source_Docs use "paper trading", THE Sanitized_Docs SHALL use "simulation mode" or "dry-run mode". 7. WHEN the Source_Docs use "live trading", THE Sanitized_Docs SHALL use "live execution mode" or "production mode". 8. WHEN the Source_Docs use "bullish" or "bearish", THE Sanitized_Docs SHALL use "positive" or "negative" (or "favorable"/"unfavorable"). 9. WHEN the Source_Docs use "position sizing", THE Sanitized_Docs SHALL use "resource allocation" or "commitment sizing". 10. WHEN the Source_Docs use "stop-loss", THE Sanitized_Docs SHALL use "risk threshold" or "loss limit". 11. WHEN the Source_Docs use "Stonks Oracle" or "stonks", THE Sanitized_Docs SHALL use a neutral system name such as "the platform" or "the system". 12. WHEN the Source_Docs use "SEC EDGAR" or "SEC filings", THE Sanitized_Docs SHALL use "regulatory filings source" or "public records API". 13. WHEN the Source_Docs use "Polygon.io" or "Polygon", THE Sanitized_Docs SHALL use "external data provider" or "data source API". 14. WHEN the Source_Docs use "earnings" as a catalyst type or event, THE Sanitized_Docs SHALL use "performance report" or "periodic disclosure". 15. THE Sanitized_Docs SHALL apply the Terminology_Map consistently across all 6 pages, the index, and all diagram files. --- ### Requirement 5: Preserve Engineering and Technical Depth **User Story:** As a presenter, I want all engineering concepts, algorithms, formulas, and architectural details preserved, so that the sanitized docs demonstrate the technical sophistication of the system. #### Acceptance Criteria 1. THE Sanitized_Docs SHALL preserve all references to Redis queue patterns, including queue names and `rpush`/`lpop`/`blpop` operations. 2. THE Sanitized_Docs SHALL preserve all references to PostgreSQL tables, including table names and column descriptions. 3. THE Sanitized_Docs SHALL preserve all references to MinIO buckets and storage patterns. 4. THE Sanitized_Docs SHALL preserve all references to Ollama as the LLM inference provider. 5. THE Sanitized_Docs SHALL preserve the composite signal scoring formula: `combined = gate × recency × credibility × (1 + novelty_bonus) × market_context_multiplier`. 6. THE Sanitized_Docs SHALL preserve the confidence computation formula with log₂ scaling and its four components (unique source count, average extraction credibility, signal agreement with sample-size dampening, contradiction penalty). 7. THE Sanitized_Docs SHALL preserve the weighted sentiment average formula: `weighted_avg = Σ(combined_weight × impact_score × sentiment_value) / Σ(combined_weight × impact_score)`. 8. THE Sanitized_Docs SHALL preserve all code module path references (e.g., `services/aggregation/scoring.py`, `services/recommendation/eligibility.py`). 9. THE Sanitized_Docs SHALL preserve the three-layer signal architecture, renaming the layers with domain-neutral labels (e.g., "Entity-Specific Signals", "Environmental Signals", "Relational Signals") while retaining the weight ratios (1.0, 0.3, 0.2). 10. THE Sanitized_Docs SHALL preserve all threshold values, configuration parameters, and numeric constants (e.g., confidence gate of 0.2, recency half-lives per window, eligibility thresholds). 11. THE Sanitized_Docs SHALL preserve all Markdown table structures containing technical parameters and thresholds. 12. THE Sanitized_Docs SHALL preserve the contradiction detection algorithm, evidence ranking methodology, and trend projection computation. --- ### Requirement 6: Sanitize Mermaid Diagrams **User Story:** As a presenter, I want the Mermaid diagrams sanitized with the same terminology mapping as the narrative pages, so that diagrams and text are consistent. #### Acceptance Criteria 1. THE Sanitized_Docs SHALL contain one sanitized Mermaid diagram file for each of the 6 diagram files in Source_Docs. 2. WHEN a Source_Docs diagram contains financial/trading terminology (e.g., "trading engine", "buy/sell", "paper_eligible", "bullish/bearish", ticker symbols), THE corresponding Sanitized_Docs diagram SHALL use the same domain-neutral replacements defined in the Terminology_Map. 3. THE Sanitized_Docs diagrams SHALL preserve all Mermaid syntax, node relationships, subgraph structures, and flow directions from the Source_Docs diagrams. 4. THE Sanitized_Docs diagrams SHALL preserve all code module path references and service names within diagram nodes. 5. THE Sanitized_Docs diagram filenames SHALL use sanitized names where the original names contain financial terms (e.g., `decision-engine-loop.md` instead of `trading-engine-decision-loop.md`). --- ### Requirement 7: Sanitize Redis Key and Queue Name References **User Story:** As a presenter, I want Redis key patterns and queue names sanitized where they contain financial terms, so that even infrastructure-level references are domain-neutral. #### Acceptance Criteria 1. WHEN a Source_Docs Redis queue name contains "stonks" (e.g., `stonks:queue:ingestion`), THE Sanitized_Docs SHALL replace "stonks" with a neutral prefix (e.g., `app:queue:ingestion`). 2. WHEN a Source_Docs Redis key pattern contains "trading" (e.g., `stonks:queue:broker_orders`, `stonks:trading:circuit_breaker:*`), THE Sanitized_Docs SHALL replace the trading-specific segment with a neutral equivalent (e.g., `app:queue:execution_orders`, `app:execution:circuit_breaker:*`). 3. THE Sanitized_Docs SHALL apply Redis key sanitization consistently across all narrative pages and diagram files. --- ### Requirement 8: Sanitize MinIO Bucket Name References **User Story:** As a presenter, I want MinIO bucket names sanitized where they contain financial terms, so that storage references are domain-neutral. #### Acceptance Criteria 1. WHEN a Source_Docs MinIO bucket name contains "stonks" (e.g., `stonks-raw-market`, `stonks-raw-news`, `stonks-normalized`), THE Sanitized_Docs SHALL replace "stonks" with a neutral prefix (e.g., `app-raw-data`, `app-raw-content`, `app-normalized`). 2. THE Sanitized_Docs SHALL apply MinIO bucket name sanitization consistently across all narrative pages and diagram files. --- ### Requirement 9: Sanitize Database Table and Column References Where Needed **User Story:** As a presenter, I want database table and column names that contain obvious financial terms sanitized, while preserving the overall schema structure. #### Acceptance Criteria 1. WHEN a Source_Docs database table name contains "trading" (e.g., `trading_decisions`), THE Sanitized_Docs SHALL use a neutral equivalent (e.g., `execution_decisions`). 2. WHEN a Source_Docs database table or column references "portfolio" (e.g., `portfolio_snapshots`, `portfolio_pct`), THE Sanitized_Docs SHALL use a neutral equivalent (e.g., `pool_snapshots`, `allocation_pct`). 3. THE Sanitized_Docs SHALL preserve all other database table names that do not contain financial-specific terms (e.g., `documents`, `document_intelligence`, `trend_windows`, `recommendations`). 4. THE Sanitized_Docs SHALL apply database reference sanitization consistently across all narrative pages. --- ### Requirement 10: Sanitize Example Scenarios and Inline References **User Story:** As a presenter, I want all inline examples, scenario walkthroughs, and narrative references sanitized, so that no financial context leaks through illustrative content. #### Acceptance Criteria 1. WHEN a Source_Docs page uses a specific company name or ticker in an example scenario (e.g., "a bearish article about AAPL"), THE Sanitized_Docs SHALL replace the reference with a generic entity (e.g., "a negative-sentiment article about Entity-A"). 2. WHEN a Source_Docs page describes a financial event as an example (e.g., "earnings miss", "tariff announcement affecting XOM"), THE Sanitized_Docs SHALL reframe the example using domain-neutral language (e.g., "a negative performance disclosure", "a regulatory policy change affecting Entity-B"). 3. WHEN a Source_Docs page references market-specific concepts in narrative flow (e.g., "markets move fast", "trading volume", "intraday swings"), THE Sanitized_Docs SHALL reframe using neutral language (e.g., "conditions change rapidly", "activity volume", "short-term fluctuations"). 4. THE Sanitized_Docs SHALL preserve the logical structure and teaching purpose of all example scenarios while removing the financial framing. --- ### Requirement 11: Preserve Acceptable Engineering Terms **User Story:** As a presenter, I want general engineering terms that happen to overlap with financial language preserved when they describe engineering patterns, so that the technical accuracy is maintained. #### Acceptance Criteria 1. THE Sanitized_Docs SHALL preserve the term "circuit breaker" when it describes the engineering safety pattern (rate limiting, cascading failure prevention). 2. THE Sanitized_Docs SHALL preserve the term "exponential backoff" and all retry/backoff patterns. 3. THE Sanitized_Docs SHALL preserve all adapter pattern references (the software design pattern), renaming only the domain-specific adapter names (e.g., "AlpacaBrokerAdapter" becomes a neutral name). 4. THE Sanitized_Docs SHALL preserve the term "signal" as used in the signal processing and scoring context. 5. THE Sanitized_Docs SHALL preserve the terms "trend", "sentiment", "confidence", "contradiction", and "evidence" as used in the data analysis context. --- ### Requirement 12: Reframe the System Narrative **User Story:** As a presenter, I want the overall system narrative reframed as a general-purpose AI decision intelligence pipeline, so that the presentation tells a coherent story without financial context. #### Acceptance Criteria 1. THE Sanitized_Docs index page SHALL describe the system as an "AI-driven intelligence-to-decision pipeline" that ingests data from multiple sources, extracts structured intelligence via NLP/LLM, scores and weights signals, aggregates trends across time windows, generates recommendations with quality gates, and executes decisions autonomously with safety mechanisms. 2. THE Sanitized_Docs page 01 SHALL describe data ingestion from "multiple external data sources" rather than from financial-specific APIs. 3. THE Sanitized_Docs page 06 SHALL describe "autonomous decision execution with safety mechanisms" rather than "trading decisions and execution". 4. WHEN the Source_Docs conclusion references the "intelligence-to-decision pipeline in Stonks Oracle", THE Sanitized_Docs conclusion SHALL reference the "intelligence-to-decision pipeline" without a financial system name. 5. THE Sanitized_Docs SHALL maintain the narrative flow where each page ends with a transition to the next page, preserving the end-to-end story structure.