Commit Graph

28 Commits

Author SHA1 Message Date
Celes Renata 693d9e0d60 fix: reduce LLM timeouts — truncate docs to 8k/6k chars, cut num_predict 16k→4k, tighten prompts, trim anti-hallucination rules 2026-04-16 18:56:11 +00:00
Celes Renata f83577480f fix: alternate extractor between macro and extraction queues (1:2 ratio) to prevent starvation 2026-04-16 17:45:25 +00:00
Celes Renata 3ff910433f fix: reject empty LLM classifications for global events
When the LLM returns empty summary and no key facts, raise ValueError
so the retry logic kicks in instead of persisting an empty event.
Also strip whitespace from summary and filter empty key_facts entries.

Cleaned up 17 empty events from the database.
2026-04-15 19:46:31 +00:00
Celes Renata da86132f0c fix: add num_predict=16384 to prevent output truncation on large articles 2026-04-15 03:11:13 +00:00
Celes Renata b8a2cdc52a fix: fill default values for missing fields in truncated LLM output 2026-04-15 03:08:10 +00:00
Celes Renata 00044af993 fix: switch to think=false with json-repair — 20x faster extraction 2026-04-15 02:54:39 +00:00
Celes Renata 4f2ae23d42 fix: set num_predict=16384 so model has token budget for thinking + content 2026-04-15 01:47:00 +00:00
Celes Renata 46b069a748 fix: switch to non-streaming Ollama calls — streaming breaks thinking mode 2026-04-15 01:19:17 +00:00
Celes Renata ffe19eb23a fix: handle empty ticker in MinIO storage paths, clean up debug log 2026-04-15 00:39:53 +00:00
Celes Renata 8b5b692d3c fix: update stall timer during thinking phase to prevent premature stream abort 2026-04-15 00:06:49 +00:00
Celes Renata 01726af360 fix: remove think=false (Ollama bug #14645), bump max_tokens to 32k 2026-04-14 23:50:28 +00:00
Celes Renata d8ea58104c fix: lint errors (import sorting, unused vars) 2026-04-14 19:48:19 +00:00
Celes Renata f7a11d14ea feat: competitive intelligence & historical pattern matching layer 2026-04-14 19:42:48 +00:00
Celes Renata 4fbddc307a fix(extractor): fallback for any unrecognized impact_horizon value 2026-04-12 16:27:37 -07:00
Celes Renata 6ae8aa779e fix(extractor): add underscore variants to impact_horizon normalizer
Model returns long_term/short_term/medium_term instead of hyphenated versions
2026-04-12 16:08:25 -07:00
Celes Renata cd782d1552 fix(extractor): streaming with guardrails + catalyst_type normalization
- Switch Ollama calls from non-streaming to streaming with early termination
- Add loop detection, max token limit, and stall timeout guards
- Add catalyst_type alias normalizer to handle model hallucinations
- Add explicit enum values in extraction prompt for catalyst_type
- Add streaming config knobs to OllamaConfig
2026-04-12 15:28:20 -07:00
Celes Renata 6e2f174b19 phase 17: disable qwen3.5 thinking mode (think:false) to reduce latency and improve structured output 2026-04-12 12:35:24 -07:00
Celes Renata 45f0c03639 phase 17: add request-level URL logging to OllamaClient for proxy debugging 2026-04-12 12:32:44 -07:00
Celes Renata 1993bfdf3e phase 17: add extraction output normalization — clamp scores to 0-1, map impact_horizon alternatives 2026-04-12 10:15:38 -07:00
Celes Renata 66ed38bf18 phase 17: switch to gemma4:e4b, rewrite prompts for fill-the-fields style with forced ticker inclusion 2026-04-12 10:05:31 -07:00
Celes Renata 28b3361833 phase 17: remove embedded JSON schema from user prompt (4.7KB saved), Ollama format param handles it 2026-04-12 09:28:28 -07:00
Celes Renata 57d0fc7d33 phase 17: pass all tracked tickers to extractor, soften prompt for macro-to-company relevance 2026-04-12 09:18:08 -07:00
Celes Renata 48bf4f7e7e phase 17: extractor fetches normalized text from MinIO when not in job payload 2026-04-12 03:24:10 -07:00
Celes Renata 012b973bb7 phase 17: wire extractor→aggregation→recommendation queue chain, add company_id_map to extractor 2026-04-12 03:16:27 -07:00
Celes Renata 0ac4493bd4 phase 17: fix parser URL lookup from DB and extractor text field name mismatch 2026-04-12 02:54:23 -07:00
Celes Renata 109440c91e phase 15: fix ruff lint errors across services 2026-04-11 12:10:01 -07:00
Celes Renata ce10afa034 phase 14-15: docker build validation and helm deployment 2026-04-11 11:59:45 -07:00
Celes Renata ebea70573b phase 0+1: project scaffold, k8s manifests, CI pipeline, steering, hooks, tests
- Repository structure for all services, infra, lakehouse, dashboards
- K8s manifests targeting stonks-oracle namespace with GHCR images
- Ingress via Traefik with ca-issuer TLS for internal services
- ConfigMap wired to existing cluster services (pg, redis, minio, ollama)
- GitHub Actions workflow for lint, test, multi-service container builds
- Dockerfile with build-arg CMD per service
- Makefile for local build/push/deploy
- Steering rules for TDD workflow, K8s conventions, project context
- Agent hooks for lint-on-save, test-on-save, k8s-validate, phase-commit
- Ruff linter config, all lint issues fixed
- 14 passing tests for schemas, config, redis keys
- PostgreSQL migrations, Trino catalogs, Superset config, MinIO lifecycle
2026-04-11 03:25:08 -07:00