docs: add LLM provider config (Ollama/vLLM/mixed), fix risk network alias in compose
ci/woodpecker/push/test Pipeline was successful
ci/woodpecker/push/build-3 Pipeline was successful
ci/woodpecker/push/build-1 Pipeline was successful
ci/woodpecker/push/build-2 Pipeline was successful
ci/woodpecker/push/finalize Pipeline was successful
Build and Push / lint-and-test (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.adapters.broker_adapter name:broker-adapter]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.aggregation.worker name:aggregation]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.extractor.worker name:extractor]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.ingestion.worker name:ingestion]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.lake_publisher.worker name:lake-publisher]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.parser.worker name:parser]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.recommendation.worker name:recommendation]) (push) Has been cancelled
Build and Push / build-services (map[cmd:python -m services.scheduler.app name:scheduler]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.api.app:app --host 0.0.0.0 --port 8000 name:query-api]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.risk.app:app --host 0.0.0.0 --port 8000 name:risk]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.symbol_registry.app:app --host 0.0.0.0 --port 8000 name:symbol-registry]) (push) Has been cancelled
Build and Push / build-services (map[cmd:uvicorn services.trading.app:app --host 0.0.0.0 --port 8000 name:trading-engine]) (push) Has been cancelled
Build and Push / build-dashboard (push) Has been cancelled
Build and Push / build-superset (push) Has been cancelled
Build and Push / integration-test (push) Has been cancelled
Build and Push / beta-gate (push) Has been cancelled

This commit is contained in:
Celes Renata
2026-04-29 03:08:54 +00:00
parent f151747d56
commit 11c6457559
2 changed files with 124 additions and 3 deletions
+4
View File
@@ -312,6 +312,10 @@ services:
<<: *app-env
ports:
- "8003:8000"
networks:
default:
aliases:
- risk
depends_on:
postgres:
condition: service_healthy
+120 -3
View File
@@ -178,9 +178,16 @@ All application services support additional environment variables loaded via `se
| `REDIS_DB` | `0` | Redis database number |
| `REDIS_PASSWORD` | (none) | Redis password (not needed in Docker Compose) |
| `MINIO_SECURE` | `false` | Use HTTPS for MinIO |
| `OLLAMA_BASE_URL` | `http://ollama:11434` | Ollama LLM server URL |
| `OLLAMA_MODEL` | `qwen3.5:9b` | Default LLM model for extraction |
| `OLLAMA_TIMEOUT` | `120` | Ollama request timeout (seconds) |
| `OLLAMA_MAX_RETRIES` | `2` | Max retries for Ollama requests |
| `VLLM_BASE_URL` | (empty) | vLLM server URL (if using vLLM instead of Ollama) |
| `VLLM_MODEL` | (empty) | vLLM model name (e.g. `AxionML/Qwen3.5-9B-NVFP4`) |
| `VLLM_TIMEOUT` | `120` | vLLM request timeout (seconds) |
| `VLLM_MAX_RETRIES` | `2` | Max retries for vLLM requests |
| `VLLM_TEMPERATURE` | `0.7` | vLLM sampling temperature |
| `VLLM_API_KEY` | (empty) | vLLM API key (if required) |
| `TRINO_HOST` | `localhost` | Trino hostname |
| `TRINO_PORT` | `8080` | Trino port |
| `TRINO_CATALOG` | `lakehouse` | Trino catalog name |
@@ -203,6 +210,103 @@ See `services/shared/config.py` for the complete list of all supported environme
---
## LLM Provider Configuration
Stonks Oracle supports two LLM backends: **Ollama** (local, self-hosted) and **vLLM** (high-performance inference server). The active provider is configured per-agent in the `ai_agents` database table, but the connection details come from environment variables.
### Option A: Bundled Ollama (default)
The `docker-compose.yml` includes an Ollama container. On first start, pull a model:
```bash
docker compose exec ollama ollama pull qwen3.5:9b-fast
```
No additional configuration needed — services connect to `http://ollama:11434` by default.
### Option B: External Ollama
If Ollama is already running on the host (e.g. with GPU access), create a `docker-compose.override.yml`:
```yaml
services:
ollama:
entrypoint: ["true"]
restart: "no"
ports: []
extractor:
depends_on:
postgres:
condition: service_healthy
redis:
condition: service_healthy
environment:
OLLAMA_BASE_URL: "http://host.docker.internal:11434"
extra_hosts:
- "host.docker.internal:host-gateway"
recommendation:
environment:
OLLAMA_BASE_URL: "http://host.docker.internal:11434"
extra_hosts:
- "host.docker.internal:host-gateway"
```
This disables the bundled Ollama container and routes services to the host's instance. Replace the port if your Ollama runs on a non-standard port.
### Option C: vLLM Server
For higher throughput or quantized models (e.g. `AxionML/Qwen3.5-9B-NVFP4`), point services at a vLLM server. Add to your `.env`:
```dotenv
VLLM_BASE_URL=http://192.168.42.254:8000
VLLM_MODEL=AxionML/Qwen3.5-9B-NVFP4
VLLM_TIMEOUT=120
VLLM_TEMPERATURE=0.7
```
Then update the `ai_agents` table to use the vLLM provider:
```sql
UPDATE ai_agents SET model_provider = 'vllm', model_name = 'AxionML/Qwen3.5-9B-NVFP4' WHERE active = true;
```
Or use the API:
```bash
curl -X PUT http://localhost:8004/api/admin/agents/document-extractor \
-H 'Content-Type: application/json' \
-d '{"model_provider": "vllm", "model_name": "AxionML/Qwen3.5-9B-NVFP4"}'
```
### Option D: Mixed (Ollama + vLLM)
You can run different agents on different providers. For example, use vLLM for the high-volume extractor and Ollama for the thesis rewriter:
```sql
UPDATE ai_agents SET model_provider = 'vllm', model_name = 'AxionML/Qwen3.5-9B-NVFP4' WHERE slug = 'document-extractor';
UPDATE ai_agents SET model_provider = 'vllm', model_name = 'AxionML/Qwen3.5-9B-NVFP4' WHERE slug = 'event-classifier';
UPDATE ai_agents SET model_provider = 'ollama', model_name = 'qwen3.5:9b-fast' WHERE slug = 'thesis-rewriter';
```
Both `OLLAMA_BASE_URL` and `VLLM_BASE_URL` must be set in the environment for mixed mode.
### Automated Deployment
The `deploy-docker.sh` script handles LLM configuration automatically:
```bash
# Auto-detect host Ollama, use default model
bash deploy-docker.sh
# Specify a remote Ollama instance
bash deploy-docker.sh --ollama-url http://10.1.1.12:2701 --ollama-model qwen3.6
# Specify a different host
bash deploy-docker.sh --host user@myserver --dir /opt/stonks
```
---
## Volume Mounts and Data Persistence
Docker Compose defines five named volumes for persistent data:
@@ -576,9 +680,11 @@ The dashboard container runs nginx with reverse proxy rules that route API reque
|------|-----------|---------|
| `/api/` | `http://query-api:8000` | Query API |
| `/registry/` | `http://symbol-registry:8000/` | Symbol Registry API |
| `/risk/` | `http://risk-engine:8000/` | Risk Engine API |
| `/risk/` | `http://risk:8000/` | Risk Engine (via network alias) |
| `/trading/` | `http://trading-engine:8000/` | Trading Engine API |
The `risk-engine` service has a network alias of `risk` in `docker-compose.yml` so the nginx upstream resolves correctly.
All other paths serve the React SPA with `try_files` fallback to `index.html`.
---
@@ -610,12 +716,23 @@ docker compose up -d # Migrations re-applied on fresh init
### Ollama model not available
The extractor service needs an LLM model loaded in Ollama. Pull a model manually:
The extractor service needs an LLM model loaded. Pull a model manually:
```bash
docker compose exec ollama ollama pull qwen3.5:9b
# If using bundled Ollama container:
docker compose exec ollama ollama pull qwen3.5:9b-fast
# If using host Ollama:
ollama pull qwen3.5:9b-fast
# If using vLLM, ensure the model is loaded on the vLLM server
curl http://your-vllm-host:8000/v1/models
```
### Ollama port conflict (address already in use)
If Ollama is already running on the host, the bundled container will fail to bind port 11434. Use the external Ollama configuration described in the "LLM Provider Configuration" section above, or use `deploy-docker.sh` which handles this automatically.
### Port conflicts
If a port is already in use, modify the host port mapping in `docker-compose.yml`: