ci: fix lint errors across project, update ruff.toml per-file ignores
This commit is contained in:
@@ -50,6 +50,9 @@ alpaca.key
|
|||||||
alpaca.secret
|
alpaca.secret
|
||||||
alpaca.url
|
alpaca.url
|
||||||
|
|
||||||
|
# Gitea OAuth2 credentials (generated by pipelines/gitea/setup.sh)
|
||||||
|
pipelines/gitea/gitea-oauth2.env
|
||||||
|
|
||||||
# Deploy scripts (live on gremlin-1, not in repo)
|
# Deploy scripts (live on gremlin-1, not in repo)
|
||||||
runmefirst.sh
|
runmefirst.sh
|
||||||
runmelast.sh
|
runmelast.sh
|
||||||
|
|||||||
@@ -0,0 +1 @@
|
|||||||
|
{"specId": "6864b7d1-ab86-473f-b6ad-7091eaabac76", "workflowType": "requirements-first", "specType": "feature"}
|
||||||
@@ -0,0 +1,519 @@
|
|||||||
|
# Local CI/CD Pipeline — Design
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
This design replaces the GitHub-dependent CI/CD pipeline (ARC + GHCR) with a fully local pipeline using Gitea as the Git forge, Woodpecker CI for pipeline execution, and the existing local Docker registry at `registry.celestium.life` for image storage. The existing ArgoCD and Kargo infrastructure is retained for GitOps deployment and staged promotion, with configuration updates to point at local sources instead of GitHub/GHCR.
|
||||||
|
|
||||||
|
The migration touches five areas:
|
||||||
|
|
||||||
|
1. **Gitea configuration** — Complete initial setup (admin user, OAuth2 app), create the `stonks-oracle` repository, and configure webhooks for Woodpecker CI. Gitea is already deployed in the `git-server` namespace but unconfigured.
|
||||||
|
2. **Woodpecker CI deployment** — Deploy server and agent via the `woodpecker/woodpecker` Helm chart in the `woodpecker` namespace. The server authenticates with Gitea via OAuth2. The agent uses the Kubernetes backend, executing each pipeline step as a standalone Pod.
|
||||||
|
3. **Pipeline file** — Create `.woodpecker.yml` translating the existing GitHub Actions workflow into Woodpecker's native format, targeting the local registry and adding a GitHub mirror step.
|
||||||
|
4. **ArgoCD/Kargo updates** — Update ArgoCD repo secret to point at Gitea, update ArgoCD Applications to source from Gitea, update Kargo Warehouse to watch the local registry.
|
||||||
|
5. **ARC teardown** — Remove ARC controller, runner scale set, RBAC, PV, and `arc-system` namespace.
|
||||||
|
|
||||||
|
### Key Design Decisions
|
||||||
|
|
||||||
|
1. **Woodpecker with Kubernetes backend (not Docker-in-Docker agent)** — The Woodpecker agent uses `WOODPECKER_BACKEND: kubernetes`, executing each pipeline step as a standalone Pod in the `woodpecker` namespace. A temporary PVC is created per pipeline run to transfer files between steps. This avoids DinD complexity for most steps. Image builds use the `woodpeckerci/plugin-docker-buildx` plugin with privileged mode for the build step only.
|
||||||
|
|
||||||
|
2. **Gitea API for initial setup** — Gitea's initial setup (admin user creation, OAuth2 app registration, repo creation) is automated via Gitea's REST API in `runmefirst.sh`. This avoids manual web UI interaction and makes the setup reproducible.
|
||||||
|
|
||||||
|
3. **Single Helm chart for Woodpecker** — The `woodpecker/woodpecker` chart contains both server and agent subcharts. One `helm install` deploys both components. The agent connects to the server via the in-cluster service `woodpecker-server:9000`.
|
||||||
|
|
||||||
|
4. **NFS PV for Woodpecker** — Woodpecker server data (SQLite database, build logs) persists on an NFS volume at `nfs://192.168.42.8:/volume1/Kubernetes/pipelines/woodpecker`, surviving cluster rebuilds. The ARC PV is removed since ARC is being torn down.
|
||||||
|
|
||||||
|
5. **GitHub as read-only mirror** — After all CI steps pass, a final pipeline step pushes to GitHub via SSH key stored as a Woodpecker secret. GitHub mirror failure does not block image promotion or deployment.
|
||||||
|
|
||||||
|
6. **ArgoCD sources from Gitea** — ArgoCD's repo secret is updated to point at the Gitea repository URL. All three Applications (beta, paper, live) source Helm charts from Gitea instead of GitHub.
|
||||||
|
|
||||||
|
7. **Helm chart image registry update** — The base `values.yaml` changes `image.registry` from `ghcr.io/celesrenata/stonks-oracle` to `registry.celestium.life/stonks-oracle`. The `ghcrAuth` section and `ghcr-credentials` imagePullSecret are removed since the local registry requires no authentication.
|
||||||
|
|
||||||
|
## Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────────────────────────────────────┐
|
||||||
|
│ Gremlin Cluster (4x NixOS) │
|
||||||
|
│ │
|
||||||
|
│ ┌─────────────────┐ ┌──────────────────┐ ┌───────────────────────────┐ │
|
||||||
|
│ │ git-server ns │ │ woodpecker ns │ │ argocd ns │ │
|
||||||
|
│ │ (pre-existing) │ │ (NEW) │ │ (existing, updated) │ │
|
||||||
|
│ │ │ │ │ │ │ │
|
||||||
|
│ │ Gitea │ │ WP Server │ │ ArgoCD Server │ │
|
||||||
|
│ │ 10.1.1.x:30300 │ │ (StatefulSet) │ │ (stonks-argocd. │ │
|
||||||
|
│ │ :30022 (SSH) │ │ stonks-ci. │ │ celestium.life) │ │
|
||||||
|
│ │ │ │ celestium.life │ │ │ │
|
||||||
|
│ │ Local Registry │ │ │ │ Repo: Gitea (updated) │ │
|
||||||
|
│ │ registry. │ │ WP Agent │ │ │ │
|
||||||
|
│ │ celestium.life │ │ (Deployment) │ │ │ │
|
||||||
|
│ │ :30500 │ │ K8s backend │ │ │ │
|
||||||
|
│ └─────────────────┘ └──────────────────┘ └───────────────────────────┘ │
|
||||||
|
│ │
|
||||||
|
│ ┌─────────────────┐ ┌──────────────────┐ ┌───────────────────────────┐ │
|
||||||
|
│ │ kargo ns │ │ stonks-beta ns │ │ stonks-oracle ns │ │
|
||||||
|
│ │ (existing, │ │ │ │ (live/production) │ │
|
||||||
|
│ │ updated) │ │ ArgoCD App: │ │ │ │
|
||||||
|
│ │ │ │ stonks-beta │ │ ArgoCD App: stonks-live │ │
|
||||||
|
│ │ Warehouse: │ │ images from │ │ images from │ │
|
||||||
|
│ │ local registry │ │ local registry │ │ local registry │ │
|
||||||
|
│ │ (updated) │ │ │ │ │ │
|
||||||
|
│ └─────────────────┘ └──────────────────┘ └───────────────────────────┘ │
|
||||||
|
│ │
|
||||||
|
│ NFS: nfs://192.168.42.8:/volume1/Kubernetes/pipelines/{argocd,kargo,woodpecker}│
|
||||||
|
└─────────────────────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
### Pipeline Flow
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
graph LR
|
||||||
|
A[Git Push to Gitea] --> B[Webhook → Woodpecker CI]
|
||||||
|
B --> C[Lint + Test<br/>Python ruff + pytest<br/>Frontend vitest]
|
||||||
|
C --> D[Build + Push<br/>all images to<br/>local registry]
|
||||||
|
D --> E[Integration Tests<br/>run_pipeline.sh]
|
||||||
|
E -->|pass| F[GitHub Mirror<br/>git push]
|
||||||
|
E -->|fail| X[❌ Pipeline Failed]
|
||||||
|
F --> G[Kargo Warehouse<br/>detects new tag<br/>in local registry]
|
||||||
|
G --> H[Beta Stage<br/>auto-promote]
|
||||||
|
H --> I{Market Hours?}
|
||||||
|
I -->|outside| J[Paper Stage]
|
||||||
|
I -->|during| K[🚫 Blocked]
|
||||||
|
K -->|break-glass| J
|
||||||
|
J --> L{Market Hours?}
|
||||||
|
L -->|outside| M[Live Stage<br/>manual approval]
|
||||||
|
L -->|during| N[🚫 Blocked]
|
||||||
|
N -->|break-glass| M
|
||||||
|
```
|
||||||
|
|
||||||
|
## Components and Interfaces
|
||||||
|
|
||||||
|
### 1. Gitea Configuration (`pipelines/gitea/`)
|
||||||
|
|
||||||
|
Gitea is already deployed in the `git-server` namespace but needs initial setup. The configuration is automated via shell scripts that call Gitea's REST API.
|
||||||
|
|
||||||
|
**Setup Steps (in `runmefirst.sh`):**
|
||||||
|
|
||||||
|
1. **Complete initial setup** — POST to `http://<gitea-svc>:3000/` with admin credentials to complete the install wizard, or use the Gitea API to create the admin user if the instance is already initialized.
|
||||||
|
2. **Create OAuth2 application** — POST to `/api/v1/user/applications/oauth2` to register Woodpecker CI with callback URL `https://stonks-ci.celestium.life/authorize`. Store the returned `client_id` and `client_secret` for Woodpecker's Helm values.
|
||||||
|
3. **Create repository** — POST to `/api/v1/user/repos` to create `stonks-oracle` repository.
|
||||||
|
4. **Add Gitea remote to local repo** — Configure the local Git clone on gremlin-1 with the Gitea remote and push the existing codebase.
|
||||||
|
|
||||||
|
**Gitea Service Access:**
|
||||||
|
- Web UI: `http://gitea-http.git-server.svc.cluster.local:3000` (cluster-internal) / `10.1.1.x:30300` (NodePort)
|
||||||
|
- SSH: `:30022` (NodePort)
|
||||||
|
- API: `http://gitea-http.git-server.svc.cluster.local:3000/api/v1/`
|
||||||
|
|
||||||
|
**Webhook Configuration:**
|
||||||
|
Woodpecker CI automatically registers webhooks when a repository is activated through the Woodpecker dashboard or API. The webhook URL points to the Woodpecker server's internal service endpoint.
|
||||||
|
|
||||||
|
### 2. Woodpecker CI Server and Agent (`pipelines/woodpecker/`)
|
||||||
|
|
||||||
|
**Namespace:** `woodpecker`
|
||||||
|
|
||||||
|
**Helm Chart:** `woodpecker/woodpecker` from the [woodpecker-ci/helm](https://github.com/woodpecker-ci/helm) repository. Contains two subcharts: `server` and `agent`.
|
||||||
|
|
||||||
|
**Server Configuration:**
|
||||||
|
- StatefulSet with 1 replica
|
||||||
|
- Persistent volume for SQLite database and build data at `/var/lib/woodpecker`
|
||||||
|
- NFS-backed PV at `nfs://192.168.42.8:/volume1/Kubernetes/pipelines/woodpecker`
|
||||||
|
- Traefik ingress at `stonks-ci.celestium.life` with TLS via `ca-issuer`
|
||||||
|
- Gitea OAuth2 authentication via `WOODPECKER_GITEA=true`, `WOODPECKER_GITEA_URL`, `WOODPECKER_GITEA_CLIENT`, `WOODPECKER_GITEA_SECRET`
|
||||||
|
- `WOODPECKER_HOST=https://stonks-ci.celestium.life`
|
||||||
|
- `WOODPECKER_ADMIN=admin` (matches Gitea admin username)
|
||||||
|
|
||||||
|
**Agent Configuration:**
|
||||||
|
- Deployment with 2 replicas
|
||||||
|
- Kubernetes backend (`WOODPECKER_BACKEND: kubernetes`)
|
||||||
|
- Pipeline steps execute as standalone Pods in the `woodpecker` namespace
|
||||||
|
- Temporary PVC created per pipeline run for file transfer between steps
|
||||||
|
- `WOODPECKER_BACKEND_K8S_STORAGE_CLASS: ""` (use default)
|
||||||
|
- `WOODPECKER_BACKEND_K8S_VOLUME_SIZE: 10G`
|
||||||
|
- ServiceAccount with RBAC for creating Pods, Services, PVCs in the `woodpecker` namespace
|
||||||
|
- Additional ClusterRoleBinding for integration test steps that need to create ephemeral namespaces
|
||||||
|
|
||||||
|
**Helm Values Structure (`pipelines/woodpecker/values.yaml`):**
|
||||||
|
```yaml
|
||||||
|
server:
|
||||||
|
enabled: true
|
||||||
|
env:
|
||||||
|
WOODPECKER_HOST: "https://stonks-ci.celestium.life"
|
||||||
|
WOODPECKER_GITEA: "true"
|
||||||
|
WOODPECKER_GITEA_URL: "http://gitea-http.git-server.svc.cluster.local:3000"
|
||||||
|
WOODPECKER_GITEA_CLIENT: "<from-oauth2-setup>"
|
||||||
|
WOODPECKER_GITEA_SECRET: "<from-oauth2-setup>"
|
||||||
|
WOODPECKER_ADMIN: "admin"
|
||||||
|
ingress:
|
||||||
|
enabled: true
|
||||||
|
ingressClassName: traefik
|
||||||
|
hosts:
|
||||||
|
- host: stonks-ci.celestium.life
|
||||||
|
paths:
|
||||||
|
- path: /
|
||||||
|
backend:
|
||||||
|
serviceName: woodpecker-server
|
||||||
|
servicePort: 80
|
||||||
|
tls:
|
||||||
|
- secretName: woodpecker-tls
|
||||||
|
hosts:
|
||||||
|
- stonks-ci.celestium.life
|
||||||
|
annotations:
|
||||||
|
cert-manager.io/cluster-issuer: ca-issuer
|
||||||
|
persistentVolume:
|
||||||
|
enabled: true
|
||||||
|
size: 5Gi
|
||||||
|
storageClass: ""
|
||||||
|
|
||||||
|
agent:
|
||||||
|
enabled: true
|
||||||
|
replicaCount: 2
|
||||||
|
env:
|
||||||
|
WOODPECKER_SERVER: "woodpecker-server:9000"
|
||||||
|
WOODPECKER_BACKEND: kubernetes
|
||||||
|
WOODPECKER_BACKEND_K8S_NAMESPACE: woodpecker
|
||||||
|
WOODPECKER_BACKEND_K8S_VOLUME_SIZE: 10G
|
||||||
|
WOODPECKER_BACKEND_K8S_STORAGE_RWX: "true"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Network Policy:**
|
||||||
|
A NetworkPolicy in the `woodpecker` namespace allows Traefik ingress traffic to the Woodpecker server on its HTTP port (80).
|
||||||
|
|
||||||
|
### 3. Woodpecker Pipeline File (`.woodpecker.yml`)
|
||||||
|
|
||||||
|
The pipeline file translates the existing GitHub Actions workflow into Woodpecker's native format. Each step runs as a Docker container.
|
||||||
|
|
||||||
|
**Pipeline Structure:**
|
||||||
|
|
||||||
|
```
|
||||||
|
.woodpecker.yml
|
||||||
|
├── lint-python (ruff check services/)
|
||||||
|
├── test-python (pytest tests/)
|
||||||
|
├── test-frontend (npm ci && npx vitest --run)
|
||||||
|
├── build-<service> (×12 Python services, sequential or grouped)
|
||||||
|
├── build-dashboard (frontend/Dockerfile)
|
||||||
|
├── build-superset (docker/Dockerfile.superset)
|
||||||
|
├── integration-test (run_pipeline.sh)
|
||||||
|
└── mirror-github (git push to GitHub)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Key Differences from GitHub Actions:**
|
||||||
|
- No `uses:` syntax — each step specifies an `image:` and `commands:` or uses a Woodpecker plugin
|
||||||
|
- Image builds use `woodpeckerci/plugin-docker-buildx` plugin with `settings.repo`, `settings.registry`, `settings.tags`
|
||||||
|
- Branch filtering via `when: { branch: main, event: push }` instead of GitHub's `if:` conditions
|
||||||
|
- Secrets referenced via `from_secret:` instead of `${{ secrets.X }}`
|
||||||
|
- No matrix builds in Woodpecker — services are built sequentially or via multiple steps
|
||||||
|
|
||||||
|
**Image Tagging:**
|
||||||
|
All images pushed to `registry.celestium.life/stonks-oracle/<service>:<sha>` and `registry.celestium.life/stonks-oracle/<service>:latest`.
|
||||||
|
|
||||||
|
**GitHub Mirror Step:**
|
||||||
|
Uses the `woodpeckerci/plugin-git-push` plugin or a custom step with `git push --mirror` using an SSH deploy key stored as a Woodpecker secret.
|
||||||
|
|
||||||
|
### 4. ArgoCD Updates
|
||||||
|
|
||||||
|
**Repo Secret Update (`pipelines/argocd/repo-secret.yaml`):**
|
||||||
|
Change the repository URL from GitHub to Gitea:
|
||||||
|
```yaml
|
||||||
|
stringData:
|
||||||
|
url: http://gitea-http.git-server.svc.cluster.local:3000/admin/stonks-oracle.git
|
||||||
|
type: git
|
||||||
|
username: admin
|
||||||
|
password: <gitea-admin-password>
|
||||||
|
```
|
||||||
|
|
||||||
|
**Application Updates (`pipelines/argocd/apps/*.yaml`):**
|
||||||
|
All three Applications (stonks-beta, stonks-paper, stonks-live) update `spec.source.repoURL` from `https://github.com/celesrenata/stonks-oracle.git` to the Gitea repository URL.
|
||||||
|
|
||||||
|
### 5. Kargo Warehouse Update
|
||||||
|
|
||||||
|
**Warehouse Update (`pipelines/kargo/warehouse.yaml`):**
|
||||||
|
Change the image subscription from GHCR to the local registry:
|
||||||
|
```yaml
|
||||||
|
spec:
|
||||||
|
subscriptions:
|
||||||
|
- image:
|
||||||
|
repoURL: registry.celestium.life/stonks-oracle/query-api
|
||||||
|
```
|
||||||
|
|
||||||
|
Kargo stages, project, project-config, and market-hours AnalysisTemplate remain unchanged.
|
||||||
|
|
||||||
|
### 6. Helm Chart Updates (`infra/helm/stonks-oracle/`)
|
||||||
|
|
||||||
|
**`values.yaml` changes:**
|
||||||
|
```yaml
|
||||||
|
image:
|
||||||
|
registry: registry.celestium.life/stonks-oracle # was: ghcr.io/celesrenata/stonks-oracle
|
||||||
|
pullPolicy: Always
|
||||||
|
tag: latest
|
||||||
|
|
||||||
|
# REMOVED: imagePullSecrets, ghcrAuth sections
|
||||||
|
```
|
||||||
|
|
||||||
|
**`values-beta.yaml` and `values-paper.yaml`:**
|
||||||
|
No changes needed — they inherit `image.registry` from the base `values.yaml` and only override `image.tag`.
|
||||||
|
|
||||||
|
### 7. ARC Teardown
|
||||||
|
|
||||||
|
The `runmefirst.sh` script tears down ARC before installing Woodpecker:
|
||||||
|
|
||||||
|
1. `helm uninstall arc-runner-set --namespace arc-system || true`
|
||||||
|
2. `helm uninstall arc --namespace arc-system || true`
|
||||||
|
3. `kubectl delete -f arc/runner-rbac.yaml --ignore-not-found`
|
||||||
|
4. `kubectl delete pv pipeline-arc-pv --ignore-not-found`
|
||||||
|
5. `kubectl delete namespace arc-system --ignore-not-found`
|
||||||
|
|
||||||
|
The `pipelines/arc/` directory and `pipelines/pvs/arc-pv.yaml` are removed from the repo.
|
||||||
|
|
||||||
|
### 8. NFS Persistent Volumes
|
||||||
|
|
||||||
|
**Updated PV set** (ARC PV removed, Woodpecker PV added):
|
||||||
|
|
||||||
|
| PV Name | NFS Path | Capacity | Bound To |
|
||||||
|
|---|---|---|---|
|
||||||
|
| `pipeline-argocd-pv` | `/volume1/Kubernetes/pipelines/argocd` | 5Gi | PVC in `argocd` ns |
|
||||||
|
| `pipeline-kargo-pv` | `/volume1/Kubernetes/pipelines/kargo` | 2Gi | PVC in `kargo` ns |
|
||||||
|
| `pipeline-woodpecker-pv` | `/volume1/Kubernetes/pipelines/woodpecker` | 5Gi | PVC in `woodpecker` ns |
|
||||||
|
|
||||||
|
### 9. Updated `runmefirst.sh`
|
||||||
|
|
||||||
|
```
|
||||||
|
#!/bin/bash
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
# 1. Tear down ARC (if present)
|
||||||
|
# - Uninstall ARC Helm releases
|
||||||
|
# - Delete RBAC, PV, namespace
|
||||||
|
|
||||||
|
# 2. Create namespaces (woodpecker, argocd, kargo, stonks-beta, stonks-paper)
|
||||||
|
|
||||||
|
# 3. Create NFS PVs (argocd, kargo, woodpecker)
|
||||||
|
|
||||||
|
# 4. Configure Gitea
|
||||||
|
# - Complete initial setup via API
|
||||||
|
# - Create admin user (if needed)
|
||||||
|
# - Create OAuth2 app for Woodpecker
|
||||||
|
# - Create stonks-oracle repository
|
||||||
|
|
||||||
|
# 5. Install Woodpecker CI via Helm
|
||||||
|
# - Inject Gitea OAuth2 client_id and client_secret into values
|
||||||
|
# - Apply NetworkPolicy for Traefik ingress
|
||||||
|
|
||||||
|
# 6. Install ArgoCD via Helm
|
||||||
|
# - Apply updated repo secret (pointing to Gitea)
|
||||||
|
# - Apply ArgoCD Applications
|
||||||
|
|
||||||
|
# 7. Install Kargo via Helm
|
||||||
|
# - Apply project, project-config, warehouse (local registry), stages
|
||||||
|
|
||||||
|
# 8. Apply Woodpecker agent RBAC for integration tests
|
||||||
|
```
|
||||||
|
|
||||||
|
### 10. Updated `runmelast.sh`
|
||||||
|
|
||||||
|
```
|
||||||
|
#!/bin/bash
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
# Reverse order: Kargo → ArgoCD → Woodpecker
|
||||||
|
# Preserves: PVs, NFS data, git-server namespace (Gitea + registry)
|
||||||
|
|
||||||
|
# 1. Remove Kargo resources + Helm release
|
||||||
|
# 2. Remove ArgoCD resources + Helm release
|
||||||
|
# 3. Remove Woodpecker Helm release
|
||||||
|
# 4. Delete namespaces (woodpecker, argocd, kargo)
|
||||||
|
# 5. PVs intentionally NOT deleted
|
||||||
|
```
|
||||||
|
|
||||||
|
### 11. Woodpecker Agent RBAC
|
||||||
|
|
||||||
|
The Woodpecker agent's service account needs:
|
||||||
|
- **Namespace-scoped RBAC** (auto-created by Helm chart): Create/delete Pods, Services, PVCs in the `woodpecker` namespace for pipeline step execution.
|
||||||
|
- **ClusterRoleBinding** (manually applied): Grant the agent service account `cluster-admin` for integration test steps that create ephemeral namespaces and deploy sandbox infrastructure. This mirrors the existing ARC runner RBAC pattern.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
|
kind: ClusterRoleBinding
|
||||||
|
metadata:
|
||||||
|
name: woodpecker-agent-inttest
|
||||||
|
roleRef:
|
||||||
|
apiGroup: rbac.authorization.k8s.io
|
||||||
|
kind: ClusterRole
|
||||||
|
name: cluster-admin
|
||||||
|
subjects:
|
||||||
|
- kind: ServiceAccount
|
||||||
|
name: woodpecker-agent
|
||||||
|
namespace: woodpecker
|
||||||
|
```
|
||||||
|
|
||||||
|
## Data Models
|
||||||
|
|
||||||
|
### Pipeline Infrastructure Layout
|
||||||
|
|
||||||
|
```
|
||||||
|
~/sources/kube/pipelines/
|
||||||
|
├── runmefirst.sh # Full install: ARC teardown → Gitea config → Woodpecker → ArgoCD → Kargo
|
||||||
|
├── runmelast.sh # Teardown: Kargo → ArgoCD → Woodpecker (preserves PVs, git-server)
|
||||||
|
├── gitea/
|
||||||
|
│ └── setup.sh # Gitea API setup: admin user, OAuth2 app, repo creation
|
||||||
|
├── woodpecker/
|
||||||
|
│ ├── values.yaml # Woodpecker Helm values (server + agent)
|
||||||
|
│ └── network-policy.yaml # NetworkPolicy for Traefik → Woodpecker server
|
||||||
|
│ └── agent-rbac.yaml # ClusterRoleBinding for integration test access
|
||||||
|
├── argocd/
|
||||||
|
│ ├── values.yaml # ArgoCD Helm values (unchanged)
|
||||||
|
│ ├── repo-secret.yaml # Updated: points to Gitea instead of GitHub
|
||||||
|
│ └── apps/
|
||||||
|
│ ├── stonks-beta.yaml # Updated: repoURL → Gitea
|
||||||
|
│ ├── stonks-paper.yaml # Updated: repoURL → Gitea
|
||||||
|
│ └── stonks-live.yaml # Updated: repoURL → Gitea
|
||||||
|
├── kargo/
|
||||||
|
│ ├── values.yaml # Kargo Helm values (unchanged)
|
||||||
|
│ ├── project.yaml # Kargo Project (unchanged)
|
||||||
|
│ ├── project-config.yaml # Kargo ProjectConfig (unchanged)
|
||||||
|
│ ├── warehouse.yaml # Updated: watches local registry
|
||||||
|
│ ├── market-hours-check.yaml # AnalysisTemplate (unchanged)
|
||||||
|
│ └── stages/
|
||||||
|
│ ├── beta.yaml # Kargo Stage (unchanged)
|
||||||
|
│ ├── paper.yaml # Kargo Stage (unchanged)
|
||||||
|
│ └── live.yaml # Kargo Stage (unchanged)
|
||||||
|
└── pvs/
|
||||||
|
├── argocd-pv.yaml # NFS PV for ArgoCD (unchanged)
|
||||||
|
├── kargo-pv.yaml # NFS PV for Kargo (unchanged)
|
||||||
|
└── woodpecker-pv.yaml # NFS PV for Woodpecker (NEW, replaces arc-pv.yaml)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Removed Files
|
||||||
|
|
||||||
|
```
|
||||||
|
pipelines/arc/ # Entire directory removed
|
||||||
|
├── values.yaml
|
||||||
|
├── runner-scaleset.yaml
|
||||||
|
└── runner-rbac.yaml
|
||||||
|
pipelines/pvs/arc-pv.yaml # ARC PV removed
|
||||||
|
```
|
||||||
|
|
||||||
|
### Image Tag Flow (Updated)
|
||||||
|
|
||||||
|
```
|
||||||
|
Git SHA (e.g., abc123)
|
||||||
|
→ Woodpecker builds: registry.celestium.life/stonks-oracle/<service>:abc123
|
||||||
|
→ Integration test: run_pipeline.sh --image-tag abc123
|
||||||
|
→ GitHub mirror: git push (non-blocking)
|
||||||
|
→ Kargo Warehouse detects: abc123 in local registry
|
||||||
|
→ Kargo Freight created: abc123
|
||||||
|
→ Beta: helm upgrade with image.tag=abc123
|
||||||
|
→ Paper: helm upgrade with image.tag=abc123 (after market-hours check)
|
||||||
|
→ Live: helm upgrade with image.tag=abc123 (after approval + market-hours check)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Kargo Resource Relationships (Updated)
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
graph TD
|
||||||
|
W[Warehouse: stonks-images<br/>watches LOCAL REGISTRY<br/>registry.celestium.life] -->|produces| F[Freight<br/>image tag = git SHA]
|
||||||
|
F -->|auto-promote| SB[Stage: beta<br/>ArgoCD App: stonks-beta]
|
||||||
|
SB -->|verified → available| SP[Stage: paper<br/>market-hours verification<br/>ArgoCD App: stonks-paper]
|
||||||
|
SP -->|verified → available| SL[Stage: live<br/>manual approval + market-hours<br/>ArgoCD App: stonks-live]
|
||||||
|
```
|
||||||
|
|
||||||
|
## Error Handling
|
||||||
|
|
||||||
|
### Gitea Setup Failures
|
||||||
|
|
||||||
|
| Failure | Detection | Recovery |
|
||||||
|
|---|---|---|
|
||||||
|
| Gitea not reachable | API call returns connection error | Check Gitea pod status in `git-server` namespace. Verify NodePort service. |
|
||||||
|
| Admin user already exists | API returns 422 | Script continues — idempotent. |
|
||||||
|
| OAuth2 app already exists | API returns 422 | Script queries existing apps and reuses credentials. |
|
||||||
|
| Repository already exists | API returns 409 | Script continues — idempotent. |
|
||||||
|
|
||||||
|
### Woodpecker Deployment Failures
|
||||||
|
|
||||||
|
| Failure | Detection | Recovery |
|
||||||
|
|---|---|---|
|
||||||
|
| Helm install fails | Non-zero exit | Check Helm chart repo access. Verify `woodpecker` namespace exists. |
|
||||||
|
| Server can't reach Gitea | OAuth2 login fails | Verify `WOODPECKER_GITEA_URL` resolves within cluster. Check Gitea service. |
|
||||||
|
| Agent can't connect to server | Agent logs show connection errors | Verify `WOODPECKER_SERVER` env var matches server service name. Check agent secret. |
|
||||||
|
| Pipeline step Pod fails to schedule | Pod stuck in Pending | Check node resources. Verify RBAC allows Pod creation in `woodpecker` namespace. |
|
||||||
|
| Image build fails (privileged) | Build step exits non-zero | Verify containerd/k3s allows privileged Pods. Check `plugin-docker-buildx` logs. |
|
||||||
|
|
||||||
|
### Pipeline Failures
|
||||||
|
|
||||||
|
| Failure | Detection | Recovery |
|
||||||
|
|---|---|---|
|
||||||
|
| Lint/test fails | Step exits non-zero | Fix code, push again. Build steps are skipped. |
|
||||||
|
| Image push to local registry fails | Plugin exits non-zero | Check registry health at `registry.celestium.life`. Verify DNS resolution. |
|
||||||
|
| Integration test fails | `run_pipeline.sh` exits non-zero | Check Woodpecker dashboard for step logs. Fix and re-push. |
|
||||||
|
| GitHub mirror fails | Mirror step exits non-zero | Non-blocking — images are already in local registry. Fix SSH key and re-run. |
|
||||||
|
|
||||||
|
### ArgoCD/Kargo Update Failures
|
||||||
|
|
||||||
|
| Failure | Detection | Recovery |
|
||||||
|
|---|---|---|
|
||||||
|
| ArgoCD can't clone from Gitea | Application shows "ComparisonError" | Verify repo secret credentials. Check Gitea accessibility from ArgoCD namespace. |
|
||||||
|
| Kargo can't reach local registry | Warehouse shows error | Verify `registry.celestium.life` DNS resolves. Check registry pod health. |
|
||||||
|
| Image pull fails (k3s nodes) | Pods stuck in ImagePullBackOff | Ensure k3s containerd trusts the local registry. Add registry mirror config if needed. |
|
||||||
|
|
||||||
|
### Rollback Strategy
|
||||||
|
|
||||||
|
Same as existing design:
|
||||||
|
- **Beta/Paper**: Promote a previous Freight in Kargo to roll back the image tag.
|
||||||
|
- **Live**: Same mechanism with manual approval required.
|
||||||
|
- **Emergency**: Direct `helm upgrade` with previous image tag.
|
||||||
|
|
||||||
|
## Testing Strategy
|
||||||
|
|
||||||
|
### Why Property-Based Testing Does Not Apply
|
||||||
|
|
||||||
|
This feature is entirely Infrastructure as Code: shell scripts, Kubernetes YAML manifests, Helm values files, and a Woodpecker pipeline YAML file. There are no pure functions, parsers, serializers, or business logic with meaningful input variation. PBT requires universal properties across a wide input space — this feature has fixed configuration values and Kubernetes resource states. Running 100 iterations of "does the Woodpecker ingress have TLS enabled" adds no value over running it once.
|
||||||
|
|
||||||
|
### Testing Approach
|
||||||
|
|
||||||
|
The testing strategy uses three tiers:
|
||||||
|
|
||||||
|
#### Tier 1: Smoke Tests (Configuration Validation)
|
||||||
|
|
||||||
|
Run locally or in CI without a live cluster.
|
||||||
|
|
||||||
|
| Test | What It Validates | How |
|
||||||
|
|---|---|---|
|
||||||
|
| Manifest syntax | All YAML files parse correctly | `kubectl apply --dry-run=client -f <file>` |
|
||||||
|
| Helm template rendering | Woodpecker values produce valid K8s resources | `helm template` with values file |
|
||||||
|
| Pipeline file syntax | `.woodpecker.yml` is valid | Woodpecker CLI lint or YAML parse |
|
||||||
|
| Namespace isolation | Pipeline namespaces distinct from `stonks-oracle` and `git-server` | Grep manifests for namespace fields |
|
||||||
|
| NFS path separation | PVs use distinct subdirectories | Inspect PV YAML |
|
||||||
|
| Image registry references | All manifests reference `registry.celestium.life` not `ghcr.io` | Grep all YAML for registry URLs |
|
||||||
|
| No GHCR auth remnants | `ghcrAuth` and `ghcr-credentials` removed from Helm chart | Grep values.yaml |
|
||||||
|
| ArgoCD repo URL | All Applications point to Gitea, not GitHub | Inspect Application YAML |
|
||||||
|
| Kargo warehouse URL | Warehouse watches local registry | Inspect warehouse YAML |
|
||||||
|
|
||||||
|
#### Tier 2: Integration Tests (Live Cluster Verification)
|
||||||
|
|
||||||
|
Run after `runmefirst.sh` on the Gremlin cluster.
|
||||||
|
|
||||||
|
| Test | What It Validates | How |
|
||||||
|
|---|---|---|
|
||||||
|
| Gitea accessible | Web UI responds | `curl http://10.1.1.x:30300` |
|
||||||
|
| Gitea repo exists | `stonks-oracle` repo created | Gitea API query |
|
||||||
|
| Woodpecker server running | Pods healthy in `woodpecker` namespace | `kubectl get pods -n woodpecker` |
|
||||||
|
| Woodpecker dashboard accessible | Web UI responds at `stonks-ci.celestium.life` | `curl -k https://stonks-ci.celestium.life` |
|
||||||
|
| Woodpecker OAuth2 works | Login redirects to Gitea | Browser test |
|
||||||
|
| ArgoCD accessible | Web UI responds at `stonks-argocd.celestium.life` | `curl -k https://stonks-argocd.celestium.life` |
|
||||||
|
| ArgoCD syncs from Gitea | Applications sync successfully | `argocd app get stonks-beta` |
|
||||||
|
| Kargo Warehouse | Discovers images from local registry | `kubectl get freight -n stonks-oracle` |
|
||||||
|
| Local registry accessible | Registry responds | `curl https://registry.celestium.life/v2/_catalog` |
|
||||||
|
| TLS certificates | Ingresses have valid certs from `ca-issuer` | `openssl s_client` or cert-manager status |
|
||||||
|
| PV binding | PVCs bound to NFS PVs | `kubectl get pvc -n woodpecker` |
|
||||||
|
| ARC removed | No ARC pods, no `arc-system` namespace | `kubectl get ns arc-system` returns NotFound |
|
||||||
|
| End-to-end pipeline | Push triggers build, images land in local registry | Push a commit, verify in Woodpecker dashboard |
|
||||||
|
| End-to-end promotion | Image flows beta → paper → live | Trigger promotion, verify deployments update |
|
||||||
|
| Teardown preservation | After `runmelast.sh`, PVs and NFS data intact | Run teardown, check PVs and NFS mount |
|
||||||
|
|
||||||
|
#### Tier 3: Market-Hours and Break-Glass Tests
|
||||||
|
|
||||||
|
Unchanged from existing design — these tests validate Kargo behavior which is not modified.
|
||||||
|
|
||||||
|
| Test | What It Validates | How |
|
||||||
|
|---|---|---|
|
||||||
|
| Market-hours block | Promotion blocked during 09:30–16:00 ET | Run AnalysisTemplate during market hours |
|
||||||
|
| Market-hours allow | Promotion allowed outside hours | Run AnalysisTemplate outside hours |
|
||||||
|
| Break-glass override | Manual approval bypasses block | Use Kargo manual approval during hours |
|
||||||
|
| Break-glass audit | Records operator, timestamp, justification | Query Kargo audit trail |
|
||||||
@@ -0,0 +1,356 @@
|
|||||||
|
# Local CI/CD Pipeline — Requirements
|
||||||
|
|
||||||
|
## Introduction
|
||||||
|
|
||||||
|
Fully local CI/CD pipeline for the Stonks Oracle platform that eliminates all dependency on GitHub's API for CI/CD orchestration. Gitea replaces GitHub as the primary Git remote, Woodpecker CI replaces ARC for CI execution and pipeline orchestration, and the local Docker registry at `registry.celestium.life` replaces GHCR for image storage. Woodpecker CI connects to Gitea via OAuth2 for authentication and receives push/PR webhooks to trigger pipelines. Woodpecker agents execute pipeline steps as Docker containers on the Gremlin Cluster. GitHub becomes a read-only mirror updated only after successful CI runs. The existing ArgoCD and Kargo infrastructure is retained for GitOps deployment and staged promotion, with image sources updated to point at the local registry. All pipeline infrastructure scripts reside in `~/sources/kube/pipelines/` on gremlin-1 and persist state on NFS volumes that survive cluster rebuilds.
|
||||||
|
|
||||||
|
## Glossary
|
||||||
|
|
||||||
|
- **Gitea**: A self-hosted Git forge running in the `git-server` namespace at `10.1.1.x:30300` (web) and `:30022` (SSH), providing Git hosting and code review
|
||||||
|
- **Woodpecker_CI**: A CI/CD server that receives webhooks from Gitea and orchestrates pipelines, deployed as a Kubernetes Deployment in the `woodpecker` namespace, accessible via the Woodpecker_Dashboard at `stonks-ci.celestium.life`
|
||||||
|
- **Woodpecker_Agent**: The worker component of Woodpecker CI that connects to the Woodpecker_CI server and executes pipeline steps as Docker containers on the Gremlin_Cluster
|
||||||
|
- **Woodpecker_Dashboard**: The Woodpecker CI web UI at `stonks-ci.celestium.life` for viewing pipeline status, build logs, and managing repository settings
|
||||||
|
- **Local_Registry**: The Docker container registry running in the `git-server` namespace at `registry.celestium.life` (HTTPS via Traefik) and `:30500` (NodePort), backed by persistent storage at `/kubedata-local/registry`
|
||||||
|
- **GitHub_Mirror**: A post-CI pipeline step that pushes the repository to GitHub for public visibility after all CI checks pass, using `git push --mirror` or equivalent
|
||||||
|
- **ArgoCD**: A GitOps continuous delivery controller for Kubernetes that syncs cluster state from Git repositories
|
||||||
|
- **Kargo**: A promotion orchestration layer built on top of ArgoCD providing staged promotion gates, a visual web dashboard, and audit trails
|
||||||
|
- **Pipeline_Infrastructure**: The set of Kubernetes resources (Gitea, Woodpecker_CI, Woodpecker_Agent, Local_Registry, ArgoCD, Kargo) and their supporting manifests, PVs, and scripts that comprise the CI/CD system, deployed from `~/sources/kube/pipelines/`
|
||||||
|
- **Promotion**: The act of advancing a specific Image_Tag from one pipeline stage to the next (e.g., beta to paper)
|
||||||
|
- **Promotion_Blocker**: A time-based gate that prevents promotions during US equity market hours (09:30–16:00 ET, Monday–Friday)
|
||||||
|
- **Break_Glass**: An emergency override mechanism that bypasses the Promotion_Blocker, requiring explicit confirmation and an audit note
|
||||||
|
- **Stage**: One of the deployment environments in the pipeline: CI, Beta, Paper, Live
|
||||||
|
- **NFS_PV**: A Kubernetes PersistentVolume backed by the NFS share at `nfs://192.168.42.8:/volume1/Kubernetes/pipelines`, used to persist pipeline state across cluster rebuilds
|
||||||
|
- **Image_Tag**: A Docker image tag in the format `<sha>` (Git commit SHA) used to identify a specific build across all stages
|
||||||
|
- **Gremlin_Cluster**: The 4-node NixOS Kubernetes cluster (gremlin-1 through gremlin-4) at primary address 192.168.42.254
|
||||||
|
- **Market_Hours**: US equity market trading hours, 09:30–16:00 Eastern Time, Monday through Friday
|
||||||
|
- **Kargo_Dashboard**: The Kargo web UI providing visual promotion management, stage status, and audit history
|
||||||
|
- **ReviewBoard**: The code review tool running in the `docker-reviewboard` namespace at `cr.celestium.life`, used as an optional review gate before merge
|
||||||
|
- **Integration_Test_Runner**: The existing standalone script at `infra/inttest/run_pipeline.sh` that deploys an ephemeral sandbox, seeds data, runs API tests, and produces `inttest-results.json`
|
||||||
|
- **Woodpecker_Pipeline_File**: The `.woodpecker.yml` file in the repository root that defines CI pipeline steps in Woodpecker's YAML format, where each step is a Docker container with commands
|
||||||
|
|
||||||
|
## Requirements
|
||||||
|
|
||||||
|
### Requirement 1: Gitea Configuration and Repository Setup
|
||||||
|
|
||||||
|
**User Story:** As a platform operator, I want Gitea configured with an admin user, the stonks-oracle repository, and webhook integration with Woodpecker CI, so that developers can push code to a fully local Git forge that triggers CI pipelines.
|
||||||
|
|
||||||
|
#### Acceptance Criteria
|
||||||
|
|
||||||
|
1. WHEN the operator executes the pipeline install script, THE Pipeline_Infrastructure SHALL configure Gitea with an admin user account and complete the initial setup
|
||||||
|
2. WHEN Gitea is configured, THE Pipeline_Infrastructure SHALL create a `stonks-oracle` repository in Gitea with webhook integration to Woodpecker_CI
|
||||||
|
3. THE Gitea SHALL be accessible via the web UI at `10.1.1.x:30300` and via SSH at `:30022` for Git operations
|
||||||
|
4. WHEN a developer pushes code to the Gitea `stonks-oracle` repository, THE Gitea SHALL send a webhook event to Woodpecker_CI to trigger the matching pipeline
|
||||||
|
5. THE Pipeline_Infrastructure SHALL store Gitea configuration scripts and manifests in `~/sources/kube/pipelines/`
|
||||||
|
|
||||||
|
### Requirement 2: Woodpecker CI Server and Agent Deployment
|
||||||
|
|
||||||
|
**User Story:** As a platform operator, I want Woodpecker CI server and agents deployed on the Gremlin_Cluster and connected to Gitea via OAuth2, so that CI pipelines execute locally on cluster resources without any external dependency.
|
||||||
|
|
||||||
|
#### Acceptance Criteria
|
||||||
|
|
||||||
|
1. WHEN the pipeline install script executes, THE Pipeline_Infrastructure SHALL deploy the Woodpecker_CI server as a Kubernetes Deployment in the `woodpecker` namespace on the Gremlin_Cluster
|
||||||
|
2. WHEN the pipeline install script executes, THE Pipeline_Infrastructure SHALL deploy at least one Woodpecker_Agent as a Kubernetes Deployment in the `woodpecker` namespace
|
||||||
|
3. WHEN Woodpecker_CI is deployed, THE Woodpecker_CI SHALL authenticate with Gitea via OAuth2 and register webhooks for the `stonks-oracle` repository
|
||||||
|
4. WHEN a webhook event is received from Gitea, THE Woodpecker_CI SHALL schedule the pipeline on an available Woodpecker_Agent
|
||||||
|
5. THE Woodpecker_Agent SHALL execute each pipeline step as a Docker container on the Gremlin_Cluster
|
||||||
|
6. WHEN a pipeline completes, THE Woodpecker_Agent SHALL release cluster resources used by that pipeline's containers
|
||||||
|
|
||||||
|
### Requirement 3: Woodpecker CI Pipeline — Lint and Test
|
||||||
|
|
||||||
|
**User Story:** As a developer, I want every push to main or pull request to trigger automated linting and testing via Woodpecker CI, so that code quality is validated locally before images are built.
|
||||||
|
|
||||||
|
#### Acceptance Criteria
|
||||||
|
|
||||||
|
1. WHEN a push to the `main` branch or a pull request is opened in Gitea, THE Woodpecker_CI SHALL trigger the pipeline defined in `.woodpecker.yml` on a Woodpecker_Agent
|
||||||
|
2. WHEN the pipeline runs, THE Woodpecker_CI SHALL execute Python linting using `ruff check services/`
|
||||||
|
3. WHEN the pipeline runs, THE Woodpecker_CI SHALL execute Python unit tests using `pytest tests/`
|
||||||
|
4. WHEN the pipeline runs, THE Woodpecker_CI SHALL install frontend dependencies and execute frontend tests using `vitest`
|
||||||
|
5. IF any lint or test step fails, THEN THE Woodpecker_CI SHALL mark the pipeline as failed and skip image build steps
|
||||||
|
|
||||||
|
### Requirement 4: Woodpecker CI Pipeline — Image Build and Push to Local Registry
|
||||||
|
|
||||||
|
**User Story:** As a developer, I want Docker images for all services and the dashboard to be built and pushed to the Local_Registry on every successful main branch push, so that new images are available for local deployment without depending on GHCR.
|
||||||
|
|
||||||
|
#### Acceptance Criteria
|
||||||
|
|
||||||
|
1. WHEN lint and tests pass on a push to `main`, THE Woodpecker_CI SHALL build Docker images for all 12 Python services (scheduler, symbol-registry, ingestion, parser, extractor, aggregation, recommendation, risk, broker-adapter, lake-publisher, query-api, trading-engine) using the `woodpeckerci/plugin-docker-buildx` plugin
|
||||||
|
2. WHEN lint and tests pass on a push to `main`, THE Woodpecker_CI SHALL build the dashboard Docker image from `frontend/Dockerfile`
|
||||||
|
3. WHEN lint and tests pass on a push to `main`, THE Woodpecker_CI SHALL build the superset Docker image from `docker/Dockerfile.superset`
|
||||||
|
4. WHEN images are built, THE Woodpecker_CI SHALL push each image to the Local_Registry with tags `registry.celestium.life/stonks-oracle/<service>:<git-sha>` and `registry.celestium.life/stonks-oracle/<service>:latest`
|
||||||
|
5. WHEN all images are pushed, THE Woodpecker_CI SHALL record the Git SHA as the Image_Tag for downstream stages
|
||||||
|
|
||||||
|
### Requirement 5: Integration Test Stage
|
||||||
|
|
||||||
|
**User Story:** As a developer, I want the CI pipeline to automatically run integration tests against newly built images, so that functional correctness is validated before promotion to beta.
|
||||||
|
|
||||||
|
#### Acceptance Criteria
|
||||||
|
|
||||||
|
1. WHEN all images are pushed to the Local_Registry for a given Image_Tag, THE Woodpecker_CI SHALL invoke the Integration_Test_Runner with `bash infra/inttest/run_pipeline.sh --image-tag <sha>`
|
||||||
|
2. WHEN the Integration_Test_Runner completes, THE Woodpecker_CI SHALL evaluate the `inttest-results.json` file for test counts and exit code
|
||||||
|
3. IF the Integration_Test_Runner exits with code 0, THEN THE Woodpecker_CI SHALL mark the Image_Tag as eligible for promotion to Beta
|
||||||
|
4. IF the Integration_Test_Runner exits with a non-zero code, THEN THE Woodpecker_CI SHALL block promotion to Beta and report the failure details in the Woodpecker_Dashboard
|
||||||
|
5. THE Woodpecker_CI SHALL store the `inttest-results.json` as a pipeline artifact accessible from the Woodpecker_Dashboard
|
||||||
|
|
||||||
|
### Requirement 6: GitHub Mirror Push After Successful CI
|
||||||
|
|
||||||
|
**User Story:** As a developer, I want the repository automatically mirrored to GitHub after all CI checks pass, so that the codebase remains publicly visible on GitHub without GitHub being in the critical CI/CD path.
|
||||||
|
|
||||||
|
#### Acceptance Criteria
|
||||||
|
|
||||||
|
1. WHEN all pipeline steps (lint, test, image build, integration test) pass on a push to `main`, THE Woodpecker_CI SHALL push the repository to the GitHub remote at `github.com/celesrenata/stonks-oracle`
|
||||||
|
2. IF any pipeline step fails, THEN THE Woodpecker_CI SHALL skip the GitHub mirror push
|
||||||
|
3. THE GitHub_Mirror step SHALL use stored Git credentials (SSH key or token) managed via Woodpecker secrets to authenticate with GitHub
|
||||||
|
4. THE GitHub_Mirror step SHALL push all branches and tags to the GitHub remote
|
||||||
|
5. THE Pipeline_Infrastructure SHALL have zero dependency on GitHub API availability for CI/CD orchestration — GitHub mirror failure SHALL NOT block image promotion or deployment
|
||||||
|
|
||||||
|
### Requirement 7: ARC Teardown
|
||||||
|
|
||||||
|
**User Story:** As a platform operator, I want ARC (GitHub Actions Runner Controller) removed from the cluster, so that the deprecated GitHub-dependent CI runner infrastructure is cleaned up.
|
||||||
|
|
||||||
|
#### Acceptance Criteria
|
||||||
|
|
||||||
|
1. WHEN the pipeline install script executes, THE Pipeline_Infrastructure SHALL remove the ARC controller Helm release from the `arc-system` namespace
|
||||||
|
2. WHEN the pipeline install script executes, THE Pipeline_Infrastructure SHALL remove the ARC runner scale set Helm release from the `arc-system` namespace
|
||||||
|
3. WHEN ARC is removed, THE Pipeline_Infrastructure SHALL delete the `arc-system` namespace
|
||||||
|
4. WHEN ARC is removed, THE Pipeline_Infrastructure SHALL remove the ARC runner RBAC ClusterRoleBinding
|
||||||
|
5. THE Pipeline_Infrastructure SHALL remove the ARC NFS PersistentVolume (`pipeline-arc-pv`) since ARC data is no longer needed
|
||||||
|
|
||||||
|
### Requirement 8: Update Kargo Warehouse to Watch Local Registry
|
||||||
|
|
||||||
|
**User Story:** As a platform operator, I want the Kargo Warehouse to watch the Local_Registry instead of GHCR for new image tags, so that the promotion pipeline sources images from the local infrastructure.
|
||||||
|
|
||||||
|
#### Acceptance Criteria
|
||||||
|
|
||||||
|
1. THE Kargo Warehouse `stonks-images` SHALL subscribe to image tags at `registry.celestium.life/stonks-oracle/query-api` instead of `ghcr.io/celesrenata/stonks-oracle/query-api`
|
||||||
|
2. WHEN a new Image_Tag is pushed to the Local_Registry, THE Kargo Warehouse SHALL detect the new tag and create a Freight resource
|
||||||
|
3. IF the Local_Registry is temporarily unavailable, THEN THE Kargo Warehouse SHALL retry image discovery and report the error in the Kargo_Dashboard
|
||||||
|
|
||||||
|
### Requirement 9: Update ArgoCD Applications to Use Local Registry
|
||||||
|
|
||||||
|
**User Story:** As a platform operator, I want ArgoCD Applications to pull images from the Local_Registry instead of GHCR, so that all deployments source images from local infrastructure.
|
||||||
|
|
||||||
|
#### Acceptance Criteria
|
||||||
|
|
||||||
|
1. THE ArgoCD Application `stonks-beta` SHALL deploy using images from `registry.celestium.life/stonks-oracle/` instead of `ghcr.io/celesrenata/stonks-oracle/`
|
||||||
|
2. THE ArgoCD Application `stonks-paper` SHALL deploy using images from `registry.celestium.life/stonks-oracle/` instead of `ghcr.io/celesrenata/stonks-oracle/`
|
||||||
|
3. THE ArgoCD Application `stonks-live` SHALL deploy using images from `registry.celestium.life/stonks-oracle/` instead of `ghcr.io/celesrenata/stonks-oracle/`
|
||||||
|
4. THE ArgoCD Applications SHALL source the Helm chart from the Gitea repository instead of the GitHub repository
|
||||||
|
5. WHEN ArgoCD syncs an Application, THE ArgoCD SHALL pull images from the Local_Registry using the Image_Tag set by Kargo during promotion
|
||||||
|
|
||||||
|
### Requirement 10: Update Helm Chart for Local Registry
|
||||||
|
|
||||||
|
**User Story:** As a platform operator, I want the Helm chart's default image registry updated to the Local_Registry, so that all Kubernetes deployments pull images locally without requiring GHCR credentials.
|
||||||
|
|
||||||
|
#### Acceptance Criteria
|
||||||
|
|
||||||
|
1. THE Helm chart `values.yaml` SHALL set `image.registry` to `registry.celestium.life/stonks-oracle` instead of `ghcr.io/celesrenata/stonks-oracle`
|
||||||
|
2. THE Helm chart SHALL remove the `ghcrAuth` section and `ghcr-credentials` imagePullSecret since the Local_Registry does not require authentication
|
||||||
|
3. THE Helm chart `values-beta.yaml` and `values-paper.yaml` SHALL reference images from the Local_Registry
|
||||||
|
4. WHEN a Helm deployment is executed, THE Helm chart SHALL pull all service images from `registry.celestium.life/stonks-oracle/<service>:<tag>`
|
||||||
|
|
||||||
|
### Requirement 11: Pipeline Infrastructure Deployment
|
||||||
|
|
||||||
|
**User Story:** As a platform operator, I want a single deployment script that installs all local CI/CD pipeline components (configures Gitea, deploys Woodpecker CI, updates ArgoCD and Kargo) onto the Gremlin_Cluster, so that the pipeline infrastructure can be stood up or rebuilt with one command.
|
||||||
|
|
||||||
|
#### Acceptance Criteria
|
||||||
|
|
||||||
|
1. WHEN the operator executes `runmefirst.sh` from `~/sources/kube/pipelines/`, THE Pipeline_Infrastructure SHALL configure Gitea, deploy Woodpecker_CI server and Woodpecker_Agent, install ArgoCD, and install Kargo into the Gremlin_Cluster in dedicated namespaces
|
||||||
|
2. WHEN the operator executes `runmefirst.sh`, THE Pipeline_Infrastructure SHALL create NFS-backed PersistentVolumes at `nfs://192.168.42.8:/volume1/Kubernetes/pipelines` for ArgoCD, Kargo, and Woodpecker persistent data
|
||||||
|
3. WHEN ArgoCD is deployed, THE Pipeline_Infrastructure SHALL expose the ArgoCD web UI via Traefik ingress with TLS using the `ca-issuer` ClusterIssuer
|
||||||
|
4. WHEN Kargo is deployed, THE Pipeline_Infrastructure SHALL expose the Kargo_Dashboard via Traefik ingress with TLS using the `ca-issuer` ClusterIssuer
|
||||||
|
5. WHEN Woodpecker_CI is deployed, THE Pipeline_Infrastructure SHALL expose the Woodpecker_Dashboard via Traefik ingress with TLS using the `ca-issuer` ClusterIssuer at `stonks-ci.celestium.life`
|
||||||
|
6. THE Pipeline_Infrastructure SHALL store all deployment manifests and scripts in `~/sources/kube/pipelines/` on gremlin-1
|
||||||
|
7. WHEN `runmefirst.sh` executes, THE Pipeline_Infrastructure SHALL tear down ARC components (controller, runner scale set, namespace, RBAC, PV) before installing local CI components
|
||||||
|
|
||||||
|
### Requirement 12: Pipeline Infrastructure Teardown
|
||||||
|
|
||||||
|
**User Story:** As a platform operator, I want a teardown script that removes pipeline components without destroying persistent pipeline data, so that pipeline state survives cluster rebuilds.
|
||||||
|
|
||||||
|
#### Acceptance Criteria
|
||||||
|
|
||||||
|
1. WHEN the operator executes `runmelast.sh` from `~/sources/kube/pipelines/`, THE Pipeline_Infrastructure SHALL remove Woodpecker_CI server, Woodpecker_Agent, ArgoCD, and Kargo deployments from the Gremlin_Cluster
|
||||||
|
2. WHEN `runmelast.sh` executes, THE Pipeline_Infrastructure SHALL preserve all NFS_PV resources and the data stored on `nfs://192.168.42.8:/volume1/Kubernetes/pipelines`
|
||||||
|
3. WHEN `runmelast.sh` executes, THE Pipeline_Infrastructure SHALL leave the `stonks-oracle` application namespace and all application workloads untouched
|
||||||
|
4. WHEN `runmelast.sh` executes, THE Pipeline_Infrastructure SHALL leave the `git-server` namespace (Gitea and Local_Registry) untouched since those are managed separately
|
||||||
|
5. WHEN the application teardown script `~/sources/kube/stonks-oracle/runmelast.sh` executes, THE Pipeline_Infrastructure SHALL remain operational and unaffected
|
||||||
|
|
||||||
|
### Requirement 13: Pipeline Infrastructure Isolation
|
||||||
|
|
||||||
|
**User Story:** As a platform operator, I want the pipeline infrastructure to be fully isolated from the application infrastructure, so that deploying or tearing down one does not affect the other.
|
||||||
|
|
||||||
|
#### Acceptance Criteria
|
||||||
|
|
||||||
|
1. THE Pipeline_Infrastructure SHALL deploy Woodpecker_CI, ArgoCD, and Kargo in namespaces separate from the `stonks-oracle` application namespace and the `git-server` namespace
|
||||||
|
2. THE Pipeline_Infrastructure SHALL use independent Helm releases or manifests that share no lifecycle with the `stonks-oracle` Helm chart
|
||||||
|
3. THE Pipeline_Infrastructure SHALL use NFS_PV paths under `pipelines/` that are distinct from any application storage paths
|
||||||
|
|
||||||
|
### Requirement 14: Woodpecker CI Pipeline File
|
||||||
|
|
||||||
|
**User Story:** As a developer, I want the existing CI pipeline translated into Woodpecker's `.woodpecker.yml` format, so that the pipeline runs natively on Woodpecker CI without GitHub Actions dependencies.
|
||||||
|
|
||||||
|
#### Acceptance Criteria
|
||||||
|
|
||||||
|
1. THE Pipeline_Infrastructure SHALL create a `.woodpecker.yml` file in the repository root defining all CI pipeline steps in Woodpecker's native YAML format
|
||||||
|
2. THE Woodpecker_Pipeline_File SHALL define each pipeline step as a Docker container with explicit image and commands (no GitHub Actions `uses:` syntax)
|
||||||
|
3. THE Woodpecker_Pipeline_File SHALL use `when` conditions to restrict image build and push steps to pushes on the `main` branch
|
||||||
|
4. THE Woodpecker_Pipeline_File SHALL use the `woodpeckerci/plugin-docker-buildx` plugin for building and pushing Docker images to the Local_Registry
|
||||||
|
5. WHEN the `.woodpecker.yml` is committed, THE Woodpecker_CI SHALL execute the pipeline for all 12 Python services, the dashboard, and the superset image
|
||||||
|
6. THE Woodpecker_Pipeline_File SHALL include a GitHub mirror step that runs only after all other steps succeed on the `main` branch
|
||||||
|
|
||||||
|
### Requirement 15: Beta Stage Deployment
|
||||||
|
|
||||||
|
**User Story:** As a developer, I want a beta environment where newly built images are deployed for smoke testing and manual verification before promotion to paper trading, so that regressions are caught early.
|
||||||
|
|
||||||
|
#### Acceptance Criteria
|
||||||
|
|
||||||
|
1. WHEN an Image_Tag passes the integration test stage, THE Beta_Stage SHALL deploy the application with that Image_Tag to the `stonks-beta` namespace managed by ArgoCD
|
||||||
|
2. WHILE the Beta_Stage is active, THE Kargo_Dashboard SHALL display the currently deployed Image_Tag and its promotion status
|
||||||
|
3. WHEN a developer requests promotion from Beta to Paper via the Kargo_Dashboard, THE Beta_Stage SHALL verify that the Image_Tag passed integration tests before allowing promotion
|
||||||
|
4. THE Beta_Stage SHALL use the same Helm chart (`infra/helm/stonks-oracle/`) as production, with beta-specific value overrides pulling images from the Local_Registry
|
||||||
|
|
||||||
|
### Requirement 16: Paper Trading Stage Deployment
|
||||||
|
|
||||||
|
**User Story:** As a trader, I want a paper trading environment that uses the Alpaca paper broker, so that new builds can be validated against simulated market conditions before going live.
|
||||||
|
|
||||||
|
#### Acceptance Criteria
|
||||||
|
|
||||||
|
1. WHEN an Image_Tag is promoted from Beta, THE Paper_Stage SHALL deploy the application with that Image_Tag to the `stonks-paper` namespace managed by ArgoCD
|
||||||
|
2. THE Paper_Stage SHALL configure the broker adapter with `BROKER_MODE=paper` and `BROKER_PROVIDER=alpaca` using Alpaca paper trading credentials
|
||||||
|
3. WHILE Market_Hours are active (09:30–16:00 ET, Monday–Friday), THE Paper_Stage SHALL block automatic and manual promotions to the Paper_Stage unless Break_Glass is activated
|
||||||
|
4. WHEN a promotion to Paper is attempted outside Market_Hours, THE Paper_Stage SHALL allow the promotion to proceed
|
||||||
|
5. THE Paper_Stage SHALL use the same Helm chart (`infra/helm/stonks-oracle/`) as production, with paper-specific value overrides pulling images from the Local_Registry
|
||||||
|
|
||||||
|
### Requirement 17: Live Stage Deployment
|
||||||
|
|
||||||
|
**User Story:** As a platform operator, I want production deployments to require explicit manual approval with notes, so that live trading is protected from accidental or untested deployments.
|
||||||
|
|
||||||
|
#### Acceptance Criteria
|
||||||
|
|
||||||
|
1. WHEN an Image_Tag is promoted from Paper, THE Live_Stage SHALL require explicit manual approval with a notes field before deploying to the `stonks-oracle` production namespace
|
||||||
|
2. THE Live_Stage SHALL deploy the application with the approved Image_Tag via ArgoCD syncing the production Helm release
|
||||||
|
3. WHILE Market_Hours are active (09:30–16:00 ET, Monday–Friday), THE Live_Stage SHALL block promotions to the Live_Stage unless Break_Glass is activated
|
||||||
|
4. WHEN a promotion to Live is attempted outside Market_Hours with valid approval, THE Live_Stage SHALL allow the promotion to proceed
|
||||||
|
5. THE Live_Stage SHALL use the existing `stonks-oracle` namespace and Helm chart with production values pulling images from the Local_Registry
|
||||||
|
|
||||||
|
### Requirement 18: Market-Hours Promotion Blocker
|
||||||
|
|
||||||
|
**User Story:** As a risk manager, I want promotions to paper and live environments to be blocked during US market hours, so that deployments do not disrupt active trading sessions.
|
||||||
|
|
||||||
|
#### Acceptance Criteria
|
||||||
|
|
||||||
|
1. WHILE the current time is between 09:30 and 16:00 Eastern Time on a weekday, THE Promotion_Blocker SHALL prevent promotions to the Paper_Stage and Live_Stage
|
||||||
|
2. WHEN the current time is outside 09:30–16:00 ET or on a weekend, THE Promotion_Blocker SHALL allow promotions to proceed (subject to other gates)
|
||||||
|
3. WHEN a promotion is blocked by the Promotion_Blocker, THE Kargo_Dashboard SHALL display a visual indicator showing the block reason and the time until the market closes
|
||||||
|
4. THE Promotion_Blocker SHALL evaluate Eastern Time correctly, accounting for US daylight saving time transitions
|
||||||
|
|
||||||
|
### Requirement 19: Break-Glass Emergency Override
|
||||||
|
|
||||||
|
**User Story:** As a platform operator, I want a break-glass mechanism to bypass market-hours blockers during emergencies, so that critical fixes can be deployed at any time.
|
||||||
|
|
||||||
|
#### Acceptance Criteria
|
||||||
|
|
||||||
|
1. WHEN an operator activates Break_Glass via the Kargo_Dashboard, THE Pipeline_Infrastructure SHALL bypass the Promotion_Blocker for the target Stage
|
||||||
|
2. WHEN Break_Glass is activated, THE Kargo_Dashboard SHALL require a confirmation dialog before proceeding
|
||||||
|
3. WHEN Break_Glass is activated, THE Pipeline_Infrastructure SHALL require the operator to provide a written justification note
|
||||||
|
4. WHEN Break_Glass is used, THE Pipeline_Infrastructure SHALL record the operator identity, timestamp, target Stage, Image_Tag, and justification note in the audit trail
|
||||||
|
5. THE Break_Glass mechanism SHALL apply only to the single promotion for which it was activated and SHALL NOT disable the Promotion_Blocker for subsequent promotions
|
||||||
|
|
||||||
|
### Requirement 20: Per-Stage Enable/Disable Controls
|
||||||
|
|
||||||
|
**User Story:** As a platform operator, I want to independently enable or disable each pipeline stage, so that the pipeline can be configured for different operational modes.
|
||||||
|
|
||||||
|
#### Acceptance Criteria
|
||||||
|
|
||||||
|
1. THE Pipeline_Infrastructure SHALL provide a configuration mechanism to independently enable or disable each of the four stages (CI, Beta, Paper, Live)
|
||||||
|
2. WHEN a Stage is disabled, THE Pipeline_Infrastructure SHALL skip that Stage during promotion and advance the Image_Tag to the next enabled Stage
|
||||||
|
3. WHEN a Stage is re-enabled, THE Pipeline_Infrastructure SHALL resume gating promotions through that Stage for new Image_Tags
|
||||||
|
|
||||||
|
### Requirement 21: Revision Tracking
|
||||||
|
|
||||||
|
**User Story:** As a developer, I want to see which Image_Tag (Git SHA) is deployed at each pipeline stage, so that I can track exactly what code is running in each environment.
|
||||||
|
|
||||||
|
#### Acceptance Criteria
|
||||||
|
|
||||||
|
1. THE Kargo_Dashboard SHALL display the currently deployed Image_Tag for each active Stage
|
||||||
|
2. WHEN a promotion occurs, THE Kargo_Dashboard SHALL update the displayed Image_Tag for the target Stage within 60 seconds
|
||||||
|
3. THE Pipeline_Infrastructure SHALL maintain a mapping of Stage to current Image_Tag that is queryable via the Kargo API or ArgoCD
|
||||||
|
|
||||||
|
### Requirement 22: Audit Trail
|
||||||
|
|
||||||
|
**User Story:** As a compliance officer, I want a complete audit trail of all promotions including who promoted, when, with what notes, and whether break-glass was used, so that deployment decisions are traceable.
|
||||||
|
|
||||||
|
#### Acceptance Criteria
|
||||||
|
|
||||||
|
1. WHEN a promotion occurs, THE Pipeline_Infrastructure SHALL record the operator identity, timestamp, source Stage, target Stage, Image_Tag, and any notes provided
|
||||||
|
2. WHEN Break_Glass is used for a promotion, THE Pipeline_Infrastructure SHALL record the break-glass justification alongside the standard promotion record
|
||||||
|
3. THE Kargo_Dashboard SHALL display the promotion history for each Stage, showing all recorded audit fields
|
||||||
|
4. THE Pipeline_Infrastructure SHALL persist audit trail data on NFS_PV so that promotion history survives cluster rebuilds
|
||||||
|
|
||||||
|
### Requirement 23: Kargo Visual Dashboard
|
||||||
|
|
||||||
|
**User Story:** As a platform operator, I want a web dashboard showing all pipeline stages, their current revisions, and promotion controls, so that I can manage deployments visually.
|
||||||
|
|
||||||
|
#### Acceptance Criteria
|
||||||
|
|
||||||
|
1. THE Kargo_Dashboard SHALL display all active Stages with their current deployed Image_Tag and promotion status
|
||||||
|
2. THE Kargo_Dashboard SHALL provide a click-to-promote action for advancing an Image_Tag from one Stage to the next
|
||||||
|
3. WHEN Market_Hours are active, THE Kargo_Dashboard SHALL display block/allow indicators on the Paper_Stage and Live_Stage
|
||||||
|
4. THE Kargo_Dashboard SHALL provide a notes field when promoting or when a promotion is blocked
|
||||||
|
5. THE Kargo_Dashboard SHALL provide a Break_Glass button with a confirmation dialog for emergency overrides
|
||||||
|
6. THE Kargo_Dashboard SHALL be accessible via Traefik ingress at `stonks-kargo.celestium.life` with TLS via `ca-issuer`
|
||||||
|
|
||||||
|
### Requirement 24: NFS Persistent Storage
|
||||||
|
|
||||||
|
**User Story:** As a platform operator, I want all pipeline state (ArgoCD app configs, Kargo promotion history, Woodpecker build data) to persist on NFS volumes, so that pipeline data survives cluster teardowns and rebuilds.
|
||||||
|
|
||||||
|
#### Acceptance Criteria
|
||||||
|
|
||||||
|
1. THE Pipeline_Infrastructure SHALL create PersistentVolumes backed by the NFS share at `nfs://192.168.42.8:/volume1/Kubernetes/pipelines` for ArgoCD server data, Kargo data, and Woodpecker_CI data
|
||||||
|
2. WHEN `runmelast.sh` is executed, THE NFS_PV resources and their underlying NFS data SHALL remain intact
|
||||||
|
3. WHEN `runmefirst.sh` is executed after a previous teardown, THE Pipeline_Infrastructure SHALL reattach to the existing NFS data and restore previous pipeline state
|
||||||
|
4. THE Pipeline_Infrastructure SHALL use separate NFS subdirectories for ArgoCD, Kargo, and Woodpecker to prevent data conflicts
|
||||||
|
|
||||||
|
### Requirement 25: ArgoCD GitOps Configuration with Gitea
|
||||||
|
|
||||||
|
**User Story:** As a platform operator, I want ArgoCD to sync Kubernetes manifests from the Gitea repository instead of GitHub, so that the GitOps source of truth is the local Git forge.
|
||||||
|
|
||||||
|
#### Acceptance Criteria
|
||||||
|
|
||||||
|
1. THE ArgoCD SHALL be configured with a repository secret pointing to the Gitea `stonks-oracle` repository instead of the GitHub repository
|
||||||
|
2. WHEN a change is committed to the Helm chart or values files in the Gitea repository, THE ArgoCD SHALL detect the change and sync the updated manifests to the target namespace
|
||||||
|
3. THE ArgoCD SHALL support multiple Application resources for beta, paper, and live environments, each with stage-specific value overrides
|
||||||
|
4. IF an ArgoCD sync fails, THEN THE ArgoCD SHALL report the failure status in the ArgoCD UI and the Kargo_Dashboard
|
||||||
|
|
||||||
|
### Requirement 26: Woodpecker Agent RBAC for Integration Tests
|
||||||
|
|
||||||
|
**User Story:** As a platform operator, I want the Woodpecker agent to have sufficient Kubernetes RBAC permissions to run integration tests that create ephemeral namespaces and deploy sandbox infrastructure.
|
||||||
|
|
||||||
|
#### Acceptance Criteria
|
||||||
|
|
||||||
|
1. THE Pipeline_Infrastructure SHALL create a ClusterRoleBinding granting the Woodpecker_Agent service account permissions to create and delete namespaces for integration test sandboxes
|
||||||
|
2. THE Woodpecker_Agent RBAC SHALL be scoped to the minimum permissions required for integration test execution
|
||||||
|
3. WHEN the Woodpecker_Agent executes integration tests, THE Woodpecker_Agent SHALL have access to create deployments, services, and configmaps in ephemeral test namespaces
|
||||||
|
|
||||||
|
### Requirement 27: Woodpecker Pipeline File Creation
|
||||||
|
|
||||||
|
**User Story:** As a developer, I want a `.woodpecker.yml` file created that translates the existing CI pipeline into Woodpecker's native format, targeting the Local_Registry and including a GitHub mirror step.
|
||||||
|
|
||||||
|
#### Acceptance Criteria
|
||||||
|
|
||||||
|
1. THE Woodpecker_Pipeline_File SHALL define lint, test, image build, integration test, and GitHub mirror steps using Woodpecker's container-based step format
|
||||||
|
2. THE Woodpecker_Pipeline_File SHALL target all image pushes to `registry.celestium.life/stonks-oracle/` instead of `ghcr.io`
|
||||||
|
3. THE Woodpecker_Pipeline_File SHALL use Woodpecker secrets for Local_Registry credentials (if required) and GitHub mirror credentials
|
||||||
|
4. THE Woodpecker_Pipeline_File SHALL use `when` branch and event filters to restrict image builds and mirror pushes to `main` branch push events
|
||||||
|
5. THE Woodpecker_Pipeline_File SHALL support matrix builds or sequential steps for building all 12 Python services, the dashboard, and the superset image
|
||||||
|
|
||||||
|
### Requirement 28: Woodpecker Dashboard Access via Traefik Ingress
|
||||||
|
|
||||||
|
**User Story:** As a platform operator, I want the Woodpecker CI dashboard accessible via a Traefik ingress with TLS, so that developers can view pipeline status and build logs from a browser.
|
||||||
|
|
||||||
|
#### Acceptance Criteria
|
||||||
|
|
||||||
|
1. THE Pipeline_Infrastructure SHALL expose the Woodpecker_Dashboard via Traefik ingress at `stonks-ci.celestium.life` with TLS using the `ca-issuer` ClusterIssuer
|
||||||
|
2. THE Woodpecker_Dashboard SHALL display pipeline status, build logs, and step-level output for all `stonks-oracle` repository builds
|
||||||
|
3. WHEN a pipeline fails, THE Woodpecker_Dashboard SHALL display the failing step and its container logs
|
||||||
|
4. THE Woodpecker_Dashboard SHALL authenticate users via the Gitea OAuth2 integration so that only Gitea users can access build information
|
||||||
|
5. THE Pipeline_Infrastructure SHALL create a Kubernetes NetworkPolicy allowing Traefik ingress traffic to the Woodpecker_CI server on its HTTP port
|
||||||
@@ -0,0 +1,92 @@
|
|||||||
|
# Implementation Plan: Local CI/CD Pipeline
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
Migrate the Stonks Oracle CI/CD pipeline from GitHub-dependent infrastructure (ARC + GHCR) to a fully local pipeline using Gitea, Woodpecker CI, and the local Docker registry. Update ArgoCD and Kargo to source from local infrastructure. All pipeline scripts and manifests live in `~/sources/kube/pipelines/` on gremlin-1.
|
||||||
|
|
||||||
|
## Tasks
|
||||||
|
|
||||||
|
- [x] 1. Tear down ARC infrastructure
|
||||||
|
- [x] 1.1 Remove `pipelines/arc/` directory (values.yaml, runner-scaleset.yaml, runner-rbac.yaml)
|
||||||
|
- _Requirements: 7.1, 7.2, 7.4_
|
||||||
|
- [x] 1.2 Remove `pipelines/pvs/arc-pv.yaml` — ARC NFS PV no longer needed
|
||||||
|
- _Requirements: 7.5_
|
||||||
|
- [x] 1.3 Update `pipelines/runmefirst.sh` — add ARC teardown section at the beginning: uninstall `arc-runner-set` and `arc` Helm releases, delete `arc/runner-rbac.yaml` ClusterRoleBinding, delete `pipeline-arc-pv` PV, delete `arc-system` namespace. Use `|| true` and `--ignore-not-found` for idempotency.
|
||||||
|
- _Requirements: 7.1, 7.2, 7.3, 7.4, 7.5, 11.7_
|
||||||
|
|
||||||
|
- [x] 2. Create Woodpecker CI NFS PersistentVolume
|
||||||
|
- [x] 2.1 Create `pipelines/pvs/woodpecker-pv.yaml` — NFS PV for Woodpecker server data (5Gi, `nfs://192.168.42.8:/volume1/Kubernetes/pipelines/woodpecker`, `persistentVolumeReclaimPolicy: Retain`, label `app: pipeline-woodpecker`)
|
||||||
|
- _Requirements: 24.1, 24.4_
|
||||||
|
|
||||||
|
- [x] 3. Create Gitea configuration script
|
||||||
|
- [x] 3.1 Create `pipelines/gitea/setup.sh` — Shell script that automates Gitea initial setup via REST API: complete install wizard or create admin user, create OAuth2 application for Woodpecker CI (callback URL `https://stonks-ci.celestium.life/authorize`), create `stonks-oracle` repository. Script outputs OAuth2 client_id and client_secret for use by Woodpecker Helm install. All operations idempotent (check-before-create pattern).
|
||||||
|
- _Requirements: 1.1, 1.2, 1.5, 2.3_
|
||||||
|
|
||||||
|
- [x] 4. Create Woodpecker CI manifests
|
||||||
|
- [x] 4.1 Create `pipelines/woodpecker/values.yaml` — Helm values for `woodpecker/woodpecker` chart in `woodpecker` namespace. Server: Gitea OAuth2 config (`WOODPECKER_GITEA=true`, `WOODPECKER_GITEA_URL`, `WOODPECKER_GITEA_CLIENT`, `WOODPECKER_GITEA_SECRET`), `WOODPECKER_HOST=https://stonks-ci.celestium.life`, `WOODPECKER_ADMIN=admin`, Traefik ingress at `stonks-ci.celestium.life` with TLS via `ca-issuer`, NFS-backed persistent volume (5Gi). Agent: 2 replicas, Kubernetes backend (`WOODPECKER_BACKEND=kubernetes`), `WOODPECKER_BACKEND_K8S_NAMESPACE=woodpecker`, `WOODPECKER_BACKEND_K8S_VOLUME_SIZE=10G`.
|
||||||
|
- _Requirements: 2.1, 2.2, 2.3, 2.5, 11.1, 11.5, 28.1_
|
||||||
|
- [x] 4.2 Create `pipelines/woodpecker/network-policy.yaml` — NetworkPolicy in `woodpecker` namespace allowing Traefik ingress (from `kube-system` namespace) to Woodpecker server on HTTP port
|
||||||
|
- _Requirements: 28.5_
|
||||||
|
- [x] 4.3 Create `pipelines/woodpecker/agent-rbac.yaml` — ClusterRoleBinding granting the Woodpecker agent service account `cluster-admin` for integration test steps that create ephemeral namespaces. Mirrors the existing ARC runner RBAC pattern.
|
||||||
|
- _Requirements: 26.1, 26.2, 26.3_
|
||||||
|
|
||||||
|
- [x] 5. Checkpoint — Verify Woodpecker and Gitea manifests
|
||||||
|
- Ensure all YAML manifests are syntactically valid. Verify Woodpecker Helm values include correct Gitea OAuth2 configuration. Verify NFS PV paths are distinct from ArgoCD and Kargo PVs. Ask the user if questions arise.
|
||||||
|
|
||||||
|
- [x] 6. Update ArgoCD manifests for Gitea
|
||||||
|
- [x] 6.1 Update `pipelines/argocd/repo-secret.yaml` — Change repository URL from `https://github.com/celesrenata/stonks-oracle.git` to the Gitea repository URL (`http://gitea-http.git-server.svc.cluster.local:3000/admin/stonks-oracle.git`). Update credentials to use Gitea admin username/password instead of GitHub token.
|
||||||
|
- _Requirements: 25.1, 9.4_
|
||||||
|
- [x] 6.2 Update `pipelines/argocd/apps/stonks-beta.yaml` — Change `spec.source.repoURL` from GitHub to Gitea repository URL
|
||||||
|
- _Requirements: 9.1, 25.3_
|
||||||
|
- [x] 6.3 Update `pipelines/argocd/apps/stonks-paper.yaml` — Change `spec.source.repoURL` from GitHub to Gitea repository URL
|
||||||
|
- _Requirements: 9.2, 25.3_
|
||||||
|
- [x] 6.4 Update `pipelines/argocd/apps/stonks-live.yaml` — Change `spec.source.repoURL` from GitHub to Gitea repository URL
|
||||||
|
- _Requirements: 9.3, 25.3_
|
||||||
|
|
||||||
|
- [x] 7. Update Kargo Warehouse for local registry
|
||||||
|
- [x] 7.1 Update `pipelines/kargo/warehouse.yaml` — Change `spec.subscriptions[0].image.repoURL` from `ghcr.io/celesrenata/stonks-oracle/query-api` to `registry.celestium.life/stonks-oracle/query-api`
|
||||||
|
- _Requirements: 8.1, 8.2_
|
||||||
|
|
||||||
|
- [x] 8. Update Helm chart for local registry
|
||||||
|
- [x] 8.1 Update `infra/helm/stonks-oracle/values.yaml` — Change `image.registry` from `ghcr.io/celesrenata/stonks-oracle` to `registry.celestium.life/stonks-oracle`. Remove `imagePullSecrets` list (remove `ghcr-credentials` entry). Remove `ghcrAuth` section entirely.
|
||||||
|
- _Requirements: 10.1, 10.2, 10.4_
|
||||||
|
|
||||||
|
- [x] 9. Checkpoint — Verify ArgoCD, Kargo, and Helm chart updates
|
||||||
|
- Ensure all updated YAML is syntactically valid. Verify no references to `ghcr.io` or `github.com` remain in ArgoCD apps, Kargo warehouse, or Helm values. Verify `values-beta.yaml` and `values-paper.yaml` don't need changes (they inherit `image.registry` from base). Ask the user if questions arise.
|
||||||
|
|
||||||
|
- [x] 10. Create Woodpecker pipeline file
|
||||||
|
- [x] 10.1 Create `.woodpecker.yml` in the repository root — Define all CI pipeline steps in Woodpecker's native YAML format: lint-python (ruff check), test-python (pytest), test-frontend (vitest), build steps for all 12 Python services using `woodpeckerci/plugin-docker-buildx` pushing to `registry.celestium.life/stonks-oracle/<service>:<sha>` and `:latest`, build-dashboard, build-superset, integration-test (run_pipeline.sh), and mirror-github. Use `when` conditions to restrict build/push/mirror steps to `main` branch push events. Use `depends_on` for step ordering. Use `from_secret` for registry and GitHub credentials.
|
||||||
|
- _Requirements: 14.1, 14.2, 14.3, 14.4, 14.5, 27.1, 27.2, 27.3, 27.4, 27.5, 3.1, 3.2, 3.3, 3.4, 3.5, 4.1, 4.2, 4.3, 4.4, 4.5, 5.1, 5.2, 5.3, 5.4, 5.5, 6.1, 6.2, 6.3, 6.4, 6.5_
|
||||||
|
|
||||||
|
- [x] 11. Update install script (`runmefirst.sh`)
|
||||||
|
- [x] 11.1 Rewrite `pipelines/runmefirst.sh` — New install order: (1) ARC teardown, (2) create namespaces (`woodpecker`, `argocd`, `kargo`, `stonks-beta`, `stonks-paper`), (3) apply NFS PVs (argocd, kargo, woodpecker), (4) run Gitea setup script (`gitea/setup.sh`), (5) add Woodpecker Helm repo and install Woodpecker with Gitea OAuth2 credentials injected, (6) apply Woodpecker network policy and agent RBAC, (7) install ArgoCD via Helm with updated values, apply repo secret and Applications, (8) install Kargo via Helm, apply project/warehouse/stages. Use `set -euo pipefail`, idempotent operations throughout.
|
||||||
|
- _Requirements: 11.1, 11.2, 11.3, 11.4, 11.5, 11.6, 11.7, 1.1, 1.2, 1.4, 1.5_
|
||||||
|
|
||||||
|
- [x] 12. Update teardown script (`runmelast.sh`)
|
||||||
|
- [x] 12.1 Rewrite `pipelines/runmelast.sh` — New teardown order: (1) remove Kargo resources + Helm release, (2) remove ArgoCD resources + Helm release, (3) uninstall Woodpecker Helm release, delete Woodpecker agent RBAC and network policy, (4) delete namespaces (`woodpecker`, `argocd`, `kargo`). Preserve: NFS PVs, NFS data, `stonks-oracle` namespace, `stonks-beta`, `stonks-paper`, `git-server` namespace (Gitea + registry). Use `--ignore-not-found` and `|| true` for idempotency.
|
||||||
|
- _Requirements: 12.1, 12.2, 12.3, 12.4, 12.5, 13.1, 13.2, 13.3_
|
||||||
|
|
||||||
|
- [x] 13. Checkpoint — Verify all scripts and pipeline file
|
||||||
|
- Ensure `runmefirst.sh` install order matches design (ARC teardown → Gitea → Woodpecker → ArgoCD → Kargo). Ensure `runmelast.sh` teardown order is reverse (Kargo → ArgoCD → Woodpecker). Verify `.woodpecker.yml` covers all 12 services + dashboard + superset + integration test + GitHub mirror. Verify no references to ARC, `arc-system`, or GHCR remain in any pipeline scripts. Ask the user if questions arise.
|
||||||
|
|
||||||
|
- [x] 14. Final review
|
||||||
|
- Review all created and modified files for completeness. Verify the full pipeline flow: Gitea push → Woodpecker webhook → lint/test → build/push to local registry → integration test → GitHub mirror → Kargo detects new tag → beta auto-promote → paper/live with market-hours gates. Ensure all 28 requirements are addressed. Ask the user if questions arise.
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
|
||||||
|
- Pipeline infrastructure scripts (`~/sources/kube/pipelines/`) are on gremlin-1, separate from the stonks-oracle repo
|
||||||
|
- The `.woodpecker.yml` pipeline file and Helm chart changes are in the stonks-oracle repo
|
||||||
|
- No property-based tests — this feature is entirely IaC (shell scripts, YAML manifests, Helm values)
|
||||||
|
- Gitea is already deployed in `git-server` namespace but unconfigured (still on install page)
|
||||||
|
- Local registry is already deployed and healthy at `registry.celestium.life`
|
||||||
|
- ArgoCD and Kargo are already deployed — only config updates needed (repo URL, image source)
|
||||||
|
- ARC is currently deployed and needs to be torn down before Woodpecker install
|
||||||
|
- The `.github/workflows/build.yml` remains in the repo for reference but won't be used by Woodpecker
|
||||||
|
- Woodpecker agents use the Kubernetes backend — each pipeline step runs as a standalone Pod
|
||||||
|
- Image builds use `woodpeckerci/plugin-docker-buildx` with privileged mode
|
||||||
|
- GitHub mirror is a post-CI step that pushes via SSH key — failure does not block promotion
|
||||||
|
- Break-glass, audit trail, and Kargo Dashboard features are provided by Kargo out of the box (Requirements 19, 20, 21, 22, 23)
|
||||||
|
- Market-hours AnalysisTemplate is unchanged from the existing pipeline
|
||||||
|
- Kargo stages, project, and project-config are unchanged
|
||||||
|
- `values-beta.yaml` and `values-paper.yaml` inherit `image.registry` from base `values.yaml` — no changes needed
|
||||||
|
- k3s nodes may need containerd mirror config to trust the local registry for image pulls
|
||||||
@@ -1,20 +1,9 @@
|
|||||||
## Global image settings
|
## Global image settings
|
||||||
image:
|
image:
|
||||||
registry: ghcr.io/celesrenata/stonks-oracle
|
registry: registry.celestium.life/stonks-oracle
|
||||||
pullPolicy: Always
|
pullPolicy: Always
|
||||||
tag: latest
|
tag: latest
|
||||||
|
|
||||||
imagePullSecrets:
|
|
||||||
- name: ghcr-credentials
|
|
||||||
|
|
||||||
## GHCR authentication for private registry
|
|
||||||
ghcrAuth:
|
|
||||||
enabled: true
|
|
||||||
registry: ghcr.io
|
|
||||||
username: celesrenata
|
|
||||||
# base64-encoded dockerconfigjson — override at install time
|
|
||||||
password: ""
|
|
||||||
|
|
||||||
## Service deployments — replicas and resource overrides
|
## Service deployments — replicas and resource overrides
|
||||||
services:
|
services:
|
||||||
scheduler:
|
scheduler:
|
||||||
|
|||||||
@@ -1,15 +0,0 @@
|
|||||||
# RBAC for ARC runner pods — allows integration tests to create
|
|
||||||
# ephemeral namespaces and deploy sandbox infrastructure.
|
|
||||||
# The service account is auto-created by the ARC runner scale set chart.
|
|
||||||
apiVersion: rbac.authorization.k8s.io/v1
|
|
||||||
kind: ClusterRoleBinding
|
|
||||||
metadata:
|
|
||||||
name: arc-runner-inttest
|
|
||||||
roleRef:
|
|
||||||
apiGroup: rbac.authorization.k8s.io
|
|
||||||
kind: ClusterRole
|
|
||||||
name: cluster-admin
|
|
||||||
subjects:
|
|
||||||
- kind: ServiceAccount
|
|
||||||
name: self-hosted-gremlin-gha-rs-no-permission
|
|
||||||
namespace: arc-system
|
|
||||||
@@ -1,102 +0,0 @@
|
|||||||
# Helm values for ARC runner scale set
|
|
||||||
# Chart: oci://ghcr.io/actions/actions-runner-controller-charts/gha-runner-scale-set
|
|
||||||
# Namespace: arc-system
|
|
||||||
#
|
|
||||||
# Custom DinD template with resource requests to spread pods across nodes.
|
|
||||||
# containerMode is NOT set — we provide the full template ourselves.
|
|
||||||
# Based on the chart's default DinD template for Kubernetes >= v1.29 (sidecar containers).
|
|
||||||
|
|
||||||
githubConfigUrl: "https://github.com/celesrenata/stonks-oracle"
|
|
||||||
runnerScaleSetName: "self-hosted-gremlin"
|
|
||||||
|
|
||||||
githubConfigSecret:
|
|
||||||
github_token: "PLACEHOLDER"
|
|
||||||
|
|
||||||
template:
|
|
||||||
spec:
|
|
||||||
# Spread runner pods across nodes
|
|
||||||
affinity:
|
|
||||||
podAntiAffinity:
|
|
||||||
preferredDuringSchedulingIgnoredDuringExecution:
|
|
||||||
- weight: 100
|
|
||||||
podAffinityTerm:
|
|
||||||
labelSelector:
|
|
||||||
matchExpressions:
|
|
||||||
- key: actions.github.com/scale-set-name
|
|
||||||
operator: In
|
|
||||||
values:
|
|
||||||
- self-hosted-gremlin
|
|
||||||
topologyKey: kubernetes.io/hostname
|
|
||||||
|
|
||||||
initContainers:
|
|
||||||
- name: init-dind-externals
|
|
||||||
image: ghcr.io/actions/actions-runner:latest
|
|
||||||
command: ["cp", "-r", "/home/runner/externals/.", "/home/runner/tmpDir/"]
|
|
||||||
volumeMounts:
|
|
||||||
- name: dind-externals
|
|
||||||
mountPath: /home/runner/tmpDir
|
|
||||||
|
|
||||||
- name: dind
|
|
||||||
image: docker:dind
|
|
||||||
args:
|
|
||||||
- dockerd
|
|
||||||
- --host=unix:///var/run/docker.sock
|
|
||||||
- --group=$(DOCKER_GROUP_GID)
|
|
||||||
env:
|
|
||||||
- name: DOCKER_GROUP_GID
|
|
||||||
value: "123"
|
|
||||||
securityContext:
|
|
||||||
privileged: true
|
|
||||||
restartPolicy: Always
|
|
||||||
startupProbe:
|
|
||||||
exec:
|
|
||||||
command:
|
|
||||||
- docker
|
|
||||||
- info
|
|
||||||
initialDelaySeconds: 0
|
|
||||||
failureThreshold: 24
|
|
||||||
periodSeconds: 5
|
|
||||||
resources:
|
|
||||||
requests:
|
|
||||||
cpu: "2"
|
|
||||||
memory: 2Gi
|
|
||||||
limits:
|
|
||||||
cpu: "4"
|
|
||||||
memory: 4Gi
|
|
||||||
volumeMounts:
|
|
||||||
- name: work
|
|
||||||
mountPath: /home/runner/_work
|
|
||||||
- name: dind-sock
|
|
||||||
mountPath: /var/run
|
|
||||||
- name: dind-externals
|
|
||||||
mountPath: /home/runner/externals
|
|
||||||
|
|
||||||
containers:
|
|
||||||
- name: runner
|
|
||||||
image: ghcr.io/actions/actions-runner:latest
|
|
||||||
command: ["/home/runner/run.sh"]
|
|
||||||
env:
|
|
||||||
- name: DOCKER_HOST
|
|
||||||
value: unix:///var/run/docker.sock
|
|
||||||
- name: RUNNER_WAIT_FOR_DOCKER_IN_SECONDS
|
|
||||||
value: "120"
|
|
||||||
resources:
|
|
||||||
requests:
|
|
||||||
cpu: "2"
|
|
||||||
memory: 2Gi
|
|
||||||
limits:
|
|
||||||
cpu: "4"
|
|
||||||
memory: 8Gi
|
|
||||||
volumeMounts:
|
|
||||||
- name: work
|
|
||||||
mountPath: /home/runner/_work
|
|
||||||
- name: dind-sock
|
|
||||||
mountPath: /var/run
|
|
||||||
|
|
||||||
volumes:
|
|
||||||
- name: work
|
|
||||||
emptyDir: {}
|
|
||||||
- name: dind-sock
|
|
||||||
emptyDir: {}
|
|
||||||
- name: dind-externals
|
|
||||||
emptyDir: {}
|
|
||||||
@@ -1,16 +0,0 @@
|
|||||||
# Helm values for ARC controller
|
|
||||||
# Chart: oci://ghcr.io/actions/actions-runner-controller-charts/gha-runner-scale-set-controller
|
|
||||||
# Namespace: arc-system
|
|
||||||
|
|
||||||
# Flags to enable cert-manager and TLS (disabled — not needed for controller)
|
|
||||||
flags:
|
|
||||||
logLevel: info
|
|
||||||
|
|
||||||
# NFS-backed persistence via the pipeline-arc-pv PersistentVolume
|
|
||||||
persistence:
|
|
||||||
enabled: true
|
|
||||||
accessMode: ReadWriteOnce
|
|
||||||
size: 2Gi
|
|
||||||
selector:
|
|
||||||
matchLabels:
|
|
||||||
app: pipeline-arc
|
|
||||||
@@ -6,7 +6,7 @@ metadata:
|
|||||||
spec:
|
spec:
|
||||||
project: default
|
project: default
|
||||||
source:
|
source:
|
||||||
repoURL: https://github.com/celesrenata/stonks-oracle.git
|
repoURL: http://gitea-service.git-server.svc.cluster.local:3000/admin/stonks-oracle.git
|
||||||
targetRevision: main
|
targetRevision: main
|
||||||
path: infra/helm/stonks-oracle
|
path: infra/helm/stonks-oracle
|
||||||
helm:
|
helm:
|
||||||
|
|||||||
@@ -6,7 +6,7 @@ metadata:
|
|||||||
spec:
|
spec:
|
||||||
project: default
|
project: default
|
||||||
source:
|
source:
|
||||||
repoURL: https://github.com/celesrenata/stonks-oracle.git
|
repoURL: http://gitea-service.git-server.svc.cluster.local:3000/admin/stonks-oracle.git
|
||||||
targetRevision: main
|
targetRevision: main
|
||||||
path: infra/helm/stonks-oracle
|
path: infra/helm/stonks-oracle
|
||||||
helm:
|
helm:
|
||||||
|
|||||||
@@ -6,7 +6,7 @@ metadata:
|
|||||||
spec:
|
spec:
|
||||||
project: default
|
project: default
|
||||||
source:
|
source:
|
||||||
repoURL: https://github.com/celesrenata/stonks-oracle.git
|
repoURL: http://gitea-service.git-server.svc.cluster.local:3000/admin/stonks-oracle.git
|
||||||
targetRevision: main
|
targetRevision: main
|
||||||
path: infra/helm/stonks-oracle
|
path: infra/helm/stonks-oracle
|
||||||
helm:
|
helm:
|
||||||
|
|||||||
@@ -7,6 +7,7 @@ metadata:
|
|||||||
argocd.argoproj.io/secret-type: repository
|
argocd.argoproj.io/secret-type: repository
|
||||||
type: Opaque
|
type: Opaque
|
||||||
stringData:
|
stringData:
|
||||||
url: https://github.com/celesrenata/stonks-oracle.git
|
url: http://gitea-service.git-server.svc.cluster.local:3000/admin/stonks-oracle.git
|
||||||
type: git
|
type: git
|
||||||
password: PLACEHOLDER # Filled at deploy time from gremlin-1's github_token
|
username: admin
|
||||||
|
password: St0nks0racl3!
|
||||||
|
|||||||
Executable
+283
@@ -0,0 +1,283 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
# pipelines/gitea/setup.sh — Automate Gitea initial setup via REST API
|
||||||
|
#
|
||||||
|
# Steps:
|
||||||
|
# 1. Create admin user (idempotent — skip if already exists)
|
||||||
|
# 2. Create OAuth2 application for Woodpecker CI (idempotent — reuse if exists)
|
||||||
|
# 3. Create stonks-oracle repository (idempotent — skip if already exists)
|
||||||
|
#
|
||||||
|
# Outputs OAuth2 client_id and client_secret to stdout and to
|
||||||
|
# gitea-oauth2.env (sourceable by runmefirst.sh).
|
||||||
|
#
|
||||||
|
# Requirements: 1.1, 1.2, 1.5, 2.3
|
||||||
|
|
||||||
|
# -------------------------------------------------------
|
||||||
|
# Configuration (override via environment variables)
|
||||||
|
# -------------------------------------------------------
|
||||||
|
GITEA_URL="${GITEA_URL:-http://10.1.1.12:30300}"
|
||||||
|
GITEA_ADMIN_USER="${GITEA_ADMIN_USER:-admin}"
|
||||||
|
GITEA_ADMIN_PASSWORD="${GITEA_ADMIN_PASSWORD:-St0nks0racl3!}"
|
||||||
|
GITEA_ADMIN_EMAIL="${GITEA_ADMIN_EMAIL:-admin@celestium.life}"
|
||||||
|
|
||||||
|
OAUTH2_APP_NAME="woodpecker-ci"
|
||||||
|
OAUTH2_REDIRECT_URI="https://stonks-ci.celestium.life/authorize"
|
||||||
|
REPO_NAME="stonks-oracle"
|
||||||
|
|
||||||
|
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||||
|
OAUTH2_ENV_FILE="${SCRIPT_DIR}/gitea-oauth2.env"
|
||||||
|
|
||||||
|
API="${GITEA_URL}/api/v1"
|
||||||
|
AUTH_HEADER="Authorization: Basic $(echo -n "${GITEA_ADMIN_USER}:${GITEA_ADMIN_PASSWORD}" | base64)"
|
||||||
|
|
||||||
|
echo "=== Gitea Setup ==="
|
||||||
|
echo " URL: ${GITEA_URL}"
|
||||||
|
echo " Admin user: ${GITEA_ADMIN_USER}"
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# -------------------------------------------------------
|
||||||
|
# Helper: wait for Gitea to be reachable
|
||||||
|
# (Gitea returns 404 on /api/v1/version when on install page, so check root)
|
||||||
|
# -------------------------------------------------------
|
||||||
|
echo "--- Waiting for Gitea to be reachable ---"
|
||||||
|
for i in $(seq 1 30); do
|
||||||
|
HTTP_STATUS=$(curl -s -o /dev/null -w "%{http_code}" --connect-timeout 5 "${GITEA_URL}/" 2>/dev/null || echo "000")
|
||||||
|
if [ "$HTTP_STATUS" != "000" ]; then
|
||||||
|
echo " ✓ Gitea is reachable (HTTP ${HTTP_STATUS})"
|
||||||
|
break
|
||||||
|
fi
|
||||||
|
if [ "$i" -eq 30 ]; then
|
||||||
|
echo " ✗ Gitea not reachable after 30 attempts"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
echo " Attempt ${i}/30 — waiting 5s..."
|
||||||
|
sleep 5
|
||||||
|
done
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# -------------------------------------------------------
|
||||||
|
# Step 1: Create admin user
|
||||||
|
# -------------------------------------------------------
|
||||||
|
echo "--- Step 1: Create admin user ---"
|
||||||
|
|
||||||
|
# Check if admin user already exists by attempting to query the API with credentials
|
||||||
|
HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" --connect-timeout 5 \
|
||||||
|
-H "${AUTH_HEADER}" \
|
||||||
|
"${API}/user")
|
||||||
|
|
||||||
|
if [ "$HTTP_CODE" = "200" ]; then
|
||||||
|
echo " ✓ Admin user '${GITEA_ADMIN_USER}' already exists and credentials are valid"
|
||||||
|
elif [ "$HTTP_CODE" = "404" ]; then
|
||||||
|
# Gitea is on the install page — API not available yet
|
||||||
|
echo " Gitea API not available (install page) — completing initial setup..."
|
||||||
|
INSTALL_RESPONSE=$(curl -s -w "\n%{http_code}" \
|
||||||
|
-X POST "${GITEA_URL}/" \
|
||||||
|
-H "Content-Type: application/x-www-form-urlencoded" \
|
||||||
|
--data-urlencode "db_type=sqlite3" \
|
||||||
|
--data-urlencode "db_host=localhost:3306" \
|
||||||
|
--data-urlencode "db_user=root" \
|
||||||
|
--data-urlencode "db_passwd=" \
|
||||||
|
--data-urlencode "db_name=gitea" \
|
||||||
|
--data-urlencode "ssl_mode=disable" \
|
||||||
|
--data-urlencode "db_path=/data/gitea/gitea.db" \
|
||||||
|
--data-urlencode "app_name=Gitea: Git with a cup of tea" \
|
||||||
|
--data-urlencode "repo_root_path=/data/git/repositories" \
|
||||||
|
--data-urlencode "lfs_root_path=/data/git/lfs" \
|
||||||
|
--data-urlencode "run_user=git" \
|
||||||
|
--data-urlencode "domain=gitea-service.git-server.svc.cluster.local" \
|
||||||
|
--data-urlencode "ssh_port=22" \
|
||||||
|
--data-urlencode "http_port=3000" \
|
||||||
|
--data-urlencode "app_url=http://gitea-service.git-server.svc.cluster.local:3000/" \
|
||||||
|
--data-urlencode "log_root_path=/data/gitea/log" \
|
||||||
|
--data-urlencode "admin_name=${GITEA_ADMIN_USER}" \
|
||||||
|
--data-urlencode "admin_passwd=${GITEA_ADMIN_PASSWORD}" \
|
||||||
|
--data-urlencode "admin_confirm_passwd=${GITEA_ADMIN_PASSWORD}" \
|
||||||
|
--data-urlencode "admin_email=${GITEA_ADMIN_EMAIL}")
|
||||||
|
|
||||||
|
INSTALL_CODE=$(echo "$INSTALL_RESPONSE" | tail -n 1)
|
||||||
|
|
||||||
|
if [ "$INSTALL_CODE" = "200" ] || [ "$INSTALL_CODE" = "302" ]; then
|
||||||
|
echo " ✓ Gitea initial install completed with admin user"
|
||||||
|
# Gitea restarts after install — wait for API to become available
|
||||||
|
echo " Waiting for Gitea API to become available after install..."
|
||||||
|
for j in $(seq 1 30); do
|
||||||
|
API_STATUS=$(curl -s -o /dev/null -w "%{http_code}" --connect-timeout 5 "${API}/version" 2>/dev/null || echo "000")
|
||||||
|
if [ "$API_STATUS" = "200" ]; then
|
||||||
|
echo " ✓ Gitea API is ready"
|
||||||
|
break
|
||||||
|
fi
|
||||||
|
if [ "$j" -eq 30 ]; then
|
||||||
|
echo " ✗ Gitea API not available after install (waited 150s)"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
sleep 5
|
||||||
|
done
|
||||||
|
else
|
||||||
|
echo " ✗ Install wizard returned HTTP ${INSTALL_CODE}"
|
||||||
|
echo " Response: $(echo "$INSTALL_RESPONSE" | head -n -1 | tail -5)"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
else
|
||||||
|
echo " Admin user not found or credentials invalid — creating via admin API..."
|
||||||
|
|
||||||
|
# Try creating the user via Gitea's admin create-user API
|
||||||
|
# This works when Gitea has been initialized but no admin user exists,
|
||||||
|
# or when using the built-in admin creation endpoint
|
||||||
|
RESPONSE=$(curl -s -w "\n%{http_code}" \
|
||||||
|
-X POST "${API}/admin/users" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-H "${AUTH_HEADER}" \
|
||||||
|
-d "{
|
||||||
|
\"username\": \"${GITEA_ADMIN_USER}\",
|
||||||
|
\"password\": \"${GITEA_ADMIN_PASSWORD}\",
|
||||||
|
\"email\": \"${GITEA_ADMIN_EMAIL}\",
|
||||||
|
\"must_change_password\": false,
|
||||||
|
\"login_name\": \"${GITEA_ADMIN_USER}\",
|
||||||
|
\"source_id\": 0,
|
||||||
|
\"visibility\": \"public\"
|
||||||
|
}")
|
||||||
|
|
||||||
|
BODY=$(echo "$RESPONSE" | head -n -1)
|
||||||
|
CODE=$(echo "$RESPONSE" | tail -n 1)
|
||||||
|
|
||||||
|
if [ "$CODE" = "201" ]; then
|
||||||
|
echo " ✓ Admin user '${GITEA_ADMIN_USER}' created"
|
||||||
|
elif [ "$CODE" = "422" ]; then
|
||||||
|
echo " ✓ Admin user '${GITEA_ADMIN_USER}' already exists (422)"
|
||||||
|
else
|
||||||
|
echo " ✗ Unexpected response creating admin user: HTTP ${CODE}"
|
||||||
|
echo " Response: ${BODY}"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
fi
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# -------------------------------------------------------
|
||||||
|
# Step 2: Create OAuth2 application for Woodpecker CI
|
||||||
|
# -------------------------------------------------------
|
||||||
|
echo "--- Step 2: Create OAuth2 application '${OAUTH2_APP_NAME}' ---"
|
||||||
|
|
||||||
|
# Check if the OAuth2 app already exists
|
||||||
|
EXISTING_APPS=$(curl -s \
|
||||||
|
-H "${AUTH_HEADER}" \
|
||||||
|
"${API}/user/applications/oauth2")
|
||||||
|
|
||||||
|
EXISTING_APP=$(echo "$EXISTING_APPS" | python3 -c "
|
||||||
|
import sys, json
|
||||||
|
apps = json.load(sys.stdin)
|
||||||
|
for app in apps:
|
||||||
|
if app.get('name') == '${OAUTH2_APP_NAME}':
|
||||||
|
print(json.dumps(app))
|
||||||
|
break
|
||||||
|
" 2>/dev/null || echo "")
|
||||||
|
|
||||||
|
if [ -n "$EXISTING_APP" ]; then
|
||||||
|
echo " ✓ OAuth2 app '${OAUTH2_APP_NAME}' already exists"
|
||||||
|
OAUTH2_CLIENT_ID=$(echo "$EXISTING_APP" | python3 -c "import sys,json; print(json.load(sys.stdin)['client_id'])")
|
||||||
|
# Note: Gitea does not return client_secret for existing apps.
|
||||||
|
# If we need the secret, we must delete and recreate.
|
||||||
|
OAUTH2_CLIENT_SECRET=$(echo "$EXISTING_APP" | python3 -c "
|
||||||
|
import sys, json
|
||||||
|
data = json.load(sys.stdin)
|
||||||
|
print(data.get('client_secret', ''))
|
||||||
|
" 2>/dev/null || echo "")
|
||||||
|
|
||||||
|
if [ -z "$OAUTH2_CLIENT_SECRET" ]; then
|
||||||
|
echo " ⚠ Client secret not available for existing app — recreating..."
|
||||||
|
APP_ID=$(echo "$EXISTING_APP" | python3 -c "import sys,json; print(json.load(sys.stdin)['id'])")
|
||||||
|
curl -s -X DELETE \
|
||||||
|
-H "${AUTH_HEADER}" \
|
||||||
|
"${API}/user/applications/oauth2/${APP_ID}" > /dev/null
|
||||||
|
echo " Deleted existing OAuth2 app (id=${APP_ID})"
|
||||||
|
EXISTING_APP=""
|
||||||
|
fi
|
||||||
|
fi
|
||||||
|
|
||||||
|
if [ -z "${EXISTING_APP:-}" ]; then
|
||||||
|
# Create the OAuth2 application
|
||||||
|
RESPONSE=$(curl -s -w "\n%{http_code}" \
|
||||||
|
-X POST "${API}/user/applications/oauth2" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-H "${AUTH_HEADER}" \
|
||||||
|
-d "{
|
||||||
|
\"name\": \"${OAUTH2_APP_NAME}\",
|
||||||
|
\"redirect_uris\": [\"${OAUTH2_REDIRECT_URI}\"],
|
||||||
|
\"confidential_client\": true
|
||||||
|
}")
|
||||||
|
|
||||||
|
BODY=$(echo "$RESPONSE" | head -n -1)
|
||||||
|
CODE=$(echo "$RESPONSE" | tail -n 1)
|
||||||
|
|
||||||
|
if [ "$CODE" = "201" ]; then
|
||||||
|
OAUTH2_CLIENT_ID=$(echo "$BODY" | python3 -c "import sys,json; print(json.load(sys.stdin)['client_id'])")
|
||||||
|
OAUTH2_CLIENT_SECRET=$(echo "$BODY" | python3 -c "import sys,json; print(json.load(sys.stdin)['client_secret'])")
|
||||||
|
echo " ✓ OAuth2 app '${OAUTH2_APP_NAME}' created"
|
||||||
|
else
|
||||||
|
echo " ✗ Failed to create OAuth2 app: HTTP ${CODE}"
|
||||||
|
echo " Response: ${BODY}"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
fi
|
||||||
|
|
||||||
|
echo " Client ID: ${OAUTH2_CLIENT_ID}"
|
||||||
|
echo " Client Secret: ${OAUTH2_CLIENT_SECRET:0:8}..."
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# -------------------------------------------------------
|
||||||
|
# Step 3: Create stonks-oracle repository
|
||||||
|
# -------------------------------------------------------
|
||||||
|
echo "--- Step 3: Create '${REPO_NAME}' repository ---"
|
||||||
|
|
||||||
|
# Check if repo already exists
|
||||||
|
REPO_CHECK=$(curl -s -o /dev/null -w "%{http_code}" \
|
||||||
|
-H "${AUTH_HEADER}" \
|
||||||
|
"${API}/repos/${GITEA_ADMIN_USER}/${REPO_NAME}")
|
||||||
|
|
||||||
|
if [ "$REPO_CHECK" = "200" ]; then
|
||||||
|
echo " ✓ Repository '${REPO_NAME}' already exists"
|
||||||
|
else
|
||||||
|
RESPONSE=$(curl -s -w "\n%{http_code}" \
|
||||||
|
-X POST "${API}/user/repos" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-H "${AUTH_HEADER}" \
|
||||||
|
-d "{
|
||||||
|
\"name\": \"${REPO_NAME}\",
|
||||||
|
\"description\": \"Stonks Oracle — AI market intelligence and paper-trading platform\",
|
||||||
|
\"private\": false,
|
||||||
|
\"auto_init\": false
|
||||||
|
}")
|
||||||
|
|
||||||
|
BODY=$(echo "$RESPONSE" | head -n -1)
|
||||||
|
CODE=$(echo "$RESPONSE" | tail -n 1)
|
||||||
|
|
||||||
|
if [ "$CODE" = "201" ]; then
|
||||||
|
echo " ✓ Repository '${REPO_NAME}' created"
|
||||||
|
elif [ "$CODE" = "409" ]; then
|
||||||
|
echo " ✓ Repository '${REPO_NAME}' already exists (409)"
|
||||||
|
else
|
||||||
|
echo " ✗ Failed to create repository: HTTP ${CODE}"
|
||||||
|
echo " Response: ${BODY}"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
fi
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# -------------------------------------------------------
|
||||||
|
# Output OAuth2 credentials
|
||||||
|
# -------------------------------------------------------
|
||||||
|
echo "--- OAuth2 Credentials ---"
|
||||||
|
echo "GITEA_CLIENT_ID=${OAUTH2_CLIENT_ID}"
|
||||||
|
echo "GITEA_CLIENT_SECRET=${OAUTH2_CLIENT_SECRET}"
|
||||||
|
|
||||||
|
# Write to env file for runmefirst.sh to source
|
||||||
|
cat > "${OAUTH2_ENV_FILE}" <<EOF
|
||||||
|
# Generated by gitea/setup.sh — do not edit manually
|
||||||
|
GITEA_CLIENT_ID=${OAUTH2_CLIENT_ID}
|
||||||
|
GITEA_CLIENT_SECRET=${OAUTH2_CLIENT_SECRET}
|
||||||
|
EOF
|
||||||
|
echo ""
|
||||||
|
echo " ✓ Credentials written to ${OAUTH2_ENV_FILE}"
|
||||||
|
|
||||||
|
echo ""
|
||||||
|
echo "=== Gitea Setup Complete ==="
|
||||||
@@ -6,4 +6,4 @@ metadata:
|
|||||||
spec:
|
spec:
|
||||||
subscriptions:
|
subscriptions:
|
||||||
- image:
|
- image:
|
||||||
repoURL: ghcr.io/celesrenata/stonks-oracle/query-api
|
repoURL: registry.celestium.life/stonks-oracle/query-api
|
||||||
|
|||||||
@@ -1,15 +1,16 @@
|
|||||||
apiVersion: v1
|
apiVersion: v1
|
||||||
kind: PersistentVolume
|
kind: PersistentVolume
|
||||||
metadata:
|
metadata:
|
||||||
name: pipeline-arc-pv
|
name: pipeline-woodpecker-pv
|
||||||
labels:
|
labels:
|
||||||
app: pipeline-arc
|
app: pipeline-woodpecker
|
||||||
spec:
|
spec:
|
||||||
capacity:
|
capacity:
|
||||||
storage: 2Gi
|
storage: 5Gi
|
||||||
accessModes:
|
accessModes:
|
||||||
- ReadWriteOnce
|
- ReadWriteOnce
|
||||||
persistentVolumeReclaimPolicy: Retain
|
persistentVolumeReclaimPolicy: Retain
|
||||||
|
storageClassName: ""
|
||||||
nfs:
|
nfs:
|
||||||
server: 192.168.42.8
|
server: 192.168.42.8
|
||||||
path: /volume1/Kubernetes/pipelines/arc
|
path: /volume1/Kubernetes/pipelines/woodpecker
|
||||||
@@ -0,0 +1,15 @@
|
|||||||
|
# ClusterRoleBinding: Grant Woodpecker agent cluster-admin for integration tests
|
||||||
|
# Integration test steps create ephemeral namespaces and deploy sandbox infrastructure.
|
||||||
|
# Mirrors the existing ARC runner RBAC pattern.
|
||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
|
kind: ClusterRoleBinding
|
||||||
|
metadata:
|
||||||
|
name: woodpecker-agent-inttest
|
||||||
|
roleRef:
|
||||||
|
apiGroup: rbac.authorization.k8s.io
|
||||||
|
kind: ClusterRole
|
||||||
|
name: cluster-admin
|
||||||
|
subjects:
|
||||||
|
- kind: ServiceAccount
|
||||||
|
name: woodpecker-agent
|
||||||
|
namespace: woodpecker
|
||||||
@@ -0,0 +1,20 @@
|
|||||||
|
# NetworkPolicy: Allow Traefik ingress to Woodpecker server
|
||||||
|
apiVersion: networking.k8s.io/v1
|
||||||
|
kind: NetworkPolicy
|
||||||
|
metadata:
|
||||||
|
name: allow-traefik-to-woodpecker
|
||||||
|
namespace: woodpecker
|
||||||
|
spec:
|
||||||
|
podSelector:
|
||||||
|
matchLabels:
|
||||||
|
app.kubernetes.io/name: server
|
||||||
|
policyTypes:
|
||||||
|
- Ingress
|
||||||
|
ingress:
|
||||||
|
- from:
|
||||||
|
- namespaceSelector:
|
||||||
|
matchLabels:
|
||||||
|
kubernetes.io/metadata.name: kube-system
|
||||||
|
ports:
|
||||||
|
- protocol: TCP
|
||||||
|
port: 8000
|
||||||
@@ -0,0 +1,53 @@
|
|||||||
|
# Helm values for Woodpecker CI
|
||||||
|
# Chart: woodpecker/woodpecker
|
||||||
|
# Namespace: woodpecker
|
||||||
|
|
||||||
|
# --- Server ---
|
||||||
|
server:
|
||||||
|
enabled: true
|
||||||
|
|
||||||
|
env:
|
||||||
|
WOODPECKER_HOST: "https://stonks-ci.celestium.life"
|
||||||
|
WOODPECKER_SERVER_ADDR: "0.0.0.0:8000"
|
||||||
|
WOODPECKER_GITEA: "true"
|
||||||
|
WOODPECKER_GITEA_URL: "http://gitea-service.git-server.svc.cluster.local:3000"
|
||||||
|
WOODPECKER_GITEA_CLIENT: "<GITEA_CLIENT_ID>"
|
||||||
|
WOODPECKER_GITEA_SECRET: "<GITEA_CLIENT_SECRET>"
|
||||||
|
WOODPECKER_ADMIN: "admin"
|
||||||
|
WOODPECKER_PLUGINS_PRIVILEGED: "woodpeckerci/plugin-docker-buildx"
|
||||||
|
|
||||||
|
# Traefik ingress with TLS via cert-manager
|
||||||
|
ingress:
|
||||||
|
enabled: true
|
||||||
|
ingressClassName: traefik
|
||||||
|
hosts:
|
||||||
|
- host: stonks-ci.celestium.life
|
||||||
|
paths:
|
||||||
|
- path: /
|
||||||
|
backend:
|
||||||
|
serviceName: woodpecker-server
|
||||||
|
servicePort: 80
|
||||||
|
tls:
|
||||||
|
- secretName: woodpecker-tls
|
||||||
|
hosts:
|
||||||
|
- stonks-ci.celestium.life
|
||||||
|
annotations:
|
||||||
|
cert-manager.io/cluster-issuer: ca-issuer
|
||||||
|
|
||||||
|
# NFS-backed persistent volume for SQLite database and build data
|
||||||
|
persistentVolume:
|
||||||
|
enabled: true
|
||||||
|
size: 5Gi
|
||||||
|
storageClass: ""
|
||||||
|
|
||||||
|
# --- Agent ---
|
||||||
|
agent:
|
||||||
|
enabled: true
|
||||||
|
replicaCount: 2
|
||||||
|
|
||||||
|
env:
|
||||||
|
WOODPECKER_SERVER: "woodpecker-server:9000"
|
||||||
|
WOODPECKER_BACKEND: kubernetes
|
||||||
|
WOODPECKER_BACKEND_K8S_NAMESPACE: woodpecker
|
||||||
|
WOODPECKER_BACKEND_K8S_VOLUME_SIZE: 10G
|
||||||
|
WOODPECKER_BACKEND_K8S_STORAGE_RWX: "true"
|
||||||
@@ -5,5 +5,9 @@ line-length = 120
|
|||||||
select = ["E", "F", "I", "W"]
|
select = ["E", "F", "I", "W"]
|
||||||
ignore = ["E501"]
|
ignore = ["E501"]
|
||||||
|
|
||||||
|
[lint.per-file-ignores]
|
||||||
|
"tests/**" = ["E402", "F841", "F811", "E741"]
|
||||||
|
"scripts/**" = ["E722", "E402"]
|
||||||
|
|
||||||
[lint.isort]
|
[lint.isort]
|
||||||
known-first-party = ["services"]
|
known-first-party = ["services"]
|
||||||
|
|||||||
@@ -1,5 +1,7 @@
|
|||||||
|
import json
|
||||||
|
import os
|
||||||
|
|
||||||
from minio import Minio
|
from minio import Minio
|
||||||
import os, json
|
|
||||||
|
|
||||||
mc = Minio(os.environ["MINIO_ENDPOINT"], access_key=os.environ["MINIO_ACCESS_KEY"], secret_key=os.environ["MINIO_SECRET_KEY"], secure=False)
|
mc = Minio(os.environ["MINIO_ENDPOINT"], access_key=os.environ["MINIO_ACCESS_KEY"], secret_key=os.environ["MINIO_SECRET_KEY"], secure=False)
|
||||||
objs = list(mc.list_objects("stonks-raw-filings", recursive=True))
|
objs = list(mc.list_objects("stonks-raw-filings", recursive=True))
|
||||||
|
|||||||
@@ -1,5 +1,7 @@
|
|||||||
|
import json
|
||||||
|
import os
|
||||||
|
|
||||||
from minio import Minio
|
from minio import Minio
|
||||||
import os, json
|
|
||||||
|
|
||||||
mc = Minio(os.environ["MINIO_ENDPOINT"], access_key=os.environ["MINIO_ACCESS_KEY"], secret_key=os.environ["MINIO_SECRET_KEY"], secure=False)
|
mc = Minio(os.environ["MINIO_ENDPOINT"], access_key=os.environ["MINIO_ACCESS_KEY"], secret_key=os.environ["MINIO_SECRET_KEY"], secure=False)
|
||||||
|
|
||||||
@@ -26,7 +28,7 @@ for o in objs:
|
|||||||
if raw:
|
if raw:
|
||||||
print(f" output ({len(raw)} chars): {raw[:200]}")
|
print(f" output ({len(raw)} chars): {raw[:200]}")
|
||||||
else:
|
else:
|
||||||
print(f" output: (empty)")
|
print(" output: (empty)")
|
||||||
|
|
||||||
if not objs:
|
if not objs:
|
||||||
print(f"No LLM result found for {target}")
|
print(f"No LLM result found for {target}")
|
||||||
|
|||||||
+11
-7
@@ -1,11 +1,15 @@
|
|||||||
|
import os
|
||||||
|
|
||||||
from minio import Minio
|
from minio import Minio
|
||||||
import os, json
|
|
||||||
|
|
||||||
mc = Minio(os.environ["MINIO_ENDPOINT"], access_key=os.environ["MINIO_ACCESS_KEY"], secret_key=os.environ["MINIO_SECRET_KEY"], secure=False)
|
mc = Minio(os.environ["MINIO_ENDPOINT"], access_key=os.environ["MINIO_ACCESS_KEY"], secret_key=os.environ["MINIO_SECRET_KEY"], secure=False)
|
||||||
|
|
||||||
# Check the most recent extraction - what text did the model get?
|
# Check the most recent extraction - what text did the model get?
|
||||||
# Look at the normalized text for a known doc
|
# Look at the normalized text for a known doc
|
||||||
import asyncio, asyncpg
|
import asyncio
|
||||||
|
|
||||||
|
import asyncpg
|
||||||
|
|
||||||
|
|
||||||
async def main():
|
async def main():
|
||||||
pool = await asyncpg.create_pool(
|
pool = await asyncpg.create_pool(
|
||||||
@@ -15,20 +19,20 @@ async def main():
|
|||||||
user=os.environ["POSTGRES_USER"],
|
user=os.environ["POSTGRES_USER"],
|
||||||
password=os.environ["POSTGRES_PASSWORD"],
|
password=os.environ["POSTGRES_PASSWORD"],
|
||||||
)
|
)
|
||||||
|
|
||||||
# Get a recently extracted doc
|
# Get a recently extracted doc
|
||||||
row = await pool.fetchrow(
|
row = await pool.fetchrow(
|
||||||
"SELECT id, title, normalized_storage_ref, parse_quality_score "
|
"SELECT id, title, normalized_storage_ref, parse_quality_score "
|
||||||
"FROM documents WHERE source_type = 'news_api' AND parse_quality_score > 0.8 "
|
"FROM documents WHERE source_type = 'news_api' AND parse_quality_score > 0.8 "
|
||||||
"ORDER BY updated_at DESC LIMIT 1"
|
"ORDER BY updated_at DESC LIMIT 1"
|
||||||
)
|
)
|
||||||
|
|
||||||
if row:
|
if row:
|
||||||
print(f"Doc: {row['id']}")
|
print(f"Doc: {row['id']}")
|
||||||
print(f"Title: {row['title']}")
|
print(f"Title: {row['title']}")
|
||||||
print(f"Quality: {row['parse_quality_score']}")
|
print(f"Quality: {row['parse_quality_score']}")
|
||||||
print(f"Ref: {row['normalized_storage_ref']}")
|
print(f"Ref: {row['normalized_storage_ref']}")
|
||||||
|
|
||||||
ref = row["normalized_storage_ref"]
|
ref = row["normalized_storage_ref"]
|
||||||
parts = ref.replace("s3://", "").split("/", 1)
|
parts = ref.replace("s3://", "").split("/", 1)
|
||||||
if len(parts) == 2:
|
if len(parts) == 2:
|
||||||
@@ -37,9 +41,9 @@ async def main():
|
|||||||
obj.close()
|
obj.close()
|
||||||
obj.release_conn()
|
obj.release_conn()
|
||||||
print(f"Text length: {len(text)} chars")
|
print(f"Text length: {len(text)} chars")
|
||||||
print(f"First 500 chars:")
|
print("First 500 chars:")
|
||||||
print(text[:500])
|
print(text[:500])
|
||||||
|
|
||||||
await pool.close()
|
await pool.close()
|
||||||
|
|
||||||
asyncio.run(main())
|
asyncio.run(main())
|
||||||
|
|||||||
@@ -1,5 +1,7 @@
|
|||||||
|
import json
|
||||||
|
import os
|
||||||
|
|
||||||
from minio import Minio
|
from minio import Minio
|
||||||
import os, json
|
|
||||||
|
|
||||||
mc = Minio(os.environ["MINIO_ENDPOINT"], access_key=os.environ["MINIO_ACCESS_KEY"], secret_key=os.environ["MINIO_SECRET_KEY"], secure=False)
|
mc = Minio(os.environ["MINIO_ENDPOINT"], access_key=os.environ["MINIO_ACCESS_KEY"], secret_key=os.environ["MINIO_SECRET_KEY"], secure=False)
|
||||||
raw_objs = list(mc.list_objects("stonks-llm-results", recursive=True))
|
raw_objs = list(mc.list_objects("stonks-llm-results", recursive=True))
|
||||||
|
|||||||
@@ -1,5 +1,7 @@
|
|||||||
|
import json
|
||||||
|
import os
|
||||||
|
|
||||||
from minio import Minio
|
from minio import Minio
|
||||||
import os, json
|
|
||||||
|
|
||||||
mc = Minio(os.environ["MINIO_ENDPOINT"], access_key=os.environ["MINIO_ACCESS_KEY"], secret_key=os.environ["MINIO_SECRET_KEY"], secure=False)
|
mc = Minio(os.environ["MINIO_ENDPOINT"], access_key=os.environ["MINIO_ACCESS_KEY"], secret_key=os.environ["MINIO_SECRET_KEY"], secure=False)
|
||||||
raw_objs = list(mc.list_objects("stonks-llm-results", recursive=True))
|
raw_objs = list(mc.list_objects("stonks-llm-results", recursive=True))
|
||||||
|
|||||||
@@ -1,5 +1,7 @@
|
|||||||
|
import json
|
||||||
|
import os
|
||||||
|
|
||||||
from minio import Minio
|
from minio import Minio
|
||||||
import os, json
|
|
||||||
|
|
||||||
mc = Minio(os.environ["MINIO_ENDPOINT"], access_key=os.environ["MINIO_ACCESS_KEY"], secret_key=os.environ["MINIO_SECRET_KEY"], secure=False)
|
mc = Minio(os.environ["MINIO_ENDPOINT"], access_key=os.environ["MINIO_ACCESS_KEY"], secret_key=os.environ["MINIO_SECRET_KEY"], secure=False)
|
||||||
raw_objs = list(mc.list_objects("stonks-llm-results", recursive=True))
|
raw_objs = list(mc.list_objects("stonks-llm-results", recursive=True))
|
||||||
|
|||||||
@@ -1,4 +1,7 @@
|
|||||||
import redis, os
|
import os
|
||||||
|
|
||||||
|
import redis
|
||||||
|
|
||||||
r = redis.from_url(f"redis://:{os.environ.get('REDIS_PASSWORD','')}@{os.environ['REDIS_HOST']}:{os.environ['REDIS_PORT']}/0")
|
r = redis.from_url(f"redis://:{os.environ.get('REDIS_PASSWORD','')}@{os.environ['REDIS_HOST']}:{os.environ['REDIS_PORT']}/0")
|
||||||
for q in ["ingestion","parsing","extraction","aggregation","recommendation","lake_publish","broker_orders"]:
|
for q in ["ingestion","parsing","extraction","aggregation","recommendation","lake_publish","broker_orders"]:
|
||||||
depth = r.llen(f"stonks:queue:{q}")
|
depth = r.llen(f"stonks:queue:{q}")
|
||||||
|
|||||||
@@ -1,5 +1,7 @@
|
|||||||
|
import json
|
||||||
|
import os
|
||||||
|
|
||||||
from minio import Minio
|
from minio import Minio
|
||||||
import os, json
|
|
||||||
|
|
||||||
mc = Minio(os.environ["MINIO_ENDPOINT"], access_key=os.environ["MINIO_ACCESS_KEY"], secret_key=os.environ["MINIO_SECRET_KEY"], secure=False)
|
mc = Minio(os.environ["MINIO_ENDPOINT"], access_key=os.environ["MINIO_ACCESS_KEY"], secret_key=os.environ["MINIO_SECRET_KEY"], secure=False)
|
||||||
|
|
||||||
@@ -15,11 +17,11 @@ for o in raw_objs[:3]:
|
|||||||
raw_out = last.get("raw_output", "")
|
raw_out = last.get("raw_output", "")
|
||||||
ticker = o.object_name.split("/")[1]
|
ticker = o.object_name.split("/")[1]
|
||||||
doc_id = o.object_name.split("/")[-2]
|
doc_id = o.object_name.split("/")[-2]
|
||||||
|
|
||||||
print(f"=== {ticker} / {doc_id[:8]} ===")
|
print(f"=== {ticker} / {doc_id[:8]} ===")
|
||||||
print(f" success: {data.get('success')}")
|
print(f" success: {data.get('success')}")
|
||||||
print(f" duration: {data.get('total_duration_ms')}ms")
|
print(f" duration: {data.get('total_duration_ms')}ms")
|
||||||
|
|
||||||
try:
|
try:
|
||||||
parsed = json.loads(raw_out)
|
parsed = json.loads(raw_out)
|
||||||
print(f" summary: {parsed.get('summary', '')[:120]}")
|
print(f" summary: {parsed.get('summary', '')[:120]}")
|
||||||
|
|||||||
@@ -1,4 +1,7 @@
|
|||||||
import redis, os
|
import os
|
||||||
|
|
||||||
|
import redis
|
||||||
|
|
||||||
r = redis.from_url(f"redis://:{os.environ.get('REDIS_PASSWORD','')}@{os.environ['REDIS_HOST']}:{os.environ['REDIS_PORT']}/0")
|
r = redis.from_url(f"redis://:{os.environ.get('REDIS_PASSWORD','')}@{os.environ['REDIS_HOST']}:{os.environ['REDIS_PORT']}/0")
|
||||||
keys = list(r.scan_iter("stonks:dedupe:*"))
|
keys = list(r.scan_iter("stonks:dedupe:*"))
|
||||||
if keys:
|
if keys:
|
||||||
|
|||||||
@@ -1,4 +1,7 @@
|
|||||||
import redis, os
|
import os
|
||||||
|
|
||||||
|
import redis
|
||||||
|
|
||||||
r = redis.from_url(f"redis://:{os.environ.get('REDIS_PASSWORD','')}@{os.environ['REDIS_HOST']}:{os.environ['REDIS_PORT']}/0")
|
r = redis.from_url(f"redis://:{os.environ.get('REDIS_PASSWORD','')}@{os.environ['REDIS_HOST']}:{os.environ['REDIS_PORT']}/0")
|
||||||
for q in ["ingestion","parsing","extraction","aggregation","recommendation","lake_publish","broker_orders"]:
|
for q in ["ingestion","parsing","extraction","aggregation","recommendation","lake_publish","broker_orders"]:
|
||||||
key = f"stonks:queue:{q}"
|
key = f"stonks:queue:{q}"
|
||||||
|
|||||||
@@ -1,4 +1,10 @@
|
|||||||
import asyncio, asyncpg, json, os, redis
|
import asyncio
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
|
||||||
|
import asyncpg
|
||||||
|
import redis
|
||||||
|
|
||||||
|
|
||||||
async def main():
|
async def main():
|
||||||
pool = await asyncpg.create_pool(
|
pool = await asyncpg.create_pool(
|
||||||
@@ -9,19 +15,19 @@ async def main():
|
|||||||
password=os.environ["POSTGRES_PASSWORD"],
|
password=os.environ["POSTGRES_PASSWORD"],
|
||||||
)
|
)
|
||||||
r = redis.from_url(f"redis://:{os.environ.get('REDIS_PASSWORD','')}@{os.environ['REDIS_HOST']}:{os.environ['REDIS_PORT']}/0")
|
r = redis.from_url(f"redis://:{os.environ.get('REDIS_PASSWORD','')}@{os.environ['REDIS_HOST']}:{os.environ['REDIS_PORT']}/0")
|
||||||
|
|
||||||
rows = await pool.fetch(
|
rows = await pool.fetch(
|
||||||
"SELECT d.id, dcm.ticker FROM documents d "
|
"SELECT d.id, dcm.ticker FROM documents d "
|
||||||
"LEFT JOIN document_company_mentions dcm ON d.id = dcm.document_id "
|
"LEFT JOIN document_company_mentions dcm ON d.id = dcm.document_id "
|
||||||
"WHERE d.status = 'parsed'"
|
"WHERE d.status = 'parsed'"
|
||||||
)
|
)
|
||||||
|
|
||||||
for row in rows:
|
for row in rows:
|
||||||
r.rpush("stonks:queue:extraction", json.dumps({
|
r.rpush("stonks:queue:extraction", json.dumps({
|
||||||
"document_id": str(row["id"]),
|
"document_id": str(row["id"]),
|
||||||
"ticker": row["ticker"] or "",
|
"ticker": row["ticker"] or "",
|
||||||
}))
|
}))
|
||||||
|
|
||||||
print(f"Enqueued {len(rows)} parsed docs for extraction")
|
print(f"Enqueued {len(rows)} parsed docs for extraction")
|
||||||
await pool.close()
|
await pool.close()
|
||||||
|
|
||||||
|
|||||||
@@ -1,4 +1,10 @@
|
|||||||
import asyncio, asyncpg, json, os, redis
|
import asyncio
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
|
||||||
|
import asyncpg
|
||||||
|
import redis
|
||||||
|
|
||||||
|
|
||||||
async def main():
|
async def main():
|
||||||
pool = await asyncpg.create_pool(
|
pool = await asyncpg.create_pool(
|
||||||
@@ -9,26 +15,26 @@ async def main():
|
|||||||
password=os.environ["POSTGRES_PASSWORD"],
|
password=os.environ["POSTGRES_PASSWORD"],
|
||||||
)
|
)
|
||||||
r = redis.from_url(f"redis://:{os.environ.get('REDIS_PASSWORD','')}@{os.environ['REDIS_HOST']}:{os.environ['REDIS_PORT']}/0")
|
r = redis.from_url(f"redis://:{os.environ.get('REDIS_PASSWORD','')}@{os.environ['REDIS_HOST']}:{os.environ['REDIS_PORT']}/0")
|
||||||
|
|
||||||
# Reset filing docs to ingested
|
# Reset filing docs to ingested
|
||||||
await pool.execute(
|
await pool.execute(
|
||||||
"UPDATE documents SET status = 'ingested', parse_quality_score = NULL, parse_confidence = NULL "
|
"UPDATE documents SET status = 'ingested', parse_quality_score = NULL, parse_confidence = NULL "
|
||||||
"WHERE source_type = 'filings_api' AND status = 'low_quality' AND url IS NOT NULL"
|
"WHERE source_type = 'filings_api' AND status = 'low_quality' AND url IS NOT NULL"
|
||||||
)
|
)
|
||||||
|
|
||||||
rows = await pool.fetch(
|
rows = await pool.fetch(
|
||||||
"SELECT d.id, dcm.ticker FROM documents d "
|
"SELECT d.id, dcm.ticker FROM documents d "
|
||||||
"LEFT JOIN document_company_mentions dcm ON d.id = dcm.document_id "
|
"LEFT JOIN document_company_mentions dcm ON d.id = dcm.document_id "
|
||||||
"WHERE d.source_type = 'filings_api' AND d.status = 'ingested' "
|
"WHERE d.source_type = 'filings_api' AND d.status = 'ingested' "
|
||||||
"LIMIT 20" # Start with 20 to test
|
"LIMIT 20" # Start with 20 to test
|
||||||
)
|
)
|
||||||
|
|
||||||
for row in rows:
|
for row in rows:
|
||||||
r.rpush("stonks:queue:parsing", json.dumps({
|
r.rpush("stonks:queue:parsing", json.dumps({
|
||||||
"document_id": str(row["id"]),
|
"document_id": str(row["id"]),
|
||||||
"ticker": row["ticker"] or "",
|
"ticker": row["ticker"] or "",
|
||||||
}))
|
}))
|
||||||
|
|
||||||
print(f"Enqueued {len(rows)} filing docs for parsing (test batch)")
|
print(f"Enqueued {len(rows)} filing docs for parsing (test batch)")
|
||||||
await pool.close()
|
await pool.close()
|
||||||
|
|
||||||
|
|||||||
@@ -1,4 +1,10 @@
|
|||||||
import asyncio, asyncpg, json, os, redis
|
import asyncio
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
|
||||||
|
import asyncpg
|
||||||
|
import redis
|
||||||
|
|
||||||
|
|
||||||
async def main():
|
async def main():
|
||||||
pool = await asyncpg.create_pool(
|
pool = await asyncpg.create_pool(
|
||||||
@@ -9,20 +15,20 @@ async def main():
|
|||||||
password=os.environ["POSTGRES_PASSWORD"],
|
password=os.environ["POSTGRES_PASSWORD"],
|
||||||
)
|
)
|
||||||
r = redis.from_url(f"redis://:{os.environ.get('REDIS_PASSWORD', '')}@{os.environ['REDIS_HOST']}:{os.environ['REDIS_PORT']}/0")
|
r = redis.from_url(f"redis://:{os.environ.get('REDIS_PASSWORD', '')}@{os.environ['REDIS_HOST']}:{os.environ['REDIS_PORT']}/0")
|
||||||
|
|
||||||
rows = await pool.fetch(
|
rows = await pool.fetch(
|
||||||
"SELECT d.id, dcm.ticker FROM documents d "
|
"SELECT d.id, dcm.ticker FROM documents d "
|
||||||
"LEFT JOIN document_company_mentions dcm ON d.id = dcm.document_id "
|
"LEFT JOIN document_company_mentions dcm ON d.id = dcm.document_id "
|
||||||
"WHERE d.source_type = 'news_api' AND d.parse_quality_score > 0.7 "
|
"WHERE d.source_type = 'news_api' AND d.parse_quality_score > 0.7 "
|
||||||
"ORDER BY d.parse_quality_score DESC LIMIT 5"
|
"ORDER BY d.parse_quality_score DESC LIMIT 5"
|
||||||
)
|
)
|
||||||
|
|
||||||
for row in rows:
|
for row in rows:
|
||||||
r.rpush("stonks:queue:extraction", json.dumps({
|
r.rpush("stonks:queue:extraction", json.dumps({
|
||||||
"document_id": str(row["id"]),
|
"document_id": str(row["id"]),
|
||||||
"ticker": row["ticker"] or "",
|
"ticker": row["ticker"] or "",
|
||||||
}))
|
}))
|
||||||
|
|
||||||
print(f"Enqueued {len(rows)} high-quality docs for re-extraction")
|
print(f"Enqueued {len(rows)} high-quality docs for re-extraction")
|
||||||
await pool.close()
|
await pool.close()
|
||||||
|
|
||||||
|
|||||||
@@ -1,4 +1,10 @@
|
|||||||
import asyncio, asyncpg, json, os, redis
|
import asyncio
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
|
||||||
|
import asyncpg
|
||||||
|
import redis
|
||||||
|
|
||||||
|
|
||||||
async def main():
|
async def main():
|
||||||
pool = await asyncpg.create_pool(
|
pool = await asyncpg.create_pool(
|
||||||
|
|||||||
Reference in New Issue
Block a user