ci: fix lint errors across project, update ruff.toml per-file ignores
This commit is contained in:
@@ -0,0 +1,519 @@
|
||||
# Local CI/CD Pipeline — Design
|
||||
|
||||
## Overview
|
||||
|
||||
This design replaces the GitHub-dependent CI/CD pipeline (ARC + GHCR) with a fully local pipeline using Gitea as the Git forge, Woodpecker CI for pipeline execution, and the existing local Docker registry at `registry.celestium.life` for image storage. The existing ArgoCD and Kargo infrastructure is retained for GitOps deployment and staged promotion, with configuration updates to point at local sources instead of GitHub/GHCR.
|
||||
|
||||
The migration touches five areas:
|
||||
|
||||
1. **Gitea configuration** — Complete initial setup (admin user, OAuth2 app), create the `stonks-oracle` repository, and configure webhooks for Woodpecker CI. Gitea is already deployed in the `git-server` namespace but unconfigured.
|
||||
2. **Woodpecker CI deployment** — Deploy server and agent via the `woodpecker/woodpecker` Helm chart in the `woodpecker` namespace. The server authenticates with Gitea via OAuth2. The agent uses the Kubernetes backend, executing each pipeline step as a standalone Pod.
|
||||
3. **Pipeline file** — Create `.woodpecker.yml` translating the existing GitHub Actions workflow into Woodpecker's native format, targeting the local registry and adding a GitHub mirror step.
|
||||
4. **ArgoCD/Kargo updates** — Update ArgoCD repo secret to point at Gitea, update ArgoCD Applications to source from Gitea, update Kargo Warehouse to watch the local registry.
|
||||
5. **ARC teardown** — Remove ARC controller, runner scale set, RBAC, PV, and `arc-system` namespace.
|
||||
|
||||
### Key Design Decisions
|
||||
|
||||
1. **Woodpecker with Kubernetes backend (not Docker-in-Docker agent)** — The Woodpecker agent uses `WOODPECKER_BACKEND: kubernetes`, executing each pipeline step as a standalone Pod in the `woodpecker` namespace. A temporary PVC is created per pipeline run to transfer files between steps. This avoids DinD complexity for most steps. Image builds use the `woodpeckerci/plugin-docker-buildx` plugin with privileged mode for the build step only.
|
||||
|
||||
2. **Gitea API for initial setup** — Gitea's initial setup (admin user creation, OAuth2 app registration, repo creation) is automated via Gitea's REST API in `runmefirst.sh`. This avoids manual web UI interaction and makes the setup reproducible.
|
||||
|
||||
3. **Single Helm chart for Woodpecker** — The `woodpecker/woodpecker` chart contains both server and agent subcharts. One `helm install` deploys both components. The agent connects to the server via the in-cluster service `woodpecker-server:9000`.
|
||||
|
||||
4. **NFS PV for Woodpecker** — Woodpecker server data (SQLite database, build logs) persists on an NFS volume at `nfs://192.168.42.8:/volume1/Kubernetes/pipelines/woodpecker`, surviving cluster rebuilds. The ARC PV is removed since ARC is being torn down.
|
||||
|
||||
5. **GitHub as read-only mirror** — After all CI steps pass, a final pipeline step pushes to GitHub via SSH key stored as a Woodpecker secret. GitHub mirror failure does not block image promotion or deployment.
|
||||
|
||||
6. **ArgoCD sources from Gitea** — ArgoCD's repo secret is updated to point at the Gitea repository URL. All three Applications (beta, paper, live) source Helm charts from Gitea instead of GitHub.
|
||||
|
||||
7. **Helm chart image registry update** — The base `values.yaml` changes `image.registry` from `ghcr.io/celesrenata/stonks-oracle` to `registry.celestium.life/stonks-oracle`. The `ghcrAuth` section and `ghcr-credentials` imagePullSecret are removed since the local registry requires no authentication.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────────┐
|
||||
│ Gremlin Cluster (4x NixOS) │
|
||||
│ │
|
||||
│ ┌─────────────────┐ ┌──────────────────┐ ┌───────────────────────────┐ │
|
||||
│ │ git-server ns │ │ woodpecker ns │ │ argocd ns │ │
|
||||
│ │ (pre-existing) │ │ (NEW) │ │ (existing, updated) │ │
|
||||
│ │ │ │ │ │ │ │
|
||||
│ │ Gitea │ │ WP Server │ │ ArgoCD Server │ │
|
||||
│ │ 10.1.1.x:30300 │ │ (StatefulSet) │ │ (stonks-argocd. │ │
|
||||
│ │ :30022 (SSH) │ │ stonks-ci. │ │ celestium.life) │ │
|
||||
│ │ │ │ celestium.life │ │ │ │
|
||||
│ │ Local Registry │ │ │ │ Repo: Gitea (updated) │ │
|
||||
│ │ registry. │ │ WP Agent │ │ │ │
|
||||
│ │ celestium.life │ │ (Deployment) │ │ │ │
|
||||
│ │ :30500 │ │ K8s backend │ │ │ │
|
||||
│ └─────────────────┘ └──────────────────┘ └───────────────────────────┘ │
|
||||
│ │
|
||||
│ ┌─────────────────┐ ┌──────────────────┐ ┌───────────────────────────┐ │
|
||||
│ │ kargo ns │ │ stonks-beta ns │ │ stonks-oracle ns │ │
|
||||
│ │ (existing, │ │ │ │ (live/production) │ │
|
||||
│ │ updated) │ │ ArgoCD App: │ │ │ │
|
||||
│ │ │ │ stonks-beta │ │ ArgoCD App: stonks-live │ │
|
||||
│ │ Warehouse: │ │ images from │ │ images from │ │
|
||||
│ │ local registry │ │ local registry │ │ local registry │ │
|
||||
│ │ (updated) │ │ │ │ │ │
|
||||
│ └─────────────────┘ └──────────────────┘ └───────────────────────────┘ │
|
||||
│ │
|
||||
│ NFS: nfs://192.168.42.8:/volume1/Kubernetes/pipelines/{argocd,kargo,woodpecker}│
|
||||
└─────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Pipeline Flow
|
||||
|
||||
```mermaid
|
||||
graph LR
|
||||
A[Git Push to Gitea] --> B[Webhook → Woodpecker CI]
|
||||
B --> C[Lint + Test<br/>Python ruff + pytest<br/>Frontend vitest]
|
||||
C --> D[Build + Push<br/>all images to<br/>local registry]
|
||||
D --> E[Integration Tests<br/>run_pipeline.sh]
|
||||
E -->|pass| F[GitHub Mirror<br/>git push]
|
||||
E -->|fail| X[❌ Pipeline Failed]
|
||||
F --> G[Kargo Warehouse<br/>detects new tag<br/>in local registry]
|
||||
G --> H[Beta Stage<br/>auto-promote]
|
||||
H --> I{Market Hours?}
|
||||
I -->|outside| J[Paper Stage]
|
||||
I -->|during| K[🚫 Blocked]
|
||||
K -->|break-glass| J
|
||||
J --> L{Market Hours?}
|
||||
L -->|outside| M[Live Stage<br/>manual approval]
|
||||
L -->|during| N[🚫 Blocked]
|
||||
N -->|break-glass| M
|
||||
```
|
||||
|
||||
## Components and Interfaces
|
||||
|
||||
### 1. Gitea Configuration (`pipelines/gitea/`)
|
||||
|
||||
Gitea is already deployed in the `git-server` namespace but needs initial setup. The configuration is automated via shell scripts that call Gitea's REST API.
|
||||
|
||||
**Setup Steps (in `runmefirst.sh`):**
|
||||
|
||||
1. **Complete initial setup** — POST to `http://<gitea-svc>:3000/` with admin credentials to complete the install wizard, or use the Gitea API to create the admin user if the instance is already initialized.
|
||||
2. **Create OAuth2 application** — POST to `/api/v1/user/applications/oauth2` to register Woodpecker CI with callback URL `https://stonks-ci.celestium.life/authorize`. Store the returned `client_id` and `client_secret` for Woodpecker's Helm values.
|
||||
3. **Create repository** — POST to `/api/v1/user/repos` to create `stonks-oracle` repository.
|
||||
4. **Add Gitea remote to local repo** — Configure the local Git clone on gremlin-1 with the Gitea remote and push the existing codebase.
|
||||
|
||||
**Gitea Service Access:**
|
||||
- Web UI: `http://gitea-http.git-server.svc.cluster.local:3000` (cluster-internal) / `10.1.1.x:30300` (NodePort)
|
||||
- SSH: `:30022` (NodePort)
|
||||
- API: `http://gitea-http.git-server.svc.cluster.local:3000/api/v1/`
|
||||
|
||||
**Webhook Configuration:**
|
||||
Woodpecker CI automatically registers webhooks when a repository is activated through the Woodpecker dashboard or API. The webhook URL points to the Woodpecker server's internal service endpoint.
|
||||
|
||||
### 2. Woodpecker CI Server and Agent (`pipelines/woodpecker/`)
|
||||
|
||||
**Namespace:** `woodpecker`
|
||||
|
||||
**Helm Chart:** `woodpecker/woodpecker` from the [woodpecker-ci/helm](https://github.com/woodpecker-ci/helm) repository. Contains two subcharts: `server` and `agent`.
|
||||
|
||||
**Server Configuration:**
|
||||
- StatefulSet with 1 replica
|
||||
- Persistent volume for SQLite database and build data at `/var/lib/woodpecker`
|
||||
- NFS-backed PV at `nfs://192.168.42.8:/volume1/Kubernetes/pipelines/woodpecker`
|
||||
- Traefik ingress at `stonks-ci.celestium.life` with TLS via `ca-issuer`
|
||||
- Gitea OAuth2 authentication via `WOODPECKER_GITEA=true`, `WOODPECKER_GITEA_URL`, `WOODPECKER_GITEA_CLIENT`, `WOODPECKER_GITEA_SECRET`
|
||||
- `WOODPECKER_HOST=https://stonks-ci.celestium.life`
|
||||
- `WOODPECKER_ADMIN=admin` (matches Gitea admin username)
|
||||
|
||||
**Agent Configuration:**
|
||||
- Deployment with 2 replicas
|
||||
- Kubernetes backend (`WOODPECKER_BACKEND: kubernetes`)
|
||||
- Pipeline steps execute as standalone Pods in the `woodpecker` namespace
|
||||
- Temporary PVC created per pipeline run for file transfer between steps
|
||||
- `WOODPECKER_BACKEND_K8S_STORAGE_CLASS: ""` (use default)
|
||||
- `WOODPECKER_BACKEND_K8S_VOLUME_SIZE: 10G`
|
||||
- ServiceAccount with RBAC for creating Pods, Services, PVCs in the `woodpecker` namespace
|
||||
- Additional ClusterRoleBinding for integration test steps that need to create ephemeral namespaces
|
||||
|
||||
**Helm Values Structure (`pipelines/woodpecker/values.yaml`):**
|
||||
```yaml
|
||||
server:
|
||||
enabled: true
|
||||
env:
|
||||
WOODPECKER_HOST: "https://stonks-ci.celestium.life"
|
||||
WOODPECKER_GITEA: "true"
|
||||
WOODPECKER_GITEA_URL: "http://gitea-http.git-server.svc.cluster.local:3000"
|
||||
WOODPECKER_GITEA_CLIENT: "<from-oauth2-setup>"
|
||||
WOODPECKER_GITEA_SECRET: "<from-oauth2-setup>"
|
||||
WOODPECKER_ADMIN: "admin"
|
||||
ingress:
|
||||
enabled: true
|
||||
ingressClassName: traefik
|
||||
hosts:
|
||||
- host: stonks-ci.celestium.life
|
||||
paths:
|
||||
- path: /
|
||||
backend:
|
||||
serviceName: woodpecker-server
|
||||
servicePort: 80
|
||||
tls:
|
||||
- secretName: woodpecker-tls
|
||||
hosts:
|
||||
- stonks-ci.celestium.life
|
||||
annotations:
|
||||
cert-manager.io/cluster-issuer: ca-issuer
|
||||
persistentVolume:
|
||||
enabled: true
|
||||
size: 5Gi
|
||||
storageClass: ""
|
||||
|
||||
agent:
|
||||
enabled: true
|
||||
replicaCount: 2
|
||||
env:
|
||||
WOODPECKER_SERVER: "woodpecker-server:9000"
|
||||
WOODPECKER_BACKEND: kubernetes
|
||||
WOODPECKER_BACKEND_K8S_NAMESPACE: woodpecker
|
||||
WOODPECKER_BACKEND_K8S_VOLUME_SIZE: 10G
|
||||
WOODPECKER_BACKEND_K8S_STORAGE_RWX: "true"
|
||||
```
|
||||
|
||||
**Network Policy:**
|
||||
A NetworkPolicy in the `woodpecker` namespace allows Traefik ingress traffic to the Woodpecker server on its HTTP port (80).
|
||||
|
||||
### 3. Woodpecker Pipeline File (`.woodpecker.yml`)
|
||||
|
||||
The pipeline file translates the existing GitHub Actions workflow into Woodpecker's native format. Each step runs as a Docker container.
|
||||
|
||||
**Pipeline Structure:**
|
||||
|
||||
```
|
||||
.woodpecker.yml
|
||||
├── lint-python (ruff check services/)
|
||||
├── test-python (pytest tests/)
|
||||
├── test-frontend (npm ci && npx vitest --run)
|
||||
├── build-<service> (×12 Python services, sequential or grouped)
|
||||
├── build-dashboard (frontend/Dockerfile)
|
||||
├── build-superset (docker/Dockerfile.superset)
|
||||
├── integration-test (run_pipeline.sh)
|
||||
└── mirror-github (git push to GitHub)
|
||||
```
|
||||
|
||||
**Key Differences from GitHub Actions:**
|
||||
- No `uses:` syntax — each step specifies an `image:` and `commands:` or uses a Woodpecker plugin
|
||||
- Image builds use `woodpeckerci/plugin-docker-buildx` plugin with `settings.repo`, `settings.registry`, `settings.tags`
|
||||
- Branch filtering via `when: { branch: main, event: push }` instead of GitHub's `if:` conditions
|
||||
- Secrets referenced via `from_secret:` instead of `${{ secrets.X }}`
|
||||
- No matrix builds in Woodpecker — services are built sequentially or via multiple steps
|
||||
|
||||
**Image Tagging:**
|
||||
All images pushed to `registry.celestium.life/stonks-oracle/<service>:<sha>` and `registry.celestium.life/stonks-oracle/<service>:latest`.
|
||||
|
||||
**GitHub Mirror Step:**
|
||||
Uses the `woodpeckerci/plugin-git-push` plugin or a custom step with `git push --mirror` using an SSH deploy key stored as a Woodpecker secret.
|
||||
|
||||
### 4. ArgoCD Updates
|
||||
|
||||
**Repo Secret Update (`pipelines/argocd/repo-secret.yaml`):**
|
||||
Change the repository URL from GitHub to Gitea:
|
||||
```yaml
|
||||
stringData:
|
||||
url: http://gitea-http.git-server.svc.cluster.local:3000/admin/stonks-oracle.git
|
||||
type: git
|
||||
username: admin
|
||||
password: <gitea-admin-password>
|
||||
```
|
||||
|
||||
**Application Updates (`pipelines/argocd/apps/*.yaml`):**
|
||||
All three Applications (stonks-beta, stonks-paper, stonks-live) update `spec.source.repoURL` from `https://github.com/celesrenata/stonks-oracle.git` to the Gitea repository URL.
|
||||
|
||||
### 5. Kargo Warehouse Update
|
||||
|
||||
**Warehouse Update (`pipelines/kargo/warehouse.yaml`):**
|
||||
Change the image subscription from GHCR to the local registry:
|
||||
```yaml
|
||||
spec:
|
||||
subscriptions:
|
||||
- image:
|
||||
repoURL: registry.celestium.life/stonks-oracle/query-api
|
||||
```
|
||||
|
||||
Kargo stages, project, project-config, and market-hours AnalysisTemplate remain unchanged.
|
||||
|
||||
### 6. Helm Chart Updates (`infra/helm/stonks-oracle/`)
|
||||
|
||||
**`values.yaml` changes:**
|
||||
```yaml
|
||||
image:
|
||||
registry: registry.celestium.life/stonks-oracle # was: ghcr.io/celesrenata/stonks-oracle
|
||||
pullPolicy: Always
|
||||
tag: latest
|
||||
|
||||
# REMOVED: imagePullSecrets, ghcrAuth sections
|
||||
```
|
||||
|
||||
**`values-beta.yaml` and `values-paper.yaml`:**
|
||||
No changes needed — they inherit `image.registry` from the base `values.yaml` and only override `image.tag`.
|
||||
|
||||
### 7. ARC Teardown
|
||||
|
||||
The `runmefirst.sh` script tears down ARC before installing Woodpecker:
|
||||
|
||||
1. `helm uninstall arc-runner-set --namespace arc-system || true`
|
||||
2. `helm uninstall arc --namespace arc-system || true`
|
||||
3. `kubectl delete -f arc/runner-rbac.yaml --ignore-not-found`
|
||||
4. `kubectl delete pv pipeline-arc-pv --ignore-not-found`
|
||||
5. `kubectl delete namespace arc-system --ignore-not-found`
|
||||
|
||||
The `pipelines/arc/` directory and `pipelines/pvs/arc-pv.yaml` are removed from the repo.
|
||||
|
||||
### 8. NFS Persistent Volumes
|
||||
|
||||
**Updated PV set** (ARC PV removed, Woodpecker PV added):
|
||||
|
||||
| PV Name | NFS Path | Capacity | Bound To |
|
||||
|---|---|---|---|
|
||||
| `pipeline-argocd-pv` | `/volume1/Kubernetes/pipelines/argocd` | 5Gi | PVC in `argocd` ns |
|
||||
| `pipeline-kargo-pv` | `/volume1/Kubernetes/pipelines/kargo` | 2Gi | PVC in `kargo` ns |
|
||||
| `pipeline-woodpecker-pv` | `/volume1/Kubernetes/pipelines/woodpecker` | 5Gi | PVC in `woodpecker` ns |
|
||||
|
||||
### 9. Updated `runmefirst.sh`
|
||||
|
||||
```
|
||||
#!/bin/bash
|
||||
set -euo pipefail
|
||||
|
||||
# 1. Tear down ARC (if present)
|
||||
# - Uninstall ARC Helm releases
|
||||
# - Delete RBAC, PV, namespace
|
||||
|
||||
# 2. Create namespaces (woodpecker, argocd, kargo, stonks-beta, stonks-paper)
|
||||
|
||||
# 3. Create NFS PVs (argocd, kargo, woodpecker)
|
||||
|
||||
# 4. Configure Gitea
|
||||
# - Complete initial setup via API
|
||||
# - Create admin user (if needed)
|
||||
# - Create OAuth2 app for Woodpecker
|
||||
# - Create stonks-oracle repository
|
||||
|
||||
# 5. Install Woodpecker CI via Helm
|
||||
# - Inject Gitea OAuth2 client_id and client_secret into values
|
||||
# - Apply NetworkPolicy for Traefik ingress
|
||||
|
||||
# 6. Install ArgoCD via Helm
|
||||
# - Apply updated repo secret (pointing to Gitea)
|
||||
# - Apply ArgoCD Applications
|
||||
|
||||
# 7. Install Kargo via Helm
|
||||
# - Apply project, project-config, warehouse (local registry), stages
|
||||
|
||||
# 8. Apply Woodpecker agent RBAC for integration tests
|
||||
```
|
||||
|
||||
### 10. Updated `runmelast.sh`
|
||||
|
||||
```
|
||||
#!/bin/bash
|
||||
set -euo pipefail
|
||||
|
||||
# Reverse order: Kargo → ArgoCD → Woodpecker
|
||||
# Preserves: PVs, NFS data, git-server namespace (Gitea + registry)
|
||||
|
||||
# 1. Remove Kargo resources + Helm release
|
||||
# 2. Remove ArgoCD resources + Helm release
|
||||
# 3. Remove Woodpecker Helm release
|
||||
# 4. Delete namespaces (woodpecker, argocd, kargo)
|
||||
# 5. PVs intentionally NOT deleted
|
||||
```
|
||||
|
||||
### 11. Woodpecker Agent RBAC
|
||||
|
||||
The Woodpecker agent's service account needs:
|
||||
- **Namespace-scoped RBAC** (auto-created by Helm chart): Create/delete Pods, Services, PVCs in the `woodpecker` namespace for pipeline step execution.
|
||||
- **ClusterRoleBinding** (manually applied): Grant the agent service account `cluster-admin` for integration test steps that create ephemeral namespaces and deploy sandbox infrastructure. This mirrors the existing ARC runner RBAC pattern.
|
||||
|
||||
```yaml
|
||||
apiVersion: rbac.authorization.k8s.io/v1
|
||||
kind: ClusterRoleBinding
|
||||
metadata:
|
||||
name: woodpecker-agent-inttest
|
||||
roleRef:
|
||||
apiGroup: rbac.authorization.k8s.io
|
||||
kind: ClusterRole
|
||||
name: cluster-admin
|
||||
subjects:
|
||||
- kind: ServiceAccount
|
||||
name: woodpecker-agent
|
||||
namespace: woodpecker
|
||||
```
|
||||
|
||||
## Data Models
|
||||
|
||||
### Pipeline Infrastructure Layout
|
||||
|
||||
```
|
||||
~/sources/kube/pipelines/
|
||||
├── runmefirst.sh # Full install: ARC teardown → Gitea config → Woodpecker → ArgoCD → Kargo
|
||||
├── runmelast.sh # Teardown: Kargo → ArgoCD → Woodpecker (preserves PVs, git-server)
|
||||
├── gitea/
|
||||
│ └── setup.sh # Gitea API setup: admin user, OAuth2 app, repo creation
|
||||
├── woodpecker/
|
||||
│ ├── values.yaml # Woodpecker Helm values (server + agent)
|
||||
│ └── network-policy.yaml # NetworkPolicy for Traefik → Woodpecker server
|
||||
│ └── agent-rbac.yaml # ClusterRoleBinding for integration test access
|
||||
├── argocd/
|
||||
│ ├── values.yaml # ArgoCD Helm values (unchanged)
|
||||
│ ├── repo-secret.yaml # Updated: points to Gitea instead of GitHub
|
||||
│ └── apps/
|
||||
│ ├── stonks-beta.yaml # Updated: repoURL → Gitea
|
||||
│ ├── stonks-paper.yaml # Updated: repoURL → Gitea
|
||||
│ └── stonks-live.yaml # Updated: repoURL → Gitea
|
||||
├── kargo/
|
||||
│ ├── values.yaml # Kargo Helm values (unchanged)
|
||||
│ ├── project.yaml # Kargo Project (unchanged)
|
||||
│ ├── project-config.yaml # Kargo ProjectConfig (unchanged)
|
||||
│ ├── warehouse.yaml # Updated: watches local registry
|
||||
│ ├── market-hours-check.yaml # AnalysisTemplate (unchanged)
|
||||
│ └── stages/
|
||||
│ ├── beta.yaml # Kargo Stage (unchanged)
|
||||
│ ├── paper.yaml # Kargo Stage (unchanged)
|
||||
│ └── live.yaml # Kargo Stage (unchanged)
|
||||
└── pvs/
|
||||
├── argocd-pv.yaml # NFS PV for ArgoCD (unchanged)
|
||||
├── kargo-pv.yaml # NFS PV for Kargo (unchanged)
|
||||
└── woodpecker-pv.yaml # NFS PV for Woodpecker (NEW, replaces arc-pv.yaml)
|
||||
```
|
||||
|
||||
### Removed Files
|
||||
|
||||
```
|
||||
pipelines/arc/ # Entire directory removed
|
||||
├── values.yaml
|
||||
├── runner-scaleset.yaml
|
||||
└── runner-rbac.yaml
|
||||
pipelines/pvs/arc-pv.yaml # ARC PV removed
|
||||
```
|
||||
|
||||
### Image Tag Flow (Updated)
|
||||
|
||||
```
|
||||
Git SHA (e.g., abc123)
|
||||
→ Woodpecker builds: registry.celestium.life/stonks-oracle/<service>:abc123
|
||||
→ Integration test: run_pipeline.sh --image-tag abc123
|
||||
→ GitHub mirror: git push (non-blocking)
|
||||
→ Kargo Warehouse detects: abc123 in local registry
|
||||
→ Kargo Freight created: abc123
|
||||
→ Beta: helm upgrade with image.tag=abc123
|
||||
→ Paper: helm upgrade with image.tag=abc123 (after market-hours check)
|
||||
→ Live: helm upgrade with image.tag=abc123 (after approval + market-hours check)
|
||||
```
|
||||
|
||||
### Kargo Resource Relationships (Updated)
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
W[Warehouse: stonks-images<br/>watches LOCAL REGISTRY<br/>registry.celestium.life] -->|produces| F[Freight<br/>image tag = git SHA]
|
||||
F -->|auto-promote| SB[Stage: beta<br/>ArgoCD App: stonks-beta]
|
||||
SB -->|verified → available| SP[Stage: paper<br/>market-hours verification<br/>ArgoCD App: stonks-paper]
|
||||
SP -->|verified → available| SL[Stage: live<br/>manual approval + market-hours<br/>ArgoCD App: stonks-live]
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
### Gitea Setup Failures
|
||||
|
||||
| Failure | Detection | Recovery |
|
||||
|---|---|---|
|
||||
| Gitea not reachable | API call returns connection error | Check Gitea pod status in `git-server` namespace. Verify NodePort service. |
|
||||
| Admin user already exists | API returns 422 | Script continues — idempotent. |
|
||||
| OAuth2 app already exists | API returns 422 | Script queries existing apps and reuses credentials. |
|
||||
| Repository already exists | API returns 409 | Script continues — idempotent. |
|
||||
|
||||
### Woodpecker Deployment Failures
|
||||
|
||||
| Failure | Detection | Recovery |
|
||||
|---|---|---|
|
||||
| Helm install fails | Non-zero exit | Check Helm chart repo access. Verify `woodpecker` namespace exists. |
|
||||
| Server can't reach Gitea | OAuth2 login fails | Verify `WOODPECKER_GITEA_URL` resolves within cluster. Check Gitea service. |
|
||||
| Agent can't connect to server | Agent logs show connection errors | Verify `WOODPECKER_SERVER` env var matches server service name. Check agent secret. |
|
||||
| Pipeline step Pod fails to schedule | Pod stuck in Pending | Check node resources. Verify RBAC allows Pod creation in `woodpecker` namespace. |
|
||||
| Image build fails (privileged) | Build step exits non-zero | Verify containerd/k3s allows privileged Pods. Check `plugin-docker-buildx` logs. |
|
||||
|
||||
### Pipeline Failures
|
||||
|
||||
| Failure | Detection | Recovery |
|
||||
|---|---|---|
|
||||
| Lint/test fails | Step exits non-zero | Fix code, push again. Build steps are skipped. |
|
||||
| Image push to local registry fails | Plugin exits non-zero | Check registry health at `registry.celestium.life`. Verify DNS resolution. |
|
||||
| Integration test fails | `run_pipeline.sh` exits non-zero | Check Woodpecker dashboard for step logs. Fix and re-push. |
|
||||
| GitHub mirror fails | Mirror step exits non-zero | Non-blocking — images are already in local registry. Fix SSH key and re-run. |
|
||||
|
||||
### ArgoCD/Kargo Update Failures
|
||||
|
||||
| Failure | Detection | Recovery |
|
||||
|---|---|---|
|
||||
| ArgoCD can't clone from Gitea | Application shows "ComparisonError" | Verify repo secret credentials. Check Gitea accessibility from ArgoCD namespace. |
|
||||
| Kargo can't reach local registry | Warehouse shows error | Verify `registry.celestium.life` DNS resolves. Check registry pod health. |
|
||||
| Image pull fails (k3s nodes) | Pods stuck in ImagePullBackOff | Ensure k3s containerd trusts the local registry. Add registry mirror config if needed. |
|
||||
|
||||
### Rollback Strategy
|
||||
|
||||
Same as existing design:
|
||||
- **Beta/Paper**: Promote a previous Freight in Kargo to roll back the image tag.
|
||||
- **Live**: Same mechanism with manual approval required.
|
||||
- **Emergency**: Direct `helm upgrade` with previous image tag.
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
### Why Property-Based Testing Does Not Apply
|
||||
|
||||
This feature is entirely Infrastructure as Code: shell scripts, Kubernetes YAML manifests, Helm values files, and a Woodpecker pipeline YAML file. There are no pure functions, parsers, serializers, or business logic with meaningful input variation. PBT requires universal properties across a wide input space — this feature has fixed configuration values and Kubernetes resource states. Running 100 iterations of "does the Woodpecker ingress have TLS enabled" adds no value over running it once.
|
||||
|
||||
### Testing Approach
|
||||
|
||||
The testing strategy uses three tiers:
|
||||
|
||||
#### Tier 1: Smoke Tests (Configuration Validation)
|
||||
|
||||
Run locally or in CI without a live cluster.
|
||||
|
||||
| Test | What It Validates | How |
|
||||
|---|---|---|
|
||||
| Manifest syntax | All YAML files parse correctly | `kubectl apply --dry-run=client -f <file>` |
|
||||
| Helm template rendering | Woodpecker values produce valid K8s resources | `helm template` with values file |
|
||||
| Pipeline file syntax | `.woodpecker.yml` is valid | Woodpecker CLI lint or YAML parse |
|
||||
| Namespace isolation | Pipeline namespaces distinct from `stonks-oracle` and `git-server` | Grep manifests for namespace fields |
|
||||
| NFS path separation | PVs use distinct subdirectories | Inspect PV YAML |
|
||||
| Image registry references | All manifests reference `registry.celestium.life` not `ghcr.io` | Grep all YAML for registry URLs |
|
||||
| No GHCR auth remnants | `ghcrAuth` and `ghcr-credentials` removed from Helm chart | Grep values.yaml |
|
||||
| ArgoCD repo URL | All Applications point to Gitea, not GitHub | Inspect Application YAML |
|
||||
| Kargo warehouse URL | Warehouse watches local registry | Inspect warehouse YAML |
|
||||
|
||||
#### Tier 2: Integration Tests (Live Cluster Verification)
|
||||
|
||||
Run after `runmefirst.sh` on the Gremlin cluster.
|
||||
|
||||
| Test | What It Validates | How |
|
||||
|---|---|---|
|
||||
| Gitea accessible | Web UI responds | `curl http://10.1.1.x:30300` |
|
||||
| Gitea repo exists | `stonks-oracle` repo created | Gitea API query |
|
||||
| Woodpecker server running | Pods healthy in `woodpecker` namespace | `kubectl get pods -n woodpecker` |
|
||||
| Woodpecker dashboard accessible | Web UI responds at `stonks-ci.celestium.life` | `curl -k https://stonks-ci.celestium.life` |
|
||||
| Woodpecker OAuth2 works | Login redirects to Gitea | Browser test |
|
||||
| ArgoCD accessible | Web UI responds at `stonks-argocd.celestium.life` | `curl -k https://stonks-argocd.celestium.life` |
|
||||
| ArgoCD syncs from Gitea | Applications sync successfully | `argocd app get stonks-beta` |
|
||||
| Kargo Warehouse | Discovers images from local registry | `kubectl get freight -n stonks-oracle` |
|
||||
| Local registry accessible | Registry responds | `curl https://registry.celestium.life/v2/_catalog` |
|
||||
| TLS certificates | Ingresses have valid certs from `ca-issuer` | `openssl s_client` or cert-manager status |
|
||||
| PV binding | PVCs bound to NFS PVs | `kubectl get pvc -n woodpecker` |
|
||||
| ARC removed | No ARC pods, no `arc-system` namespace | `kubectl get ns arc-system` returns NotFound |
|
||||
| End-to-end pipeline | Push triggers build, images land in local registry | Push a commit, verify in Woodpecker dashboard |
|
||||
| End-to-end promotion | Image flows beta → paper → live | Trigger promotion, verify deployments update |
|
||||
| Teardown preservation | After `runmelast.sh`, PVs and NFS data intact | Run teardown, check PVs and NFS mount |
|
||||
|
||||
#### Tier 3: Market-Hours and Break-Glass Tests
|
||||
|
||||
Unchanged from existing design — these tests validate Kargo behavior which is not modified.
|
||||
|
||||
| Test | What It Validates | How |
|
||||
|---|---|---|
|
||||
| Market-hours block | Promotion blocked during 09:30–16:00 ET | Run AnalysisTemplate during market hours |
|
||||
| Market-hours allow | Promotion allowed outside hours | Run AnalysisTemplate outside hours |
|
||||
| Break-glass override | Manual approval bypasses block | Use Kargo manual approval during hours |
|
||||
| Break-glass audit | Records operator, timestamp, justification | Query Kargo audit trail |
|
||||
Reference in New Issue
Block a user