Commit Graph

26 Commits

Author SHA1 Message Date
devops f95463ef50 fix: permanent owner password persistence with SeedAudit guard
CI - Build & Test / Backend (.NET) (push) Successful in 28s
CI - Build & Test / Frontend (Vue/TS) (push) Successful in 18s
CI - Build & Test / Security Check (push) Successful in 2s
Root cause: Dual-source architecture for owner password (Gitea secret
ENV_OWNER_PASSWORD vs host .env OWNER_PASSWORD) caused drift when
the DB was ever re-seeded or the volume recreated.

Changes:
- Add SeedAudit entity + migration to track one-time seed operations
- EnsureDatabaseAsync checks SeedAudit BEFORE seeding — owner is never
  re-created even if the Users table is wiped
- Deploy and rollback workflows now read OWNER_PASSWORD from the host's
  persistent .env (single source of truth) instead of Gitea secrets
- compose.yaml documented: OWNER_PASSWORD only used during initial seed
- Cleanup: .gitignore extended for core dumps, changelog/deployment.md
  updated with 2026-06-20 session notes

After this fix the DB is the single source of truth for the owner
password after initial seed. The host .env is the single reference
for the initial value.
2026-06-21 10:15:36 +02:00
devops f0023ac033 fix: use external deploy script to avoid nested quoting errors
CI - Build & Test / Backend (.NET) (push) Successful in 29s
CI - Build & Test / Frontend (Vue/TS) (push) Successful in 18s
CI - Build & Test / Security Check (push) Successful in 4s
The inner shell script run via docker:cli had complex escaping
that caused 'unterminated quoted string' errors at runtime.
Moved the deploy logic to an external script file (heredoc in
the workflow YAML), mounted read-only into the docker:cli
container. Pass BUILD_ARGS and SERVICE via environment
variables instead of shell interpolation.
2026-06-20 19:00:53 +02:00
devops 73c5eb69d7 fix: ensure zombie container cleanup before deploy + verbose pg_resetwal
CI - Build & Test / Backend (.NET) (push) Successful in 34s
CI - Build & Test / Frontend (Vue/TS) (push) Successful in 20s
CI - Build & Test / Security Check (push) Successful in 4s
2026-06-20 18:57:54 +02:00
devops 06eac66baa fix: postgres WAL corruption recovery + memory bump + researcher/executor
CI - Build & Test / Backend (.NET) (push) Successful in 30s
CI - Build & Test / Frontend (Vue/TS) (push) Successful in 19s
CI - Build & Test / Security Check (push) Successful in 4s
- Postgres memory: 256M→384M limits, 64M→96M reservations
- Added pg_resetwal -f pre-deploy step to recover from corrupt WAL
  ('PANIC: could not locate a valid checkpoint record' caused by
  force-killed postgres during --force-recreate)
- Added data-checksums initdb arg for future corruption detection
- api→postgres and web→api depends_on: service_healthy→service_started
- Deploy wait loop: fail fast on unhealthy, wait on starting (180s)
- Added researcher/executor to ValidAssignees and frontend dropdowns
2026-06-20 18:56:11 +02:00
devops b95bec7915 fix: relax web→api dependency + smarter wait loop
CI - Build & Test / Backend (.NET) (push) Successful in 31s
CI - Build & Test / Frontend (Vue/TS) (push) Successful in 18s
CI - Build & Test / Security Check (push) Successful in 4s
- web's depends_on on api: change from service_healthy to
  service_started+restart (same as api→postgres fix)
- deploy wait loop: fail fast on unhealthy, wait on starting,
  increased timeout to 180s (36×5s)
2026-06-20 18:50:29 +02:00
devops baf4008d97 fix: remove --wait flag causing premature deploy failure, use manual health loop
CI - Build & Test / Backend (.NET) (push) Successful in 28s
CI - Build & Test / Frontend (Vue/TS) (push) Successful in 18s
CI - Build & Test / Security Check (push) Successful in 4s
The docker compose --wait flag times out before postgres can
become healthy (start_period=30s). Replaced with explicit
poll loop (5s interval, up to 120s) that checks ps output
for unhealthy/starting states.
2026-06-20 18:46:27 +02:00
devops 12998170e3 fix: update DEPLOY_PATH in all workflows from /opt/openclaw to /home/projekte_bao/openclaw
CI - Build & Test / Backend (.NET) (push) Successful in 27s
CI - Build & Test / Frontend (Vue/TS) (push) Successful in 17s
CI - Build & Test / Security Check (push) Successful in 3s
2026-06-18 21:44:33 +02:00
reviewer 88cafc7b8e review: remove version-bump from deploy workflow — VERSION is read-only source of truth
CI - Build & Test / Backend (.NET) (push) Successful in 27s
CI - Build & Test / Frontend (Vue/TS) (push) Has been cancelled
CI - Build & Test / Security Check (push) Has been cancelled
2026-06-14 11:31:04 +02:00
reviewer 63319e1046 fix: stream deploy env into docker cli
CI - Build & Test / Backend (.NET) (push) Successful in 29s
CI - Build & Test / Frontend (Vue/TS) (push) Successful in 17s
CI - Build & Test / Security Check (push) Successful in 3s
2026-06-14 09:27:56 +02:00
reviewer 5ea7aa9611 fix(ops): mount temp env directory for compose
CI - Build & Test / Backend (.NET) (push) Failing after 23s
CI - Build & Test / Frontend (Vue/TS) (push) Successful in 17s
CI - Build & Test / Security Check (push) Successful in 2s
2026-06-14 08:48:23 +02:00
reviewer db62354c97 fix(ops): pass temp env via compose --env-file
CI - Build & Test / Backend (.NET) (push) Failing after 25s
CI - Build & Test / Frontend (Vue/TS) (push) Successful in 16s
CI - Build & Test / Security Check (push) Successful in 3s
2026-06-14 08:44:42 +02:00
reviewer 4ad0f9e493 refactor: SOLID architecture — backend service layer + frontend V2 components
CI - Build & Test / Backend (.NET) (push) Failing after 25s
CI - Build & Test / Frontend (Vue/TS) (push) Successful in 17s
CI - Build & Test / Security Check (push) Successful in 2s
## Backend — Service Layer & Repository Refactoring

### Neue Services (21 neue Dateien)

**Interfaces & Implementierungen:**
- `IOpenClawGatewayClient` — Interface für OpenClawGatewayClient (DIP-Fix: DashboardController hing an konkreter Klasse)
- `IAgentConfigService` / `AgentConfigService` — Agent-Config-File-I/O aus AgentsController extrahiert
- `IProjectService` / `ProjectService` — Projekt-CRUD + Activity-Logging (SRP)
- `ITaskService` / `TaskService` — Task-State-Machine, Approve/Reject, Dashboard-Operationen (eliminiert Duplikation zwischen TasksController und DashboardController)
- `IDashboardService` / `DashboardService` — Queue-Aggregation, Priority-Normalisierung, Gateway-Delegation
- `IOperationsService` / `OperationsService` — Metriken-Berechnung aus OperationsController
- `ITeamService` / `TeamService` — IDENTITY.md-Lesen aus TeamController
- `IMemoryService` / `MemoryService` — File-I/O aus MemoryController
- `IIncidentService` / `IncidentService` — File-Parsing (Regex-Source-Generatoren) aus IncidentsController
- `IDocService` / `DocService` — Directory-Scan aus DocsController
- `ICalendarService` / `CalendarService` — Gateway-HTTP-Calls + Fallback-Daten aus CalendarController

### Repository-Fixes

**IUserRepository / UserRepository:**
- `SaveChangesAsync` entfernt (leaky abstraction — Caller sollten nie SaveChanges steuern)
- `RevokeTokenAsync(tokenHash)` — atomares Token-Revoke inkl. SaveChanges
- `RevokeFamilyAsync(familyId)` — Batch-Revoke einer Token-Familie inkl. SaveChanges
- `RemoveExpiredTokensAsync` speichert jetzt selbst (war vorher dependent auf nachfolgenden Save)

### AuthService-Fixes
- `GetUserAsync`: unnötiges `Task.Run` entfernt → direkt `_users.GetByIdAsync().AsTask()`
- `RevokeAsync`: delegiert jetzt an `IUserRepository.RevokeTokenAsync`
- `RefreshAsync`: Token-Reuse-Detection delegiert an `IUserRepository.RevokeFamilyAsync`

### Bug-Fix
- `OpenClawGatewayClient.ReadAgentGoalAsync`: pre-existing `CS1656` behoben (`reader` war `using`-Variable und wurde neu zugewiesen — in `reader2` umbenannt)

### Controller (16 Stück — alle slim)
Alle Controller reduziert auf: Input validieren → Service aufrufen → HTTP-Result zurückgeben.
Kein Business-Logic, kein File-I/O, keine direkte Repository-Nutzung (außer AgentsController für Activity-Log).

**Program.cs — neue Registrierungen:**
- `AddHttpClient<IOpenClawGatewayClient, OpenClawGatewayClient>` (war vorher konkrete Klasse)
- Scoped: IDashboardService, IProjectService, ITaskService, IOperationsService, ITeamService, ICalendarService
- Singleton: IAgentConfigService, IMemoryService, IIncidentService, IDocService

---

## Frontend — Dashboard V2 Components

**AgentDetailModal.vue, IrisChat.vue, TaskStrip.vue:**
- V2 Design-System: Dark Space Theme, Glass-Panels, Gradient-Akzente
- Stores (agents, chat, tasks) nutzen Service + Mapper-Pattern
- NexusLayout, FlowBoard, Topbar — Layoutfixes für fullHeight-Route-Meta

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-14 08:34:58 +02:00
reviewer d169cbe9d5 feat(ops): production resilience — healthchecks, restart_policy, log-rotation, --wait deploy [skip ci] 2026-06-13 20:04:42 +02:00
devops 802d2cef3f fix(ci): remove swagger from smoke test — disabled in production
CI - Build & Test / Backend (.NET) (push) Successful in 25s
CI - Build & Test / Frontend (Vue/TS) (push) Successful in 16s
CI - Build & Test / Security Check (push) Successful in 3s
Swagger (/swagger) is only enabled in Development mode (Program.cs
gates it behind app.Environment.IsDevelopment()). In production,
nginx serves the frontend catch-all (index.html), so the check
always returns 200 but never actually validates the API layer.

/health already covers API + database + runtime health checks.
No replacement endpoint needed — the smoke test still validates
both the dashboard and the backend API via /health.
2026-06-09 21:24:27 +02:00
devops 84bf9b7fba fix(ci): correct swagger path + add deploy concurrency guard
CI - Build & Test / Backend (.NET) (push) Successful in 24s
CI - Build & Test / Frontend (Vue/TS) (push) Successful in 16s
CI - Build & Test / Security Check (push) Successful in 3s
Iteration 2 fix: /api/swagger → /swagger (correct ASP.NET default).

Iteration 3 — Concurrency guard:
- concurrency group 'deploy-production': ensures only one deploy
  runs at a time (cancel-in-progress: false so queued deploys
  wait instead of being cancelled).
- Why: prevents race conditions when CI-triggered workflow_run
  and manual workflow_dispatch overlap. Without this, parallel
  deploys could corrupt docker compose state or conflict on
  shared resources (ports, volumes, version tags).
2026-06-09 21:20:54 +02:00
devops cf00318f23 feat(ci): robust health check + multi-endpoint smoke test + rollback hint
CI - Build & Test / Backend (.NET) (push) Successful in 26s
CI - Build & Test / Frontend (Vue/TS) (push) Successful in 16s
CI - Build & Test / Security Check (push) Successful in 3s
Iteration 2 — Deploy robustness:
- Health check: Fibonacci-ish backoff (1,2,3,5,8,13s) instead of fixed
  5s intervals. Why: containers need variable warmup time; fixed intervals
  either wait too long or give up too early. Total budget ~32s vs 30s before.
- Smoke test: now checks /dashboard, /health, and /api/swagger. Why: a
  single endpoint check can miss backend-only outages; API Swagger confirms
  the ASP.NET layer is healthy.
- Rollback hint: on any failure, prints previous git tag + docker compose
  commands for quick manual rollback. Why: reduces MTTR by providing the
  exact recovery steps inline.
2026-06-09 21:19:07 +02:00
devops 045e36b014 fix(ci): remove backslash escapes from Gitea expressions in Build step
CI - Build & Test / Backend (.NET) (push) Successful in 27s
CI - Build & Test / Frontend (Vue/TS) (push) Successful in 16s
CI - Build & Test / Security Check (push) Successful in 3s
The \$ escape before ${{ inputs.service }} prevented Gitea from
evaluating the expression, passing literal backslash to the shell.
Also use ${BUILD_ARGS} (shell expansion) instead of \$BUILD_ARGS
so the outer shell passes the actual build args to the DIND container.
2026-06-09 21:08:33 +02:00
devops 5a72399136 fix(ci): create .env in workspace before sync (DIND path issue)
CI - Build & Test / Backend (.NET) (push) Successful in 25s
CI - Build & Test / Frontend (Vue/TS) (push) Successful in 16s
CI - Build & Test / Security Check (push) Successful in 2s
Phase 1 — .env provisioning fix:
The previous approach tried to write .env directly to
/opt/openclaw/data/openclaw/workspace/nexus from inside the
runner's job container, but that host path is not mounted there.

Fix: write .env from Gitea secrets into the workspace first,
then sync it along with the source code via the existing
Docker-in-Docker pattern (which can access the host path).

Combined the separate '.env creation' and 'sync code' steps
into a single atomic 'Sync code + .env to host' step.
2026-06-09 21:06:04 +02:00
devops 3646521a75 fix(ci): version bump from git tags + .env from secrets
CI - Build & Test / Backend (.NET) (push) Successful in 29s
CI - Build & Test / Frontend (Vue/TS) (push) Successful in 16s
CI - Build & Test / Security Check (push) Successful in 3s
Phase 1 — Deploy reliability:
- Version bump: derive current version from 'git describe --tags' instead of
  VERSION file. This eliminates race conditions where the VERSION file is
  stale but the tag already exists from a previous failed run.
- Tag creation: use 'git tag -f' + 'git push --force --tags' to handle
  retries gracefully when tags already exist.
- Environment: provision .env at the host deploy path from Gitea secrets
  (ENV_POSTGRES_PASSWORD, ENV_JWT_KEY, ENV_OWNER_PASSWORD, ENV_OPENCLAW_TOKEN).
  This ensures .env always exists on the host even though it's excluded from
  the sync step for security.

Runner label was already fixed in previous commit (runs-on: ubuntu-latest).
2026-06-09 21:03:15 +02:00
devops c13d730aa0 fix(ci): change deploy runs-on to ubuntu-latest for reliable label matching
CI - Build & Test / Backend (.NET) (push) Successful in 25s
CI - Build & Test / Frontend (Vue/TS) (push) Successful in 16s
CI - Build & Test / Security Check (push) Successful in 3s
The runner registers with labels [linux, dotnet, node, ubuntu-latest, ...]
but did not include 'deploy'. Changed workflow to use the consistently
available ubuntu-latest label. Also added 'deploy' label to the runner
registration for future compatibility.
2026-06-09 20:56:44 +02:00
devops b41992ec0a fix: deploy via Docker-in-Docker with host-mounted nexus path
CI - Build & Test / Backend (.NET) (push) Successful in 27s
CI - Build & Test / Frontend (Vue/TS) (push) Successful in 16s
CI - Build & Test / Security Check (push) Successful in 3s
Runner job containers don't have the /workspace/nexus mount.
- Sync code to host path using a docker run helper (preserves .env)
- Build & deploy from host path using docker:cli image
- Health check with retry loop for slow container startup
2026-06-09 20:33:42 +02:00
devops 3e0db0dfd1 fix: deploy from checkout dir instead of /workspace/nexus
CI - Build & Test / Backend (.NET) (push) Successful in 29s
CI - Build & Test / Frontend (Vue/TS) (push) Successful in 16s
CI - Build & Test / Security Check (push) Successful in 3s
The runner job container does not have /workspace/nexus mounted.
Run everything from the checkout directory which has .git and compose.yaml.
- Removed rsync sync step (not needed)
- Version bump uses checkout dir with full git history
- Docker compose runs from checkout dir
- Added fetch-depth:0 and fetch-tags for version tagging
2026-06-09 20:30:31 +02:00
devops 247dddc2fc fix: install rsync before deploy sync step
CI - Build & Test / Backend (.NET) (push) Successful in 24s
CI - Build & Test / Frontend (Vue/TS) (push) Successful in 16s
CI - Build & Test / Security Check (push) Successful in 2s
The Gitea runner ubuntu-latest image lacks rsync, causing
the Sync-to-deploy-path step to fail with exit code 127.
Added apt-get install rsync before the sync step.
2026-06-09 20:27:08 +02:00
devops c01c6c990e ci: CD auto-deploy via workflow_run trigger
CI - Build & Test / Backend (.NET) (push) Successful in 27s
CI - Build & Test / Frontend (Vue/TS) (push) Successful in 16s
CI - Build & Test / Security Check (push) Successful in 3s
- deploy.yaml now triggers automatically after successful CI completion
- Adds workflow_run event listener for 'CI - Build & Test'
- Guards deploy to only run when CI conclusion == success
- Preserves manual workflow_dispatch for targeted deploys
- Adds CI/CD note to README
2026-06-09 20:25:32 +02:00
iris edda569536 ci: Version-Bump-Semantik im Deploy-Workflow
CI - Build & Test / Backend (.NET) (push) Successful in 24s
CI - Build & Test / Frontend (Vue/TS) (push) Successful in 16s
CI - Build & Test / Security Check (push) Successful in 3s
2026-06-09 20:04:36 +02:00
bao ca86c4c310 feat: Add CI/CD pipeline with Gitea Actions
CI - Build & Test / Backend (.NET) (push) Failing after 14s
CI - Build & Test / Frontend (Vue/TS) (push) Failing after 37s
CI - Build & Test / Security Check (push) Failing after 1s
- ci.yaml: Backend build+test, Frontend type-check+build, Security scan
- deploy.yaml: Manual deploy with health check + smoke test
- Deploy supports per-service deploy and --no-cache option
2026-06-09 16:43:43 +02:00