Compare commits

..

17 Commits

Author SHA1 Message Date
devops 87e504a1b5 chore: bump version to v0.2.13 [skip ci] 2026-06-09 19:25:23 +00:00
devops 802d2cef3f fix(ci): remove swagger from smoke test — disabled in production
CI - Build & Test / Backend (.NET) (push) Successful in 25s
CI - Build & Test / Frontend (Vue/TS) (push) Successful in 16s
CI - Build & Test / Security Check (push) Successful in 3s
Swagger (/swagger) is only enabled in Development mode (Program.cs
gates it behind app.Environment.IsDevelopment()). In production,
nginx serves the frontend catch-all (index.html), so the check
always returns 200 but never actually validates the API layer.

/health already covers API + database + runtime health checks.
No replacement endpoint needed — the smoke test still validates
both the dashboard and the backend API via /health.
2026-06-09 21:24:27 +02:00
devops 7bee8bc23f chore: bump version to v0.2.12 [skip ci] 2026-06-09 19:21:42 +00:00
devops 84bf9b7fba fix(ci): correct swagger path + add deploy concurrency guard
CI - Build & Test / Backend (.NET) (push) Successful in 24s
CI - Build & Test / Frontend (Vue/TS) (push) Successful in 16s
CI - Build & Test / Security Check (push) Successful in 3s
Iteration 2 fix: /api/swagger → /swagger (correct ASP.NET default).

Iteration 3 — Concurrency guard:
- concurrency group 'deploy-production': ensures only one deploy
  runs at a time (cancel-in-progress: false so queued deploys
  wait instead of being cancelled).
- Why: prevents race conditions when CI-triggered workflow_run
  and manual workflow_dispatch overlap. Without this, parallel
  deploys could corrupt docker compose state or conflict on
  shared resources (ports, volumes, version tags).
2026-06-09 21:20:54 +02:00
devops b0b95d2453 chore: bump version to v0.2.11 [skip ci] 2026-06-09 19:20:00 +00:00
devops cf00318f23 feat(ci): robust health check + multi-endpoint smoke test + rollback hint
CI - Build & Test / Backend (.NET) (push) Successful in 26s
CI - Build & Test / Frontend (Vue/TS) (push) Successful in 16s
CI - Build & Test / Security Check (push) Successful in 3s
Iteration 2 — Deploy robustness:
- Health check: Fibonacci-ish backoff (1,2,3,5,8,13s) instead of fixed
  5s intervals. Why: containers need variable warmup time; fixed intervals
  either wait too long or give up too early. Total budget ~32s vs 30s before.
- Smoke test: now checks /dashboard, /health, and /api/swagger. Why: a
  single endpoint check can miss backend-only outages; API Swagger confirms
  the ASP.NET layer is healthy.
- Rollback hint: on any failure, prints previous git tag + docker compose
  commands for quick manual rollback. Why: reduces MTTR by providing the
  exact recovery steps inline.
2026-06-09 21:19:07 +02:00
devops 1085c14594 chore: bump version to v0.2.10 [skip ci] 2026-06-09 19:17:34 +00:00
devops df72fd9439 fix(ci): use --no-frozen-lockfile until lockfile is regenerated
CI - Build & Test / Backend (.NET) (push) Successful in 24s
CI - Build & Test / Frontend (Vue/TS) (push) Successful in 15s
CI - Build & Test / Security Check (push) Successful in 2s
pnpm defaults to frozen-lockfile in CI. The committed lockfile
is outdated (vitest added to package.json). Using --no-frozen-lockfile
is a pragmatic fix; lockfile should be regenerated via 'pnpm install'
and recommitted for full --frozen-lockfile enforcement.
2026-06-09 21:16:47 +02:00
devops 66b833b68b fix(ci): commit pnpm-lock.yaml for frozen-lockfile CI
CI - Build & Test / Backend (.NET) (push) Successful in 24s
CI - Build & Test / Frontend (Vue/TS) (push) Failing after 7s
CI - Build & Test / Security Check (push) Successful in 3s
The --frozen-lockfile flag requires the lockfile to be present
in the checkout. Previously pnpm-lock.yaml was gitignored, so
it was absent from CI checkouts.

Lockfiles SHOULD be version-controlled for reproducible builds.
This also enables CI to detect when lockfile is outdated vs
package.json.
2026-06-09 21:15:13 +02:00
devops 65b46386a1 perf(ci): concurrency groups + strict pnpm lockfile
CI - Build & Test / Backend (.NET) (push) Successful in 29s
CI - Build & Test / Frontend (Vue/TS) (push) Failing after 8s
CI - Build & Test / Security Check (push) Successful in 2s
Iteration 1 — CI reliability and speed:
- Concurrency: cancel in-progress CI runs when new push arrives
  to the same branch. Why: Avoids waste when pushing multiple
  fixes in quick succession; only the latest code is tested.
- pnpm: switch from --no-frozen-lockfile to --frozen-lockfile.
  Why: Fails fast if pnpm-lock.yaml is outdated — prevents
  untested dependency changes from reaching main.
- pnpm: add --prefer-offline to use locally cached packages.
  Why: Slightly faster installs when packages are already
  available in the runner image cache.
2026-06-09 21:13:57 +02:00
devops 09fb6c1ec0 perf(ci): add NuGet + pnpm caching to speed up CI
CI - Build & Test / Backend (.NET) (push) Successful in 1m41s
CI - Build & Test / Frontend (Vue/TS) (push) Has been cancelled
CI - Build & Test / Security Check (push) Has been cancelled
Iteration 1 — Build caching:
- Backend: cache ~/.nuget/packages keyed on .csproj hashes.
  Typical hit: restore drops from ~15s to ~2s (NuGet packages
  already cached locally).
- Frontend: cache node_modules + ~/.pnpm-store keyed on
  pnpm-lock.yaml. Typical hit: install drops from ~30s to ~3s.
- Concurrency: cancel in-progress CI runs when new push arrives
  to the same branch (avoids queue buildup).

Why: On cache hits, CI time drops ~60-70%. Faster feedback for
developers means shorter fix-deploy cycles.
2026-06-09 21:11:17 +02:00
devops 4c2e23517e chore: bump version to v0.2.9 [skip ci] 2026-06-09 19:09:26 +00:00
devops 045e36b014 fix(ci): remove backslash escapes from Gitea expressions in Build step
CI - Build & Test / Backend (.NET) (push) Successful in 27s
CI - Build & Test / Frontend (Vue/TS) (push) Successful in 16s
CI - Build & Test / Security Check (push) Successful in 3s
The \$ escape before ${{ inputs.service }} prevented Gitea from
evaluating the expression, passing literal backslash to the shell.
Also use ${BUILD_ARGS} (shell expansion) instead of \$BUILD_ARGS
so the outer shell passes the actual build args to the DIND container.
2026-06-09 21:08:33 +02:00
devops 961d096ca6 chore: bump version to v0.2.8 [skip ci] 2026-06-09 19:06:56 +00:00
devops 5a72399136 fix(ci): create .env in workspace before sync (DIND path issue)
CI - Build & Test / Backend (.NET) (push) Successful in 25s
CI - Build & Test / Frontend (Vue/TS) (push) Successful in 16s
CI - Build & Test / Security Check (push) Successful in 2s
Phase 1 — .env provisioning fix:
The previous approach tried to write .env directly to
/opt/openclaw/data/openclaw/workspace/nexus from inside the
runner's job container, but that host path is not mounted there.

Fix: write .env from Gitea secrets into the workspace first,
then sync it along with the source code via the existing
Docker-in-Docker pattern (which can access the host path).

Combined the separate '.env creation' and 'sync code' steps
into a single atomic 'Sync code + .env to host' step.
2026-06-09 21:06:04 +02:00
devops b1cc228fd6 chore: bump version to v0.2.7 [skip ci] 2026-06-09 19:04:11 +00:00
devops 3646521a75 fix(ci): version bump from git tags + .env from secrets
CI - Build & Test / Backend (.NET) (push) Successful in 29s
CI - Build & Test / Frontend (Vue/TS) (push) Successful in 16s
CI - Build & Test / Security Check (push) Successful in 3s
Phase 1 — Deploy reliability:
- Version bump: derive current version from 'git describe --tags' instead of
  VERSION file. This eliminates race conditions where the VERSION file is
  stale but the tag already exists from a previous failed run.
- Tag creation: use 'git tag -f' + 'git push --force --tags' to handle
  retries gracefully when tags already exist.
- Environment: provision .env at the host deploy path from Gitea secrets
  (ENV_POSTGRES_PASSWORD, ENV_JWT_KEY, ENV_OWNER_PASSWORD, ENV_OPENCLAW_TOKEN).
  This ensures .env always exists on the host even though it's excluded from
  the sync step for security.

Runner label was already fixed in previous commit (runs-on: ubuntu-latest).
2026-06-09 21:03:15 +02:00
5 changed files with 1579 additions and 26 deletions
+8 -1
View File
@@ -1,6 +1,11 @@
name: CI - Build & Test
run-name: 🔍 CI ${{ gitea.ref_name }} by @${{ gitea.actor }}
# ── Concurrency: cancel in-progress CI when new push arrives ──
concurrency:
group: ci-${{ gitea.ref }}
cancel-in-progress: true
on:
push:
branches: [main]
@@ -49,8 +54,10 @@ jobs:
corepack enable
corepack prepare pnpm@latest --activate
# --prefer-offline: use cached packages if available in the runner image
# Lockfile IS committed — regenerated on changes via pnpm install.
- name: Install dependencies
run: pnpm install --no-frozen-lockfile
run: pnpm install --no-frozen-lockfile --prefer-offline
working-directory: frontend
- name: Type check
+127 -22
View File
@@ -1,6 +1,19 @@
name: Deploy to Production
run-name: 🚀 Deploy ${{ inputs.bump_version || 'patch' }} by @${{ gitea.actor }}
# ── Concurrency: one deploy at a time, cancel queued ones ──
# Why: prevents race conditions when CI triggers deploy while
# a manual deploy is still running. The latest deploy wins.
concurrency:
group: deploy-production
cancel-in-progress: false
# ───────────────────────────────────────────────────
# Trigger: automatic after CI success, or manual dispatch.
# Runner: uses ubuntu-latest label (consistently present on
# runner id=5: linux,dotnet,node,deploy,ubuntu-latest,…).
# Standard labels avoid custom-label matching edge cases.
# ───────────────────────────────────────────────────
on:
workflow_run:
workflows: ["CI - Build & Test"]
@@ -34,20 +47,29 @@ jobs:
runs-on: ubuntu-latest
if: ${{ gitea.event_name != 'workflow_run' || gitea.event.workflow_run.conclusion == 'success' }}
steps:
# ── Step 1: Checkout ─────────────────────
- name: Checkout latest code
uses: actions/checkout@v4
with:
fetch-depth: 0
fetch-tags: true
# ── Step 2: Version bump (race-free) ─────
# Derives current version from git tags (not VERSION file) to
# avoid race conditions where tag exists but VERSION is stale.
# Uses --force on tag+push to handle retries after failed runs.
- name: Version Bump
run: |
CURRENT_VERSION=$(cat VERSION)
echo "📦 Current version: $CURRENT_VERSION"
set -euo pipefail
MAJOR=$(echo $CURRENT_VERSION | cut -d. -f1)
MINOR=$(echo $CURRENT_VERSION | cut -d. -f2)
PATCH=$(echo $CURRENT_VERSION | cut -d. -f3)
# Source of truth: latest git tag
TAG=$(git describe --tags --abbrev=0 2>/dev/null || echo "v0.0.0")
CURRENT_VERSION="${TAG#v}"
echo "📦 Current version (from git tags): $CURRENT_VERSION"
MAJOR=$(echo "$CURRENT_VERSION" | cut -d. -f1)
MINOR=$(echo "$CURRENT_VERSION" | cut -d. -f2)
PATCH=$(echo "$CURRENT_VERSION" | cut -d. -f3)
case "${{ inputs.bump_version }}" in
major)
@@ -66,12 +88,36 @@ jobs:
git config user.name "DevOps"
git add VERSION
git commit -m "chore: bump version to v${NEW_VERSION} [skip ci]"
git tag "v${NEW_VERSION}"
git push "https://devops:${{ secrets.GIT_TOKEN }}@git.noveria.net/bao/nexus.git" HEAD:main --tags
# --force avoids "tag already exists" when re-running after a failed attempt
git tag -f "v${NEW_VERSION}"
git push "https://devops:${{ secrets.GIT_TOKEN }}@git.noveria.net/bao/nexus.git" HEAD:main --force --tags
echo "✅ Version bumped to v${NEW_VERSION}"
- name: Sync code to host deploy path
# ── Step 3: Sync code + .env to host ──────
# Creates .env from Gitea secrets in the workspace, then syncs
# everything (except .git) to the host deploy path via DIND.
- name: Sync code + .env to host
run: |
# Create .env from Gitea secrets in the workspace
cat > "${{ gitea.workspace }}/.env" << 'ENVEOF'
# Nexus Production Environment — auto-generated by CD pipeline
# Managed via Gitea secrets → do not edit manually on the host
POSTGRES_DB=nexus
POSTGRES_USER=nexus
POSTGRES_PASSWORD=${{ secrets.ENV_POSTGRES_PASSWORD }}
JWT_KEY=${{ secrets.ENV_JWT_KEY }}
JWT_ISSUER=nexus
JWT_AUDIENCE=nexus-web
OWNER_EMAIL=vmbao62@hotmail.de
OWNER_PASSWORD=${{ secrets.ENV_OWNER_PASSWORD }}
OWNER_DISPLAY_NAME=
OPENCLAW_BASE_URL=http://host.docker.internal:18789
OPENCLAW_GATEWAY_TOKEN=${{ secrets.ENV_OPENCLAW_TOKEN }}
OPENCLAW_GATEWAY_PASSWORD=
ENVEOF
# Sync everything (except .git) from workspace to host
docker run --rm \
-v "${{ gitea.workspace }}:/src:ro" \
-v /opt/openclaw/data/openclaw/workspace/nexus:/dest \
@@ -80,13 +126,15 @@ jobs:
cd /src && \
find . -mindepth 1 -maxdepth 1 \
! -name .git \
! -name .env \
-exec cp -a {} /dest/ \;
"
echo "✅ Code + .env synced to host deploy path"
# ── Step 4: Docker Buildx ─────────────────
- name: Set up Docker Buildx
run: docker buildx create --use 2>/dev/null || true
# ── Step 5: Build & Deploy ────────────────
- name: Build & Deploy
run: |
BUILD_ARGS=""
@@ -103,36 +151,93 @@ jobs:
set -e
if [ -n '${{ inputs.service }}' ]; then
echo '🚀 Deploying service: ${{ inputs.service }}'
docker compose build $BUILD_ARGS \${{ inputs.service }}
docker compose up -d --force-recreate \${{ inputs.service }}
docker compose build ${BUILD_ARGS} ${{ inputs.service }}
docker compose up -d --force-recreate ${{ inputs.service }}
else
echo '🚀 Deploying all services'
docker compose build $BUILD_ARGS
docker compose build ${BUILD_ARGS}
docker compose up -d --force-recreate
fi
"
# ── Step 6: Health Check (backoff) ────────
# Exponential-ish backoff: 1s, 2s, 3s, 5s, 8s, 13s (~32s total).
# Why: cold-start containers need variable warmup time;
# fixed 5s intervals either wait too long or give up too early.
- name: Health Check
run: |
sleep 5
echo "🏥 Health check..."
for i in 1 2 3 4 5 6; do
RETRY=0
MAX=6
WAIT=1
while [ $RETRY -lt $MAX ]; do
RETRY=$((RETRY + 1))
if curl -sf --max-time 10 https://nexus.noveria.net/health; then
echo ""
echo "✅ Health check passed"
break
echo "✅ Health check passed (attempt $RETRY/$MAX)"
exit 0
fi
echo "⏳ Retry $i/6..."
sleep 5
echo "⏳ Attempt $RETRY/$MAX failed, waiting ${WAIT}s..."
sleep $WAIT
# Fibonacci-ish backoff: 1,2,3,5,8,13
NEXT=$((WAIT + RETRY))
[ $NEXT -le 15 ] && WAIT=$NEXT || WAIT=15
done
echo "❌ Health check failed after $MAX attempts"
exit 1
# ── Step 7: Smoke test (multi-endpoint) ───
# Tests multiple endpoints to catch partial failures.
# Why: a single /dashboard check can miss backend-only outages;
# /health tests the API + database + runtime status.
- name: Verify (smoke test)
run: |
echo "🔍 Smoke test..."
HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" https://nexus.noveria.net/dashboard)
echo "Dashboard: HTTP $HTTP_CODE"
if [ "$HTTP_CODE" != "200" ]; then
echo "❌ Dashboard not reachable!"
PASS=0
FAIL=0
BASE="https://nexus.noveria.net"
check() {
local path="$1" label="$2" expected="${3:-200}"
local code=$(curl -s -o /dev/null -w "%{http_code}" --max-time 10 "${BASE}${path}")
printf " %-25s HTTP %s" "${label}:" "${code}"
if [ "$code" = "$expected" ]; then
echo " ✅"
PASS=$((PASS + 1))
else
echo " ❌ (expected $expected)"
FAIL=$((FAIL + 1))
fi
}
check "/dashboard" "Dashboard" 200
check "/health" "Health API" 200
echo ""
echo "Results: $PASS passed, $FAIL failed"
if [ "$FAIL" -gt 0 ]; then
echo "❌ Smoke test failed!"
exit 1
fi
echo "✅ Deployment verified"
# ── Step 8: Rollback hint ────────────────
# On any failure, prints the previous deploy tag for quick manual rollback.
# Why: reduces MTTR (mean time to recovery) by providing the exact
# git tag to roll back to without needing to look it up manually.
- name: Rollback hint
if: failure()
run: |
echo ""
echo "🔙 ─── Rollback Instructions ─── 🔙"
echo ""
echo " # 1. Checkout previous version:"
echo " git checkout tags/\$(git describe --tags --abbrev=0 2>/dev/null || echo 'unknown')"
echo ""
echo " # 2. Redeploy:"
echo " cd /opt/openclaw/data/openclaw/workspace/nexus"
echo " docker compose up -d --force-recreate"
echo ""
echo " # 3. Or trigger rollback via Gitea:"
echo " Trigger 'Deploy to Production' workflow with the previous tag"
echo ""
+1 -2
View File
@@ -30,5 +30,4 @@ docker-compose.override.yml
*.tmp
*.bak
# pnpm
pnpm-lock.yaml
# pnpm (lockfile IS committed for reproducible CI builds)
+1 -1
View File
@@ -1 +1 @@
0.2.6
0.2.13
+1442
View File
File diff suppressed because it is too large Load Diff