Skip to content
Self-hosting

Operations

Daily operating checks for the review service: health, queue, logs, metrics, dashboards, and context services.

Health endpoints

/health
Liveness. Use for simple process checks.
/ready
Readiness. Use for orchestration because it waits for DB and migrations.
/metrics
Prometheus metrics for queues, jobs, HTTP requests, uptime, and AI usage.

Useful commands

docker compose ps
docker compose logs -f gittensory
curl http://localhost:8787/ready
curl http://localhost:8787/metrics
bash

Important log events

selfhost_listening
selfhost_migrations_applied
selfhost_ai_provider
selfhost_ai_review_plan
selfhost_embed_provider
selfhost_vectorize
selfhost_job_dead
selfhost_cron_error
review_context_fetch_failed

Observability profile

The observability profile starts Prometheus, Alertmanager, Loki, Promtail, and Grafana with dashboards for infra, review activity, and AI usage.

When OpenTelemetry and Sentry are enabled, job audit logs and Sentry events include trace_id/span_id fields so an operator can jump from a failed job or issue to the matching trace in Grafana or Tempo.

docker compose --profile observability up -d
bash

Sentry cron monitors

When SENTRY_DSN is set, the self-host runtime emits Sentry monitor check-ins for the recurring loops where silent stoppage matters most. Leaving SENTRY_DSN unset keeps monitor reporting off.

scheduled loop
The two-minute maintenance tick that fans out sweeps, backfills, and refresh jobs.
Orb export
The hourly outcome export loop used by brokered self-host deployments.
Orb relay drain
The pull-mode relay loop for installations that receive events outbound from Orb.

A missed monitor means the process may still be alive but the recurring work is not checking in on schedule. Pair the monitor with queue depth, dead-job counts, and the structured error log for the same subsystem.

Routine checks

  • Queue pending count is not growing without processing.
  • Dead jobs stay at zero or are investigated promptly.
  • Webhook deliveries are recent and have 2xx responses.
  • AI usage matches expected review volume and model/effort choices.
  • REES and RAG failures are visible and bounded.
  • Backups are recent and restore-tested.

If an operating check fails, go to Self-host troubleshooting.