# Docker Setup for GridPilot This document describes the Docker setup for local development and production deployment of GridPilot. ## Quick Start ### Development Start all services with hot-reloading: ```bash npm run docker:dev:build ``` This will: - Start PostgreSQL database on port 5432 - Start API on port 3001 (container port 3000, debugger 9229) - Start Website on port 3000 - Enable hot-reloading for both apps Access: - Website: http://localhost:3000 - API: http://localhost:3001 - Database: localhost:5432 ### Production Start all services in production mode: ```bash npm run docker:prod:build ``` This will: - Build optimized Docker images - Start PostgreSQL, Redis, API, Website, and Nginx - Enable health checks, auto-restart, and resource limits - Configure caching and performance optimizations Access: - Nginx (Website + API): http://localhost:80 ## Available Commands ### Development - `npm run docker:dev` - Start dev environment (alias of `docker:dev:up`) - `npm run docker:dev:up` - Start dev environment - `npm run docker:dev:postgres` - Start dev environment with `GRIDPILOT_API_PERSISTENCE=postgres` - `npm run docker:dev:inmemory` - Start dev environment with `GRIDPILOT_API_PERSISTENCE=inmemory` - `npm run docker:dev:build` - Rebuild and start - `npm run docker:dev:restart` - Restart services - `npm run docker:dev:ps` - Show service status - `npm run docker:dev:down` - Stop services - `npm run docker:dev:logs` - View logs - `npm run docker:dev:clean` - Stop and remove volumes ### Production - `npm run docker:prod` - Start prod environment - `npm run docker:prod:build` - Rebuild and start - `npm run docker:prod:down` - Stop services - `npm run docker:prod:logs` - View logs - `npm run docker:prod:clean` - Stop and remove volumes ### Testing (Docker) #### Available Commands **Unified E2E Testing (Recommended):** - `npm run test:e2e:website` - Run complete e2e test suite - `npm run docker:e2e:up` - Start all e2e services - `npm run docker:e2e:down` - Stop e2e services - `npm run docker:e2e:logs` - View e2e service logs - `npm run docker:e2e:ps` - Check e2e service status - `npm run docker:e2e:clean` - Clean e2e environment **Legacy Testing (Deprecated):** - `npm run test:docker:website` - Run legacy hybrid tests - `npm run docker:test:up` - Start legacy API/DB - `npm run docker:test:down` - Stop legacy services - `npm run docker:test:clean` - Clean legacy environment #### Quick Comparison | Feature | Legacy (Hybrid) | Unified (E2E) | |---------|-----------------|---------------| | Website | Local (Playwright webServer) | Docker container | | API | Docker container | Docker container | | Database | Docker container | Docker container | | Playwright | Local | Docker container | | SWC Issues | ❌ Yes | ✅ No | | CI Compatible | ❌ No | ✅ Yes | | Single Command | ❌ No | ✅ Yes | | Port Conflicts | ❌ Possible | ✅ No | #### Unified E2E Test Environment (Recommended) The new unified e2e test environment runs **everything in Docker** - website, API, database, and Playwright tests. This eliminates the hybrid approach and solves Next.js SWC compilation issues. **Quick Start:** ```bash # Run complete e2e test suite npm run test:e2e:website # Run specific test file (fast, no rebuild) npm run test:e2e:run -- tests/e2e/website/website-pages.e2e.test.ts # Or step-by-step: npm run docker:e2e:up # Start all services (fast, uses cache) npm run docker:e2e:build # Force rebuild website image npm run docker:e2e:logs # View logs npm run docker:e2e:down # Stop services npm run docker:e2e:clean # Clean everything ``` **What this does:** - Builds optimized website image with all SWC dependencies (cached unless source changes) - Starts PostgreSQL database (port 5434) - Starts API server (port 3101) - Starts website server (port 3100) - Runs Playwright tests in container - All services communicate via isolated Docker network **Architecture:** ``` ┌─────────────────────────────────────────┐ │ Docker Network: gridpilot-e2e-network │ │ │ │ ┌──────────┐ ┌──────────┐ ┌───────┐ │ │ │ Playwright│→│ Website │→│ API │ │ │ │ Runner │ │ (Next.js)│ │(NestJS)│ │ │ └──────────┘ └──────────┘ └───────┘ │ │ ↓ ↓ ↓ │ │ └──────────────┴────────┴──────┘ │ ↓ │ PostgreSQL DB └─────────────────────────────────────────┘ ``` **Benefits:** - ✅ **Fully containerized** - identical to CI environment - ✅ **No SWC issues** - optimized Dockerfile with build tools - ✅ **No port conflicts** - isolated network and unique ports - ✅ **Single command** - one script runs everything - ✅ **Deterministic** - no local dependencies #### Legacy Testing (Deprecated) The old hybrid approach (API/DB in Docker, website locally) is still available but deprecated: - `npm run test:docker:website` - Start API/DB in Docker, run website locally via Playwright - Uses [`docker-compose.test.yml`](docker-compose.test.yml:1) - **Note**: This approach has SWC compilation issues and won't work in CI **Supporting scripts (legacy):** - `npm run docker:test:deps` - Verify monorepo dependencies - `npm run docker:test:up` - Start API and PostgreSQL - `npm run docker:test:wait` - Wait for API health - `npm run docker:test:down` - Stop containers **Recommendation**: Use the unified e2e environment above instead. ## Environment Variables ### "Mock vs Real" (Website & API) There is **no** `AUTOMATION_MODE` equivalent for the Website/API runtime. - **Website "mock vs real"** is controlled purely by *which API base URL you point it at* via [`getWebsiteApiBaseUrl()`](apps/website/lib/config/apiBaseUrl.ts:6): - Browser calls use `NEXT_PUBLIC_API_BASE_URL` - Server/Next.js calls use `API_BASE_URL ?? NEXT_PUBLIC_API_BASE_URL` - **API "mock vs real"** is controlled by API runtime env: - Persistence: `GRIDPILOT_API_PERSISTENCE=postgres|inmemory` in [`AppModule`](apps/api/src/app.module.ts:25) - Optional bootstrapping: `GRIDPILOT_API_BOOTSTRAP=0|1` in [`AppModule`](apps/api/src/app.module.ts:35) Practical presets: - **Website + real API (Docker dev)**: `npm run docker:dev:build` (Website `3000`, API `3001`, Postgres required). - Website browser → API: `NEXT_PUBLIC_API_BASE_URL=http://localhost:3001` - Website container → API container: `API_BASE_URL=http://api:3000` - **Website + mock API (Docker smoke)**: `npm run test:docker:website` (Website `3100`, API mock `3101`). - API mock is defined inline in [`docker-compose.test.yml`](docker-compose.test.yml:24) - Website browser → API mock: `NEXT_PUBLIC_API_BASE_URL=http://localhost:3101` - Website container → API mock container: `API_BASE_URL=http://api:3000` ### Website ↔ API Connection The website talks to the API via `fetch()` in [`BaseApiClient`](apps/website/lib/api/base/BaseApiClient.ts:11), and it always includes cookies (`credentials: 'include'`). That means: - The **browser** must be pointed at a host-accessible API URL via `NEXT_PUBLIC_API_BASE_URL` - The **server** (Next.js / Node) must be pointed at a container-network API URL via `API_BASE_URL` (when running in Docker) The single source of truth for "what base URL should I use?" is [`getWebsiteApiBaseUrl()`](apps/website/lib/config/apiBaseUrl.ts:6): - Browser: reads `NEXT_PUBLIC_API_BASE_URL` - Server: reads `API_BASE_URL ?? NEXT_PUBLIC_API_BASE_URL` - In Docker/CI/test: throws if missing (no silent localhost fallback) #### Dev Docker defaults (docker-compose.dev.yml) - Website: `http://localhost:3000` - API: `http://localhost:3001` (maps to container `api:3000`) - `NEXT_PUBLIC_API_BASE_URL=http://localhost:3001` (browser → host port) - `API_BASE_URL=http://api:3000` (website container → api container) #### E2E Docker defaults (docker-compose.e2e.yml) This stack runs **everything in Docker** for fully containerized e2e testing: - Website: `http://website:3000` (containerized Next.js, exposed as `localhost:3100`) - API: `http://api:3000` (containerized NestJS, exposed as `localhost:3101`) - PostgreSQL: `db:5432` (containerized, exposed as `localhost:5434`) - Playwright: Runs in container, connects via Docker network - `NEXT_PUBLIC_API_BASE_URL=http://api:3000` (browser → container) - `API_BASE_URL=http://api:3000` (website → API container) **Key differences from legacy approach**: - ✅ Website runs in Docker (no SWC issues) - ✅ Playwright runs in Docker (identical to CI) - ✅ All services on isolated network - ✅ No port conflicts with local dev - ✅ Single command execution **Accessing services during development**: - Website: `http://localhost:3100` - API: `http://localhost:3101` - Database: `localhost:5434` #### Test Docker defaults (docker-compose.test.yml) - Legacy **Deprecated**: Use `docker-compose.e2e.yml` instead. This stack is intended for deterministic smoke tests and uses different host ports to avoid colliding with `docker:dev`: - Website: `http://localhost:3000` (started by Playwright webServer, not Docker) - API: `http://localhost:3101` (maps to container `api:3000`) - PostgreSQL: `localhost:5433` (maps to container `5432`) - `NEXT_PUBLIC_API_BASE_URL=http://localhost:3101` (browser → host port) - `API_BASE_URL=http://localhost:3101` (Playwright webServer → host port) **Important**: - The website runs locally via Playwright's `webServer` config to avoid Next.js SWC compilation issues in Docker. - The API is a real TypeORM/PostgreSQL server (not a mock) for testing actual database interactions. - Playwright automatically starts the website server before running tests. #### Troubleshooting (E2E) **Common Issues:** - **"Website not building"**: Ensure Docker has enough memory (4GB+). SWC compilation is memory-intensive. - **"Port already in use"**: Use `npm run docker:e2e:down` to stop conflicting services. - **"Module not found"**: Run `npm run docker:e2e:clean` to rebuild from scratch. - **"Database connection failed"**: Wait for health checks. Use `npm run docker:e2e:logs` to check status. - **"Playwright timeout"**: Increase timeout in `playwright.website.config.ts` if needed. **Debug Commands:** ```bash # View all service logs npm run docker:e2e:logs # Check service status npm run docker:e2e:ps # Clean everything and restart npm run docker:e2e:clean && npm run docker:e2e:up # Run specific service logs docker-compose -f docker-compose.e2e.yml logs -f website docker-compose -f docker-compose.e2e.yml logs -f api docker-compose -f docker-compose.e2e.yml logs -f db ``` **Migration from Legacy to Unified:** If you were using the old `test:docker:website` approach: 1. **Stop old services**: `npm run docker:test:down` 2. **Clean up**: `npm run docker:test:clean` 3. **Use new approach**: `npm run test:e2e:website` The new approach is: - ✅ More reliable (no SWC issues) - ✅ Faster (no local server startup) - ✅ CI-compatible (identical environment) - ✅ Simpler (single command) #### Troubleshooting (Legacy - Deprecated) - **Port conflicts**: If `docker:dev` is running, use `npm run docker:dev:down` before `npm run test:docker:website` to avoid port conflicts (dev uses 3001, test uses 3101). - **Website not starting**: Playwright's webServer may fail if dependencies are missing. Run `npm install` first. - **Cookie errors**: The `WebsiteAuthManager` requires both `url` and `path` properties for cookies. Check Playwright version compatibility. - **Docker volumes stuck**: Run `npm run docker:test:down` (uses `--remove-orphans` + `rm -f`). - **SWC compilation issues**: If website fails to start in Docker, use the local webServer approach (already configured in `playwright.website.config.ts`). ### API "Real vs In-Memory" Mode The API can now be run either: - **postgres**: loads [`DatabaseModule`](apps/api/src/domain/database/DatabaseModule.ts:1) (requires Postgres) - **inmemory**: does not load `DatabaseModule` (no Postgres required) Control it with: - `GRIDPILOT_API_PERSISTENCE=postgres|inmemory` (defaults to `postgres` if `DATABASE_URL` is set, otherwise `inmemory`) - Optional: `GRIDPILOT_API_BOOTSTRAP=0` to skip [`BootstrapModule`](apps/api/src/domain/bootstrap/BootstrapModule.ts:1) ### Development (.env.development) Copy and customize as needed. Default values work out of the box. ### Production (.env.production) **IMPORTANT**: Update these before deploying: - Database credentials (`POSTGRES_PASSWORD`, `DATABASE_URL`) - Website/API URLs (`NEXT_PUBLIC_API_BASE_URL`, `NEXT_PUBLIC_SITE_URL`) - Vercel KV credentials (`KV_REST_API_URL`, `KV_REST_API_TOKEN`) (required for production email signups/rate limit) ## Architecture ### Development Setup - Hot-reloading enabled via volume mounts - Source code changes reflect immediately - Database persisted in named volume - Debug port exposed for API (9229) ### Production Setup - Multi-stage builds for optimized images - Only production dependencies included - Nginx reverse proxy for both services - Health checks for all services - Auto-restart on failure ## Docker Services ### API (NestJS) - Dev: `apps/api/Dockerfile.dev` - Prod: `apps/api/Dockerfile.prod` - Port: 3000 - Debug: 9229 (dev only) ### Website (Next.js) - Dev: `apps/website/Dockerfile.dev` - Prod: `apps/website/Dockerfile.prod` - Port: 3001 (dev), 3000 (prod) ### Database (PostgreSQL) - Image: postgres:15-alpine - Port: 5432 (internal) - Data: Persisted in Docker volume - Optimized with performance tuning parameters ### Redis (Production only) - Image: redis:7-alpine - Port: 6379 (internal) - Configured with: - LRU eviction policy - 512MB max memory - AOF persistence - Password protection ### Nginx (Production only) - Reverse proxy for website + API - Features: - Rate limiting (API: 10r/s, General: 30r/s) - Security headers (XSS, CSP, Frame-Options) - Gzip compression - Static asset caching - Connection pooling - Request buffering - Port: 80, 443 ## Troubleshooting ### Services won't start ```bash # Clean everything and rebuild npm run docker:dev:clean npm run docker:dev:build ``` ### Hot-reloading not working Check that volume mounts are correct in docker-compose.dev.yml ### Database connection issues Ensure DATABASE_URL in .env matches the database service configuration ### Check logs ```bash # All services npm run docker:dev:logs # Specific service docker-compose -f docker-compose.dev.yml logs -f api docker-compose -f docker-compose.dev.yml logs -f website docker-compose -f docker-compose.dev.yml logs -f db ``` ### Database Migration for Media References If you have existing seeded data with old URL formats (e.g., `/api/avatar/{id}`, `/api/media/teams/{id}/logo`), you need to migrate to the new `MediaReference` format. #### Option 1: Migration Script (Preserve Data) Run the migration script to convert old URLs to proper `MediaReference` objects: ```bash # Test mode (dry run - shows what would change) npm run migrate:media:test # Execute migration (applies changes) npm run migrate:media:exec ``` The script handles: - **Driver avatars**: `/api/avatar/{id}` → `system-default` (deterministic variant) - **Team logos**: `/api/media/teams/{id}/logo` → `generated` - **League logos**: `/api/media/leagues/{id}/logo` → `generated` - **Unknown formats** → `none` #### Option 2: Wipe and Reseed (Clean Slate) For development environments, you can wipe all data and start fresh: ```bash # Stop services and remove volumes npm run docker:dev:clean # Rebuild and start fresh npm run docker:dev:build ``` This will: - Delete all existing data - Run fresh seed with correct `MediaReference` format - No migration needed #### When to Use Each Option **Use Migration Script** when: - You have production data you want to preserve - You want to understand what changes will be made - You need a controlled, reversible process **Use Wipe and Reseed** when: - You're in development/testing - You don't care about existing data - You want the fastest path to a clean state ## Tips 1. **First time setup**: Use `docker:dev:build` to ensure images are built 2. **Clean slate**: Use `docker:dev:clean` to remove all data and start fresh 3. **Production testing**: Test prod setup locally before deploying 4. **Database access**: Use any PostgreSQL client with credentials from .env file 5. **Debugging**: Attach debugger to port 9229 for API debugging ## Production Deployment Before deploying to production: 1. Update `.env.production` with real credentials 2. Configure SSL certificates in `nginx/ssl/` 3. Update Nginx configuration for HTTPS 4. Set proper domain names in environment variables 5. Consider using Docker secrets for sensitive data ## File Structure ``` . ├── docker-compose.dev.yml # Development orchestration ├── docker-compose.prod.yml # Production orchestration ├── docker-compose.e2e.yml # E2E testing orchestration (NEW) ├── docker-compose.test.yml # Legacy test orchestration (deprecated) ├── .env.development # Dev environment variables ├── .env.production # Prod environment variables ├── .env.test.example # Test env template ├── apps/ │ ├── api/ │ │ ├── Dockerfile.dev # API dev image │ │ ├── Dockerfile.prod # API prod image │ │ └── .dockerignore │ └── website/ │ ├── Dockerfile.dev # Website dev image │ ├── Dockerfile.prod # Website prod image │ ├── Dockerfile.e2e # E2E optimized image (NEW) │ └── .dockerignore ├── playwright.website.config.ts # E2E test config (updated) ├── playwright.website-integration.config.ts ├── playwright.smoke.config.ts ├── package.json # Updated scripts (NEW commands) └── nginx/ └── nginx.conf # Nginx configuration ``` **Key Changes for E2E Testing:** - `docker-compose.e2e.yml` - Unified test environment - `apps/website/Dockerfile.e2e` - SWC-optimized Next.js image - Updated `playwright.website.config.ts` - Containerized setup - New npm scripts in `package.json`