Files
gridpilot.gg/README.docker.md
2026-01-17 01:04:36 +01:00

489 lines
18 KiB
Markdown

# Docker Setup for GridPilot
This document describes the Docker setup for local development and production deployment of GridPilot.
## Quick Start
### Development
Start all services with hot-reloading:
```bash
npm run docker:dev:build
```
This will:
- Start PostgreSQL database on port 5432
- Start API on port 3001 (container port 3000, debugger 9229)
- Start Website on port 3000
- Enable hot-reloading for both apps
Access:
- Website: http://localhost:3000
- API: http://localhost:3001
- Database: localhost:5432
### Production
Start all services in production mode:
```bash
npm run docker:prod:build
```
This will:
- Build optimized Docker images
- Start PostgreSQL, Redis, API, Website, and Nginx
- Enable health checks, auto-restart, and resource limits
- Configure caching and performance optimizations
Access:
- Nginx (Website + API): http://localhost:80
## Available Commands
### Development
- `npm run docker:dev` - Start dev environment (alias of `docker:dev:up`)
- `npm run docker:dev:up` - Start dev environment
- `npm run docker:dev:postgres` - Start dev environment with `GRIDPILOT_API_PERSISTENCE=postgres`
- `npm run docker:dev:inmemory` - Start dev environment with `GRIDPILOT_API_PERSISTENCE=inmemory`
- `npm run docker:dev:build` - Rebuild and start
- `npm run docker:dev:restart` - Restart services
- `npm run docker:dev:ps` - Show service status
- `npm run docker:dev:down` - Stop services
- `npm run docker:dev:logs` - View logs
- `npm run docker:dev:clean` - Stop and remove volumes
### Production
- `npm run docker:prod` - Start prod environment
- `npm run docker:prod:build` - Rebuild and start
- `npm run docker:prod:down` - Stop services
- `npm run docker:prod:logs` - View logs
- `npm run docker:prod:clean` - Stop and remove volumes
### Testing (Docker)
#### Available Commands
**Unified E2E Testing (Recommended):**
- `npm run test:e2e:website` - Run complete e2e test suite
- `npm run docker:e2e:up` - Start all e2e services
- `npm run docker:e2e:down` - Stop e2e services
- `npm run docker:e2e:logs` - View e2e service logs
- `npm run docker:e2e:ps` - Check e2e service status
- `npm run docker:e2e:clean` - Clean e2e environment
**Legacy Testing (Deprecated):**
- `npm run test:docker:website` - Run legacy hybrid tests
- `npm run docker:test:up` - Start legacy API/DB
- `npm run docker:test:down` - Stop legacy services
- `npm run docker:test:clean` - Clean legacy environment
#### Quick Comparison
| Feature | Legacy (Hybrid) | Unified (E2E) |
|---------|-----------------|---------------|
| Website | Local (Playwright webServer) | Docker container |
| API | Docker container | Docker container |
| Database | Docker container | Docker container |
| Playwright | Local | Docker container |
| SWC Issues | ❌ Yes | ✅ No |
| CI Compatible | ❌ No | ✅ Yes |
| Single Command | ❌ No | ✅ Yes |
| Port Conflicts | ❌ Possible | ✅ No |
#### Unified E2E Test Environment (Recommended)
The new unified e2e test environment runs **everything in Docker** - website, API, database, and Playwright tests. This eliminates the hybrid approach and solves Next.js SWC compilation issues.
**Quick Start:**
```bash
# Run complete e2e test suite
npm run test:e2e:website
# Run specific test file (fast, no rebuild)
npm run test:e2e:run -- tests/e2e/website/website-pages.e2e.test.ts
# Or step-by-step:
npm run docker:e2e:up # Start all services (fast, uses cache)
npm run docker:e2e:build # Force rebuild website image
npm run docker:e2e:logs # View logs
npm run docker:e2e:down # Stop services
npm run docker:e2e:clean # Clean everything
```
**What this does:**
- Builds optimized website image with all SWC dependencies (cached unless source changes)
- Starts PostgreSQL database (port 5434)
- Starts API server (port 3101)
- Starts website server (port 3100)
- Runs Playwright tests in container
- All services communicate via isolated Docker network
**Architecture:**
```
┌─────────────────────────────────────────┐
│ Docker Network: gridpilot-e2e-network │
│ │
│ ┌──────────┐ ┌──────────┐ ┌───────┐ │
│ │ Playwright│→│ Website │→│ API │ │
│ │ Runner │ │ (Next.js)│ │(NestJS)│ │
│ └──────────┘ └──────────┘ └───────┘ │
│ ↓ ↓ ↓ │
│ └──────────────┴────────┴──────┘
│ ↓
│ PostgreSQL DB
└─────────────────────────────────────────┘
```
**Benefits:**
-**Fully containerized** - identical to CI environment
-**No SWC issues** - optimized Dockerfile with build tools
-**No port conflicts** - isolated network and unique ports
-**Single command** - one script runs everything
-**Deterministic** - no local dependencies
#### Legacy Testing (Deprecated)
The old hybrid approach (API/DB in Docker, website locally) is still available but deprecated:
- `npm run test:docker:website` - Start API/DB in Docker, run website locally via Playwright
- Uses [`docker-compose.test.yml`](docker-compose.test.yml:1)
- **Note**: This approach has SWC compilation issues and won't work in CI
**Supporting scripts (legacy):**
- `npm run docker:test:deps` - Verify monorepo dependencies
- `npm run docker:test:up` - Start API and PostgreSQL
- `npm run docker:test:wait` - Wait for API health
- `npm run docker:test:down` - Stop containers
**Recommendation**: Use the unified e2e environment above instead.
## Environment Variables
### "Mock vs Real" (Website & API)
There is **no** `AUTOMATION_MODE` equivalent for the Website/API runtime.
- **Website "mock vs real"** is controlled purely by *which API base URL you point it at* via [`getWebsiteApiBaseUrl()`](apps/website/lib/config/apiBaseUrl.ts:6):
- Browser calls use `NEXT_PUBLIC_API_BASE_URL`
- Server/Next.js calls use `API_BASE_URL ?? NEXT_PUBLIC_API_BASE_URL`
- **API "mock vs real"** is controlled by API runtime env:
- Persistence: `GRIDPILOT_API_PERSISTENCE=postgres|inmemory` in [`AppModule`](apps/api/src/app.module.ts:25)
- Optional bootstrapping: `GRIDPILOT_API_BOOTSTRAP=0|1` in [`AppModule`](apps/api/src/app.module.ts:35)
Practical presets:
- **Website + real API (Docker dev)**: `npm run docker:dev:build` (Website `3000`, API `3001`, Postgres required).
- Website browser → API: `NEXT_PUBLIC_API_BASE_URL=http://localhost:3001`
- Website container → API container: `API_BASE_URL=http://api:3000`
- **Website + mock API (Docker smoke)**: `npm run test:docker:website` (Website `3100`, API mock `3101`).
- API mock is defined inline in [`docker-compose.test.yml`](docker-compose.test.yml:24)
- Website browser → API mock: `NEXT_PUBLIC_API_BASE_URL=http://localhost:3101`
- Website container → API mock container: `API_BASE_URL=http://api:3000`
### Website ↔ API Connection
The website talks to the API via `fetch()` in [`BaseApiClient`](apps/website/lib/api/base/BaseApiClient.ts:11), and it always includes cookies (`credentials: 'include'`). That means:
- The **browser** must be pointed at a host-accessible API URL via `NEXT_PUBLIC_API_BASE_URL`
- The **server** (Next.js / Node) must be pointed at a container-network API URL via `API_BASE_URL` (when running in Docker)
The single source of truth for "what base URL should I use?" is [`getWebsiteApiBaseUrl()`](apps/website/lib/config/apiBaseUrl.ts:6):
- Browser: reads `NEXT_PUBLIC_API_BASE_URL`
- Server: reads `API_BASE_URL ?? NEXT_PUBLIC_API_BASE_URL`
- In Docker/CI/test: throws if missing (no silent localhost fallback)
#### Dev Docker defaults (docker-compose.dev.yml)
- Website: `http://localhost:3000`
- API: `http://localhost:3001` (maps to container `api:3000`)
- `NEXT_PUBLIC_API_BASE_URL=http://localhost:3001` (browser → host port)
- `API_BASE_URL=http://api:3000` (website container → api container)
#### E2E Docker defaults (docker-compose.e2e.yml)
This stack runs **everything in Docker** for fully containerized e2e testing:
- Website: `http://website:3000` (containerized Next.js, exposed as `localhost:3100`)
- API: `http://api:3000` (containerized NestJS, exposed as `localhost:3101`)
- PostgreSQL: `db:5432` (containerized, exposed as `localhost:5434`)
- Playwright: Runs in container, connects via Docker network
- `NEXT_PUBLIC_API_BASE_URL=http://api:3000` (browser → container)
- `API_BASE_URL=http://api:3000` (website → API container)
**Key differences from legacy approach**:
- ✅ Website runs in Docker (no SWC issues)
- ✅ Playwright runs in Docker (identical to CI)
- ✅ All services on isolated network
- ✅ No port conflicts with local dev
- ✅ Single command execution
**Accessing services during development**:
- Website: `http://localhost:3100`
- API: `http://localhost:3101`
- Database: `localhost:5434`
#### Test Docker defaults (docker-compose.test.yml) - Legacy
**Deprecated**: Use `docker-compose.e2e.yml` instead.
This stack is intended for deterministic smoke tests and uses different host ports to avoid colliding with `docker:dev`:
- Website: `http://localhost:3000` (started by Playwright webServer, not Docker)
- API: `http://localhost:3101` (maps to container `api:3000`)
- PostgreSQL: `localhost:5433` (maps to container `5432`)
- `NEXT_PUBLIC_API_BASE_URL=http://localhost:3101` (browser → host port)
- `API_BASE_URL=http://localhost:3101` (Playwright webServer → host port)
**Important**:
- The website runs locally via Playwright's `webServer` config to avoid Next.js SWC compilation issues in Docker.
- The API is a real TypeORM/PostgreSQL server (not a mock) for testing actual database interactions.
- Playwright automatically starts the website server before running tests.
#### Troubleshooting (E2E)
**Common Issues:**
- **"Website not building"**: Ensure Docker has enough memory (4GB+). SWC compilation is memory-intensive.
- **"Port already in use"**: Use `npm run docker:e2e:down` to stop conflicting services.
- **"Module not found"**: Run `npm run docker:e2e:clean` to rebuild from scratch.
- **"Database connection failed"**: Wait for health checks. Use `npm run docker:e2e:logs` to check status.
- **"Playwright timeout"**: Increase timeout in `playwright.website.config.ts` if needed.
**Debug Commands:**
```bash
# View all service logs
npm run docker:e2e:logs
# Check service status
npm run docker:e2e:ps
# Clean everything and restart
npm run docker:e2e:clean && npm run docker:e2e:up
# Run specific service logs
docker-compose -f docker-compose.e2e.yml logs -f website
docker-compose -f docker-compose.e2e.yml logs -f api
docker-compose -f docker-compose.e2e.yml logs -f db
```
**Migration from Legacy to Unified:**
If you were using the old `test:docker:website` approach:
1. **Stop old services**: `npm run docker:test:down`
2. **Clean up**: `npm run docker:test:clean`
3. **Use new approach**: `npm run test:e2e:website`
The new approach is:
- ✅ More reliable (no SWC issues)
- ✅ Faster (no local server startup)
- ✅ CI-compatible (identical environment)
- ✅ Simpler (single command)
#### Troubleshooting (Legacy - Deprecated)
- **Port conflicts**: If `docker:dev` is running, use `npm run docker:dev:down` before `npm run test:docker:website` to avoid port conflicts (dev uses 3001, test uses 3101).
- **Website not starting**: Playwright's webServer may fail if dependencies are missing. Run `npm install` first.
- **Cookie errors**: The `WebsiteAuthManager` requires both `url` and `path` properties for cookies. Check Playwright version compatibility.
- **Docker volumes stuck**: Run `npm run docker:test:down` (uses `--remove-orphans` + `rm -f`).
- **SWC compilation issues**: If website fails to start in Docker, use the local webServer approach (already configured in `playwright.website.config.ts`).
### API "Real vs In-Memory" Mode
The API can now be run either:
- **postgres**: loads [`DatabaseModule`](apps/api/src/domain/database/DatabaseModule.ts:1) (requires Postgres)
- **inmemory**: does not load `DatabaseModule` (no Postgres required)
Control it with:
- `GRIDPILOT_API_PERSISTENCE=postgres|inmemory` (defaults to `postgres` if `DATABASE_URL` is set, otherwise `inmemory`)
- Optional: `GRIDPILOT_API_BOOTSTRAP=0` to skip [`BootstrapModule`](apps/api/src/domain/bootstrap/BootstrapModule.ts:1)
### Development (.env.development)
Copy and customize as needed. Default values work out of the box.
### Production (.env.production)
**IMPORTANT**: Update these before deploying:
- Database credentials (`POSTGRES_PASSWORD`, `DATABASE_URL`)
- Website/API URLs (`NEXT_PUBLIC_API_BASE_URL`, `NEXT_PUBLIC_SITE_URL`)
- Vercel KV credentials (`KV_REST_API_URL`, `KV_REST_API_TOKEN`) (required for production email signups/rate limit)
## Architecture
### Development Setup
- Hot-reloading enabled via volume mounts
- Source code changes reflect immediately
- Database persisted in named volume
- Debug port exposed for API (9229)
### Production Setup
- Multi-stage builds for optimized images
- Only production dependencies included
- Nginx reverse proxy for both services
- Health checks for all services
- Auto-restart on failure
## Docker Services
### API (NestJS)
- Dev: `apps/api/Dockerfile.dev`
- Prod: `apps/api/Dockerfile.prod`
- Port: 3000
- Debug: 9229 (dev only)
### Website (Next.js)
- Dev: `apps/website/Dockerfile.dev`
- Prod: `apps/website/Dockerfile.prod`
- Port: 3001 (dev), 3000 (prod)
### Database (PostgreSQL)
- Image: postgres:15-alpine
- Port: 5432 (internal)
- Data: Persisted in Docker volume
- Optimized with performance tuning parameters
### Redis (Production only)
- Image: redis:7-alpine
- Port: 6379 (internal)
- Configured with:
- LRU eviction policy
- 512MB max memory
- AOF persistence
- Password protection
### Nginx (Production only)
- Reverse proxy for website + API
- Features:
- Rate limiting (API: 10r/s, General: 30r/s)
- Security headers (XSS, CSP, Frame-Options)
- Gzip compression
- Static asset caching
- Connection pooling
- Request buffering
- Port: 80, 443
## Troubleshooting
### Services won't start
```bash
# Clean everything and rebuild
npm run docker:dev:clean
npm run docker:dev:build
```
### Hot-reloading not working
Check that volume mounts are correct in docker-compose.dev.yml
### Database connection issues
Ensure DATABASE_URL in .env matches the database service configuration
### Check logs
```bash
# All services
npm run docker:dev:logs
# Specific service
docker-compose -f docker-compose.dev.yml logs -f api
docker-compose -f docker-compose.dev.yml logs -f website
docker-compose -f docker-compose.dev.yml logs -f db
```
### Database Migration for Media References
If you have existing seeded data with old URL formats (e.g., `/api/avatar/{id}`, `/api/media/teams/{id}/logo`), you need to migrate to the new `MediaReference` format.
#### Option 1: Migration Script (Preserve Data)
Run the migration script to convert old URLs to proper `MediaReference` objects:
```bash
# Test mode (dry run - shows what would change)
npm run migrate:media:test
# Execute migration (applies changes)
npm run migrate:media:exec
```
The script handles:
- **Driver avatars**: `/api/avatar/{id}``system-default` (deterministic variant)
- **Team logos**: `/api/media/teams/{id}/logo``generated`
- **League logos**: `/api/media/leagues/{id}/logo``generated`
- **Unknown formats** → `none`
#### Option 2: Wipe and Reseed (Clean Slate)
For development environments, you can wipe all data and start fresh:
```bash
# Stop services and remove volumes
npm run docker:dev:clean
# Rebuild and start fresh
npm run docker:dev:build
```
This will:
- Delete all existing data
- Run fresh seed with correct `MediaReference` format
- No migration needed
#### When to Use Each Option
**Use Migration Script** when:
- You have production data you want to preserve
- You want to understand what changes will be made
- You need a controlled, reversible process
**Use Wipe and Reseed** when:
- You're in development/testing
- You don't care about existing data
- You want the fastest path to a clean state
## Tips
1. **First time setup**: Use `docker:dev:build` to ensure images are built
2. **Clean slate**: Use `docker:dev:clean` to remove all data and start fresh
3. **Production testing**: Test prod setup locally before deploying
4. **Database access**: Use any PostgreSQL client with credentials from .env file
5. **Debugging**: Attach debugger to port 9229 for API debugging
## Production Deployment
Before deploying to production:
1. Update `.env.production` with real credentials
2. Configure SSL certificates in `nginx/ssl/`
3. Update Nginx configuration for HTTPS
4. Set proper domain names in environment variables
5. Consider using Docker secrets for sensitive data
## File Structure
```
.
├── docker-compose.dev.yml # Development orchestration
├── docker-compose.prod.yml # Production orchestration
├── docker-compose.e2e.yml # E2E testing orchestration (NEW)
├── docker-compose.test.yml # Legacy test orchestration (deprecated)
├── .env.development # Dev environment variables
├── .env.production # Prod environment variables
├── .env.test.example # Test env template
├── apps/
│ ├── api/
│ │ ├── Dockerfile.dev # API dev image
│ │ ├── Dockerfile.prod # API prod image
│ │ └── .dockerignore
│ └── website/
│ ├── Dockerfile.dev # Website dev image
│ ├── Dockerfile.prod # Website prod image
│ ├── Dockerfile.e2e # E2E optimized image (NEW)
│ └── .dockerignore
├── playwright.website.config.ts # E2E test config (updated)
├── playwright.website-integration.config.ts
├── playwright.smoke.config.ts
├── package.json # Updated scripts (NEW commands)
└── nginx/
└── nginx.conf # Nginx configuration
```
**Key Changes for E2E Testing:**
- `docker-compose.e2e.yml` - Unified test environment
- `apps/website/Dockerfile.e2e` - SWC-optimized Next.js image
- Updated `playwright.website.config.ts` - Containerized setup
- New npm scripts in `package.json`