# Testing Strategy ## Overview GridPilot employs a comprehensive BDD (Behavior-Driven Development) testing strategy across three distinct layers: **Unit**, **Integration**, and **End-to-End (E2E)**. Each layer validates different aspects of the system while maintaining a consistent Given/When/Then approach that emphasizes behavior over implementation. This document provides practical guidance on testing philosophy, test organization, tooling, and execution patterns for GridPilot. --- ## BDD Philosophy ### Why BDD for GridPilot? GridPilot manages complex business rules around league management, team registration, event scheduling, result processing, and standings calculation. These rules must be: - **Understandable** by non-technical stakeholders (league admins, race organizers) - **Verifiable** through automated tests that mirror real-world scenarios - **Maintainable** as business requirements evolve BDD provides a shared vocabulary (Given/When/Then) that bridges the gap between domain experts and developers, ensuring tests document expected behavior rather than technical implementation details. ### Given/When/Then Format All tests—regardless of layer—follow this structure: ```typescript // Given: Establish initial state/context // When: Perform the action being tested // Then: Assert the expected outcome ``` **Example (Unit Test):** ```typescript describe('League Domain Entity', () => { it('should add a team when team limit not reached', () => { // Given const league = new League('Summer Series', { maxTeams: 10 }); const team = new Team('Racing Legends'); // When const result = league.addTeam(team); // Then expect(result.isSuccess()).toBe(true); expect(league.teams).toContain(team); }); }); ``` This pattern applies equally to integration tests (with real database operations) and E2E tests (with full UI workflows). --- ## Test Types & Organization ### Unit Tests (`/tests/unit`) **Scope:** Domain entities, value objects, and application use cases with mocked ports (repositories, external services). **Tooling:** Vitest (fast, TypeScript-native, ESM support) **Execution:** Parallel, target <1 second total runtime **Purpose:** - Validate business logic in isolation - Ensure domain invariants hold (e.g., team limits, scoring rules) - Test use case orchestration with mocked dependencies **Examples from Architecture:** 1. **Domain Entity Test:** ```typescript // League.addTeam() validation Given a League with maxTeams=10 and 9 current teams When addTeam() is called with a valid Team Then the team is added successfully Given a League with maxTeams=10 and 10 current teams When addTeam() is called Then a DomainError is returned with "Team limit reached" ``` 2. **Use Case Test:** ```typescript // GenerateStandingsUseCase Given a League with 5 teams and completed races When execute() is called Then LeagueRepository.findById() is invoked And ScoringRule.calculatePoints() is called for each team And sorted standings are returned ``` 3. **Scoring Rule Test:** ```typescript // ScoringRule.calculatePoints() Given a F1-style scoring rule (25-18-15-12-10-8-6-4-2-1) When calculatePoints(position=1) is called Then 25 points are returned Given the same rule When calculatePoints(position=11) is called Then 0 points are returned ``` **Key Practices:** - Mock only at architecture boundaries (ports like `ILeagueRepository`) - Never mock domain entities or value objects - Keep tests fast (<10ms per test) - Use in-memory test doubles for simple cases --- ### Integration Tests (`/tests/integration`) **Scope:** Repository implementations, infrastructure adapters (PostgreSQL, Redis, OAuth clients, result importers). **Tooling:** Vitest + Testcontainers (spins up real PostgreSQL/Redis in Docker) **Execution:** Sequential, ~10 seconds per suite **Purpose:** - Validate that infrastructure adapters correctly implement port interfaces - Test database queries, migrations, and transaction handling - Ensure external API clients handle authentication and error scenarios **Examples from Architecture:** 1. **Repository Test:** ```typescript // PostgresLeagueRepository Given a PostgreSQL container is running When save() is called with a League entity Then the league is persisted to the database And findById() returns the same league with correct attributes ``` 2. **OAuth Client Test:** ```typescript // IRacingOAuthClient Given valid iRacing credentials When authenticate() is called Then an access token is returned And the token is cached in Redis for 1 hour Given expired credentials When authenticate() is called Then an AuthenticationError is thrown ``` 3. **Result Importer Test:** ```typescript // EventResultImporter Given an Event exists in the database When importResults() is called with iRacing session data Then Driver entities are created/updated And EventResult entities are persisted with correct positions/times And the Event status is updated to 'COMPLETED' ``` **Key Practices:** - Use Testcontainers to spin up real databases (not mocks) - Clean database state between tests (truncate tables or use transactions) - Seed minimal test data via SQL fixtures - Test both success and failure paths (network errors, constraint violations) --- ### End-to-End Tests (`/tests/e2e`) **Scope:** Full user workflows spanning web-client → web-api → database. **Tooling:** Playwright + Docker Compose (orchestrates all services) **Execution:** ~2 minutes per scenario **Purpose:** - Validate complete user journeys from UI interactions to database changes - Ensure services integrate correctly in a production-like environment - Catch regressions in multi-service workflows **Examples from Architecture:** 1. **League Creation Workflow:** ```gherkin Given an authenticated league admin When they navigate to "Create League" And fill in league name, scoring system, and team limit And submit the form Then the league appears in the admin dashboard And the database contains the new league record And the league is visible to other users ``` 2. **Team Registration Workflow:** ```gherkin Given a published league with 5/10 team slots filled When a team captain navigates to the league page And clicks "Join League" And fills in team name and roster And submits the form Then the team appears in the league's team list And the team count updates to 6/10 And the captain receives a confirmation email ``` 3. **Automated Result Import:** ```gherkin Given a League with an upcoming Event And iRacing OAuth credentials are configured When the scheduled import job runs Then the job authenticates with iRacing And fetches session results for the Event And creates EventResult records in the database And updates the Event status to 'COMPLETED' And triggers standings recalculation ``` 4. **Companion App Login Automation:** ```gherkin Given a League Admin enables companion app login automation When the companion app is launched Then the app polls for a generated login token from web-api And auto-fills iRacing credentials from the admin's profile And logs into iRacing automatically And confirms successful login to web-api ``` **Key Practices:** - Use Playwright's Page Object pattern for reusable UI interactions - Test both happy paths and error scenarios (validation errors, network failures) - Clean database state between scenarios (via API or direct SQL) - Run E2E tests in CI before merging to main branch --- ## Test Data Strategy ### Fixtures & Seeding **Unit Tests:** - Use in-memory domain objects (no database) - Factory functions for common test entities: ```typescript function createTestLeague(overrides?: Partial): League { return new League('Test League', { maxTeams: 10, ...overrides }); } ``` **Integration Tests:** - Use Testcontainers to spin up fresh PostgreSQL instances - Seed minimal test data via SQL scripts: ```sql -- tests/integration/fixtures/leagues.sql INSERT INTO leagues (id, name, max_teams) VALUES ('league-1', 'Test League', 10); ``` - Clean state between tests (truncate tables or rollback transactions) **E2E Tests:** - Pre-seed database via migrations before Docker Compose starts - Use API endpoints to create test data when possible (validates API behavior) - Database cleanup between scenarios: ```typescript // tests/e2e/support/database.ts export async function cleanDatabase() { await sql`TRUNCATE TABLE event_results CASCADE`; await sql`TRUNCATE TABLE events CASCADE`; await sql`TRUNCATE TABLE teams CASCADE`; await sql`TRUNCATE TABLE leagues CASCADE`; } ``` --- ## Docker E2E Setup ### Architecture E2E tests run against a full stack orchestrated by `docker-compose.test.yml`: ```yaml services: postgres: image: postgres:16 environment: POSTGRES_DB: gridpilot_test POSTGRES_USER: test POSTGRES_PASSWORD: test redis: image: redis:7-alpine web-api: build: ./src/apps/web-api depends_on: - postgres - redis environment: DATABASE_URL: postgres://test:test@postgres:5432/gridpilot_test REDIS_URL: redis://redis:6379 ports: - "3000:3000" ``` ### Execution Flow 1. **Start Services:** `docker compose -f docker-compose.test.yml up -d` 2. **Run Migrations:** `npm run migrate:test` (seeds database) 3. **Execute Tests:** Playwright targets `http://localhost:3000` 4. **Teardown:** `docker compose -f docker-compose.test.yml down -v` ### Environment Setup ```bash # tests/e2e/setup.ts export async function globalSetup() { // Wait for web-api to be ready await waitForService('http://localhost:3000/health'); // Run database migrations await runMigrations(); } export async function globalTeardown() { // Stop Docker Compose services await exec('docker compose -f docker-compose.test.yml down -v'); } ``` --- ## BDD Scenario Examples ### 1. League Creation (Success + Failure) ```gherkin Scenario: Admin creates a new league Given an authenticated admin user When they submit a league form with: | name | Summer Series 2024 | | maxTeams | 12 | | scoringSystem | F1 | Then the league is created successfully And the admin is redirected to the league dashboard And the database contains the new league Scenario: League creation fails with duplicate name Given a league named "Summer Series 2024" already exists When an admin submits a league form with name "Summer Series 2024" Then the form displays error "League name already exists" And no new league is created in the database ``` ### 2. Team Registration (Success + Failure) ```gherkin Scenario: Team registers for a league Given a published league with 5/10 team slots When a team captain submits registration with: | teamName | Racing Legends | | drivers | Alice, Bob, Carol | Then the team is added to the league And the team count updates to 6/10 And the captain receives a confirmation email Scenario: Registration fails when league is full Given a published league with 10/10 team slots When a team captain attempts to register Then the form displays error "League is full" And the team is not added to the league ``` ### 3. Automated Result Import (Success + Failure) ```gherkin Scenario: Import results from iRacing Given a League with an Event scheduled for today And iRacing OAuth credentials are configured When the scheduled import job runs Then the job authenticates with iRacing API And fetches session results for the Event And creates EventResult records for each driver And updates the Event status to 'COMPLETED' And triggers standings recalculation Scenario: Import fails with invalid credentials Given an Event with expired iRacing credentials When the import job runs Then an AuthenticationError is logged And the Event status remains 'SCHEDULED' And an admin notification is sent ``` ### 4. Parallel Scoring Calculation ```gherkin Scenario: Calculate standings for multiple leagues concurrently Given 5 active leagues with completed events When the standings recalculation job runs Then each league's standings are calculated in parallel And the process completes in <5 seconds And all standings are persisted correctly And no race conditions occur (validated via database integrity checks) ``` ### 5. Companion App Login Automation ```gherkin Scenario: Companion app logs into iRacing automatically Given a League Admin enables companion app login automation And provides their iRacing credentials When the companion app is launched Then the app polls web-api for a login token And retrieves the admin's encrypted credentials And auto-fills the iRacing login form And submits the login request And confirms successful login to web-api And caches the session token for 24 hours ``` --- ## Coverage Goals ### Target Coverage Levels - **Domain/Application Layers:** >90% (critical business logic) - **Infrastructure Layer:** >80% (repository implementations, adapters) - **Presentation Layer:** Smoke tests (basic rendering, no exhaustive UI coverage) ### Running Coverage Reports ```bash # Unit + Integration coverage npm run test:coverage # View HTML report open coverage/index.html # E2E coverage (via Istanbul) npm run test:e2e:coverage ``` ### What to Prioritize 1. **Domain Entities:** Invariants, validation rules, state transitions 2. **Use Cases:** Orchestration logic, error handling, port interactions 3. **Repositories:** CRUD operations, query builders, transaction handling 4. **Adapters:** External API clients, OAuth flows, result importers **What NOT to prioritize:** - Trivial getters/setters - Framework boilerplate (Express route handlers) - UI styling (covered by visual regression tests if needed) --- ## Continuous Testing ### Watch Mode (Development) ```bash # Auto-run unit tests on file changes npm run test:watch # Auto-run integration tests (slower, but useful for DB work) npm run test:integration:watch ``` ### CI/CD Pipeline ```mermaid graph LR A[Code Push] --> B[Unit Tests] B --> C[Integration Tests] C --> D[E2E Tests] D --> E[Deploy to Staging] ``` **Execution Order:** 1. **Unit Tests** (parallel, <1 second) — fail fast on logic errors 2. **Integration Tests** (sequential, ~10 seconds) — catch infrastructure issues 3. **E2E Tests** (sequential, ~2 minutes) — validate full workflows 4. **Deploy** — only if all tests pass **Parallelization:** - Unit tests run in parallel (Vitest default) - Integration tests run sequentially (avoid database conflicts) - E2E tests run sequentially (UI interactions are stateful) --- ## Testing Best Practices ### 1. Test Behavior, Not Implementation **❌ Bad (overfitted to implementation):** ```typescript it('should call repository.save() once', () => { const repo = mock(); const useCase = new CreateLeagueUseCase(repo); useCase.execute({ name: 'Test' }); expect(repo.save).toHaveBeenCalledTimes(1); }); ``` **✅ Good (tests observable behavior):** ```typescript it('should persist the league to the repository', async () => { const repo = new InMemoryLeagueRepository(); const useCase = new CreateLeagueUseCase(repo); const result = await useCase.execute({ name: 'Test' }); expect(result.isSuccess()).toBe(true); const league = await repo.findById(result.value.id); expect(league?.name).toBe('Test'); }); ``` ### 2. Mock Only at Architecture Boundaries **Ports (interfaces)** should be mocked in use case tests: ```typescript const mockRepo = mock({ save: jest.fn().mockResolvedValue(undefined), }); ``` **Domain entities** should NEVER be mocked: ```typescript // ❌ Don't do this const mockLeague = mock(); // ✅ Do this const league = new League('Test League', { maxTeams: 10 }); ``` ### 3. Keep Tests Readable and Maintainable **Arrange-Act-Assert Pattern:** ```typescript it('should calculate standings correctly', () => { // Arrange: Set up test data const league = createTestLeague(); const teams = [createTestTeam('Team A'), createTestTeam('Team B')]; const results = [createTestResult(teams[0], position: 1)]; // Act: Perform the action const standings = league.calculateStandings(results); // Assert: Verify the outcome expect(standings[0].team).toBe(teams[0]); expect(standings[0].points).toBe(25); }); ``` ### 4. Test Error Scenarios Don't just test the happy path: ```typescript describe('League.addTeam()', () => { it('should add team successfully', () => { /* ... */ }); it('should fail when team limit reached', () => { const league = createTestLeague({ maxTeams: 1 }); league.addTeam(createTestTeam('Team A')); const result = league.addTeam(createTestTeam('Team B')); expect(result.isFailure()).toBe(true); expect(result.error.message).toBe('Team limit reached'); }); it('should fail when adding duplicate team', () => { /* ... */ }); }); ``` --- ## Common Patterns ### Setting Up Test Fixtures **Factory Functions:** ```typescript // tests/support/factories.ts export function createTestLeague(overrides?: Partial): League { return new League('Test League', { maxTeams: 10, scoringSystem: 'F1', ...overrides, }); } export function createTestTeam(name: string): Team { return new Team(name, { drivers: ['Driver 1', 'Driver 2'] }); } ``` ### Mocking Ports in Use Case Tests ```typescript // tests/unit/application/CreateLeagueUseCase.test.ts describe('CreateLeagueUseCase', () => { let mockRepo: jest.Mocked; let useCase: CreateLeagueUseCase; beforeEach(() => { mockRepo = { save: jest.fn().mockResolvedValue(undefined), findById: jest.fn().mockResolvedValue(null), findByName: jest.fn().mockResolvedValue(null), }; useCase = new CreateLeagueUseCase(mockRepo); }); it('should create a league when name is unique', async () => { const result = await useCase.execute({ name: 'New League' }); expect(result.isSuccess()).toBe(true); expect(mockRepo.save).toHaveBeenCalledWith( expect.objectContaining({ name: 'New League' }) ); }); }); ``` ### Database Cleanup Strategies **Integration Tests:** ```typescript // tests/integration/setup.ts import { sql } from './database'; export async function cleanDatabase() { await sql`TRUNCATE TABLE event_results CASCADE`; await sql`TRUNCATE TABLE events CASCADE`; await sql`TRUNCATE TABLE teams CASCADE`; await sql`TRUNCATE TABLE leagues CASCADE`; } beforeEach(async () => { await cleanDatabase(); }); ``` **E2E Tests:** ```typescript // tests/e2e/support/hooks.ts import { test as base } from '@playwright/test'; export const test = base.extend({ page: async ({ page }, use) => { // Clean database before each test await fetch('http://localhost:3000/test/cleanup', { method: 'POST' }); await use(page); }, }); ``` ### Playwright Page Object Pattern ```typescript // tests/e2e/pages/LeaguePage.ts export class LeaguePage { constructor(private page: Page) {} async navigateToCreateLeague() { await this.page.goto('/leagues/create'); } async fillLeagueForm(data: { name: string; maxTeams: number }) { await this.page.fill('[name="name"]', data.name); await this.page.fill('[name="maxTeams"]', data.maxTeams.toString()); } async submitForm() { await this.page.click('button[type="submit"]'); } async getSuccessMessage() { return this.page.textContent('.success-message'); } } // Usage in test test('should create league', async ({ page }) => { const leaguePage = new LeaguePage(page); await leaguePage.navigateToCreateLeague(); await leaguePage.fillLeagueForm({ name: 'Test', maxTeams: 10 }); await leaguePage.submitForm(); expect(await leaguePage.getSuccessMessage()).toBe('League created'); }); ``` --- ## Real E2E Testing Strategy (No Mocks) GridPilot requires two distinct E2E testing strategies due to the nature of its automation adapters: 1. **Strategy A (Docker)**: Test `BrowserDevToolsAdapter` with Puppeteer against a fixture server 2. **Strategy B (Native macOS)**: Test `NutJsAutomationAdapter` on real hardware with display access ### Constraint: iRacing Terms of Service - **Production**: nut.js OS-level automation only (no Puppeteer/CDP for actual iRacing automation) - **Testing**: Puppeteer CAN be used to test `BrowserDevToolsAdapter` against static HTML fixtures ### Test Architecture Overview ```mermaid graph TB subgraph Docker E2E - CI FX[Static HTML Fixtures] --> FS[Fixture Server Container] FS --> HC[Headless Chrome Container] HC --> BDA[BrowserDevToolsAdapter Tests] end subgraph Native E2E - macOS Runner SCR[Screen Capture] --> TM[Template Matching Tests] WF[Window Focus Tests] --> NJA[NutJsAutomationAdapter Tests] KB[Keyboard/Mouse Tests] --> NJA end ``` --- ### Strategy A: Docker-Based E2E Tests #### Purpose Test the complete 18-step workflow using `BrowserDevToolsAdapter` against real HTML fixtures without mocks. #### Architecture ```yaml # docker/docker-compose.e2e.yml services: # Headless Chrome with remote debugging enabled chrome: image: browserless/chrome:latest ports: - "9222:3000" environment: - CONNECTION_TIMEOUT=600000 - MAX_CONCURRENT_SESSIONS=1 - PREBOOT_CHROME=true healthcheck: test: ["CMD", "curl", "-f", "http://localhost:3000/json/version"] interval: 5s timeout: 10s retries: 3 # Static server for iRacing HTML fixtures fixture-server: build: context: ./fixture-server dockerfile: Dockerfile ports: - "3456:80" volumes: - ../resources/iracing-hosted-sessions:/usr/share/nginx/html:ro healthcheck: test: ["CMD", "curl", "-f", "http://localhost:80/01-hosted-racing.html"] interval: 5s timeout: 10s retries: 3 ``` #### Fixture Server Configuration ```dockerfile # docker/fixture-server/Dockerfile FROM nginx:alpine # Configure nginx for static HTML serving COPY nginx.conf /etc/nginx/conf.d/default.conf EXPOSE 80 ``` ```nginx # docker/fixture-server/nginx.conf server { listen 80; server_name localhost; root /usr/share/nginx/html; location / { try_files $uri $uri/ =404; add_header Access-Control-Allow-Origin *; } } ``` #### BDD Scenarios for Docker E2E ```gherkin Feature: BrowserDevToolsAdapter Workflow Automation As the automation engine I want to execute the 18-step hosted session workflow So that I can verify browser automation against real HTML fixtures Background: Given the Docker E2E environment is running And the fixture server is serving iRacing HTML pages And the headless Chrome container is connected Scenario: Complete workflow navigation through all 18 steps Given the BrowserDevToolsAdapter is connected to Chrome When I execute step 2 HOSTED_RACING Then the adapter should navigate to the hosted racing page And the page should contain the create race button When I execute step 3 CREATE_RACE Then the wizard modal should open When I execute step 4 RACE_INFORMATION And I fill the session name field with "Test Race" Then the form field should contain "Test Race" # ... steps 5-17 follow same pattern When I execute step 18 TRACK_CONDITIONS Then the automation should stop at the safety checkpoint And the checkout button should NOT be clicked Scenario: Modal step handling - Add Car modal Given the automation is at step 8 SET_CARS When I click the "Add Car" button Then the ADD_CAR modal should open When I search for "Dallara F3" And I select the first result Then the modal should close And the car should be added to the selection Scenario: Form field validation with real selectors Given I am on the RACE_INFORMATION page Then the selector "input[name='sessionName']" should exist And the selector ".form-group:has label:has-text Session Name input" should exist Scenario: Error handling when element not found Given I am on a blank page When I try to click selector "#nonexistent-element" Then the result should indicate failure And the error message should contain "not found" ``` #### Test Implementation Structure ```typescript // tests/e2e/docker/browserDevToolsAdapter.e2e.test.ts import { describe, it, expect, beforeAll, afterAll } from 'vitest'; import { BrowserDevToolsAdapter } from '@infrastructure/adapters/automation/BrowserDevToolsAdapter'; import { StepId } from '@domain/value-objects/StepId'; describe('E2E: BrowserDevToolsAdapter - Docker Environment', () => { let adapter: BrowserDevToolsAdapter; const CHROME_WS_ENDPOINT = process.env.CHROME_WS_ENDPOINT || 'ws://localhost:9222'; const FIXTURE_BASE_URL = process.env.FIXTURE_BASE_URL || 'http://localhost:3456'; beforeAll(async () => { adapter = new BrowserDevToolsAdapter({ browserWSEndpoint: CHROME_WS_ENDPOINT, defaultTimeout: 30000, }); await adapter.connect(); }); afterAll(async () => { await adapter.disconnect(); }); describe('Step Workflow Execution', () => { it('should navigate to hosted racing page - step 2', async () => { const result = await adapter.navigateToPage(`${FIXTURE_BASE_URL}/01-hosted-racing.html`); expect(result.success).toBe(true); }); it('should fill race information form - step 4', async () => { await adapter.navigateToPage(`${FIXTURE_BASE_URL}/03-race-information.html`); const stepId = StepId.create(4); const result = await adapter.executeStep(stepId, { sessionName: 'E2E Test Session', password: 'testpass123', description: 'Automated E2E test session', }); expect(result.success).toBe(true); }); // ... additional step tests }); describe('Modal Operations', () => { it('should handle ADD_CAR modal - step 9', async () => { await adapter.navigateToPage(`${FIXTURE_BASE_URL}/09-add-a-car.html`); const stepId = StepId.create(9); const result = await adapter.handleModal(stepId, 'open'); expect(result.success).toBe(true); }); }); describe('Safety Checkpoint', () => { it('should stop at step 18 without clicking checkout', async () => { await adapter.navigateToPage(`${FIXTURE_BASE_URL}/18-track-conditions.html`); const stepId = StepId.create(18); const result = await adapter.executeStep(stepId, {}); expect(result.success).toBe(true); expect(result.metadata?.safetyStop).toBe(true); }); }); }); ``` --- ### Strategy B: Native macOS E2E Tests #### Purpose Test OS-level screen automation using nut.js on real hardware. These tests CANNOT run in Docker because nut.js requires actual display access. #### Requirements - macOS CI runner with display access - Screen recording permissions granted - Accessibility permissions enabled - Real Chrome/browser window visible #### BDD Scenarios for Native E2E ```gherkin Feature: NutJsAutomationAdapter OS-Level Automation As the automation engine I want to perform OS-level screen automation So that I can interact with iRacing without browser DevTools Background: Given I am running on macOS with display access And accessibility permissions are granted And screen recording permissions are granted Scenario: Screen capture functionality When I capture the full screen Then a valid image buffer should be returned And the image dimensions should match screen resolution Scenario: Window focus management Given a Chrome window titled "iRacing" is open When I focus the browser window Then the Chrome window should become the active window Scenario: Template matching detection Given I have a template image for the "Create Race" button And the iRacing hosted racing page is visible When I search for the template on screen Then the template should be found And the location should have confidence > 0.8 Scenario: Mouse click at detected location Given I have detected a button at coordinates 500,300 When I click at that location Then the mouse should move to 500,300 And a left click should be performed Scenario: Keyboard input simulation Given a text field is focused When I type "Test Session Name" Then the text should be entered character by character With appropriate delays between keystrokes Scenario: Login state detection Given the iRacing login page is displayed When I detect the login state Then the result should indicate logged out And the login form indicator should be detected Scenario: Safe automation - no checkout Given I am on the Track Conditions step When I execute step 18 Then no click should be performed on the checkout button And the automation should report safety stop ``` #### Test Implementation Structure ```typescript // tests/e2e/native/nutJsAdapter.e2e.test.ts import { describe, it, expect, beforeAll, afterAll } from 'vitest'; import { NutJsAutomationAdapter } from '@infrastructure/adapters/automation/NutJsAutomationAdapter'; describe('E2E: NutJsAutomationAdapter - Native macOS', () => { let adapter: NutJsAutomationAdapter; beforeAll(async () => { // Skip if not on macOS with display if (process.platform !== 'darwin' || !process.env.DISPLAY_AVAILABLE) { return; } adapter = new NutJsAutomationAdapter({ mouseSpeed: 500, keyboardDelay: 25, defaultTimeout: 10000, }); await adapter.connect(); }); afterAll(async () => { if (adapter?.isConnected()) { await adapter.disconnect(); } }); describe('Screen Capture', () => { it('should capture full screen', async () => { const result = await adapter.captureScreen(); expect(result.success).toBe(true); expect(result.imageData).toBeDefined(); expect(result.dimensions.width).toBeGreaterThan(0); }); it('should capture specific region', async () => { const region = { x: 100, y: 100, width: 200, height: 200 }; const result = await adapter.captureScreen(region); expect(result.success).toBe(true); }); }); describe('Window Focus', () => { it('should focus Chrome window', async () => { const result = await adapter.focusBrowserWindow('Chrome'); // May fail if Chrome not open, which is acceptable expect(result).toBeDefined(); }); }); describe('Template Matching', () => { it('should find element by template', async () => { const template = { id: 'test-button', imagePath: './resources/templates/test-button.png', confidence: 0.8, }; const location = await adapter.findElement(template); // Template may not be on screen - test structure only expect(location === null || location.confidence > 0).toBe(true); }); }); }); ``` --- ### Test File Structure ``` tests/ ├── e2e/ │ ├── docker/ # Docker-based E2E tests │ │ ├── browserDevToolsAdapter.e2e.test.ts │ │ ├── workflowSteps.e2e.test.ts │ │ ├── modalHandling.e2e.test.ts │ │ └── selectorValidation.e2e.test.ts │ ├── native/ # Native OS automation tests │ │ ├── nutJsAdapter.e2e.test.ts │ │ ├── screenCapture.e2e.test.ts │ │ ├── templateMatching.e2e.test.ts │ │ └── windowFocus.e2e.test.ts │ ├── automation.e2e.test.ts # Existing selector validation │ └── features/ # Gherkin feature files │ └── hosted-session-automation.feature ├── integration/ │ └── infrastructure/ │ └── BrowserDevToolsAdapter.test.ts └── unit/ └── ... docker/ ├── docker-compose.e2e.yml # E2E test environment └── fixture-server/ ├── Dockerfile └── nginx.conf .github/ └── workflows/ ├── e2e-docker.yml # Docker E2E workflow └── e2e-macos.yml # macOS native E2E workflow ``` --- ### CI/CD Integration #### Docker E2E Workflow ```yaml # .github/workflows/e2e-docker.yml name: E2E Tests - Docker on: push: branches: [main, develop] pull_request: branches: [main] jobs: e2e-docker: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Setup Node.js uses: actions/setup-node@v4 with: node-version: '20' cache: 'npm' - name: Install dependencies run: npm ci - name: Start Docker E2E environment run: | docker compose -f docker/docker-compose.e2e.yml up -d docker compose -f docker/docker-compose.e2e.yml ps - name: Wait for services to be healthy run: | timeout 60 bash -c 'until curl -s http://localhost:9222/json/version; do sleep 2; done' timeout 60 bash -c 'until curl -s http://localhost:3456/01-hosted-racing.html; do sleep 2; done' - name: Run Docker E2E tests run: npm run test:e2e:docker env: CHROME_WS_ENDPOINT: ws://localhost:9222 FIXTURE_BASE_URL: http://localhost:3456 - name: Stop Docker environment if: always() run: docker compose -f docker/docker-compose.e2e.yml down -v ``` #### macOS Native E2E Workflow ```yaml # .github/workflows/e2e-macos.yml name: E2E Tests - macOS Native on: push: branches: [main, develop] pull_request: branches: [main] jobs: e2e-macos: runs-on: macos-latest steps: - uses: actions/checkout@v4 - name: Setup Node.js uses: actions/setup-node@v4 with: node-version: '20' cache: 'npm' - name: Install dependencies run: npm ci - name: Grant screen recording permissions run: | # Note: GitHub Actions macOS runners have limited permission support # Some tests may be skipped if permissions cannot be granted sudo sqlite3 /Library/Application\ Support/com.apple.TCC/TCC.db \ "INSERT OR REPLACE INTO access VALUES('kTCCServiceScreenCapture','com.apple.Terminal',0,2,0,1,NULL,NULL,0,'UNUSED',NULL,0,$(date +%s));" 2>/dev/null || true - name: Run native E2E tests run: npm run test:e2e:native env: DISPLAY_AVAILABLE: "true" - name: Upload screenshots on failure if: failure() uses: actions/upload-artifact@v4 with: name: e2e-screenshots path: tests/e2e/native/screenshots/ ``` --- ### NPM Scripts ```json { "scripts": { "test:e2e": "vitest run --config vitest.e2e.config.ts", "test:e2e:docker": "vitest run --config vitest.e2e.config.ts tests/e2e/docker/", "test:e2e:native": "vitest run --config vitest.e2e.config.ts tests/e2e/native/", "docker:e2e:up": "docker compose -f docker/docker-compose.e2e.yml up -d", "docker:e2e:down": "docker compose -f docker/docker-compose.e2e.yml down -v", "docker:e2e:logs": "docker compose -f docker/docker-compose.e2e.yml logs -f" } } ``` --- ### Environment Configuration ```bash # .env.test.example # Docker E2E Configuration CHROME_WS_ENDPOINT=ws://localhost:9222 FIXTURE_BASE_URL=http://localhost:3456 E2E_TIMEOUT=120000 # Native E2E Configuration DISPLAY_AVAILABLE=true NUT_JS_MOUSE_SPEED=500 NUT_JS_KEYBOARD_DELAY=25 ``` --- ## Cross-References - **[`ARCHITECTURE.md`](./ARCHITECTURE.md)** — Layer boundaries, port definitions, and dependency rules that guide test structure - **[`TECH.md`](./TECH.md)** — Detailed tooling specifications (Vitest, Playwright, Testcontainers configuration) - **[`package.json`](../package.json)** — Test scripts and commands (`test:unit`, `test:integration`, `test:e2e`, `test:coverage`) --- ## Summary GridPilot's testing strategy ensures: - **Business logic is correct** (unit tests for domain/application layers) - **Infrastructure works reliably** (integration tests for repositories/adapters) - **User workflows function end-to-end** (E2E tests for full stack) - **Browser automation works correctly** (Docker E2E tests with real fixtures) - **OS-level automation works correctly** (Native macOS E2E tests with display access) By following BDD principles and maintaining clear test organization, the team can confidently evolve GridPilot while preserving correctness and stability.