# Feature Availability (Modes + Feature Flags) This document defines a clean, consistent system for enabling/disabling functionality across: - API endpoints - Website links/navigation - Website components It is designed to support: - test mode - maintenance mode - disabling features due to risk/issues - coming soon features - future super admin flag management It is aligned with the hard separation of responsibilities in `Blockers & Guards`: - Frontend uses Blockers (UX best-effort) - Backend uses Guards (authoritative enforcement) See: docs/architecture/BLOCKER_GUARDS.md --- ## 1) Core Principle Availability is decided once, then applied in multiple places. - Backend Guards enforce availability for correctness and security. - Frontend Blockers reflect availability for UX, but must never be relied on for enforcement. If it must be enforced, it is a Guard. If it only improves UX, it is a Blocker. --- ## 2) Definitions (Canonical Vocabulary) ### 2.1 Operational Mode (system-level) A small, global state representing operational posture. Recommended enum: - normal - maintenance - test Operational Mode is: - authoritative in backend - typically environment-scoped - required for rapid response (maintenance must be runtime-changeable) ### 2.2 Feature State (capability-level) A per-feature state machine (not a boolean). Recommended enum: - enabled - disabled - coming_soon - hidden Semantics: - enabled: feature is available and advertised - disabled: feature exists but must not be used (safety kill switch) - coming_soon: may be visible in UI as teaser, but actions are blocked - hidden: not visible/advertised; actions are blocked (safest default) ### 2.3 Capability A named unit of functionality (stable key) used consistently across API + website. Examples: - races.create - payments.checkout - sponsor.portal - stewarding.protests A capability key is a contract. ### 2.4 Action Type Availability decisions vary by the type of action: - view: read-only operations (pages, GET endpoints) - mutate: state-changing operations (POST/PUT/PATCH/DELETE) --- ## 3) Policy Model (What Exists) ### 3.1 FeatureAvailabilityPolicy (single evaluation model) One evaluation function produces a decision. Inputs: - environment (dev/test/prod) - operationalMode (normal/maintenance/test) - capabilityKey (string) - actionType (view/mutate) - actorContext (anonymous/authenticated; roles later) Outputs: - allow: boolean - publicReason: one of maintenance | disabled | coming_soon | hidden | not_configured - uxHint: optional { messageKey, redirectPath, showTeaser } The same decision model is reused by: - API Guard enforcement - Website navigation visibility - Website component rendering/disablement ### 3.2 Precedence (where values come from) To avoid “mystery behavior”, use strict precedence: 1. runtime overrides (highest priority) 2. build-time environment configuration 3. code defaults (lowest priority, should be safe: hidden/disabled) Rationale: - runtime overrides enable emergency response without rebuild - env config enables environment-specific defaults - code defaults keep behavior deterministic if config is missing --- ## 4) Evaluation Rules (Deterministic, Explicit) ### 4.1 Maintenance mode rules Maintenance must be able to block the platform fast and consistently. Default behavior: - mutate actions: denied unless explicitly allowlisted - view actions: allowed only for a small allowlist (status page, login, health, static public routes) This creates a safe “fail closed” posture. Optional refinement: - define a maintenance allowlist for critical reads (e.g., dashboards for operators) ### 4.2 Test mode rules Test mode should primarily exist in non-prod, and should be explicit in prod. Recommended behavior: - In prod, test mode should not be enabled accidentally. - In test environments, test mode may: - enable test-only endpoints - bypass external integrations (through adapters) - relax rate limits - expose test banners in UI (Blocker-level display) ### 4.3 Feature state rules (per capability) Given a capability state: - enabled: - allow view + mutate (subject to auth/roles) - visible in UI - coming_soon: - allow view of teaser pages/components - deny mutate and deny sensitive reads - visible in UI with Coming Soon affordances - disabled: - deny view + mutate - hidden in nav by default - hidden: - deny view + mutate - never visible in UI Note: - “disabled” and “hidden” are both blocked; the difference is UI and information disclosure. ### 4.4 Missing configuration If a capability is not configured: - treat as hidden (fail closed) - optionally log a warning (server-side) --- ## 5) Enforcement Mapping (Where Each Requirement Lives) This section is the “wiring contract” across layers. ### 5.1 API endpoints (authoritative) - Enforce via Backend Guards (NestJS CanActivate). - Endpoints must declare the capability they require. Mapping to HTTP: - maintenance: 503 Service Unavailable (preferred for global maintenance) - disabled/hidden: 404 Not Found (avoid advertising unavailable capabilities) - coming_soon: 404 Not Found publicly, or 409 Conflict internally if you want explicit semantics for trusted clients later Guideline: - External clients should not get detailed feature availability information unless explicitly intended. ### 5.2 Website links / navigation (UX) - Enforce via Frontend Blockers. - Hide links when state is disabled/hidden. - For coming_soon, show link but route to teaser page or disable with explanation. Rules: - Never assume hidden in UI equals enforced on server. - UI should degrade gracefully (API may still block). ### 5.3 Website components (UX) - Use Blockers to: - hide components for hidden/disabled - show teaser content for coming_soon - disable buttons or flows for coming_soon/disabled, with consistent messaging Recommendation: - Provide a single reusable component (FeatureBlocker) that consumes policy decisions and renders: - children when allowed - teaser when coming_soon - null or fallback when disabled/hidden --- ## 6) Build-Time vs Runtime (Clean, Predictable) ### 6.1 Build-time flags (require rebuild/redeploy) What they are good for: - preventing unfinished UI code from shipping in a bundle - cutting entire routes/components from builds for deterministic releases Limitations: - NEXT_PUBLIC_* values are compiled into the client bundle; changing them does not update clients without rebuild. Use build-time flags for: - experimental UI - “not yet shipped” components/routes - simplifying deployments (pre-launch vs alpha style gating) ### 6.2 Runtime flags (no rebuild) What they are for: - maintenance mode - emergency disable for broken endpoints - quickly hiding risky features Runtime flags must be available to: - API Guards (always) - Website SSR/middleware optionally - Website client optionally (for UX only) Key tradeoff: - runtime access introduces caching and latency concerns - treat runtime policy reads as cached, fast, and resilient Recommended approach: - API is authoritative source of runtime policy - website can optionally consume a cached policy snapshot endpoint --- ## 7) Storage and Distribution (Now + Future Super Admin) ### 7.1 Now (no super admin UI) Use a single “policy snapshot” stored in one place and read by the API, with caching. Options (in priority order): 1. Remote KV/DB-backed policy snapshot (preferred for true runtime changes) 2. Environment variable JSON (simpler, but changes require restart/redeploy) 3. Static config file in repo (requires rebuild/redeploy) ### 7.2 Future (super admin UI) Super admin becomes a writer to the same store. Non-negotiable: - The storage schema must be stable and versioned. Recommended schema (conceptual): - policyVersion - operationalMode - capabilities: map of capabilityKey -> featureState - allowlists: maintenance view/mutate allowlists - optional targeting rules later (by role/user) --- ## 8) Data Flow (Conceptual) ```mermaid flowchart LR UI[Website UI] --> FB[Frontend Blockers] FB --> PC[Policy Client] UI --> API[API Request] API --> FG[Feature Guard] FG --> AS[API Application Service] AS --> UC[Core Use Case] PC --> PS[Policy Snapshot] FG --> PS ``` Interpretation: - Website reads policy for UX (best-effort). - API enforces policy (authoritative) before any application logic. --- ## 9) Implementation Checklist (For Code Mode) Backend (apps/api): - Define capability keys and feature states as shared types in a local module. - Create FeaturePolicyService that resolves the current policy snapshot (cached). - Add FeatureFlagGuard (or FeatureAvailabilityGuard) that: - reads required capability metadata for an endpoint - evaluates allow/deny with actionType - maps denial to the chosen HTTP status codes Frontend (apps/website): - Add a small PolicyClient that fetches policy snapshot from API (optional for phase 1). - Add FeatureBlocker component for consistent UI behavior. - Centralize navigation link definitions and filter them via policy. Ops/Config: - Define how maintenance mode is toggled (KV/DB entry or config endpoint restricted to operators later). - Ensure defaults are safe (fail closed). --- ## 10) Non-Goals (Explicit) - This system is not an authorization system. - Roles/permissions are separate (but can be added as actorContext inputs later). - Blockers never replace Guards.