auth
This commit is contained in:
247
docs/MESSAGING.md
Normal file
247
docs/MESSAGING.md
Normal file
@@ -0,0 +1,247 @@
|
||||
# GridPilot — Messaging & Communication System
|
||||
**Design Document (Code-First, Admin-Safe)**
|
||||
|
||||
---
|
||||
|
||||
## 1. Goals
|
||||
|
||||
The messaging system must:
|
||||
|
||||
- be **code-first**
|
||||
- be **fully versioned**
|
||||
- be **safe by default**
|
||||
- prevent admins from breaking tone, structure, or legality
|
||||
- support **transactional emails**, **announcements**, and **votes**
|
||||
- give admins **visibility**, not creative control
|
||||
|
||||
This is **not** a marketing CMS.
|
||||
It is infrastructure.
|
||||
|
||||
---
|
||||
|
||||
## 2. Core Principles
|
||||
|
||||
### 2.1 Code is the Source of Truth
|
||||
- All email templates live in the repository
|
||||
- No WYSIWYG editors
|
||||
- No runtime editing by admins
|
||||
- Templates are reviewed like any other code
|
||||
|
||||
### 2.2 Admins Trigger, They Don’t Author
|
||||
Admins can:
|
||||
- preview
|
||||
- test
|
||||
- trigger
|
||||
- audit
|
||||
|
||||
Admins cannot:
|
||||
- edit wording
|
||||
- change layout
|
||||
- inject content
|
||||
|
||||
This guarantees:
|
||||
- consistent voice
|
||||
- legal safety
|
||||
- no accidental damage
|
||||
|
||||
---
|
||||
|
||||
## 3. Template System
|
||||
|
||||
### 3.1 Template Structure
|
||||
|
||||
Each template defines:
|
||||
|
||||
- unique ID
|
||||
- version
|
||||
- subject
|
||||
- body (HTML + plain text)
|
||||
- allowed variables
|
||||
- default values
|
||||
- fallback behavior
|
||||
|
||||
Example (conceptual):
|
||||
|
||||
- `league_invite_v1`
|
||||
- `season_start_v2`
|
||||
- `penalty_applied_v1`
|
||||
|
||||
Templates are immutable once deprecated.
|
||||
|
||||
---
|
||||
|
||||
### 3.2 Variables
|
||||
|
||||
- Strictly typed
|
||||
- Explicit allow-list
|
||||
- Required vs optional
|
||||
- Default values for previews
|
||||
|
||||
Missing variables:
|
||||
- never crash delivery
|
||||
- always fall back safely
|
||||
|
||||
---
|
||||
|
||||
## 4. Admin Preview & Testing
|
||||
|
||||
### 4.1 Preview Mode
|
||||
|
||||
Admins can:
|
||||
- open any template
|
||||
- see rendered output
|
||||
- switch between HTML / text
|
||||
- inspect subject line
|
||||
|
||||
Preview uses:
|
||||
- **test data only**
|
||||
- never real user data by default
|
||||
|
||||
---
|
||||
|
||||
### 4.2 Test Send
|
||||
|
||||
Admins may:
|
||||
- send a test email to themselves
|
||||
- choose a predefined test dataset
|
||||
- never inject arbitrary values
|
||||
|
||||
Purpose:
|
||||
- sanity check
|
||||
- formatting validation
|
||||
- confidence before triggering
|
||||
|
||||
---
|
||||
|
||||
## 5. Delivery & Audit Trail
|
||||
|
||||
Every sent message is logged.
|
||||
|
||||
For each send event, store:
|
||||
- template ID + version
|
||||
- timestamp
|
||||
- triggered by (admin/system)
|
||||
- recipient(s)
|
||||
- delivery status
|
||||
- error details (if any)
|
||||
|
||||
Admins can view:
|
||||
- delivery history
|
||||
- failures
|
||||
- resend eligibility
|
||||
|
||||
---
|
||||
|
||||
## 6. Trigger Types
|
||||
|
||||
### 6.1 Automatic Triggers
|
||||
- season start
|
||||
- race reminder
|
||||
- protest resolved
|
||||
- penalty applied
|
||||
- standings updated
|
||||
|
||||
### 6.2 Manual Triggers
|
||||
- league announcement
|
||||
- sponsor message
|
||||
- admin update
|
||||
- vote launch
|
||||
|
||||
Manual triggers are:
|
||||
- explicit
|
||||
- logged
|
||||
- rate-limited
|
||||
|
||||
---
|
||||
|
||||
## 7. Newsletter Handling
|
||||
|
||||
Newsletters follow the same system.
|
||||
|
||||
Characteristics:
|
||||
- predefined formats
|
||||
- fixed structure
|
||||
- optional sections
|
||||
- no free-text editing
|
||||
|
||||
Admins can:
|
||||
- choose newsletter type
|
||||
- select audience
|
||||
- trigger send
|
||||
|
||||
Admins cannot:
|
||||
- rewrite copy
|
||||
- add arbitrary sections
|
||||
|
||||
---
|
||||
|
||||
## 8. Voting & Poll Messaging
|
||||
|
||||
Polls are also template-driven.
|
||||
|
||||
Flow:
|
||||
1. Poll defined in code
|
||||
2. Admin starts poll
|
||||
3. System sends notification
|
||||
4. Users vote
|
||||
5. Results summarized automatically
|
||||
|
||||
Messaging remains:
|
||||
- neutral
|
||||
- consistent
|
||||
- auditable
|
||||
|
||||
---
|
||||
|
||||
## 9. Admin UI Scope
|
||||
|
||||
Admin interface provides:
|
||||
|
||||
- template list
|
||||
- preview button
|
||||
- test send
|
||||
- send history
|
||||
- delivery status
|
||||
- trigger actions
|
||||
|
||||
Admin UI explicitly excludes:
|
||||
- template editing
|
||||
- layout controls
|
||||
- copywriting fields
|
||||
|
||||
---
|
||||
|
||||
## 10. Why This Matters
|
||||
|
||||
This approach ensures:
|
||||
|
||||
- trust
|
||||
- predictability
|
||||
- legal safety
|
||||
- consistent brand voice
|
||||
- low operational risk
|
||||
- no CMS hell
|
||||
|
||||
GridPilot communicates like a tool, not a marketing department.
|
||||
|
||||
---
|
||||
|
||||
## 11. Non-Goals
|
||||
|
||||
This system will NOT:
|
||||
- support custom admin HTML
|
||||
- allow per-league copy editing
|
||||
- replace marketing platforms
|
||||
- become a newsletter builder
|
||||
|
||||
That is intentional.
|
||||
|
||||
---
|
||||
|
||||
## 12. Summary
|
||||
|
||||
**Code defines communication.
|
||||
Admins execute communication.
|
||||
Users receive communication they can trust.**
|
||||
|
||||
Simple. Stable. Scalable.
|
||||
199
docs/OBSERVABILITY.md
Normal file
199
docs/OBSERVABILITY.md
Normal file
@@ -0,0 +1,199 @@
|
||||
GridPilot — Observability & Data Separation Design
|
||||
|
||||
Purpose
|
||||
|
||||
This document defines how GridPilot separates business-critical domain data from infrastructure / observability data, while keeping operations simple, self-hosted, and cognitively manageable.
|
||||
|
||||
Goals:
|
||||
• protect domain data at all costs
|
||||
• avoid tool sprawl
|
||||
• keep one clear mental model for operations
|
||||
• enable debugging without polluting business logic
|
||||
• ensure long-term maintainability
|
||||
|
||||
⸻
|
||||
|
||||
Core Principle
|
||||
|
||||
Domain data and infrastructure data must never share the same storage, lifecycle, or access path.
|
||||
|
||||
They serve different purposes, have different risk profiles, and must be handled independently.
|
||||
|
||||
⸻
|
||||
|
||||
Data Categories
|
||||
|
||||
1. Domain (Business) Data
|
||||
|
||||
Includes
|
||||
• users
|
||||
• leagues
|
||||
• seasons
|
||||
• races
|
||||
• results
|
||||
• penalties
|
||||
• escrow balances
|
||||
• sponsorship contracts
|
||||
• payments & payouts
|
||||
|
||||
Characteristics
|
||||
• legally relevant
|
||||
• trust-critical
|
||||
• user-facing
|
||||
• must never be lost
|
||||
• requires strict migrations and backups
|
||||
|
||||
Storage
|
||||
• Relational database (PostgreSQL)
|
||||
• Strong consistency (ACID)
|
||||
• Backups and disaster recovery mandatory
|
||||
|
||||
Access
|
||||
• Application backend
|
||||
• Custom Admin UI (primary control surface)
|
||||
|
||||
⸻
|
||||
|
||||
2. Infrastructure / Observability Data
|
||||
|
||||
Includes
|
||||
• application logs
|
||||
• error traces
|
||||
• metrics (latency, throughput, failures)
|
||||
• background job status
|
||||
• system health signals
|
||||
|
||||
Characteristics
|
||||
• high volume
|
||||
• ephemeral by design
|
||||
• not user-facing
|
||||
• safe to rotate or delete
|
||||
• supports debugging, not business logic
|
||||
|
||||
Storage
|
||||
• Dedicated observability stack
|
||||
• Completely separate from domain database
|
||||
|
||||
Access
|
||||
• Grafana UI only
|
||||
• Never exposed to users
|
||||
• Never queried by application logic
|
||||
|
||||
⸻
|
||||
|
||||
Observability Architecture (Self-Hosted)
|
||||
|
||||
GridPilot uses a single consolidated self-hosted observability stack.
|
||||
|
||||
Components
|
||||
• Grafana
|
||||
• Central UI
|
||||
• Dashboards
|
||||
• Alerting
|
||||
• Single login
|
||||
• Loki
|
||||
• Log aggregation
|
||||
• Append-only
|
||||
• Schema-less
|
||||
• Optimized for high-volume logs
|
||||
• Prometheus
|
||||
• Metrics collection
|
||||
• Time-series data
|
||||
• Alert rules
|
||||
• Tempo (optional)
|
||||
• Distributed traces
|
||||
• Request flow analysis
|
||||
|
||||
All components are accessed exclusively through Grafana.
|
||||
|
||||
⸻
|
||||
|
||||
Responsibility Split
|
||||
|
||||
Custom Admin (GridPilot)
|
||||
|
||||
Handles:
|
||||
• business workflows
|
||||
• escrow state visibility
|
||||
• payment events
|
||||
• league integrity checks
|
||||
• moderation actions
|
||||
• audit views
|
||||
|
||||
Never handles:
|
||||
• raw logs
|
||||
• metrics
|
||||
• system traces
|
||||
|
||||
⸻
|
||||
|
||||
Observability Stack (Grafana)
|
||||
|
||||
Handles:
|
||||
• system health
|
||||
• performance bottlenecks
|
||||
• error rates
|
||||
• background job failures
|
||||
• infrastructure alerts
|
||||
|
||||
Never handles:
|
||||
• business decisions
|
||||
• user-visible data
|
||||
• domain state
|
||||
|
||||
⸻
|
||||
|
||||
Logging & Metrics Policy
|
||||
|
||||
What is logged
|
||||
• errors and exceptions
|
||||
• payment and escrow failures
|
||||
• background job failures
|
||||
• unexpected external API responses
|
||||
• startup and shutdown events
|
||||
|
||||
What is not logged
|
||||
• user personal data
|
||||
• credentials
|
||||
• domain state snapshots
|
||||
• high-frequency debug spam
|
||||
|
||||
⸻
|
||||
|
||||
Alerting Philosophy
|
||||
|
||||
Alerts are:
|
||||
• minimal
|
||||
• actionable
|
||||
• rare
|
||||
|
||||
Examples:
|
||||
• payment failure spike
|
||||
• escrow release delay
|
||||
• background jobs failing repeatedly
|
||||
• sustained error rate increase
|
||||
|
||||
No vanity alerts.
|
||||
|
||||
⸻
|
||||
|
||||
Rationale
|
||||
|
||||
This separation ensures:
|
||||
• domain data remains clean and safe
|
||||
• observability data can scale freely
|
||||
• infra failures never corrupt business data
|
||||
• operational complexity stays manageable
|
||||
|
||||
The system favors clarity over completeness and stability over tooling hype.
|
||||
|
||||
⸻
|
||||
|
||||
Summary
|
||||
• Domain data lives in PostgreSQL
|
||||
• Observability data lives in a dedicated stack
|
||||
• Grafana is the single infra control surface
|
||||
• Custom Admin is the single business control surface
|
||||
• No shared storage, no shared lifecycle
|
||||
|
||||
This design minimizes risk, cognitive load, and operational overhead while remaining fully extensible.
|
||||
Reference in New Issue
Block a user