Files
gridpilot.gg/plans/ratings-architecture-concept.md
2025-12-27 19:39:23 +01:00

20 KiB
Raw Blame History

Ratings Architecture Concept (Multi-Rating + Transparency + Eligibility)

This concept defines a clean, extendable architecture for ratings in GridPilot with:

  • Our own platform ratings (computed only from GridPilot league activity).
  • External per-game ratings (e.g. iRacing iRating/SR) stored separately for display + eligibility filtering only.
  • A transparent rating ledger so users can see exactly why they gained/lost rating.

It is designed to fit the projects Clean Architecture + CQRS Light rules in:

It is also aligned with the principles in:


1. Requirements Summary

1.1 Must Have (now)

  • Platform ratings
    • driving: combines clean + fast driving (and also accounts for AFK/DNS/DNF/DSQ).
    • adminTrust: administrative trust score.
  • Per-game ratings
    • Stored per game (e.g. iRacing iRating, safetyRating) for display + eligibility filters.
    • Not used to compute platform ratings.
  • Transparency
    • UI must show “why did my rating change” with plus/minus, reason, and reference context.
    • A persisted rating ledger is required.

1.2 Future (design for, do not implement now)

  • stewardTrust
  • broadcasterTrust

1.3 Non-Functional

  • Architecture is easy to maintain and easy to access (used across many locations).
  • Strong separation of concerns: domain is pure; commands enforce invariants; queries are pragmatic.
  • Extendability: new rating dimensions and new event types should not cause rewrites.

2. Key Architectural Decisions

2.1 Platform ratings are computed only from GridPilot events

External game ratings are:

  • Stored independently,
  • Displayed and queried,
  • Usable in eligibility filters,
  • Not inputs to platform rating computation.

2.2 Ledger-first transparency

Every rating adjustment is represented as an immutable rating event in a ledger, with:

  • Who: userId (subject)
  • What: dimension (driving/adminTrust/…)
  • Why: reason code + human-readable summary + structured metadata
  • How much: delta (+/-) and optional weight
  • Where: reference to a domain object (raceId, penaltyId, voteId, adminActionId)

Snapshots are derived from ledger events, not the other way around.

2.3 CQRS Light split

  • Commands record rating events and recompute snapshots.
  • Queries provide fast read models for UI and eligibility evaluation, without loading domain aggregates.

2.4 Evolution path from existing code

There is already a multi-dimensional value object UserRating and a domain service RatingUpdateService triggered by CompleteRaceUseCaseWithRatings.execute().

This concept treats the existing UserRating as an early “snapshot-like” model and proposes a controlled evolution:

  • Keep a snapshot object (can stay named UserRating or be renamed later).
  • Add a ledger model + repositories + calculators.
  • Gradually redirect the write flow from “direct updates” to “record events + recompute snapshot”.

No “big bang rewrite”.


3. Domain Model (Core Concepts)

3.1 Bounded contexts

  • Identity context owns user reputation/ratings (consistent with current placement of UserRating).
  • Racing context emits race outcomes (finishes, incidents, statuses) and penalties/DSQ information; it does not own rating logic.
  • Admin/Competition context emits admin actions and vote outcomes; it does not own rating logic.

3.2 Rating dimensions (extendable)

Define a canonical dimension key set (enum-like union) for platform ratings:

  • driving
  • adminTrust
  • stewardTrust (future)
  • broadcasterTrust (future)

Rule: adding a dimension should require:

  • A new calculator strategy, and
  • New event taxonomy entries, not structural redesign.

3.3 Domain objects (suggested)

Domain objects below follow the rules in Domain Objects.

Value Objects

  • RatingDimensionKey (e.g. driving, adminTrust)
  • RatingValue (0..100 or 0..N; pick one standard scale; recommend 0..100 aligned with UserRating)
  • RatingDelta (signed float/decimal; stored and displayed)
  • RatingEventId (uuid-like string)
  • RatingReference (typed reference union: raceId, penaltyId, voteId, adminActionId)
  • ExternalRating (per-game rating data point, e.g. iRating, safety rating)
  • GameKey (e.g. iracing, future acc, etc.)

Entities / Aggregate Roots

  • RatingLedger (aggregate root for a users rating events)
    • Identity: userId
    • Holds a list/stream of RatingEvent (not necessarily loaded fully; repository can stream)
  • RatingEvent (entity inside ledger or separate entity persisted in table)
    • Identity: ratingEventId
    • Immutable once persisted
  • AdminVoteSession (aggregate root, scoped to league + admin candidate + window)
    • Identity: voteSessionId
    • Controls who can vote, dedup, time window, and closure
    • Emits outcome events that convert to rating ledger events
  • ExternalGameRatingProfile (aggregate root per user)
    • Identity: userId + gameKey
    • Stores latest known per-game ratings + provenance

Domain Services

  • DrivingRatingCalculator (pure, stateless)
  • AdminTrustRatingCalculator (pure, stateless)
  • RatingSnapshotCalculator (applies ordered events to snapshot)
  • RatingEventFactory (turns domain facts into rating events)
  • EligibilityEvaluator (pure evaluation over rating snapshots and external ratings, but invoked from application layer for “decisions”)
  • Keep services similar in spirit to AverageStrengthOfFieldCalculator.calculate() and constraints typical of value objects like StrengthOfField.create().

3.4 Rating snapshot (current UserRating)

A snapshot is what most screens need:

  • latest rating value per dimension,
  • confidence/sample size/trend,
  • lastUpdated.

This already exists in UserRating. Conceptually, the snapshot is derived from events:

  • value: derived
  • confidence + sampleSize: derived from count/weights and recentness rules
  • trend: derived from recent deltas

Snapshots are persisted for fast reads; events are persisted for transparency.


4. Rating Ledger (Transparency Backbone)

4.1 Rating event structure (conceptual schema)

A RatingEvent should contain:

  • id: RatingEventId
  • userId: subject of the rating
  • dimension: RatingDimensionKey
  • delta: RatingDelta
  • weight: numeric (optional; for sample size / confidence)
  • occurredAt: Date
  • createdAt: Date
  • source:
    • sourceType: race | penalty | vote | adminAction | manualAdjustment
    • sourceId: string
  • reason:
    • code: stable machine code (for i18n and filtering)
    • summary: human text (or key + template params)
    • details: structured JSON (for UI)
  • visibility:
    • public: boolean (default true)
    • redactedFields: list (for sensitive moderation info)
  • version: schema version for forward compatibility

4.2 Ledger invariants

  • Immutable events (append-only); corrections happen via compensating events.
  • Deterministic ordering rule (by occurredAt, then createdAt, then id).
  • The snapshot is always reproducible from events (within the same calculator version).

4.3 Calculator versioning

To remain maintainable over time:

  • Events reference a calculatorVersion used when they were generated (optional but recommended).
  • Snapshot stores the latest calculatorVersion.
  • When the algorithm changes, snapshots can be recomputed in background; events remain unchanged.

5. Platform Rating Definitions

5.1 Driving rating (clean + fast + reliability)

Driving rating is the platforms main driver identity rating (as described in GridPilot Rating).

It is derived from ledger events sourced from race facts:

  • Finishing position vs field strength (fast driving component)
  • Incidents and penalty involvement (clean driving component)
  • Attendance and reliability (DNS/DNF/DSQ/AFK)

5.1.1 Driver status inputs

We must explicitly model:

  • AFK
  • DNS (did not start)
  • DNF (did not finish)
  • DSQ (disqualified)

These should become explicit event types, not hidden inside one “performance score”.

5.1.2 Driving event taxonomy (initial)

Examples of ledger event reason codes (illustrative; final list is a product decision):

Performance:

  • DRIVING_FINISH_STRENGTH_GAIN
  • DRIVING_POSITIONS_GAINED_BONUS
  • DRIVING_PACE_RELATIVE_GAIN (optional)

Clean driving:

  • DRIVING_INCIDENTS_PENALTY
  • DRIVING_MAJOR_CONTACT_PENALTY (if severity exists)
  • DRIVING_PENALTY_INVOLVEMENT_PENALTY

Reliability:

  • DRIVING_DNS_PENALTY
  • DRIVING_DNF_PENALTY
  • DRIVING_DSQ_PENALTY
  • DRIVING_AFK_PENALTY
  • DRIVING_SEASON_ATTENDANCE_BONUS (optional later)

Each event must reference source facts:

  • raceId always for race-derived events
  • penaltyId for steward/admin penalty events
  • additional metadata: start position, finish position, incident count, etc.

5.1.3 Field strength support

Driving performance should consider strength of field similar to the existing value object StrengthOfField and its service pattern in StrengthOfFieldCalculator.

Concept: the driving calculator receives:

  • driver finish data
  • field rating inputs (which can be platform driving snapshot values or external iRating for SoF only, depending on product choice)

Given the earlier decision “platform rating does not use external ratings”, we can still compute SoF using:

  • platform driving snapshot values (for users with sufficient data), and/or
  • a neutral default for new users without using external ratings as an input to driving rating itself.

(If SoF must use iRating for accuracy, it still does not violate “independent” as long as SoF is a race context signal and not a direct driver rating input. This is a design choice to confirm later.)

5.2 Admin trust rating (hybrid system signals + votes)

Admin trust is separate from driving.

It must include:

  • System-derived actions (timeliness, reversals, consistency, completion of tasks)
  • Driver votes among participants in a league

5.2.1 Voting model (anti-abuse, league-scoped)

Votes are generated within a league, but the rating is global. To avoid abuse:

  • Only eligible voters: drivers who participated in the league (membership + minimum participation threshold).
  • 1 vote per voter per admin per voting window.
  • Voting windows are timeboxed (e.g. weekly/monthly/season-end).
  • Votes have reduced weight if the voter has low trust (optional later).
  • Votes should be explainable: aggregated outcome + distribution; individual votes may be private.

Votes produce ledger events:

  • ADMIN_VOTE_OUTCOME_POSITIVE
  • ADMIN_VOTE_OUTCOME_NEGATIVE with reference voteSessionId and metadata including:
  • leagueId
  • eligibleVoterCount
  • voteCount
  • percentPositive

5.2.2 Admin system-signal taxonomy (initial)

Examples:

  • ADMIN_ACTION_SLA_BONUS (responded within SLA)
  • ADMIN_ACTION_REVERSAL_PENALTY (frequent reversals)
  • ADMIN_ACTION_RULE_CLARITY_BONUS (published rules/changes; if tracked)
  • ADMIN_ACTION_ABUSE_REPORT_PENALTY (validated abuse reports)

All of these should be “facts” emitted by admin/competition workflows, not computed in rating domain from raw infra signals.


6. External Game Ratings (Per-Game Profiles)

6.1 Purpose

External ratings exist to:

  • Display on user profiles
  • Be used in eligibility filters

They do not affect platform ratings.

6.2 Data model (conceptual)

ExternalGameRatingProfile per userId + gameKey stores:

  • gameKey: e.g. iracing
  • ratings: map of rating type -> numeric value
    • e.g. iracing.iRating, iracing.safetyRating
  • provenance:
    • source: iracing-api | manual | import
    • lastSyncedAt
    • confidence/verified flag (optional)

6.3 Read surfaces

Queries should provide:

  • “latest ratings by game”
  • “rating history by game” (optional future)
  • “last sync status”

7. Application Layer (Commands and Queries)

7.1 Command side (write model)

Commands are use-cases that:

  • validate permissions
  • load required domain facts (race outcomes, votes)
  • create rating events
  • append to ledger
  • recompute snapshot(s)
  • persist results

Must follow Use Cases: output via presenter/output port, no DTO leakage.

7.1.1 Command use cases (proposed)

Driving:

  • RecordRaceRatingEventsUseCase
    • Input: raceId
    • Loads race results (positions, incidents, statuses)
    • Produces ledger events for driving
  • ApplyPenaltyRatingEventUseCase
    • Input: penaltyId
    • Produces event(s) affecting driving and/or fairness dimension

Admin trust:

  • OpenAdminVoteSessionUseCase
  • CastAdminVoteUseCase
  • CloseAdminVoteSessionUseCase
    • On close: create ledger event(s) from aggregated vote outcome
  • RecordAdminActionRatingEventUseCase
    • Called by admin workflows to translate system events into rating events

Snapshots:

  • RecomputeUserRatingSnapshotUseCase
    • Input: userId (or batch)
    • Replays ledger events through calculator to update snapshot

External ratings:

  • UpsertExternalGameRatingUseCase
    • Input: userId, gameKey, rating values, provenance

7.2 Query side (read model)

Queries must be pragmatic per CQRS Light, and should not use domain entities.

7.2.1 Query use cases (proposed)

User-facing:

  • GetUserRatingsSummaryQuery
    • returns current platform snapshot values + external game ratings + last updated timestamps
  • GetUserRatingLedgerQuery
    • returns paginated ledger events, filterable by dimension, date range, reason code
  • GetUserRatingChangeExplanationQuery
    • returns a “why” view for a time window (e.g. last race), pre-grouped by race/vote/penalty

League-facing:

  • GetLeagueEligibilityPreviewQuery
    • evaluates candidate eligibility for a league filter and returns explanation (which condition failed)

Leaderboards:

  • GetTopDrivingRatingsQuery
  • GetTopAdminTrustQuery

8. Eligibility Filters (Leagues)

8.1 Requirements

Leagues can define eligibility filters against:

  • Platform driving rating (and future dimensions)
  • External per-game ratings (e.g. iRating threshold)

Eligibility decisions should be explainable (audit trail and UI explanation).

8.2 Filter DSL (typed, explainable)

Define a small filter language that supports:

  • target:

    • platform.driving
    • platform.adminTrust
    • external.iracing.iRating
    • external.iracing.safetyRating
  • operators:

    • >=, >, <=, <, between, exists
  • composition:

    • and, or

Each evaluation returns:

  • eligible: boolean
  • reasons: [] each with:
    • target
    • operator
    • expected
    • actual
    • pass/fail

This makes it UI-transparent and debuggable.


9. Website / UI Transparency Contract

Per View Models, UI should consume view models built from query DTOs.

9.1 “Ratings” surfaces (suggested)

  • User profile:
    • Platform driving rating + trend + confidence
    • Admin trust rating (if relevant)
    • External game ratings section (iRating/SR)
  • “Why did my rating change?” page:
    • Ledger list with grouping by race/vote/penalty
    • Each entry: delta, reason, context (race link), and explanation
  • League eligibility panel:
    • Filter configured + explanation of pass/fail for a given user
    • Should be able to show: “iRating 2200 is below required 2500” and/or “driving 61 is above required 55”

10. Event Flow Examples

10.1 Race completion updates driving rating

Triggered today by CompleteRaceUseCaseWithRatings.execute() which calls RatingUpdateService.updateDriverRatingsAfterRace().

Target flow (conceptually):

flowchart LR
  RaceCompleted[Race completed]
  Cmd[RecordRaceRatingEventsUseCase]
  Ledger[Append rating events]
  Calc[DrivingRatingCalculator]
  Snap[Persist snapshot]
  Query[GetUserRatingLedgerQuery]
  UI[Profile and Why view]

  RaceCompleted --> Cmd
  Cmd --> Ledger
  Cmd --> Calc
  Calc --> Snap
  Snap --> Query
  Ledger --> Query
  Query --> UI

10.2 Admin vote updates admin trust

flowchart LR
  Open[OpenAdminVoteSessionUseCase]
  Cast[CastAdminVoteUseCase]
  Close[CloseAdminVoteSessionUseCase]
  Ledger[Append vote outcome event]
  Calc[AdminTrustRatingCalculator]
  Snap[Persist snapshot]
  UI[Admin trust breakdown]

  Open --> Cast
  Cast --> Close
  Close --> Ledger
  Close --> Calc
  Calc --> Snap
  Snap --> UI
  Ledger --> UI

11. Maintainability Notes

11.1 Keep calculators pure

All rating computations should be pure functions of:

  • Events
  • Inputs (like race facts)
  • Current snapshot (optional) No repositories, no IO.

11.2 Stable reason codes

Reason codes must be stable to support:

  • filtering
  • analytics
  • translations
  • consistent UI explanation

11.3 Explicit extendability

Adding stewardTrust later should follow the same template:

  • Add event taxonomy
  • Add calculator
  • Add ledger reasons
  • Add snapshot dimension
  • Add queries and UI

No architecture changes.


12. Fit with existing UserRating and RatingUpdateService

12.1 Current state

This preserves the public API while improving transparency and extensibility.


13. Open Decisions (to confirm before implementation)

  1. Strength of Field inputs:

    • Should SoF use platform driving snapshots only, or may it use external iRating as a contextual “field difficulty” signal while still keeping platform ratings independent?
  2. Scale:

    • Keep 0..100 scale for platform ratings (consistent with UserRating)?
  3. Privacy:

    • Which admin trust vote details are public (aggregates only) vs private (individual votes)?
  4. Penalty integration:

    • Which penalties affect driving vs admin trust, and how do we ensure moderation-sensitive info can be redacted while keeping rating transparency?

14. Next Step: Implementation Planning Checklist

Implementation should proceed in small vertical slices:

  • Ledger persistence + query read models
  • Driving rating events from race completion including DNS/DNF/DSQ/AFK
  • Admin vote sessions and rating events
  • Eligibility filter DSL + evaluation query

All aligned with the projects CQRS Light patterns in CQRS Light.