How routing works end to end

The full reference routing flow from request input to decision and observability output.

The reference router in role-model-router/packages/core/src/router.ts is a clear implementation of the protocol's routing model.

End-to-end flow

Routing input

Request plus candidates, role definitions, task definitions, and role bindings.

Build policy snapshot

Normalize strategy, locality, and effective capability requirements.

Evaluate eligibility

Apply hard checks for status, policy, role/task compatibility, capabilities, modalities, tools, context, and budget.

Compare and score

Score eligible candidates across quality, latency, throughput, cost, reliability, and preference.

Apply tie-breaks

Resolve close scores with quality, latency, reliability, then stable endpoint ID.

RouterDecision

Emit policy snapshot, eligibility, scored candidates, chosen endpoint, fallbacks, and reason codes.

Fallback ordering

Remaining eligible candidates in ranked order.

Observability outputs

Trace and usage artifacts for execution.

Future profile updates

New measurements that inform later routing.

The reference router turns a request plus protocol context into an explainable decision and ordered fallbacks.

Step 1: Normalize request intent into policy

The router first computes:

the effective compute preference
the effective required capabilities
the effective preferred capabilities
a canonical policy strategy

This becomes the policy_snapshot embedded in the final decision.

Step 1b: Narrow to role-eligible model-serving endpoints

If the request names a role, the router first narrows the comparison set to endpoints where that role is active and compatible.

Those endpoints may represent:

different models
or multiple endpoints serving the same model

At this stage, the router is not choosing a bare model name. It is constructing the set of concrete model-serving endpoints that are allowed to compete.

Step 2: Evaluate eligibility

Every candidate is checked for:

status
policy denies and allow lists
local/remote restrictions
role-binding status
task support and role allowance
capability and modality compatibility
context window sufficiency
tool support
budget compatibility

This phase produces the eligibility array and the set of still-eligible candidates.

Step 3: Compute metric scores

Eligible candidates receive per-metric scores for:

quality
latency
throughput
cost
reliability
preference

Scoring compares eligible endpoints, not abstract model families. That matters because the same model may be available through multiple endpoints with different observed performance and policy implications.

Measured evidence is used when present; neutral defaults are used when it is absent.

Step 4: Redistribute missing-metric weight

If an entire metric is unknown for all eligible candidates, the reference router removes that metric's weight and redistributes it proportionally across the remaining metrics. This keeps scoring from being dominated by evidence that does not exist.

Step 5: Score and annotate candidates

Each eligible candidate gets:

a numeric score
selection-reason annotations such as MEASURED_PROFILE_USED or ROLE_PREFERENCE_APPLIED

Step 6: Sort and tie-break

The router sorts by total score, but if two candidates are within SCORE_TIE_EPSILON = 0.01, it breaks ties by:

higher quality score
lower effective latency
higher reliability score
lexicographically stable endpoint_id

Step 7: Emit the decision

The final RouterDecision contains:

the policy snapshot
full eligibility outcomes
ranked scored candidates
the chosen endpoint
fallbacks
selection reasons
evidence flags
scoring version

How routing works end to end

On this page