Key Takeaway

A structured ADR for LLM provider selection creates a referenceable decision record that prevents re-litigation of the choice and documents the trade-offs accepted.

When to Use This Template

Use this ADR when your team is selecting an LLM provider for a new product feature, migrating between providers, or evaluating whether to add a secondary provider for redundancy. The template is designed for decisions that involve commercial API providers (Anthropic, OpenAI, Google, Cohere, Mistral), self-hosted open-weight models, or hybrid approaches. It is particularly valuable when multiple stakeholders have opinions about provider choice, because it forces structured evaluation rather than advocacy-based decisions.

ADR Template

adr-llm-provider-selection.md

# ADR: LLM Provider Selection

## Status
[Proposed | Accepted | Deprecated | Superseded by ADR-XXX]

## Date
YYYY-MM-DD

## Decision Makers
- [Name, Role — e.g., "Jane Smith, Engineering Director"]
- [Name, Role]

## Context

### Business Requirements
- Primary use case: [e.g., customer-facing conversational AI, internal document processing]
- Quality bar: [e.g., "must match or exceed current human agent accuracy"]
- Latency SLA: [e.g., "first token within 500ms, complete response within 5s"]
- Throughput: [e.g., "sustained 100 requests/minute, peak 500 requests/minute"]
- Data sensitivity: [e.g., "PII present, must not be used for training"]

### Technical Constraints
- Integration requirements: [existing tech stack, SDK availability]
- Compliance requirements: [SOC 2, HIPAA, GDPR, data residency]
- Budget envelope: [monthly budget ceiling for inference costs]

## Options Considered

| Criterion | Provider A | Provider B | Provider C (Self-Hosted) |
|-----------|-----------|-----------|------------------------|
| Model capability | | | |
| Input pricing (per 1M tokens) | | | |
| Output pricing (per 1M tokens) | | | |
| Latency (p50 / p99) | | | |
| Rate limits | | | |
| Data handling policy | | | |
| Fine-tuning support | | | |
| SDK maturity | | | |
| SLA guarantee | | | |
| Vendor lock-in risk | | | |

## Evaluation Criteria (Weighted)
1. **Capability fit** (30%) — Does the model meet quality requirements?
2. **Total cost of ownership** (25%) — Inference + integration + maintenance
3. **Data handling** (20%) — Privacy, compliance, training data usage
4. **Operational reliability** (15%) — Uptime SLA, rate limits, support
5. **Migration flexibility** (10%) — Ease of switching if needed

## Decision
We will use [Provider] because [rationale tied to weighted criteria].

### Migration Path
[If switching providers: rollout plan, prompt adaptation work, testing approach]

## Consequences

### Accepted Trade-offs
- [e.g., "Higher per-token cost in exchange for better quality on our evaluation set"]
- [e.g., "Vendor lock-in on function calling format"]

### Risks
- [e.g., "Single provider dependency — mitigated by abstraction layer"]

### Action Items
- [ ] Implement provider abstraction layer
- [ ] Set up cost monitoring and alerts
- [ ] Establish prompt regression test suite
- [ ] Document fallback procedure

## Review Trigger
Re-evaluate this decision when:
- [ ] Monthly inference cost exceeds $[threshold]
- [ ] A new model generation is released by any evaluated provider
- [ ] Quality metrics drop below [threshold] for [duration]
- [ ] Contract renewal approaches (review 90 days prior)

Section-by-Section Guidance

Status and Decision Makers

Always list decision makers by name and role. This creates accountability and makes it clear who should be consulted if the decision needs revisiting. The status field should be updated as the ADR moves through its lifecycle. Mark it as Proposed during review, Accepted once approved, and Superseded if a newer ADR replaces it.

Context Section

The context section is the most important part of the ADR because it documents why this decision was necessary at this point in time. Be specific about requirements. Avoid vague statements like 'we need good performance' in favor of measurable criteria like 'p99 latency under 3 seconds for responses up to 500 tokens.' Include budget constraints even if they feel uncomfortable to document, because cost is almost always a binding constraint in LLM selection.

Options and Evaluation

The comparison table should include at least three options, even if one is clearly leading. Including a self-hosted option forces the team to articulate why managed APIs are worth the premium, or reveals that self-hosting is viable. Weight the evaluation criteria before filling in the comparison table to prevent post-hoc rationalization. If your team cannot agree on weights, that disagreement itself is valuable to surface before making the decision.

Consequences and Review Triggers

Document accepted trade-offs explicitly. Every provider choice involves trade-offs, and writing them down prevents future debates about whether the team 'considered' a particular concern. Review triggers are concrete conditions that should prompt re-evaluation. Avoid vague triggers like 'when things change' in favor of specific thresholds tied to cost, quality metrics, or calendar dates.

Build a provider abstraction layer before you need it. The most painful LLM provider migrations happen when prompt formats, function calling schemas, and response parsing are tightly coupled to a single provider's API. Even a thin abstraction layer makes switching providers a configuration change rather than a rewrite.

Do not rely solely on public benchmarks for capability evaluation. Run your own evaluation suite against your actual use cases with representative data. A model that leads on general benchmarks may underperform on your specific domain or task structure.

Version History

1.0.0 · 2026-03-01

• Initial ADR template for LLM provider selection

When to Use This Template

ADR Template

adr-llm-provider-selection.md

# ADR: LLM Provider Selection

## Status
[Proposed | Accepted | Deprecated | Superseded by ADR-XXX]

## Date
YYYY-MM-DD

## Decision Makers
- [Name, Role — e.g., "Jane Smith, Engineering Director"]
- [Name, Role]

## Context

### Business Requirements
- Primary use case: [e.g., customer-facing conversational AI, internal document processing]
- Quality bar: [e.g., "must match or exceed current human agent accuracy"]
- Latency SLA: [e.g., "first token within 500ms, complete response within 5s"]
- Throughput: [e.g., "sustained 100 requests/minute, peak 500 requests/minute"]
- Data sensitivity: [e.g., "PII present, must not be used for training"]

### Technical Constraints
- Integration requirements: [existing tech stack, SDK availability]
- Compliance requirements: [SOC 2, HIPAA, GDPR, data residency]
- Budget envelope: [monthly budget ceiling for inference costs]

## Options Considered

| Criterion | Provider A | Provider B | Provider C (Self-Hosted) |
|-----------|-----------|-----------|------------------------|
| Model capability | | | |
| Input pricing (per 1M tokens) | | | |
| Output pricing (per 1M tokens) | | | |
| Latency (p50 / p99) | | | |
| Rate limits | | | |
| Data handling policy | | | |
| Fine-tuning support | | | |
| SDK maturity | | | |
| SLA guarantee | | | |
| Vendor lock-in risk | | | |

## Evaluation Criteria (Weighted)
1. **Capability fit** (30%) — Does the model meet quality requirements?
2. **Total cost of ownership** (25%) — Inference + integration + maintenance
3. **Data handling** (20%) — Privacy, compliance, training data usage
4. **Operational reliability** (15%) — Uptime SLA, rate limits, support
5. **Migration flexibility** (10%) — Ease of switching if needed

## Decision
We will use [Provider] because [rationale tied to weighted criteria].

### Migration Path
[If switching providers: rollout plan, prompt adaptation work, testing approach]

## Consequences

### Accepted Trade-offs
- [e.g., "Higher per-token cost in exchange for better quality on our evaluation set"]
- [e.g., "Vendor lock-in on function calling format"]

### Risks
- [e.g., "Single provider dependency — mitigated by abstraction layer"]

### Action Items
- [ ] Implement provider abstraction layer
- [ ] Set up cost monitoring and alerts
- [ ] Establish prompt regression test suite
- [ ] Document fallback procedure

## Review Trigger
Re-evaluate this decision when:
- [ ] Monthly inference cost exceeds $[threshold]
- [ ] A new model generation is released by any evaluated provider
- [ ] Quality metrics drop below [threshold] for [duration]
- [ ] Contract renewal approaches (review 90 days prior)

Section-by-Section Guidance

Status and Decision Makers

Context Section

Options and Evaluation

Consequences and Review Triggers

Version History

1.0.0 · 2026-03-01

• Initial ADR template for LLM provider selection

ADR: LLM Provider Selection

When to Use This Template

ADR Template

Section-by-Section Guidance

Status and Decision Makers

Context Section

Options and Evaluation

Consequences and Review Triggers

Version History

Related content

ADR: LLM Provider Selection

When to Use This Template

ADR Template

Section-by-Section Guidance

Status and Decision Makers

Context Section

Options and Evaluation

Consequences and Review Triggers

Version History

Related content