Key Takeaway

By the end of this blueprint you will have a prompt management system with a versioned registry, an approval workflow, A/B testing via consistent hashing, rollback capabilities, and a runtime SDK that applications use to fetch prompts without code deployments.

Prerequisites

PostgreSQL 15+ for the prompt registry
Redis for the low-latency prompt cache
Python 3.11+ or TypeScript 5+ for the SDK
Basic understanding of content versioning and deployment pipelines

The Problem with Hardcoded Prompts

Hardcoded prompts create three problems as your team scales. First, every prompt change requires a code deployment, meaning your prompt engineer needs to go through the full CI/CD pipeline to test a wording tweak. Second, there is no audit trail — you cannot answer the question 'who changed the customer support prompt last Tuesday and what did it say before?' Third, A/B testing prompt variants requires feature flag infrastructure that most teams bolt on as an afterthought. A centralized prompt management system addresses all three by treating prompts as first-class managed artifacts.

Architecture Overview

The system consists of a prompt registry backed by a PostgreSQL database, a management UI for authoring and reviewing prompt versions, a deployment pipeline that syncs approved versions to a low-latency Redis cache, and an SDK that applications use to fetch prompts at runtime. Traffic splitting for A/B tests is handled at the SDK layer using consistent hashing on user or session identifiers.

Prompt Registry Schema

The registry stores prompts as named entities with multiple versions. Each version has a template (the prompt text with variable placeholders), metadata (author, change description, model compatibility), and a lifecycle status (draft, review, approved, deployed, archived). Only one version per prompt can be in the 'deployed' status at any time, which is the version that production applications receive.

migrations/001_prompt_registry.sql

-- Prompt registry schema
CREATE TYPE prompt_status AS ENUM (
    'draft', 'review', 'approved', 'deployed', 'archived'
);

CREATE TABLE prompts (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    slug        TEXT UNIQUE NOT NULL,  -- e.g., 'customer-support-v2'
    name        TEXT NOT NULL,
    description TEXT,
    created_by  TEXT NOT NULL,
    created_at  TIMESTAMPTZ DEFAULT now()
);

CREATE TABLE prompt_versions (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    prompt_id   UUID REFERENCES prompts(id) ON DELETE CASCADE,
    version     INTEGER NOT NULL,
    template    TEXT NOT NULL,          -- Prompt text with {{variable}} placeholders
    variables   JSONB DEFAULT '[]',    -- Expected variables and their types
    model_hint  TEXT,                   -- Recommended model
    status      prompt_status DEFAULT 'draft',
    change_note TEXT,
    created_by  TEXT NOT NULL,
    reviewed_by TEXT,
    deployed_at TIMESTAMPTZ,
    created_at  TIMESTAMPTZ DEFAULT now(),
    UNIQUE (prompt_id, version)
);

-- Only one deployed version per prompt
CREATE UNIQUE INDEX idx_one_deployed
    ON prompt_versions (prompt_id)
    WHERE status = 'deployed';

CREATE TABLE prompt_ab_tests (
    id            UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    prompt_id     UUID REFERENCES prompts(id),
    control_id    UUID REFERENCES prompt_versions(id),
    variant_id    UUID REFERENCES prompt_versions(id),
    traffic_pct   INTEGER DEFAULT 10,  -- % to variant
    started_at    TIMESTAMPTZ DEFAULT now(),
    ended_at      TIMESTAMPTZ,
    winner_id     UUID REFERENCES prompt_versions(id)
);

Runtime SDK

Applications never hardcode prompts. Instead, they use an SDK that fetches the current deployed version of a prompt by slug, renders the template with the provided variables, and returns the final prompt string. The SDK reads from the Redis cache with a fallback to PostgreSQL, so prompt fetches add sub-millisecond latency in the hot path. For A/B tests, the SDK uses consistent hashing on a provided identifier (user ID or session ID) to deterministically assign users to control or variant groups.

prompt_sdk/client.py

"""Prompt management SDK for runtime prompt fetching."""

from __future__ import annotations

import hashlib
import json
import re
from typing import Any

import redis.asyncio as redis


class PromptClient:
    """SDK for fetching and rendering managed prompts."""

    def __init__(self, redis_url: str, db_fallback=None):
        self._redis = redis.from_url(redis_url)
        self._db = db_fallback
        self._local_cache: dict[str, dict] = {}

    async def get_prompt(
        self,
        slug: str,
        variables: dict[str, Any] | None = None,
        ab_identifier: str | None = None,
    ) -> str:
        """Fetch and render a prompt by slug.

        Args:
            slug: The prompt slug (e.g., 'customer-support').
            variables: Template variables to render.
            ab_identifier: User/session ID for A/B test assignment.

        Returns:
            The rendered prompt string.
        """
        prompt_data = await self._fetch(slug)

        # Check for active A/B test
        if ab_identifier and prompt_data.get("ab_test"):
            ab = prompt_data["ab_test"]
            bucket = self._hash_to_bucket(ab_identifier)
            if bucket < ab["traffic_pct"]:
                template = ab["variant_template"]
            else:
                template = prompt_data["template"]
        else:
            template = prompt_data["template"]

        return self._render(template, variables or {})

    async def _fetch(self, slug: str) -> dict:
        """Fetch prompt data from cache with DB fallback."""
        # L1: in-process cache
        if slug in self._local_cache:
            return self._local_cache[slug]

        # L2: Redis cache
        cached = await self._redis.get(f"prompt:{slug}")
        if cached:
            data = json.loads(cached)
            self._local_cache[slug] = data
            return data

        # L3: Database fallback
        if self._db:
            data = await self._db.fetch_deployed_prompt(slug)
            await self._redis.setex(
                f"prompt:{slug}", 300, json.dumps(data)
            )
            self._local_cache[slug] = data
            return data

        raise ValueError(f"Prompt '{slug}' not found")

    @staticmethod
    def _hash_to_bucket(identifier: str) -> int:
        """Deterministic hash to 0-99 for A/B assignment."""
        h = hashlib.sha256(identifier.encode()).hexdigest()
        return int(h[:8], 16) % 100

    @staticmethod
    def _render(template: str, variables: dict) -> str:
        """Render template variables like {{name}}."""
        def replacer(match):
            key = match.group(1).strip()
            return str(variables.get(key, match.group(0)))
        return re.sub(r"\{\{(.+?)\}\}", replacer, template)

A/B Testing Workflow

1
Create a prompt variant
Author a new version with the proposed changes. Set its status to 'approved' after review.
2
Configure the A/B test
Specify the control (current deployed) and variant versions, and set the traffic percentage (start at 5-10%).
3
Deploy the test
The deployment pipeline pushes both versions to cache with the A/B test configuration.
4
Monitor quality metrics
Track evaluation scores, user feedback, and task completion rates for both groups.
5
Promote or roll back
If the variant wins, promote it to deployed status. If it loses, archive it and end the test.

Use consistent hashing for A/B assignment rather than random sampling. This ensures the same user always sees the same prompt variant within a test, which is essential for measuring longitudinal quality metrics and avoids confusing users with inconsistent behavior.

CI/CD Integration

Prompt changes can be managed through a Git-based workflow. Store prompt templates as YAML files in a dedicated repository. A CI pipeline validates the template syntax, runs the prompt against an evaluation dataset, and creates a prompt_version record in the registry with status 'review'. Reviewers approve or reject through the management UI or via a PR approval. Approved prompts are deployed to cache by a CD pipeline that also handles cache invalidation across all application instances.

prompts/customer-support.yaml

# Prompt definition file for CI/CD pipeline
slug: customer-support
name: Customer Support Agent
description: System prompt for the customer-facing support assistant
model_hint: claude-sonnet-4-20250514

template: |
  You are a helpful customer support agent for {{company_name}}.

  Guidelines:
  - Be empathetic and professional
  - Answer questions using only the provided context
  - If you do not know the answer, say so honestly
  - Never make up product features or policies
  - Escalate billing and account deletion requests to a human agent

  Customer tier: {{customer_tier}}
  Context from knowledge base:
  {{retrieved_context}}

variables:
  - name: company_name
    type: string
    required: true
  - name: customer_tier
    type: string
    required: true
    enum: [free, starter, pro, enterprise]
  - name: retrieved_context
    type: string
    required: true

Rollback Strategy

Every deployed prompt version is immutable — you never edit a deployed version. To roll back, you re-deploy the previous version by changing its status back to 'deployed' and archiving the problematic version. The Redis cache is invalidated, and all applications pick up the previous version on their next prompt fetch. This gives you a clear audit trail and the ability to roll back in seconds rather than waiting for a code deployment.

Cache invalidation is the hard part. When you deploy a new prompt version, every application instance holding a local cache copy must refresh. Use a Redis pub/sub channel to broadcast invalidation events, and set a TTL on local caches as a safety net (5 minutes is a reasonable default).

Registry

Runtime

Process

Version History

1.0.0 · 2026-03-01

• Initial publication with prompt registry schema and runtime SDK
• A/B testing workflow with consistent hashing
• CI/CD integration with YAML prompt definitions
• Rollback strategy and cache invalidation patterns

Key Takeaway

Prerequisites

PostgreSQL 15+ for the prompt registry
Redis for the low-latency prompt cache
Python 3.11+ or TypeScript 5+ for the SDK
Basic understanding of content versioning and deployment pipelines

The Problem with Hardcoded Prompts

Architecture Overview

Prompt Registry Schema

migrations/001_prompt_registry.sql

-- Prompt registry schema
CREATE TYPE prompt_status AS ENUM (
    'draft', 'review', 'approved', 'deployed', 'archived'
);

CREATE TABLE prompts (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    slug        TEXT UNIQUE NOT NULL,  -- e.g., 'customer-support-v2'
    name        TEXT NOT NULL,
    description TEXT,
    created_by  TEXT NOT NULL,
    created_at  TIMESTAMPTZ DEFAULT now()
);

CREATE TABLE prompt_versions (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    prompt_id   UUID REFERENCES prompts(id) ON DELETE CASCADE,
    version     INTEGER NOT NULL,
    template    TEXT NOT NULL,          -- Prompt text with {{variable}} placeholders
    variables   JSONB DEFAULT '[]',    -- Expected variables and their types
    model_hint  TEXT,                   -- Recommended model
    status      prompt_status DEFAULT 'draft',
    change_note TEXT,
    created_by  TEXT NOT NULL,
    reviewed_by TEXT,
    deployed_at TIMESTAMPTZ,
    created_at  TIMESTAMPTZ DEFAULT now(),
    UNIQUE (prompt_id, version)
);

-- Only one deployed version per prompt
CREATE UNIQUE INDEX idx_one_deployed
    ON prompt_versions (prompt_id)
    WHERE status = 'deployed';

CREATE TABLE prompt_ab_tests (
    id            UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    prompt_id     UUID REFERENCES prompts(id),
    control_id    UUID REFERENCES prompt_versions(id),
    variant_id    UUID REFERENCES prompt_versions(id),
    traffic_pct   INTEGER DEFAULT 10,  -- % to variant
    started_at    TIMESTAMPTZ DEFAULT now(),
    ended_at      TIMESTAMPTZ,
    winner_id     UUID REFERENCES prompt_versions(id)
);

Runtime SDK

prompt_sdk/client.py

"""Prompt management SDK for runtime prompt fetching."""

from __future__ import annotations

import hashlib
import json
import re
from typing import Any

import redis.asyncio as redis


class PromptClient:
    """SDK for fetching and rendering managed prompts."""

    def __init__(self, redis_url: str, db_fallback=None):
        self._redis = redis.from_url(redis_url)
        self._db = db_fallback
        self._local_cache: dict[str, dict] = {}

    async def get_prompt(
        self,
        slug: str,
        variables: dict[str, Any] | None = None,
        ab_identifier: str | None = None,
    ) -> str:
        """Fetch and render a prompt by slug.

        Args:
            slug: The prompt slug (e.g., 'customer-support').
            variables: Template variables to render.
            ab_identifier: User/session ID for A/B test assignment.

        Returns:
            The rendered prompt string.
        """
        prompt_data = await self._fetch(slug)

        # Check for active A/B test
        if ab_identifier and prompt_data.get("ab_test"):
            ab = prompt_data["ab_test"]
            bucket = self._hash_to_bucket(ab_identifier)
            if bucket < ab["traffic_pct"]:
                template = ab["variant_template"]
            else:
                template = prompt_data["template"]
        else:
            template = prompt_data["template"]

        return self._render(template, variables or {})

    async def _fetch(self, slug: str) -> dict:
        """Fetch prompt data from cache with DB fallback."""
        # L1: in-process cache
        if slug in self._local_cache:
            return self._local_cache[slug]

        # L2: Redis cache
        cached = await self._redis.get(f"prompt:{slug}")
        if cached:
            data = json.loads(cached)
            self._local_cache[slug] = data
            return data

        # L3: Database fallback
        if self._db:
            data = await self._db.fetch_deployed_prompt(slug)
            await self._redis.setex(
                f"prompt:{slug}", 300, json.dumps(data)
            )
            self._local_cache[slug] = data
            return data

        raise ValueError(f"Prompt '{slug}' not found")

    @staticmethod
    def _hash_to_bucket(identifier: str) -> int:
        """Deterministic hash to 0-99 for A/B assignment."""
        h = hashlib.sha256(identifier.encode()).hexdigest()
        return int(h[:8], 16) % 100

    @staticmethod
    def _render(template: str, variables: dict) -> str:
        """Render template variables like {{name}}."""
        def replacer(match):
            key = match.group(1).strip()
            return str(variables.get(key, match.group(0)))
        return re.sub(r"\{\{(.+?)\}\}", replacer, template)

A/B Testing Workflow

1
Create a prompt variant
Author a new version with the proposed changes. Set its status to 'approved' after review.
2
Configure the A/B test
Specify the control (current deployed) and variant versions, and set the traffic percentage (start at 5-10%).
3
Deploy the test
The deployment pipeline pushes both versions to cache with the A/B test configuration.
4
Monitor quality metrics
Track evaluation scores, user feedback, and task completion rates for both groups.
5
Promote or roll back
If the variant wins, promote it to deployed status. If it loses, archive it and end the test.

CI/CD Integration

prompts/customer-support.yaml

# Prompt definition file for CI/CD pipeline
slug: customer-support
name: Customer Support Agent
description: System prompt for the customer-facing support assistant
model_hint: claude-sonnet-4-20250514

template: |
  You are a helpful customer support agent for {{company_name}}.

  Guidelines:
  - Be empathetic and professional
  - Answer questions using only the provided context
  - If you do not know the answer, say so honestly
  - Never make up product features or policies
  - Escalate billing and account deletion requests to a human agent

  Customer tier: {{customer_tier}}
  Context from knowledge base:
  {{retrieved_context}}

variables:
  - name: company_name
    type: string
    required: true
  - name: customer_tier
    type: string
    required: true
    enum: [free, starter, pro, enterprise]
  - name: retrieved_context
    type: string
    required: true

Rollback Strategy

Registry

Runtime

Process

Version History

1.0.0 · 2026-03-01

• Initial publication with prompt registry schema and runtime SDK
• A/B testing workflow with consistent hashing
• CI/CD integration with YAML prompt definitions
• Rollback strategy and cache invalidation patterns

Prompt Management System

The Problem with Hardcoded Prompts

Architecture Overview

Prompt Registry Schema

Runtime SDK

A/B Testing Workflow

Create a prompt variant

Configure the A/B test

Deploy the test

Monitor quality metrics

Promote or roll back

CI/CD Integration

Rollback Strategy

Registry

Runtime

Process

Version History

Related content

Prompt Management System

The Problem with Hardcoded Prompts

Architecture Overview

Prompt Registry Schema

Runtime SDK

A/B Testing Workflow

Create a prompt variant

Configure the A/B test

Deploy the test

Monitor quality metrics

Promote or roll back

CI/CD Integration

Rollback Strategy

Registry

Runtime

Process

Version History

Related content