← Back to recipes

Build a quality-controlled translation workflow

communicationsintermediateemerging

The problem

You're translating newsletters, web updates, and service communications into multiple languages every month. One-off translations are inconsistent: sometimes formal, sometimes casual, terminology varies. You need quality control but can't afford professional translation for everything. Your multilingual comms feel disjointed.

The solution

Build a three-stage AI translation workflow: translate with controlled tone of voice and terminology, review for accuracy and cultural appropriateness, then check comprehension (does it retain the original meaning and language level?). Use a structured prompt system to maintain consistency. Save your terminology glossary so 'support worker' is always translated the same way.

What you get

A repeatable translation workflow with quality gates. Documents translated consistently in your organisation's voice, with terminology that stays stable across all communications. A record of what was checked and approved. Confidence that translations maintain the meaning and accessibility of the original.

Before you start

  • Regular need for translations (monthly newsletters, ongoing web content, etc.)
  • Native speakers who can review translations in each language
  • A Claude or OpenAI API key
  • Basic Python skills or willingness to adapt example code
  • A terminology glossary or willingness to build one
  • IMPORTANT: Review your data protection policy before sending content to US-based AI providers. Strip any personally identifiable information (names, addresses, case details) from source text before translation. Only translate generic communications, not individual beneficiary correspondence.

When to use this

  • You're translating regularly and consistency matters
  • You need to control tone of voice across languages
  • You want to ensure translations maintain meaning and accessibility
  • You have native speakers available for review but can't afford full professional translation

When not to use this

  • You only translate occasionally - simple translation recipe is fine
  • You don't have native speakers to review - quality control won't work
  • You're translating legally binding documents - get professional translation
  • Your content changes so much that terminology consistency doesn't matter
  • You're translating safeguarding instructions, emergency procedures, or crisis communications - these require professional human translation to ensure life-critical information is accurate

Steps

  1. 1

    Build your terminology glossary

    List key terms that must be translated consistently: your service names, common concepts, technical terms. For each term, specify the approved translation in each language. Include notes about formality (tu/vous in French, formal/informal in Spanish). This is your translation style guide.

  2. 2

    Define your tone of voice rules

    Document how your organisation speaks: Are you formal or approachable? Do you use technical terms or plain language? What reading level do you target? These rules go into your translation prompt so every language matches your voice, not just the words.

  3. 3

    Create the translation prompt

    Write a structured prompt that includes: target language, your terminology glossary, tone of voice rules, audience description, formality level. Tell it to maintain the original's language complexity and flag any terms it's unsure about. Test this on 3-4 documents and refine based on output.

  4. 4

    Create the review prompt

    Second pass: Ask a fresh AI instance to review the translation. Does it match the terminology glossary? Is tone consistent? Are there cultural references that don't translate? Are there better word choices? This catches issues the first pass missed.

  5. 5

    Create the comprehension check prompt

    Third pass: Ask AI to back-translate key points to English, then compare to the original. Check: Does it maintain the same meaning? Same language level? Same key messages? This verifies nothing was lost or distorted in translation. Note: This third pass adds to API costs (roughly tripling them vs single-pass translation). For added independence, consider using a different model for this stage - e.g. if Stage 1 uses GPT-4o, use Claude for Stage 3 to avoid the AI simply confirming its own logic.

  6. 6

    Build the automated pipeline

    Use the example code to chain the three stages: translate → review → comprehension check. The script takes your source text, runs it through all three stages, and outputs the final translation plus a quality report flagging anything uncertain.

  7. 7

    Test with native speaker review

    Run 10-20 documents through your pipeline and have native speakers check the final output. Do they agree with the terminology choices? Does the tone feel right? What needs adjusting? Use this feedback to refine your prompts and glossary.

  8. 8

    Establish review checkpoints

    Decide which translations always need human review (major announcements, policy changes) vs which are low-risk (event reminders, standard updates). High-priority content goes through your workflow plus human approval. Routine content can go straight out after the three-stage check.

  9. 9

    Update your glossary regularly(optional)

    When new terms appear or reviewers suggest better translations, update your glossary and re-run affected documents. Your workflow improves over time as you refine terminology and tone rules.

Example code

Three-stage translation workflow with quality control

This implements translate → review → comprehension check pipeline. Adapt the glossary and tone rules to your organisation. Note: Uses JSON mode with gpt-4o-mini, which requires the prompt to explicitly include the word 'JSON' (each prompt says 'Return JSON with:').

from openai import OpenAI
import json

client = OpenAI()

# Your terminology glossary - maintain this
GLOSSARY = {
    "en-fr": {
        "support worker": "travailleur de soutien",
        "service user": "usager du service",
        "safeguarding": "protection"
    },
    "en-es": {
        "support worker": "trabajador de apoyo",
        "service user": "usuario del servicio",
        "safeguarding": "salvaguardia"
    }
}

# Your tone of voice rules
TONE_RULES = {
    "formality": "approachable but professional",
    "register": "use tu form in French/Spanish (we speak directly to our community)",
    "complexity": "plain language, aim for B1 level",
    "avoid": "jargon, complex sentences, passive voice"
}

def stage_1_translate(text, target_lang):
    """Stage 1: Initial translation with terminology control"""

    glossary_for_lang = GLOSSARY.get(f"en-{target_lang}", {})

    prompt = f"""Translate this to {target_lang.upper()}.

Mandatory terminology (always use these translations):
{json.dumps(glossary_for_lang, indent=2)}

Tone of voice rules:
- {TONE_RULES['formality']}
- {TONE_RULES['register']}
- Keep language simple and accessible ({TONE_RULES['complexity']})

Source text:
{text}

Return JSON with:
- translation: the translated text
- terminology_used: which glossary terms you used
- uncertain_terms: any terms you weren't sure about
- confidence: overall confidence (0-100)"""

    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
        response_format={"type": "json_object"}
    )

    return json.loads(response.choices[0].message.content)

def stage_2_review(original, translation, target_lang):
    """Stage 2: Quality review by fresh AI instance"""

    glossary_for_lang = GLOSSARY.get(f"en-{target_lang}", {})

    prompt = f"""Review this {target_lang.upper()} translation for quality.

Original English:
{original}

Translation:
{translation}

Check against these criteria:
1. Mandatory terminology used correctly: {json.dumps(glossary_for_lang, indent=2)}
2. Tone is {TONE_RULES['formality']}
3. Uses {TONE_RULES['register']}
4. Language complexity matches original (both are {TONE_RULES['complexity']})
5. No cultural references that don't translate

Return JSON with:
- issues_found: list of any problems
- suggested_changes: specific improvements
- terminology_compliance: did it use the glossary correctly?
- tone_assessment: does it match our tone rules?
- overall_quality: score (0-100)"""

    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
        response_format={"type": "json_object"}
    )

    return json.loads(response.choices[0].message.content)

def stage_3_comprehension_check(original, translation, target_lang):
    """Stage 3: Check meaning is preserved"""

    prompt = f"""Check if this translation preserves the original meaning.

Original English:
{original}

Translation ({target_lang.upper()}):
{translation}

Tasks:
1. Back-translate the {target_lang.upper()} to English
2. Compare to the original:
   - Are all key messages preserved?
   - Is the language level similar?
   - Is anything added or lost?

Return JSON with:
- back_translation: your English version of the translation
- meaning_preserved: yes/no
- key_differences: any important changes in meaning
- complexity_comparison: is the translation simpler/harder than original?
- recommendation: approve/revise/flag-for-review"""

    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
        response_format={"type": "json_object"}
    )

    return json.loads(response.choices[0].message.content)

# Full workflow
def translate_with_quality_control(text, target_lang):
    """Run complete three-stage workflow"""

    print(f"\nTranslating to {target_lang}...")

    # Stage 1: Translate
    print("Stage 1: Translating...")
    stage1 = stage_1_translate(text, target_lang)
    translation = stage1['translation']

    # Stage 2: Review
    print("Stage 2: Reviewing...")
    stage2 = stage_2_review(text, translation, target_lang)

    # Apply suggested changes if any
    if stage2.get('suggested_changes'):
        print(f"  - Found {len(stage2['suggested_changes'])} suggestions")
        # In practice, you might want to auto-apply these or flag for human review

    # Stage 3: Comprehension check
    print("Stage 3: Comprehension check...")
    stage3 = stage_3_comprehension_check(text, translation, target_lang)

    # Compile quality report
    report = {
        'target_language': target_lang,
        'translation': translation,
        'stage1_confidence': stage1.get('confidence'),
        'stage2_quality': stage2.get('overall_quality'),
        'stage3_recommendation': stage3.get('recommendation'),
        'issues_flagged': {
            'uncertain_terms': stage1.get('uncertain_terms', []),
            'review_issues': stage2.get('issues_found', []),
            'meaning_differences': stage3.get('key_differences', [])
        },
        'needs_human_review': stage3.get('recommendation') != 'approve'
    }

    return report

# Example usage
source_text = """
We're here to support you. Our support workers can help with housing,
benefits advice, and safeguarding concerns. All service users are welcome.
"""

for lang in ['fr', 'es']:
    result = translate_with_quality_control(source_text, lang)

    print(f"\n{'='*50}")
    print(f"Language: {result['target_language']}")
    print(f"Translation: {result['translation']}")
    print(f"Quality scores: Stage1={result['stage1_confidence']}%, Stage2={result['stage2_quality']}%")
    print(f"Recommendation: {result['stage3_recommendation']}")
    print(f"Needs human review: {result['needs_human_review']}")

    if result['issues_flagged']['uncertain_terms']:
        print(f"Uncertain terms: {result['issues_flagged']['uncertain_terms']}")

Tools

Claude APIservice · paid
Visit →
OpenAI APIservice · paid
Visit →
Google Colabplatform · freemium
Visit →

Resources

At a glance

Time to implement
weeks
Setup cost
low
Ongoing cost
low
Cost trend
decreasing
Organisation size
medium, large
Target audience
comms-marketing, operations-manager, it-technical

API costs are ~£0.01-0.05 per document depending on length (3 passes × cost per pass). For 50 documents/month across 4 languages that's roughly £10-15/month. Main investment is setup time and building your glossary.