Process documents in bulk with LLM APIs
The problem
You've got 100 grant applications to review, 50 case notes to summarise, or 200 beneficiary feedback forms to extract themes from. Each is a separate document (PDF, Word doc, text file). Reading and analysing them manually would take weeks. You need to apply consistent AI analysis across all documents, but copying each into Claude.ai is still too slow.
The solution
Write a script that loops through a folder of documents, extracts the text, sends it to an LLM API with a consistent prompt (e.g., 'score this grant application' or 'summarise this case note'), and saves the structured results to a spreadsheet. This is the batch processing pattern for documents instead of CSV rows.
What you get
A CSV file with one row per document containing AI-generated analysis. For grant applications: score, strengths, weaknesses, recommendation. For case notes: summary, key issues, follow-up actions. For feedback forms: themes, sentiment, key quotes. Typically processes 50-200 documents in 20-60 minutes.
Before you start
- Folder of documents to process (PDF, DOCX, or TXT files)
- API key from OpenAI or Anthropic (budget £10-50 depending on volume)
- Clear rubric: what should the AI extract or assess from each document?
- Python environment (Google Colab works, or local installation)
- Documents are in English or other language supported by the LLM
When to use this
- You have 20+ documents that need the same analysis applied
- Documents are in standard formats (PDF, Word, plain text)
- The analysis is well-defined (score against rubric, extract specific information, summarise)
- Manual reading would take days or weeks
- You can tolerate 85-95% accuracy with human review of edge cases
When not to use this
- Fewer than 20 documents (manual processing is faster)
- Documents are scanned images without OCR (need to extract text first)
- Each document needs highly customised assessment (not a standard template)
- Documents contain highly sensitive information you can't send to external APIs
- You need 100% accuracy (always budget for human review)
- Documents are very long (>100 pages) - may hit API context limits
Steps
- 1
Create your assessment rubric or extraction template
Define exactly what you want from each document. For grant applications: 'Score 1-10 on: alignment with our priorities, feasibility, budget reasonableness. List 3 strengths and 3 weaknesses. Recommend: fund, maybe, reject.' For case notes: 'Summarise in 2-3 sentences, list key issues flagged, suggest follow-up actions.' Test this rubric manually on 3 documents to ensure it works.
- 2
Prepare your documents in one folder
Put all documents to process in a single folder. Use clear filenames (e.g., 'grant-app-001.pdf', 'case-note-2024-01-15.docx'). If documents are scanned PDFs without text, you'll need to OCR them first (separate step). Organise subfolders if you have different document types needing different prompts.
- 3
Set up code with text extraction
Use the example code below. It handles PDF, DOCX, and TXT files. For PDFs: uses PyPDF2 to extract text. For DOCX: uses python-docx. For TXT: reads directly. Test text extraction on 3 documents first - check the extracted text looks right (no garbled characters, structure preserved). Some complex PDFs extract poorly.
- 4
Test your prompt on 5 documents
Modify the code to process only 5 documents (there's a limit in the example). Run it. Check outputs carefully: Does the AI follow your rubric? Are scores reasonable? Is extracted information accurate? Common issues: AI misses key details in long documents, scores are too generous, formatting is inconsistent. Refine prompt and re-test.
- 5
Run the full batch
Remove the document limit and run on all documents. For 100 documents this typically takes 30-60 minutes depending on length and API speed. The code shows progress and handles rate limits. When complete, you'll have a CSV with one row per document containing the AI analysis.
- 6
Review outputs and flag anomalies
Open the results CSV. Sort by AI scores or key fields. Spot-check 15-20 documents against the original: is the analysis accurate? Flag any documents where the AI clearly failed (extracted nonsense, missed key information, contradicted itself). Review these manually.
- 7
Handle failed extractions(optional)
Some documents might have failed text extraction (complex PDFs, protected documents, corrupt files). The code logs these. For failed documents: try opening in Word and saving as plain text, or use OCR if they're scanned images. Re-run just the failed documents.
Example code
Process grant applications in bulk
Batch process grant applications with scoring rubric. Install: pip install openai PyPDF2 pandas tqdm
import os
import pandas as pd
import openai
from PyPDF2 import PdfReader
import time
from tqdm import tqdm
# Configuration
API_KEY = 'your-openai-api-key'
DOCS_FOLDER = './grant-applications' # Folder with PDFs
OUTPUT_CSV = 'grant_scores.csv'
# Your assessment rubric
ASSESSMENT_PROMPT = """You are reviewing a grant application. Assess it using this rubric:
1. Alignment with our priorities (1-10 score): How well does it match our funding criteria?
2. Feasibility (1-10 score): Is the project realistic and achievable?
3. Budget quality (1-10 score): Is the budget clear, reasonable, and well-justified?
Also provide:
- 3 key strengths (bullet points)
- 3 key weaknesses (bullet points)
- Overall recommendation: FUND, MAYBE, or REJECT
- Brief rationale (2-3 sentences)
Return as JSON:
{
"alignment_score": 7,
"feasibility_score": 8,
"budget_score": 6,
"strengths": ["...", "...", "..."],
"weaknesses": ["...", "...", "..."],
"recommendation": "MAYBE",
"rationale": "..."
}"""
def extract_text_from_pdf(pdf_path):
"""Extract text from PDF file"""
try:
reader = PdfReader(pdf_path)
text = ""
for page in reader.pages:
text += page.extract_text()
return text
except Exception as e:
return None
def assess_document(text, filename):
"""Send document to OpenAI for assessment"""
try:
response = openai.ChatCompletion.create(
model='gpt-4o-mini',
messages=[
{'role': 'system', 'content': ASSESSMENT_PROMPT},
{'role': 'user', 'content': f"Document: {filename}\n\n{text}"}
],
temperature=0.3,
max_tokens=1000
)
import json
result = json.loads(response.choices[0].message.content)
return result
except Exception as e:
return {'error': str(e)}
# Initialize
openai.api_key = API_KEY
results = []
# Get all PDF files
pdf_files = [f for f in os.listdir(DOCS_FOLDER) if f.endswith('.pdf')]
# For testing: process first 5 only
# Remove this line when ready for full batch
pdf_files = pdf_files[:5]
print(f"Processing {len(pdf_files)} documents...")
# Process each document
for pdf_file in tqdm(pdf_files):
pdf_path = os.path.join(DOCS_FOLDER, pdf_file)
# Extract text
print(f"\nExtracting text from {pdf_file}...")
text = extract_text_from_pdf(pdf_path)
if not text:
print(f" Failed to extract text from {pdf_file}")
results.append({
'filename': pdf_file,
'error': 'Text extraction failed'
})
continue
# Assess document
print(f" Assessing {pdf_file}...")
assessment = assess_document(text, pdf_file)
# Store results
results.append({
'filename': pdf_file,
'alignment_score': assessment.get('alignment_score'),
'feasibility_score': assessment.get('feasibility_score'),
'budget_score': assessment.get('budget_score'),
'strengths': ' | '.join(assessment.get('strengths', [])),
'weaknesses': ' | '.join(assessment.get('weaknesses', [])),
'recommendation': assessment.get('recommendation'),
'rationale': assessment.get('rationale'),
'error': assessment.get('error')
})
# Rate limiting
time.sleep(1)
# Save results
df = pd.DataFrame(results)
df.to_csv(OUTPUT_CSV, index=False)
print(f"\nDone! Results saved to {OUTPUT_CSV}")
print(f"Successfully processed: {df['error'].isna().sum()} documents")
print(f"Errors: {df['error'].notna().sum()} documents")
# Show summary statistics
if df['error'].isna().sum() > 0:
print(f"\nAverage scores:")
print(f" Alignment: {df['alignment_score'].mean():.1f}/10")
print(f" Feasibility: {df['feasibility_score'].mean():.1f}/10")
print(f" Budget: {df['budget_score'].mean():.1f}/10")
print(f"\nRecommendations:")
print(df['recommendation'].value_counts())Summarise case notes in bulk using Claude
Summarise case notes in bulk using Claude. Install: pip install anthropic python-docx pandas tqdm
import os
import pandas as pd
import anthropic
from docx import Document
import time
from tqdm import tqdm
# Configuration
API_KEY = 'your-anthropic-api-key'
DOCS_FOLDER = './case-notes' # Folder with DOCX files
OUTPUT_CSV = 'case_summaries.csv'
SUMMARY_PROMPT = """Summarise this case note following this template:
1. Brief summary (2-3 sentences): What happened in this interaction?
2. Key issues identified (bullet list): What concerns or needs were flagged?
3. Actions taken (bullet list): What was done or arranged?
4. Follow-up required (bullet list): What needs to happen next?
5. Urgency (LOW, MEDIUM, HIGH): How urgent is follow-up?
Return as JSON."""
def extract_text_from_docx(docx_path):
"""Extract text from Word document"""
try:
doc = Document(docx_path)
return '\n'.join([para.text for para in doc.paragraphs])
except Exception as e:
return None
def summarise_case_note(text, filename, client):
"""Send case note to Claude for summarisation"""
try:
message = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=1000,
temperature=0.3,
messages=[
{
"role": "user",
"content": f"{SUMMARY_PROMPT}\n\nCase note: {filename}\n\n{text}"
}
]
)
import json
result = json.loads(message.content[0].text)
return result
except Exception as e:
return {'error': str(e)}
# Initialize
client = anthropic.Anthropic(api_key=API_KEY)
results = []
# Get all Word docs
docx_files = [f for f in os.listdir(DOCS_FOLDER) if f.endswith('.docx')]
# Test mode: first 5 only
docx_files = docx_files[:5]
print(f"Processing {len(docx_files)} case notes...")
for docx_file in tqdm(docx_files):
docx_path = os.path.join(DOCS_FOLDER, docx_file)
# Extract text
text = extract_text_from_docx(docx_path)
if not text:
results.append({'filename': docx_file, 'error': 'Text extraction failed'})
continue
# Summarise
summary = summarise_case_note(text, docx_file, client)
results.append({
'filename': docx_file,
'summary': summary.get('summary'),
'key_issues': ' | '.join(summary.get('key_issues', [])),
'actions_taken': ' | '.join(summary.get('actions_taken', [])),
'follow_up': ' | '.join(summary.get('follow_up', [])),
'urgency': summary.get('urgency'),
'error': summary.get('error')
})
time.sleep(1)
# Save
df = pd.DataFrame(results)
df.to_csv(OUTPUT_CSV, index=False)
print(f"\nSaved {len(df)} summaries to {OUTPUT_CSV}")Tools
Resources
Extract text from PDF files in Python.
python-docx documentationdocumentationRead and write Word documents in Python.
Handling long documents with LLMstutorialBest practices for processing long documents with Claude.
OCR tools for scanned PDFstoolAdd text layer to scanned PDFs before processing.
At a glance
- Time to implement
- hours
- Setup cost
- low
- Ongoing cost
- low
- Cost trend
- decreasing
- Organisation size
- small, medium, large
- Target audience
- operations-manager, program-delivery, fundraising, data-analyst
Cost depends on document length and model used. Typical grant application (3-5 pages): £0.05-0.10 with GPT-4o-mini, £0.20-0.40 with GPT-4. Claude Opus handles longer documents better but costs more. For 100 documents averaging 4 pages: £5-40 depending on model. Always test with 5 documents first to estimate costs.