Create an AI assistant with web search and document tools
The problem
Your team wastes hours researching grant opportunities, finding sector benchmarks, or answering policy questions from your 50-page handbook. Each time someone asks 'Are there grants for youth mental health?' or 'What's our safeguarding escalation process?', it's a manual research task. You want an AI assistant that can actually search the web and your documents to find accurate answers.
The solution
Build an AI assistant with tools for web search and document retrieval. When asked a question, the AI decides whether to search online (for grants, benchmarks, news) or search your internal documents (policies, reports, case studies). It retrieves relevant information, synthesises it, and provides sourced answers. This is research-as-a-service for your team.
What you get
A conversational research assistant that answers questions by searching the web and your documents. Example uses: Grant research bot that finds relevant funding opportunities and explains eligibility. Policy Q&A that cites specific sections of your handbook. Impact research assistant that finds sector benchmarks. All answers include sources so you can verify.
Before you start
- Clear use case: what will people ask this assistant?
- For web search: API key for search service (SerpAPI, Brave Search API, or similar)
- For document search: Your documents in searchable format (PDFs, Word docs, or text files)
- API key from OpenAI or Anthropic
- Either: n8n account OR Python environment for custom build
When to use this
- Team regularly asks research questions with findable answers (grants, policies, sector data)
- You have documents that need to be searchable (policy handbooks, reports, procedures)
- Research tasks are time-consuming but follow patterns
- Answers can be verified (you need sources cited, not just AI opinions)
- You want to democratise access to knowledge without everyone reading 50-page handbooks
When not to use this
- Questions require expert judgement, not just information retrieval
- Documents change constantly (assistant would give outdated answers)
- Fewer than 10-20 research queries per week (manual is fine)
- You need 100% accuracy (AI can misinterpret sources - always verify critical info)
- Documents contain highly confidential information you can't expose via chatbot
- Web searches would surface sensitive organisational information
Steps
- 1
Define your use case and gather sources
Choose ONE use case to start: Grant research assistant OR Policy Q&A OR Sector benchmark finder. Don't try to do everything at once. Gather your sources: For grants, you'll search the web. For policies, collect your handbook PDFs. For benchmarks, identify trusted websites (NCVO, Charity Commission, sector bodies).
- 2
Choose your tech stack
Easiest: n8n with AI Agent node + Google Search tool or Document Search tool. No code required, visual workflow builder. Reference: https://docs.n8n.io/integrations/builtin/cluster-nodes/root-nodes/n8n-nodes-langchain.agent/. More control: Python with LangChain for custom agents and tools. Choose based on technical skills and customisation needs.
- 3
Set up web search tool
Get API key from SerpAPI, Brave Search, or use n8n's built-in search. Test it manually: search for 'youth mental health grants UK' and check you get relevant results. Configure: how many results to retrieve (5-10), what to extract (title, snippet, URL). This becomes a tool the AI can call.
- 4
Set up document search tool
Index your documents: Convert PDFs/Word docs to searchable text, split into chunks (paragraphs or sections), create embeddings (numerical representations for semantic search). n8n: Use Vector Store nodes. Custom code: Use LangChain document loaders + FAISS or Chroma vector store. Test: search for a phrase and verify you get relevant chunks.
- 5
Connect AI with both tools
Configure AI agent with access to both search tools. Provide clear tool descriptions: 'web_search: Search the internet for current information about grants, sector news, or benchmarks' and 'document_search: Search our internal policy handbook and reports'. The AI will decide which tool(s) to use based on the question.
- 6
Test with realistic questions
Ask questions your team would actually ask: 'Find grants for youth mental health in London', 'What's our volunteer safeguarding policy?', 'What's the sector average for admin costs?'. Check: Does it use the right tool? Are sources relevant? Is the answer accurate? Common issues: AI searches when it should know, or doesn't search when it should.
- 7
Add source citation and verification
Critical: Make the AI cite sources. Prompt: 'Always provide sources for your answers. For web search: include URLs. For documents: cite the document name and section.' Add a 'verify this' button in your interface so users can check sources. Never present AI answers as fact without attribution.
- 8
Build simple interface and gather feedback
n8n: Expose as webhook or chat widget. Custom code: Build Streamlit interface or Slack bot. Launch to 3-5 users first. Gather feedback: What questions work well? What fails? Are sources helpful? Iterate based on real use. Don't worry about polish yet - focus on whether it saves time.
Example code
Grant research assistant with web search
Grant research assistant using LangChain and web search. Install: pip install langchain openai google-search-results
from langchain.agents import initialize_agent, Tool
from langchain.agents import AgentType
from langchain.chat_models import ChatOpenAI
from langchain.utilities import SerpAPIWrapper
import os
# Configuration
os.environ["OPENAI_API_KEY"] = "your-openai-key"
os.environ["SERPAPI_API_KEY"] = "your-serpapi-key"
# Initialize LLM
llm = ChatOpenAI(model="gpt-4o", temperature=0.3)
# Set up web search tool
search = SerpAPIWrapper()
tools = [
Tool(
name="Web Search",
func=search.run,
description="Search the internet for information about grants, funding opportunities, or sector news. Use this for questions about current grant programmes, funder priorities, or deadlines."
)
]
# Create agent
agent = initialize_agent(
tools,
llm,
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
verbose=True,
handle_parsing_errors=True
)
# Custom prompt for grant research
prefix = """You are a grant research assistant for a UK charity. Your job is to find relevant grant opportunities and funding information.
When answering:
1. Always search for current information (grants change frequently)
2. Cite your sources with URLs
3. Summarise eligibility criteria clearly
4. Note application deadlines if found
5. If information is unclear, say so
Answer the question below:"""
def find_grants(question):
"""Research grants using web search"""
full_prompt = f"{prefix}\n\n{question}"
response = agent.run(full_prompt)
return response
# Example usage
questions = [
"Find grant opportunities for youth mental health projects in London",
"What are Comic Relief's current funding priorities?",
"Are there any grants for refugee support closing in the next month?"
]
for q in questions:
print(f"\nQuestion: {q}")
print("\nResearch:")
print(find_grants(q))
print("\n" + "="*80)Policy Q&A bot with document search
Policy Q&A using document search with citations. Install: pip install langchain openai faiss-cpu pypdf
from langchain.document_loaders import PyPDFLoader, DirectoryLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import FAISS
from langchain.chat_models import ChatOpenAI
from langchain.chains import RetrievalQA
import os
# Configuration
os.environ["OPENAI_API_KEY"] = "your-api-key"
DOCS_FOLDER = "./policy-docs" # Folder with your PDFs
# Load and process documents
print("Loading policy documents...")
loader = DirectoryLoader(
DOCS_FOLDER,
glob="**/*.pdf",
loader_cls=PyPDFLoader
)
documents = loader.load()
# Split into chunks for better retrieval
text_splitter = RecursiveCharacterTextSplitter(
chunk_size=1000,
chunk_overlap=200
)
chunks = text_splitter.split_documents(documents)
print(f"Loaded {len(documents)} documents, split into {len(chunks)} chunks")
# Create vector store for semantic search
embeddings = OpenAIEmbeddings()
vectorstore = FAISS.from_documents(chunks, embeddings)
# Create Q&A chain
llm = ChatOpenAI(model="gpt-4o", temperature=0.2)
qa_chain = RetrievalQA.from_chain_type(
llm=llm,
chain_type="stuff",
retriever=vectorstore.as_retriever(search_kwargs={"k": 3}),
return_source_documents=True
)
def ask_policy_question(question):
"""Answer question using policy documents"""
result = qa_chain({"query": question})
answer = result["result"]
sources = result["source_documents"]
print(f"\nAnswer: {answer}\n")
print("Sources:")
for i, doc in enumerate(sources, 1):
source_file = doc.metadata.get("source", "Unknown")
page = doc.metadata.get("page", "?")
print(f" {i}. {source_file} (page {page})")
print(f" Extract: {doc.page_content[:200]}...\n")
# Example usage
policy_questions = [
"What is our safeguarding escalation process?",
"What are the requirements for volunteer DBS checks?",
"What expenses can volunteers claim?"
]
for q in policy_questions:
print(f"\nQuestion: {q}")
ask_policy_question(q)
print("="*80)Tools
Resources
Build AI agents with web search and document tools in n8n (low-code).
LangChain retrieval agentstutorialBuilding agents with document retrieval and web search tools.
SerpAPI for web searchdocumentationGoogle search API with generous free tier.
Building a RAG chatbottutorialRetrieval-augmented generation for Q&A over documents.
At a glance
- Time to implement
- days
- Setup cost
- low
- Ongoing cost
- low
- Cost trend
- stable
- Organisation size
- small, medium, large
- Target audience
- operations-manager, fundraising, program-delivery, it-technical
n8n free tier works for testing. SerpAPI: 100 free searches/month, then $50/month for 5000 searches. LLM costs: £0.02-0.10 per conversation depending on model and sources retrieved. For 50 queries/day: £30-150/month total. Self-hosting n8n and using open search APIs reduces costs significantly.