Skip to main content

Tool Calling: Giving Agents Abilities

In the previous article, we gave Aria a personality and taught her to return reliable, structured answers. But try asking her something like "What's in my inbox?" and you'll hit a wall immediately — she'll either make something up or admit she has no way to know. Aria can talk beautifully. She just can't do anything yet.

That changes today. We're giving Aria her first real ability: checking an inbox. By the end of this article, you'll understand exactly what a "tool" is in LangChain, how to define one, and — this is the part beginners usually find surprising — how an agent actually decides whether to use it.

🟢 Skill level: Beginner.

Quick Reference

When to use this: Whenever your agent needs to do something beyond generating text — fetch data, run a calculation, call an API, check a database.

Basic syntax:

from langchain.tools import tool

@tool
def check_inbox() -> str:
"""Check the inbox for recent emails."""
return "..."

agent = create_agent(model="gpt-5-nano", tools=[check_inbox])

Common patterns:

  • A tool is just a regular Python function with a @tool decorator and a docstring
  • The agent reads the docstring to decide when a tool is relevant — it never sees your function's code
  • The agent doesn't run the tool itself; LangChain executes it and feeds the result back to the model

Gotchas:

  • ⚠️ A missing or vague docstring is the #1 reason a tool never gets called — the model only knows what the docstring tells it.
  • ⚠️ Adding a tool to tools=[...] doesn't mean it's used every time — the agent decides per-question whether it's relevant.

See also: Prompting Agents: System Prompts and Structured Output

What You Need to Know First

  • Everything from Article 1 — system prompts, create_agent, and what an agent is
  • Basic Python functions and type hints (e.g., def my_func(x: int) -> str:)
  • An OpenAI API key set up in a .env file — see Article 1's setup walkthrough if you haven't done this yet

What We'll Cover in This Article

  • What a "tool" actually is, conceptually
  • How to define a tool with LangChain's @tool decorator
  • How to test a tool directly, without involving an agent at all
  • How to give a tool to an agent and watch it decide to use it
  • How to inspect exactly what happened — what the agent called, and with what arguments

What We'll Explain Along the Way

  • What "tool calling" / "function calling" means at the model level
  • The difference between the agent deciding to call a tool and the tool actually running
  • What a docstring is, for readers newer to Python, and why it matters so much here

What's a Tool, Really?

Here's the mental model: a tool is just a regular Python function that you've made available to the agent. The agent can't reach into your codebase and run arbitrary functions — it can only use functions you've explicitly handed it, and even then, only when it decides one is relevant to what's being asked.

Here's the part that trips people up: the language model never sees your function's actual code. It only sees the function's name, its docstring (the description in triple quotes), and its parameters. Based on that information alone, the model decides: "Does this look relevant to what the user just asked? If so, I should call it — with these specific argument values."

Once the model decides to call a tool, LangChain — not the model — actually runs your Python function, takes the real return value, and sends it back to the model. The model then uses that real result to write its final answer.

Diagram: The model never runs your function itself — it only requests that a specific tool be called. LangChain executes the real function and hands the actual result back to the model, which then writes a response grounded in that real data.

This is the key difference between a tool-using agent and a plain chat model: a plain model can only talk about what it learned during training. A tool-using agent can reach out and get real, current information — and then talk about that.

Defining Your First Tool: check_inbox

Let's give Aria a way to check Julie's inbox. For now, we'll keep it simple with a tool that returns one fixed email — in a later article, this will connect to something real.

# Purpose: Define a tool the agent can call to check the inbox
# Context: Aria's first real ability — no more guessing what's in the inbox
# Input: None (this tool takes no arguments)
# Output: The text of the most recent email, as a string

from langchain.tools import tool

@tool
def check_inbox() -> str:
"""Check the inbox for recent emails."""
# In a real application, this would call an email API.
# For now, we return a fixed example so we can focus on tool calling itself.
return """
Hi Julie,
I'm going to be in town next week and was wondering if we could grab a coffee?
- best, Jane (jane@example.com)
"""

Three things make this a tool instead of just a function:

  1. The @tool decorator, imported from langchain.tools
  2. A docstring — the text in triple quotes right after the function definition. This is what the model reads to decide if the tool is relevant.
  3. A type hint on the return value (-> str) — this tells LangChain (and indirectly, the model) what shape of data to expect back.

💡 The docstring isn't a comment for other developers — it's the only description the model ever sees of what this function does. Write it the way you'd explain the tool's purpose to someone who can't read your code.

Testing a Tool Directly

Before involving an agent at all, you can call a tool exactly like testing any other function — useful for confirming your tool actually works before adding the unpredictability of a language model on top of it.

# Purpose: Test the tool in isolation, with no agent involved
# Context: Confirms the tool itself works before wiring it into Aria
# Input: None
# Output: The raw return value of the tool

result = check_inbox.invoke({})
print(result)

Notice we call .invoke({}) rather than check_inbox() directly — tools expect their arguments as a dictionary (empty here, since check_inbox takes none), which keeps the calling convention consistent for tools that do take arguments. This is exactly how LangChain will call it internally once an agent decides to use it.

Giving Aria the Tool

Now let's actually hand this tool to Aria, and update her system prompt to mention what she can now do:

# Purpose: Create an agent that has access to the check_inbox tool
# Context: Aria's first agent configuration with a real ability attached
# Input: None yet — we'll ask a question next
# Output: An agent instance, configured with one tool

from langchain.agents import create_agent

aria_system_prompt = """
You are Aria, a personal email assistant for Julie.
You are warm, concise, and a little formal — like an excellent
executive assistant. You never ramble, and you get straight to
the point while staying friendly.

You can check Julie's inbox when asked.
"""

agent = create_agent(
model="gpt-5-nano",
system_prompt=aria_system_prompt,
tools=[check_inbox],
)

That single tools=[check_inbox] line is the whole wiring job. Let's see what happens when we ask Aria something that requires checking the inbox.

Watching Aria Decide to Use the Tool

Let's ask two different questions — one that should trigger the tool, and one that shouldn't — to see the decision-making in action.

# Purpose: Observe the agent choosing whether to call the tool
# Context: Compares a question that needs the tool vs. one that doesn't
# Input: Two different user questions
# Output: Aria's responses, one using the tool, one not

from langchain.messages import HumanMessage

# This question requires real, current information — Aria should use the tool
question_one = HumanMessage(content="What's in my inbox right now?")
response_one = agent.invoke({"messages": [question_one]})
print(response_one["messages"][-1].content)

# This question doesn't need the inbox at all — Aria should just answer directly
question_two = HumanMessage(content="What's a polite way to decline a coffee invite?")
response_two = agent.invoke({"messages": [question_two]})
print(response_two["messages"][-1].content)

Take a moment to predict what happens with each question before running this.

...

For the first question, Aria calls check_inbox, gets back Jane's real email text, and answers based on that actual content — she'll mention Jane by name and reference the coffee invite specifically, because she's working from real data, not a guess. For the second question, Aria just answers directly from her own knowledge — there was nothing for the tool to help with, so she never calls it. This is exactly the behavior we want: the tool gets used when it's relevant, and ignored otherwise.

Looking Under the Hood: Inspecting Tool Calls

How do we know, for certain, that the tool was actually called rather than Aria just making something up that happened to sound plausible? We can inspect the full message history.

# Purpose: Confirm exactly what tool was called, and with what arguments
# Context: Useful for debugging and for understanding what actually happened
# Input: The response from the inbox question above
# Output: The structured tool call details, not just the final text

from pprint import pprint

# The full conversation, including the tool call and its result
pprint(response_one["messages"])

# The AI message that requested the tool call specifically
# tool_calls is a list because an agent can call multiple tools at once
ai_message_with_tool_call = response_one["messages"][1]
print(ai_message_with_tool_call.tool_calls)

You'll see something like [{'name': 'check_inbox', 'args': {}, 'id': '...'}] — concrete, inspectable proof of exactly which tool was called and with what arguments. This isn't just useful for learning; it's how you'll debug real tool-calling agents when something doesn't behave the way you expect.

Common Misconceptions

❌ Misconception: Giving an agent a tool means the tool runs on every message

Reality: The agent decides, per-question, whether a given tool is relevant. Adding a tool to tools=[...] makes it available, not mandatory.

Why this matters: If you ask Aria something unrelated to email, she won't call check_inbox just because it exists — and that's correct behavior, not a bug.

Example:

# ❌ Wrong assumption: "I gave Aria a tool, so every response will use it"
# ✅ Correct: Aria evaluates relevance per-message and calls tools only when needed

Explanation: This is what makes tool-calling agents useful instead of rigid — the model is making a judgment call about relevance every time, the same way a human assistant would decide whether they need to actually go check something versus just answering from what they already know.

❌ Misconception: The model runs your Python function

Reality: The model only ever requests that a tool be called, with specific arguments. LangChain is the thing that actually executes your Python function and returns the real result.

Why this matters: This separation is what makes tool calling safe and predictable — your actual code, with real logic and real access to your systems, runs exactly as written. The model can't bypass it or run something else instead.

Example:

# ❌ Wrong mental model: "the AI executes my code"
# ✅ Correct: the AI requests a call; LangChain executes your real function

Troubleshooting Common Issues

Problem: The agent never calls the tool, even when it seems obviously relevant

Symptoms: Aria answers a question that should require check_inbox without ever calling it — and likely makes something up instead.

Common Causes:

  1. The docstring is missing, vague, or doesn't clearly describe when the tool is useful (most common)
  2. The tool isn't actually included in the tools=[...] list passed to create_agent
  3. The system prompt doesn't mention the agent has this capability, which can make the model less likely to consider using it

Diagnostic Steps:

# Step 1: Confirm the tool has a clear, specific docstring
@tool
def check_inbox() -> str:
"""Check the inbox for recent emails.""" # ✅ clear and specific
# vs.
"""Does inbox stuff.""" # ❌ too vague for the model to judge relevance

# Step 2: Confirm the tool is actually passed to create_agent
agent = create_agent(model="gpt-5-nano", tools=[check_inbox]) # ✅
agent = create_agent(model="gpt-5-nano") # ❌ tool never made available

Solution: Write docstrings the way you'd explain the tool's purpose to a new coworker — specific about what it does and when it's useful. Double-check the tool actually appears in the tools=[...] list.

Prevention: Always test a new tool with a question that obviously requires it, and inspect tool_calls (as shown above) to confirm it actually fired before moving on.

Problem: TypeError or unexpected argument errors when the agent calls a tool

Symptoms: An error appears mentioning missing or unexpected arguments when the agent tries to use a tool.

Common Causes:

  1. The tool's parameters are missing type hints, which makes it harder for the model to infer correct argument values (most common)
  2. A parameter name is ambiguous (e.g., x instead of email_address), causing the model to guess the wrong kind of value

Diagnostic Steps:

# Step 1: Confirm every parameter has an explicit type hint
@tool
def example_tool(query: str) -> str: # ✅ clear type hint
"""Look something up."""
...

# Step 2: Use descriptive parameter names, not single letters
def example_tool(search_query: str) -> str: # ✅ self-explanatory
def example_tool(x: str) -> str: # ❌ ambiguous

Solution: Add explicit type hints to every parameter, and use descriptive names that hint at what kind of value is expected.

Prevention: Treat tool function signatures as part of the model's instructions, not just internal code — they're being read and interpreted, not just executed.

Check Your Understanding

Quick Quiz

  1. What does the model actually see when deciding whether to use a tool?

    Show Answer

    Only the tool's name, its docstring, and its parameter names/types — never the function's actual code. This is why the docstring is so important: it's the model's entire understanding of what the tool does.

  2. If you add a tool to tools=[...] but never ask a question related to it, what happens?

    Show Answer

    Nothing — the tool is never called. The agent decides per-message whether a tool is relevant, and an unrelated question simply won't trigger it.

  3. What's wrong with this tool definition?

    @tool
    def lookup(x: str) -> str:
    """Looks stuff up."""
    return database.query(x)
    Show Answer

    The docstring is too vague for the model to reliably judge when this tool is relevant, and x is an unhelpfully generic parameter name. A better version: def search_customer_records(customer_name: str) -> str: with a docstring like "Search the customer database by customer name."

Hands-On Exercise

Challenge: Add a second tool, get_current_time, that returns today's date and time as a string, and confirm Aria calls it only when asked a time-related question.

Starter Code:

from datetime import datetime

@tool
def get_current_time() -> str:
# Your docstring and implementation here
...
Show Solution
from datetime import datetime
from langchain.tools import tool
from langchain.agents import create_agent
from langchain.messages import HumanMessage

@tool
def get_current_time() -> str:
"""Get the current date and time."""
return datetime.now().strftime("%Y-%m-%d %H:%M:%S")

agent = create_agent(
model="gpt-5-nano",
tools=[check_inbox, get_current_time],
)

# This should trigger get_current_time, not check_inbox
response = agent.invoke(
{"messages": [HumanMessage(content="What time is it right now?")]}
)
print(response["messages"][-1].content)
print(response["messages"][1].tool_calls) # should show get_current_time, not check_inbox

Explanation: With two tools available, the agent picks the one that actually matches the question being asked — this is the same relevance judgment from earlier, just now with a choice between two options instead of one.

Summary: Key Takeaways

  • A tool is a regular Python function made available to an agent via the @tool decorator
  • The model only ever sees a tool's name, docstring, and parameters — never its actual code
  • The agent decides per-question whether a tool is relevant; adding a tool doesn't force its use
  • LangChain — not the model — actually executes the tool and returns the real result
  • You can test a tool directly with .invoke({}) before ever involving an agent
  • response["messages"][n].tool_calls lets you inspect exactly what was called and with what arguments
  • Aria can now check a (fake, for now) inbox — real inbox data and sending replies come next

Version Information

Tested with:

  • Python: >=3.10, <4.0
  • langchain: >=1.1.3 (latest stable as of writing: 1.3.4) — @tool and create_agent are both part of the core langchain package, no extra install needed beyond what Article 1 already set up

Known issues:

  • ⚠️ Tools without docstrings will either fail to register correctly or be effectively invisible to the model, depending on your langchain version — always include one.

What's Next?

You now understand how to give an agent real abilities, and exactly how it decides whether to use them.

The natural next step is Web Search: Real-Time Knowledge for Agents — Aria can check a fixed, fake inbox right now, but the same tool-calling pattern you just learned is what lets her reach out to the live internet for real-time information too.

References