OpenAI Function Calling: Getting Started
How function calling works in OpenAI API — from defining functions to handling responses.
What Is Function Calling
Function calling lets GPT models “call” your functions. The model doesn’t execute code — it generates JSON with the function name and parameters, and you execute the call on your side.
This is the foundation of OpenAI agents. Without function calling, GPT is a text generator. With it, GPT becomes a tool-using agent that can fetch data, call APIs, and take actions.
How It Works
User: "What's the weather in London?"
↓
GPT: "I want to call get_weather(city='London')"
↓
Your code: calls weather API → {temp: 15, condition: "cloudy"}
↓
GPT: "It's 15°C and cloudy in London right now."
The flow:
- You describe available functions in JSON Schema
- Send the user’s message + function definitions to GPT
- GPT decides if it needs to call a function (or just responds directly)
- If yes — GPT returns the function name + parameters as JSON
- You execute the function and send the result back
- GPT uses the result to generate the final response
Defining Functions
Each function needs a name, description, and parameter schema. The description is critical — GPT uses it to decide when to call the function.
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Gets the current weather for a given city. Use this when the user asks about weather conditions, temperature, or forecasts.",
"parameters": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "City name, e.g.: London, New York, Tokyo"
},
"units": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "Temperature units. Default: celsius."
}
},
"required": ["city"]
}
}
},
{
"type": "function",
"function": {
"name": "search_products",
"description": "Searches the product catalog by name, category, or price range. Returns matching products with prices.",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Search query"
},
"category": {
"type": "string",
"enum": ["electronics", "clothing", "books", "home"],
"description": "Product category to filter by"
},
"max_price": {
"type": "number",
"description": "Maximum price in USD"
}
},
"required": ["query"]
}
}
}
]
Pro tip: Use enum wherever possible. It restricts the model’s output to valid values — fewer errors, better results.
Making the API Call
from openai import OpenAI
import json
client = OpenAI()
messages = [
{"role": "user", "content": "What's the weather in London?"}
]
response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=tools,
tool_choice="auto", # Let GPT decide whether to call a function
)
message = response.choices[0].message
The tool_choice parameter controls behavior:
"auto"— GPT decides (default, recommended)"required"— GPT must call at least one function{"type": "function", "function": {"name": "get_weather"}}— force a specific function"none"— disable function calling for this request
Handling the Response
When GPT decides to call a function, message.tool_calls will be populated:
if message.tool_calls:
# GPT wants to call one or more functions
messages.append(message) # Add GPT's response to conversation
for tool_call in message.tool_calls:
function_name = tool_call.function.name
arguments = json.loads(tool_call.function.arguments)
print(f"Calling: {function_name}({arguments})")
# Execute the function
if function_name == "get_weather":
result = fetch_weather(arguments["city"], arguments.get("units", "celsius"))
elif function_name == "search_products":
result = search_catalog(arguments["query"], arguments.get("category"))
else:
result = {"error": f"Unknown function: {function_name}"}
# Return the result to GPT
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": json.dumps(result),
})
# Get GPT's final response with the function results
final_response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
)
print(final_response.choices[0].message.content)
else:
# GPT responded directly without calling a function
print(message.content)
Parallel Function Calls
GPT can call multiple functions in a single turn. If you ask “What’s the weather in London and Paris?”, GPT returns two tool_calls:
# GPT returns:
# tool_calls = [
# {name: "get_weather", arguments: {city: "London"}},
# {name: "get_weather", arguments: {city: "Paris"}},
# ]
# Handle all calls and return all results before getting the final response
for tool_call in message.tool_calls:
function_name = tool_call.function.name
arguments = json.loads(tool_call.function.arguments)
result = execute_function(function_name, arguments)
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": json.dumps(result),
})
This is faster than sequential calls — GPT gets all data at once and generates one combined response.
Building the Agent Loop
A single tool call is useful, but real agents need multiple rounds. Here’s the full agent pattern:
def run_agent(user_message: str, max_turns: int = 10):
messages = [
{"role": "system", "content": "You are a helpful assistant. Use tools when needed."},
{"role": "user", "content": user_message},
]
for turn in range(max_turns):
response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=tools,
tool_choice="auto",
)
message = response.choices[0].message
# If GPT responded without tools — we're done
if not message.tool_calls:
return message.content
# Execute all tool calls
messages.append(message)
for tool_call in message.tool_calls:
function_name = tool_call.function.name
arguments = json.loads(tool_call.function.arguments)
try:
result = execute_function(function_name, arguments)
except Exception as e:
result = {"error": str(e)}
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": json.dumps(result),
})
return "Max turns reached — agent stopped."
This loop lets the agent chain multiple function calls. Ask “Find cheap electronics under $50 and check if they’re in stock” — the agent calls search_products, then check_inventory, then responds with a combined answer.
Error Handling
Functions fail. APIs timeout, data doesn’t exist, parameters are wrong. Always handle errors gracefully:
def execute_function(name: str, arguments: dict) -> dict:
try:
if name == "get_weather":
return fetch_weather(arguments["city"])
elif name == "search_products":
return search_catalog(arguments["query"])
else:
return {"error": f"Unknown function: {name}"}
except ConnectionError:
return {"error": "Service unavailable. Please try again."}
except KeyError as e:
return {"error": f"Missing required parameter: {e}"}
except Exception as e:
return {"error": f"Failed to execute {name}: {str(e)}"}
Return errors as structured data, not exceptions. GPT will explain the error to the user or try an alternative approach.
Streaming with Function Calls
For real-time applications, use streaming. Function calls arrive as deltas:
stream = client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=tools,
stream=True,
)
tool_calls_buffer = {}
for chunk in stream:
delta = chunk.choices[0].delta
# Text content — print immediately
if delta.content:
print(delta.content, end="", flush=True)
# Tool call chunks — accumulate
if delta.tool_calls:
for tc in delta.tool_calls:
idx = tc.index
if idx not in tool_calls_buffer:
tool_calls_buffer[idx] = {"name": "", "arguments": ""}
if tc.function.name:
tool_calls_buffer[idx]["name"] = tc.function.name
if tc.function.arguments:
tool_calls_buffer[idx]["arguments"] += tc.function.arguments
OpenAI vs Claude: Function Calling Comparison
| Aspect | OpenAI Function Calling | Claude Tool Use |
|---|---|---|
| JSON Schema | Yes | Yes |
| Parallel calls | Yes | Yes |
| Streaming | Yes | Yes |
| Force specific function | tool_choice | tool_choice |
| Managed state | Assistants API (threads) | Stateless (you manage) |
| Built-in RAG | Assistants API file search | No (use MCP or custom) |
| Code execution | Code Interpreter | No |
| Open ecosystem | Proprietary | MCP (open standard) |
Both platforms are capable. OpenAI has the Assistants API with managed threads and built-in RAG. Claude has MCP with a massive open ecosystem of pre-built servers. Choose based on your needs.
Best Practices
-
Descriptions matter most — GPT chooses functions based on descriptions. Write them like documentation, not variable names.
-
Validate parameters — don’t blindly trust JSON from the model. Check types, ranges, and sanitize input.
-
Use enums — restrict values wherever possible.
"enum": ["celsius", "fahrenheit"]is better than"type": "string". -
Handle errors gracefully — return structured error objects. GPT will communicate the issue to the user.
-
Set turn limits — always cap the agent loop. 10-15 turns handles most scenarios. Runaway loops waste tokens and money.
-
Log everything — log every function call with input/output. Essential for debugging and cost monitoring.
Common Mistakes
- Too many functions — GPT performs worse with 20+ functions. Group related operations or use routing.
- Vague descriptions — “Does stuff with data” won’t work. Be specific about when and why to use each function.
- No error handling — a crashing function breaks the whole agent. Always return clean error messages.
- Forgetting
tool_call_id— every tool response must include the matchingtool_call_id. Without it, GPT can’t match results to calls.
Next Steps
- Compare with Claude: Claude Tool Use
- Build a full agent from scratch: Build Your First Agent
- Learn about the OpenAI Assistants API: OpenAI Assistants API Guide