What Is Function Calling

Function calling lets GPT models “call” your functions. The model doesn’t execute code — it generates JSON with the function name and parameters, and you execute the call on your side.

This is the foundation of OpenAI agents. Without function calling, GPT is a text generator. With it, GPT becomes a tool-using agent that can fetch data, call APIs, and take actions.

How It Works

User: "What's the weather in London?"
  ↓
GPT: "I want to call get_weather(city='London')"
  ↓
Your code: calls weather API → {temp: 15, condition: "cloudy"}
  ↓
GPT: "It's 15°C and cloudy in London right now."

The flow:

You describe available functions in JSON Schema
Send the user’s message + function definitions to GPT
GPT decides if it needs to call a function (or just responds directly)
If yes — GPT returns the function name + parameters as JSON
You execute the function and send the result back
GPT uses the result to generate the final response

Defining Functions

Each function needs a name, description, and parameter schema. The description is critical — GPT uses it to decide when to call the function.

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Gets the current weather for a given city. Use this when the user asks about weather conditions, temperature, or forecasts.",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {
                        "type": "string",
                        "description": "City name, e.g.: London, New York, Tokyo"
                    },
                    "units": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "Temperature units. Default: celsius."
                    }
                },
                "required": ["city"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "search_products",
            "description": "Searches the product catalog by name, category, or price range. Returns matching products with prices.",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {
                        "type": "string",
                        "description": "Search query"
                    },
                    "category": {
                        "type": "string",
                        "enum": ["electronics", "clothing", "books", "home"],
                        "description": "Product category to filter by"
                    },
                    "max_price": {
                        "type": "number",
                        "description": "Maximum price in USD"
                    }
                },
                "required": ["query"]
            }
        }
    }
]

Pro tip: Use enum wherever possible. It restricts the model’s output to valid values — fewer errors, better results.

Making the API Call

from openai import OpenAI
import json

client = OpenAI()

messages = [
    {"role": "user", "content": "What's the weather in London?"}
]

response = client.chat.completions.create(
    model="gpt-4o",
    messages=messages,
    tools=tools,
    tool_choice="auto",  # Let GPT decide whether to call a function
)

message = response.choices[0].message

The tool_choice parameter controls behavior:

"auto" — GPT decides (default, recommended)
"required" — GPT must call at least one function
{"type": "function", "function": {"name": "get_weather"}} — force a specific function
"none" — disable function calling for this request

Handling the Response

When GPT decides to call a function, message.tool_calls will be populated:

if message.tool_calls:
    # GPT wants to call one or more functions
    messages.append(message)  # Add GPT's response to conversation

    for tool_call in message.tool_calls:
        function_name = tool_call.function.name
        arguments = json.loads(tool_call.function.arguments)

        print(f"Calling: {function_name}({arguments})")

        # Execute the function
        if function_name == "get_weather":
            result = fetch_weather(arguments["city"], arguments.get("units", "celsius"))
        elif function_name == "search_products":
            result = search_catalog(arguments["query"], arguments.get("category"))
        else:
            result = {"error": f"Unknown function: {function_name}"}

        # Return the result to GPT
        messages.append({
            "role": "tool",
            "tool_call_id": tool_call.id,
            "content": json.dumps(result),
        })

    # Get GPT's final response with the function results
    final_response = client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
    )
    print(final_response.choices[0].message.content)
else:
    # GPT responded directly without calling a function
    print(message.content)

Parallel Function Calls

GPT can call multiple functions in a single turn. If you ask “What’s the weather in London and Paris?”, GPT returns two tool_calls:

# GPT returns:
# tool_calls = [
#   {name: "get_weather", arguments: {city: "London"}},
#   {name: "get_weather", arguments: {city: "Paris"}},
# ]

# Handle all calls and return all results before getting the final response
for tool_call in message.tool_calls:
    function_name = tool_call.function.name
    arguments = json.loads(tool_call.function.arguments)
    result = execute_function(function_name, arguments)

    messages.append({
        "role": "tool",
        "tool_call_id": tool_call.id,
        "content": json.dumps(result),
    })

This is faster than sequential calls — GPT gets all data at once and generates one combined response.

Building the Agent Loop

A single tool call is useful, but real agents need multiple rounds. Here’s the full agent pattern:

def run_agent(user_message: str, max_turns: int = 10):
    messages = [
        {"role": "system", "content": "You are a helpful assistant. Use tools when needed."},
        {"role": "user", "content": user_message},
    ]

    for turn in range(max_turns):
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=messages,
            tools=tools,
            tool_choice="auto",
        )

        message = response.choices[0].message

        # If GPT responded without tools — we're done
        if not message.tool_calls:
            return message.content

        # Execute all tool calls
        messages.append(message)
        for tool_call in message.tool_calls:
            function_name = tool_call.function.name
            arguments = json.loads(tool_call.function.arguments)

            try:
                result = execute_function(function_name, arguments)
            except Exception as e:
                result = {"error": str(e)}

            messages.append({
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": json.dumps(result),
            })

    return "Max turns reached — agent stopped."

This loop lets the agent chain multiple function calls. Ask “Find cheap electronics under $50 and check if they’re in stock” — the agent calls search_products, then check_inventory, then responds with a combined answer.

Error Handling

Functions fail. APIs timeout, data doesn’t exist, parameters are wrong. Always handle errors gracefully:

def execute_function(name: str, arguments: dict) -> dict:
    try:
        if name == "get_weather":
            return fetch_weather(arguments["city"])
        elif name == "search_products":
            return search_catalog(arguments["query"])
        else:
            return {"error": f"Unknown function: {name}"}
    except ConnectionError:
        return {"error": "Service unavailable. Please try again."}
    except KeyError as e:
        return {"error": f"Missing required parameter: {e}"}
    except Exception as e:
        return {"error": f"Failed to execute {name}: {str(e)}"}

Return errors as structured data, not exceptions. GPT will explain the error to the user or try an alternative approach.

Streaming with Function Calls

For real-time applications, use streaming. Function calls arrive as deltas:

stream = client.chat.completions.create(
    model="gpt-4o",
    messages=messages,
    tools=tools,
    stream=True,
)

tool_calls_buffer = {}
for chunk in stream:
    delta = chunk.choices[0].delta

    # Text content — print immediately
    if delta.content:
        print(delta.content, end="", flush=True)

    # Tool call chunks — accumulate
    if delta.tool_calls:
        for tc in delta.tool_calls:
            idx = tc.index
            if idx not in tool_calls_buffer:
                tool_calls_buffer[idx] = {"name": "", "arguments": ""}
            if tc.function.name:
                tool_calls_buffer[idx]["name"] = tc.function.name
            if tc.function.arguments:
                tool_calls_buffer[idx]["arguments"] += tc.function.arguments

OpenAI vs Claude: Function Calling Comparison

Aspect	OpenAI Function Calling	Claude Tool Use
JSON Schema	Yes	Yes
Parallel calls	Yes	Yes
Streaming	Yes	Yes
Force specific function	`tool_choice`	`tool_choice`
Managed state	Assistants API (threads)	Stateless (you manage)
Built-in RAG	Assistants API file search	No (use MCP or custom)
Code execution	Code Interpreter	No
Open ecosystem	Proprietary	MCP (open standard)

Both platforms are capable. OpenAI has the Assistants API with managed threads and built-in RAG. Claude has MCP with a massive open ecosystem of pre-built servers. Choose based on your needs.

Best Practices

Descriptions matter most — GPT chooses functions based on descriptions. Write them like documentation, not variable names.
Validate parameters — don’t blindly trust JSON from the model. Check types, ranges, and sanitize input.
Use enums — restrict values wherever possible. "enum": ["celsius", "fahrenheit"] is better than "type": "string".
Handle errors gracefully — return structured error objects. GPT will communicate the issue to the user.
Set turn limits — always cap the agent loop. 10-15 turns handles most scenarios. Runaway loops waste tokens and money.
Log everything — log every function call with input/output. Essential for debugging and cost monitoring.

Common Mistakes

Too many functions — GPT performs worse with 20+ functions. Group related operations or use routing.
Vague descriptions — “Does stuff with data” won’t work. Be specific about when and why to use each function.
No error handling — a crashing function breaks the whole agent. Always return clean error messages.
Forgetting tool_call_id — every tool response must include the matching tool_call_id. Without it, GPT can’t match results to calls.

Next Steps

Compare with Claude: Claude Tool Use
Build a full agent from scratch: Build Your First Agent
Learn about the OpenAI Assistants API: OpenAI Assistants API Guide