AISuffer
beginner Engineers

OpenAI Function Calling: Getting Started

How function calling works in OpenAI API — from defining functions to handling responses.

What Is Function Calling

Function calling lets GPT models “call” your functions. The model doesn’t execute code — it generates JSON with the function name and parameters, and you execute the call on your side.

This is the foundation of OpenAI agents. Without function calling, GPT is a text generator. With it, GPT becomes a tool-using agent that can fetch data, call APIs, and take actions.

How It Works

User: "What's the weather in London?"

GPT: "I want to call get_weather(city='London')"

Your code: calls weather API → {temp: 15, condition: "cloudy"}

GPT: "It's 15°C and cloudy in London right now."

The flow:

  1. You describe available functions in JSON Schema
  2. Send the user’s message + function definitions to GPT
  3. GPT decides if it needs to call a function (or just responds directly)
  4. If yes — GPT returns the function name + parameters as JSON
  5. You execute the function and send the result back
  6. GPT uses the result to generate the final response

Defining Functions

Each function needs a name, description, and parameter schema. The description is critical — GPT uses it to decide when to call the function.

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Gets the current weather for a given city. Use this when the user asks about weather conditions, temperature, or forecasts.",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {
                        "type": "string",
                        "description": "City name, e.g.: London, New York, Tokyo"
                    },
                    "units": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "Temperature units. Default: celsius."
                    }
                },
                "required": ["city"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "search_products",
            "description": "Searches the product catalog by name, category, or price range. Returns matching products with prices.",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {
                        "type": "string",
                        "description": "Search query"
                    },
                    "category": {
                        "type": "string",
                        "enum": ["electronics", "clothing", "books", "home"],
                        "description": "Product category to filter by"
                    },
                    "max_price": {
                        "type": "number",
                        "description": "Maximum price in USD"
                    }
                },
                "required": ["query"]
            }
        }
    }
]

Pro tip: Use enum wherever possible. It restricts the model’s output to valid values — fewer errors, better results.

Making the API Call

from openai import OpenAI
import json

client = OpenAI()

messages = [
    {"role": "user", "content": "What's the weather in London?"}
]

response = client.chat.completions.create(
    model="gpt-4o",
    messages=messages,
    tools=tools,
    tool_choice="auto",  # Let GPT decide whether to call a function
)

message = response.choices[0].message

The tool_choice parameter controls behavior:

  • "auto" — GPT decides (default, recommended)
  • "required" — GPT must call at least one function
  • {"type": "function", "function": {"name": "get_weather"}} — force a specific function
  • "none" — disable function calling for this request

Handling the Response

When GPT decides to call a function, message.tool_calls will be populated:

if message.tool_calls:
    # GPT wants to call one or more functions
    messages.append(message)  # Add GPT's response to conversation

    for tool_call in message.tool_calls:
        function_name = tool_call.function.name
        arguments = json.loads(tool_call.function.arguments)

        print(f"Calling: {function_name}({arguments})")

        # Execute the function
        if function_name == "get_weather":
            result = fetch_weather(arguments["city"], arguments.get("units", "celsius"))
        elif function_name == "search_products":
            result = search_catalog(arguments["query"], arguments.get("category"))
        else:
            result = {"error": f"Unknown function: {function_name}"}

        # Return the result to GPT
        messages.append({
            "role": "tool",
            "tool_call_id": tool_call.id,
            "content": json.dumps(result),
        })

    # Get GPT's final response with the function results
    final_response = client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
    )
    print(final_response.choices[0].message.content)
else:
    # GPT responded directly without calling a function
    print(message.content)

Parallel Function Calls

GPT can call multiple functions in a single turn. If you ask “What’s the weather in London and Paris?”, GPT returns two tool_calls:

# GPT returns:
# tool_calls = [
#   {name: "get_weather", arguments: {city: "London"}},
#   {name: "get_weather", arguments: {city: "Paris"}},
# ]

# Handle all calls and return all results before getting the final response
for tool_call in message.tool_calls:
    function_name = tool_call.function.name
    arguments = json.loads(tool_call.function.arguments)
    result = execute_function(function_name, arguments)

    messages.append({
        "role": "tool",
        "tool_call_id": tool_call.id,
        "content": json.dumps(result),
    })

This is faster than sequential calls — GPT gets all data at once and generates one combined response.

Building the Agent Loop

A single tool call is useful, but real agents need multiple rounds. Here’s the full agent pattern:

def run_agent(user_message: str, max_turns: int = 10):
    messages = [
        {"role": "system", "content": "You are a helpful assistant. Use tools when needed."},
        {"role": "user", "content": user_message},
    ]

    for turn in range(max_turns):
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=messages,
            tools=tools,
            tool_choice="auto",
        )

        message = response.choices[0].message

        # If GPT responded without tools — we're done
        if not message.tool_calls:
            return message.content

        # Execute all tool calls
        messages.append(message)
        for tool_call in message.tool_calls:
            function_name = tool_call.function.name
            arguments = json.loads(tool_call.function.arguments)

            try:
                result = execute_function(function_name, arguments)
            except Exception as e:
                result = {"error": str(e)}

            messages.append({
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": json.dumps(result),
            })

    return "Max turns reached — agent stopped."

This loop lets the agent chain multiple function calls. Ask “Find cheap electronics under $50 and check if they’re in stock” — the agent calls search_products, then check_inventory, then responds with a combined answer.

Error Handling

Functions fail. APIs timeout, data doesn’t exist, parameters are wrong. Always handle errors gracefully:

def execute_function(name: str, arguments: dict) -> dict:
    try:
        if name == "get_weather":
            return fetch_weather(arguments["city"])
        elif name == "search_products":
            return search_catalog(arguments["query"])
        else:
            return {"error": f"Unknown function: {name}"}
    except ConnectionError:
        return {"error": "Service unavailable. Please try again."}
    except KeyError as e:
        return {"error": f"Missing required parameter: {e}"}
    except Exception as e:
        return {"error": f"Failed to execute {name}: {str(e)}"}

Return errors as structured data, not exceptions. GPT will explain the error to the user or try an alternative approach.

Streaming with Function Calls

For real-time applications, use streaming. Function calls arrive as deltas:

stream = client.chat.completions.create(
    model="gpt-4o",
    messages=messages,
    tools=tools,
    stream=True,
)

tool_calls_buffer = {}
for chunk in stream:
    delta = chunk.choices[0].delta

    # Text content — print immediately
    if delta.content:
        print(delta.content, end="", flush=True)

    # Tool call chunks — accumulate
    if delta.tool_calls:
        for tc in delta.tool_calls:
            idx = tc.index
            if idx not in tool_calls_buffer:
                tool_calls_buffer[idx] = {"name": "", "arguments": ""}
            if tc.function.name:
                tool_calls_buffer[idx]["name"] = tc.function.name
            if tc.function.arguments:
                tool_calls_buffer[idx]["arguments"] += tc.function.arguments

OpenAI vs Claude: Function Calling Comparison

AspectOpenAI Function CallingClaude Tool Use
JSON SchemaYesYes
Parallel callsYesYes
StreamingYesYes
Force specific functiontool_choicetool_choice
Managed stateAssistants API (threads)Stateless (you manage)
Built-in RAGAssistants API file searchNo (use MCP or custom)
Code executionCode InterpreterNo
Open ecosystemProprietaryMCP (open standard)

Both platforms are capable. OpenAI has the Assistants API with managed threads and built-in RAG. Claude has MCP with a massive open ecosystem of pre-built servers. Choose based on your needs.

Best Practices

  1. Descriptions matter most — GPT chooses functions based on descriptions. Write them like documentation, not variable names.

  2. Validate parameters — don’t blindly trust JSON from the model. Check types, ranges, and sanitize input.

  3. Use enums — restrict values wherever possible. "enum": ["celsius", "fahrenheit"] is better than "type": "string".

  4. Handle errors gracefully — return structured error objects. GPT will communicate the issue to the user.

  5. Set turn limits — always cap the agent loop. 10-15 turns handles most scenarios. Runaway loops waste tokens and money.

  6. Log everything — log every function call with input/output. Essential for debugging and cost monitoring.

Common Mistakes

  • Too many functions — GPT performs worse with 20+ functions. Group related operations or use routing.
  • Vague descriptions — “Does stuff with data” won’t work. Be specific about when and why to use each function.
  • No error handling — a crashing function breaks the whole agent. Always return clean error messages.
  • Forgetting tool_call_id — every tool response must include the matching tool_call_id. Without it, GPT can’t match results to calls.

Next Steps