Response Types and Formats¶

Chimeric provides a unique dual response system that gives you the best of both worlds: unified consistency across providers and access to provider-specific features when needed.

Overview¶

Every response from Chimeric contains two formats:

Unified Format (default): Standardized CompletionResponse with consistent fields across all providers
Native Format: Provider's original response object with all provider-specific fields and metadata

This dual system allows you to write cross-provider code while still accessing provider-specific features when necessary.

Response Architecture¶

Internal Structure¶

Internally, Chimeric wraps all provider responses in container objects:

# Non-streaming responses
ChimericCompletionResponse[NativeType]:
    .native   # Provider-specific response object
    .common   # Unified CompletionResponse format

# Streaming responses  
ChimericStreamChunk[NativeType]:
    .native   # Provider-specific chunk object
    .common   # Unified StreamChunk format

Access Control¶

The native parameter controls which format you receive:

# Default: Returns unified format
response = client.generate(model="gpt-4o", messages="Hello")
# Type: CompletionResponse

# Native: Returns provider-specific format
native_response = client.generate(model="gpt-4o", messages="Hello", native=True)  
# Type: OpenAI's ChatCompletion object

Unified Format (Default)¶

The unified format provides consistent fields across all providers:

CompletionResponse Structure¶

from chimeric import Chimeric

client = Chimeric()
response = client.generate(model="gpt-4o", messages="Explain quantum physics")

# Standardized fields available across all providers
print(response.content)           # str | list[Any] - Main response content
print(response.model)             # str | None - Model that generated response  
print(response.usage.prompt_tokens)      # int - Input tokens used
print(response.usage.completion_tokens)  # int - Output tokens generated
print(response.usage.total_tokens)       # int - Total tokens
print(response.metadata)          # dict[str, Any] | None - Additional info

StreamChunk Structure (Streaming)¶

stream = client.generate(
    model="gpt-4o", 
    messages="Write a story", 
    stream=True
)

for chunk in stream:
    print(chunk.content)          # str | list[Any] - Accumulated content
    print(chunk.delta)            # str | None - New content in this chunk
    print(chunk.finish_reason)    # str | None - Reason streaming stopped
    print(chunk.metadata)         # dict[str, Any] | None - Chunk metadata

Cross-Provider Consistency¶

The unified format ensures your code works identically across providers:

def analyze_with_any_model(model_name: str, text: str) -> str:
    """Works with any provider - OpenAI, Anthropic, Google, etc."""
    response = client.generate(
        model=model_name,
        messages=f"Analyze this text: {text}"
    )

    # Same interface regardless of provider
    tokens_used = response.usage.total_tokens
    content = response.content

    return f"Analysis ({tokens_used} tokens): {content}"

# Works with any model/provider
result1 = analyze_with_any_model("gpt-4o", "Sample text")
result2 = analyze_with_any_model("claude-3-5-sonnet-20241022", "Sample text") 
result3 = analyze_with_any_model("gemini-1.5-pro", "Sample text")

When to Use Each Format¶

Use Unified Format When:¶

Cross-provider compatibility is important
Building provider-agnostic applications
You only need standard fields (content, usage, model)
Consistency across different models/providers is required
Building generic tools or libraries

# Perfect for cross-provider applications
def summarize_text(text: str, model: str) -> dict:
    response = client.generate(model=model, messages=f"Summarize: {text}")
    return {
        "summary": response.content,
        "tokens_used": response.usage.total_tokens,
        "model": response.model
    }

Use Native Format When:¶

You need provider-specific metadata (IDs, timestamps, safety ratings)
Accessing unique provider features (stop sequences, system fingerprints)
Debugging or logging detailed response information
Integration with provider-specific tools or workflows
Advanced monitoring of provider-specific metrics

# Perfect for detailed logging and monitoring
def detailed_completion_log(prompt: str, model: str):
    native_response = client.generate(model=model, messages=prompt, native=True)

    # Log provider-specific details for debugging
    if "gpt" in model:
        log_openai_response(native_response)
    elif "claude" in model:
        log_anthropic_response(native_response)

Async Support¶

Both formats work identically with async operations:

import asyncio

async def main():
    # Unified async response
    response = await client.agenerate(model="gpt-4o", messages="Hello")
    print(response.content)

    # Native async response  
    native_response = await client.agenerate(
        model="gpt-4o", 
        messages="Hello", 
        native=True
    )
    print(native_response.choices[0].message.content)

    # Unified async streaming
    stream = await client.agenerate(model="gpt-4o", messages="Story", stream=True)
    async for chunk in stream:
        if chunk.delta:
            print(chunk.delta, end="")

asyncio.run(main())

Best Practices¶

Start with Unified Format¶

# Begin with unified format for simplicity
response = client.generate(model="gpt-4o", messages="Hello")
content = response.content
tokens = response.usage.total_tokens

Switch to Native When Needed¶

# Use native format only when you need provider-specific features
if need_detailed_metadata:
    native_response = client.generate(model="gpt-4o", messages="Hello", native=True)
    response_id = native_response.id
    created_time = native_response.created

Handle Multiple Providers Gracefully¶

def smart_response_handler(model: str, prompt: str):
    # Use unified for basic info
    response = client.generate(model=model, messages=prompt)
    result = {"content": response.content, "tokens": response.usage.total_tokens}

    # Add provider-specific details if needed
    if need_provider_details:
        native = client.generate(model=model, messages=prompt, native=True)
        result["native_metadata"] = extract_provider_metadata(native, model)

    return result

This dual response system ensures you can build both flexible cross-provider applications and provider-specific integrations with the same codebase.