API Reference¶

Complete reference for all AGON methods and classes.

AGON Class¶

The main entry point for encoding and decoding operations.

AGON.encode()¶

Encode data to the optimal token-efficient format.

Signature:

AGON.encode(
    data: object,
    format: Format = "auto",
    force: bool = False,
    min_savings: float = 0.10,
    encoding: Encoding | None = None
) -> AGONEncoding

Parameters:

Parameter	Type	Default	Description
`data`	`object`	required	JSON-serializable Python data to encode
`format`	`Format`	`"auto"`	Format to use: `"auto"`, `"json"`, `"rows"`, `"columns"`, `"struct"`
`force`	`bool`	`False`	If True with `format="auto"`, never fall back to JSON
`min_savings`	`float`	`0.10`	Minimum token savings (0.0-1.0) required to use specialized format vs JSON
`encoding`	`Encoding \| None`	`None`	Token encoding for accurate counting (e.g., `"o200k_base"`). If `None`, uses fast byte-length estimation

Returns: AGONEncoding - Result object with encoded text and metadata

Examples:

Auto Selection (Recommended)Specific FormatCustom ThresholdForce Specialized Format

from agon import AGON

data = [
    {"id": 1, "name": "Alice", "role": "admin"},
    {"id": 2, "name": "Bob", "role": "user"},
]

# Auto-select best format
result = AGON.encode(data, format="auto")
print(f"Selected: {result.format}")  # → "rows"
print(f"Tokens: {AGON.count_tokens(result.text)}")
print(result)  # Use directly in LLM prompts

# Force a specific format
result_rows = AGON.encode(data, format="rows")
result_columns = AGON.encode(data, format="columns")
result_struct = AGON.encode(data, format="struct")
result_json = AGON.encode(data, format="json")

# Each returns AGONEncoding with the specified format

# Require 20% savings before using specialized format
result = AGON.encode(data, format="auto", min_savings=0.20)

# Lower threshold for aggressive optimization
result = AGON.encode(data, format="auto", min_savings=0.05)

# Never fall back to JSON, always use best specialized format
result = AGON.encode(data, format="auto", force=True)

# Useful when you know your data is structured
# and want maximum token savings

AGON.decode()¶

Decode AGON-encoded data back to original Python objects.

Signatures:

# Overload 1: Decode AGONEncoding result
AGON.decode(payload: AGONEncoding) -> object

# Overload 2: Decode string with auto-detection
AGON.decode(payload: str, format: ConcreteFormat | None = None) -> object

Parameters:

Parameter	Type	Default	Description
`payload`	`AGONEncoding \\| str`	required	Encoded data to decode
`format`	`ConcreteFormat \\| None`	`None`	Optional format override (`"json"`, `"rows"`, `"columns"`, `"struct"`)

Returns: object - Decoded Python data (list, dict, etc.)

Examples:

Round-Trip DecodeAuto-Detect from HeaderExplicit Format

data = [{"id": 1, "name": "Alice"}]

# Encode
result = AGON.encode(data, format="rows")

# Decode - automatically uses result's format
decoded = AGON.decode(result)
assert decoded == data  # Lossless

# AGON-encoded string with header
agon_string = """@AGON rows

[2]{id  name}
1   Alice
2   Bob"""

# Auto-detects "rows" format from @AGON header
decoded = AGON.decode(agon_string)
# → [{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}]

# Decode without header by specifying format
agon_rows_without_header = """[2]{id    name}
1   Alice
2   Bob"""

decoded = AGON.decode(agon_rows_without_header, format="rows")
# → [{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}]

AGON.project_data()¶

Filter data to keep only specific fields, supporting dotted paths for nested access.

Signature:

AGON.project_data(
    data: list[dict],
    keep_paths: list[str]
) -> list[dict]

Parameters:

Parameter	Type	Description
`data`	`list[dict]`	List of dictionaries to filter
`keep_paths`	`list[str]`	List of field paths to keep (supports dot notation)

Returns: list[dict] - Filtered data with only specified fields

Examples:

Simple FieldsNested PathsArray Fields

data = [
    {"id": 1, "name": "Alice", "email": "alice@example.com", "age": 28},
    {"id": 2, "name": "Bob", "email": "bob@example.com", "age": 32},
]

# Keep only id and name
projected = AGON.project_data(data, ["id", "name"])
# → [{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}]

data = [
    {
        "user": {
            "profile": {"name": "Alice", "age": 28},
            "settings": {"theme": "dark"}
        },
        "status": "active"
    }
]

# Extract nested fields with dot notation
projected = AGON.project_data(data, ["user.profile.name", "status"])
# → [{"user": {"profile": {"name": "Alice"}}, "status": "active"}]

data = [
    {
        "type": "DAY_GAINERS",
        "quotes": [
            {"symbol": "AAPL", "price": 150.0, "volume": 1000000},
            {"symbol": "GOOGL", "price": 2800.0, "volume": 500000}
        ]
    }
]

# Project fields from nested arrays
projected = AGON.project_data(data, ["quotes.symbol", "quotes.price"])
# → [{"quotes": [{"symbol": "AAPL", "price": 150.0},
#                {"symbol": "GOOGL", "price": 2800.0}]}]

Use Before Encoding

Project data before encoding to reduce token count further:

# Filter to essential fields, then encode
projected = AGON.project_data(full_data, ["id", "name", "score"])
result = AGON.encode(projected, format="auto")

AGON.count_tokens()¶

Count tokens in text using the specified encoding.

Signature:

AGON.count_tokens(
    text: str,
    encoding: Encoding = "o200k_base"
) -> int

Parameters:

Parameter	Type	Default	Description
`text`	`str`	required	Text to count tokens for
`encoding`	`Encoding`	`"o200k_base"`	Tiktoken encoding name (`"o200k_base"`, `"cl100k_base"`, etc.)

Returns: int - Number of tokens

Example:

text = "Hello, world!"
tokens = AGON.count_tokens(text)
print(f"Tokens: {tokens}")  # → 4

# Use different encoding
tokens_gpt4 = AGON.count_tokens(text, encoding="cl100k_base")

AGONEncoding Class¶

Result object returned by AGON.encode().

Attributes:

Attribute	Type	Description
`format`	`ConcreteFormat`	Format used: `"json"`, `"rows"`, `"columns"`, `"struct"`
`text`	`str`	Encoded output (ready for LLM prompts)
`header`	`str`	Format header (e.g., `"@AGON rows"`)

Methods:

str()¶

Returns the encoded text (without header) for direct use in prompts.

result = AGON.encode(data, format="rows")
prompt = f"Analyze this data:\n\n{result}"  # Converts to string via __str__()

len()¶

Returns character count of the encoded text.

result = AGON.encode(data, format="rows")
char_count = len(result)  # Character count

repr()¶

Returns debug representation.

result = AGON.encode(data, format="rows")
print(repr(result))
# → AGONEncoding(format='rows', length=45)

with_header()¶

Returns encoded text with header prepended (for auto-detect decoding).

result = AGON.encode(data, format="rows")

# Without header (for sending to LLM)
print(result.text)
# → [2]{id  name}
#   1   Alice
#   2   Bob

# With header (for decoding)
print(result.with_header())
# → @AGON rows
#
#   [2]{id  name}
#   1   Alice
#   2   Bob

Use cases:

Without header (result.text or str(result)): Send to LLM prompts
With header (result.with_header()): Store for later decoding, or ask LLM to return in same format

hint()¶

Get prescriptive generation instructions for LLMs (experimental feature for asking LLMs to return AGON-formatted data).

result = AGON.encode(data, format="auto")

# Get hint for the selected format
hint = result.hint()
print(hint)
# → "Return in AGON rows format: Start with @AGON rows header,
#    encode arrays as name[N]{fields} with tab-delimited rows"

Example use in LLM prompts:

data = [{"id": 1, "name": "Alice", "role": "admin"}]
result = AGON.encode(data, format="auto")

# Ask LLM to respond in AGON format
prompt = f"""Analyze this data and return enriched results in AGON format.

Instructions: {result.hint()}

Example output:
{result.with_header()}

Task: Add a "seniority" field (junior/mid/senior) based on role.
"""

Experimental Feature

LLMs have not been trained on AGON format, so generation accuracy cannot be guaranteed. This is experimental—always validate LLM-generated AGON data.

Prefer: Sending AGON to LLMs (reliable) Over: Asking LLMs to generate AGON (experimental)

Format-Specific Encoders¶

For advanced use cases, you can access format-specific encoders directly.

AGONRows¶

from agon import AGONRows

# Direct encoding with custom options
encoded = AGONRows.encode(
    data,
    delimiter="\t",  # Default: tab
    include_header=False  # Default: False
)

# Direct decoding
decoded = AGONRows.decode(encoded)

AGONColumns¶

from agon import AGONColumns

# Direct encoding
encoded = AGONColumns.encode(
    data,
    delimiter="\t",  # Default: tab
    include_header=False
)

decoded = AGONColumns.decode(encoded)

AGONStruct¶

from agon import AGONStruct

# Direct encoding
encoded = AGONStruct.encode(
    data,
    include_header=False
)

decoded = AGONStruct.decode(encoded)

When to Use Direct Encoders

Use direct format encoders when:

You want guaranteed format selection (bypass auto mode)
You need format-specific options (custom delimiters)
You're benchmarking or comparing formats

For most use cases, AGON.encode(data, format="rows") is preferred.

Error Handling¶

AGON defines a hierarchy of exceptions for error handling.

AGONError¶

Base exception for all AGON errors.

from agon import AGONError

try:
    result = AGON.encode(data, format="auto")
except AGONError as e:
    print(f"AGON error: {e}")

Format-Specific Exceptions¶

AGONRowsError - Errors specific to AGONRows format
AGONColumnsError - Errors specific to AGONColumns format
AGONStructError - Errors specific to AGONStruct format

from agon import AGONRowsError, AGONColumnsError, AGONStructError

try:
    result = AGON.decode(malformed_agon_rows, format="rows")
except AGONRowsError as e:
    print(f"Rows format error: {e}")

Constants & Defaults¶

Constant	Value	Description
`DEFAULT_ENCODING`	`"o200k_base"`	Default token encoding (GPT-4, GPT-4 Turbo)
`DEFAULT_DELIMITER`	`"\t"`	Default field delimiter (tab character)
`DEFAULT_MIN_SAVINGS`	`0.10`	Default minimum token savings threshold (10%)

Type Aliases¶

from agon import Format, ConcreteFormat, Encoding

# Format includes "auto"
Format = Literal["auto", "json", "rows", "columns", "struct"]

# ConcreteFormat excludes "auto" (actual encoding formats)
ConcreteFormat = Literal["json", "rows", "columns", "struct"]

# Encoding - supported tiktoken encodings
Encoding = Literal[
    "o200k_base",      # GPT-4o, o1, o3
    "o200k_harmony",   # GPT-OSS
    "cl100k_base",     # GPT-4, GPT-3.5-turbo
    "p50k_base",       # Codex, text-davinci-003
    "p50k_edit",       # text-davinci-edit-001
    "r50k_base",       # GPT-3 (davinci, curie, babbage, ada)
]

Next Steps¶

JSON Fallback ¶

View how JSON is used as a safety net

AGONRows Format ¶

Complete guide to row-based encoding

Core Concepts ¶

Understand AGON's adaptive approach and design principles

Benchmarks ¶

See real-world performance across multiple datasets

API Reference¶

AGON Class¶

AGON.encode()¶

AGON.decode()¶

AGON.project_data()¶

AGON.count_tokens()¶

AGONEncoding Class¶

__str__()¶

__len__()¶

__repr__()¶

with_header()¶

hint()¶

Format-Specific Encoders¶

AGONRows¶

AGONColumns¶

AGONStruct¶

Error Handling¶

AGONError¶

Format-Specific Exceptions¶

Constants & Defaults¶

Type Aliases¶

Next Steps¶

JSON Fallback¶

AGONRows Format¶

Core Concepts¶

Benchmarks¶

str()¶

len()¶

repr()¶

JSON Fallback ¶

AGONRows Format ¶

Core Concepts ¶

Benchmarks ¶