Usage Extraction modes
Theusage_extract directive applies native parsing strategies tailored to standard provider schemas:
usage_mode presets or custom. The default repository config focuses on the more specific presets from config/modes/usage_modes.conf, such as openai_chat_completions, openai_prompt_completion, openai_responses, openai_responses_stream, anthropic_messages, anthropic_messages_stream, gemini_generate_content, and gemini_generate_content_stream.
Generic names such as openai, anthropic, and gemini are no longer special builtin usage_extract modes. If you want them, define them explicitly as global usage_mode presets.
Inside metrics, declaring usage_fact, *_tokens_path, or *_tokens_expr without usage_extract is equivalent to usage_extract custom;.
Migration notes for the old generic provider modes:
gemini: the current default preset behavior can be fully replaced bycustom;input tokenshould read the total input token count, and multimodal details can be exposed asinput.image/input.audio/input.video tokenfacts.anthropic:customcan cover the same core token/cache extraction, including streaming, but the config is more verbose because stream events may place usage under eithermessage.usageor top-levelusage.openai:customcovers the core token/cache extraction, but image/audio/tool supplemental facts still require extra explicitusage_factrules.
custom mode:
custom sketch:
custom sketches:
Custom Token Extraction
If a provider hides tokens in a weird JSON path, you can map them dynamically.$.items[*].x JSONPath syntax to sum all numeric occurrences in an array.
Usage Facts
For the new custom-first flow, useusage_fact to describe each measurable item explicitly.
Use usage_root first when the upstream response has a nested usage object and most facts should read from that object.
usage_root path="..."extracts a usage JSON object before facts run.- Multiple
usage_rootrules merge into one usage object; later non-zero fields can fill or replace earlier zero fields. - In stream extraction, ONR merges usage roots from chunks first, then runs default-source /
source=usagefacts once at stream end. Explicitsource=response,request, andderivedfacts still run during chunk processing. event="a|b"can restrictusage_rootorusage_factto specific SSE event names.path,count_path,sum_path, andexprare supported.- String values may use either double quotes or single quotes.
count_pathcan be combined withtypeandstatusfilters.event="..."optionally restricts ausage_factrule to matching SSEevent:names.event_optional=truemay be combined withevent="..."when an upstream stream sometimes omits SSEevent:framing.attr.ttldistinguishes Anthropic cache write tiers.- Multiple
usage_factrules may share the samedimension + unit; all matched non-fallback rules are summed. fallback=trueuses a total field only when the more specific facts do not exist.sourcedefaults to the mergedusage_rootwhen configured; otherwise it defaults toresponse.sourcecurrently supportsusage,response,request, andderived.dimensionis a flat registry key;.is part of the name and does not imply nesting.- For filter JSONPath, single-quoted DSL strings avoid escaping inner double quotes, for example:
path='$.usageMetadata.promptTokensDetails[?(@.modality=="AUDIO")].tokenCount'
Supported dimensions
inputoutputinput.imageinput.videoinput.audiooutput.imageoutput.videooutput.audiocache_readcache_writeserver_tool.web_searchserver_tool.file_searchimage.generateimage.editimage.variationaudio.ttsaudio.sttaudio.translate
Supported dimension + unit pairs
The current registry intentionally accepts a limited set ofdimension + unit pairs:
- Token and cache:
input tokenoutput tokeninput.image tokeninput.video tokeninput.audio tokenoutput.image tokenoutput.video tokenoutput.audio tokencache_read tokencache_write token
- Tool usage:
server_tool.web_search callserver_tool.file_search call
- Image and audio:
image.generate imageimage.edit imageimage.variation imageaudio.tts secondaudio.stt secondaudio.translate second
Source examples
OpenAI Responses tool usage:OpenAI-specific presets and supplemental facts
The repository’s OpenAI-specific presets commonly model canonical facts such as:images.generations -> image.generate imageimages.edits -> image.edit imageaudio.transcriptions -> audio.stt secondaudio.translations -> audio.translate secondaudio.speech -> audio.tts secondwhen derived runtime usage is availableresponses -> server_tool.web_search call
Formula & Arithmetic Configuration
Alternatively, you can compute tokens dynamically based on math equations rather than strictly passing the exact path.total_tokens is derived from input + output automatically. In most cases, avoid setting total_tokens_expr explicitly, because it introduces a second total fact source that can drift from the totals derived from input and output.
Finish Reason Extraction
Similar to usage Extraction, you can instruct ONR where to look for the “stop” string (e.g.stop, length, tool_calls) indicating why generation ceased.
metrics, declaring finish_reason_path without finish_reason_extract is equivalent to finish_reason_extract custom;.
Like usage extraction, finish reason extraction also supports reusable top-level presets:
config/modes/finish_reason_modes.conf, which is included by config/onr.conf.
The default repository config focuses on path-specific presets such as openai_chat_completions, openai_completions, openai_responses, anthropic_messages, anthropic_messages_stream, gemini_generate_content, and gemini_generate_content_stream.
Generic names such as openai, anthropic, and gemini are no longer special builtin finish_reason_extract modes. If you want them, define them explicitly as global finish_reason_mode presets.