ahvn.llm package

class ahvn.llm.LLM(preset=None, model=None, provider=None, cache=True, cache_exclude=None, name=None, **kwargs)[源代码]

基类:object

High-level chat LLM client with retry, caching, proxy, and streaming support.

This class wraps a litellm-compatible chat API and provides two access modes: - stream: incremental (delta) results as they arrive - oracle: full (final) result collected from the stream

Key features: - Retry: automatic retries via tenacity on retryable exceptions. - Caching: memoizes successful results keyed by all request inputs and a user-defined name. Excluded keys can be configured via cache_exclude. - Streaming-first: always uses stream=True under the hood for stability; oracle aggregates the stream. - Proxies: optional http_proxy and https_proxy support per-request. - Flexible messages: accepts multiple message formats and normalizes them. - Output shaping: include and reduce control what is returned and whether to flatten lists.

参数:
  • preset (str | None) -- Named preset from configuration (if supported by resolve_llm_config).

  • model (str | None) -- Model identifier (e.g., "gpt-4o"). Overrides preset when provided.

  • provider (str | None) -- Provider name used by the underlying client.

  • cache (Union[bool, str, BaseCache] | None) -- Cache implementation. Defaults to True. If True, uses DiskCache with the default cache directory ("core.cache_path"). If a string is provided, it is treated as the path for DiskCache. If None/False, uses NoCache (no caching).

  • cache_exclude (list[str] | None) -- Keys to exclude from cache key construction.

  • name (str | None) -- Logical name for this LLM instance. Used to namespace the cache. Defaults to "llm".

  • **kwargs -- Additional provider/client config (e.g., temperature, top_p, n, tools, tool_choice, http_proxy, https_proxy, and any litellm client options). These act as defaults and can be overridden per call.

备注

  • Caching: Only successful executions are cached. The cache key includes the normalized messages,

    the full effective configuration, and name, minus any keys listed in cache_exclude.

  • Set name differently for semantically distinct use-cases to avoid cache collisions.

__init__(preset=None, model=None, provider=None, cache=True, cache_exclude=None, name=None, **kwargs)[源代码]
参数:
stream(messages, tools=None, tool_choice=None, include=None, verbose=False, reduce=True, **kwargs)[源代码]

Stream LLM responses (deltas) for the given messages.

Features: - Retry: automatic retries for transient failures. - Caching: memoizes successful runs keyed by inputs and name. - Streaming-first: uses stream=True for stability; yields deltas as they arrive. - Tool support: when tools are provided, tool_calls are aggregated and yielded at the end. - Proxies: supports http_proxy and https_proxy in kwargs. - Flexible input: accepts multiple message formats and normalizes them. - Output shaping: control returned fields with include and flattening with reduce.

参数:
  • messages (Union[str, Dict[str, Any], Any, List[Union[str, Dict[str, Any], Any]]]) --

    Conversation content, normalized by format_messages: 1) str -> treated as a single user message 2) list:

    • litellm.Message -> converted via json()

    • str -> treated as user message

    • dict -> used as-is and must include "role"

  • tools (Optional[List[Union[Dict, ToolSpec]]]) -- Optional list of tools, each can be a ToolSpec or jsonschema dict. When provided, include defaults to ["think", "text", "tool_calls"].

  • tool_choice (Optional[str]) -- Tool choice setting. Defaults to "auto" if tools present, otherwise None.

  • include (Optional[List[Literal['text', 'think', 'tool_calls', 'content', 'message', 'structured', 'tool_messages', 'tool_results', 'delta_messages', 'messages']]]) -- Fields to include in each streamed delta. Can be a str or list[str]. Allowed: "text", "think", "tool_calls", "content", "message", "structured", "tool_messages", "tool_results", "delta_messages", "messages". Default: ["text"] without tools, ["think", "text", "tool_calls"] with tools.

  • verbose (bool) -- If True, logs the resolved request config.

  • reduce (bool) -- If True and len(include) == 1, returns a single value instead of a dict. If False, always returns a dict.

  • **kwargs -- Per-call overrides for LLM config (e.g., temperature, top_p, http_proxy, https_proxy, etc.).

生成器:

LLMResponse -- - dict if len(include) > 1 or reduce == False - single value if len(include) == 1 and reduce == True When tools are present, tool_calls/tool_messages/tool_results are yielded at the end after all text.

抛出:
  • ValueError -- if include is empty or contains unsupported fields (e.g., "messages").

  • ValueError -- if tool_messages or tool_results is in include but some tools are not ToolSpec.

返回类型:

Generator[Union[str, Dict[str, Any], List[Union[str, Dict[str, Any]]]], None, None]

async astream(messages, tools=None, tool_choice=None, include=None, verbose=False, reduce=True, **kwargs)[源代码]

Asynchronously stream LLM responses (deltas) for the given messages.

Mirrors stream() but returns an async generator suitable for async workflows.

Warning: tools are not yet supported in async mode and will raise NotImplementedError if provided.

返回类型:

AsyncGenerator[Union[str, Dict[str, Any], List[Union[str, Dict[str, Any]]]], None]

参数:
oracle(messages, tools=None, tool_choice=None, include=None, verbose=False, reduce=True, **kwargs)[源代码]

Get the final LLM response for the given messages (aggregated from a stream).

Features: - Retry: automatic retries for transient failures. - Caching: memoizes successful runs keyed by inputs and name. - Streaming-first: uses stream=True under the hood and aggregates the result. - Tool support: can include tools and tool_results in response. - Proxies: supports http_proxy and https_proxy in kwargs. - Flexible input: accepts multiple message formats and normalizes them. - Output shaping: control returned fields with include and flattening with reduce.

参数:
  • messages (Union[str, Dict[str, Any], Any, List[Union[str, Dict[str, Any], Any]]]) -- Conversation content, normalized by format_messages.

  • tools (Optional[List[Union[Dict, ToolSpec]]]) -- Optional list of tools, each can be a ToolSpec or jsonschema dict. When provided, include defaults to ["think", "text", "tool_calls"].

  • tool_choice (Optional[str]) -- Tool choice setting. Defaults to "auto" if tools present.

  • include (Optional[List[Literal['text', 'think', 'tool_calls', 'content', 'message', 'structured', 'tool_messages', 'tool_results', 'delta_messages', 'messages']]]) -- Fields to include in the final result. Can be a str or list[str]. Allowed: "text", "think", "tool_calls", "content", "message", "structured", "tool_messages", "tool_results", "delta_messages", "messages". Default: ["text"] without tools, ["think", "text", "tool_calls"] with tools.

  • verbose (bool) -- If True, logs the resolved request config.

  • reduce (bool) -- If True and len(include) == 1, returns a single value instead of a dict. If False, always returns a dict.

  • **kwargs -- Per-call overrides for LLM config.

返回:

  • dict if len(include) > 1 or reduce == False

  • single value if len(include) == 1 and reduce == True

返回类型:

LLMResponse

抛出:
  • ValueError -- if include is empty or contains unsupported fields.

  • ValueError -- if tool_messages or tool_results is in include but some tools are not ToolSpec.

async aoracle(messages, tools=None, tool_choice=None, include=None, verbose=False, reduce=True, **kwargs)[源代码]

Asynchronously retrieve the final LLM response (aggregated from the async stream).

Mirrors oracle() and shares its configuration, caching, and reduction semantics.

返回类型:

Union[str, Dict[str, Any], List[Union[str, Dict[str, Any]]]]

参数:
embed(inputs, verbose=False, **kwargs)[源代码]

Get embeddings for the given inputs.

参数:
  • inputs (Union[str, List[str]]) -- A single string or a list of strings to embed.

  • verbose (bool) -- If True, logs the resolved request config.

  • **kwargs -- Additional parameters for the embedding request.

返回:

A list of embeddings, one for each input string.

返回类型:

List[List[float]]

async aembed(inputs, verbose=False, **kwargs)[源代码]

Get embeddings for the given inputs asynchronously.

Provides parity with embed() using litellm.aembedding under the hood while respecting caching behavior.

返回类型:

List[List[float]]

参数:
tooluse(messages, tools, tool_choice='required', include=None, verbose=False, reduce=True, **kwargs)[源代码]

Execute tool calls with the LLM.

This is a convenience method that forces the LLM to use tools and returns the executed tool messages. It sets tool_choice="required" and returns tool_messages by default.

参数:
  • messages (Union[str, Dict[str, Any], Any, List[Union[str, Dict[str, Any], Any]]]) -- Conversation content.

  • tools (List[Union[Dict, ToolSpec]]) -- List of tools (ToolSpec instances required for execution).

  • tool_choice (str) -- Tool choice setting. Defaults to "required".

  • include (Union[str, List[str], None]) -- Fields to include in the result. Defaults to ["tool_messages"].

  • verbose (bool) -- If True, logs the resolved request config.

  • reduce (bool) -- If True, simplifies the output when possible.

  • **kwargs -- Per-call overrides for LLM config.

返回:

List of tool result messages in OpenAI format:

[{"role": "tool", "tool_call_id": ..., "name": ..., "content": ...}, ...]

返回类型:

List[Dict]

抛出:

ValueError -- if tools are not ToolSpec instances.

示例

>>> tool_messages = llm.tooluse("Calculate fib(10)", tools=[fib_tool])
>>> print(tool_messages)
[{"role": "tool", "tool_call_id": "...", "name": "fib", "content": "55"}]
>>> # For repeated tool use iteration:
>>> messages.append({"role": "assistant", "tool_calls": ...})
>>> messages.extend(tool_messages)
>>> tool_messages = llm.tooluse(messages, tools=[fib_tool])
async atooluse(messages, tools, verbose=False, **kwargs)[源代码]

Asynchronously execute tool calls with the LLM.

Mirrors tooluse() but awaits async streaming.

返回类型:

List[Dict]

参数:
property dim

Get the dimensionality of the embeddings produced by this LLM. This is determined by making a test embedding call (i.e., "<TEST>").

警告

Due to efficiency considerations, this is only computed once and cached. If the LLM config is edited after the first call (which is not recommended), the result may be incorrect.

返回:

The dimensionality of the embeddings.

返回类型:

int

抛出:

ValueError -- if the embedding dimension cannot be determined.

property embed_empty: List[float]

Get a fixed embedding vector for empty strings.

This is a simple heuristic embedding consisting of a 1 followed by zeros, with the length equal to the LLM's embedding dimensionality.

返回:

The embedding vector for an empty string.

返回类型:

List[float]

ahvn.llm.gather_assistant_message(message_chunks)[源代码]

Gather assistant message_chunks (returned by _LLMChunk.to_message()) from a list of message dictionaries.

参数:

message_chunks (List[Dict]) -- A list of message dictionaries to gather.

返回:

A dictionary containing the gathered assistant message.

返回类型:

Dict[str, Any]

ahvn.llm.resolve_llm_config(preset=None, model=None, provider=None, **kwargs)[源代码]

Compile an LLM configuration dictionary based on the following order of priority: 1. kwargs 2. preset 3. provider 4. model 5. global configuration When a parameter is specified in multiple places, the one with the highest priority is used. For example, if a parameter is specified in both kwargs and preset, the value from kwargs will be used. When missing, the preset falls back to the default preset, the model falls back to the default model, and the provider falls back to the default provider of the model.

参数:
  • preset (str, optional) -- The preset name to use.

  • model (str, optional) -- The model name to use.

  • provider (str, optional) -- The provider name to use.

  • encrypt (bool, optional) -- Whether to encrypt the configuration. Defaults to False.

  • **kwargs -- Additional parameters to override in the configuration.

返回:

The resolved LLM configuration dictionary.

返回类型:

Dict[str, Any]

ahvn.llm.format_messages(messages)[源代码]

Unify messages for LLM in diverse formats to OpenAI message format.

  1. If messages is a single string, it is treated as a single user message.

  2. If messages is a list, each item is processed as follows:

    • If the item is a litellm.Message object, it is converted to dict using its json() method.

    • If the item is a string, it is treated as a user message.

    • If the item is a dict, it is used as is, but must contain a "role" field.

    • If the item is of any other type, a TypeError is raised.

  3. If a message dict contains "tool_calls", its "function.arguments" field is converted to a JSON string if it is not already a string.

参数:

messages (Union[str, Dict[str, Any], Any, List[Union[str, Dict[str, Any], Any]]]) -- List of messages that can be either dict or Message objects

返回:

List of formatted messages in OpenAI format

返回类型:

List[dict]

抛出:
  • ValueError -- If messages are invalid or missing required fields

  • TypeError -- If an unsupported message type is encountered

Submodules