ahvn.llm.base module¶
- class ahvn.llm.base.LLM(preset=None, model=None, provider=None, cache=True, cache_exclude=None, name=None, **kwargs)[源代码]¶
基类:
objectHigh-level chat LLM client with retry, caching, proxy, and streaming support.
This class wraps a litellm-compatible chat API and provides two access modes: - stream: incremental (delta) results as they arrive - oracle: full (final) result collected from the stream
Key features: - Retry: automatic retries via tenacity on retryable exceptions. - Caching: memoizes successful results keyed by all request inputs and a user-defined name. Excluded keys can be configured via cache_exclude. - Streaming-first: always uses stream=True under the hood for stability; oracle aggregates the stream. - Proxies: optional http_proxy and https_proxy support per-request. - Flexible messages: accepts multiple message formats and normalizes them. - Output shaping: include and reduce control what is returned and whether to flatten lists.
- 参数:
preset (str | None) -- Named preset from configuration (if supported by resolve_llm_config).
model (str | None) -- Model identifier (e.g., "gpt-4o"). Overrides preset when provided.
provider (str | None) -- Provider name used by the underlying client.
cache (Union[bool, str, BaseCache] | None) -- Cache implementation. Defaults to True. If True, uses DiskCache with the default cache directory ("core.cache_path"). If a string is provided, it is treated as the path for DiskCache. If None/False, uses NoCache (no caching).
cache_exclude (list[str] | None) -- Keys to exclude from cache key construction.
name (str | None) -- Logical name for this LLM instance. Used to namespace the cache. Defaults to "llm".
**kwargs -- Additional provider/client config (e.g., temperature, top_p, n, tools, tool_choice, http_proxy, https_proxy, and any litellm client options). These act as defaults and can be overridden per call.
备注
- Caching: Only successful executions are cached. The cache key includes the normalized messages,
the full effective configuration, and name, minus any keys listed in cache_exclude.
Set name differently for semantically distinct use-cases to avoid cache collisions.
- __init__(preset=None, model=None, provider=None, cache=True, cache_exclude=None, name=None, **kwargs)[源代码]¶
- stream(messages, tools=None, tool_choice=None, include=None, verbose=False, reduce=True, **kwargs)[源代码]¶
Stream LLM responses (deltas) for the given messages.
Features: - Retry: automatic retries for transient failures. - Caching: memoizes successful runs keyed by inputs and name. - Streaming-first: uses stream=True for stability; yields deltas as they arrive. - Tool support: when tools are provided, tool_calls are aggregated and yielded at the end. - Proxies: supports http_proxy and https_proxy in kwargs. - Flexible input: accepts multiple message formats and normalizes them. - Output shaping: control returned fields with include and flattening with reduce.
- 参数:
messages (
Union[str,Dict[str,Any],Any,List[Union[str,Dict[str,Any],Any]]]) --Conversation content, normalized by
format_messages: 1) str -> treated as a single user message 2) list:litellm.Message -> converted via json()
str -> treated as user message
dict -> used as-is and must include "role"
tools (
Optional[List[Union[Dict,ToolSpec]]]) -- Optional list of tools, each can be a ToolSpec or jsonschema dict. When provided, include defaults to ["think", "text", "tool_calls"].tool_choice (
Optional[str]) -- Tool choice setting. Defaults to "auto" if tools present, otherwise None.include (
Optional[List[Literal['text','think','tool_calls','content','message','structured','tool_messages','tool_results','delta_messages','messages']]]) -- Fields to include in each streamed delta. Can be a str or list[str]. Allowed: "text", "think", "tool_calls", "content", "message", "structured", "tool_messages", "tool_results", "delta_messages", "messages". Default: ["text"] without tools, ["think", "text", "tool_calls"] with tools.verbose (
bool) -- If True, logs the resolved request config.reduce (
bool) -- If True and len(include) == 1, returns a single value instead of a dict. If False, always returns a dict.**kwargs -- Per-call overrides for LLM config (e.g., temperature, top_p, http_proxy, https_proxy, etc.).
- 生成器:
LLMResponse -- - dict if len(include) > 1 or reduce == False - single value if len(include) == 1 and reduce == True When tools are present, tool_calls/tool_messages/tool_results are yielded at the end after all text.
- 抛出:
ValueError -- if include is empty or contains unsupported fields (e.g., "messages").
ValueError -- if tool_messages or tool_results is in include but some tools are not ToolSpec.
- 返回类型:
Generator[Union[str,Dict[str,Any],List[Union[str,Dict[str,Any]]]],None,None]
- async astream(messages, tools=None, tool_choice=None, include=None, verbose=False, reduce=True, **kwargs)[源代码]¶
Asynchronously stream LLM responses (deltas) for the given messages.
Mirrors
stream()but returns an async generator suitable for async workflows.Warning: tools are not yet supported in async mode and will raise NotImplementedError if provided.
- 返回类型:
AsyncGenerator[Union[str,Dict[str,Any],List[Union[str,Dict[str,Any]]]],None]- 参数:
- oracle(messages, tools=None, tool_choice=None, include=None, verbose=False, reduce=True, **kwargs)[源代码]¶
Get the final LLM response for the given messages (aggregated from a stream).
Features: - Retry: automatic retries for transient failures. - Caching: memoizes successful runs keyed by inputs and name. - Streaming-first: uses stream=True under the hood and aggregates the result. - Tool support: can include tools and tool_results in response. - Proxies: supports http_proxy and https_proxy in kwargs. - Flexible input: accepts multiple message formats and normalizes them. - Output shaping: control returned fields with include and flattening with reduce.
- 参数:
messages (
Union[str,Dict[str,Any],Any,List[Union[str,Dict[str,Any],Any]]]) -- Conversation content, normalized byformat_messages.tools (
Optional[List[Union[Dict,ToolSpec]]]) -- Optional list of tools, each can be a ToolSpec or jsonschema dict. When provided, include defaults to ["think", "text", "tool_calls"].tool_choice (
Optional[str]) -- Tool choice setting. Defaults to "auto" if tools present.include (
Optional[List[Literal['text','think','tool_calls','content','message','structured','tool_messages','tool_results','delta_messages','messages']]]) -- Fields to include in the final result. Can be a str or list[str]. Allowed: "text", "think", "tool_calls", "content", "message", "structured", "tool_messages", "tool_results", "delta_messages", "messages". Default: ["text"] without tools, ["think", "text", "tool_calls"] with tools.verbose (
bool) -- If True, logs the resolved request config.reduce (
bool) -- If True and len(include) == 1, returns a single value instead of a dict. If False, always returns a dict.**kwargs -- Per-call overrides for LLM config.
- 返回:
dict if len(include) > 1 or reduce == False
single value if len(include) == 1 and reduce == True
- 返回类型:
LLMResponse
- 抛出:
ValueError -- if include is empty or contains unsupported fields.
ValueError -- if tool_messages or tool_results is in include but some tools are not ToolSpec.
- async aoracle(messages, tools=None, tool_choice=None, include=None, verbose=False, reduce=True, **kwargs)[源代码]¶
Asynchronously retrieve the final LLM response (aggregated from the async stream).
Mirrors
oracle()and shares its configuration, caching, and reduction semantics.- 返回类型:
Union[str,Dict[str,Any],List[Union[str,Dict[str,Any]]]]- 参数:
- async aembed(inputs, verbose=False, **kwargs)[源代码]¶
Get embeddings for the given inputs asynchronously.
Provides parity with
embed()using litellm.aembedding under the hood while respecting caching behavior.
- tooluse(messages, tools, tool_choice='required', include=None, verbose=False, reduce=True, **kwargs)[源代码]¶
Execute tool calls with the LLM.
This is a convenience method that forces the LLM to use tools and returns the executed tool messages. It sets tool_choice="required" and returns tool_messages by default.
- 参数:
messages (
Union[str,Dict[str,Any],Any,List[Union[str,Dict[str,Any],Any]]]) -- Conversation content.tools (
List[Union[Dict,ToolSpec]]) -- List of tools (ToolSpec instances required for execution).tool_choice (
str) -- Tool choice setting. Defaults to "required".include (
Union[str,List[str],None]) -- Fields to include in the result. Defaults to ["tool_messages"].verbose (
bool) -- If True, logs the resolved request config.reduce (
bool) -- If True, simplifies the output when possible.**kwargs -- Per-call overrides for LLM config.
- 返回:
- List of tool result messages in OpenAI format:
[{"role": "tool", "tool_call_id": ..., "name": ..., "content": ...}, ...]
- 返回类型:
List[Dict]
- 抛出:
ValueError -- if tools are not ToolSpec instances.
示例
>>> tool_messages = llm.tooluse("Calculate fib(10)", tools=[fib_tool]) >>> print(tool_messages) [{"role": "tool", "tool_call_id": "...", "name": "fib", "content": "55"}] >>> # For repeated tool use iteration: >>> messages.append({"role": "assistant", "tool_calls": ...}) >>> messages.extend(tool_messages) >>> tool_messages = llm.tooluse(messages, tools=[fib_tool])
- async atooluse(messages, tools, verbose=False, **kwargs)[源代码]¶
Asynchronously execute tool calls with the LLM.
Mirrors
tooluse()but awaits async streaming.
- property dim¶
Get the dimensionality of the embeddings produced by this LLM. This is determined by making a test embedding call (i.e., "<TEST>").
警告
Due to efficiency considerations, this is only computed once and cached. If the LLM config is edited after the first call (which is not recommended), the result may be incorrect.
- 返回:
The dimensionality of the embeddings.
- 返回类型:
- 抛出:
ValueError -- if the embedding dimension cannot be determined.
- ahvn.llm.base.exec_tool_calls(tool_calls, toolspec_dict)[源代码]¶
Execute tool calls and return standardized tool messages/results.
Compatibility: - Accepts tool calls with or without a
functionlayer (e.g.,{"name": "foo", "arguments": "{}"}). - Missing or emptyiddefaults to an empty string. -argumentsmay be a dict or a JSON string; non-dict inputs are parsed viajson.loadswith graceful errors.- 参数:
- 返回:
- (tool_messages, tool_results)
tool_messages: List of tool message dicts in OpenAI format for conversation continuation.
tool_results: List of result content strings (just the returned values).
- 返回类型:
- 抛出:
ValueError -- If a tool name is missing or the ToolSpec is unavailable.
- ahvn.llm.base.gather_assistant_message(message_chunks)[源代码]¶
Gather assistant message_chunks (returned by _LLMChunk.to_message()) from a list of message dictionaries.
- ahvn.llm.base.gather_stream(stream, include=None, reduce=True)[源代码]¶
Gather an iterable of LLM.stream responses into a single consolidated LLM.oracle response. To use gather_stream, the stream must uses reduce=False to return a dictionary per delta.
- 参数:
stream (Iterable[LLMResponse]) -- An iterable of LLM responses from LLM.stream.
include (List[LLMIncludeType] | None) -- Fields to include in the final output. If None, includes all fields found in the stream. This can usually be omitted if the stream was generated with the desired include fields. However, when the streaming fails (empty), this ensures the final output has the expected structure.
reduce (bool) -- Whether to reduce the final output if only one field is included.
- 返回:
The consolidated LLM response.
- 返回类型:
LLMResponse
- ahvn.llm.base.resolve_llm_config(preset=None, model=None, provider=None, **kwargs)[源代码]¶
Compile an LLM configuration dictionary based on the following order of priority: 1. kwargs 2. preset 3. provider 4. model 5. global configuration When a parameter is specified in multiple places, the one with the highest priority is used. For example, if a parameter is specified in both kwargs and preset, the value from kwargs will be used. When missing, the preset falls back to the default preset, the model falls back to the default model, and the provider falls back to the default provider of the model.
- ahvn.llm.base.format_messages(messages)[源代码]¶
Unify messages for LLM in diverse formats to OpenAI message format.
If messages is a single string, it is treated as a single user message.
If messages is a list, each item is processed as follows:
If the item is a litellm.Message object, it is converted to dict using its json() method.
If the item is a string, it is treated as a user message.
If the item is a dict, it is used as is, but must contain a "role" field.
If the item is of any other type, a TypeError is raised.
If a message dict contains "tool_calls", its "function.arguments" field is converted to a JSON string if it is not already a string.
- 参数:
messages (
Union[str,Dict[str,Any],Any,List[Union[str,Dict[str,Any],Any]]]) -- List of messages that can be either dict or Message objects- 返回:
List of formatted messages in OpenAI format
- 返回类型:
List[dict]
- 抛出:
ValueError -- If messages are invalid or missing required fields
TypeError -- If an unsupported message type is encountered