ahvn.utils.basic.serialize_utils module

ahvn.utils.basic.serialize_utils.load_txt(path, encoding=None, strict=False)[源代码]

Load text from a file. If the file does not exist, returns an empty string.

参数:
  • path (str) -- The path to the file.

  • encoding (str) -- The encoding to use for reading the file. Defaults to None, which will use the encoding in the config file ("core.encoding").

  • strict (bool) -- If True, raises an error if the file does not exist. Otherwise, returns an empty string.

返回:

The contents of the file or an empty string if the file does not exist.

返回类型:

str

抛出:

FileNotFoundError -- If the file does not exist and strict is True.

ahvn.utils.basic.serialize_utils.iter_txt(path, encoding=None, strict=False)[源代码]

Iterate over a text file, yielding each line (stripping the newline character at the end).

参数:
  • path (str) -- The path to the file.

  • encoding (str) -- The encoding to use for reading the file. Defaults to None, which will use the encoding in the config file ("core.encoding").

  • strict (bool) -- If True, raises an error if the file does not exist. Otherwise, returns an empty generator.

生成器:

str -- Each line in the text file.

抛出:

FileNotFoundError -- If the file does not exist and strict is True.

返回类型:

Generator[str, None, None]

ahvn.utils.basic.serialize_utils.save_txt(obj, path, encoding=None)[源代码]

Save text to a file. If the file does not exist, it will be created.

警告

An extra newline will be added at the end of the string to be consistent with the behavior of append_txt.

参数:
  • obj (Any) -- The text to save.

  • path (str) -- The path to the file.

  • encoding (str) -- The encoding to use for writing the file. Defaults to None, which will use the encoding in the config file ("core.encoding").

ahvn.utils.basic.serialize_utils.append_txt(obj, path, encoding=None)[源代码]

Append text to a file. If the file does not exist, it will be created.

参数:
  • obj (Any) -- The text to append.

  • path (str) -- The path to the file.

  • encoding (str) -- The encoding to use for writing the file. Defaults to None, which will use the encoding in the config file ("core.encoding").

ahvn.utils.basic.serialize_utils.loads_yaml(s, **kwargs)[源代码]

Load a YAML string into a Python object.

参数:
  • s (str) -- The YAML string to load.

  • **kwargs -- Additional keyword arguments to pass to yaml.safe_load.

返回:

The loaded Python object.

返回类型:

Any

ahvn.utils.basic.serialize_utils.dumps_yaml(obj, sort_keys=False, indent=4, allow_unicode=True, **kwargs)[源代码]

Serialize a Python object to a YAML string.

参数:
  • obj (Any) -- The Python object to serialize.

  • sort_keys (bool) -- Whether to sort the keys in the YAML output. Defaults to False.

  • indent (int) -- The number of spaces to use for indentation. Defaults to 4.

  • allow_unicode (bool) -- Whether to allow Unicode characters in the output. Defaults to True.

  • **kwargs -- Additional keyword arguments to pass to yaml.safe_dump.

返回:

The YAML string representation of the object.

返回类型:

str

ahvn.utils.basic.serialize_utils.load_yaml(path, encoding=None, strict=False, **kwargs)[源代码]

Load a YAML file into a Python object.

参数:
  • path (str) -- The path to the YAML file.

  • encoding (str) -- The encoding to use for reading the file. Defaults to None, which will use the encoding in the config file ("core.encoding").

  • strict (bool) -- If True, raises an error if the file does not exist. Otherwise, returns an empty dictionary.

  • **kwargs -- Additional keyword arguments to pass to yaml.safe_load.

返回:

The Python object represented by the YAML file.

返回类型:

Any

抛出:

FileNotFoundError -- If the file does not exist and strict is True.

ahvn.utils.basic.serialize_utils.dump_yaml(obj, path, sort_keys=False, indent=4, allow_unicode=True, **kwargs)[源代码]

Save a Python object to a YAML file.

参数:
  • obj (Any) -- The Python object to save.

  • path (str) -- The path to the YAML file.

  • sort_keys (bool) -- Whether to sort the keys in the YAML output. Defaults to False.

  • indent (int) -- The number of spaces to use for indentation. Defaults to 4.

  • allow_unicode (bool) -- Whether to allow Unicode characters in the output. Defaults to True.

  • **kwargs -- Additional keyword arguments to pass to yaml.safe_dump.

ahvn.utils.basic.serialize_utils.save_yaml(obj, path, sort_keys=False, indent=4, allow_unicode=True, **kwargs)[源代码]

Alias for dump_yaml. Saves a Python object to a YAML file.

参数:
  • obj (Any) -- The Python object to save.

  • path (str) -- The path to the YAML file.

  • sort_keys (bool) -- Whether to sort the keys in the YAML output. Defaults to False.

  • indent (int) -- The number of spaces to use for indentation. Defaults to 4.

  • allow_unicode (bool) -- Whether to allow Unicode characters in the output. Defaults to True.

  • **kwargs -- Additional keyword arguments to pass to yaml.safe_dump.

ahvn.utils.basic.serialize_utils.load_pkl(path, strict=False, **kwargs)[源代码]

Load a Python object from a pickle file.

参数:
  • path (str) -- The path to the pickle file.

  • strict (bool) -- If True, raises an error if the file does not exist. Otherwise, returns None.

  • **kwargs -- Additional keyword arguments to pass to pickle.load.

返回:

The Python object represented by the pickle file.

返回类型:

Any

ahvn.utils.basic.serialize_utils.dump_pkl(obj, path, **kwargs)[源代码]

Save a Python object to a pickle file.

参数:
  • obj (Any) -- The Python object to save.

  • path (str) -- The path to the pickle file.

  • **kwargs -- Additional keyword arguments to pass to pickle.dump.

ahvn.utils.basic.serialize_utils.save_pkl(obj, path, **kwargs)[源代码]

Alias for dump_pkl. Saves a Python object to a pickle file.

参数:
  • obj (Any) -- The Python object to save.

  • path (str) -- The path to the pickle file.

  • **kwargs -- Additional keyword arguments to pass to pickle.dump.

ahvn.utils.basic.serialize_utils.load_hex(path, strict=False, **kwargs)[源代码]

Load the binary contents of a file as a hexadecimal string.

参数:
  • path (str) -- The path to the file.

  • strict (bool) -- If True, raises an error if the file does not exist. Otherwise, returns an empty string.

  • **kwargs -- Additional keyword arguments to pass to bytes.hex.

返回:

The hexadecimal string representation of the file's contents.

返回类型:

str

ahvn.utils.basic.serialize_utils.dump_hex(obj, path, **kwargs)[源代码]

Save a string or bytes object as a hexadecimal string to a file.

参数:
  • obj (str) -- The string or bytes object to save as hexadecimal.

  • path (str) -- The path to the file.

  • **kwargs -- Additional keyword arguments to pass to binascii.hexlify.

ahvn.utils.basic.serialize_utils.save_hex(obj, path, **kwargs)[源代码]

Alias for dump_hex. Saves a string or bytes object as a hexadecimal string to a file.

参数:
  • obj (str) -- The string or bytes object to save as hexadecimal.

  • path (str) -- The path to the file.

  • **kwargs -- Additional keyword arguments to pass to binascii.hexlify.

ahvn.utils.basic.serialize_utils.load_b64(path, strict=False)[源代码]

Load the binary contents of a file as a Base64-encoded string.

参数:
  • path (str) -- The path to the file.

  • strict (bool) -- If True, raises an error if the file does not exist. Otherwise, returns an empty string.

返回:

The Base64 string representation of the file's contents.

返回类型:

str

ahvn.utils.basic.serialize_utils.dump_b64(obj, path)[源代码]

Save a Base64 string to a file by decoding it into binary content.

参数:
  • obj (str) -- The Base64 string to decode and save.

  • path (str) -- The path to the output file.

ahvn.utils.basic.serialize_utils.save_b64(obj, path)[源代码]

Alias for dump_b64. Saves a Base64 string to a file by decoding it into binary content.

参数:
  • obj (str) -- The Base64 string to decode and save.

  • path (str) -- The path to the output file.

ahvn.utils.basic.serialize_utils.serialize_path(path)[源代码]

Serialize the contents of a directory hierarchy into a dictionary mapping relative paths to Base64-encoded file contents.

Directories are recorded with a value of None so the structure can be rehydrated later.

参数:

path (str) -- Directory (or file) path to serialize.

返回:

Mapping of relative paths to Base64 payloads (or None for directories).

返回类型:

Dict[str, Optional[str]]

ahvn.utils.basic.serialize_utils.deserialize_path(serialized, path)[源代码]

Materialize files and directories described by serialized under path.

参数:
  • serialized (Dict[str, Optional[str]]) -- Mapping emitted by serialize_path().

  • path (str) -- Destination directory where content should be written.

ahvn.utils.basic.serialize_utils.serialize_func(func, **kwargs)[源代码]

Serialize a function to a descriptor dictionary using dill for source code and cloudpickle for binary content.

参数:
  • func (Callable) -- The function to serialize.

  • **kwargs -- Additional keyword arguments to pass to cloudpickle.dumps.

返回:

A dictionary representation of the serialized function. It contains the following attributes:

Built-in Attributes: - name: The function's name. - qualname: The qualified name of the function. - doc: The function's docstring. - module: The qualified name of the module where the function is defined. - defaults: Default values for the function's positional arguments. - kwdefaults: Default values for the function's keyword-only arguments. - annotations: Type annotations for the function's arguments and return value. - code: The source code of the function (as a string, via dill). - dict: The function's __dict__ (excluding __source__), with all values stringified. Extra Attributes: - stream: Whether the function is a generator function (bool). - hex_dumps: The function serialized as a hex string using cloudpickle.

返回类型:

Dict

ahvn.utils.basic.serialize_utils.deserialize_func(func, prefer='hex_dumps')[源代码]

Deserialize a function from a descriptor dictionary.

参数:
  • func (Dict) -- The function descriptor dictionary.

  • prefer (Literal['code','hex_dumps']) -- Which method to try first.

返回:

The deserialized function.

返回类型:

Callable

抛出:

FunctionDeserializationError -- If deserialization fails.

class ahvn.utils.basic.serialize_utils.AhvnJsonEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[源代码]

基类:JSONEncoder

encode(obj)[源代码]

Return a JSON string representation of a Python data structure.

>>> from json.encoder import JSONEncoder
>>> JSONEncoder().encode({"foo": ["bar", "baz"]})
'{"foo": ["bar", "baz"]}'
static transform(obj)[源代码]
class ahvn.utils.basic.serialize_utils.AhvnJsonDecoder(*args, **kwargs)[源代码]

基类:JSONDecoder

__init__(*args, **kwargs)[源代码]

object_hook, if specified, will be called with the result of every JSON object decoded and its return value will be used in place of the given dict. This can be used to provide custom deserializations (e.g. to support JSON-RPC class hinting).

object_pairs_hook, if specified will be called with the result of every JSON object decoded with an ordered list of pairs. The return value of object_pairs_hook will be used instead of the dict. This feature can be used to implement custom decoders. If object_hook is also defined, the object_pairs_hook takes priority.

parse_float, if specified, will be called with the string of every JSON float to be decoded. By default this is equivalent to float(num_str). This can be used to use another datatype or parser for JSON floats (e.g. decimal.Decimal).

parse_int, if specified, will be called with the string of every JSON int to be decoded. By default this is equivalent to int(num_str). This can be used to use another datatype or parser for JSON integers (e.g. float).

parse_constant, if specified, will be called with one of the following strings: -Infinity, Infinity, NaN. This can be used to raise an exception if invalid JSON numbers are encountered.

If strict is false (true is the default), then control characters will be allowed inside strings. Control characters in this context are those with character codes in the 0-31 range, including '\t' (tab), '\n', '\r' and '\0'.

返回类型:

None

static transform(obj)[源代码]
ahvn.utils.basic.serialize_utils.loads_json(s, **kwargs)[源代码]

Load a JSON string into a Python object.

参数:
  • s (str) -- The JSON string to load.

  • **kwargs -- Additional keyword arguments to pass to json.loads.

返回:

The Python object represented by the JSON string.

返回类型:

Any

ahvn.utils.basic.serialize_utils.dumps_json(obj, sort_keys=False, indent=4, ensure_ascii=False, **kwargs)[源代码]

Serialize a Python object to a JSON string.

参数:
  • obj (Any) -- The Python object to serialize.

  • sort_keys (bool) -- Whether to sort the keys in the JSON output. Defaults to False.

  • indent (int) -- The number of spaces to use for indentation. Defaults to 4.

  • ensure_ascii (bool) -- Whether to escape non-ASCII characters. Defaults to False.

  • **kwargs -- Additional keyword arguments to pass to json.dumps.

返回:

The JSON string representation of the object.

返回类型:

str

ahvn.utils.basic.serialize_utils.load_json(path, encoding=None, strict=False, **kwargs)[源代码]

Load a JSON file into a Python object.

参数:
  • path (str) -- The path to the JSON file.

  • encoding (str) -- The encoding to use for reading the file. Defaults to None, which will use the encoding in the config file ("core.encoding").

  • strict (bool) -- If True, raises an error if the file does not exist. Otherwise, returns an empty dictionary.

  • **kwargs -- Additional keyword arguments to pass to json.load.

返回:

The Python object represented by the JSON file.

返回类型:

Any

抛出:

FileNotFoundError -- If the file does not exist and strict is True.

ahvn.utils.basic.serialize_utils.dump_json(obj, path, sort_keys=False, indent=4, encoding=None, ensure_ascii=False, **kwargs)[源代码]

Save a Python object to a JSON file.

参数:
  • obj (Any) -- The Python object to save.

  • path (str) -- The path to the JSON file.

  • sort_keys (bool) -- Whether to sort the keys in the JSON output. Defaults to False.

  • indent (int) -- The number of spaces to use for indentation. Defaults to 4.

  • encoding (str) -- The encoding to use for writing the file. Defaults to None, which will use the encoding in the config file ("core.encoding").

  • ensure_ascii (bool) -- Whether to escape non-ASCII characters. Defaults to False.

  • **kwargs -- Additional keyword arguments to pass to json.dump.

ahvn.utils.basic.serialize_utils.save_json(obj, path, sort_keys=False, indent=4, encoding=None, ensure_ascii=False, **kwargs)[源代码]

Alias for dump_json. Saves a Python object to a JSON file.

参数:
  • obj (Any) -- The Python object to save.

  • path (str) -- The path to the JSON file.

  • sort_keys (bool) -- Whether to sort the keys in the JSON output. Defaults to False.

  • indent (int) -- The number of spaces to use for indentation. Defaults to 4.

  • encoding (str) -- The encoding to use for writing the file. Defaults to None, which will use the encoding in the config file ("core.encoding").

  • ensure_ascii (bool) -- Whether to escape non-ASCII characters. Defaults to False.

  • **kwargs -- Additional keyword arguments to pass to json.dump.

ahvn.utils.basic.serialize_utils.escape_json(s, args, **kwargs)[源代码]

Fixes corrupted JSON by escaping string values for known keys.

  • Only processes keys listed in args

  • Only escapes string values

  • Leaves non-string values untouched

  • Handles unescaped quotes and newlines inside strings

参数:
  • s (str) -- The corrupted JSON string.

  • args (List[str]) -- List of keys whose string values need to be escaped.

  • **kwargs -- Additional keyword arguments to pass to json.dumps when returning the repaired JSON

返回:

The repaired JSON string.

返回类型:

str

ahvn.utils.basic.serialize_utils.loads_jsonl(s, **kwargs)[源代码]

Load a JSON Lines string into a list of Python objects.

参数:
  • s (str) -- The JSON Lines string to load.

  • **kwargs -- Additional keyword arguments to pass to json.loads.

返回:

A list of Python objects represented by the JSON Lines string.

返回类型:

List[Any]

ahvn.utils.basic.serialize_utils.dumps_jsonl(obj, sort_keys=False, ensure_ascii=False, **kwargs)[源代码]

Serialize a list of Python objects to a JSON Lines string.

警告

An extra newline will be added at the end of the string to be consistent with the behavior of append_jsonl. indent is NOT a valid argument for this function, as JSON Lines does not support indentation. Passing indent will be ignored.

参数:
  • obj (List[Any]) -- The list of Python objects to serialize.

  • sort_keys (bool) -- Whether to sort the keys in the JSON output. Defaults to False.

  • ensure_ascii (bool) -- Whether to escape non-ASCII characters. Defaults to False.

  • **kwargs -- Additional keyword arguments to pass to json.dumps.

返回:

The JSON Lines string representation of the list.

返回类型:

str

ahvn.utils.basic.serialize_utils.load_jsonl(path, encoding=None, strict=False, **kwargs)[源代码]

Load a JSON Lines file into a list of Python objects.

参数:
  • path (str) -- The path to the JSON Lines file.

  • encoding (str) -- The encoding to use for reading the file. Defaults to None, which will use the encoding in the config file ("core.encoding").

  • strict (bool) -- If True, raises an error if the file does not exist. Otherwise, returns an empty list.

  • **kwargs -- Additional keyword arguments to pass to json.load.

返回:

A list of Python objects represented by the JSON Lines file.

返回类型:

List[Any]

抛出:

FileNotFoundError -- If the file does not exist and strict is True.

ahvn.utils.basic.serialize_utils.iter_jsonl(path, encoding=None, strict=False, **kwargs)[源代码]

Iterate over a JSON Lines file, yielding each Python object.

参数:
  • path (str) -- The path to the JSON Lines file.

  • encoding (str) -- The encoding to use for reading the file. Defaults to None, which will use the encoding in the config file ("core.encoding").

  • strict (bool) -- If True, raises an error if the file does not exist. Otherwise, returns an empty list.

  • **kwargs -- Additional keyword arguments to pass to json.load.

生成器:

Any -- Each Python object represented by a line in the JSON Lines file.

抛出:

FileNotFoundError -- If the file does not exist and strict is True.

返回类型:

Generator[Any, None, None]

ahvn.utils.basic.serialize_utils.dump_jsonl(obj, path, sort_keys=False, ensure_ascii=False, encoding=None, **kwargs)[源代码]

Save a list of Python objects to a JSON Lines file.

警告

An extra newline will be added at the end of the file to be consistent with the behavior of append_jsonl. indent is NOT a valid argument for this function, as JSON Lines does not support indentation. Passing indent will be ignored.

参数:
  • obj (List[Any]) -- The list of Python objects to save.

  • path (str) -- The path to the JSON Lines file.

  • sort_keys (bool) -- Whether to sort the keys in the JSON output. Defaults to False.

  • ensure_ascii (bool) -- Whether to escape non-ASCII characters. Defaults to False.

  • encoding (str) -- The encoding to use for writing the file. Defaults to None, which will use the encoding in the config file ("core.encoding").

  • **kwargs -- Additional keyword arguments to pass to json.dump.

ahvn.utils.basic.serialize_utils.save_jsonl(obj, path, sort_keys=False, ensure_ascii=False, encoding=None, **kwargs)[源代码]

Alias for dump_jsonl. Saves a list of Python objects to a JSON Lines file.

警告

An extra newline will be added at the end of the file to be consistent with the behavior of append_jsonl. indent is NOT a valid argument for this function, as JSON Lines does not support indentation. Passing indent will be ignored.

参数:
  • obj (List[Any]) -- The list of Python objects to save.

  • path (str) -- The path to the JSON Lines file.

  • sort_keys (bool) -- Whether to sort the keys in the JSON output. Defaults to False.

  • ensure_ascii (bool) -- Whether to escape non-ASCII characters. Defaults to False.

  • encoding (str) -- The encoding to use for writing the file. Defaults to None, which will use the encoding in the config file ("core.encoding").

  • **kwargs -- Additional keyword arguments to pass to json.dump.

ahvn.utils.basic.serialize_utils.append_jsonl(obj, path, sort_keys=False, ensure_ascii=False, encoding=None, **kwargs)[源代码]

Append a list of Python objects to a JSON Lines file. If the file does not exist, it will be created. If the object is a dictionary, a single line will be added with the dictionary serialized as JSON. If the object is a list, each item in the list will be serialized as a separate line in the JSON Lines file.

参数:
  • obj (Union[Dict,List[Any]]) -- The list of Python objects to append.

  • path (str) -- The path to the JSON Lines file.

  • sort_keys (bool) -- Whether to sort the keys in the JSON output. Defaults to False.

  • ensure_ascii (bool) -- Whether to escape non-ASCII characters. Defaults to False.

  • encoding (str) -- The encoding to use for writing the file. Defaults to None, which will use the encoding in the config file ("core.encoding").

  • **kwargs -- Additional keyword arguments to pass to json.dump.