Categories

Versions

Operator guide

With the help of the devkit, a Python function can be converted into an AI Studio operator. An operator in AI Studio often requires a parameter, an input port and a output port.

Table of contents

Operator Parameters

Arguments are converted to Parameters in AI Tools. Function inputs are considered arguments based on the following table:

Python type JSON value
int "integer"
float "real"
str "string"
bool "boolean"
EnumType "category"

Inputs/Outputs

Inputs: Any function input not mentioned in the arguments table above gets converted into an input port of the AI Tool operator.

Outputs: Only classes are supported as outputs.

For an input/output variable of class C, the parsed JSON value is given by: C_MODULE_NAME.C_CLASS_NAME

The following types have special handling (this list is likely to expand in the future.):

Python type Description
pandas.core.frame.DataFrame Datatable in AI tools
pathlib.Path Generic File representation using its path
altair_aitools.ext.io.File Type-specific File representation (see below for detatils)

Important: Function arguments are passed via keywords at operator execution, therefore positional only and *args argument are not supported for Python Extensions.

Although **kwargs parameters can be defined, they can only be set by the runtime (e.g. random seed), not by the user.

Extension configuration

There are some attributes that must be specified in the configuration file in the scope [extension]:

Key Description Type
name display name of the extension string
namespace namespace of the extension string
version version of the extension string
environment ID of Python environment (key:version) string
module name of the module containing the operator functions string

The ID of the Python environment consists of the key and optional version of the Python distribution required to run the contained Python code.

These are valid environment IDs: sample-environment or sample-environment:0.1.0

Optional attributes

In addition to the mandatory attributes, extension developers are allowed to specify the following configuration properties in the [extension] scope:

Key Description Type
min_core_version minimum version of AI Studio Core string
license license identifier (see below for details) string

Override operator attributes

The configuration file can be used to override the following attributes for each operator in the scope [operators.operator_name]:

Key Description Type Default Constraints
name display name of the operator string titled name of the function -
icon ID of the desired icon string null -
outputs name of the output ports list of strings result{i} where is i is the port index the length of the list must match with the number of ports
group_key key of container group string null allowed characters: [a-zA-Z0-9_.]; not allowed: whitespaces, two or more . directly next to each other, ending or starting with a .

Extra configuration with docstrings

The builder inspects the docstrings of the operator functions and collects optional metadata from them. The following configuration options are parsed from each docstring:

  • Synopsis: A brief overview of the operator.

    The first row(s) of the docstring followed by an empty line.

  • Description: A detailed description of the operator.

    The row(s) between the Synopsis and the Arguments section. If there is no Arguments section, all rows are after the Synopsis are used as description.

  • Arguments: Description of the input parameters (only parsed for operator parameters not for input ports).

    Parameter descriptions parsed from a structured format. Only the Google and Sphinx docstring formats are supported for this feature. Both typed and multiline descriptions are supported.

Extension licensing (optional)

To specify the license for the Python Extension, include a LICENSE file in the root directory and define the license identifier in the configuration file under the extension.license scope. Third-party package licenses can also be defined by placing the license files in the licenses folder, using the format package_name.license_id.license (e.g., licenses/pandas.BSD-3-Clause.license). Users will be able to view these licenses.

Package resources

It is possible to add resource files to the extension archive by placing the desired files into the root level resources folder. These files become available inside the operators (see below for details).

Advanced features

File I/O

This feature facilitates the transfer of file objects between AI Studio and Python processes. The altair_aitools.ext.io module provides predefined classes specifically designed for this purpose. These classes, which inherit from the abstract base class altair_aitools.ext.io.File, can be utilized as inputs or outputs for Operators.

  1. Input Handling:

    • When a class derived from altair_aitools.ext.io.File is used as an input in an Operator, the SDK ensures that the file is automatically transferred from AI Studio to the Python process.
    • The file is loaded into the memory of the Python process, allowing for immediate access of its content within the Operator.
  2. Output Handling:

    • When an Operator generates an output in the form of a altair_aitools.ext.io.File object, the SDK handles the transfer of this file back to AI Studio.
    • This ensures that the output file is available in AI Studio for further use.

Available File types

Class Module Backend Status
TextFile altair_aitools.ext.io str
BinaryFile altair_aitools.ext.io bytes
ImageFile altair_aitools.ext.io.vision PIL.Image
VideoFile altair_aitools.ext.io.vision TBD
AudioFile altair_aitools.ext.io.audio TBD

File types Example

from altair_aitools.ext.io import BinaryFile, TextFile


def convert_file(file: BinaryFile) -> TextFile:
    """Converts a BinaryFile into a TextFile.

    Args:
        file (BinaryFile): Input binary file.

    Returns:
        TextFile: Output text file.
    """
    b: bytes = file.get_content() # Returns the content of the file as bytes
    text: str = b.decode() # Decodes bytes into a string
    return TextFile.from_content(text) # Creates and returns a TextFile object with the decoded content

Conditional Parameter Definition

In Python extensions, all operator parameters are defined as function inputs (optionally with default values). However, parameters may be conditional on the value of others meaning that they should be hidden if a specified condition is not met. For instance, different dropdown items may require different parameters to be set.

This feature enables Python extension developers to define such conditional parameters. To this end, one should use typing.Annotated (or typing_extensions.Annotated for Python versions before 3.9) as a wrapper around the parameter and then specify the conditions with ConditionalAnnotation from altair_aitools.ext.annotations. If a parameter has multiple condition annotations the final condition is achieved by joining all conditions with AND operators.

some_integer_parameter: Annotated[int, CONDITION_1, CONDITION_2, ..., CONDITION_N]

Supported logical operators for condition definition

Condition annotations can be defined by calling logical operators. The annotation class altair_aitools.ext.annotations.ConditionalAnnotation supports the supported logical operators:

Name Description
eq equals
ne not equals

These operators should be called at parameter definition with two arguments:

  • key (str): Name of the parameter to depend on.
  • value (Any): Value to compare the parameter with.

Annotation Example

from typing import Annotated
from altair_aitools.ext.annotations import ConditionalAnnotation


def do_something_with_names(
    name: str = "",
    surname: Annotated[str, ConditionalAnnotation.ne("name", "")] = "",  # Will only be showed if name is set (name != "")
) -> None: ...

Note: For Python versions before 3.9, use typing_extensions.Annotated instead!

DataFrame Column Selector

String parameters can be annotated with SelectedColumnAnnotation from altair_aitools.ext.annotations indicating that the parameter represents a column name of an other (DataFrame) input. This annotation object can simply be constructed by calling it with a str type argument, which is the name of the relevant DataFrame the user can select the column from on the GUI.

Important: the name of the DataFrame in the annotation, and the name of the DataFrame parameter in the function's signature must match!

Column Annotation Example

from pandas import DataFrame
from typing import Annotated
from altair_aitools.ext.annotations import SelectedColumnAnnotation

def process_df_with_selected_column(
        my_dataframe: DataFrame,
        selected_column: Annotated[str, SelectedColumnAnnotation("my_dataframe")]
) -> None: ...

Note: For Python versions before 3.9, use typing_extensions.Annotated instead!

Long Text Parameter

String parameters can be annotated with TextParameterAnnotation from altair_aitools.ext.annotations indicating that the parameter holds a longer text. This leads to a button opening a text editor on the GUI. Also, the text type can be specified by passing a value of the enum TextType to the annotation initializer. The text type is only used for syntax highlighting on the GUI. Supported types: PLAIN (default), JSON, XML, HTML, SQL and PYTHON.

Text Type Example

from typing import Annotated
from altair_aitools.ext.annotations import TextParameterAnnotation, TextType

def use_longer_texts(
        plain_text: Annotated[str, TextParameterAnnotation()], # Same as providing TextType.PLAIN
        json_text: Annotated[str, TextParameterAnnotation(TextType.JSON)],
) -> None: ...

Note: For Python versions before 3.9, use typing_extensions.Annotated instead!

Accessing Resources

In order to access a resource file from the resources folder (see above), one can use the Resource class from altair_aitools.ext.io. Its initializer takes the path of the file relative to the resources folder, and implements the context manager pattern providing the resource file as a byte stream.

Example

import pandas
from altair_aitools.ext.io import Resource
from pandas import DataFrame

def sample_data() -> DataFrame:
    with Resource("data.csv") as resource:
        return pandas.read_csv(resource)

Depending on other Python Extensions

Functionality can be imported from other Python Extensions. First of all, it has to be specified which extension(s) to depend on. This can be done by defining the list of extensions and their minimum version in the configuration file extension.toml under the extension.dependencies scope:

[extension]
name = "Dependent Extension"
namespace = "depex"
version = "0.1.0"
environment = "sample-environment:0.1.0"
module = "depex"

[extension.dependencies]
pytensors = "0.3.0"

The extension archives of the extensions listed here must be included in the dependencies folder:

depex/
├── extension.toml
├── depex
│   ├── __init__.py
│   └── operator.py
└── dependencies
    └──  pytensors-0.3.0.zip

Finally, Python's importing system can be used to import objects from other extensions:

from pandas import DataFrame
import tensors

def sample_random_data(rows: int = 5, cols: int = 2) -> DataFrame:
    return tensors.tensor_to_df(tensors.random_tensor(rows, cols))

Important Notes:

  • While in the configuration file namespace is used to define dependence, code can only be imported using the module attribute of the extension.
  • Instead of directly importing functions from a dependency, import modules (otherwise the imported functions are exposed as operators).
  • Only import objects from the configured dependencies directly, not from their dependencies (Those must also be explicitly configured).