Advanced Operator Examples

Custom Data Objects


from typing import Dict, Any
from scipy.sparse import csr_matrix
import numpy as np
from altair_aitools.ext.io import ObjectSerializer, BaseDataObject


class TensorShape2D(BaseDataObject):
    """Simple data object to represent a 2D tensor shape."""

    height: int
    width: int


def create_tensor_shape(height: int, width: int) -> TensorShape2D:
    """An operator to create a 2D tensor shape object.

    Args:
        height (int): The height of the tensor.
        width (int): The width of the tensor.

    Returns:
        TensorShape2D: The created tensor shape object.
    """
    return TensorShape2D(height=height, width=width)


def random_tensor(shape: TensorShape2D) -> np.ndarray:
    """An operator to create a random 2D tensor.

    Args:
        shape (TensorShape2D): The shape of the tensor.

    Returns:
        np.ndarray: The created random tensor.
    """
    return np.random.rand(shape.height, shape.width)


class SparseMatrixSerializer(ObjectSerializer[csr_matrix]):
    """Serializer for scipy sparse CSR matrix objects."""

    @staticmethod
    def object_to_dict(object: csr_matrix) -> Dict[str, Any]:
        return {
            "data": object.data,  # Keep as numpy array
            "indices": object.indices,  # Keep as numpy array
            "indptr": object.indptr,  # Keep as numpy array
            "shape": list(object.shape),
        }

    @staticmethod
    def dict_to_object(object: Dict[str, Any]) -> csr_matrix:
        return csr_matrix(
            (
                object["data"],
                object["indices"],
                object["indptr"],
            ),  # Unpack numpy arrays
            shape=tuple(object["shape"]),
        )


def np_to_csr(array: np.ndarray) -> csr_matrix:
    """Operator to convert a numpy array to a CSR sparse matrix."""
    return csr_matrix(array)


def csr_to_np(matrix: csr_matrix) -> np.ndarray:
    """Operator to convert a CSR sparse matrix to a numpy array."""
    return matrix.toarray()

def concat_tensors(arrays: list[np.ndarray], axis: int | None = 0) -> np.ndarray:
    """Operator to concatenate a list of numpy arrays."""
    return np.concatenate(arrays, axis=axis)

Conditional Parameter definition

In Python extensions, all operator parameters are defined as function inputs (optionally with default values). However, parameters may be conditional on the value of others meaning that they should be hidden if a specified condition is not met. For instance, different dropdown items may require different parameters to be set.

This feature enables Python extension developers to define such conditional parameters. To this end, one should use typing.Annotated as a wrapper around the parameter and then specify the conditions with ConditionalAnnotation from altair_aitools.ext.annotations. If a parameter has multiple condition annotations the final condition is achieved by joining all conditions with AND operators.

some_integer_parameter: Annotated[int, CONDITION_1, CONDITION_2, ..., CONDITION_N]

Supported logical operators for condition definition

Condition annotations can be defined by calling logical operators. The annotation class altair_aitools.ext.annotations.ConditionalAnnotation supports the supported logical operators:

Name Description
eq equals
ne not equals

These operators should be called at parameter definition with two arguments:

  • key (str): Name of the parameter to depend on.
  • value (Any): Value to compare the parameter with.
from typing import Annotated
from altair_aitools.ext.annotations import ConditionalAnnotation


def do_something_with_names(
    name: str = "",
    surname: Annotated[str, ConditionalAnnotation.ne("name", "")] = "",  # Will only be showed if name is set (name != "")
) -> None: ...

DataFrame Column Selector

String parameters can be annotated with SelectedColumnAnnotation from altair_aitools.ext.annotations indicating that the parameter represents a column name of an other (DataFrame) input. This annotation object can simply be constructed by calling it with a str type argument, which is the name of the relevant DataFrame the user can select the column from on the GUI.

Important: the name of the DataFrame in the annotation, and the name of the DataFrame parameter in the function's signature must match!

from pandas import DataFrame
from typing import Annotated
from altair_aitools.ext.annotations import SelectedColumnAnnotation

def process_df_with_selected_column(
        my_dataframe: DataFrame,
        selected_column: Annotated[str, SelectedColumnAnnotation("my_dataframe")]
) -> None: ...

Long Text Parameter

String parameters can be annotated with TextParameterAnnotation from altair_aitools.ext.annotations indicating that the parameter holds a longer text. This leads to a button opening a text editor on the GUI. Also, the text type can be specified by passing a value of the enum TextType to the annotation initializer. The text type is only used for syntax highlighting on the GUI. Supported types: PLAIN (default), JSON, XML, HTML, SQL and PYTHON.

from typing import Annotated
from altair_aitools.ext.annotations import TextParameterAnnotation, TextType

def use_longer_texts(
        plain_text: Annotated[str, TextParameterAnnotation()], # Same as providing TextType.PLAIN
        json_text: Annotated[str, TextParameterAnnotation(TextType.JSON)],
) -> None: ...

Column Roles

The concept of attribute/column roles is supported in Python extensions. It is handled with the built-in attrs dictionary attribute of pandas.DataFrame. So users can access and set these roles with the key "role" in the aforementioned dictionary. For convenience, the DevKit provides the following helper functions in the altair_aitools.ext.metadata package to manage column roles:

  • get_all_roles(df: pd.DataFrame, ref: bool = False) -> dict[str, ColumnRole]: Get column roles for a DataFrame.
    • ref: If True, returns a reference to the roles dictionary. If False, returns a copy of the roles dictionary.
  • get_role(df: pd.DataFrame, col: str) -> ColumnRole: Get the role of a column in a DataFrame.
  • set_role(df: pd.DataFrame, col: str, role: ColumnRole) -> None: Set a column role for a DataFrame.

The enum altair_aitools.ext.metadata.ColumnRole contains the supported values for column roles. If a columns does not have a special role, ColumnRole.REGULAR is used by default.

import pandas as pd
from typing import Annotated
from altair_aitools.ext.metadata import ColumnRole, set_role
from altair_aitools.ext.annotations import SelectedColumnAnnotation

def set_role_example(
    df: pd.DataFrame,
    col: Annotated[str, SelectedColumnAnnotation("df")],
    role: ColumnRole,
) -> pd.DataFrame:
    """Operator to set the role of the specified column in the DataFrame."""
    set_role(df, col, role)
    return df

Accessing Resources

In order to access a resource file from the resources folder, one can use the Resource class from altair_aitools.ext.io. Its initializer takes the path of the file relative to the resources folder, and implements the context manager pattern providing the resource file as a byte stream. If the resource is needed to be available as a file, one can use the tempfile() function on the Resource instance, and the resource file will be extracted with automatically managed lifecycle.

import pandas as pd
from altair_aitools.ext.io import Resource


def sample_data() -> pd.DataFrame:
    with Resource("data.csv") as resource: # resource: IO[bytes]
        return pd.read_csv(resource)

def sample_data_as_file() -> pd.DataFrame:
    with Resource("data.csv").tempfile() as file_path: # file_path: str
        return pd.read_csv(file_path)