Operators and Parameters
Understanding Operators
An operator is a fundamental building block in the Altair AI Tools platform. Each operator:
- Processes data through defined input and output ports
- Modifies its behavior using parameters
- Performs specific functions on your data
The altair-aitools-devkit
makes it simple to transform any Python function into an Altair AI Tools operator.
From Python Function to Operator
Converting a Python function to an operator is straightforward. Consider this example:
import pandas as pd
def hello_world(data: pd.DataFrame, x: int) -> pd.DataFrame:
"""
Add x to each numeric value in the dataframe.
Args:
data: Input dataframe to process
x: Value to add to each cell
Returns:
Modified dataframe with x added to each numeric value
"""
# Create a copy to avoid modifying the original
result = data.copy()
# Add x to numeric columns
for column in result.select_dtypes(include=['number']).columns:
result[column] += x
return result
When transformed into an operator:
- Input port:
data
(pandas DataFrame) becomes an input connection point - Parameter:
x
(integer) becomes a configurable parameter in the UI - Output port: The return value (pandas DataFrame) becomes an output connection
Making Your Function an Operator
To register your function as an operator:
Declare it in your extension's
__init__.py
file:from .sample import hello_world
(Optional) Define it in your extension.toml configuration file:
[operator.hello_world] name = "Hello World" icon = "message.png" group = "Examples/Basic"
After building and installing your extension, the operator will appear in Altair AI Studio's operator tree:
Working with Parameters
Parameters control how your operators process data. They appear in the parameters panel in Altair AI Studio's design view and can be adjusted by users.
Supported Parameter Types
The altair-aitools-devkit
automatically converts Python types to appropriate UI controls:
Python type | Parameter type | UI representation |
---|---|---|
int | integer | Number field with integer validation |
float | real | Number field supporting decimals |
str | string | Text input field |
bool | boolean | Checkbox |
Enum | category | Dropdown selection |
Note: While other types can be used as IO ports, they are considered "internal python objects" and can only be passed between Python operators inside the same envirnoment.
Parameter best practices
- Use type hints for clear parameter definitions
- Provide default values where appropriate
- Consider adding validation via docstrings
Input and Output Ports
Input Ports
Any function parameter that isn't converted to an operator parameter (see table above) becomes an input port. This lets your operator accept data from upstream operators.
Output Ports
Function return values become output ports, allowing data to flow to downstream operators. Only class instances are supported as outputs.
Special Data Types
The following types receive special handling:
Python type | AI Tools representation |
---|---|
pandas.core.frame.DataFrame | Datatable |
numpy.ndarray | Numpy array |
pathlib.Path | File reference |
io.BytesIO/io.TextIOWrapper | Binary/text streams |
altair_aitools.ext.io.BaseDataObject | Custom Data Object |
To represent collections, wrap inputs/outputs in a Python list
.
Function Argument Requirements
Important considerations:
- Function arguments are passed via keywords at runtime
positional-only
parameters and*args
are not supported**kwargs
can be used but only for system-defined values (e.g., random seeds)
Documenting Your Operators
Well-documented operators are easier to use. The system parses your Python docstrings to generate operator documentation.
Docstring Components
- Synopsis: The first paragraph of your docstring
- Description: Detailed explanation following the synopsis
- Arguments: Parameter descriptions (parsed for operator parameters)
Supported Documentation Formats
The parser supports both Google and Sphinx docstring formats:
def example_operator(data: pd.DataFrame, threshold: float = 0.5) -> pd.DataFrame:
"""
Remove outliers from a dataset.
This operator filters rows based on the specified threshold value.
Args:
data: Input dataframe to process
threshold: Values above this level are considered outliers (0.0-1.0)
Returns:
Filtered dataframe with outliers removed
"""
# Implementation
By following these guidelines, you can create well-structured operators that are intuitive for users to work with in Altair AI Tools.
Next, lets look at how to configure extensions.