Skip to content

Transforms

TransformFunction

from tempora.utils.transform import TransformFunction

Callable protocol for transforming context-window or target-window data.

class TransformMetadata(TypedDict):
    columns: list[str]
    source: Literal['context', 'target']
    time_column: str | None
    entity_keys: list[str] | None

class TransformFunction(Protocol):
    def __call__(
        self, data: pd.DataFrame, metadata: TransformMetadata,
    ) -> pd.DataFrame:
        ...

Parameters

Name Description
data Input window data as a Pandas DataFrame.
metadata Metadata describing the window (columns, source, time_column, entity_keys).

Returns

Name Description
pd.DataFrame Transformed output data.

Notes:

  • Function must be serializable for client/server transport.
  • source is 'context' for sampler transforms and 'target' for target-spec transforms.

TransformSpec

from tempora.utils.transform import TransformSpec

Serializable transform specification used by samplers and target specifications.

TransformSpec(
    transform: TransformFunction,
    output_schema: pa.Schema | None = None,
    on_schema_mismatch: Literal['raise', 'coerce', 'skip'] = 'raise'
)

Parameters

Name Description
transform TransformFunction callable.
output_schema Optional PyArrow schema for transformed output validation/inference.
on_schema_mismatch Policy when transformed schema differs from expected: 'raise' (raise exception), 'coerce' (attempt column-wise cast and set uncastable values to null; if any required column is missing, fall through to 'skip' for that segment), or 'skip' (skip example and log warning).