Transforms
TransformFunction
from tempora.utils.transform import TransformFunction
Callable protocol for transforming context-window or target-window data.
class TransformMetadata(TypedDict):
columns: list[str]
source: Literal['context', 'target']
time_column: str | None
entity_keys: list[str] | None
class TransformFunction(Protocol):
def __call__(
self, data: pd.DataFrame, metadata: TransformMetadata,
) -> pd.DataFrame:
...
Parameters
| Name | Description |
|---|---|
data |
Input window data as a Pandas DataFrame. |
metadata |
Metadata describing the window (columns, source, time_column, entity_keys). |
Returns
| Name | Description |
|---|---|
pd.DataFrame |
Transformed output data. |
Notes:
- Function must be serializable for client/server transport.
sourceis'context'for sampler transforms and'target'for target-spec transforms.
TransformSpec
from tempora.utils.transform import TransformSpec
Serializable transform specification used by samplers and target specifications.
TransformSpec(
transform: TransformFunction,
output_schema: pa.Schema | None = None,
on_schema_mismatch: Literal['raise', 'coerce', 'skip'] = 'raise'
)
Parameters
| Name | Description |
|---|---|
transform |
TransformFunction callable. |
output_schema |
Optional PyArrow schema for transformed output validation/inference. |
on_schema_mismatch |
Policy when transformed schema differs from expected: 'raise' (raise exception), 'coerce' (attempt column-wise cast and set uncastable values to null; if any required column is missing, fall through to 'skip' for that segment), or 'skip' (skip example and log warning). |