Skip to content

Add support for Feature Transformation on Writes for On Demand Feature Views #4376

Closed
@franciscojavierarceo

Description

@franciscojavierarceo

The problem

As discussed in #4365, we should add the ability to write an On Demand Feature View (ODFV) to store the output of the calculation.

The solution

The ideal solution would require a boolean to the ODFV decorator as metadata to control the write behavior and another boolean in the get_online_features method to allow for users to force features to be recomputed.

The writes would be done by calling push() or write_to_online_store() with the underlying raw data (inputs) and storing the transformed feature values (outputs) into the online store. The ODFV would be called before executing the writes.

The change for the ODFV definition would be:

@on_demand_feature_view(
    sources=[
        driver_hourly_stats_view,
        input_request
    ],
    schema=[
        Field(name='conv_rate_plus_val1_python', dtype=Float64),
        Field(name='conv_rate_plus_val2_python', dtype=Float64),
    ],
    mode="python",
    write_to_online_store=True,                #  THIS IS THE FIRST NEW BOOLEAN
)
def transformed_conv_rate_python(inputs: Dict[str, Any]) -> Dict[str, Any]:
    output: Dict[str, Any] = {
        "conv_rate_plus_val1_python": [
            conv_rate + val_to_add
            for conv_rate, val_to_add in zip(
                inputs["conv_rate"], inputs["val_to_add"]
            )
        ],
        "conv_rate_plus_val2_python": [
            conv_rate + val_to_add
            for conv_rate, val_to_add in zip(
                inputs["conv_rate"], inputs["val_to_add_2"]
            )
        ]
    }
    return output

And the change for the get_online_features call would be:

entity_rows = [
    {
        "driver_id": 1001,
        "val_to_add": 1,
        "val_to_add_2": 2,
    }
]

online_response = store.get_online_features(
    entity_rows=entity_rows,
    features=[
        "driver_hourly_stats:conv_rate",
        "driver_hourly_stats:acc_rate",
        "transformed_conv_rate_python:conv_rate_plus_val1_python",
        "transformed_conv_rate_python:conv_rate_plus_val2_python",
    ],
    force_compute=False,                         #  THIS IS THE SECOND NEW BOOLEAN
).to_dict()

Again the write_to_online_store: bool parameter would dictate whether this ODFV would write to the online store and the force_compute: bool would dictate whether the ODFV would always recalculate the features. There's an argument to be made that we could skip the write_to_online_store in the FeatureView declaration but this metadata would be useful to have in the registry for users.

The write call would be the standard:

store.push("transformed_conv_rate_python", entity_rows, to=PushMode.ONLINE)
# or alternative
store.write_to_online_store("transformed_conv_rate_python", entity_rows, to=PushMode.ONLINE)

Alternatives

We discussed creating a different feature view for this behavior altogether but using the existing ODFV benefits from reusing a lot of existing code and documentation. Moreover, the industry has adopted this language so adding on top of the language feels more natural than adding entirely new language.

Additional context

After the implementation it would be ideal to add this as an example in the local Credit Scoring tutorial.

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions