Skip to content

feat: Offline store update pull_all_from_table_or_query to make timestampfield optional #5281

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Apr 20, 2025

Conversation

HaoXuAI
Copy link
Collaborator

@HaoXuAI HaoXuAI commented Apr 18, 2025

What this PR does / why we need it:

The major change in this PR is making the start_date and end_date in pull_all_from_table_or_query optional:

    def pull_all_from_table_or_query(
        config: RepoConfig,
        data_source: DataSource,
        join_key_columns: List[str],
        feature_name_columns: List[str],
        timestamp_field: str,
        created_timestamp_field: Optional[str] = None,
        start_date: Optional[datetime] = None,
        end_date: Optional[datetime] = None,
    ) -> RetrievalJob:

The reason is the API should be able to let the user skip the time range selection, and return all entities with features.
This will also make the current materialization and get_historical_features in Compute Engine to be able to use this API.
The change is non-breaking, and the API currently is only used in retrieve_feature_service_logs, and retrieve_saved_dataset.

Which issue(s) this PR fixes:

Misc

@HaoXuAI HaoXuAI requested review from sudohainguyen and a team as code owners April 18, 2025 06:00
@HaoXuAI HaoXuAI requested review from shuchu, ejscribner and tokoko April 18, 2025 06:00
HaoXuAI added 5 commits April 17, 2025 23:10
Signed-off-by: HaoXuAI <[email protected]>
Signed-off-by: HaoXuAI <[email protected]>
Signed-off-by: HaoXuAI <[email protected]>
Signed-off-by: HaoXuAI <[email protected]>
@@ -95,25 +94,3 @@ def build_output_nodes(self, input_node):
node = LocalOutputNode("output")
node.add_input(input_node)
self.nodes.append(node)

def build(self) -> ExecutionPlan:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can remove it as it is the same with the Parent build

@@ -15,5 +15,5 @@ class HistoricalRetrievalTask:
feature_view: Union[BatchFeatureView, StreamFeatureView]
full_feature_name: bool
registry: Registry
start_time: datetime
end_time: datetime
start_time: Optional[datetime] = None
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

retrieval job doesn't have to provide start or end date

if isinstance(self.task, HistoricalRetrievalTask):
last_node = self.build_join_node(last_node)
# Join entity_df with source if needed
last_node = self.build_join_node(last_node)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without entity_df provided, the join node is just a pass through node

) = context.column_info

# 📥 Reuse Feast's robust query resolver
retrieval_job = offline_store.pull_all_from_table_or_query(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reuse the pull_all_from_table_or_query to read data from all feast offline_store now

@@ -206,10 +207,13 @@ def pull_all_from_table_or_query(
+ BigQueryOfflineStore._escape_query_columns(feature_name_columns)
+ [timestamp_field]
)
timestamp_filter = get_timestamp_filter_sql(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

major change: get the timestamp filter from get_timestamp_filter_sql

HaoXuAI added 11 commits April 17, 2025 23:51
Signed-off-by: HaoXuAI <[email protected]>
Signed-off-by: HaoXuAI <[email protected]>
Signed-off-by: HaoXuAI <[email protected]>
Signed-off-by: HaoXuAI <[email protected]>
Signed-off-by: HaoXuAI <[email protected]>
Signed-off-by: HaoXuAI <[email protected]>
Signed-off-by: HaoXuAI <[email protected]>
Signed-off-by: HaoXuAI <[email protected]>
Signed-off-by: HaoXuAI <[email protected]>
Signed-off-by: HaoXuAI <[email protected]>
Signed-off-by: HaoXuAI <[email protected]>
HaoXuAI added 2 commits April 18, 2025 13:48
Signed-off-by: HaoXuAI <[email protected]>
Signed-off-by: HaoXuAI <[email protected]>
@franciscojavierarceo franciscojavierarceo merged commit 4b94608 into master Apr 20, 2025
23 checks passed
franciscojavierarceo pushed a commit that referenced this pull request Apr 29, 2025
# [0.49.0](v0.48.0...v0.49.0) (2025-04-29)

### Bug Fixes

* Adding brackets to unit tests ([c46fea3](c46fea3))
* Adding logic back for a step ([2bb240b](2bb240b))
* Adjustment for unit test action ([a6f78ae](a6f78ae))
* Allow get_historical_features with only On Demand Feature View ([#5256](#5256)) ([0752795](0752795))
* CI adjustment ([3850643](3850643))
* Embed Query configuration breaks when switching between DataFrame and SQL ([#5257](#5257)) ([32375a5](32375a5))
* Fix for proto issue in utils ([1b291b2](1b291b2))
* Fix milvus online_read ([#5233](#5233)) ([4b91f26](4b91f26))
* Fix tests ([431d9b8](431d9b8))
* Fixed Permissions object parameter in example ([#5259](#5259)) ([045c100](045c100))
* Java CI [#12](#12) ([d7e44ac](d7e44ac))
* Java PR [#15](#15) ([a5da3bb](a5da3bb))
* Java PR [#16](#16) ([e0320fe](e0320fe))
* Java PR [#17](#17) ([49da810](49da810))
* Materialization logs ([#5243](#5243)) ([4aa2f49](4aa2f49))
* Moving to custom github action for checking skip tests ([caf312e](caf312e))
* Operator - remove default replicas setting from Feast Deployment ([#5294](#5294)) ([e416d01](e416d01))
* Patch java pr [#14](#14) ([592526c](592526c))
* Patch update for test ([a3e8967](a3e8967))
* Remove conditional from steps ([995307f](995307f))
* Remove misleading HTTP prefix from gRPC endpoints in logs and doc ([#5280](#5280)) ([0ee3a1e](0ee3a1e))
* removing id ([268ade2](268ade2))
* Renaming workflow file ([5f46279](5f46279))
* Resolve `no pq wrapper` import issue ([#5240](#5240)) ([d5906f1](d5906f1))
* Update actions to remove check skip tests ([#5275](#5275)) ([b976f27](b976f27))
* Update docling demo ([446efea](446efea))
* Update java pr [#13](#13) ([fda7db7](fda7db7))
* Update java_pr ([fa138f4](fa138f4))
* Update repo_config.py ([6a59815](6a59815))
* Update unit tests workflow ([06486a0](06486a0))
* Updated docs for docling demo ([768e6cc](768e6cc))
* Updating action for unit tests ([0996c28](0996c28))
* Updating github actions to filter at job level ([0a09622](0a09622))
* Updating Java CI ([c7c3a3c](c7c3a3c))
* Updating java pr to skip tests ([e997dd9](e997dd9))
* Updating workflows ([c66bcd2](c66bcd2))

### Features

* Add date_partition_column_format for spark source ([#5273](#5273)) ([7a61d6f](7a61d6f))
* Add Milvus tutorial with Feast integration ([#5292](#5292)) ([a1388a5](a1388a5))
* Add pgvector tutorial with PostgreSQL integration ([#5290](#5290)) ([bb1cbea](bb1cbea))
* Add ReactFlow visualization for Feast registry metadata ([#5297](#5297)) ([9768970](9768970))
* Add retrieve online documents v2 method into  pgvector  ([#5253](#5253)) ([6770ee6](6770ee6))
* Compute Engine Initial Implementation ([#5223](#5223)) ([64bdafd](64bdafd))
* Enable write node for compute engine ([#5287](#5287)) ([f9baf97](f9baf97))
* Local compute engine ([#5278](#5278)) ([8e06dfe](8e06dfe))
* Make transform on writes configurable for ingestion ([#5283](#5283)) ([ecad170](ecad170))
* Offline store update pull_all_from_table_or_query to make timestampfield optional ([#5281](#5281)) ([4b94608](4b94608))
* Serialization version 2 deprecation notice ([#5248](#5248)) ([327d99d](327d99d))
* Vector length definition moved to Feature View from Config  ([#5289](#5289)) ([d8f1c97](d8f1c97))
j-wine pushed a commit to j-wine/feast that referenced this pull request Jun 7, 2025
…tampfield optional (feast-dev#5281)

* Update offline store pull all API date field optional

Signed-off-by: HaoXuAI <[email protected]>

* Update offline store pull all API date field optional

Signed-off-by: HaoXuAI <[email protected]>

* update

Signed-off-by: HaoXuAI <[email protected]>

* update

Signed-off-by: HaoXuAI <[email protected]>

* update

Signed-off-by: HaoXuAI <[email protected]>

* fix test

Signed-off-by: HaoXuAI <[email protected]>

* update source read node

Signed-off-by: HaoXuAI <[email protected]>

* fix linting

Signed-off-by: HaoXuAI <[email protected]>

* fix linting

Signed-off-by: HaoXuAI <[email protected]>

* fix linting

Signed-off-by: HaoXuAI <[email protected]>

* fix linting

Signed-off-by: HaoXuAI <[email protected]>

* fix linting

Signed-off-by: HaoXuAI <[email protected]>

* fix test

Signed-off-by: HaoXuAI <[email protected]>

* fix test

Signed-off-by: HaoXuAI <[email protected]>

* fix test

Signed-off-by: HaoXuAI <[email protected]>

* fix test

Signed-off-by: HaoXuAI <[email protected]>

* fix test

Signed-off-by: HaoXuAI <[email protected]>

* fix test

Signed-off-by: HaoXuAI <[email protected]>

* fix test

Signed-off-by: HaoXuAI <[email protected]>

* fix test

Signed-off-by: HaoXuAI <[email protected]>

---------

Signed-off-by: HaoXuAI <[email protected]>
Signed-off-by: Jacob Weinhold <[email protected]>
j-wine pushed a commit to j-wine/feast that referenced this pull request Jun 7, 2025
# [0.49.0](feast-dev/feast@v0.48.0...v0.49.0) (2025-04-29)

### Bug Fixes

* Adding brackets to unit tests ([c46fea3](feast-dev@c46fea3))
* Adding logic back for a step ([2bb240b](feast-dev@2bb240b))
* Adjustment for unit test action ([a6f78ae](feast-dev@a6f78ae))
* Allow get_historical_features with only On Demand Feature View ([feast-dev#5256](feast-dev#5256)) ([0752795](feast-dev@0752795))
* CI adjustment ([3850643](feast-dev@3850643))
* Embed Query configuration breaks when switching between DataFrame and SQL ([feast-dev#5257](feast-dev#5257)) ([32375a5](feast-dev@32375a5))
* Fix for proto issue in utils ([1b291b2](feast-dev@1b291b2))
* Fix milvus online_read ([feast-dev#5233](feast-dev#5233)) ([4b91f26](feast-dev@4b91f26))
* Fix tests ([431d9b8](feast-dev@431d9b8))
* Fixed Permissions object parameter in example ([feast-dev#5259](feast-dev#5259)) ([045c100](feast-dev@045c100))
* Java CI [feast-dev#12](feast-dev#12) ([d7e44ac](feast-dev@d7e44ac))
* Java PR [feast-dev#15](feast-dev#15) ([a5da3bb](feast-dev@a5da3bb))
* Java PR [feast-dev#16](feast-dev#16) ([e0320fe](feast-dev@e0320fe))
* Java PR [feast-dev#17](feast-dev#17) ([49da810](feast-dev@49da810))
* Materialization logs ([feast-dev#5243](feast-dev#5243)) ([4aa2f49](feast-dev@4aa2f49))
* Moving to custom github action for checking skip tests ([caf312e](feast-dev@caf312e))
* Operator - remove default replicas setting from Feast Deployment ([feast-dev#5294](feast-dev#5294)) ([e416d01](feast-dev@e416d01))
* Patch java pr [feast-dev#14](feast-dev#14) ([592526c](feast-dev@592526c))
* Patch update for test ([a3e8967](feast-dev@a3e8967))
* Remove conditional from steps ([995307f](feast-dev@995307f))
* Remove misleading HTTP prefix from gRPC endpoints in logs and doc ([feast-dev#5280](feast-dev#5280)) ([0ee3a1e](feast-dev@0ee3a1e))
* removing id ([268ade2](feast-dev@268ade2))
* Renaming workflow file ([5f46279](feast-dev@5f46279))
* Resolve `no pq wrapper` import issue ([feast-dev#5240](feast-dev#5240)) ([d5906f1](feast-dev@d5906f1))
* Update actions to remove check skip tests ([feast-dev#5275](feast-dev#5275)) ([b976f27](feast-dev@b976f27))
* Update docling demo ([446efea](feast-dev@446efea))
* Update java pr [feast-dev#13](feast-dev#13) ([fda7db7](feast-dev@fda7db7))
* Update java_pr ([fa138f4](feast-dev@fa138f4))
* Update repo_config.py ([6a59815](feast-dev@6a59815))
* Update unit tests workflow ([06486a0](feast-dev@06486a0))
* Updated docs for docling demo ([768e6cc](feast-dev@768e6cc))
* Updating action for unit tests ([0996c28](feast-dev@0996c28))
* Updating github actions to filter at job level ([0a09622](feast-dev@0a09622))
* Updating Java CI ([c7c3a3c](feast-dev@c7c3a3c))
* Updating java pr to skip tests ([e997dd9](feast-dev@e997dd9))
* Updating workflows ([c66bcd2](feast-dev@c66bcd2))

### Features

* Add date_partition_column_format for spark source ([feast-dev#5273](feast-dev#5273)) ([7a61d6f](feast-dev@7a61d6f))
* Add Milvus tutorial with Feast integration ([feast-dev#5292](feast-dev#5292)) ([a1388a5](feast-dev@a1388a5))
* Add pgvector tutorial with PostgreSQL integration ([feast-dev#5290](feast-dev#5290)) ([bb1cbea](feast-dev@bb1cbea))
* Add ReactFlow visualization for Feast registry metadata ([feast-dev#5297](feast-dev#5297)) ([9768970](feast-dev@9768970))
* Add retrieve online documents v2 method into  pgvector  ([feast-dev#5253](feast-dev#5253)) ([6770ee6](feast-dev@6770ee6))
* Compute Engine Initial Implementation ([feast-dev#5223](feast-dev#5223)) ([64bdafd](feast-dev@64bdafd))
* Enable write node for compute engine ([feast-dev#5287](feast-dev#5287)) ([f9baf97](feast-dev@f9baf97))
* Local compute engine ([feast-dev#5278](feast-dev#5278)) ([8e06dfe](feast-dev@8e06dfe))
* Make transform on writes configurable for ingestion ([feast-dev#5283](feast-dev#5283)) ([ecad170](feast-dev@ecad170))
* Offline store update pull_all_from_table_or_query to make timestampfield optional ([feast-dev#5281](feast-dev#5281)) ([4b94608](feast-dev@4b94608))
* Serialization version 2 deprecation notice ([feast-dev#5248](feast-dev#5248)) ([327d99d](feast-dev@327d99d))
* Vector length definition moved to Feature View from Config  ([feast-dev#5289](feast-dev#5289)) ([d8f1c97](feast-dev@d8f1c97))

Signed-off-by: Jacob Weinhold <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants