Skip to content

feat(bigquery): Support DATE-type event timestamp columns#6362

Open
Jwrede wants to merge 9 commits intofeast-dev:masterfrom
Jwrede:feat/bq-date-timestamp-type
Open

feat(bigquery): Support DATE-type event timestamp columns#6362
Jwrede wants to merge 9 commits intofeast-dev:masterfrom
Jwrede:feat/bq-date-timestamp-type

Conversation

@Jwrede
Copy link
Copy Markdown
Contributor

@Jwrede Jwrede commented May 3, 2026

What this PR does / why we need it:

When the event_timestamp column in BigQuery is a DATE type (not TIMESTAMP), the generated SQL wraps comparison values in TIMESTAMP(), causing a type mismatch error. This makes DATE-partitioned summary tables unusable without creating views or duplicate tables.

This PR adds an optional timestamp_field_type parameter to BigQuerySource. When set to "DATE", SQL generation uses DATE('YYYY-MM-DD') comparisons instead of TIMESTAMP('...'), both in direct queries (pull_latest_from_table_or_query, pull_all_from_table_or_query) and in the point-in-time join Jinja template.

Usage:

BigQuerySource(
    table="project:dataset.daily_features",
    timestamp_field="event_date",
    timestamp_field_type="DATE",
)

Changes:

  • Proto: add timestamp_field_type string field (field 28) to DataSource
  • DataSource base class: add timestamp_field_type attribute, equality check, and __init__ parameter
  • BigQuerySource: wire timestamp_field_type through __init__, from_proto, and _to_proto_impl
  • get_timestamp_filter_sql(): add "date_func" cast style that generates DATE('YYYY-MM-DD')
  • BigQueryOfflineStore: select cast style based on timestamp_field_type
  • Jinja template: conditional DATE() comparisons for DATE-type timestamp fields
  • FeatureViewQueryContext: propagate timestamp_field_type to template context

Backward-compatible: when timestamp_field_type is unset, behavior is unchanged.

Which issue(s) this PR fixes:

Fixes #2530 (part 2 -- DATE type event_timestamp support; part 1 was addressed by #6076)

How to test:

python -m pytest sdk/python/tests/unit/infra/offline_stores/test_bigquery.py -v

4 new tests added:

  • test_pull_latest_date_type_timestamp_field -- verifies DATE() cast in pull_latest
  • test_pull_all_date_type_timestamp_field -- verifies DATE() cast in pull_all
  • test_pull_latest_date_type_with_partition_column -- DATE type combined with partition pruning
  • test_bigquery_source_date_type_proto_roundtrip -- proto serialization roundtrip

When the event_timestamp column in BigQuery is a DATE type, the
generated SQL wraps comparison values in TIMESTAMP(), causing a type
mismatch error. This adds a timestamp_field_type parameter to
BigQuerySource that, when set to "DATE", generates DATE() comparisons
instead.

Closes feast-dev#2530 (part 2)

Signed-off-by: Jonathan Wrede <wrede.jonathan00@gmail.com>
@Jwrede Jwrede requested review from a team and sudohainguyen as code owners May 3, 2026 08:47
@Jwrede Jwrede requested review from HaoXuAI, nquinn408 and shuchu and removed request for a team May 3, 2026 08:47
Jwrede added 2 commits May 3, 2026 08:55
The proto files were regenerated with protobuf 6.31.1 / grpcio-tools
1.80.0, which imports runtime_version -- a module that does not exist
in protobuf 4.25.x used by the project. Revert generated code to
4.25.1 format while keeping the new timestamp_field_type field.

Signed-off-by: Jonathan Wrede <wrede.jonathan00@gmail.com>
Mypy infers str from the ternary expression; annotate with the
exact Literal union so the call to get_timestamp_filter_sql passes
type checking.

Signed-off-by: Jonathan Wrede <wrede.jonathan00@gmail.com>
franciscojavierarceo and others added 6 commits May 3, 2026 10:42
…text

Callers that do not use DATE-typed timestamp fields (e.g. Spark offline
store tests) should not be forced to pass timestamp_field_type. Adding
a default keeps the new field backward-compatible.

Signed-off-by: Jonathan Wrede <wrede.jonathan00@gmail.com>
A default value on timestamp_field_type breaks the
SparkFeatureViewQueryContext subclass because its non-default fields
(min_date_partition, max_date_partition) would follow a field with a
default. Instead, keep it required and update the Spark test to pass it.

Signed-off-by: Jonathan Wrede <wrede.jonathan00@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Handle BigQuery partitions when event_timestamp is not the partition column, deal with event_timestamp columns of DATE type

2 participants