本教學課程將說明如何搭配 BigQuery ML 的內建 TimesFM 單變數模型,使用 AI.FORECAST
函式,根據指定資料欄的歷來值,預測該資料欄的未來值。
本教學課程使用公開bigquery-public-data.san_francisco_bikeshare.bikeshare_trips
資料表中的資料。
目標
本教學課程會逐步引導您使用 AI.FORECAST 函式和內建的 TimesFM 模型,預測自行車共享行程。前兩節說明如何預測及顯示單一時間序列的結果。第三節說明如何預測多個時間序列。
費用
本教學課程使用 Google Cloud的計費元件,包括:
- BigQuery
- BigQuery ML
如要進一步瞭解 BigQuery 費用,請參閱 BigQuery 定價頁面。
如要進一步瞭解 BigQuery ML 費用,請參閱 BigQuery ML 定價。
事前準備
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Google Cloud project.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Google Cloud project.
- 新專案會自動啟用 BigQuery。如要在現有的專案中啟用 BigQuery,請
Enable the BigQuery API.
預測單一時間序列的共享單車行程
使用 AI.FORECAST
函式預測未來時間序列值。
下列查詢會根據過去四個月的歷史資料,預測下個月 (約 720 小時) 每小時的訂閱者單車共享行程數。confidence_level
引數表示查詢會產生信賴水準為 95% 的預測間隔。
如要使用 TimesFM 模型預測資料,請按照下列步驟操作:
前往 Google Cloud 控制台的「BigQuery」頁面。
在查詢編輯器中貼上以下查詢,然後點選「執行」:
SELECT * FROM AI.FORECAST( ( SELECT TIMESTAMP_TRUNC(start_date, HOUR) as trip_hour, COUNT(*) as num_trips FROM `bigquery-public-data.san_francisco_bikeshare.bikeshare_trips` WHERE subscriber_type = 'Subscriber' AND start_date >= TIMESTAMP('2018-01-01') GROUP BY TIMESTAMP_TRUNC(start_date, HOUR) ), horizon => 720, confidence_level => 0.95, timestamp_col => 'trip_hour', data_col => 'num_trips');
結果類似下方:
+-------------------------+-------------------+------------------+---------------------------------+---------------------------------+--------------------+ | forecast_timestamp | forecast_value | confidence_level | prediction_interval_lower_bound | prediction_interval_upper_bound | ai_forecast_status | +-------------------------+-------------------+------------------+---------------------------------+---------------------------------+--------------------+ | 2018-05-01 00:00:00 UTC | 26.3045959... | 0.95 | 21.7088378... | 30.9003540... | | +-------------------------+-------------------+------------------+---------------------------------+---------------------------------+--------------------+ | 2018-05-01 01:00:00 UTC | 34.0890502... | 0.95 | 2.47682913... | 65.7012714... | | +-------------------------+-------------------+------------------+---------------------------------+---------------------------------+--------------------+ | 2018-05-01 02:00:00 UTC | 24.2154693... | 0.95 | 2.87621605... | 45.5547226... | | +-------------------------+-------------------+------------------+---------------------------------+---------------------------------+--------------------+ | ... | ... | ... | ... | ... | | +-------------------------+-------------------+------------------+---------------------------------+---------------------------------+--------------------+
比較預測資料與輸入資料
繪製 AI.FORECAST
函式輸出內容的圖表,並與部分函式輸入資料進行比較。
請按照下列步驟繪製函式輸出內容的圖表:
前往 Google Cloud 控制台的「BigQuery」頁面。
在查詢編輯器中貼上以下查詢,然後點選「執行」:
WITH historical AS ( SELECT TIMESTAMP_TRUNC(start_date, HOUR) as trip_hour, COUNT(*) as num_trips FROM `bigquery-public-data.san_francisco_bikeshare.bikeshare_trips` WHERE subscriber_type = 'Subscriber' AND start_date >= TIMESTAMP('2018-01-01') GROUP BY TIMESTAMP_TRUNC(start_date, HOUR) ORDER BY TIMESTAMP_TRUNC(start_date, HOUR) ) SELECT * FROM ( (SELECT trip_hour as date, num_trips AS historical_value, NULL as forecast_value, 'historical' as type, NULL as prediction_interval_low, NULL as prediction_interval_upper_bound FROM historical ORDER BY historical.trip_hour DESC LIMIT 400) UNION ALL (SELECT forecast_timestamp AS date, NULL as historical_value, forecast_value as forecast_value, 'forecast' as type, prediction_interval_lower_bound, prediction_interval_upper_bound FROM AI.FORECAST( ( SELECT * FROM historical ), horizon => 720, confidence_level => 0.99, timestamp_col => 'trip_hour', data_col => 'num_trips'))) ORDER BY date asc;
查詢執行完畢後,按一下「Query results」(查詢結果) 窗格中的「Chart」(圖表) 分頁標籤。產生的圖表看起來類似下列內容:
您會發現輸入資料和預測資料顯示的單車共享使用情形相似。您也可以看到,預測時間點越往未來延伸,預測區間下限和上限就會增加。
預測多個自行車共享行程的時間序列
下列查詢會根據過去四個月的歷史資料,預測下個月 (約 720 小時) 各訂閱者類型每小時的單車共享行程數。confidence_level
引數表示查詢會產生信賴水準為 95% 的預測間隔。
如要使用 TimesFM 模型預測資料,請按照下列步驟操作:
前往 Google Cloud 控制台的「BigQuery」頁面。
在查詢編輯器中貼上以下查詢,然後點選「執行」:
SELECT * FROM AI.FORECAST( ( SELECT TIMESTAMP_TRUNC(start_date, HOUR) as trip_hour, subscriber_type, COUNT(*) as num_trips FROM `bigquery-public-data.san_francisco_bikeshare.bikeshare_trips` WHERE start_date >= TIMESTAMP('2018-01-01') GROUP BY TIMESTAMP_TRUNC(start_date, HOUR), subscriber_type ), horizon => 720, confidence_level => 0.95, timestamp_col => 'trip_hour', data_col => 'num_trips', id_cols => ['subscriber_type']);
結果類似下方:
+---------------------+--------------------------+------------------+------------------+---------------------------------+---------------------------------+--------------------+ | subscriber_type | forecast_timestamp | forecast_value | confidence_level | prediction_interval_lower_bound | prediction_interval_upper_bound | ai_forecast_status | +---------------------+--------------------------+------------------+------------------+---------------------------------+---------------------------------+--------------------+ | Subscriber | 2018-05-01 00:00:00 UTC | 26.3045959... | 0.95 | 21.7088378... | 30.9003540... | | +---------------------+--------------------------+------------------+------------------+---------------------------------+---------------------------------+--------------------+ | Subscriber | 2018-05-01 01:00:00 UTC | 34.0890502... | 0.95 | 2.47682913... | 65.7012714... | | +---------------------+-------------------+------------------+-------------------------+---------------------------------+---------------------------------+--------------------+ | Subscriber | 2018-05-01 02:00:00 UTC | 24.2154693... | 0.95 | 2.87621605... | 45.5547226... | | +---------------------+--------------------------+------------------+------------------+---------------------------------+---------------------------------+--------------------+ | ... | ... | ... | ... | ... | ... | | +---------------------+--------------------------+------------------+------------------+---------------------------------+---------------------------------+--------------------+
清除所用資源
如要避免系統向您的 Google Cloud 帳戶收取本教學課程中所用資源的相關費用,請刪除含有該項資源的專案,或者保留專案但刪除個別資源。
刪除專案
如要刪除專案,請進行以下操作:
- In the Google Cloud console, go to the Manage resources page.
- In the project list, select the project that you want to delete, and then click Delete.
- In the dialog, type the project ID, and then click Shut down to delete the project.
後續步驟
- 如需 BigQuery ML 的總覽,請參閱 BigQuery 中的 AI 和 ML 簡介。