新聞中心
SQL 時(shí)區(qū)問題
1.SQL 時(shí)區(qū)解決的問題
首先說一下這個(gè)問題的背景:

創(chuàng)新互聯(lián)建站專業(yè)為企業(yè)提供尼勒克網(wǎng)站建設(shè)、尼勒克做網(wǎng)站、尼勒克網(wǎng)站設(shè)計(jì)、尼勒克網(wǎng)站制作等企業(yè)網(wǎng)站建設(shè)、網(wǎng)頁設(shè)計(jì)與制作、尼勒克企業(yè)網(wǎng)站模板建站服務(wù),十載尼勒克做網(wǎng)站經(jīng)驗(yàn),不只是建網(wǎng)站,更提供有價(jià)值的思路和整體網(wǎng)絡(luò)服務(wù)。
大家想一下離線 Hive 環(huán)境中,有遇到過時(shí)區(qū)時(shí)區(qū)相關(guān)的問題嗎?
至少博主目前沒有碰到過,因?yàn)檫@個(gè)問題在底層的數(shù)據(jù)集成系統(tǒng)都已經(jīng)給解決了,小伙伴萌拿到手的 ODS 層表都是已經(jīng)按照所在地區(qū)的時(shí)區(qū)給格式化好的了。
舉個(gè)例子:小伙伴萌看到日期分區(qū)為 2022-01-01 的 Hive 表時(shí),可以默認(rèn)認(rèn)為該分區(qū)中的數(shù)據(jù)就對應(yīng)到你所在地區(qū)的時(shí)區(qū)的 2022-01-01 日的數(shù)據(jù)。
但是 Flink 中時(shí)區(qū)問題要特別引起關(guān)注,不加小心就會誤用。
而本節(jié) SQL 時(shí)區(qū)旨在幫助大家了解到以下兩個(gè)場景的問題:
- 在 1.13 之前,DDL create table 中使用 PROCTIME() 指定處理時(shí)間列時(shí),返回值類型為 TIMESTAMP(3) 類型,而 TIMESTAMP(3) 是不帶任何時(shí)區(qū)信息的,默認(rèn)為 UTC 時(shí)間(0 時(shí)區(qū))。
- 使用 StreamTableEnvironment::createTemporaryView 將 DataStream 轉(zhuǎn)為 Table 時(shí),注冊處理時(shí)間(proctime.proctime)、事件時(shí)間列(rowtime.rowtime)時(shí),兩列時(shí)間類型也為 TIMESTAMP(3) 類型,不帶時(shí)區(qū)信息。
而以上兩個(gè)場景就會導(dǎo)致:
- 在北京時(shí)區(qū)的用戶使用 TIMESTAMP(3) 類型的時(shí)間列開最常用的 1 天的窗口時(shí),劃分出來的窗口范圍是北京時(shí)間的 [2022-01-01 08:00:00, 2022-01-02 08:00:00],而不是北京時(shí)間的 [2022-01-01 00:00:00, 2022-01-02 00:00:00]。因?yàn)?TIMESTAMP(3) 是默認(rèn)的 UTC 時(shí)間,即 0 時(shí)區(qū)。
- 北京時(shí)區(qū)的用戶將 TIMESTAMP(3) 類型時(shí)間屬性列轉(zhuǎn)為 STRING 類型的數(shù)據(jù)展示時(shí),也是 UTC 時(shí)區(qū)的,而不是北京時(shí)間的。
因此充分了解本節(jié)的知識內(nèi)容可以很好的幫你避免時(shí)區(qū)問題錯(cuò)誤。
2.SQL 時(shí)間類型
- Flink SQL 支持 TIMESTAMP(不帶時(shí)區(qū)信息的時(shí)間)、TIMESTAMP_LTZ(帶時(shí)區(qū)信息的時(shí)間)
- TIMESTAMP(不帶時(shí)區(qū)信息的時(shí)間):是通過一個(gè) 年, 月, 日, 小時(shí), 分鐘, 秒 和 小數(shù)秒 的字符串來指定。舉例:1970-01-01 00:00:04.001。
- 為什么要使用字符串來指定呢?因?yàn)榇朔N類型不帶時(shí)區(qū)信息,所以直接用一個(gè)字符串指定就好了?
- 那 TIMESTAMP 字符串的時(shí)間代表的是什么時(shí)區(qū)的時(shí)間呢?UTC 時(shí)區(qū),也就是默認(rèn) 0 時(shí)區(qū),對應(yīng)中國北京是東八區(qū)。
- TIMESTAMP_LTZ(帶時(shí)區(qū)信息的時(shí)間):沒有字符串來指定,而是通過 java 標(biāo)準(zhǔn) epoch 時(shí)間 1970-01-01T00:00:00Z 開始計(jì)算的毫秒數(shù)。舉例:1640966400000。
- 其時(shí)區(qū)信息是怎么指定的呢?是通過本次任務(wù)中的時(shí)區(qū)配置參數(shù) table.local-time-zone 設(shè)置的。
- 時(shí)間戳本身也不帶有時(shí)區(qū)信息,為什么要使用時(shí)間戳來指定呢?就是因?yàn)闀r(shí)間戳不帶有時(shí)區(qū)信息,所以我們通過配置 table.local-time-zone 時(shí)區(qū)參數(shù)之后,就能將一個(gè)不帶有時(shí)區(qū)信息的時(shí)間戳轉(zhuǎn)換為帶有時(shí)區(qū)信息的字符串了。舉例:table.local-time-zone 為 Asia/Shanghai 時(shí),4001 時(shí)間戳轉(zhuǎn)化為字符串的效果是 1970-01-01 08:00:04.001。
3.時(shí)區(qū)參數(shù)生效的 SQL 時(shí)間函數(shù)
以下 SQL 中的時(shí)間函數(shù)都會受到時(shí)區(qū)參數(shù)的影響,從而做到最后顯示給用戶的時(shí)間、窗口的劃分都按照用戶設(shè)置時(shí)區(qū)之內(nèi)的時(shí)間。
- LOCALTIME;
- LOCALTIMESTAMP;
- CURRENT_DATE;
- CURRENT_TIME;
- CURRENT_TIMESTAMP;
- CURRENT_ROW_TIMESTAMP();
- NOW();
- PROCTIME():其中 PROCTIME() 在 1.13 版本及之后版本,返回值類型是 TIMESTAMP_LTZ(3)。
在 Flink SQL client 中執(zhí)行結(jié)果如下:
Flink SQL> SET sql-client.execution.result-mode=tableau;
Flink SQL> CREATE VIEW MyView1 AS SELECT LOCALTIME, LOCALTIMESTAMP, CURRENT_DATE, CURRENT_TIME, CURRENT_TIMESTAMP, CURRENT_ROW_TIMESTAMP(), NOW(), PROCTIME();
Flink SQL> DESC MyView1;
+------------------------+-----------------------------+-------+-----+--------+-----------+
| name | type | null | key | extras | watermark |
+------------------------+-----------------------------+-------+-----+--------+-----------+
| LOCALTIME | TIME(0) | false | | | |
| LOCALTIMESTAMP | TIMESTAMP(3) | false | | | |
| CURRENT_DATE | DATE | false | | | |
| CURRENT_TIME | TIME(0) | false | | | |
| CURRENT_TIMESTAMP | TIMESTAMP_LTZ(3) | false | | | |
|CURRENT_ROW_TIMESTAMP() | TIMESTAMP_LTZ(3) | false | | | |
| NOW() | TIMESTAMP_LTZ(3) | false | | | |
| PROCTIME() | TIMESTAMP_LTZ(3) *PROCTIME* | false | | | |
+------------------------+-----------------------------+-------+-----+--------+-----------+
Flink SQL> SET table.local-time-zone=UTC;
Flink SQL> SELECT * FROM MyView1;
+-----------+-------------------------+--------------+--------------+-------------------------+-------------------------+-------------------------+-------------------------+
| LOCALTIME | LOCALTIMESTAMP | CURRENT_DATE | CURRENT_TIME | CURRENT_TIMESTAMP | CURRENT_ROW_TIMESTAMP() | NOW() | PROCTIME() |
+-----------+-------------------------+--------------+--------------+-------------------------+-------------------------+-------------------------+-------------------------+
| 15:18:36 | 2021-04-15 15:18:36.384 | 2021-04-15 | 15:18:36 | 2021-04-15 15:18:36.384 | 2021-04-15 15:18:36.384 | 2021-04-15 15:18:36.384 | 2021-04-15 15:18:36.384 |
+-----------+-------------------------+--------------+--------------+-------------------------+-------------------------+-------------------------+-------------------------+
Flink SQL> SET table.local-time-zone=Asia/Shanghai;
Flink SQL> SELECT * FROM MyView1;
+-----------+-------------------------+--------------+--------------+-------------------------+-------------------------+-------------------------+-------------------------+
| LOCALTIME | LOCALTIMESTAMP | CURRENT_DATE | CURRENT_TIME | CURRENT_TIMESTAMP | CURRENT_ROW_TIMESTAMP() | NOW() | PROCTIME() |
+-----------+-------------------------+--------------+--------------+-------------------------+-------------------------+-------------------------+-------------------------+
| 23:18:36 | 2021-04-15 23:18:36.384 | 2021-04-15 | 23:18:36 | 2021-04-15 23:18:36.384 | 2021-04-15 23:18:36.384 | 2021-04-15 23:18:36.384 | 2021-04-15 23:18:36.384 |
+-----------+-------------------------+--------------+--------------+-------------------------+-------------------------+-------------------------+-------------------------+
Flink SQL> CREATE VIEW MyView2 AS SELECT TO_TIMESTAMP_LTZ(4001, 3) AS ltz, TIMESTAMP '1970-01-01 00:00:01.001' AS ntz;
Flink SQL> DESC MyView2;
+------+------------------+-------+-----+--------+-----------+
| name | type | null | key | extras | watermark |
+------+------------------+-------+-----+--------+-----------+
| ltz | TIMESTAMP_LTZ(3) | true | | | |
| ntz | TIMESTAMP(3) | false | | | |
+------+------------------+-------+-----+--------+-----------+
Flink SQL> SET table.local-time-zone=UTC;
Flink SQL> SELECT * FROM MyView2;
+-------------------------+-------------------------+
| ltz | ntz |
+-------------------------+-------------------------+
| 1970-01-01 00:00:04.001 | 1970-01-01 00:00:01.001 |
+-------------------------+-------------------------+
Flink SQL> SET table.local-time-zone=Asia/Shanghai;
Flink SQL> SELECT * FROM MyView2;
+-------------------------+-------------------------+
| ltz | ntz |
+-------------------------+-------------------------+
| 1970-01-01 08:00:04.001 | 1970-01-01 00:00:01.001 |
+-------------------------+-------------------------+
Flink SQL> CREATE VIEW MyView3 AS SELECT ltz, CAST(ltz AS TIMESTAMP(3)), CAST(ltz AS STRING), ntz, CAST(ntz AS TIMESTAMP_LTZ(3)) FROM MyView2;
Flink SQL> DESC MyView3;
+-------------------------------+------------------+-------+-----+--------+-----------+
| name | type | null | key | extras | watermark |
+-------------------------------+------------------+-------+-----+--------+-----------+
| ltz | TIMESTAMP_LTZ(3) | true | | | |
| CAST(ltz AS TIMESTAMP(3)) | TIMESTAMP(3) | true | | | |
| CAST(ltz AS STRING) | STRING | true | | | |
| ntz | TIMESTAMP(3) | false | | | |
| CAST(ntz AS TIMESTAMP_LTZ(3)) | TIMESTAMP_LTZ(3) | false | | | |
+-------------------------------+------------------+-------+-----+--------+-----------+
Flink SQL> SELECT * FROM MyView3;
+-------------------------+---------------------------+-------------------------+-------------------------+-------------------------------+
| ltz | CAST(ltz AS TIMESTAMP(3)) | CAST(ltz AS STRING) | ntz | CAST(ntz AS TIMESTAMP_LTZ(3)) |
+-------------------------+---------------------------+-------------------------+-------------------------+-------------------------------+
| 1970-01-01 08:00:04.001 | 1970-01-01 08:00:04.001 | 1970-01-01 08:00:04.001 | 1970-01-01 00:00:01.001 | 1970-01-01 00:00:01.001 |
+-------------------------+---------------------------+-------------------------+-------------------------+-------------------------------+
4.事件時(shí)間和時(shí)區(qū)應(yīng)用案例
這里分兩類,分別是 TIMESTAMP(不帶時(shí)區(qū)信息的時(shí)間)、TIMESTAMP_LTZ(帶時(shí)區(qū)信息的時(shí)間) 的事件時(shí)間 Flink SQL 任務(wù)。
- TIMESTAMP(不帶時(shí)區(qū)信息的時(shí)間)
Flink SQL> CREATE TABLE MyTable2 (
item STRING,
price DOUBLE,
ts TIMESTAMP(3), -- TIMESTAMP 類型的時(shí)間戳
WATERMARK FOR ts AS ts - INTERVAL '10' SECOND
) WITH (
'connector' = 'socket',
'hostname' = '127.0.0.1',
'port' = '9999',
'format' = 'csv'
);
Flink SQL> CREATE VIEW MyView4 AS
SELECT
TUMBLE_START(ts, INTERVAL '10' MINUTES) AS window_start,
TUMBLE_END(ts, INTERVAL '10' MINUTES) AS window_end,
TUMBLE_ROWTIME(ts, INTERVAL '10' MINUTES) as window_rowtime,
item,
MAX(price) as max_price
FROM MyTable2
GROUP BY TUMBLE(ts, INTERVAL '10' MINUTES), item;
Flink SQL> DESC MyView4;
+----------------+------------------------+------+-----+--------+-----------+
| name | type | null | key | extras | watermark |
+----------------+------------------------+------+-----+--------+-----------+
| window_start | TIMESTAMP(3) | true | | | |
| window_end | TIMESTAMP(3) | true | | | |
| window_rowtime | TIMESTAMP(3) *ROWTIME* | true | | | |
| item | STRING | true | | | |
| max_price | DOUBLE | true | | | |
+----------------+------------------------+------+-----+--------+-----------+
將數(shù)據(jù)寫入到 MyTable2 中:
> nc -lk 9999
A,1.1,2021-04-15 14:01:00
B,1.2,2021-04-15 14:02:00
A,1.8,2021-04-15 14:03:00
B,2.5,2021-04-15 14:04:00
C,3.8,2021-04-15 14:05:00
C,3.8,2021-04-15 14:11:00
最終結(jié)果如下:
Flink SQL> SET table.local-time-zone=UTC;
Flink SQL> SELECT * FROM MyView4;
+-------------------------+-------------------------+-------------------------+------+-----------+
| window_start | window_end | window_rowtime | item | max_price |
+-------------------------+-------------------------+-------------------------+------+-----------+
| 2021-04-15 14:00:00.000 | 2021-04-15 14:10:00.000 | 2021-04-15 14:09:59.999 | A | 1.8 |
| 2021-04-15 14:00:00.000 | 2021-04-15 14:10:00.000 | 2021-04-15 14:09:59.999 | B | 2.5 |
| 2021-04-15 14:00:00.000 | 2021-04-15 14:10:00.000 | 2021-04-15 14:09:59.999 | C | 3.8 |
+-------------------------+-------------------------+-------------------------+------+-----------+
Flink SQL> SET table.local-time-zone=Asia/Shanghai;
Flink SQL> SELECT * FROM MyView4;
+-------------------------+-------------------------+-------------------------+------+-----------+
| window_start | window_end | window_rowtime | item | max_price |
+-------------------------+-------------------------+-------------------------+------+-----------+
| 2021-04-15 14:00:00.000 | 2021-04-15 14:10:00.000 | 2021-04-15 14:09:59.999 | A | 1.8 |
| 2021-04-15 14:00:00.000 | 2021-04-15 14:10: 本文題目:FlinkSQL知其所以然:SQL的時(shí)區(qū)問題!
本文地址:http://www.fisionsoft.com.cn/article/cdppcho.html


咨詢
建站咨詢
