【累积型快照事实表】订单表

tech2022-09-06  99

1、concat

concat函数在连接字符串的时候,只要其中一个是NULL,那么将返回NULL

hive> select concat('a','b'); ab hive> select concat('a','b',null); NULL

2、concat_ws

concat_ws函数在连接字符串的时候,只要有一个字符串不是NULL,就不会返回NULL。

concat_ws函数需要指定分隔符。

hive> select concat_ws('-','a','b'); a-b hive> select concat_ws('-','a','b',null); a-b hive> select concat_ws('','a','b',null); ab

3、STR_TO_MAP

语法

STR_TO_MAP(VARCHAR text, VARCHAR listDelimiter, VARCHAR keyValueDelimiter)

功能 使用listDelimiter将text分隔成K-V对,然后使用keyValueDelimiter分隔每个K-V对,组装成MAP返回。默认listDelimiter为( ,),keyValueDelimiter为(=)。 案例 str_to_map('1001=2020-03-10,1002=2020-03-10', ',' , '=') 输出 {"1001":"2020-03-10","1002":"2020-03-10"}

4、案例

第一步:

hive> select order_id, concat(order_status,'=', operate_time) from order_status_log where dt='2020-03-10'; 3210 1001=2020-03-10 00:00:00.0 3211 1001=2020-03-10 00:00:00.0 3212 1001=2020-03-10 00:00:00.0 3210 1002=2020-03-10 00:00:00.0 3211 1002=2020-03-10 00:00:00.0 3212 1002=2020-03-10 00:00:00.0 3210 1005=2020-03-10 00:00:00.0 3211 1004=2020-03-10 00:00:00.0 3212 1004=2020-03-10 00:00:00.0

第二步:

hive > select order_id, collect_set(concat(order_status,'=',operate_time)) from order_status_log where dt='2020-03-10' group by order_id; 3210 ["1001=2020-03-10 00:00:00.0","1002=2020-03-10 00:00:00.0","1005=2020-03-10 00:00:00.0"] 3211 ["1001=2020-03-10 00:00:00.0","1002=2020-03-10 00:00:00.0","1004=2020-03-10 00:00:00.0"] 3212 ["1001=2020-03-10 00:00:00.0","1002=2020-03-10 00:00:00.0","1004=2020-03-10 00:00:00.0"]

第三步:

hive> select order_id, concat_ws(',', collect_set(concat(order_status,'=',operate_time))) from order_status_log where dt='2020-03-10' group by order_id; 3210 1001=2020-03-10 00:00:00.0,1002=2020-03-10 00:00:00.0,1005=2020-03-10 00:00:00.0 3211 1001=2020-03-10 00:00:00.0,1002=2020-03-10 00:00:00.0,1004=2020-03-10 00:00:00.0 3212 1001=2020-03-10 00:00:00.0,1002=2020-03-10 00:00:00.0,1004=2020-03-10 00:00:00.0

 

第四步:

hive > select order_id, str_to_map(concat_ws(',',collect_set(concat(order_status,'=',operate_time))), ',' , '=') tms from order_status_log where dt='2020-03-10' group by order_id; 3210 {"1001":"2020-03-10 00:00:00.0","1002":"2020-03-10 00:00:00.0","1005":"2020-03-10 00:00:00.0"} 3211 {"1001":"2020-03-10 00:00:00.0","1002":"2020-03-10 00:00:00.0","1004":"2020-03-10 00:00:00.0"} 3212 {"1001":"2020-03-10 00:00:00.0","1002":"2020-03-10 00:00:00.0","1004":"2020-03-10 00:00:00.0"}

第五步:取值

tms['1001']:创建时间(未支付状态)tms['1002']:支付时间(支付状态)tms['1003']:取消时间(已取消状态)tms['1004']:完成时间(已完成状态)tms['1005']:退款时间(退款状态)tms['1006']:退款完成时间(退款完成状态)
最新回复(0)