=====================================lateral view==================================
分类:视图
功能:配合UDTF来使用,把某一行数据拆分成多行数据
与UDTF直接使用的区别:
很多的UDTF不能将结果与源表进行关联,使用lateral view
可以将UDTF拆分的单个字段数据与原始表数据关联上
使用方式:
tabelA lateral view UDTF(xxx) 视图名 as a,b,c
案例:
--准备数据:vim /export/datas/lateral.txt
1 http://facebook.com/path/p1.php?query=1
2 http://www.baidu.com/news/index.jsp?uuid=frank
3 http://www.jd.com/index?source=baidu
--创建表
create table tb_url(
id int,
url string
) row format delimited fields terminated by '\t';
--加载数据
load data local inpath '/export/datas/lateral.txt' into table tb_url;
--使用UDTF解析
SELECT parse_url_tuple(url, 'HOST') from tb_url;
--使用UDTF+lateral view
select a.id,b.* from tb_url a lateral view parse_url_tuple(url, 'HOST') b as host;
--对比
SELECT id,parse_url_tuple(url, 'HOST') from tb_url;--失败,UDTF函数不能与字段连用
=====================================explode==================================
分类:UDTF
功能:函数可以将一个array或者map展开
explode(array):
将array列表里的每个元素生成一行
explode(map):
每一对元素作为一行,key为一列,value为一列
使用方式:
1-直接使用
2-与lateral view连用
案例:
1-实现wordcount【explode(array)】
select explode(split(word," ")) from wc1;
2-将兴趣爱好展开【explode(map)】
select explode(hobby) from complex_map;
3-与侧视图连用
select a.name,b.* from complex_map a lateral view explode(hobby) b as hobby,deep;
=====================================行列转换==================================
行转列:
案例1:多行转多列
-》创建数据:vim /export/datas/r2c1.txt
a c 1
a d 2
a e 3
b c 4
b d 5
b e 6
-》希望结果
+-------+----+----+----+--+
| col1 | c | d | e |
+-------+----+----+----+--+
| a | 1 | 2 | 3 |
| b | 4 | 5 | 6 |
+-------+----+----+----+--+
-》创建表
create table row2col1(
col1 string,
col2 string,
col3 int
)row format delimited fields terminated by '\t';
-》加载数据
load data local inpath '/export/datas/r2c1.txt' into table row2col1;
-》实现
select
col1 as col1,
max(case col2 when 'c' then col3 else 0 end) as c,
max(case col2 when 'd' then col3 else 0 end) as d,
max(case col2 when 'e' then col3 else 0 end) as e
from row2col1
group by col1;
案例2:多行转单列【重要】
-》创建数据:vim /export/datas/r2c2.txt
a b 1
a b 2
a b 3
c d 4
c d 5
c d 6
-》希望结果
+-------+-------+--------+--+
| col1 | col2 | col3 |
+-------+-------+--------+--+
| a | b | 1,2,3 |
| c | d | 4,5,6 |
+-------+-------+--------+--+
-》创建表
create table row2col2(
col1 string,
col2 string,
col3 int
)row format delimited fields terminated by '\t';
-》加载数据
load data local inpath '/export/datas/r2c2.txt' into table row2col2;
-》实现
select
col1,
col2,
concat_ws(',', collect_set(cast(col3 as string))) as col3
from
row2col2
group by
col1, col2;
concat_ws(参数1,参数2):用于进行字符的拼接
参数1—指定分隔符
参数2—拼接的内容
collect_set(col):将某字段的值进行去重汇总,产生array类型字段
如果不想去重可用collect_list()
列转行:
案例1:多列转多行
-》创建数据:vim /export/datas/c2r1.txt
co11 col2 col3 col4
a 1 2 3
b 4 5 6
-》希望结果
+-----------+-----------+-----------+--+
| _u1.col1 | _u1.col2 | _u1.col3 |
+-----------+-----------+-----------+--+
| a | c | 1 |
| a | d | 2 |
| a | e | 3 |
| b | c | 4 |
| b | d | 5 |
| b | e | 6 |
+-----------+-----------+-----------+--+
-》创建表
create table col2row1(
col1 string,
col2 int,
col3 int,
col4 int
)row format delimited fields terminated by '\t';
-》加载数据
load data local inpath '/export/datas/c2r1.txt' into table col2row1;
-》实现
select col1, 'c' as col2, col2 as col3 from col2row1
UNION ALL
select col1, 'd' as col2, col3 as col3 from col2row1
UNION ALL
select col1, 'e' as col2, col4 as col3 from col2row1;
案例2:单列转多行(重要)
-》创建数据:vim /export/datas/c2r2.txt
col1 col2 col3
a b 1,2,3
c d 4,5,6
-》希望结果
+-------+-------+-------+--+
| col1 | col2 | col3 |
+-------+-------+-------+--+
| a | b | 1 |
| a | b | 2 |
| a | b | 3 |
| c | d | 4 |
| c | d | 5 |
| c | d | 6 |
+-------+-------+-------+--+
-》创建表
create table col2row2(
col1 string,
col2 string,
col3 string
)row format delimited fields terminated by '\t';
-》加载数据
load data local inpath '/export/datas/c2r2.txt' into table col2row2;
-》实现
select
col1,
col2,
lv.col3 as col3
from
col2row2
lateral view
explode(split(col3, ',')) lv as col3;