ByConity ELT 小测

借着 ByConity 的邀测活动体验一下 bsp 能力,ByConity 也很贴心的提供了标准环境,下面开始体验。

测试环境

版本

配置

ByConity v1.0.1

集群规格

Worker:4 * 16core 64G

Server:1 * 16core 64G

TSO:1 * 4core 16G

Daemon Manager:1 * 4core 16G

Resource Manager:1 * 8core 32G

存储:对象存储 TOS

FoundationDB:3 * 4core 16G

TPD-DS 测试

登录 ByConity

乍一看咋是 ByteHouse,不过了解的应该都知道,ByConity 是 ByteHouse 的开源版,不纠结继续。

切换到测试库

已经贴心的创建好了库表,准备体验。

查看数据量

看数据量应该是 1T 的 tpcds 数据集。

指定方言

set dialect_type = 'ANSI';

如果写错了,也给了友好的提示,告诉用户支持 'MYSQL', 'ANSI', 'CLICKHOUSE' 三种

SQL 测试

先来个简单点的 q3

select  dt.d_year 
       ,item.i_brand_id brand_id 
       ,item.i_brand brand
       ,sum(ss_ext_sales_price) sum_agg
 from  date_dim dt 
      ,store_sales
      ,item
 where dt.d_date_sk = store_sales.ss_sold_date_sk
   and store_sales.ss_item_sk = item.i_item_sk
   and item.i_manufact_id = 128
   and dt.d_moy=11
 group by dt.d_year
      ,item.i_brand
      ,item.i_brand_id
 order by dt.d_year
         ,sum_agg desc
         ,brand_id
limit 100;

结果如下:

再来个 q1

with customer_total_return as
(
    select
        sr_customer_sk as ctr_customer_sk,
        sr_store_sk as ctr_store_sk
        ,sum(sr_return_amt) as ctr_total_return
    from store_returns, date_dim
    where sr_returned_date_sk = d_date_sk and d_year = 2000
    group by sr_customer_sk,sr_store_sk)
select  c_customer_id
from customer_total_return ctr1, store, customer
where ctr1.ctr_total_return > (
    select avg(ctr_total_return) *1.2
    from customer_total_return ctr2
    where ctr1.ctr_store_sk = ctr2.ctr_store_sk
)
and s_store_sk = ctr1.ctr_store_sk
and s_state = 'TN'
and ctr1.ctr_customer_sk = c_customer_sk
order by c_customer_id
limit 100;

结果如下:

居然报错了!!!

当然 ByConity 不会出现这么低级的问题,是因为 dialect_type 默认是 CLICKHOUSE

是因为重新登录没有修改 dialect_type 导致的,改成 ANSI,正常运行,结果如下:

再来个复杂的 q78

with ws as
        (select d_year AS ws_sold_year, ws_item_sk,
        ws_bill_customer_sk ws_customer_sk,
        sum(ws_quantity) ws_qty,
        sum(ws_wholesale_cost) ws_wc,
        sum(ws_sales_price) ws_sp
        from web_sales
        left join web_returns on wr_order_number=ws_order_number and ws_item_sk=wr_item_sk
        join date_dim on ws_sold_date_sk = d_date_sk
        where wr_order_number is null
        group by d_year, ws_item_sk, ws_bill_customer_sk
        ),
        cs as
        (select d_year AS cs_sold_year, cs_item_sk,
        cs_bill_customer_sk cs_customer_sk,
        sum(cs_quantity) cs_qty,
        sum(cs_wholesale_cost) cs_wc,
        sum(cs_sales_price) cs_sp
        from catalog_sales
        left join catalog_returns on cr_order_number=cs_order_number and cs_item_sk=cr_item_sk
        join date_dim on cs_sold_date_sk = d_date_sk
        where cr_order_number is null
        group by d_year, cs_item_sk, cs_bill_customer_sk
        ),
        ss as
        (select d_year AS ss_sold_year, ss_item_sk,
        ss_customer_sk,
        sum(ss_quantity) ss_qty,
        sum(ss_wholesale_cost) ss_wc,
        sum(ss_sales_price) ss_sp
        from store_sales
        left join store_returns on sr_ticket_number=ss_ticket_number and ss_item_sk=sr_item_sk
        join date_dim on ss_sold_date_sk = d_date_sk
        where sr_ticket_number is null
        group by d_year, ss_item_sk, ss_customer_sk
        )
        select
        ss_sold_year, ss_item_sk, ss_customer_sk,
        round(ss_qty/(coalesce(ws_qty,0)+coalesce(cs_qty,0)),2) ratio,
        ss_qty store_qty, ss_wc store_wholesale_cost, ss_sp store_sales_price,
        coalesce(ws_qty,0)+coalesce(cs_qty,0) other_chan_qty,
        coalesce(ws_wc,0)+coalesce(cs_wc,0) other_chan_wholesale_cost,
        coalesce(ws_sp,0)+coalesce(cs_sp,0) other_chan_sales_price
        from ss
        left join ws on (ws_sold_year=ss_sold_year and ws_item_sk=ss_item_sk and ws_customer_sk=ss_customer_sk)
        left join cs on (cs_sold_year=ss_sold_year and cs_item_sk=ss_item_sk and cs_customer_sk=ss_customer_sk)
        where (coalesce(ws_qty,0)>0 or coalesce(cs_qty, 0)>0) and ss_sold_year=2000
        order by
        ss_sold_year, ss_item_sk, ss_customer_sk,
        ss_qty desc, ss_wc desc, ss_sp desc,
        other_chan_qty,
        other_chan_wholesale_cost,
        other_chan_sales_price,
        ratio
        LIMIT 100;

结果如下:

内存超限了。。。

ByConity 在这种场景下提供了两个参数来解决这个问题

在 q78 最后加上

SETTINGS bsp_mode = 1, distributed_max_parallel_size = 12;

其中参数distributed_max_parallel_size可以设置为 4 的其他整数倍(因为 Worker 的数量为4)

再次执行

成功执行。

这里提高了并发,并减少了内存使用,但是在执行时候看不到 CPU 和内存的使用情况,可以像 clickhouse 一样在客户端显示实时使用情况,例如:

在 tpcds 测试中,除了上述 SQL ,还有 q64、q67、q68、q75、q79 也都是很复杂的。

尝试跑下 q64

with cs_ui as
        (select cs_item_sk
        ,sum(cs_ext_list_price) as sale,sum(cr_refunded_cash+cr_reversed_charge+cr_store_credit) as refund
        from catalog_sales
        ,catalog_returns
        where cs_item_sk = cr_item_sk
        and cs_order_number = cr_order_number
        group by cs_item_sk
        having sum(cs_ext_list_price)>2*sum(cr_refunded_cash+cr_reversed_charge+cr_store_credit)),
        cross_sales as
        (select i_product_name product_name
        ,i_item_sk item_sk
        ,s_store_name store_name
        ,s_zip store_zip
        ,ad1.ca_street_number b_street_number
        ,ad1.ca_street_name b_street_name
        ,ad1.ca_city b_city
        ,ad1.ca_zip b_zip
        ,ad2.ca_street_number c_street_number
        ,ad2.ca_street_name c_street_name
        ,ad2.ca_city c_city
        ,ad2.ca_zip c_zip
        ,d1.d_year as syear
        ,d2.d_year as fsyear
        ,d3.d_year s2year
        ,count(*) cnt
        ,sum(ss_wholesale_cost) s1
        ,sum(ss_list_price) s2
        ,sum(ss_coupon_amt) s3
        FROM   store_sales
        ,store_returns
        ,cs_ui
        ,date_dim d1
        ,date_dim d2
        ,date_dim d3
        ,store
        ,customer
        ,customer_demographics cd1
        ,customer_demographics cd2
        ,promotion
        ,household_demographics hd1
        ,household_demographics hd2
        ,customer_address ad1
        ,customer_address ad2
        ,income_band ib1
        ,income_band ib2
        ,item
        WHERE  ss_store_sk = s_store_sk AND
        ss_sold_date_sk = d1.d_date_sk AND
        ss_customer_sk = c_customer_sk AND
        ss_cdemo_sk= cd1.cd_demo_sk AND
        ss_hdemo_sk = hd1.hd_demo_sk AND
        ss_addr_sk = ad1.ca_address_sk and
        ss_item_sk = i_item_sk and
        ss_item_sk = sr_item_sk and
        ss_ticket_number = sr_ticket_number and
        ss_item_sk = cs_ui.cs_item_sk and
        c_current_cdemo_sk = cd2.cd_demo_sk AND
        c_current_hdemo_sk = hd2.hd_demo_sk AND
        c_current_addr_sk = ad2.ca_address_sk and
        c_first_sales_date_sk = d2.d_date_sk and
        c_first_shipto_date_sk = d3.d_date_sk and
        ss_promo_sk = p_promo_sk and
        hd1.hd_income_band_sk = ib1.ib_income_band_sk and
        hd2.hd_income_band_sk = ib2.ib_income_band_sk and
        cd1.cd_marital_status <> cd2.cd_marital_status and
        i_color in ('purple','burlywood','indian','spring','floral','medium') and
        i_current_price between 64 and 64 + 10 and
        i_current_price between 64 + 1 and 64 + 15
        group by i_product_name
        ,i_item_sk
        ,s_store_name
        ,s_zip
        ,ad1.ca_street_number
        ,ad1.ca_street_name
        ,ad1.ca_city
        ,ad1.ca_zip
        ,ad2.ca_street_number
        ,ad2.ca_street_name
        ,ad2.ca_city
        ,ad2.ca_zip
        ,d1.d_year
        ,d2.d_year
        ,d3.d_year
        )
        select cs1.product_name
        ,cs1.store_name
        ,cs1.store_zip
        ,cs1.b_street_number
        ,cs1.b_street_name
        ,cs1.b_city
        ,cs1.b_zip
        ,cs1.c_street_number
        ,cs1.c_street_name
        ,cs1.c_city
        ,cs1.c_zip
        ,cs1.syear
        ,cs1.cnt
        ,cs1.s1 as s11
        ,cs1.s2 as s21
        ,cs1.s3 as s31
        ,cs2.s1 as s12
        ,cs2.s2 as s22
        ,cs2.s3 as s32
        ,cs2.syear
        ,cs2.cnt
        from cross_sales cs1,cross_sales cs2
        where cs1.item_sk=cs2.item_sk and
        cs1.syear = 1999 and
        cs2.syear = 1999 + 1 and
        cs2.cnt <= cs1.cnt and
        cs1.store_name = cs2.store_name and
        cs1.store_zip = cs2.store_zip
        order by cs1.product_name
        ,cs1.store_name
        ,cs2.cnt
        ,cs1.s1
        ,cs2.s1;

结果如下:

确实复杂,执行了 84s,看下执行计划

┌─explain────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ Projection Est. 21719222 rows, cost 6.970330e+09                                                                                                                                                                                                           │
│ │     Expressions: b_city:=ca_city_2, b_street_name:=ca_street_name_2, b_street_number:=ca_street_number_2, b_zip:=ca_zip_2, c_city:=ca_city_3, c_street_name:=ca_street_name_3, c_street_number:=ca_street_number_3, c_zip:=ca_zip_3, cnt:=`expr#count()_1`, cnt_1:=`expr#count()_2`, product_name:=i_product_name_1, s11:=`expr#sum(ss_wholesale_cost)_1`, s12:=`expr#sum(ss_wholesale_cost)_2`, s21:=`expr#sum(ss_list_price)_1`, s22:=`expr#sum(ss_list_price)_2`, s31:=`expr#sum(ss_coupon_amt)_1`, s32:=`expr#sum(ss_coupon_amt)_2`, store_name:=s_store_name_2, store_zip:=s_zip_2, syear:=d_year_3, syear_1:=d_year_6 │
│ └─ Sorting Est. 21719222 rows, cost 6.968723e+09                                                                                                                                                                                                           │
│    │     Order by: {i_product_name_1 ASC NULLS LAST, s_store_name_2 ASC NULLS LAST, expr#count()_2 ASC NULLS LAST, expr#sum(ss_wholesale_cost)_1 ASC NULLS LAST, expr#sum(ss_wholesale_cost)_2 ASC NULLS LAST}                                             │
│    └─ Gather Exchange Est. 21719222 rows, cost 6.968723e+09                                                                                                                                                                                                │
│       └─ Sorting Est. 21719222 rows, cost 6.965248e+09                                                                                                                                                                                                     │
│          │     Order by: {i_product_name_1 ASC NULLS LAST, s_store_name_2 ASC NULLS LAST, expr#count()_2 ASC NULLS LAST, expr#sum(ss_wholesale_cost)_1 ASC NULLS LAST, expr#sum(ss_wholesale_cost)_2 ASC NULLS LAST}                                       │
│          └─ Inner Join Est. 21719222 rows, cost 6.965248e+09                                                                                                                                                                                               │
│             │     Condition: i_item_sk_2 == i_item_sk_1, s_store_name_2 == s_store_name_1, s_zip_2 == s_zip_1                                                                                                                                              │
│             │     Filter: `expr#count()_2` <= `expr#count()_1`                                                                                                                                                                                             │
│             ├─ Repartition Exchange Est. 154787 rows, cost 3.572475e+05                                                                                                                                                                                    │
│             │  │     Partition by: {i_item_sk_2, s_store_name_2, s_zip_2}                                                                                                                                                                                  │
│             │  └─ Filter Est. 77393 rows, cost 3.324814e+05                                                                                                                                                                                                │
│             │     │     Condition: d_year_6 = cast(2000, 'UInt32')                                                                                                                                                                                         │
│             │     └─ Buffer Est. 309573 rows, cost 3.095730e+05                                                                                                                                                                                            │
│             │        └─ CTERef[0] Est. 309573 rows, cost 3.095730e+05                                                                                                                                                                                      │
│             └─ Repartition Exchange Est. 154787 rows, cost 3.572475e+05                                                                                                                                                                                    │
│                │     Partition by: {i_item_sk_1, s_store_name_1, s_zip_1}                                                                                                                                                                                  │
│                └─ Filter Est. 77393 rows, cost 3.324814e+05                                                                                                                                                                                                │
│                   │     Condition: d_year_3 = 1999                                                                                                                                                                                                         │
│                   └─ Buffer Est. 309573 rows, cost 3.095730e+05                                                                                                                                                                                            │
│                      └─ CTERef[0] Est. 309573 rows, cost 3.095730e+05                                                                                                                                                                                      │
│ CTEDef [0]                                                                                                                                                                                                                                                 │
│    Projection Est. 309573 rows, cost 6.956253e+09                                                                                                                                                                                                          │
│    │     Expressions: [ca_city, ca_city_1, ca_street_name, ca_street_name_1, ca_street_number, ca_street_number_1, ca_zip, ca_zip_1, d_year, expr#count(), expr#sum(ss_coupon_amt), expr#sum(ss_list_price), expr#sum(ss_wholesale_cost), i_product_name, s_store_name, s_zip], i_item_sk:=ss_item_sk │
│    └─ MergingAggregated Est. 309573 rows, cost 6.956230e+09                                                                                                                                                                                                │
│       └─ Repartition Exchange Est. 364649 rows, cost 6.956230e+09                                                                                                                                                                                          │
│          │     Partition by: {i_product_name, ss_item_sk, s_store_name, s_zip, ca_street_number, ca_street_name, ca_city, ca_zip, ca_street_number_1, ca_street_name_1, ca_city_1, ca_zip_1, d_year, d_year_1, d_year_2}                                   │
│          └─ Aggregating Est. 364649 rows, cost 6.956171e+09                                                                                                                                                                                                │
│             │     Group by: {i_product_name, ss_item_sk, s_store_name, s_zip, ca_street_number, ca_street_name, ca_city, ca_zip, ca_street_number_1, ca_street_name_1, ca_city_1, ca_zip_1, d_year, d_year_1, d_year_2}                                    │
│             │     Aggregates: expr#count():=AggNull(count)(), expr#sum(ss_wholesale_cost):=AggNull(sum)(ss_wholesale_cost), expr#sum(ss_list_price):=AggNull(sum)(ss_list_price), expr#sum(ss_coupon_amt):=AggNull(sum)(ss_coupon_amt)                     │
│             └─ Inner Join Est. 364649 rows, cost 6.956171e+09                                                                                                                                                                                              │
│                │     Condition: ca_address_sk_1 == c_current_addr_sk                                                                                                                                                                                       │
│                │     Runtime Filters Builder: {c_current_addr_sk}                                                                                                                                                                                          │
│                ├─ Repartition Exchange Est. 6000000 rows, cost 5.844000e+06                                                                                                                                                                                │
│                │  │     Partition by: {ca_address_sk_1}                                                                                                                                                                                                    │
│                │  └─ Filter Est. 6000000 rows, cost 4.884000e+06                                                                                                                                                                                           │
│                │     │     Condition: Runtime Filters: {ca_address_sk_1}                                                                                                                                                                                   │
│                │     └─ TableScan test_elt.customer_address Est. 6000000 rows, cost 4.440000e+06                                                                                                                                                           │
│                │              Where: Runtime Filters: {ca_address_sk}                                                                                                                                                                                      │
│                │              Outputs: ca_address_sk_1:=ca_address_sk, ca_street_number_1:=ca_street_number, ca_street_name_1:=ca_street_name, ca_city_1:=ca_city, ca_zip_1:=ca_zip                                                                        │
│                └─ Repartition Exchange Est. 364649 rows, cost 6.947531e+09                                                                                                                                                                                 │
│                   │     Partition by: {c_current_addr_sk}                                                                                                                                                                                                  │
│                   └─ Inner Join Est. 355608 rows, cost 6.947473e+09                                                                                                                                                                                        │
│                      │     Condition: ca_address_sk == ss_addr_sk                                                                                                                                                                                          │
│                      │     Runtime Filters Builder: {ss_addr_sk}                                                                                                                                                                                           │
│                      ├─ Repartition Exchange Est. 6000000 rows, cost 5.844000e+06                                                                                                                                                                          │
│                      │  │     Partition by: {ca_address_sk}                                                                                                                                                                                                │
│                      │  └─ Filter Est. 6000000 rows, cost 4.884000e+06                                                                                                                                                                                     │
│                      │     │     Condition: Runtime Filters: {ca_address_sk}                                                                                                                                                                               │
│                      │     └─ TableScan test_elt.customer_address Est. 6000000 rows, cost 4.440000e+06                                                                                                                                                     │
│                      │              Where: Runtime Filters: {ca_address_sk}                                                                                                                                                                                │
│                      │              Outputs: [ca_address_sk, ca_street_number, ca_street_name, ca_city, ca_zip]                                                                                                                                            │
│                      └─ Repartition Exchange Est. 355608 rows, cost 6.938847e+09                                                                                                                                                                           │
│                         │     Partition by: {ss_addr_sk}                                                                                                                                                                                                   │
│                         └─ Inner Join Est. 354159 rows, cost 6.938790e+09                                                                                                                                                                                  │
│                            │     Condition: cd_demo_sk_1 == c_current_cdemo_sk                                                                                                                                                                             │
│                            │     Filter: cd_marital_status != cd_marital_status_1                                                                                                                                                                          │
│                            │     Runtime Filters Builder: {c_current_cdemo_sk}                                                                                                                                                                             │
│                            ├─ Repartition Exchange Est. 1920800 rows, cost 1.870859e+06                                                                                                                                                                    │
│                            │  │     Partition by: {cd_demo_sk_1}                                                                                                                                                                                           │
│                            │  └─ Filter Est. 1920800 rows, cost 1.563531e+06                                                                                                                                                                               │
│                            │     │     Condition: Runtime Filters: {cd_demo_sk_1}                                                                                                                                                                          │
│                            │     └─ TableScan test_elt.customer_demographics Est. 1920800 rows, cost 1.421392e+06                                                                                                                                          │
│                            │              Where: Runtime Filters: {cd_demo_sk}                                                                                                                                                                             │
│                            │              Outputs: cd_demo_sk_1:=cd_demo_sk, cd_marital_status_1:=cd_marital_status                                                                                                                                        │
│                            └─ Repartition Exchange Est. 708317 rows, cost 6.935220e+09                                                                                                                                                                     │
│                               │     Partition by: {c_current_cdemo_sk}                                                                                                                                                                                     │
│                               └─ Inner Join Est. 706525 rows, cost 6.935107e+09                                                                                                                                                                            │
│                                  │     Condition: cd_demo_sk == ss_cdemo_sk                                                                                                                                                                                │
│                                  │     Runtime Filters Builder: {ss_cdemo_sk}                                                                                                                                                                              │
│                                  ├─ Repartition Exchange Est. 1920800 rows, cost 1.870859e+06                                                                                                                                                              │
│                                  │  │     Partition by: {cd_demo_sk}                                                                                                                                                                                       │
│                                  │  └─ Filter Est. 1920800 rows, cost 1.563531e+06                                                                                                                                                                         │
│                                  │     │     Condition: Runtime Filters: {cd_demo_sk}                                                                                                                                                                      │
│                                  │     └─ TableScan test_elt.customer_demographics Est. 1920800 rows, cost 1.421392e+06                                                                                                                                    │
│                                  │              Where: Runtime Filters: {cd_demo_sk}                                                                                                                                                                       │
│                                  │              Outputs: [cd_demo_sk, cd_marital_status]                                                                                                                                                                   │
│                                  └─ Repartition Exchange Est. 706525 rows, cost 6.931409e+09                                                                                                                                                               │
│                                     │     Partition by: {ss_cdemo_sk}                                                                                                                                                                                      │
│                                     └─ Inner Join Est. 706525 rows, cost 6.931296e+09                                                                                                                                                                      │
│                                        │     Condition: c_first_shipto_date_sk == d_date_sk_2                                                                                                                                                              │
│                                        ├─ Repartition Exchange Est. 706525 rows, cost 6.930619e+09                                                                                                                                                         │
│                                        │  │     Partition by: {c_first_shipto_date_sk}                                                                                                                                                                     │
│                                        │  └─ Inner Join Est. 706525 rows, cost 6.930506e+09                                                                                                                                                                │
│                                        │     │     Condition: c_first_sales_date_sk == d_date_sk_1                                                                                                                                                         │
│                                        │     ├─ Repartition Exchange Est. 706525 rows, cost 6.929829e+09                                                                                                                                                   │
│                                        │     │  │     Partition by: {c_first_sales_date_sk}                                                                                                                                                                │
│                                        │     │  └─ Inner Join Est. 706525 rows, cost 6.929716e+09                                                                                                                                                          │
│                                        │     │     │     Condition: hd_income_band_sk_1 == ib_income_band_sk_1                                                                                                                                             │
│                                        │     │     ├─ Inner Join Est. 706525 rows, cost 6.929193e+09                                                                                                                                                       │
│                                        │     │     │  │     Condition: c_current_hdemo_sk == hd_demo_sk_1                                                                                                                                                  │
│                                        │     │     │  ├─ Inner Join Est. 706525 rows, cost 6.928626e+09                                                                                                                                                    │
│                                        │     │     │  │  │     Condition: c_customer_sk == ss_customer_sk                                                                                                                                                  │
│                                        │     │     │  │  │     Runtime Filters Builder: {ss_customer_sk}                                                                                                                                                   │
│                                        │     │     │  │  ├─ Repartition Exchange Est. 12000000 rows, cost 1.168800e+07                                                                                                                                     │
│                                        │     │     │  │  │  │     Partition by: {c_customer_sk}                                                                                                                                                            │
│                                        │     │     │  │  │  └─ Filter Est. 12000000 rows, cost 9.768000e+06                                                                                                                                                │
│                                        │     │     │  │  │     │     Condition: Runtime Filters: {c_customer_sk}                                                                                                                                           │
│                                        │     │     │  │  │     └─ TableScan test_elt.customer Est. 12000000 rows, cost 8.880000e+06                                                                                                                        │
│                                        │     │     │  │  │              Where: Runtime Filters: {c_customer_sk}                                                                                                                                            │
│                                        │     │     │  │  │              Outputs: [c_customer_sk, c_current_cdemo_sk, c_current_hdemo_sk, c_current_addr_sk, c_first_shipto_date_sk, c_first_sales_date_sk]                                                 │
│                                        │     │     │  │  └─ Repartition Exchange Est. 703646 rows, cost 6.911385e+09                                                                                                                                       │
│                                        │     │     │  │     │     Partition by: {ss_customer_sk}                                                                                                                                                           │
│                                        │     │     │  │     └─ Inner Join Est. 703646 rows, cost 6.911272e+09                                                                                                                                              │
│                                        │     │     │  │        │     Condition: ss_store_sk == s_store_sk                                                                                                                                                  │
│                                        │     │     │  │        ├─ Inner Join Est. 703646 rows, cost 6.910745e+09                                                                                                                                           │
│                                        │     │     │  │        │  │     Condition: hd_income_band_sk == ib_income_band_sk                                                                                                                                  │
│                                        │     │     │  │        │  ├─ Inner Join Est. 703646 rows, cost 6.910224e+09                                                                                                                                        │
│                                        │     │     │  │        │  │  │     Condition: ss_hdemo_sk == hd_demo_sk                                                                                                                                            │
│                                        │     │     │  │        │  │  ├─ Inner Join Est. 703646 rows, cost 6.909659e+09                                                                                                                                     │
│                                        │     │     │  │        │  │  │  │     Condition: ss_promo_sk == p_promo_sk                                                                                                                                         │
│                                        │     │     │  │        │  │  │  ├─ Inner Join Est. 709321 rows, cost 6.909127e+09                                                                                                                                  │
│                                        │     │     │  │        │  │  │  │  │     Condition: ss_sold_date_sk == d_date_sk                                                                                                                                   │
│                                        │     │     │  │        │  │  │  │  │     Runtime Filters Builder: {d_date_sk}                                                                                                                                      │
│                                        │     │     │  │        │  │  │  │  ├─ Inner Join Est. 1778671 rows, cost 6.908143e+09                                                                                                                              │
│                                        │     │     │  │        │  │  │  │  │  │     Condition: ss_item_sk == sr_item_sk, ss_ticket_number == sr_ticket_number                                                                                              │
│                                        │     │     │  │        │  │  │  │  │  │     Runtime Filters Builder: {sr_item_sk,sr_ticket_number}                                                                                                                 │
│                                        │     │     │  │        │  │  │  │  │  ├─ Filter Est. 2879987999 rows, cost 2.344310e+09                                                                                                                            │
│                                        │     │     │  │        │  │  │  │  │  │  │     Condition: Runtime Filters: {ss_item_sk, ss_sold_date_sk, ss_ticket_number}                                                                                         │
│                                        │     │     │  │        │  │  │  │  │  │  └─ TableScan test_elt.store_sales Est. 2879987999 rows, cost 2.131191e+09                                                                                                 │
│                                        │     │     │  │        │  │  │  │  │  │           Where: Runtime Filters: {ss_item_sk, ss_sold_date_sk, ss_ticket_number}                                                                                          │
│                                        │     │     │  │        │  │  │  │  │  │           Outputs: [ss_sold_date_sk, ss_item_sk, ss_customer_sk, ss_cdemo_sk, ss_hdemo_sk, ss_addr_sk, ss_store_sk, ss_promo_sk, ss_ticket_number, ss_wholesale_cost, ss_list_price, ss_coupon_amt] │
│                                        │     │     │  │        │  │  │  │  │  └─ Inner Join Est. 185279 rows, cost 3.497355e+09                                                                                                                            │
│                                        │     │     │  │        │  │  │  │  │     │     Condition: sr_item_sk == cs_item_sk                                                                                                                                 │
│                                        │     │     │  │        │  │  │  │  │     │     Runtime Filters Builder: {cs_item_sk}                                                                                                                               │
│                                        │     │     │  │        │  │  │  │  │     ├─ Filter Est. 287999764 rows, cost 2.344318e+08                                                                                                                          │
│                                        │     │     │  │        │  │  │  │  │     │  │     Condition: Runtime Filters: {sr_item_sk}                                                                                                                         │
│                                        │     │     │  │        │  │  │  │  │     │  └─ TableScan test_elt.store_returns Est. 287999764 rows, cost 2.131198e+08                                                                                             │
│                                        │     │     │  │        │  │  │  │  │     │           Where: Runtime Filters: {sr_item_sk}                                                                                                                          │
│                                        │     │     │  │        │  │  │  │  │     │           Outputs: [sr_item_sk, sr_ticket_number]                                                                                                                       │
│                                        │     │     │  │        │  │  │  │  │     └─ Inner Join Est. 195 rows, cost 3.156294e+09                                                                                                                            │
│                                        │     │     │  │        │  │  │  │  │        │     Condition: cs_item_sk == i_item_sk                                                                                                                               │
│                                        │     │     │  │        │  │  │  │  │        │     Runtime Filters Builder: {i_item_sk}                                                                                                                             │
│                                        │     │     │  │        │  │  │  │  │        ├─ Projection Est. 75000 rows, cost 3.156022e+09                                                                                                                       │
│                                        │     │     │  │        │  │  │  │  │        │  │     Expressions: [cs_item_sk]                                                                                                                                     │
│                                        │     │     │  │        │  │  │  │  │        │  └─ Filter Est. 75000 rows, cost 3.156016e+09                                                                                                                        │
│                                        │     │     │  │        │  │  │  │  │        │     │     Condition: `expr#sum(cs_ext_list_price)` > (2 * `expr#sum(plus(plus(cr_refunded_cash, cr_reversed_charge), cr_store_credit))`)                             │
│                                        │     │     │  │        │  │  │  │  │        │     └─ Aggregating Est. 300000 rows, cost 3.155994e+09                                                                                                               │
│                                        │     │     │  │        │  │  │  │  │        │        │     Group by: {cs_item_sk}                                                                                                                                  │
│                                        │     │     │  │        │  │  │  │  │        │        │     Aggregates: expr#sum(cs_ext_list_price):=AggNull(sum)(cs_ext_list_price), expr#sum(plus(plus(cr_refunded_cash, cr_reversed_charge), cr_store_credit)):=AggNull(sum)(expr#plus(plus(cr_refunded_cash, cr_reversed_charge), cr_store_credit)) │
│                                        │     │     │  │        │  │  │  │  │        │        └─ Projection Est. 978035432 rows, cost 2.430634e+09                                                                                                          │
│                                        │     │     │  │        │  │  │  │  │        │           │     Expressions: [cs_ext_list_price, cs_item_sk], expr#plus(plus(cr_refunded_cash, cr_reversed_charge), cr_store_credit):=(cr_refunded_cash + cr_reversed_charge) + cr_store_credit │
│                                        │     │     │  │        │  │  │  │  │        │           └─ Inner (PARALLEL_HASH) Join Est. 978035432 rows, cost 2.358259e+09                                                                                       │
│                                        │     │     │  │        │  │  │  │  │        │              │     Condition: cs_item_sk == cr_item_sk, cs_order_number == cr_order_number                                                                           │
│                                        │     │     │  │        │  │  │  │  │        │              │     Runtime Filters Builder: {cr_order_number}                                                                                                        │
│                                        │     │     │  │        │  │  │  │  │        │              ├─ Filter Est. 1439980416 rows, cost 1.172144e+09                                                                                                       │
│                                        │     │     │  │        │  │  │  │  │        │              │  │     Condition: Runtime Filters: {cs_item_sk, cs_order_number}                                                                                      │
│                                        │     │     │  │        │  │  │  │  │        │              │  └─ TableScan test_elt.catalog_sales Est. 1439980416 rows, cost 1.065586e+09                                                                          │
│                                        │     │     │  │        │  │  │  │  │        │              │           Where: Runtime Filters: {cs_item_sk, cs_order_number}                                                                                       │
│                                        │     │     │  │        │  │  │  │  │        │              │           Outputs: [cs_item_sk, cs_order_number, cs_ext_list_price]                                                                                   │
│                                        │     │     │  │        │  │  │  │  │        │              └─ Filter Est. 143996756 rows, cost 1.172134e+08                                                                                                        │
│                                        │     │     │  │        │  │  │  │  │        │                 │     Condition: Runtime Filters: {cr_item_sk}                                                                                                       │
│                                        │     │     │  │        │  │  │  │  │        │                 └─ TableScan test_elt.catalog_returns Est. 143996756 rows, cost 1.065576e+08                                                                         │
│                                        │     │     │  │        │  │  │  │  │        │                          Where: Runtime Filters: {cr_item_sk}                                                                                                        │
│                                        │     │     │  │        │  │  │  │  │        │                          Outputs: [cr_item_sk, cr_order_number, cr_refunded_cash, cr_reversed_charge, cr_store_credit]                                               │
│                                        │     │     │  │        │  │  │  │  │        └─ Projection Est. 195 rows, cost 2.442144e+05                                                                                                                         │
│                                        │     │     │  │        │  │  │  │  │           │     Expressions: [i_item_sk, i_product_name]                                                                                                                      │
│                                        │     │     │  │        │  │  │  │  │           └─ Filter Est. 195 rows, cost 2.442000e+05                                                                                                                          │
│                                        │     │     │  │        │  │  │  │  │              │     Condition: (i_color IN ('burlywood', 'floral', 'indian', 'medium', 'purple', 'spring')) AND (i_current_price >= 65.) AND (i_current_price <= 74.)          │
│                                        │     │     │  │        │  │  │  │  │              └─ TableScan test_elt.item Est. 300000 rows, cost 2.220000e+05                                                                                                   │
│                                        │     │     │  │        │  │  │  │  │                       Where: (i_color IN ('burlywood', 'floral', 'indian', 'medium', 'purple', 'spring')) AND (i_current_price >= 65.) AND (i_current_price <= 74.)           │
│                                        │     │     │  │        │  │  │  │  │                       Outputs: [i_item_sk, i_current_price, i_color, i_product_name]                                                                                          │
│                                        │     │     │  │        │  │  │  │  └─ Broadcast Exchange Est. 729 rows, cost 5.992909e+04                                                                                                                          │
│                                        │     │     │  │        │  │  │  │     └─ Filter Est. 729 rows, cost 5.946189e+04                                                                                                                                   │
│                                        │     │     │  │        │  │  │  │        │     Condition: (d_year = 1999) OR (d_year = cast(2000, 'UInt32'))                                                                                                       │
│                                        │     │     │  │        │  │  │  │        └─ TableScan test_elt.date_dim Est. 73049 rows, cost 5.405626e+04                                                                                                         │
│                                        │     │     │  │        │  │  │  │                 Where: (d_year = 1999) OR (d_year = cast(2000, 'UInt32'))                                                                                                        │
│                                        │     │     │  │        │  │  │  │                 Outputs: [d_date_sk, d_year]                                                                                                                                     │
│                                        │     │     │  │        │  │  │  └─ Broadcast Exchange Est. 1500 rows, cost 2.070640e+03                                                                                                                            │
│                                        │     │     │  │        │  │  │     └─ TableScan test_elt.promotion Est. 1500 rows, cost 1.110000e+03                                                                                                               │
│                                        │     │     │  │        │  │  │              Outputs: [p_promo_sk]                                                                                                                                                  │
│                                        │     │     │  │        │  │  └─ Broadcast Exchange Est. 7200 rows, cost 9.936640e+03                                                                                                                               │
│                                        │     │     │  │        │  │     └─ TableScan test_elt.household_demographics Est. 7200 rows, cost 5.328000e+03                                                                                                     │
│                                        │     │     │  │        │  │              Outputs: [hd_demo_sk, hd_income_band_sk]                                                                                                                                  │
│                                        │     │     │  │        │  └─ Broadcast Exchange Est. 20 rows, cost 2.824000e+01                                                                                                                                    │
│                                        │     │     │  │        │     └─ TableScan test_elt.income_band Est. 20 rows, cost 1.480000e+01                                                                                                                     │
│                                        │     │     │  │        │              Outputs: [ib_income_band_sk]                                                                                                                                                 │
│                                        │     │     │  │        └─ Broadcast Exchange Est. 1002 rows, cost 1.383400e+03                                                                                                                                     │
│                                        │     │     │  │           └─ TableScan test_elt.store Est. 1002 rows, cost 7.414800e+02                                                                                                                            │
│                                        │     │     │  │                    Outputs: [s_store_sk, s_store_name, s_zip]                                                                                                                                      │
│                                        │     │     │  └─ Broadcast Exchange Est. 7200 rows, cost 9.936640e+03                                                                                                                                              │
│                                        │     │     │     └─ TableScan test_elt.household_demographics Est. 7200 rows, cost 5.328000e+03                                                                                                                    │
│                                        │     │     │              Outputs: hd_demo_sk_1:=hd_demo_sk, hd_income_band_sk_1:=hd_income_band_sk                                                                                                                │
│                                        │     │     └─ Broadcast Exchange Est. 20 rows, cost 2.824000e+01                                                                                                                                                   │
│                                        │     │        └─ TableScan test_elt.income_band Est. 20 rows, cost 1.480000e+01                                                                                                                                    │
│                                        │     │                 Outputs: ib_income_band_sk_1:=ib_income_band_sk                                                                                                                                             │
│                                        │     └─ Repartition Exchange Est. 73049 rows, cost 6.574434e+04                                                                                                                                                    │
│                                        │        │     Partition by: {d_date_sk_1}                                                                                                                                                                          │
│                                        │        └─ TableScan test_elt.date_dim Est. 73049 rows, cost 5.405626e+04                                                                                                                                          │
│                                        │                 Outputs: d_date_sk_1:=d_date_sk, d_year_1:=d_year                                                                                                                                                 │
│                                        └─ Repartition Exchange Est. 73049 rows, cost 6.574434e+04                                                                                                                                                          │
│                                           │     Partition by: {d_date_sk_2}                                                                                                                                                                                │
│                                           └─ TableScan test_elt.date_dim Est. 73049 rows, cost 5.405626e+04                                                                                                                                                │
│                                                    Outputs: d_date_sk_2:=d_date_sk, d_year_2:=d_year                                                                                                                                                       │
│ note: Runtime Filter is applied for 12 times.                                                                                                                                                                                                              │
│ note: CTE(Common Table Expression) is applied for 2 times.                                                                                                                                                                                                 │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

执行计划看着就很复杂,不过 ByConity 支持了 RBO + CBO 优化,以及复杂查询,包括 CTE 的优化,可以生成分布式查询计划,不过把 CTE 计划放在主计划下面还是和其他的优化器做法有点不一样。

下面通过减小内存让 q64 OOM,看看 bsp 可不可以让小内存场景依然正常执行

在 q64 后加上以下参数

SETTINGS max_memory_usage=300000000;

执行结果如下:

再加上 bsp 配置参数

SETTINGS max_memory_usage=300000000, bsp_mode = 1, distributed_max_parallel_size = 12;

再次执行结果如下:

虽然这两次都执行出了结果,但是一个是 11492 行,一个是 2 行,暂时不知道是配置没有对还是有了 bug。

总结

总体测试体验还是不错的,bsp 确实能在资源有限的情况下一定程度的解决大查询或复杂查询的问题,但是在想构造一个 OOM 且 bsp 还能运行的参数确实有点难,尝试了好多次,建议可以在查询失败时给出建议,甚至能够自适应开启 bsp 并设置并行度,让用户收到 OOM 时就知道这个 SQL 确实是不能跑了。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

一只努力的微服务

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值