Skip to content

[chore](routine-load) increase routine load job default max batch size and rows #36632

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 21, 2024
Merged

[chore](routine-load) increase routine load job default max batch size and rows #36632

merged 1 commit into from
Jun 21, 2024

Conversation

sollhui
Copy link
Contributor

@sollhui sollhui commented Jun 20, 2024

Most users only care about the size of max_batch_interval, but in order to achieve an interval effect, they have to configure max_batch_rows and max_batch_size according to the characteristics of the data. By adjusting these two default values, users do not need to worry about configuration in most scenarios.

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@sollhui
Copy link
Contributor Author

sollhui commented Jun 20, 2024

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 39499 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit ff45493e587341345f5597b29770a68af355786d, data reload: false

------ Round 1 ----------------------------------
q1	17639	4309	4227	4227
q2	2023	193	194	193
q3	10443	1144	1098	1098
q4	10192	844	905	844
q5	7451	2664	2713	2664
q6	216	134	136	134
q7	948	596	593	593
q8	9217	2030	2070	2030
q9	8891	6475	6395	6395
q10	8799	3688	3719	3688
q11	452	235	234	234
q12	404	243	229	229
q13	17883	2996	2965	2965
q14	256	217	220	217
q15	513	494	474	474
q16	525	381	376	376
q17	956	653	731	653
q18	7939	7367	7288	7288
q19	7310	1393	1503	1393
q20	670	328	319	319
q21	4884	3146	3902	3146
q22	400	355	339	339
Total cold run time: 118011 ms
Total hot run time: 39499 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4412	4251	4242	4242
q2	375	267	265	265
q3	3014	2869	2870	2869
q4	2006	1694	1732	1694
q5	5576	5488	5435	5435
q6	217	125	130	125
q7	2237	1889	1844	1844
q8	3251	3401	3445	3401
q9	8633	8734	8727	8727
q10	4039	3735	3718	3718
q11	597	500	490	490
q12	825	633	633	633
q13	16147	3181	3189	3181
q14	295	278	263	263
q15	522	456	490	456
q16	498	445	450	445
q17	1814	1513	1486	1486
q18	8100	7900	7641	7641
q19	1830	1608	1504	1504
q20	3089	1878	1852	1852
q21	5119	4867	4782	4782
q22	620	538	530	530
Total cold run time: 73216 ms
Total hot run time: 55583 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 173416 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit ff45493e587341345f5597b29770a68af355786d, data reload: false

query1	932	377	374	374
query2	6450	2343	2404	2343
query3	6628	205	205	205
query4	19356	17301	17211	17211
query5	3588	488	466	466
query6	238	161	166	161
query7	4584	296	293	293
query8	331	288	295	288
query9	8589	2419	2439	2419
query10	575	319	279	279
query11	10565	9977	10013	9977
query12	125	85	85	85
query13	1645	353	361	353
query14	10171	7149	7815	7149
query15	230	189	186	186
query16	7454	266	271	266
query17	1802	535	514	514
query18	1220	269	267	267
query19	201	170	151	151
query20	95	88	85	85
query21	205	147	126	126
query22	4198	4128	4015	4015
query23	33780	33541	33705	33541
query24	10999	2936	2871	2871
query25	602	376	376	376
query26	720	159	153	153
query27	2349	320	322	320
query28	6256	2118	2107	2107
query29	904	636	640	636
query30	256	155	157	155
query31	977	798	760	760
query32	93	55	56	55
query33	755	302	329	302
query34	897	489	473	473
query35	734	648	653	648
query36	1168	966	943	943
query37	138	78	71	71
query38	2972	2886	2819	2819
query39	881	842	848	842
query40	212	127	124	124
query41	55	46	43	43
query42	118	97	100	97
query43	591	554	565	554
query44	1154	736	734	734
query45	202	162	162	162
query46	1068	725	732	725
query47	1885	1764	1767	1764
query48	387	286	287	286
query49	838	398	407	398
query50	749	398	389	389
query51	6923	6706	6743	6706
query52	102	91	92	91
query53	363	284	291	284
query54	882	438	437	437
query55	74	76	71	71
query56	279	254	268	254
query57	1122	1022	1059	1022
query58	248	249	247	247
query59	3349	3156	3143	3143
query60	302	278	273	273
query61	96	93	98	93
query62	612	429	449	429
query63	315	285	293	285
query64	8576	2358	1708	1708
query65	3441	3097	3097	3097
query66	741	329	329	329
query67	15244	14941	14832	14832
query68	4596	543	538	538
query69	590	445	394	394
query70	1207	1157	1180	1157
query71	459	281	270	270
query72	7238	5330	5288	5288
query73	741	319	325	319
query74	5958	5457	5507	5457
query75	3356	2660	2674	2660
query76	2795	963	945	945
query77	590	297	310	297
query78	10424	9801	9664	9664
query79	2408	503	504	503
query80	1212	467	458	458
query81	583	225	221	221
query82	1409	107	101	101
query83	270	168	171	168
query84	241	83	85	83
query85	1282	286	280	280
query86	470	288	339	288
query87	3317	3104	3108	3104
query88	3854	2334	2340	2334
query89	478	376	394	376
query90	1704	191	189	189
query91	129	103	99	99
query92	57	49	55	49
query93	2224	510	498	498
query94	1047	189	188	188
query95	411	314	325	314
query96	582	266	263	263
query97	3207	3010	3032	3010
query98	214	201	201	201
query99	1112	850	838	838
Total cold run time: 267756 ms
Total hot run time: 173416 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.14 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit ff45493e587341345f5597b29770a68af355786d, data reload: false

query1	0.04	0.03	0.03
query2	0.08	0.04	0.04
query3	0.22	0.05	0.05
query4	1.68	0.07	0.07
query5	0.50	0.49	0.49
query6	1.13	0.74	0.71
query7	0.02	0.02	0.01
query8	0.06	0.04	0.04
query9	0.55	0.50	0.48
query10	0.54	0.53	0.56
query11	0.15	0.11	0.11
query12	0.14	0.12	0.12
query13	0.60	0.59	0.59
query14	0.77	0.76	0.80
query15	0.83	0.81	0.80
query16	0.36	0.35	0.36
query17	0.97	1.02	0.98
query18	0.20	0.27	0.22
query19	1.82	1.72	1.73
query20	0.02	0.01	0.01
query21	15.40	0.67	0.67
query22	4.62	7.42	1.53
query23	18.25	1.41	1.26
query24	2.08	0.23	0.23
query25	0.16	0.08	0.09
query26	0.28	0.18	0.18
query27	0.08	0.07	0.08
query28	13.35	1.02	1.00
query29	12.63	3.30	3.31
query30	0.25	0.06	0.06
query31	2.87	0.38	0.38
query32	3.27	0.46	0.48
query33	2.93	2.90	2.94
query34	17.14	4.43	4.40
query35	4.43	4.44	4.50
query36	0.66	0.46	0.46
query37	0.19	0.16	0.16
query38	0.16	0.15	0.16
query39	0.04	0.04	0.03
query40	0.19	0.14	0.14
query41	0.10	0.05	0.05
query42	0.05	0.05	0.04
query43	0.05	0.04	0.04
Total cold run time: 109.86 s
Total hot run time: 30.14 s

@sollhui
Copy link
Contributor Author

sollhui commented Jun 21, 2024

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 40019 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 378ca94ef1f4d50fa9d6e805d2769ed471e3201a, data reload: false

------ Round 1 ----------------------------------
q1	17974	4800	4333	4333
q2	2660	198	190	190
q3	11669	1133	1150	1133
q4	11103	796	835	796
q5	7492	2725	2651	2651
q6	222	140	144	140
q7	975	627	630	627
q8	9242	2069	2047	2047
q9	8856	6449	6443	6443
q10	8970	3703	3708	3703
q11	449	246	242	242
q12	411	241	239	239
q13	17772	2972	3001	2972
q14	280	221	214	214
q15	531	472	472	472
q16	514	392	378	378
q17	966	697	618	618
q18	8014	7564	7435	7435
q19	7463	1505	1542	1505
q20	654	324	347	324
q21	5025	3219	3894	3219
q22	389	338	347	338
Total cold run time: 121631 ms
Total hot run time: 40019 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4389	4234	4201	4201
q2	377	281	274	274
q3	3025	2778	2725	2725
q4	1840	1631	1599	1599
q5	5258	5282	5278	5278
q6	219	131	133	131
q7	2113	1677	1766	1677
q8	3168	3315	3329	3315
q9	8301	8308	8262	8262
q10	3921	3716	3667	3667
q11	594	478	491	478
q12	789	600	596	596
q13	16647	2988	2985	2985
q14	286	256	267	256
q15	521	501	471	471
q16	470	415	430	415
q17	1783	1492	1450	1450
q18	7564	7485	7332	7332
q19	1680	1563	1470	1470
q20	1992	1779	1789	1779
q21	4928	4707	4662	4662
q22	616	538	553	538
Total cold run time: 70481 ms
Total hot run time: 53561 ms

@sollhui
Copy link
Contributor Author

sollhui commented Jun 21, 2024

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 39780 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 4cd104f1538d7e41db99538590e539ed94f1cd33, data reload: false

------ Round 1 ----------------------------------
q1	18193	4569	4441	4441
q2	2986	194	212	194
q3	11696	1142	1129	1129
q4	10552	783	843	783
q5	7485	2765	2587	2587
q6	229	142	139	139
q7	968	637	617	617
q8	9270	2071	2068	2068
q9	8789	6426	6422	6422
q10	9008	3782	3727	3727
q11	450	241	235	235
q12	452	232	237	232
q13	18809	2953	2946	2946
q14	275	221	233	221
q15	512	465	487	465
q16	522	394	370	370
q17	950	660	724	660
q18	7974	7513	7259	7259
q19	3028	1487	1463	1463
q20	648	316	338	316
q21	4928	3180	3855	3180
q22	377	326	340	326
Total cold run time: 118101 ms
Total hot run time: 39780 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4446	4327	4336	4327
q2	372	275	262	262
q3	2935	2754	2678	2678
q4	1898	1636	1590	1590
q5	5231	5231	5265	5231
q6	212	126	128	126
q7	2085	1714	1742	1714
q8	3160	3306	3312	3306
q9	8274	8247	8240	8240
q10	3922	3623	3666	3623
q11	582	479	478	478
q12	785	611	591	591
q13	16316	2983	3017	2983
q14	288	266	268	266
q15	549	476	474	474
q16	463	402	426	402
q17	1763	1501	1448	1448
q18	7638	7521	7253	7253
q19	1702	1593	1610	1593
q20	1979	1823	1777	1777
q21	4949	4706	4662	4662
q22	629	551	540	540
Total cold run time: 70178 ms
Total hot run time: 53564 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 172386 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 4cd104f1538d7e41db99538590e539ed94f1cd33, data reload: false

query1	926	380	373	373
query2	6467	2472	2369	2369
query3	6647	208	214	208
query4	18991	17388	17403	17388
query5	4162	496	474	474
query6	244	161	163	161
query7	4583	309	317	309
query8	307	296	292	292
query9	8764	2434	2452	2434
query10	605	317	288	288
query11	10596	10074	10050	10050
query12	143	91	84	84
query13	1671	365	361	361
query14	8559	7711	6809	6809
query15	238	192	187	187
query16	7800	267	269	267
query17	1885	539	544	539
query18	1904	275	267	267
query19	197	153	155	153
query20	93	83	82	82
query21	212	125	125	125
query22	4371	4088	4123	4088
query23	33583	33177	33079	33079
query24	11715	2798	2775	2775
query25	701	362	363	362
query26	1825	149	148	148
query27	3032	317	315	315
query28	7670	2056	2050	2050
query29	1109	616	595	595
query30	293	151	148	148
query31	955	752	728	728
query32	94	53	58	53
query33	772	283	279	279
query34	1005	478	468	468
query35	759	628	634	628
query36	1121	940	954	940
query37	214	68	71	68
query38	2853	2767	2753	2753
query39	836	784	805	784
query40	281	124	125	124
query41	55	54	51	51
query42	118	101	100	100
query43	584	544	547	544
query44	1270	725	740	725
query45	201	165	162	162
query46	1100	725	706	706
query47	1883	1812	1755	1755
query48	369	295	289	289
query49	1183	401	420	401
query50	762	380	387	380
query51	6825	6767	6730	6730
query52	98	97	99	97
query53	357	277	286	277
query54	1031	427	437	427
query55	79	76	76	76
query56	278	254	259	254
query57	1169	1037	1067	1037
query58	248	242	238	238
query59	3342	3138	3191	3138
query60	289	279	263	263
query61	92	89	88	88
query62	659	444	436	436
query63	321	288	291	288
query64	9836	2242	1734	1734
query65	3198	3124	3074	3074
query66	1379	336	332	332
query67	15402	14979	14948	14948
query68	4564	519	523	519
query69	460	325	303	303
query70	1209	1111	1133	1111
query71	373	283	270	270
query72	7187	5196	5823	5196
query73	736	329	323	323
query74	5912	5561	5394	5394
query75	3387	2649	2658	2649
query76	2613	894	885	885
query77	465	305	292	292
query78	10227	9791	9644	9644
query79	2341	517	508	508
query80	991	457	450	450
query81	581	217	223	217
query82	678	103	99	99
query83	260	173	168	168
query84	243	84	86	84
query85	1863	275	268	268
query86	505	317	303	303
query87	3233	3136	3090	3090
query88	4295	2355	2354	2354
query89	475	385	391	385
query90	1824	188	182	182
query91	129	98	100	98
query92	66	49	50	49
query93	2443	510	499	499
query94	1259	188	195	188
query95	408	327	326	326
query96	597	266	266	266
query97	3218	3096	3101	3096
query98	227	256	192	192
query99	1266	823	836	823
Total cold run time: 274346 ms
Total hot run time: 172386 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.67 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 4cd104f1538d7e41db99538590e539ed94f1cd33, data reload: false

query1	0.04	0.03	0.03
query2	0.08	0.04	0.04
query3	0.22	0.04	0.05
query4	1.68	0.07	0.07
query5	0.50	0.49	0.50
query6	1.13	0.72	0.72
query7	0.03	0.01	0.02
query8	0.05	0.04	0.05
query9	0.55	0.50	0.48
query10	0.54	0.53	0.52
query11	0.14	0.11	0.11
query12	0.15	0.12	0.12
query13	0.60	0.58	0.59
query14	0.78	0.78	0.76
query15	0.84	0.81	0.81
query16	0.35	0.37	0.36
query17	0.97	1.02	0.95
query18	0.23	0.25	0.24
query19	1.85	1.71	1.81
query20	0.01	0.01	0.02
query21	15.41	0.67	0.66
query22	3.92	6.53	2.25
query23	18.33	1.35	1.17
query24	2.16	0.22	0.22
query25	0.15	0.08	0.08
query26	0.26	0.18	0.18
query27	0.09	0.08	0.08
query28	13.26	1.02	0.99
query29	12.64	3.27	3.26
query30	0.26	0.07	0.06
query31	2.85	0.39	0.38
query32	3.33	0.48	0.46
query33	2.85	2.92	2.85
query34	17.07	4.43	4.45
query35	4.54	4.43	4.54
query36	0.65	0.46	0.46
query37	0.19	0.15	0.15
query38	0.16	0.16	0.15
query39	0.04	0.04	0.04
query40	0.16	0.14	0.14
query41	0.10	0.05	0.05
query42	0.06	0.05	0.05
query43	0.04	0.05	0.04
Total cold run time: 109.26 s
Total hot run time: 30.67 s

Copy link
Contributor

@liaoxin01 liaoxin01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jun 21, 2024
Copy link
Contributor

PR approved by anyone and no changes requested.

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dataroaring dataroaring merged commit a43ef44 into apache:master Jun 21, 2024
26 of 29 checks passed
dataroaring pushed a commit that referenced this pull request Jun 26, 2024
…e and rows (#36632)

Most users only care about the size of **max_batch_interval**, but in
order to achieve an interval effect, they have to configure
**max_batch_rows** and **max_batch_size** according to the
characteristics of the data. By adjusting these two default values,
users do not need to worry about configuration in most scenarios.
dataroaring pushed a commit that referenced this pull request Jul 7, 2024
…h size and rows (#37388)

pick #36632

Most users only care about the size of **max_batch_interval**, but in
order to achieve an interval effect, they have to configure
**max_batch_rows** and **max_batch_size** according to the
characteristics of the data. By adjusting these two default values,
users do not need to worry about configuration in most scenarios.

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
dataroaring pushed a commit that referenced this pull request Jul 8, 2024
…e and rows (#36632) (#37459)

pick #36632

Most users only care about the size of **max_batch_interval**, but in
order to achieve an interval effect, they have to configure
**max_batch_rows** and **max_batch_size** according to the
characteristics of the data. By adjusting these two default values,
users do not need to worry about configuration in most scenarios.
mongo360 pushed a commit to mongo360/doris that referenced this pull request Aug 16, 2024
…e and rows (apache#36632) (apache#37459)

pick apache#36632

Most users only care about the size of **max_batch_interval**, but in
order to achieve an interval effect, they have to configure
**max_batch_rows** and **max_batch_size** according to the
characteristics of the data. By adjusting these two default values,
users do not need to worry about configuration in most scenarios.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants