Flink1.10 异常记录

本文解析Flink on YARN中常见异常,如“CouldnotbuildtheprogramfromJARfile”、“CouldnotdeployYarnjobcluster”,并提供解决策略,包括调整Yarn配置参数。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

1、应用提交控制台异常信息:Could not build the program from JAR file.

这个问题的迷惑性较大,很多时候并非指定运行的 JAR 文件问题,而是提交过程中发生了异常,需要根据日志信息进一步排查。最常见原因是未将依赖的 Hadoop JAR 文件加到 CLASSPATH,找不到依赖类(例如:ClassNotFoundException: org.apache.hadoop.yarn.exceptions.YarnException)导致加载客户端入口类(FlinkYarnSessionCli)失败。
Flink on YARN 客户端通常需配置 HADOOP_CONF_DIR 和 HADOOP_CLASSPATH 两个环境变量来让客户端能加载到 Hadoop 配置和依赖 JAR 文件。示例(已有环境变量 HADOOP_HOME 指定 Hadoop 部署目录):

export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export HADOOP_CLASSPATH=`${HADOOP_HOME}/bin/hadoop classpath`

2、客户端日志在哪里,如何配置?
客户端日志通常在 Flink 部署目录的 log 文件夹下:

日志位置:${FLINK_HOME}/log/flink-${USER}-client-.log,
log4j 配置:${FLINK_HOME}/conf/log4j-cli.properties。

有的客户端环境比较复杂,难以定位日志位置和配置时,可以通过以下环境变量配置打开 log4j 的 DEBUG 日志,跟踪 log4j 的初始化和详细加载流程:

export JVM_ARGS="-Dlog4j.debug=true"

3、 The main method caused an error: Could not deploy Yarn job cluster.

org.apache.flink.client.program.ProgramInvocationException: The main method caused an error: Could not deploy Yarn job cluster.
	at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:335)
	at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:205)
	at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:138)
	at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:664)
	at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213)
	at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:895)
	at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:968)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1754)
	at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
	at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:968)
Caused by: org.apache.flink.client.deployment.ClusterDeploymentException: Could not deploy Yarn job cluster.
	at org.apache.flink.yarn.YarnClusterDescriptor.deployJobCluster(YarnClusterDescriptor.java:397)
	at org.apache.flink.client.deployment.executors.AbstractJobClusterExecutor.execute(AbstractJobClusterExecutor.java:70)
	at org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:944)
	at org.apache.flink.client.program.ContextEnvironment.executeAsync(ContextEnvironment.java:84)
	at org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:53)
	at org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:844)
	at org.apache.flink.api.java.DataSet.collect(DataSet.java:413)
	at org.apache.flink.api.java.DataSet.print(DataSet.java:1652)
	at org.apache.flink.api.scala.DataSet.print(DataSet.scala:1864)
	at com.cjy.flink.T1_WordCount$.main(T1_WordCount.scala:22)
	at com.cjy.flink.T1_WordCount.main(T1_WordCount.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:321)
	... 11 more
Caused by: org.apache.flink.yarn.YarnClusterDescriptor$YarnDeploymentException: The YARN application unexpectedly switched to state FAILED during deployment. 
Diagnostics from YARN: Application application_1598606942477_0001 failed 1 times due to AM Container for appattempt_1598606942477_0001_000001 exited with  exitCode: -103
For more detailed output, check application tracking page:http://hadoop203:8088/cluster/app/application_1598606942477_0001Then, click on links to logs of each attempt.
Diagnostics: Container [pid=6920,containerID=container_1598606942477_0001_01_000001] is running beyond virtual memory limits. Current usage: 220.0 MB of 1 GB physical memory used; 2.2 GB of 2.1 GB virtual memory used. Killing container.
Dump of the process-tree for container_1598606942477_0001_01_000001 :
	|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
	|- 6940 6920 6920 6920 (java) 528 18 2213388288 56015 /opt/module/jdk1.8.0_181/bin/java -Xms424m -Xmx424m -Dlog.file=/opt/module/hadoop-2.7.2/logs/userlogs/application_1598606942477_0001/container_1598606942477_0001_01_000001/jobmanager.log -Dlog4j.configuration=file:log4j.properties org.apache.flink.yarn.entrypoint.YarnJobClusterEntrypoint 
	|- 6920 6919 6920 6920 (bash) 0 0 115838976 306 /bin/bash -c /opt/module/jdk1.8.0_181/bin/java -Xms424m -Xmx424m -Dlog.file=/opt/module/hadoop-2.7.2/logs/userlogs/application_1598606942477_0001/container_1598606942477_0001_01_000001/jobmanager.log -Dlog4j.configuration=file:log4j.properties org.apache.flink.yarn.entrypoint.YarnJobClusterEntrypoint 1> /opt/module/hadoop-2.7.2/logs/userlogs/application_1598606942477_0001/container_1598606942477_0001_01_000001/jobmanager.out 2> /opt/module/hadoop-2.7.2/logs/userlogs/application_1598606942477_0001/container_1598606942477_0001_01_000001/jobmanager.err 

Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
Failing this attempt. Failing the application.
If log aggregation is enabled on your cluster, use this command to further investigate the issue:
yarn logs -applicationId application_1598606942477_0001
	at org.apache.flink.yarn.YarnClusterDescriptor.startAppMaster(YarnClusterDescriptor.java:999)
	at org.apache.flink.yarn.YarnClusterDescriptor.deployInternal(YarnClusterDescriptor.java:488)
	at org.apache.flink.yarn.YarnClusterDescriptor.deployJobCluster(YarnClusterDescriptor.java:390)
	... 26 more

解决:调大了Yarn一下两个配置的参数之后,Job正常提交完成。
yarn-site.xml 文件

<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>10240</value>
</property>

<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>1024</value>
</property>

<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>10240</value>
</property>

查阅博客:
环境搭建 https://www.cnblogs.com/quchunhui/p/12463455.html
https://developer.aliyun.com/article/719703

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值