Spark提交Yarn的詳細(xì)過程

這篇文章主要講解了“Spark提交Yarn的詳細(xì)過程”，文中的講解內(nèi)容簡單清晰，易于學(xué)習(xí)與理解，下面請(qǐng)大家跟著小編的思路慢慢深入，一起來研究和學(xué)習(xí)“Spark提交Yarn的詳細(xì)過程”吧！

創(chuàng)新互聯(lián)專注為客戶提供全方位的互聯(lián)網(wǎng)綜合服務(wù)，包含不限于做網(wǎng)站、成都網(wǎng)站設(shè)計(jì)、邳州網(wǎng)絡(luò)推廣、微信小程序、邳州網(wǎng)絡(luò)營銷、邳州企業(yè)策劃、邳州品牌公關(guān)、搜索引擎seo、人物專訪、企業(yè)宣傳片、企業(yè)代運(yùn)營等，從售前售中售后，我們都將竭誠為您服務(wù)，您的肯定，是我們最大的嘉獎(jiǎng)；創(chuàng)新互聯(lián)為所有大學(xué)生創(chuàng)業(yè)者提供邳州建站搭建服務(wù)，24小時(shí)服務(wù)熱線：18980820575，官方網(wǎng)址：www.cdcxhl.com

spark-submit.sh-> spark-class.sh，然后調(diào)用SparkSubmit.scala。

根據(jù)client或者cluster模式處理方式不一樣。

client：直接在spark-class.sh運(yùn)行的地方包裝要給進(jìn)程來執(zhí)行driver。

cluster：將driver提交到集群去執(zhí)行。

核心在SparkSubmit.scala的prepareSubmitEnvironment方法中，截取一段處理Yarn集群環(huán)境的看一下。

// In client mode, launch the application main class directly
    // In addition, add the main application jar and any added jars (if any) to the classpath
    if (deployMode == CLIENT) {
      childMainClass = args.mainClass
      if (localPrimaryResource != null && isUserJar(localPrimaryResource)) {
        childClasspath += localPrimaryResource
      }
      if (localJars != null) { childClasspath ++= localJars.split(",") }
    }

client模式，childMainClass就是driver的main方法。

接下來看看Yarn cluster模式：

// In yarn-cluster mode, use yarn.Client as a wrapper around the user class
    if (isYarnCluster) {
      childMainClass = YARN_CLUSTER_SUBMIT_CLASS
      if (args.isPython) {
        childArgs += ("--primary-py-file", args.primaryResource)
        childArgs += ("--class", "org.apache.spark.deploy.PythonRunner")
      } else if (args.isR) {
        val mainFile = new Path(args.primaryResource).getName
        childArgs += ("--primary-r-file", mainFile)
        childArgs += ("--class", "org.apache.spark.deploy.RRunner")
      } else {
        if (args.primaryResource != SparkLauncher.NO_RESOURCE) {
          childArgs += ("--jar", args.primaryResource)
        }
        childArgs += ("--class", args.mainClass)
      }
      if (args.childArgs != null) {
        args.childArgs.foreach { arg => childArgs += ("--arg", arg) }
      }
    }

這時(shí)候childMainClass變成了

YARN_CLUSTER_SUBMIT_CLASS = "org.apache.spark.deploy.yarn.YarnClusterApplication"

private[spark] class YarnClusterApplication extends SparkApplication {
  override def start(args: Array[String], conf: SparkConf): Unit = {
    // SparkSubmit would use yarn cache to distribute files & jars in yarn mode,
    // so remove them from sparkConf here for yarn mode.
    conf.remove(JARS)
    conf.remove(FILES)
    new Client(new ClientArguments(args), conf, null).run()
  }
}

看源碼可以看到，YarnClusterApplication最終是用到了deploy/yarn/Client.scala

client.run調(diào)用client.submitApplication方法提交到Y(jié)arn集群。

def submitApplication(): ApplicationId = {
     // Set up the appropriate contexts to launch our AM
      val containerContext = createContainerLaunchContext(newAppResponse)
      val appContext = createApplicationSubmissionContext(newApp, containerContext)
}

主要是createContainerLaunchContext方法：

 /**
   * Set up a ContainerLaunchContext to launch our ApplicationMaster container.
   * This sets up the launch environment, java options, and the command for launching the AM.
   */
private def createContainerLaunchContext(newAppResponse: GetNewApplicationResponse){

val userClass =
      if (isClusterMode) {
        Seq("--class", YarnSparkHadoopUtil.escapeForShell(args.userClass))
      } else {
        Nil
      }    
 val amClass =
      if (isClusterMode) {
        Utils.classForName("org.apache.spark.deploy.yarn.ApplicationMaster").getName
      } else {
        Utils.classForName("org.apache.spark.deploy.yarn.ExecutorLauncher").getName
      }
 val amArgs =
      Seq(amClass) ++ userClass ++ userJar ++ primaryPyFile ++ primaryRFile ++ userArgs ++
      Seq("--properties-file",
        buildPath(Environment.PWD.$$(), LOCALIZED_CONF_DIR, SPARK_CONF_FILE)) ++
      Seq("--dist-cache-conf",
        buildPath(Environment.PWD.$$(), LOCALIZED_CONF_DIR, DIST_CACHE_CONF_FILE))

    // Command for the ApplicationMaster
    val commands = prefixEnv ++
      Seq(Environment.JAVA_HOME.$$() + "/bin/java", "-server") ++
      javaOpts ++ amArgs ++
      Seq(
        "1>", ApplicationConstants.LOG_DIR_EXPANSION_VAR + "/stdout",
        "2>", ApplicationConstants.LOG_DIR_EXPANSION_VAR + "/stderr")
}

這樣就生成要執(zhí)行的命令了，就是Command。上面這句話啥意思呢：

（1）cluster模式

用ApplicationMaster啟動(dòng)userClass。

（2）client模式

啟動(dòng)Executor

這里我們要看的是cluster模式，至此就清楚了，在cluster模式下，在Yarn集群中用ApplicationMaster包裝了userClass并啟動(dòng)。userClass就是driver的意思。

感謝各位的閱讀，以上就是“Spark提交Yarn的詳細(xì)過程”的內(nèi)容了，經(jīng)過本文的學(xué)習(xí)后，相信大家對(duì)Spark提交Yarn的詳細(xì)過程這一問題有了更深刻的體會(huì)，具體使用情況還需要大家實(shí)踐驗(yàn)證。這里是創(chuàng)新互聯(lián)，小編將為大家推送更多相關(guān)知識(shí)點(diǎn)的文章，歡迎關(guān)注！

新聞名稱：Spark提交Yarn的詳細(xì)過程
分享鏈接：http://weahome.cn/article/jsdjih.html

真实的国产乱ⅩXXX66竹夫人,五月香六月婷婷激情综合,亚洲日本VA一区二区三区,亚洲精品一区二区三区麻豆

Spark提交Yarn的詳細(xì)過程

其他資訊

網(wǎng)站制作

企業(yè)服務(wù)

網(wǎng)站建設(shè)

服務(wù)器托管