這篇文章給大家介紹IDEA WordCount jar包上傳spark是怎么調(diào)試及排錯(cuò)的,內(nèi)容非常詳細(xì),感興趣的小伙伴們可以參考借鑒,希望對(duì)大家能有所幫助。
專注于為中小企業(yè)提供成都網(wǎng)站建設(shè)、網(wǎng)站設(shè)計(jì)服務(wù),電腦端+手機(jī)端+微信端的三站合一,更高效的管理,為中小企業(yè)晉源免費(fèi)做網(wǎng)站提供優(yōu)質(zhì)的服務(wù)。我們立足成都,凝聚了一批互聯(lián)網(wǎng)行業(yè)人才,有力地推動(dòng)了上千家企業(yè)的穩(wěn)健成長(zhǎng),幫助中小企業(yè)通過(guò)網(wǎng)站建設(shè)實(shí)現(xiàn)規(guī)模擴(kuò)充和轉(zhuǎn)變。
Based on:
Mac os
Spark 2.4.3
(Spark running on a standalone mode reference blog :http://blog.itpub.net/69908925/viewspace-2644303/ )
scala 2.12.8
IDEA 2019
1 IDEA-File-Project Structure-Libarary-Scala SDK
select version 2.11.12
這處選擇的版本需要跟spark scala運(yùn)行版本一致,默認(rèn)的是本機(jī)裝的Scala版本2.12.8,spark上運(yùn)行會(huì)報(bào)主類錯(cuò)誤
2 新建project ,pom.xml添加依賴
4.0.0 com.ny.service scala517 1.0 org.scala-lang scala-library 2.11.12 org.apache.spark spark-core_2.11 2.4.3 src/main/scala org.scala-tools maven-scala-plugin 2.15.2 compile testCompile org.apache.maven.plugins maven-shade-plugin 2.4.3 package shade *:* META-INF/*.SF META-INF/*.DSA META-INF/*.RSA org.apache.maven.plugins maven-compiler-plugin 1.8 org.apache.maven.plugins maven-jar-plugin true false lib/ com.ny.service.WordCount
scala library 選擇spark中的Scala版本 2.11.12 也是目前支持的最近版本
org.apache.spark 也選擇2.11
否則會(huì)出現(xiàn)主類錯(cuò)誤:
19/05/16 10:52:03 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:60010 (size: 22.9 KB, free: 366.3 MB)
19/05/16 10:52:03 INFO SparkContext: Created broadcast 0 from textFile at WordCount.scala:18
Exception in thread "main" java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError: scala/runtime/java8/JFunction2$mcIII$sp
at com.nyc.WordCount$.main(WordCount.scala:24)
at com.nyc.WordCount.main(WordCount.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
如何查看spark 中Scala版本號(hào)
進(jìn)入路徑:
/usr/local/opt/spark-2.4.3/jars
3 word count測(cè)試腳本
package com.ny.service import org.apache.spark.{SparkConf, SparkContext} object WordCount{ def main(args: Array[String]): Unit = { // 1 創(chuàng)建配置信息 val conf = new SparkConf().setAppName("wc") // 2 創(chuàng)建spark context sc val sc = new SparkContext(conf) // 3 處理邏輯 //讀取文件 val lines = sc.textFile(args(0)) //壓平 val words = lines.flatMap(_.split(" ")) //map val k2v = words.map((_,1)) val results = k2v.reduceByKey(_+_) //保存數(shù)據(jù) results.saveAsTextFile(args(1)) // 4 關(guān)閉連接 sc.stop() } }
4 打包
復(fù)制到spark家目錄下,因?yàn)閟tandalone模式所以沒(méi)有啟動(dòng)Hadoop集群
nancylulululu:spark-2.4.3 nancy$ mv /Users/nancy/IdeaProjects/scala517/target/original-scala517-1.0.jar wc.jar
5 spark submit 執(zhí)行
bin/spark-submit \ --class com.ny.service.WordCount \ --master spark://localhost:7077 \ ./wc.jar \ file:///usr/local/opt/spark-2.4.3/test/1test \ file:///usr/local/opt/spark-2.4.3/test/out
如果是Hadoop file改為hdfs文件系統(tǒng)路徑
查看執(zhí)行結(jié)果文件:
nancylulululu:out nancy$ ls _SUCCESSpart-00000part-00001 nancylulululu:out nancy$ cat part-00000 (scala,2) (hive,1) (MySQL,1) (hello,5) (java,2)
關(guān)于IDEA WordCount jar包上傳spark是怎么調(diào)試及排錯(cuò)的就分享到這里了,希望以上內(nèi)容可以對(duì)大家有一定的幫助,可以學(xué)到更多知識(shí)。如果覺(jué)得文章不錯(cuò),可以把它分享出去讓更多的人看到。