這篇文章主要介紹“Sqoop導(dǎo)入數(shù)據(jù)異常怎么處理”,在日常操作中,相信很多人在Sqoop導(dǎo)入數(shù)據(jù)異常怎么處理問題上存在疑惑,小編查閱了各式資料,整理出簡單好用的操作方法,希望對(duì)大家解答”Sqoop導(dǎo)入數(shù)據(jù)異常怎么處理”的疑惑有所幫助!接下來,請(qǐng)跟著小編一起來學(xué)習(xí)吧!
創(chuàng)新互聯(lián)于2013年創(chuàng)立,先為泰和等服務(wù)建站,泰和等地企業(yè),進(jìn)行企業(yè)商務(wù)咨詢服務(wù)。為泰和企業(yè)網(wǎng)站制作PC+手機(jī)+微官網(wǎng)三網(wǎng)同步一站式服務(wù)解決您的所有建站問題。
1.錯(cuò)誤提示(沒有json.jar包)
19/01/30 11:59:48 INFO manager.DirectMySQLManager: Beginning mysqldump fast path import
19/01/30 11:59:48 INFO mapreduce.ImportJobBase: Beginning import of t3
Exception in thread "main" java.lang.NoClassDefFoundError: org/json/JSONObject
at org.apache.sqoop.util.SqoopJsonUtil.getJsonStringforMap(SqoopJsonUtil.java:43)
at org.apache.sqoop.SqoopOptions.writeProperties(SqoopOptions.java:780)
at org.apache.sqoop.mapreduce.JobBase.putSqoopOptionsToConfiguration(JobBase.java:392)
at org.apache.sqoop.mapreduce.JobBase.createJob(JobBase.java:378)
at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:256)
at org.apache.sqoop.manager.DirectMySQLManager.importTable(DirectMySQLManager.java:92)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:513)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:621)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
解決方法:
參考文檔:https://github.com/stleary/JSON-java
下載json包: https://search.maven.org/search?q=g:org.json%20AND%20a:json&core=gav
將下載的json.tar包上傳到/opt/cloudera/parcels/CDH/lib/sqoop/lib中。
2.其它CDH節(jié)點(diǎn)沒有mysqldump命令(因?yàn)閷?dǎo)入時(shí)加了--direct參數(shù))
19/01/30 13:59:29 INFO mapreduce.Job: map 0% reduce 0%
19/01/30 13:59:33 INFO mapreduce.Job: Task Id : attempt_1545874390047_0006_m_000000_0, Status : FAILED
Error: java.io.IOException: Cannot run program "mysqldump": error=2, 沒有那個(gè)文件或目錄
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
at java.lang.Runtime.exec(Runtime.java:620)
at java.lang.Runtime.exec(Runtime.java:485)
at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:405)
at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:49)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
解決方法:
[root@jingong01 ~]# scp /usr/bin/mysqldump root@192.168.7.32:/usr/bin/
mysqldump 100% 3102KB 3.0MB/s 00:00
[root@jingong01 ~]#
3.Sqoop缺少hive包
19/01/30 14:15:50 WARN hive.TableDefWriter: Column CREATETIME had to be cast to a less precise type in Hive
19/01/30 14:15:50 WARN hive.TableDefWriter: Column UPDATETIME had to be cast to a less precise type in Hive
19/01/30 14:15:50 INFO hive.HiveImport: Loading uploaded data into Hive
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hive/shims/ShimLoader
at org.apache.hadoop.hive.conf.HiveConf$ConfVars.
at org.apache.hadoop.hive.conf.HiveConf.
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
at org.apache.sqoop.hive.HiveConfig.getHiveConf(HiveConfig.java:44)
at org.apache.sqoop.hive.HiveImport.getHiveArgs(HiveImport.java:392)
at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:379)
at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:337)
at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:241)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:530)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:621)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
解決方法:
[root@jingong01 ~]# cp -a /opt/cloudera/parcels/CDH/lib/hive/lib/hive-shims* /opt/cloudera/parcels/CDH/lib/sqoop/lib/ --將hive下面的hive-shims包全部考到sqoop下面。
4.在hive中已創(chuàng)建表,在導(dǎo)入時(shí)加了創(chuàng)建表參數(shù)
19/01/30 14:30:20 INFO hive.HiveImport: WARN: The method class org.apache.commons.logging.impl.SLF4JLogFactory#release() was invoked.
19/01/30 14:30:20 INFO hive.HiveImport: WARN: Please see http://www.slf4j.org/codes.html#release for an explanation.
19/01/30 14:30:21 ERROR tool.ImportTool: Import failed: java.io.IOException: Hive exited with status 64
at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:384)
at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:337)
at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:241)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:530)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:621)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
解決方法:
[hdfs@jingong01 ~]$ sqoop import --connect jdbc:mysql://172.16.8.93:3306/db_stktag --username wangying --password wangying --table t3 --target-dir /user/tong/123 --hive-import --create-hive-table --num-mappers 1 --hive-table TT3 -m 1 --split-by date --direct --去掉紅色參數(shù)
5.使用sqoop抽取數(shù)據(jù)時(shí),提示8032端口拒絕連接
19/03/19 10:22:42 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
19/03/19 10:22:42 WARN ipc.Client: Failed to connect to server: 0.0.0.0/0.0.0.0:8032: retries get failed due to exceeded maximum allowed retries number: 10
java.net.ConnectException: 拒絕連接
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:685)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:788)
at org.apache.hadoop.ipc.Client$Connection.access$3500(Client.java:410)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1550)
解決方法:
[root@node1 ~]# vim /opt/hadoop-2.8.5/etc/hadoop/yarn-site.xml
yarn.resourcemanager.address node1:8032 yarn.resourcemanager.scheduler.address node1:8030 yarn.resourcemanager.resource-tracker.address node1:8031
[root@node1 ~]#
6.使用hbase和hive時(shí)老卡在Running job不動(dòng),不向下執(zhí)行
19/03/19 11:20:09 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1552965562217_0001
19/03/19 11:20:10 INFO impl.YarnClientImpl: Submitted application application_1552965562217_0001
19/03/19 11:20:10 INFO mapreduce.Job: The url to track the job: http://node1:8088/proxy/application_1552965562217_0001/
19/03/19 11:20:10 INFO mapreduce.Job: Running job: job_1552965562217_0001
解決方法:
[root@node1 ~]# vim /opt/hadoop-2.8.5/etc/hadoop/yarn-site.xml --限制內(nèi)存,cpu的資源,并將配置文件同步到其它node,重啟hadoop服務(wù)
yarn.nodemanager.resource.memory-mb 2048 yarn.nodemanager.resource.cpu-vcores 2
[root@node1 ~]#
7.將mysql數(shù)據(jù)導(dǎo)入到hive中報(bào)找不到hive包
19/03/19 14:34:25 INFO hive.HiveImport: Loading uploaded data into Hive
19/03/19 14:34:25 ERROR hive.HiveConfig: Could not load org.apache.hadoop.hive.conf.HiveConf. Make sure HIVE_CONF_DIR is set correctly.
19/03/19 14:34:25 ERROR tool.ImportTool: Import failed: java.io.IOException: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf
at org.apache.sqoop.hive.HiveConfig.getHiveConf(HiveConfig.java:50)
at org.apache.sqoop.hive.HiveImport.getHiveArgs(HiveImport.java:392)
at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:379)
解決方法:
[root@node1 ~]# vim /etc/profile --添加lib變量
export HADOOP_CLASSPATH=/opt/hive-2.3.4/lib/*
[root@node1 ~]# source /etc/profile
8.使用sqoop導(dǎo)入hive時(shí)提示jackson包沖突
19/03/19 15:32:11 INFO ql.Driver: Concurrency mode is disabled, not creating a lock manager
19/03/19 15:32:11 INFO ql.Driver: Executing command(queryId=root_20190319153153_63feddd9-a2c8-4217-97d4-23dd9840a54b): CREATE TABLE `tt` ( `TBL_GRANT_ID` BIGINT, `CREATE_TIME` INT,
`GRANT_OPTION` INT, `GRANTOR` STRING, `GRANTOR_TYPE` STRING, `PRINCIPAL_NAME` STRING, `PRINCIPAL_TYPE` STRING, `TBL_PRIV` STRING, `TBL_ID` BIGINT) COMMENT 'Imported by sqoop on 2019/03/19
15:31:49' ROW FORMAT DELIMITED FIELDS TERMINATED BY '\001' LINES TERMINATED BY '\012' STORED AS TEXTFILE
19/03/19 15:32:11 INFO ql.Driver: Starting task [Stage-0:DDL] in serial mode
19/03/19 15:32:12 ERROR exec.DDLTask: java.lang.NoSuchMethodError: com.fasterxml.jackson.databind.ObjectMapper.readerFor(Ljava/lang/Class;)Lcom/fasterxml/jackson/databind/ObjectReader;
at org.apache.hadoop.hive.common.StatsSetupConst$ColumnStatsAccurate.
at org.apache.hadoop.hive.common.StatsSetupConst.parseStatsAcc(StatsSetupConst.java:297)
at org.apache.hadoop.hive.common.StatsSetupConst.setBasicStatsState(StatsSetupConst.java:230)
at org.apache.hadoop.hive.common.StatsSetupConst.setBasicStatsStateForCreateTable(StatsSetupConst.java:292)
解決方法:
[root@node1 ~]# mv /opt/sqoop-1.4.7/lib/jackson-* /home/
[root@node1 ~]# cp -a /opt/hive-2.3.4/lib/jackson-* /opt/sqoop-1.4.7/lib/
9.創(chuàng)建表和導(dǎo)入數(shù)據(jù)到hive時(shí)沒有使用分隔符
19/03/19 18:38:40 INFO metastore.HiveMetaStore: 0: Done cleaning up thread local RawStore
19/03/19 18:38:40 INFO HiveMetaStore.audit: ugi=root ip=unknown-ip-addr cmd=Done cleaning up thread local RawStore
19/03/19 18:38:40 ERROR tool.ImportTool: Import failed: java.io.IOException: Hive CliDriver exited with status=1
at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:355)
at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:241)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:537)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:628)
解決方法:
create table t1(a int,b int) row format delimited fields terminated by '\t'; --創(chuàng)建表時(shí)必須加分隔符
sqoop import --connect jdbc:mysql://172.16.9.100/hive --username hive --password system --table TBL_PRIVS --target-dir /user/sqoop --direct -m 1 --fields-terminated-by '\t'
到此,關(guān)于“Sqoop導(dǎo)入數(shù)據(jù)異常怎么處理”的學(xué)習(xí)就結(jié)束了,希望能夠解決大家的疑惑。理論與實(shí)踐的搭配能更好的幫助大家學(xué)習(xí),快去試試吧!若想繼續(xù)學(xué)習(xí)更多相關(guān)知識(shí),請(qǐng)繼續(xù)關(guān)注創(chuàng)新互聯(lián)網(wǎng)站,小編會(huì)繼續(xù)努力為大家?guī)砀鄬?shí)用的文章!