這篇文章主要介紹“hive數(shù)據(jù)怎么遷移”,在日常操作中,相信很多人在hive數(shù)據(jù)怎么遷移問(wèn)題上存在疑惑,小編查閱了各式資料,整理出簡(jiǎn)單好用的操作方法,希望對(duì)大家解答”hive數(shù)據(jù)怎么遷移”的疑惑有所幫助!接下來(lái),請(qǐng)跟著小編一起來(lái)學(xué)習(xí)吧!
成都創(chuàng)新互聯(lián)公司-專業(yè)網(wǎng)站定制、快速模板網(wǎng)站建設(shè)、高性價(jià)比硚口網(wǎng)站開(kāi)發(fā)、企業(yè)建站全套包干低至880元,成熟完善的模板庫(kù),直接使用。一站式硚口網(wǎng)站制作公司更省心,省錢,快速模板網(wǎng)站建設(shè)找我們,業(yè)務(wù)覆蓋硚口地區(qū)。費(fèi)用合理售后完善,10年實(shí)體公司更值得信賴。
hive數(shù)據(jù)遷移,cdh4u5的hive中數(shù)據(jù)遷移到cdh6.1的hive中,由于distcp不能使用,需要手動(dòng)導(dǎo)出數(shù)據(jù)
on hadoop4
cd /tmp/test/people_payment_log
hadoop fs -get /data/warehouse/userdb.db/people_payment/hour=201309* .
hadoop fs -get /data/warehouse/userdb.db/people_payment/hour=201310* .
hadoop fs -get /data/warehouse/userdb.db/people_payment/hour=201311* .
hadoop fs -get /data/warehouse/userdb.db/people_payment/hour=201312* .
hadoop fs -get /data/warehouse/userdb.db/people_payment/hour=201401* .
hadoop fs -get /data/warehouse/userdb.db/people_payment/hour=201402* .
hadoop fs -get /data/warehouse/userdb.db/people_payment/hour=201403* .
cd /tmp/test
tar -czf people_payment_log.tgz people_payment_log
壓縮,copy到hdp7,/home/abc/cdh/people_payment,解壓縮
on hdp7,scp -Cr hadoop4:/tmp/test/people_payment_log.tgz /home/abc/cdh/people_payment
cd /home/abc/cdh/people_payment;tar -xzf people_payment_log.tgz
將數(shù)據(jù)上傳到cdh6集群的people_payment表中,shell內(nèi)容如下:
base_dir=/home/abc/cdh/people_payment
data_dir=$base_dir/people_payment_log
ls $data_dir >$base_dir/hour.txt
cd $data_dir
cat $base_dir/hour.txt |while read oneHour
do
echo $oneHour
hadoop fs -put $oneHour /user/hive/warehouse/userdb.db/people_payment/
done
然后需要讓hive metastore知道這些分區(qū)的存在,生成分區(qū)alert語(yǔ)句。
base_dir=/home/abc/cdh/people_payment
cd $base_dir
echo "use userdb;">$base_dir/alert.txt
cat $base_dir/hour.txt |while read oneHour
do
realy_hour=`echo $oneHour|awk -F '=' '{print $2}'`
echo "ALTER TABLE people_payment ADD PARTITION (hour = '$realy_hour');">>$base_dir/alert.txt
done
alert.txt的內(nèi)容類似
use userdb;
ALTER TABLE people_payment ADD PARTITION (hour = '2013090100');
ALTER TABLE people_payment ADD PARTITION (hour = '2013090101');
然后調(diào)用hive -f alert.txt集中進(jìn)行alert partition。
直接有文件存在的話,可以用下面的方式導(dǎo)入hive
腳本內(nèi)容如下:
base_dir=/home/abc/cdh/people_payment
data_dir=/data/login/data_login_raw
hive_db=userdb
table=user_login
ls $data_dir/a.bc.d.201408*|awk -F '.' '{print $5}'>$base_dir/hour.txt
cat $base_dir/hour.txt |while read oneHour
do
echo $oneHour
sql="use $hive_db;LOAD DATA LOCAL INPATH '$data_dir/a.bc.d.$oneHour' OVERWRITE INTO table $table partition ( hour=$oneHour);"
echo "===================================================$sql"
/home/abc/cdh/hive/bin/hive -e "$sql"
done
最好是生成一個(gè)批量的LOAD DATA LOCAL INPATH。..語(yǔ)句
然后hive -f調(diào)用,避免了多次啟動(dòng)hive client.
到此,關(guān)于“hive數(shù)據(jù)怎么遷移”的學(xué)習(xí)就結(jié)束了,希望能夠解決大家的疑惑。理論與實(shí)踐的搭配能更好的幫助大家學(xué)習(xí),快去試試吧!若想繼續(xù)學(xué)習(xí)更多相關(guān)知識(shí),請(qǐng)繼續(xù)關(guān)注創(chuàng)新互聯(lián)網(wǎng)站,小編會(huì)繼續(xù)努力為大家?guī)?lái)更多實(shí)用的文章!