Hive數(shù)據(jù)如何導入導出mysql

這篇文章給大家分享的是有關(guān)Hive數(shù)據(jù)如何導入導出MySQL的內(nèi)容。小編覺得挺實用的，因此分享給大家做個參考，一起跟隨小編過來看看吧。

公司主營業(yè)務(wù)：網(wǎng)站建設(shè)、成都網(wǎng)站建設(shè)、移動網(wǎng)站開發(fā)等業(yè)務(wù)。幫助企業(yè)客戶真正實現(xiàn)互聯(lián)網(wǎng)宣傳，提高企業(yè)的競爭能力。創(chuàng)新互聯(lián)公司是一支青春激揚、勤奮敬業(yè)、活力青春激揚、勤奮敬業(yè)、活力澎湃、和諧高效的團隊。公司秉承以“開放、自由、嚴謹、自律”為核心的企業(yè)文化，感謝他們對我們的高要求，感謝他們從不同領(lǐng)域給我們帶來的挑戰(zhàn)，讓我們激情的團隊有機會用頭腦與智慧不斷的給客戶帶來驚喜。創(chuàng)新互聯(lián)公司推出原陽免費做網(wǎng)站回饋大家。

Hive定位：ETL（數(shù)據(jù)倉庫）工具
將數(shù)據(jù)從來源端經(jīng)過抽?。╡xtract）、轉(zhuǎn)換（transform）、加載（load）至目的端的工具,如像：kettle

DML

批量插入/批量導入
LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)]
注：filepath可以是hdfs路徑或者是S3路徑，如hdfs://namenode:9000/user/hive/project/data1
1.從本地文件導入到表
load data local inpath 'test.txt' into table test;
2.從hdfs導入到表
load data inpath '/home/test/add.txt' into table test;
3.從表查詢中導入到表
insert into table test select id, name, tel from test;
4.將查詢數(shù)據(jù)導入到多個表
from source_table
insert into table test select id, name, tel from dest1_table select src.* where src.id < 100
insert into table test select id, name, tel from dest2_table select src.* where src.id < 100
insert into table test select id, name, tel from dest3_table select src.* where src.id < 100;
5.建表時導入
create table test4 as select id, name, tel from test;
指定分隔符導出數(shù)據(jù)
insert overwrite local directory '/home/hadoop/export_hive'
row format delimited
fields terminated by '\t'
select * from test;
刪除/清空
1.刪除table1中不符合條件的數(shù)據(jù)
insert overwrite table table1
select * from table1 where XXXX;
2.清空表
insert overwrite table t_table1
select * from t_table1 where 1=0;
3.截斷表（注：不能截斷外部表）
truncate table table_name;
4.刪除hdfs對應的表數(shù)據(jù)達到清空表（表結(jié)構(gòu)依然存在）
hdfs dfs -rmr /user/hive/warehouse/test

注：1和2本質(zhì)是覆寫表來實現(xiàn)清除數(shù)據(jù)
delete 與 update
在hive中默認不支持事務(wù)，因此默認不支持delete與update，如果需要支持必須在hive-site.xml中配置打開

DDL

庫/表/索引/視圖/分區(qū)/分桶

數(shù)據(jù)庫

列出/創(chuàng)建/修改/刪除/查看信息
1.列出所有數(shù)據(jù)庫
show databases;
2.創(chuàng)建數(shù)據(jù)庫
create database test;
3.刪除
drop database test;

處于安全原因，直接drop有數(shù)據(jù)的數(shù)據(jù)庫會報錯，此時需要cascade關(guān)鍵字忽略報錯刪除
drop database if exists test cascade;
4.查看數(shù)據(jù)庫信息
describe database test;

表

列出/創(chuàng)建/修改/刪除/查看信息
1.列出所有表

當前數(shù)據(jù)庫的所有表
show tables;

指定數(shù)據(jù)庫的所有表
show tables in db_name;

支持正則
show tables '.*s';
2.創(chuàng)建表
create table test
(id int,
a string
)
ROW FORMAT DELIMITED        行分割
FIELDS TERMINATED BY ‘,’    字段分隔符
LINES TERMINATED BY ‘\n’    行分隔符
STORED AS TEXTFILE;         作為文本存儲
創(chuàng)建基于正則切分行字段的表
add jar ../build/contrib/hive_contrib.jar;

CREATE TABLE apachelog (
host STRING,
identity STRING,
user STRING,
time STRING,
request STRING,
status STRING,
size STRING,
referer STRING,
agent STRING)
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
WITH SERDEPROPERTIES (
"input.regex" = "([^ ]*) ([^ ]*) ([^ ]*) (-|//[[^//]]*//]) ([^ /"]*|/"[^/"]*/") (-|[0-9]*) (-|[0-9]*)(?: ([^ /"]*|/"[^/"]*/") ([^ /"]*|/"[^/"]*/"))?",
"output.format.string" = "%1$s %2$s %3$s %4$s %5$s %6$s %7$s %8$s %9$s"
)
STORED AS TEXTFILE;
3.修改
加一個新列
ALTER TABLE test ADD COLUMNS (new_col2 INT COMMENT 'a comment');

改表名
ALTER TABLE old_name RENAME TO new_name;
4.刪除
drop table test;
5.查看信息

顯示列信息
desc test;

顯示詳細表信息
desc formatted test;

索引

創(chuàng)建索引
CREATE INDEX index_name
ON TABLE base_table_name (col_name, ...)
AS 'index.handler.class.name'

如：DROP INDEX index_name ON table_name

重建索引
ALTER INDEX index_name ON table_name [PARTITION (...)] REBUILD

如：alter index index1_index_test on index_test rebuild;

刪除索引
DROP INDEX index_name ON table_name

列出索引
show index on index_test;

視圖

CREATE VIEW [IF NOT EXISTS] view_name [ (column_name [COMMENT column_comment], ...) ][COMMENT view_comment][TBLPROPERTIES (property_name = property_value, ...)] AS SELECT

注：hive只支持邏輯視圖，不支持物化視圖
?增加視圖
?如果沒有提供表名，視圖列的名字將由定義的SELECT表達式自動生成
?如果修改基本表的屬性，視圖中不會體現(xiàn)，無效查詢將會失敗
?視圖是只讀的，不能用LOAD/INSERT/ALTER
?刪除視圖 DROP VIEW view_name

分區(qū)（重點）

列出/創(chuàng)建/修改/刪除
1.列出一個表的所有分區(qū)
show partitions test;
2.創(chuàng)建分區(qū)表
create table test
(id int,
a string,
)
partitioned by (b string,c int)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘,’
LINES TERMINATED BY ‘\n’
STORED AS TEXTFILE;
3.對現(xiàn)有表添加分區(qū)
ALTER TABLE test ADD IF NOT EXISTS
PARTITION (year = 2017) LOCATION ‘/hiveuser/hive/warehouse/data_zh.db/data_zh/2017.txt’;
4.刪除分區(qū)
ALTER TABLE test DROP IF EXISTS PARTITION(year =2017);
5.加載數(shù)據(jù)到分區(qū)表
LOAD DATA INPATH ‘/data/2017.txt’ INTO TABLE test PARTITION(year=2017);
6.未分區(qū)表數(shù)據(jù)導入分區(qū)表
insert overwrite table part_table partition (YEAR,MONTH) select * from no_part_table;
7.動態(tài)分區(qū)指令
set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;
#set hive.enforce.bucketing = true;

開啟動態(tài)分區(qū)后導入數(shù)據(jù)時可以省略指定分區(qū)的步驟
LOAD DATA INPATH ‘/data/2017.txt’ INTO TABLE test PARTITION(year);

分桶

CREATE TABLE bucketed_user (id INT) name STRING)
CLUSTERED BY (id) INTO 4 BUCKETS;
對于每一個表（table）或者分區(qū)， Hive可以進一步組織成桶，也就是說桶是更為細粒度的數(shù)據(jù)范圍劃分。Hive也是針對某一列進行桶的組織。Hive采用對列值哈希，然后除以桶的個數(shù)求余的方式?jīng)Q定該條記錄存放在哪個桶當中。
把表（或者分區(qū)）組織成桶（Bucket）有兩個理由：
（1）獲得更高的查詢處理效率。桶為表加上了額外的結(jié)構(gòu)，Hive 在處理有些查詢時能利用這個結(jié)構(gòu)。具體而言，連接兩個在（包含連接列的）相同列上劃分了桶的表，可以使用 Map 端連接（Map-side join）高效的實現(xiàn)。比如JOIN操作。對于JOIN操作兩個表有一個相同的列，如果對這兩個表都進行了桶操作。那么將保存相同列值的桶進行JOIN操作就可以，可以大大較少JOIN的數(shù)據(jù)量。
（2）使取樣（sampling）更高效。在處理大規(guī)模數(shù)據(jù)集時，在開發(fā)和修改查詢的階段，如果能在數(shù)據(jù)集的一小部分數(shù)據(jù)上試運行查詢，會帶來很多方便。

感謝各位的閱讀！關(guān)于“Hive數(shù)據(jù)如何導入導出mysql”這篇文章就分享到這里了，希望以上內(nèi)容可以對大家有一定的幫助，讓大家可以學到更多知識，如果覺得文章不錯，可以把它分享出去讓更多的人看到吧！

標題名稱：Hive數(shù)據(jù)如何導入導出mysql
本文來源：http://weahome.cn/article/jhshdg.html

真实的国产乱ⅩXXX66竹夫人,五月香六月婷婷激情综合,亚洲日本VA一区二区三区,亚洲精品一区二区三区麻豆

Hive數(shù)據(jù)如何導入導出mysql

DML

DDL

數(shù)據(jù)庫

表

索引

視圖

分區(qū)（重點）

分桶

其他資訊

網(wǎng)站制作

企業(yè)服務(wù)

網(wǎng)站建設(shè)

服務(wù)器托管