一、了解sqoop数据导入的几个重要概念
1)connector:sqoop2中预定了各种链接,这些链接是一些配置模板。
1 2 3 4 5 6 7 8 9 10 11 12 |
sqoop:000> show connector +------------------------+---------+------------------------------------------------------------+----------------------+ | Name | Version | Class | Supported Directions | +------------------------+---------+------------------------------------------------------------+----------------------+ | oracle-jdbc-connector | 1.99.7 | org.apache.sqoop.connector.jdbc.oracle.OracleJdbcConnector | FROM/TO | | sftp-connector | 1.99.7 | org.apache.sqoop.connector.sftp.SftpConnector | TO | | kafka-connector | 1.99.7 | org.apache.sqoop.connector.kafka.KafkaConnector | TO | | kite-connector | 1.99.7 | org.apache.sqoop.connector.kite.KiteConnector | FROM/TO | | ftp-connector | 1.99.7 | org.apache.sqoop.connector.ftp.FtpConnector | TO | | hdfs-connector | 1.99.7 | org.apache.sqoop.connector.hdfs.HdfsConnector | FROM/TO | | generic-jdbc-connector | 1.99.7 | org.apache.sqoop.connector.jdbc.GenericJdbcConnector | FROM/TO | +------------------------+---------+------------------------------------------------------------+----------------------+ |
其中,最基本的是generic-jdbc-connector,是MySQL等关系型数据库的连接器。支持数据的从关系型数据库的导入导出。除此之外,支持导入导出的还有:hdfs-connector、kite-connector和oracle-jdbc-connector。仅支持数据导入的连接器有: sftp-connector 、 ftp-connector 以及kafka-connector 。
通过这些模板,可以创建出对应数据源的link,比如我们链接MySQL,就是使用JDBC的方式进行链接,这时候就从这个generic-jdbc-connector模板继承出一个link。那如果我们需要连接hdfs,则需要使用hdfs-connector模版。
2)link:从connector继承出的对象,用于指定的对数据源的链接。
3)job:指定一个导入导出作业,必须指定数据源和数据目的地,并配置各项job参数,用于提交给mapreduce。
二、几个常用的重要命令
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
# 设置; set [server|option|truststore] # 查看; show [server|version|connector|driver|link|job|submission|option|role|principal|privilege] # 创建; create [link|job|role] # 删除; delete [link|job|role] # 更新; update [link|job] # 克隆; clone [link|job] # 执行任务; start [job] # 结束任务; stop [job] # 查看任务状态; status [job] # 启用; enable [link|job] # 禁用; disable [link|job] # 授权; grant [role|privilege] # 回收授权; revoke [role|privilege]<a title="复制代码"> </a> |
1 2 |
set option --name verbose --value true set option --name poll-timeout --value 20000 |
三、查看
查看服务器
1 2 3 4 |
sqoop:000> show server -all Server host: localhost Server port: 12000 Server webapp: sqoop |
查看版本
1 2 3 4 5 6 7 8 9 |
sqoop:000> show version -all client version: Sqoop 1.99.7 source revision 435d5e61b922a32d7bce567fe5fb1a9c0d9b1bbb Compiled by abefine on Tue Jul 19 16:08:27 PDT 2016 server version: Sqoop 1.99.7 source revision 435d5e61b922a32d7bce567fe5fb1a9c0d9b1bbb Compiled by abefine on Tue Jul 19 16:08:27 PDT 2016 API versions: [v1]<a title="复制代码"> </a> |
查看内置连接器
1 2 3 4 5 6 7 8 9 10 11 12 |
sqoop:000> show connector +------------------------+---------+------------------------------------------------------------+----------------------+ | Name | Version | Class | Supported Directions | +------------------------+---------+------------------------------------------------------------+----------------------+ | oracle-jdbc-connector | 1.99.7 | org.apache.sqoop.connector.jdbc.oracle.OracleJdbcConnector | FROM/TO | | sftp-connector | 1.99.7 | org.apache.sqoop.connector.sftp.SftpConnector | TO | | kafka-connector | 1.99.7 | org.apache.sqoop.connector.kafka.KafkaConnector | TO | | kite-connector | 1.99.7 | org.apache.sqoop.connector.kite.KiteConnector | FROM/TO | | ftp-connector | 1.99.7 | org.apache.sqoop.connector.ftp.FtpConnector | TO | | hdfs-connector | 1.99.7 | org.apache.sqoop.connector.hdfs.HdfsConnector | FROM/TO | | generic-jdbc-connector | 1.99.7 | org.apache.sqoop.connector.jdbc.GenericJdbcConnector | FROM/TO | +------------------------+---------+------------------------------------------------------------+----------------------+ |
1 2 3 4 5 |
sqoop:000> show link +------+----------------+---------+ | Name | Connector Name | Enabled | +------+----------------+---------+ +------+----------------+---------+ |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
show server --all show option --name verbose show version --all show connector --all or show connector show driver show link --all show link --name linkName show job --all show job --name jobName show submission show submission -j jobName show submission --job jobName --detail |
四、创建
创建link
1 2 |
create link --connector connectorName create link -c connectorName |
例如:
1 2 |
create link -connector generic-jdbc-connector create link -connector hdfs-connector |
创建job
1 2 |
create job --from fromLinkName --to toLinkName create job -f fromLinkName -t toLinkName |
五、更新
1 2 |
update link --name linkName update job --name jobName |
六、删除
1 2 |
delete link --name linkName delete job --name jobName |
七、克隆
1 2 |
clone link --name linkName clone job --name jobName |
八、启动
1 2 |
start job --name jobName start job --name jobName --synchronous |
九、停止
1 |
stop job --name jobName |
十、查看任务执行状态
1 |
status job --name jobName |
转载:http://www.cnblogs.com/avivaye/p/6196922.html