Sqoop
Apache Sqoop™ is a tool designed to efficiently transfer bulk data between Apache Hadoop and structured datastores, such as relational databases.
This documentation provides information for using Sqoop and Sqoop2, but does not duplicate the Apache Sqoop™ documentation on the Apache Sqoop website.
The following table describes the differences between Sqoop1 or Sqoop2:
Feature | Sqoop1 | Sqoop2 |
---|---|---|
Specialized connectors for all major RDBMS | Available. |
Not available. However, you can use the
The generic JDBC connector should also work with any other JDBC-compliant database, although specialized connectors probably give better performance. |
Data transfer from RDBMS to Hive | Done automatically. |
Must be done manually in two stages:
|
Data transfer from Hive to RDBMS |
Must be done manually in two stages:
|
Must be done manually in two stages:
|
Integrated Kerberos security | Supported. | Supported. |
Password encryption | Not supported. | Supported as of Sqoop 1.99.7. |