Sqoop

IMPORTANT This component is deprecated. Hewlett Packard Enterprise recommends using an alternate product. For more information, see Discontinued Ecosystem Components.

Apache Sqoop™ is a tool designed to efficiently transfer bulk data between Apache Hadoop and structured datastores, such as relational databases.

This documentation provides information for using Sqoop, but does not duplicate the Apache Sqoop™ documentation on the Apache Sqoop website.

The following table describes the feature of Sqoop1:

Feature Sqoop1
Specialized connectors for all major RDBMS Available.
Data transfer from RDBMS to Hive Done automatically.
Data transfer from Hive to RDBMS

Must be done manually in two stages:

  1. Extract data from Hive into file system, as a text file or as an Avro file.
  2. Export the output of step 1 to an RDBMS using Sqoop.
Integrated Kerberos security Supported.
Password encryption Not supported.