maexadata - Welcome to the world of mega-exabytes!!!


Sep 23, 2014

Loading bulk data into Hadoop from production systems or accessing it from map-reduce applications running on large clusters can be a challenging task. Transferring data using scripts is inefficient and time-consuming.

How do we efficiently move data from an external storage into HDFS or Hive or HBase?

Meet Apache Sqoop. Sqoop allows easy import and export of data from structured data stores such as relational databases, enterprise data warehouses, and NoSQL systems. The dataset being transferred is sliced up into different partitions and a map-only job is launched with individual mappers responsible for transferring a slice of this dataset.

Go Back