Followers

Monday, March 4, 2019

Sqoop Import Analysis

Find the below analysis on similar kind of record on different file format to import from MySQL to HDFS.
--as-textfile:(Size in HDFS:61KB)(Time Taken:70Second)
19/03/04 04:27:43 INFO mapreduce.ImportJobBase: Transferred 60.8369 KB in 69.8476 seconds (891.8985 bytes/sec)
19/03/04 04:27:43 INFO mapreduce.ImportJobBase: Retrieved 1114 records.

--as-avrodatafile:(Size in HDFS:58KB)(Time Taken:63Second)
19/03/04 04:24:07 INFO mapreduce.ImportJobBase: Transferred 58.124 KB in 62.5104 seconds (952.1461 bytes/sec)
19/03/04 04:24:07 INFO mapreduce.ImportJobBase: Retrieved 1114 records.

--as-sequencefile:(Size in HDFS:81KB)(Time Taken:109Second)
19/03/04 04:14:57 INFO mapreduce.ImportJobBase: Transferred 80.7959 KB in 109.3956 seconds (756.2921 bytes/sec)
19/03/04 04:14:57 INFO mapreduce.ImportJobBase: Retrieved 1114 records.

--as-parquetfile:(Size in HDFS:28KB)(Time Taken:82Second)
19/03/04 04:17:17 INFO mapreduce.ImportJobBase: Transferred 32.2197 KB in 82.2192 seconds (401.281 bytes/sec)
19/03/04 04:17:17 INFO mapreduce.ImportJobBase: Retrieved 1114 records.

Conclusion is:
If Space is a concerned then use : parquetfile.
If Speed is concerned then use: avrodatafile

No comments:

Post a Comment