Example: Create an ORC file in the file system by storing the data in a Hive table and uploading it to Pig
About this task
IMPORTANT This component is deprecated. Hewlett Packard
Enterprise recommends using an alternate product. Deprecated components are either in
maintenance or have reached the end of their maintenance lifecycle. For more information,
see Discontinued Ecosystem Components.
You can create an ORC
format file in the file system by using Hive to load a text file into a table with ORC
storage. Then, you can upload the resulting ORC format file to Pig.Procedure
-
Create a sample test data file:
cd /home/mapr nano test_pig.data chown mapr:mapr test_pig.data
-
Add data to the file.
John,Smith Brian,May Rodger,Taylor John,Deacon Max,Plank Freddie,Mercury Albert,Einstein Fedor,Dostoevsky Lev,Tolstoy Niccolo,Paganini
NOTE Do not include any extra lines at the end of the file. -
Upload the test data to a Hive table:
sudo -u mapr hive hive> create table test_pig(first_name string, last_name string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; hive> load data local inpath '/home/mapr/test_pig.data' overwrite into table test_pig;
-
Create a Hive table with ORC storage:
hive> create table test_pig_orc(first_name string, last_name string) stored as orc tblproperties ("orc.compress"="NONE"); hive> insert overwrite table test_pig_orc select * from test_pig; hive> select * from test_pig_orc;
-
Check that the ORC file was created:
hadoop fs -ls /user/hive/warehouse/test_pig_orc
-
Upload the ORC file to Pig:
sudo -u mapr pig grunt> B = load '/user/hive/warehouse/test_pig_orc/000000_0' using OrcStorage(); grunt> dump B;