How to csvwrite an array in HDFS

Hi,
I have Matlab connected to a Hadoop cluster. I want to save an array as a csv file to HDFS. The only commend I found was:
location = 'hdfs://master/user/savecsvfile';
write(location, Array1 );
Then I check the HDFS respiratory to find a file "part-1-snapshot.seq" and the contents are different from what Array1 is.
Thanks,

 Accepted Answer

write command is able to write only tall array as .seq file. So, if you want to save csv files to HDFS, you should first save csv files on locla then upload to HDFS using Hadoop commands. From MATLAB,
csvwrite('hoge.csv', Array1);
!hadoop fs -copyFromLocal ./hoge.csv /user/savecsvfile

6 Comments

Thanks! it worked. Do you know how to read the .seq file from HDFS?
In order to read .seq files from HDFS, we use datastore. For example,
location = 'hdfs://master/some/path/to/seqfiles';
ds = datastore(location);
For detail of datastore, please see this doc .
Thanks again! Sorry to be annoying but is there a way to add the timestamp to this command (to have it saved as "savecsvfile16-01-18 14:10"):
!hadoop fs -copyFromLocal ./hoge.csv /user/savecsvfile
Do you want to create HDFS directory with timestamp? If so, try this.
formatOut = 'dd-mm-yy_HHMM';
command = ['hadoop fs -copyFromLocal ./hoge.csv /user/savecsvfile' datestr(now,formatOut)];
status = system(command);
In this case, a csv file will be stored in savecsvfile16-01-18_1410.
Thanks a lot! I'm so grateful to you
How to connect matlab to hadoop cluster.
How to fix this issue:
Parallel mapreduce execution on the Hadoop cluster:
********************************
* MAPREDUCE PROGRESS *
********************************
Map 0% Reduce 0%
Error using mapreduce (line 125)
The HADOOP job failed to submit. It is possible that there is some issue with the HADOOP configuration.
OS: windows

Sign in to comment.

More Answers (0)

Categories

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!