How to append the contents of array cells to .csv files?

Hello, I would like to export the values in the cells of an array into a .csv file, such that the contents of each cell array make up a single "value" in the .csv file. For example
data =
4 x 3 cell array
{'information'} {'labels'} {'odds and ends'}
{'more information'} {'more labels'} {'odds and ends"}
{'info'} {'a label'} {'odds and ends'}
{'data'} {'output'} {'abcd'}
If the file ('data.csv') already has data in it, how do I append the data so that I get the following:
information,labels,odds and ends
more information,more labels,odds and ends
info,a label,odds and ends
data,output,abcd
I have tried using dlmwrite('data.csv',data,'-append')
but this produces
i,n,f,o,r,m,a,t,i,o,n,l,a,b,e,l,s,o,d,d,,a,n,d,,e,n,d,s
m,o,r,e,,i,n,f,o,r,m,a...... and so on
I know I can use something like xlswrite, but the file is being opened and closed many times, with a lot of data (and using xlswrite seems to slow the process down).
Any suggestions? Thanks!
Robert

1 Comment

You'll have to write with fprintf the content of the cell array dereferenced to the strings...there's a rudimentary example "Export Cell Array to Text File" link at the bottom of the doc page for fprintf that illustrates the basic idea; you'll have to add the comma delimiter to the format as it's space-delimited and you would be well served to add the double-quotes around the strings since you do seem to have embedded blanks that are problematical on import oftentimes.

Sign in to comment.

 Accepted Answer

Robert
Robert on 7 Mar 2019
Edited: Robert on 7 Mar 2019
Thanks to you both! I was hoping there was simply a syntax error I was using that would allow me to read the cell array (or restructure it) to take advantage of the append function in dlmwrite(). Sounds like there is not a quick way to do this after all?
I ended up going a slightly different route. It's simple to place this data in an .csv file (or Excel file) using the xlswrite() function, but opening and closing it slows down the process. I ended up having the data process in sets, then saving the data in fewer, larger pieces to a .csv file using xlswrite(), and a script to identify the first empty line, in order to append the data. As follows (crude and simple, but it worked). Thanks for the suggestions!
% Identify the file name. If the file is a .csv file, the sheet name will simply be the file name
fileid = 'dataoutput.csv'
sheet = 'dataoutput'
line = xlsread(fileid,sheet,'A1:A10000')
linenumber = size(line)
linenumber = linenumber(1,1)
linenumber = num2str(linenumber)
firstemptyrow = ['A',linenumber]
% Then save the data to the file
xlswrite(fileid,dataset,sheet,firstemptyrow)

20 Comments

writetable is the simpler routine these days. dlmwrite was only designed for numeric data and will not be changed to support more complex use.
But writetable hasn't yet been extended to append to text files has it, Walter?
I think you can only use the 'Range' parameter with Excel files?
I've thought for a long time TMW should work on the output side of the import/export side to catch up on the writing end of things to what they have done for data import...
Choosing to leave csvwrite/dlmwrite in the dark ages just isn't very enlightened and while writetable is great for some purposes, it requires first casting to the table and then isn't totally general in its abilities.
I think you can use the Range value in any format that Excel can use, as long as you call it using the xlswrite() function. And I think this can be done even if Excel is not on the computer (is this correct)? Too bad Excel dumped .dbf files as a format, that was really handy for getting into Arc Shapefiles and working with data in the attribute tables.
if excel is not present or system is not Windows then xlswrite delegates to dlmwrite of csv file .
That's true, but then you're back to the high overhead/slowness of xlswrite
There's a FileExchange submittal that avoids the real bottleneck of builtin xlswrite in that it opens/closes the connection every call when you have multiple reads (iirc) I don't recall on the write side.
I'm not COM whizard but I know there have been Answers posted that illustrate doing it directly oneself with the result that performance is much better but I don't have a direct link to any of those saved, unfortunately.
xlwrite() in the File Exchange caches the connection. However, one of the R2017* releases added that internally so that there became no advantage to using the FEX contribution.
Ah! That's nice to know, Walter...I think I remember that being mentioned before but had slipped my mind...
I still think TMW should have general higher-level interfaces for CSV files, etc., that can deal with more than just plain vanilla numeric arrays.
Well, a bit of a follow up, I am hoping its something dumb and simple (I excel and dumb and simple).
I am using the process described above, and its going well and at a reasonable speed. However, there is data in the array, that is not being added to the file. For example, the following:
Capture.PNG
Is being output to the file like this:
Capture2.PNG
The 'R2' strings are being handled the same was the other strings are being handled. The output code I am using is the following:
Capture3.PNG
when the xlswrite() function is executed, the 'R2' string is in the overalloutputdata matrix, but it does not turn up in the file. Sorry to keep bothering you, but I am stuck on why this is happening.
Side note: the way that xlsx files are written as text (.xml) internally ends up coding numeric values as character vectors, typically without any special coding internally to indicate that they are intended as character rather than as numeric. Then when the xlsx file is read, each field is examined, and if it looks like it could be a valid number, it is converted to numeric -- even if there were leading or trailing spaces. That is why your '2013' become numeric 2013
At the moment I do not know where the 'R2' are going. You are using Excel on Windows, so xlswrite is using ActiveX to talk to Excel to write the data. I do not have Excel running on any of my Windows virtual machines, so I cannot test the combination.
I'm at a loss as to what the "'R2' string" is referring to...I don't see a variable R2 or anything containing the string R2 anywhere in the entire thread???
I do have Excel on Win7 so could piddle some, but would need an example dataset that you're trying to write and what you get vis a vis what you think you should get.
I'd note, however, that XLSWRITE is documented to just fail silently if any cell in the output cell array contains anything other than a scalar numeric or text by simply leaving the corresponding cell in the spreadsheet unwritten/empty.
Look in Robert's most recent posting of about an hour ago, in the first image posted, look in the column with the header of 8. All entries in the column are the string 'R2'. Then in the second image, showing the effect of xlswrite and examining the data in Excel, that column is all blank, whereas the surrounding columns are fine (other than some strings having been converted to numeric form.)
dpb
dpb on 8 Mar 2019
Edited: dpb on 8 Mar 2019
Would have to have the actual data...as in another poster's case earlier, what one sees in an image may not actually represent the real data at all...or in its entirety.
I'll bet the issue will be shown to be something like the previous limitation with xlswrite. I've never seen it fail as long as the data follow the specific rules.
OK, so perhaps we are getting closer to an answer? The issue is in the data itself. I am loading two data sets here. Once loaded, one is called data, and one is called overalloutputdata.
data is a synthetic data set I built by hand using the following code (sorry for the neophyte inefficiencies). When it is saved to a .xlsx file or .csv file, it works correctly!
overalloutputdata is an actual output from the scripts I am running. It is built using the same tools as the data set but obviously has some difference imbedded, since the 'R2' column does not export.
My apologies for including one of the files twice. Have not figured out out to remove it from the message.
>> overalloutputdata
overalloutputdata =
5×9 cell array
{'Russia'} {'Vladivostok'} {'RSM00031969'} {'2013'} {'15.3267'} {'15.3267'} {' 26'} {1×1 cell} {'M'}
{'Russia'} {'Vladivostok'} {'RSM00031969'} {'2014'} {'11.7679'} {'11.7679'} {'365'} {1×1 cell} {'M'}
{'Russia'} {'Vladivostok'} {'RSM00031969'} {'2015'} {'8.76443'} {'8.76443'} {'365'} {1×1 cell} {'M'}
{'Russia'} {'Vladivostok'} {'RSM00031969'} {'2016'} {' 13.136'} {' 13.136'} {'365'} {1×1 cell} {'M'}
{'Russia'} {'Vladivostok'} {'RSM00031969'} {'2017'} {'23.4883'} {'23.4883'} {'104'} {1×1 cell} {'M'}
>> overalloutputdata{1,8}
ans = 1×1 cell array
{'R2'}
As suspected, the offending column is a cell containing a cell array, NOT a cell containing a numeric or string...hence, XLSWRITE fails for that column.
Don't know how you built the array, but you somehow got an extra set of curlies "{}" or cellstr call for it that doesn't belong.
Time to dig through some of the code. Thanks for your help on this, the problem seems a bit anticlimatic.
Silent errors are not helpful!
Not sure one would be pleased with a hard error altho I'll agree a warning could be useful.
The lesson to be learned is to inspect the data itself carefully. I don't know how the image was made that appears to lie by showing the underlying data in the cell inside the cell, but clearly the data array itself doesn't follow the rules.
I don't use many (as in "any") of the user interface thingies TMW has built into Matlab so I'm not sure if that is a case of one of those tools being too helpful or what it was that led to the false impression of what your array actually was. I'd place the blame on that tool for lying to me if I were trying to actually pinpoint where the bug is. I don't know if that falls back onto TMW or not since don't know where the image actually came from or how was generated.
Robert's "Answer" moved here:
I understand wanting to avoid hard errors, but I admit I would prefer those to what is currently in place, since it would at least identify where to start looking for the problem.
I did find the problem in my code where the additional brackets were being added, making each cell an array. Now it works! Thanks again for all of your help, and your patience.
The image I provided of the datta is a clipped screen shot from MatLab, not sure why your version showed the additional brackets, and mine did not. Something in preferences or settings perhaps? Or version? (I am running 2018a). At the risk of further exposing my ignorance, what is TMW?
Thanks,
Robert
@Robert: TMW = The MathWorks, i.e. the company that produce MATLAB and Simulink.
You are currently looking at The MathWorks website (look at the page URL).
Ah yes. Once again, showing my ability to ignore the obvious! Cheers.
My output was from the straight command window; the command window doesn't have any of the presentation of data in a spreadsheet-like format. I'm guessing (since I don't use them for precisely such reasons) that you were using the variable editor window or whatever it is that TMW calls it, not just typing at the command line.
That tool is what "tried to help" by showing the root content of the cell but that disguised/hid that the actual array element was a cell containing a cell containing a string, not a cell containing a string.
Sometimes trying to be clever and helpful is more troublesome than it is actual help.

Sign in to comment.

More Answers (1)

If you must use dlmwrite(), then use
strjoin(arrayfun(@(IDX) strjoin(data(IDX,:), ','), 1:size(data,1), 'uniform', 0), '\n')
and dlmwrite that with a delimiter of '' (empty). Another way of phrasing this is that you need to pre-format all of the output and write it as a single vector with empty delimiter.
... Which is to say that using dlmwrite() for this is not recommended. Just fopen() with 'a' or 'at' mode and write the data yourself.

Asked:

on 7 Mar 2019

Commented:

dpb
on 8 Mar 2019

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!