Parsing text files for Beginners

I am new to Matlab. Looking for something simple.
This is my data
14:27:45.535 -> Color Temp: 9888 K - Lux: 9155 - R: 23431 G: 20757 B: 21860 C: 47209
14:27:46.258 -> Color Temp: 9813 K - Lux: 9317 - R: 23760 G: 21059 B: 22144 C: 47781
14:27:46.979 -> Color Temp: 9111 K - Lux: 10240 - R: 25986 G: 22832 B: 23719 C: 50101
14:27:47.668 -> Color Temp: 7065 K - Lux: 6485 - R: 18168 G: 14743 B: 14873 C: 26192
14:27:48.394 -> Color Temp: 6879 K - Lux: 6104 - R: 17014 G: 13777 B: 13823 C: 24495
14:27:49.084 -> Color Temp: 6760 K - Lux: 8271 - R: 22906 G: 18545 B: 18530 C: 33597
14:27:49.813 -> Color Temp: 7217 K - Lux: 8778 - R: 24217 G: 19835 B: 20038 C: 37326
14:27:50.500 -> Color Temp: 7352 K - Lux: 9219 - R: 25368 G: 20858 B: 21131 C: 39681
14:27:51.223 -> Color Temp: 7360 K - Lux: 9070 - R: 24427 G: 20262 B: 20467 C: 38348
I want to parse R,G,B,C into a 4dimensional matrix.
Please help, the online examples are too hard for a novice like me.

12 Comments

What are the four axes? Time, Color Temperature, Lux, and color pane ? If so, you will find the result to be very sparse since you do not appear to duplicate times. You do not appear to duplicate color temperature or lux either, so it is not at all clear what the four axes should be.
I would suspect that you want a 2D matrix with 4 columns, rather than a 4D matrix.
I do not need time, color temperature, lux,
I just need red, green, blue, and clear, which are respectively represented as R: , G:, B:, C:.
Thanks for your help!
Assuming this a text file, you can use textscan function as follows.
fid = fopen('abc.txt');
out = textscan(fid,'%*D -> Color Temp: %*d K - Lux: %*d - R: %d G: %d B: %d C: %d');
% out will be a 1x4 cell array
out2 = horzcat(out{:});
% out2 will be 2d matrix of size Nx4
fclose(fid);
thanks for your help. Did you try this code on my data? Im new to matlab, but I wish to learn fast.
In the text scan function, what does '%*D -> Color Temp: %*d K - Lux: %*d - do?
Do you still need to include this when we just r,g,b,c.
For the out2 line, what is out{:}?
Thank you!
out = textscan(fid,'%*D -> Color Temp: %*d K - Lux: %*d - R: %d G: %d B: %d C: %d');
This captures the data for all bits of information, including time, color temp, K, and Lux. I know you said you didn't need that information, but you can always discard the rows later.
out2 = horzcat(out{:});
out{:} is a syntax requirement within matlab to capture the contents of all cells in 'out.' This is necessary when concatenating any cell array contents, else you end up with just another array of cells (or an error I think).
Thanks for you help! After learning the code you wrote and researching for hours, I believe I have come to a understanding for you to verify.
fid = fopen('abc.txt');
/* Just like any C++, we have the open the text file to use, and fid is the file identifier. However, we do not import the data of the txt file, but we actually add the file in the matlab directory path which can be used.
*/
out = textscan(fid,'%*D -> Color Temp: %*d K - Lux: %*d - R: %d G: %d B: %d C: %d');
/* this is using the textscan function to parse it and feed it to the variable out. Everything we want to parse is either in the format %d (integer), or %D (Datetime), the rest of the text file on the line we just write it out, for instance
-> Color Temp: , K-Lux:, we write them out since we do not want to parse it. However, my point of confusion is when you say that if you do not need you discard the data later, since we have 7 data points now. Time, color temp, lux, R, G ,B,C. But I believe in the first three by adding the * , it discards the data. %*D, %*d, %*d. I tried it without the *, it gives me
a 1X7 matrix, so by adding the * it gives you the right answer of 1X4 matrix.
*/
out2 = horzcat(out{:});
/* for this line out{:} gets all the arrays elements out.
when i did answer = out{:}, i checked the variable and it gives the last row
in a 1XN matrix. This is weird since I belive that out{:} should give 4 of NX1 matrixes, and when we horzcat that these 4 NX1, it should give NX4 matrix. Please enlighten me.
*/
fclose(fid);
// standard closing the file maybe to prevent memory leaks. Good practice.
I learned a lot. Please correct if I said anything stupid. However, I believe a have a much better understanding now.
Everything you have in there looks correct to me.
I apologize, I missed the * on the first three data pieces when I skimmed the code. This does indeed automatically discard the data, but we need to keep that portion of the textscan call in the code because that allows us to make sure we are properly parsing the entire line.
out{:} will indeed return something different from [out{:}]. Others will probably be able to give you a more specific answer, but the first should still be displaying the entire set of results, just perhaps not as one array. using the horzcat, or [out{:}], will bring all of the cell contents into one array, and then display that array.
I have one last question.
K means clustering.
I tried idx = kmeans(X,k);
in X i used the matrix which was out2
and k i put 2, for two clusters, grass and not grass.
However, this resulted in an error. Do you have the same error on your computer?
No idea. What is the entire error message?
Have you worked with K means clustering before?
Error using +
Integers can only be combined with integers of the same class, or scalar doubles.
Error in kmeans2>distfun (line 564)
D(:,i) = D(:,i) + (X(:,j) - C(i,j)).^2;
Error in kmeans2/loopBody (line 149)
minDist = min(minDist,distfun(X,C(ii-1,:),distance));
Error in internal.stats.parallel.smartForReduce (line 136)
reduce = loopbody(iter, S);
Error in kmeans2 (line 53)
ClusterBest = internal.stats.parallel.smartForReduce(...
Error in kmeans (line 322)
[varargout{1:nargout}] = kmeans2(X,k, distance, emptyact,reps,start,...
Error in Parsing (line 7)
idx = kmeans(out2,2); %out2 is the matrix ; 2 is the number of clusters
Capture2.JPG
Capture.JPG
After adding one line, the double casting made the code compile. Thanks for you help! Do you have any resources or links to direct me for the k means clustering? Anyways, this post is concluded as I have a good base now and code working. Thanks for being my first mentor as I am a complete novice in matlab.
Capture3.JPG

Sign in to comment.

Answers (1)

There are several ways to read this data in. While you can do this using a series of commands, an easier approach (especially if you only need to read this file in once) is to use the interactive data importing tool. I did this for your file and the steps I followed were:
  • Open the file in the Import Tool
  • Tell MATLAB that the file is space-delimited rather than fixed-width
  • Select (using Ctrl-click) just "important" columns (A, E, I, L, N, P, and R)
  • Give those selected columns informative names (Time, Temp, Lux, R, G, B, and C)
  • Modify the format of column A slightly, to add in the fractional seconds
  • Tell MATLAB to import the selected data.
Here's what the result of that importing looked like as a table array.
sampledata =
9×7 table
Time Temp Lux R G B C
____________ ____ _____ _____ _____ _____ _____
14:27:45.535 9888 9155 23431 20757 21860 47209
14:27:46.258 9813 9317 23760 21059 22144 47781
14:27:46.979 9111 10240 25986 22832 23719 50101
14:27:47.668 7065 6485 18168 14743 14873 26192
14:27:48.394 6879 6104 17014 13777 13823 24495
14:27:49.084 6760 8271 22906 18545 18530 33597
14:27:49.813 7217 8778 24217 19835 20038 37326
14:27:50.500 7352 9219 25368 20858 21131 39681
14:27:51.223 7360 9070 24427 20262 20467 38348
From there I can use normal table indexing to retrieve data.
>> sampledata{5, 'R'}
ans =
17014
>> R = sampledata.R;
If you need to import multiple files with that same format, you could use Import Tool on the first file to get the formatting handled correctly then generate code from the Import Tool to read in the rest of the files.
If you want to see how you would do this purely via commands, I can show that.

5 Comments

Thanks for showing this. Now i got the matrix from the txt file.
I have one last question.
K means clustering.
I tried idx = kmeans(X,k);
in X i used the matrix which was out2
and k i put 2, for two clusters, grass and not grass.
However, this resulted in an error. Do you have the same error on your computer?
What's the full and exact text of the error message that you received when you tried to call kmeans? Show us all the text displayed in red and/or orange when you executed that command. The exact wording may be important, so copy it exactly as it is shown in the Command Window.
Error using +
Integers can only be combined with integers of the same class, or scalar doubles.
Error in kmeans2>distfun (line 564)
D(:,i) = D(:,i) + (X(:,j) - C(i,j)).^2;
Error in kmeans2/loopBody (line 149)
minDist = min(minDist,distfun(X,C(ii-1,:),distance));
Error in internal.stats.parallel.smartForReduce (line 136)
reduce = loopbody(iter, S);
Error in kmeans2 (line 53)
ClusterBest = internal.stats.parallel.smartForReduce(...
Error in kmeans (line 322)
[varargout{1:nargout}] = kmeans2(X,k, distance, emptyact,reps,start,...
Error in Parsing (line 7)
idx = kmeans(out2,2); %out2 is the matrix ; 2 is the number of clusters
Capture2.JPG
Capture.JPG
What's the class of out2?
If it's a table array or a cell array, use class to determine the class of the first variable in the table or the first cell in the cell array. The following should work:
class(out2{1, 1})
After adding one line, the double casting made the code compile. Thanks for you help! Do you have any resources or links to direct me for the k means clustering? Anyways, this post is concluded as I have a good base now and code working. Thanks for being my first mentor as I am a complete novice in matlab.
Capture3.JPG

Sign in to comment.

Categories

Asked:

on 24 Dec 2019

Commented:

on 16 Jan 2020

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!