Code covered by the BSD License  

Highlights from
xml2struct

4.82143

4.8 | 29 ratings Rate this file 253 Downloads (last 30 days) File Size: 2.94 KB File ID: #28518

xml2struct

by

 

20 Aug 2010 (Updated )

Convert an xml file into a MATLAB structure for easy access to the data.

| Watch this File

File Information
Description

Convert an xml file into a MATLAB structure for easy access to the data.

Acknowledgements

This file inspired Freehand Prostate Annotation and Collada Parser.

MATLAB release MATLAB 7.9 (R2009b)
Other requirements xmlread
Tags for This File   Please login to tag files.
Please login to add a comment or rating.
Comments and Ratings (37)
12 Jun 2014 Simon du Plooy

Some of the attributes in the XML file had underscores at the beginning which error because of disallowed field name. Simple strrep solved the problem.

Great!

11 Mar 2014 Luis Miguel Escobar Falcón

Excellent

06 Mar 2014 Fredrik  
06 Mar 2014 Rody Oldenhuis  
25 Feb 2014 Timo Dörsam  
05 Jan 2014 simba forrest  
13 Aug 2013 Anders Bergåker  
02 Aug 2013 Mark Mikofski

Stop using XML and use json.org/java [1] static XML.toJSONObject() method [2], there's a precompiled jar file in my dropbox [3] or use Newton King's JSON.NET [4] which is already precompiled by him and available from codeplex [5] just download and unzip then use the version for the .NET framework on your machine. Converting between XML and JSON is described in the documentation [6] and in this SO post [7]. See MATLAB documentation for more information on using Java [8] or .NET [9] in MATLAB. It's super easy!
[1] http://json.org/java/
[2](http://json.org/javadoc/org/json/XML.html#toJSONObject(java.lang.String))
[3] https://dl.dropboxusercontent.com/u/19049582/JSON.jar
[4] http://james.newtonking.com/pages/json-net.aspx
[5] https://json.codeplex.com/
[6] http://james.newtonking.com/projects/json/help/index.html?topic=html/ConvertingJSONandXML.htm
[7] http://stackoverflow.com/a/814027/1020470
[8] http://www.mathworks.com/help/matlab/using-java-libraries-in-matlab.html
[9] http://www.mathworks.com/help/matlab/using-net-libraries-in-matlab.html

01 Aug 2013 Fábio Nery

I've seen some other users report this issue but could not find how to fix this:

Undefined function 'toCharArray' for input arguments of type 'double'.

Any idea?

Regards

03 Jul 2013 Varoujan

Works well.
Didn't fully test for empty field cases like some commenters but I got a nice structure out of my input file.

I am disappointed that a similar functionality isn't built in Matlab. xmlread and xmlwrite alone are such a pain to access and/or update xml data.

19 Jun 2013 Adam

Hi,
Thanks for the file, it works great.
But I have also the same problem as Erik with empty data fields. Someone know how to fix this?

06 May 2013 Yu

Faster than xml_read, recommended!

24 Apr 2013 Erik

Thanks for the file, however I'm having an issue with empty data fields.

If I have a 100x50 XML data set which I can easily import into Excel. However there are a few fields which are empty. For example at (5,35:40), the XML data is empty.

When I use the xml2struct and then try and create a cell array in the same format (100x50) the data in row 5 between 40:50, shifts to the 35:45 position and I'm left with 5 empty spaces from 45:50 and as such the data is misaligned.

Any idea on how to deal with empty fields in order to maintain their position in the original file?

Thanks!

22 Apr 2013 Rosie Vakasilimi

i was just wondering if someone could just confirm what i am doing is correct. when i want to convert xml into a matlab array, i type:
data=xml2struct('name of the file i want to convert'); ? is that all?

12 Apr 2013 Michael Pelz-Sherman

We are encountering the same issue reported by Raoul Herzog: Undefined function or method 'toCharArray' for input arguments of type 'double'. Is there a fix for this?

10 Dec 2012 Neill Weiss

For the comment bug, @Sirius3, I changed the following code block from:

if (~strcmp(name,'#text') && ~strcmp(name,'#comment') && ~strcmp(name,'#cdata_dash_section'))
%XML allows the same elements to be defined multiple times,
%put each in a different cell
if (isfield(children,name))
if (~iscell(children.(name)))
%put existsing element into cell format
children.(name) = {children.(name)};
end
index = length(children.(name))+1;
%add new element
children.(name){index} = childs;
if(~isempty(fieldnames(text)))
children.(name){index} = text;
end
if(~isempty(attr))
children.(name){index}.('Attributes') = attr;
end
else
%add previously unknown (new) element to the structure
children.(name) = childs;
if(~isempty(text) && ~isempty(fieldnames(text)))
children.(name) = text;
end
if(~isempty(attr))
children.(name).('Attributes') = attr;
end
end
else

to

if (~strcmp(name,'#text') && ~strcmp(name,'#comment') && ~strcmp(name,'#cdata_dash_section'))
%XML allows the same elements to be defined multiple times,
%put each in a different cell
if (isfield(children,name))
if (~iscell(children.(name)))
%put existsing element into cell format
children.(name) = {children.(name)};
end
index = length(children.(name))+1;
%add new element
children.(name){index} = childs;
textFieldNames = fieldnames(text);
for t = 1:length(textFieldNames)
textFieldName = textFieldNames{t};
children.(name){index}.(textFieldName) = text.(textFieldName);
end
if(~isempty(attr))
children.(name){index}.('Attributes') = attr;
end
else
%add previously unknown (new) element to the structure
children.(name) = childs;
if(~isempty(text) && ~isempty(fieldnames(text)))
textFieldNames = fieldnames(text);
numTextFieldNames = length( textFieldNames );
for i = 1:numTextFieldNames
thisFieldName = textFieldNames{i};
children.(name).(thisFieldName) = text.(thisFieldName);
end
end
if(~isempty(attr))
children.(name).('Attributes') = attr;
end
end
else

Now, the children.(name) properties are not blown away when a comment is parsed.

24 Nov 2012 Sirius3

bug: child nodes get lost, when there are comments between them. (line 95)

27 Oct 2012 Gledi

First of all thank for the excellent code.
I have a "small" problem according to the cell. In you code, if there are more MORE THAN ONE child than you create a cell, otherwise not. What should I change to have the case: Even if the node has ONLY ONE child than I create a cell (with one element)

24 Sep 2012 Matthew

Worked very well for me. Thank you so much.

26 Jul 2012 Raoul Herzog

There seems to be a bug in xml2struct :
I can provide you the corresponding xml file if needed.

??? Undefined function or method 'toCharArray' for input arguments of type 'double'.

Error in ==> xml2struct>parseAttributes at 174
str = toCharArray(toString(item(theAttributes,count-1)))';

Error in ==> xml2struct>getNodeData at 141
attr = parseAttributes(theNode);

Error in ==> xml2struct>parseChildNodes at 72
[text,name,attr,childs,textflag] = getNodeData(theChild);

Error in ==> xml2struct>getNodeData at 147
[childs,text,textflag] = parseChildNodes(theNode);

Error in ==> xml2struct>parseChildNodes at 72
[text,name,attr,childs,textflag] = getNodeData(theChild);

Error in ==> xml2struct>getNodeData at 147
[childs,text,textflag] = parseChildNodes(theNode);

Error in ==> xml2struct>parseChildNodes at 72
[text,name,attr,childs,textflag] = getNodeData(theChild);

Error in ==> xml2struct>getNodeData at 147
[childs,text,textflag] = parseChildNodes(theNode);

Error in ==> xml2struct>parseChildNodes at 72
[text,name,attr,childs,textflag] = getNodeData(theChild);

Error in ==> xml2struct>getNodeData at 147
[childs,text,textflag] = parseChildNodes(theNode);

Error in ==> xml2struct>parseChildNodes at 72
[text,name,attr,childs,textflag] = getNodeData(theChild);

Error in ==> xml2struct>getNodeData at 147
[childs,text,textflag] = parseChildNodes(theNode);

Error in ==> xml2struct>parseChildNodes at 72
[text,name,attr,childs,textflag] = getNodeData(theChild);

Error in ==> xml2struct>getNodeData at 147
[childs,text,textflag] = parseChildNodes(theNode);

Error in ==> xml2struct>parseChildNodes at 72
[text,name,attr,childs,textflag] = getNodeData(theChild);

Error in ==> xml2struct>getNodeData at 147
[childs,text,textflag] = parseChildNodes(theNode);

Error in ==> xml2struct>parseChildNodes at 72
[text,name,attr,childs,textflag] = getNodeData(theChild);

Error in ==> xml2struct>getNodeData at 147
[childs,text,textflag] = parseChildNodes(theNode);

Error in ==> xml2struct>parseChildNodes at 72
[text,name,attr,childs,textflag] = getNodeData(theChild);

Error in ==> xml2struct>getNodeData at 147
[childs,text,textflag] = parseChildNodes(theNode);

Error in ==> xml2struct>parseChildNodes at 72
[text,name,attr,childs,textflag] = getNodeData(theChild);

Error in ==> xml2struct>getNodeData at 147
[childs,text,textflag] = parseChildNodes(theNode);

Error in ==> xml2struct>parseChildNodes at 72
[text,name,attr,childs,textflag] = getNodeData(theChild);

Error in ==> xml2struct at 57
s = parseChildNodes(xDoc);

12 May 2012 Xiaohu  
20 Mar 2012 Ivan Smirnov

One of the problems that I personally encountered is that xml2struct can't handle CDATA blocks.

It can be easily fixed, replace line 67 with:
if (~strcmp(name,'#text') && ~strcmp(name,'#comment') && ~strcmp(name,'#cdata_dash_section'))
and line 94 with:
elseif (strcmp(name,'#text') || strcmp(name, '#cdata_dash_section'))

Works great otherwise, thanks.

27 Feb 2012 ali

Excellent! I was pulling my hair to read to numbers from XML file and with this I did it in one minute

18 Jan 2012 Kevin Moerman

Works great for small files. I tested it for some larger files with >100000 entries and this takes around 178 seconds.

18 Jan 2012 Kevin Moerman  
12 Sep 2011 Brad  
04 Jul 2011 Wouter Falkena

Thank you for this suggestion Mr. Wanner. I have updated the file and it is currently under review by the MATLAB Central. It will appear here shortly.

14 Jun 2011 Adrian Wanner

Thanks for your work.
You might want to speed up the attribute parsing by about 40% by replacing lines 152-154 by the following:
str=theAttributes.item(count-1).toString.toCharArray()';
k=strfind(str,'=');
attr_name = regexprep(str(1:(k(1)-1)),'[-:.]','_');
attributes.(attr_name) =str((k(1)+2):(end-1));

22 May 2011 Mark

Thanks, your auto field naming system worked great for me to work with data parsed out from XML files.

01 Apr 2011 Bernard

Thanks a lot! I finally came across a tool that can extract info from a ISO19115/19139 xml file.

05 Mar 2011 Joao Henriques

Simple and works pretty well! The structures are a bit verbose but they're supposed to be parsed by my program anyway; any attempts to collapse some of the nested structures would only slow down the code (some similar submissions do this but are much slower). Thanks!

24 Nov 2010 Krishnan Suresh

Thanks v. much! I used it to read a Collada file (geometry file Google Sketch-up). Worked like a charm!

22 Nov 2010 Wouter Falkena

You are correct. I have removed the '.xml' extension assumption, unless the file can not be found. The update file is currently under review by MATLAB Central and should appear here soon.

22 Nov 2010 Mathieu

Warning: all XML files haven't '.xml' extension

03 Nov 2010 Joanne

Worked on the first try for loading an OSM data file.

26 Sep 2010 TideMan

I was tearing my hair out trying to figure out how to automatically access one tiny piece of data in a .xml file until I found this routine.

31 Aug 2010 Yanai  
Updates
21 Aug 2010

Decreased the processing time for large XML files

22 Nov 2010

Removed the assumption that the filename should have a '.xml' extension

24 Nov 2010

Corrected the uploaded file

04 Jul 2011

Attribute parsing speed increased by 40%

02 Jan 2012

The function now replaces element and attribute names containing - by _dash_, . by _dot_ and : by _colon_

14 May 2012

Speed improvement due to X. Mo and added support for cdata and comments.

15 May 2012

Small bugfix in the CDATA and Comment structure fields.

Contact us