4.78261

4.8 | 24 ratings Rate this file 198 downloads (last 30 days) File Size: 204.87 KB File ID: #12907

xml_io_tools

by Jaroslaw Tuszynski

 

06 Nov 2006 (Updated 26 Jun 2009)

Code covered by BSD License  

Read XML files into MATLAB struct and writes MATLAB data types to XML

Download Now | Watch this File

File Information
Description

Read XML files into MATLAB struct and writes MATLAB data types to XML files, with help of simple interface to MATLAB's xmlwrite and xmlread functions.

Two function to simplify reading and writing XML files from MATLAB:

    * Function xml_read first calls MATLAB's xmlread function and than converts its output ('Document Object Model' tree of Java objects) to tree of MATLAB struct's. The output is often in format of nested structs and cells. In the output data structure field names are based on XML tags.

    * Function xml_write first convert input tree of MATLAB structs and cells and other types to tree of 'Document Object Model' nodes, and then writes resulting object to XML file using MATLAB's xmlwrite function.

This package can:
    * Read any XML file, possibly created outside of MATLAB, and convert it to MATLAB data structures.
    * Write any MATLAB's struct tree to XML file
    * Handle XML attributes in the same way as xml_toolbox package
    * Handle special node types like CDATA_SECTIONs, PROCESSING_INSTRUCTIONs and COMMENTs.
    * Read and write XML files with base64 encoded binary objects
    * Be studied, modified, customized, rewritten and used in other packages without any limitations. All code is included and documented. Software is distributed under MIT License (included).

This package can't:
    * Guarantee to recover the same Matlab objects that were saved. If you need to be able to recover carbon copy of the structure that was saved than you will have to use one of the packages that uses special set of tags saved as xml attributes that help to guide the parsing of XML code. This package does not do that.
    * Guarantee to work with MATLAB versions older than the package (2006/11). The code does not work with older versions of MATLAB.

MATLAB release MATLAB 7.6 (R2008a)
Zip File Content  
Published M Files Tutorial for xml_io_tools Package
Other Files
base64decode.m,
base64encode.m,
gen_object_display.m,
html/xml_tutorial_script.png,
html/xml_tutorial_script_01.png,
license.txt,
MIT_Licence.txt,
test_file.xml,
xml_read.m,
xml_tutorial_script.m,
xml_write.m,
xmlwrite_xerces.m
Tags for This File  
Everyone's Tags
Tags I've Applied
Add New Tags Please login to tag files.
Comments and Ratings (37)
06 Dec 2006 Stephen Morris

This did exactly what I wanted - actually, what I'd originally hoped that the native Matlab functions would do! Very easy to use; saved me no end of time crafting similar routines for myself.

30 May 2007 Anna Kelbert

Great routines; exactly what I need and very easy to use, no knowledge of Java or DOM needed. They create a wonderful link between Matlab structures and *.xml files. Unfortunately, there is a tiny bug in the xml_write routine: the attributes of the root node (i.e. xmlns and schema location) are not written (although they are saved as attributes by xml_read). Preferences to allow for custom style for writing real values would also help, but the default is reasonable.

08 Jun 2007 Jennifer Cooper

Does exactly what I needed and the preferences options already cover the options I'm likely to need!

27 Jun 2007 Jeff Kirschner

Hi, apologies if this is the wrong place for this, but xml_read is giving me this error:

 xml_read.m Line: 132 Column: 13
"]" expected, "identifier" found.

Would anyone happen to have a solution for this pls? Thanks in advance.
I'm using Matlab 6.5 R13

19 Jul 2007 Nick Rhee

I am getting an error "Unable to parse XML file " from xml_read. Would appreciate your help!

27 Sep 2007 shlomi israel

Works well and excelently documented.

24 Oct 2007 Martin Bergtholdt

Nice work!

12 Dec 2007 Eduard Rudyk

Excellent tool. Thanks.

02 Jan 2008 Hugo Costelha

Nice work ;)

In line 286 of xml_read.m, I added an "if (Pref.Str2Num)", to honor the preferences.

05 Feb 2008 taba taba

very good

06 Mar 2008 Mark Neil

Need some help understanding how to properly use this to store/restore matrices.

I see mostly positive reviews, so I'm really hoping I'm just doing something wrong (as matrices are somewhat common in MATLAB).

Consider the following code...

a.Val = 10;
a.OneByThree = [0 0 0];
a.ThreeByOne = [0; 0; 0];
class(a.Val)
class(a.OneByThree)
class(a.ThreeByOne)
gen_object_display(a)
xml_write('a.xml', a);
b = xml_read(a.xml');
class(b.Val)
class(b.OneByThree)
class(b.ThreeByOne)
gen_object_display(b)

You get an XML file which looks like this...

<?xml version="1.0" encoding="utf-8"?>
<a>
   <Val>10</Val>
   <OneByThree>0 0 0</OneByThree>
   <ThreeByOne>0 0 0</ThreeByOne>
</a>

And output which looks (compressed) like this...

ans = double
ans = double
ans = double
           Val: [10]
    OneByThree: [0 0 0]
    ThreeByOne: [3x1 double]
ans = double
ans = char
ans = char
           Val: [10]
    OneByThree: '0 0 0'
    ThreeByOne: '0 0 0'

Two issues...

1) Your double matrices aren't double anymore, they are character arrays.
2) No preservation of the dimensionality of the matrices.

Is there a solution to this? Some flag I'm not using correctly (although I tried several).

Thanks in advance,

Mark

30 Apr 2008 John Reego

Works great. We have tested with structures that are as large as 6 megabytes with tens of thousands of variables. Takes a couple of seconds to transfer - but everything works.

30 May 2008 Andrej Mosat

Parsing my cell 'a' containing mixed types :
>> size(a)
ans =
   134 12
>> a(1,:)
ans =
[501][-1153850764] 'Cartridge-1' [9] [11] [0] [1] [-1] [1] [0] 'Filter - In-Line' 'none'

>> xml_write ...
results in:
<?xml version="1.0" encoding="utf-8"?>
<a>
   <item>501</item>
   <item>502</item>
   <item>503</item>
   ...
and finish.

No support for heterogeneous cells? If you added this, you`d have 200% boost in the downloads.
I know it`s not that hard parsing each element of the variable to be saved. Matlab does not have other way of saving complex data than .mat files, which is kind of scary if you want to save & use in another programming language. For now, I have to look for other solution. At the end, copy&paste some 17 structures with 8 subfields will take shorter than trying things here.
Just do it, please ;-)

10 Nov 2008 Tracy

I get different results when i write to a file and when i write to a buffer.

>> MyTree

MyTree =

    CDATA_SECTION: '<A>txt</A>'
         MyNumber: 13
         MyString: 'Hello World'

when writing to the buffer my CDATA is not correct.

xx=xml_write([], MyTree, {'MyTree', [], 'This is a global comment'}, Pref);

xmlwrite(xx)

ans =

<?xml version="1.0" encoding="utf-8"?><!--This is a global comment-->
<MyTree><A>txt</A><MyNumber>13</MyNumber>
   <MyString>Hello World</MyString>
</MyTree>

when writing to a file it is

>> xml_write('xx.xml', MyTree, {'MyTree', [], 'This is a global comment'}, Pref);
>> type ('xx.xml')

<?xml version="1.0" encoding="UTF-8"?>
<!--This is a global comment-->
<MyTree><![CDATA[<A>txt</A>]]><MyNumber>13</MyNumber>
    <MyString>Hello World</MyString>
</MyTree>

am i doing something wrong??

10 Nov 2008 Tracy

sorry - i used the wrong function. But when i use teh right one i get this error

>> xmlwrite_xerces(xx)
51 objFile = java.io.File(result);
??? No constructor 'java.io.File' with matching signature found.

Error in ==> xmlwrite_xerces at 51
objFile = java.io.File(result);

18 Nov 2008 Mike

perfect

05 Jan 2009 Armel Mevellec  
03 Feb 2009 Neilen Marais

Hi,

This is a great tool, but I'm having a problem. I'm trying to use it to save/load figure style structures. The problem is that the style structures include many empty strings. xml_write saves these as empty tags. When xml_read load the xml, empty tags are converted to empty matrices.

Ie. if I save an empty string ('') to xml, and load, I get an empty matrix ([]). This makes , e.g.

xml_write('figstyle.xml', figstyle);
figstyle_from_xml = xml_read('figstyle.xml')
set(gcf,figstyle_from_xml);

complain. However, converting all empty matrices to empty strings when loading would not work, since some of the struct members _should_ be empty matrices.

I hope this could easily be fixed :)

Thanks
Neilen

25 Feb 2009 Yan

Thanks for the great work!

12 Mar 2009 Michael Heistand  
12 Mar 2009 Michael Heistand

This worked perfectly. Every detail is exactly the way I needed it. Thanks!

29 Apr 2009 Peter

Very helpful - what a perfect tool!

05 May 2009 Leonardo Glavina

Works really nice... exactly what I wanted!!

05 May 2009 Ryan Sharp

I would love to try this, but it appears to not work correctly with matlab 7.4?
I noticed there are 'catch ME' statements, which I think were not introduced yet in 7.4?

14 May 2009 Daniel Lyddy

Does anyone know how to use the base64 encoding features in xml_write? I have downloaded and installed Peter J. Acklam's toolbox, but I don't see anything in the documentation, and I can't find any calls to 'base64encode' in the xml_write.m file.

15 May 2009 Daniel Lyddy

Jarek:

I am guessing that your new base64 instructions work perfectly for the case you decribe in the documentation, but in my case, I have an image already stored as a two-dimensional MATLAB uint8 array. Peter Acklam's base64encode only seems to work with vector data of dimension one, so here is what I had to do to get things to work. Assume 'img' is a 2D array of uint8 that already exists in MATLAB's workspace.

[imgHeight, imgWidth] = size(img);
MyTree.MyImage.height = imgHeight;
MyTree.MyImage.width = imgWidth;
MyTree.MyImage.data.ATTRIBUTE.EncodingMIMEType = 'base64';
MyTree.MyImage.data.CONTENT = base64encode(img(:));
xml_write(MyTree, 'testImage.xml');
tree = xml_read('testImage.xml');
newImg = uint8(reshape(base64decode(tree.MyImage.data.CONTENT), [tree.MyImage.height, tree.MyImage.width]));

When I do this, newImg matches img pixel-for-pixel.

Thanks,
Daniel

Daniel

19 May 2009 Thomas Pilutti

Even though xml_read.m does a check to pass ver >=7.1, the ME in catch ME (as pointed out by Ryan Sharp on 5May2009) fails for 7.1. I commented out the ME and also removed getReport() call, which is not on standard path for 7.1. Seems to work with these edits, but maybe there is some unintended side effect?

02 Jun 2009 Arpad Andrassy

Thanks Jaroslaw! Works like a charm!

19 Jun 2009 Shlomi

I work with these functions all the time. You did a megnificent job.
I have a sugesttion for improvment: The fact that the item notaion works only for structs that are arrays of size larger than 1 creates xml-files with different hierarchy depending on the structure size. When some other program needs to read this xml it becomes a problematic issue.
So... I suggest that you add a preferance entry that forces item notation even for structs of length 1.

Thanks a lot for this submition & keep up the good work.

21 Jun 2009 Val Schmidt

This package works well for us - reading xml that fails under other xml packages.

However it is EXTREMELY slow (for me).

In parsing a 976K xml file with a parent function that calls xml_read(), I find in profiling that "xml_read>DOMnode2struct" consumes 206 seconds of the total 210.

It appears that the process of unwrapping the results of the DOM object into the matlab structure is what kills it. Oddly, my cpu is rarely maxed out, which seems to indicate some memory management bottleneck.

Is there any chance this is fixable?

22 Jun 2009 Jaroslaw Tuszynski

Reply to Val Schmidt Comments:

By default xml_read is configured for most common use. If you need performance, DOMnode2struct can be sometimes significantly speeded up if you use Pref.Str2Num='never' and/or trim your output by changing Pref.ReadAttr, Pref.ReadSpec and Pref.NumLevels.

06 Aug 2009 John Perko

I get an error with multiple preferences--if Pref structure contains both StructItem='false' and CellItem='false'

23 Oct 2009 Sebastiaan

    <Misc>
        <property value="Comments"> </property>
    </Misc>
    <EmptyTag>
    </EmptyTag>
</PD>

Which is read in Matlab as:
tree =
         Misc: [1x1 struct]
     EmptyTag: []
    ATTRIBUTE: [1x1 struct]

tree.Misc.property =
      CONTENT: []
    ATTRIBUTE: [1x1 struct]

So far, so good. However, on writing, the XML file is malformed:
<?xml version="1.0" encoding="UTF-8"?>
<PD version="4.1">
    <Misc>
        <property value="Comments"/>
    </Misc>
    <EmptyTag/>
</PD>

I do not understand the DOM structure enough to try to find a cure for this. Also, setting tree.EmptyTag=' ' (one or more spaces) results in the same output.

How can this be corrected?

EDIT
Ok, it is amazing how fast bright ideas come after posting a problem.

I noticed that setting:
set(objOutputFormat, 'PreserveSpace', 'on')
in xmlwrite_xerces.m ~ line 48 deals with the problem and outputs a genuine XML file.

Unfortunately, the layout is unreadable (for larger xml files) to the eye:
<PD version="4.1"><Misc><property value="Comments"></property></Misc><EmptyTag></EmptyTag></PD>

23 Oct 2009 Sebastiaan

Hmm, having problems commenting. This is what the question used to be:

VERY nice tool indeed, especially the Pref settings which let you modify the output. However, I have a problem with empty tags or contents. My XML file has lines like this:

<?xml version="1.0" encoding="ISO-8859-1" ?><!DOCTYPE PD>
<PD version="4.1">
    <Misc>
        <property value="Comments"> </property>
    </Misc>
    <EmptyTag>
    </EmptyTag>
</PD>

Which is read in Matlab as:
tree =
         Misc: [1x1 struct]
     EmptyTag: []
    ATTRIBUTE: [1x1 struct]

tree.Misc.property =
      CONTENT: []
    ATTRIBUTE: [1x1 struct]

So far, so good. However, on writing, the XML file is malformed:
<?xml version="1.0" encoding="UTF-8"?>
<PD version="4.1">
    <Misc>
        <property value="Comments"/>
    </Misc>
    <EmptyTag/>
</PD>

I do not understand the DOM structure enough to try to find a cure for this. Also, setting tree.EmptyTag=' ' (one or more spaces) results in the same output.

How can this be corrected?

31 Oct 2009 Yuan Ren

I think there's some problem in this part:

digits = '[Inf,NaN,pi,\t,\n,\d,\+,\-,\*,\.,e,i, ,E,I,\[,\],\;,\,]';
s = regexprep(str, digits, ''); % remove all the digits and other allowed characters

I found this when one of my LeafNode called "IN" was recognized as "NaN".

I suggest the following modification.

>> regexprep('IN','[Inf,NaN]','')

ans =

     ''

>> regexprep('IN','(Inf)|(NaN)','')

ans =

IN

One other thing is "num = str2num(str)" is a somewhat dangerous function. I'm not sure if it worth it to use this to provide some fancy functionality.

02 Nov 2009 Mark Weber

Hi,

I am using your tool very often and I want to thank you for that.

This week I found a little problem and do not knwo how to solve. First of all my xml file:
<measurement>
      <values>
        <value>
          <rssi>-90</rssi>
          <identifier>
            <name>00-11-88-87-F4-82</name>
          </identifier>
        </value>
      </values>
    </measurement>
<measurement>
      <values>
        <value>
          <rssi>-81</rssi>
          <identifier>
            <name>00-11-88-88-42-40</name>
          </identifier>
        </value>
      </values>
    </measurement>

Reading the first part is working, so I get a string called '00-11-88-87-F4-82' with the RSSI value. WIth the second part there is a problem. Because there are only numbers in the string, xmlread does not interpret it as a string. It calcluates the the value of -269. But I need it as String because it is not a number.
How can I solve the problem? Changing the name is not possible because it is given by the application where the xml is coming from.

Thank you in advance.

Kind regards

Mark Weber

04 Nov 2009 Sebastiaan

Mark: have you tried using the Pref.Str2Num=false with xml_read?

Please login to add a comment or rating.
Updates
24 Jan 2007

Made changes to allow handling special node types like CDATA_SECTIONs and COMMENTs.

12 Mar 2007

Fix by Alberto Amaro of xml_write to allow writing CDATA sections.

21 Jun 2007

Fixed problem reported by Anna Kelbert in Reviews. Also: added support for Processing Instructions, added support for global text nodes: Processing Instructons and Comments, allowed writing tag names with special characters

20 Jul 2007

Added tutorial "Published M-file", few minor code changes adding more customization parameters

23 Jan 2008

Fixed problem reported by Anna Krewet of converting dates in format '2007-01-01' to numbers. Improved and added warning messages. Added detection of old Matlab versions incompatible with the library. Expanded documentation.

23 Jun 2008

Fixed problem with writing 1D arrays and 2D cell arrays. Extended Pref.Num2Str to: never, smart and always. Added parameter Pref.KeepNS for keeping or ignoring namespace data.

11 Sep 2008

Resubmitting last upload

25 Feb 2009

More error handling. More robust in case of large binary objects. Added support for Base64 encoding/decoding of binary objects (using functions by Peter J. Acklam).

15 May 2009

Reupload of 25 Feb 2009 version. Something did not work.

26 Jun 2009

changes to xml_read: added CellItem parameter to allow better control of reading files with 'item' notation (see comment by Shlomi); changed try-catch statements so xml_read would work for mablab versions prior to 7.5 (see Thomas Pilutti comment)

Tag Activity for this File
Tag Applied By Date/Time
data import Jaroslaw Tuszynski 22 Oct 2008 08:48:12
data export Jaroslaw Tuszynski 22 Oct 2008 08:48:12
xml Jaroslaw Tuszynski 22 Oct 2008 08:48:12
tools Jaroslaw Tuszynski 22 Oct 2008 08:48:12
data Jaroslaw Tuszynski 22 Oct 2008 08:48:12
xmlread Jaroslaw Tuszynski 22 Oct 2008 08:48:12
utilities Jaroslaw Tuszynski 22 Oct 2008 08:48:12
xmlwrite Jaroslaw Tuszynski 22 Oct 2008 08:48:12
base64 Jaroslaw Tuszynski 25 Feb 2009 14:16:09
 

MATLAB Central Terms of Use

NOTICE: Any content you submit to MATLAB Central, including personal information, is not subject to the protections which may be afforded information collected under other sections of The MathWorks, Inc. Web site. You are entirely responsible for all content that you upload, post, e-mail, transmit or otherwise make available via MATLAB Central. The MathWorks does not control the content posted by visitors to MATLAB Central and, does not guarantee the accuracy, integrity, or quality of such content. Under no circumstances will The MathWorks be liable in any way for any content not authored by The MathWorks, or any loss or damage of any kind incurred as a result of the use of any content posted, e-mailed, transmitted or otherwise made available via MATLAB Central. Read the complete Terms prior to use.

Contact us at files@mathworks.com