How to convert HTML report to Excel

15 views (last 30 days)
V Manohar
V Manohar on 23 Jul 2020
Commented: dpb on 24 Jul 2020
Hello,
we have a report in HTML which is generated by Polyspace and we wish to convert that into Excel format.The problem we have faced is, HTML page has some data in tables and while we try to read that no table data is accessable,we are able to read only text data.
%%Read in HTML file.
filenameHTML = uigetfile('.html');
txt = fileread(filenameHTML);
%%Remove HTML tags, header text, and last section (pertaining to images).
txt = regexprep(txt,'<script.*?/script>','');
txt = regexprep(txt,'<style.*?/style>','');
txt = regexprep(txt,'<.*?>','');
txt = regexprep(txt,'.*#\n','');
txt = regexprep(txt,'--.*?\n','');
txt = regexprep(txt,'\n\n.*','');
%%Set up delimiters and format specification to read columns of data as text:
delimiter = {' = '};
formatSpec = '%q%q%[^\n\r]';
%%Read columns of data according to the format.
dataArray = textscan(txt, formatSpec, 'Delimiter', delimiter); ...
%'TextType', 'char', 'ReturnOnError', false);
raw = repmat({''},length(dataArray{1}),length(dataArray)-1); %preallocation before loop
for col = 1:(length(dataArray)-1)
raw(1:length(dataArray{col}),col) = dataArray{col};
end;
%%Write data to Excel spreadsheet.
filenameSpreadsheet = 'Example.xlsx';
xlswrite(filenameSpreadsheet,raw)
  3 Comments
V Manohar
V Manohar on 24 Jul 2020
The file attached is the original file,which i'm trying to extract data.Please find the attachment.
dpb
dpb on 24 Jul 2020
Well, the following at the bottom is pretty-much the story it would appear--
<!-- This template library is designed to work with the JavaScript TOC and autonumber
scripts included in this template. The Javascript autonumber script replaces the
autonumber elements used in this template with actual numbers when the report generated
from this template is loaded into a browser. The autonumber script implements the
autonumber behavior defined by the DOM AutoNumber class. -->
There isn't any table data in the file; it's all dynamic on the server to populate the browser view. Looks to me like the implementor would have to have provided a "Download" function or you would have to scrape the page to get the actual displayed data.

Sign in to comment.

Answers (0)

Products


Release

R2016b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!