File Exchange

image thumbnail

Load COVID-19 case data from John Hopkins database

version 0.63 (6.49 KB) by Axel Ahrens
This functions load, process and plot the data from the John Hopkins COVID-19 database.

16 Downloads

Updated 14 Apr 2020

From GitHub

View Version History

View license on GitHub

# Load COVID-19 case data from John Hopkins database

Loading, processing and plotting the data from the John Hopkins COVID-19 database. The data is automatically read from the online repository, thus, you need a internet connection. The data can be found here: https://github.com/CSSEGISandData/COVID-19.

# How to (see runAll.m):
type = 'confirmed'; % 'confirmed','deaths','recovered'
[dataMatrix] = readCoronaData(type);
[dataTable,timeVector,mergedData] = processCoronaData(dataMatrix);
plotCoronaData(timeVector,mergedData,{'Denmark','US','Germany','China'},type);

Cite As

Axel Ahrens (2021). Load COVID-19 case data from John Hopkins database (https://github.com/aahr/covid-19_data_analysis), GitHub. Retrieved .

Comments and Ratings (23)

Ayman AL-KHAZRAJI

Wonderful and practical code.
I would add two lines to the end of runAll.m to get time series data as a row vector for a given country (US for example)
=============================
idx = find(strcmp(mergedData, "US"));
time_series=mergedData{idx, 2}(1,:)
============================
Regards

Axel Ahrens

@safiul mollick : I tried with a couple of recent MATLAB version but not as late as 2016. Do you have access to a newer version? If not, can you contact me via email and I try to fix the problem.

safiul mollick

I am new here, I am using Matlab 2016

I got this error too

Undefined function or variable 'newline'.

Error in readCoronaData (line 87)
if m == 1 && contains(dataMatrix{m,n},newline)

Error in runAll (line 4)
[dataMatrix] = readCoronaData(type);

Alex Backer

I updated to the latest version of MATLAB. Get this error now;
Error using cell2table (line 69)
The VariableNames property is a cell array of character vectors. To assign multiple
variable names, specify nonempty names in a string array or a cell array of character
vectors.

Error in processCoronaData (line 21)
dataTable = cell2table(dataMatrix(2:end,:),'VariableNames',varNames);

Wiebke Lamping

Axel Ahrens

@Alex: What MATLAB version do you use? Is the error fixed with the newest version of the code?

Alex Backer

Thanks Axel!

I got this error, any ideas?

Undefined function or variable 'newline'.

Error in readCoronaData (line 83)
if m == 1 && contains(dataMatrix{m,n},newline)

Or Shamir

Hartwig Harder

Thank you very much Axel for the code, very interesting
Used it the first time today with commit 18ef41f & Matlab 2019a (9.6, update 5)

I did encounter the following issues:

1. handle of strings & cell arrays in processCoronaData

Error using strrep
Cell elements must be character vectors.
Error in processCoronaData (line 17)
varNames = strrep(varNames,'/','_');
Error in runAll (line 7)
[dataTable,timeVector,mergedData] = processCoronaData(dataMatrix);

--> fixed it by usage of

function str = strrep2(str, old, new)
str = cellfun(@(str) rep(str, old, new), str, 'UniformOutput', false);
function str = rep(str, old, new)
if ischar(str)
str = strrep(str, old, new);
end
https://www.gomatlab.de/strrep-cell-elements-must-be-character-arrays-t44174.html

2. data in dataset of today in processCoronaData

25: for m = 1:size(dataMatrix,1)
26: for n = 1:size(dataMatrix,2)

fix for me -> skip last row 'malawi' entry and the last column in the datamatrix
for m = 1:size(dataMatrix,1)-1
for n = 1:size(dataMatrix,2)-1

3. issue with the time format when assigning to the Timevector
38: timeVector(n-4) = dataMatrix{1,n};

fix for me :
dataMatrix{m,n} = datetime(dataMatrix{m,n},'InputFormat','MM/dd/yy');

Cheers
Hartwig

Wolfgang Gross

Starting from today, I get the following error:
Error using cell2table (line 69)
The VariableNames property is a cell array of character vectors. To assign multiple variable names, specify names in a
string array or a cell array of character vectors.

Error in processCoronaData (line 311)
dataTable = cell2table(dataMatrix(2:end,:),'VariableNames',varNames);

The issue is that the data matirx is not read properly. Changing
if contains(lineBreakData(lineBreakIdx+2:end-1),'"') %there is a comma within state/province
to
if contains(lineBreakData(lineBreakIdx:end-1),'"') %there is a comma within state/province
in the read function fixes the issue for me. However, I don't know if there are any side effects.

I think the issue is coming from your version of Matlab and, particularly, the function cell2table.
"Starting in R2019b, you can specify table variable names that are not valid MATLAB® identifiers. Such variable names can include spaces, non-ASCII characters, and can have any character as the leading character. When you access such a variable name, enclose it quotation marks.", from: https://www.mathworks.com/help/matlab/ref/table.html

Axel Ahrens

@Christopher: This seems to be an issue only in Mac OS. But it should be fixed now.

Christopher Hoen

@Axel
I'm running on Mac OSX Catalina 10.15.4. Just tried R2019a. Same error, but with error message as Holger reported March 25th.

Axel Ahrens

@Christopher: Thats odd. I will add a fix in 10 mins.

Christopher Hoen

Hi Axel
I currently use R2018B.
It seems to me that it is the backslash that causes problems when I run. Anyway I can fix it myself for my version of Matlab.

Axel Ahrens

Hi Christopher, for me everything runs smooth with the current GitHub version. What MATLAB version do you use?

Christopher Hoen

Hi Axel
Downloaded today 2020-03-30
Problems still present in processCoronaData

>> runAll
Error using cell2table (line 57)
'Province/State' is not a valid variable name.

Error in processCoronaData (line 16)
dataTable = cell2table(dataMatrix(2:end,:),'VariableNames',varNames);

Error in runAll (line 6)
[dataTable,timeVector,mergedData] = processCoronaData(dataMatrix);

Axel Ahrens

@Holger: Yes, indeed. They changed again. Its fixed now. I am checking the code at least once per day to keep it running.

Holger

Hi Axel, I fetched your latest version - but the problem still remains. I´m not sure if there have been further changes in data format by JHU... Many thanks for your efforts!

Axel Ahrens

Hi Holger. Try the most recent version. There was a change in the John Hopkins repository data format. Please get in touch if it still does not work.

Holger

Error using cell2table (line 58)
'Province/State' is not a valid table variable name. See the documentation for isvarname or matlab.lang.makeValidName for more information.

Error in processCoronaData (line 16)
dataTable = cell2table(dataMatrix(2:end,:),'VariableNames',varNames);

Error in runAll (line 6)
[dataTable,timeVector,mergedData] = processCoronaData(dataMatrix);

Mark Smith

Thank you for this set of functions. These allow customization for those of us playing with Covid-19 tracking and prediction.

MATLAB Release Compatibility
Created with R2019b
Compatible with R2018b and later releases
Platform Compatibility
Windows macOS Linux

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!