MATLAB Answers

pietro
0

How to extract locations from strings of addresses?

Asked by pietro
on 10 Jan 2019
Latest activity Edited by Stephen Cobeldick on 13 Jan 2019
Hi all,
I have more than a thousand of strings containing addresses. How can I programmatically extract the locations, like cities (e.g. Rome, London, etc) and countries (e.g. Italy, UK, etc)?
Thanks
Pietro

  8 Comments

There might be, I have never specifically investigated MATLAB for that, but I do not know of one, and do not have a specific way that I could think of sending MATLAB to the internet to search for one.
How abou the following approach?
  1. Split each line with ';' character into 1 or 2 segments
  2. Split each segment(s) with ',' mark into several fragments
  3. Extract the last two fragments
The following is my try:
str = [...
"Sustainable Industrial Systems, School of Chemical Engineering and Analytical Science, The University of Manchester, Manchester, United Kingdom; Dipartimento di Scienze Agrarie e Ambientali - Produzione, Territorio, Agroenergia, Università degli Studi di Milano, Milan, Italy";
"Department of Agricultural and Environmental Sciences - Production, Landscape, Agroenergy, Università degli Studi di Milano, Via G. Celoria 2, Milano, Italy; Department of Earth and Environmental Sciences, University of Milano-Bicocca, Piazza della Scienza 1, Milano, Italy";
"Department of Land and Agriculture and Forestry Systems, University of Padua, Viale dell'Università 16, 35020 Legnaro (PD), Italy";
"Dipartimento di Agraria, University of Sassari, Sassari, Italy"];
for kk1 = 1:numel(str)
strSplit = strsplit(str(kk1),';');
strOut = repelem("",1,numel(strSplit));
for kk2 = 1:numel(strSplit)
strTmp = strsplit(strSplit(kk2),', ');
strOut(kk2) = strjoin(strTmp(end-1:end),', ');
end
disp(strjoin(strOut,'; '))
end
The output is as follows:
Manchester, United Kingdom; Milan, Italy
Milano, Italy; Milano, Italy
35020 Legnaro (PD), Italy
Sassari, Italy
Hi Akira,
thank you for the answer. Well, it is a good solution. I think I might adopt it.
Cheers,
Michele

Sign in to comment.

Products


Release

R2017b

1 Answer

Answer by Stephen Cobeldick on 13 Jan 2019
Edited by Stephen Cobeldick on 13 Jan 2019
 Accepted Answer

Using one simple regular expression:
>> C = {...
'Sustainable Industrial Systems, School of Chemical Engineering and Analytical Science, The University of Manchester, Manchester, United Kingdom; Dipartimento di Scienze Agrarie e Ambientali - Produzione, Territorio, Agroenergia, Università degli Studi di Milano, Milan, Italy';
'Department of Agricultural and Environmental Sciences - Production, Landscape, Agroenergy, Università degli Studi di Milano, Via G. Celoria 2, Milano, Italy; Department of Earth and Environmental Sciences, University of Milano-Bicocca, Piazza della Scienza 1, Milano, Italy';
'Department of Land and Agriculture and Forestry Systems, University of Padua, Viale dell''Università 16, 35020 Legnaro (PD), Italy';
'Dipartimento di Agraria, University of Sassari, Sassari, Italy'};
>> M = regexp(C,'[^,]+,[^,]+(?=;|$)','match');
>> M{:}
ans =
' Manchester, United Kingdom'
' Milan, Italy'
ans =
' Milano, Italy'
' Milano, Italy'
ans =
' 35020 Legnaro (PD), Italy'
ans =
' Sassari, Italy'

  0 Comments

Sign in to comment.