How can I match similar events between 2 matrices?
2 views (last 30 days)
Show older comments
So I have two matrices each with 5 columns of data. Both contain latitude, longitude, depth, time, and magnitude values. Matrix A has around 30,000 events or rows (each event is represented by lat,lon,dep,time, and mag)and matrix B has around 50,000 events. Both datasets represent the same sequence of earthquake data, but matrix B was created with less stringent error parameters and thus more events (earthquakes) were located and included in that matrix. So the 30,000 events in matrix A are also in matrix B along with ~20,000 others.
I need to match the earthquakes from each catalog. That is, an earthquake will have a unique lat, lon, depth, and time. I need to find the events in each catalog that have the same location and time and call those the same event. Now of course earthquakes can happen simultaneously so matching times alone won't cut it. I will need to match times (with some small amount of error) and locations to confidently say the events are the same.
Before I delve much deeper...Any suggestions on how to implement this? I have some working code that is slow, so I'm looking to optimize my solution.
I basically need to calculate distances between each lat,lon,depth in matrix A and each lat,lon,depth in matrix B. An event in one catalog should basically have the same location in the other catalog. There may be some small discrepancies but anything within a few meters is likely the same event. Right now, I'm using a nearest neighbor search to find distances between all the locations in one matrix from the other.
3 Comments
per isakson
on 14 Dec 2016
Edited: per isakson
on 14 Dec 2016
How important is speed?
There is an old trick (by John D'Errico, I think):
- Create new matrices of whole numbers by converting one column at a time round(A(:,jj)/tol). This allows for different tolerance values for different columns.
- Search matches with intersect(...,'rows'), or ismember(?)
Accepted Answer
Guillaume
on 14 Dec 2016
[isinB, rowinB] = ismembertol(A, B, 'ByRows', true)
You can specify a tolerance and a 'DataScale' vector to vary the amplitude of the tolerance for each column.
5 Comments
Guillaume
on 14 Dec 2016
There is absolutely no reason for the inputs to ismembertol (and ismember) to be the same length. It simply tells you which rows (with the 'rows' / |'ByRows' option) of the first input are found somewhere in the second input.
The link to the documentation of ismembertol is in my answer. As it says at the end, Introduced in R2015a.
You need the tol version since you don't want exact comparison. Time to upgrade? Replicating the full behaviour of ismembertol particularly with the 'ByRows' option is not going to be trivial.
Here's an attempt that loses the automatic tolerance, magnitude scaling and other niceties:
function [isfound, where] = ismembertolbyrow(A, B, tol)
%A, B: two matrices with the same number of columns
%tol: a vector with the same number of columns as A and B
%tol is absolute. u and v are within range if abs(u-v) < tol
validateattributes(A, {'numeric'}, {'2d'});
validateattributes(B, {'numeric'}, {'2d', 'ncols', size(A, 2)});
validateattributes(tol, {'numeric'}, {'positive', 'row', 'numel', size(A, 2)});
intol = squeeze(all(abs(bsxfun(@minus, A, permute(B, [3 2 1]))) <= tol, 2));
isfound = any(intol, 2);
where = zeros(size(isfound));
[r, c] = find(intol);
where(r) = c;
end
More Answers (0)
See Also
Categories
Find more on Cell Arrays in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!