Merging Two Data Sets Where Identical Entries from the First Data Set correlate to Unique Values in the Second Data Set

1 view (last 30 days)
Hello all,
I have two data sets that I'm trying to merge that have the following conditions:
1) The data is located in two separate excel books.
2) Identical Entries in the 1st data set (a Date+Time column, some identical to each other, some not) correlate to Unique Entries in the 2nd data set (a column of unique Dates+Times)
3)There is a linking variable between the two (a phone number), which is present in both data sets.
4) The 2nd data set has data points that do NOT correlate to data points in the 1st data sets (ie. there are some phone numbers in the 2nd data set that are not in the 1st data set. All the phone numbers in the 1st set correlate to numbers in the 2nd set. I'm only interested in phone numbers present in BOTH data sets.
5) The 2nd data set can have more dates than the 1st data set for the same phone number. (ie. 2nd data set may have 5 dates, whereas the first data set may only have 2)
WHAT I NEED: (thank you for reading all of that) I need code that will take the data points from BOTH data sets and do ALL of the following:
1) Match them according to phone number.
2) Pair the dates+times of both data sets in a way that minimizes the time difference.
3) Match the dates+times of both data sets in a one-to-one fashion.
4) Output the data into a table containing all the information for each row (1st data set has 10 columns, all with different info, 2nd data set has 16 columns, all with different info), preferable in excel. I know how to use xlswrite to output the information, so I'm less concerned about this condition of the output.
I KNOW THIS IS A COMPLEX PROBLEM, BUT PLEASE KNOW THAT IT'LL MEAN THE WORLD TO ME IF YOU CAN HELP ME WITH ANY/ALL PARTS OF IT.
This is a terrific community of brilliant individuals, and I thank all of you for contributing to make Matlab one of the best programs out their!
Please help! Chris
  1 Comment
Sean de Wolski
Sean de Wolski on 16 Jun 2011
Too long, got confused. Please provide a small sample set of data: inputs/operation (or reason for what you want)/ outputs.
We understand data sets much better than essays!

Sign in to comment.

Answers (1)

Christopher
Christopher on 20 Jun 2011
I totally see what you mean: Here's essentially my biggest problem:
Data Set Alpha: Data Set Beta:
111-111-1111.............6/17/2011 11:26:16 AM ///////////////////// 111-111-1111.............6/17/2011 11:26:45 AM
111-111-1111.............6/17/2011 11:26:16 AM ////////////////////// 111-111-1111.............6/17/2011 11:29:55 AM
111-111-1111.............6/17/2011 11:26:16 AM //////////////////////111-111-1111.............6/17/2011 12:18:20 AM
I have data points in the data set Alpha that are the same, yet correlate to different data points in Data Set Beta. There's are lots of phone numbers, and I know that if I can pair them by phone number, all I need to do is to minimize the date difference.
What I have so far:
for ii=1:length(Alpha)
.......for jj=1:length(Beta)
..............if strcmp(Alpha(ii,1),SoftRawData(jj,1)) == 1
% checks to see if the phone numbers match)%
I don't know the proper function to use (and can't seem to find one) for minimizing time difference of this nature of data. Thanks for any help you can give! Once this gets tackled, we'll try to conquer the rest of this beast.
Cheers,
Chris

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!