Combing Algorithm: Finding Unique Equaltiy between Vectors of Different Sizes Using Ranges

Question

Midimistro on 22 Dec 2015

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/261393-combing-algorithm-finding-unique-equaltiy-between-vectors-of-different-sizes-using-ranges

Edited: Midimistro on 22 Jan 2016

I need to make an algorithm in MATLAB that can compare two vectors of different sizes, but of the same type of data. The algorithm goes as follows:

There are 3 times as many variables in vector B as there are in vector A.
We are trying to figure out what offset of vector A makes all of the elements in vector A equal to the same proportion of elements in vector B. In other words, A(1)=B(1), A(2)=B(4), and so on for each element in A; basically A needs to have an equal in B.
Elements in vector A can be within certain ranges of vector B, where those ranges are defined by another vector we'll call vector C. To understand this visually, think of a single index of B as a static block that doesn't move, and an index of A as a bristle that must fit in the range of any given index B where each respective value in the offset vector A fit into specific indexes of vector B.
To find this fit, these instances where A falls between [B,B+C] are then saved as a conditional (0s&1s) vector called "instances" which is then compared to a static instance vector called "siginstances". If instances == siginstances, then the offset of A has been found, else continue to offset A until a given number or until a match has been found.

To better understand the approach of this algorithm, you may want to consider that the vectors mentioned are being used to compare times caught by two different computers recording the same information. The relative algorithm goes as follows:

%Postive Incrimentation%
%test the offset of sigA between 0 and 90 seconds after sigA
IncrementA = sigA;
for a = 0.0:.000001:90
  IncrementA = sigA+a; %offset vector A
  instance = (IncrementA>=sigB) & (IncrementA<=sigB+sigC);
  %see if the values of vector A are within [sigB,sigB+sigC]
  %Think of sigB as a block and sigA as bristles. The proper bristles of sigA need to fit into 
  %the range of block sigB, where each of these blocks and bristles are a single index of that vector.
  %The range in which an index of sigA mus fit into sigB is defined by sigC, whose size is the same
  %as sigB, but contains different ranges for each instance of B.
  if (instance == siginstances)
      %compare the resulting conditional vector to the static vector; if true print out offset,
      %else continue algorithm till an offset is found, if any.
    disp(a);
      break;
  end %end if statement
end %end for statement
%use the same approach for decrementation (aka subtracting a from sigA)

I hope that this approach and request is clear to you so that you can help me create the proper algorithm that can properly process the above mathematical nightmare in MATLAB. Let me know if you need further clarification. Thank you.

5 Comments
Show 3 older commentsHide 3 older comments

Geoff Hayes on 4 Jan 2016

Midimistro - is this homework? While you have explained the problem that you need to solve, you haven't indicated where in the algorithm you are getting stuck. Please clarify which part is causing you a problem and please show what you have tried so far.

Midimistro on 4 Jan 2016

I wouldn't say I am necessarily stuck, but rather unsure if I am taking the right approach in the code above. I want an efficient algorithm that can analyze several data points quickly and find an alignment point were all the points align according to the process described in my first comment.

And this more for personal confidence in my programming and analysis skills more than anything. Not Homework.

Sign in to comment.

Sign in to answer this question.

Answer 1

Guillaume on 4 Jan 2016

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/261393-combing-algorithm-finding-unique-equaltiy-between-vectors-of-different-sizes-using-ranges#answer_205077

Open in MATLAB Online

Despite your claim, it's not entirely clear what you want. Particularly I've not understood at all what your truth vectors correspond to. A simple example would really help.

If I understood correctly you have as input something like:

A = [1 4 7 10];
B = 0:12;
C = [0.5 1 2 0.5 0 1 2 1 1 0.5 1 0 1];

Creating a truth table that tells you which element of A is between which elements of [B, B+C] is easy:

inrange = bsxfun(@ge, A', B) & bsxfun(@le, A', B+C)

The rows of inrange corresponds to elements of A, the columns correspond to elements of B. A 1 indicates that the corresponding element of A is within the range of B. If you have several 1 in a row, that means that element of A can match several elements of B.

You can create all combinations of potential match with ndgrid:

Bmatches = cellfun(@find, num2cell(inrange, 2), 'UniformOutput', false);
[Bmatches{:}] = ndgrid(Bmatches{:});
Bcombs = reshape(cat(numel(A)+1, Bmatches{:}), [], numel(A))

Each row of BCombs is a combination of B indices that match A

Note that ndgrid will run out of memory for A with a moderate number of elements and you may have to use a loop instead to generate the combinations.

8 Comments
Show 6 older commentsHide 6 older comments

Midimistro on 5 Jan 2016

Edited: Midimistro on 5 Jan 2016

Say I have an time array "A" that contains both noise and non-noise, in which each instance acts as a prong. I use the ge function to filter out noise, which generates a truth vector (1s & 0s) of the same size called "staticTruth". I have another time array called "B", which contains up to 3 times as many non-noise data points as "A", but contains only non-noise in the form of blocks (hence the need for evaluation of less than and greater than). Say "C" is the same size as B and contains the time added to B, thus creating the block. What the algorithm is supposed to do is see if A is between B & C to generate a "changingTruth" vector (of 1s & 0s) that at some point should match up with "staticTruth" only when A & B are aligned. Does that make it more clear? I can't get much simpler than that.

Note that the size of A (aka, the prongs; with noise) is ~66000 and the size of B & C (aka, the blocks; no noise) are ~63000. Not all prongs should fit in the blocks, hence the use of the two "Truth" vectors.

I hope that clears up confusion. The output I am looking for from the comparison is a truth vector the same size as "A", so that it can be compared to the "staticTruth" vector.

Midimistro on 6 Jan 2016

Edited: Midimistro on 6 Jan 2016

Here Is an example of the vectors:

ATime: contains the time for a signal reading (in this case offset by 1 second, which is unknown to me, the analyzer)
Example: ATime = {0.012, 0.023, 0.032, 0.042}
ASig: Same size as ATime containing signal values. Includes noise (aka -100).
Example: ASig = {-14, -100, -14.2, -14.1}
ATruth: A truth vector same size as ASig and ATime that is used to filter out the noise, where noise is 0.
Example: ATruth = {1, 0, 1, 1}
BTime: A vector that contains only time of the same reading, but contains no noise. This vector also contains up to 3x as many no-noise points.
Example: BTime = {1.012, 1.013, 1.014, 1.022, 1.024, 1.025, 1.032, 1.033, 1.040, 1.041, 1.042}
BLife: A vector that contains the life of Btime and is same size as BTime:
Example: BLife ={.001, .001, .003, .001, .002, .003, .001, .004, .001, .001, .005}
CTime: BTime+BLife (no need for values here since its simple math.) To save memory, do this calculation outside the loop.

The purpose of the algorithm is to generate a truth vector when the prongs, or ATime, fit in the so-called blocks (space between BTime and CTime). This truth vector needs to be the same size as vector ATime.

For example when the times are not aligned, we get the following vector, which is not equal to ATruth:

Example: DynamicTruth = {0, 1, 0, 0}

Basically, we keep adding .001 to a temporary vector called Atemp, which contains the original time of ATime, plus however many .001s that have been added by the loop as a result of a failed match between DynamicTruth and ATruth. When the times finally align, DynamicTruth should equal ATruth, thus displaying the output of how much time was added to ATime, even if its up to 100 seconds.

In other words, the algorithm should find that the ATime = {0.012, 0.023, 0.032, 0.042} should become ATime = {1.012, 1.023, 1.032, 1.042} because of the 1 second offset. Does that make sense? I seriously can't get much clearer than that.

Midimistro on 6 Jan 2016

Edited: Midimistro on 6 Jan 2016

Even though it is helpful and provides an output, I am unsure if bsxfun will work in my case (based on its documentation) since all my vectors are simple arrays (aka one_column X Many_Rows). The issue with my original code in the question above is that sigB is not the same size as IncrementA, thus the le and ge operations couldn't be done. Is there any other function other than bsxfun that can get around this, while still providing the proper array format?

I've gotten the bsx function to work (after a few minor recalculations elsewhere), however I am not sure its doing what I want it to do. Please verify that what you provided me will work in tandem with the code I provided in the question (replacing the lessthan/greaterthan protion, of course).

Specifically, the dynamicTruth (the truth array generated by the evaluation) needs to also be in "one_column X Many_Rows" format so that it can be compared to the staticTruth array. This doesn't seem to be the case with the bsx function.

Guillaume on 7 Jan 2016

Open in MATLAB Online

Your latest explanation does make things a lot clearer, thanks. But, unless I've misunderstood something, it seems to me that you're over complicating things a bit with your noise/non-noise and truth vectors.

You could just remove the noise values from ATime, and then try to match the resulting ATime with BTime. No need for DynamicTruth and you don't care about the signal values anymore.

My bsxfun method would work in combination with your loop approach:

ATime = [0.012, 0.023, 0.032, 0.042];
ASig = [-14, -100, -14.2, -14.1];
ATimeValid = ATime(ASig > -100);
BTime = [1.012, 1.013, 1.014, 1.022, 1.024, 1.025, 1.032, 1.033, 1.040, 1.041, 1.042];
BLife = [.001, .001, .003, .001, .002, .003, .001, .004, .001, .001, .005];
BUpper = BTime + BLife;
for offset = 0 : .001 : 90
   AOffset = ATimeValid + offset;  
   if all(any(bsxfun(@ge, AOffset, BTime') & bsxfun(@le, AOffset, BUpper')))
      fprintf('found a valid offset: %g\n', offset);        
   end
end

I'm not convinced that a trial and error method is the most efficient way of going about it. You're basically trying to match two signals, one being sampled at a lower and random frequency. I'm sure that there are some more efficient methods of doing that.

Midimistro on 22 Jan 2016

Edited: Midimistro on 22 Jan 2016

Open in MATLAB Online

The approach above would work fine if neither set of data was prone to error in its data points. However, because of the nature of my data, it seems that even with the offset, certain points will never align properly (in other-words, some prongs (even w/o noise), may very well be an erroneous/inaccurate (late) data point.

This means that bsxfun will never produce a matrix of all 1s. For this reason, I still need to take the same approach, but instead of finding an exact alignment, I need to produce a vector of percentages of when a non-noise prong of A falls between the block of BTime/BUpper to a certain degree (doesn't need to be 100%). We can then use the highest percentage to find the best fit offset.

Is an approach you and I took above similar to what should be done, or is there an extremely simple approach for getting this percentage that is completely flying over my head?

Note that getting the percentage of points that are aligned, so that we can use those percentages to be compared to the offset, (were the highest percentage is the best offset)is the main goal of the algorithm now.

Relative Code:

Increment = a; %clear/define Increment before
percent = double.empty; %define an empty vector to to append each resulting percentage value
offset = double.empty; %define an empty vector to to append each offset value
for a  = 0:.000001:90
   Increment = A+a;
   %find the percentage of points in Increment that meet this condition:
   (Increment >= B) & (Increment <= C);
   %this "finding" function is the part that I am stumbling on. Then continue below:
   offset = [offset a]; %adds an additional element of the offset "a" to the growing vector of "offset" to be used for later comparison
   percent = [percent a]; %does same thing as previous line.
end

Hope this slightly new approach is clear to you.

Sign in to comment.

Combing Algorithm: Finding Unique Equaltiy between Vectors of Different Sizes Using Ranges

5 Comments
Show 3 older commentsHide 3 older comments

Answers (1)

8 Comments
Show 6 older commentsHide 6 older comments

See Also

Categories

Tags

Products

Community Treasure Hunt

Combing Algorithm: Finding Unique Equaltiy between Vectors of Different Sizes Using Ranges

5 Comments Show 3 older commentsHide 3 older comments

Answers (1)

8 Comments Show 6 older commentsHide 6 older comments

See Also

Categories

Tags

Products

Community Treasure Hunt

5 Comments
Show 3 older commentsHide 3 older comments

8 Comments
Show 6 older commentsHide 6 older comments