Path: news.mathworks.com!not-for-mail
From: "Siva " <sivaathome@gmail.com>
Newsgroups: comp.soft-sys.matlab
Subject: Removing duplicates
Date: Mon, 16 Apr 2012 03:03:06 +0000 (UTC)
Organization: Roche Diagnostics
Lines: 41
Message-ID: <jmg25a$7p1$1@newscl01ah.mathworks.com>
References: <jmfvu3$sl3$1@newscl01ah.mathworks.com>
Reply-To: "Siva " <sivaathome@gmail.com>
NNTP-Posting-Host: www-02-blr.mathworks.com
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-Trace: newscl01ah.mathworks.com 1334545386 7969 172.30.248.47 (16 Apr 2012 03:03:06 GMT)
X-Complaints-To: news@mathworks.com
NNTP-Posting-Date: Mon, 16 Apr 2012 03:03:06 +0000 (UTC)
X-Newsreader: MATLAB Central Newsreader 11031
Xref: news.mathworks.com comp.soft-sys.matlab:764620

"Mary Thompson" wrote in message <jmfvu3$sl3$1@newscl01ah.mathworks.com>...
> I was wondering if it would be possible to do the following.
> 
> I have a set of data in one column with ID numbers:
> 
> ID:
> 22
> 22
> 33
> 33
> 44
> 44
> 55
> 55
> 66
> 66
> 66
> 77
> 77
> 88
> 88
> 88
> 
> The first and second row should be the same. However, there are scenarios like with 66 and 88 that the identifier and the data that comes along with it repeats 3x.  I would like to remove the middle duplicate -i am not able to do anything in excel and was wondering if there's any type of checking/verifying in matlab?
> 
> thanks.

Not sure how big your data sets are but for small data sets this might work:

% Assume DATA is a matrix where column 1 contains ID, and the rest of the columns 
% contain the associated data.
uniqueIDs= unique( DATA( :, 1)) ; % identify all the unique IDs
for i= 1:length( uniqueIDs)
  idx= find( DATA( :, 1)==uniqueIDs( i)) ; % identify the rows corresponding 
                                                          % to i'th unique ID
  if length( idx)==3              % check if we have three of the same ID
    DATA( idx( 2), :)= [] ;      % discard the second row for that ID
  end
end

At the end of this code segment, the matrix DATA should be stripped of the second row when there were three rows for an ID.