Thread Subject: kaplan meier density

Subject: kaplan meier density

From: leo nidas

Date: 11 Nov, 2009 13:45:07

Message: 1 of 2


Hi there,

I would like to nonparametrically estimate the density of some censored data. (I also would like not to use kernels). So I simply thought that a solution is through the Kaplan Meier since S(x)=1-F(x)=> f(x)=-S'(x).

So I did the following, in order to estimate an exponential distribution with mean 3, but the plot does not seem right.. I am expecting some sort of step function similar to a histogram of the exponential distribution.

Could you please help me out? Thanx in advance!!

n=200;
x=exprnd(3,n,1); %real data
c=exprnd(3,n,1); %censoring variable
xcen=min(x,c); % data to be used
status=(x>c); %status indicator

[fi,xi,flo,fup] = ecdf(xcen,'censoring',status);
stairs(xi,1-fi,'LineWidth',2) %Kaplan Meier
title('Kaplan Meier estimate of S(x)')

S=[1-fi xi]; %Kaplan Meier with times of deaths
Sx=S(:,1);
x=S(:,2);
x=x(2:end);
for i=1:(length(find(status==0))-1)
fx(i)=(Sx(i)-Sx(i+1))/(x(i+1)-x(i)); % esimate of f(x)
end
fx(length(find(status==0)))=0;

stairs(x,fx,'LineWidth',2) %plot f(x)

Subject: kaplan meier density

From: Tom Lane

Date: 11 Nov, 2009 16:42:17

Message: 2 of 2

> I would like to nonparametrically estimate the density of some censored
> data. (I also would like not to use kernels). So I simply thought that a
> solution is through the Kaplan Meier since S(x)=1-F(x)=> f(x)=-S'(x).
>
> So I did the following, in order to estimate an exponential distribution
> with mean 3, but the plot does not seem right.. I am expecting some sort
> of step function similar to a histogram of the exponential distribution.

Leo, it appears you are estimating the density at data points with
increments based on the data spacing. I would expect this to be noisy.

If you want something more like a histogram, I recommend taking differences
in the survivor function over bigger intervals, comparable to histogram bin
widths. Here's a bit of code that tries to interpolate into the survivor
function, and compute finite differences over larger intervals. I don't
guarantee this but you may want to play around with it as a starting point.

% Set up t and S for interpolation, so that when S jumps at time t we
% record this as S(t) taking the value before the jump and S(t+eps) taking
% the value after the jump
xi(1) = 0; % avoid possibly repeated value here
tint = [-100,(xi+eps(xi))'; xi',100]; tint = tint(:);
Sint = 1-[0,fi';fi',1]; Sint = Sint(:);

% Compute finite differences of the survivor function over a range of t
% values
t = linspace(0,9,10);
S = interp1(tint,Sint,t);
ft = -diff(S)./diff(t);
stairs(t,[ft,ft(end)]);

-- Tom

Tags for this Thread

Everyone's Tags:

Add a New Tag:

Separated by commas
Ex.: root locus, bode

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Tag Activity for This Thread
Tag Applied By Date/Time
kaplan meier de... Eva-Maria 20 Jun, 2011 10:20:43
kaplan meier de... leo nidas 11 Nov, 2009 08:49:04
rssFeed for this Thread

Contact us at files@mathworks.com