Path: news.mathworks.com!not-for-mail
From: "Steven_Lord" <slord@mathworks.com>
Newsgroups: comp.soft-sys.matlab
Subject: Re: PDF in matlab not the same as PDF in Excel
Date: Fri, 13 Aug 2010 10:02:04 -0400
Organization: MathWorks
Lines: 87
Message-ID: <i43j8t$k17$1@fred.mathworks.com>
References: <i3vaeo$ggp$1@fred.mathworks.com> <i3vfek$qa5$1@fred.mathworks.com> <i3vgp9$io5$1@fred.mathworks.com> <i41uug$6ln$1@fred.mathworks.com>
NNTP-Posting-Host: ah-slord0w.dhcp.mathworks.com
Mime-Version: 1.0
Content-Type: text/plain;
	format=flowed;
	charset="UTF-8";
	reply-type=response
Content-Transfer-Encoding: 7bit
X-Trace: fred.mathworks.com 1281708125 20519 172.31.44.22 (13 Aug 2010 14:02:05 GMT)
X-Complaints-To: news@mathworks.com
NNTP-Posting-Date: Fri, 13 Aug 2010 14:02:05 +0000 (UTC)
In-Reply-To: <i41uug$6ln$1@fred.mathworks.com>
X-Priority: 3
X-MSMail-Priority: Normal
Importance: Normal
X-Newsreader: Microsoft Windows Live Mail 14.0.8089.726
X-MimeOLE: Produced By Microsoft MimeOLE V14.0.8089.726
Xref: news.mathworks.com comp.soft-sys.matlab:661770



"Safa " <enxss10@nottingham.ac.uk> wrote in message 
news:i41uug$6ln$1@fred.mathworks.com...
> "Roger Stafford" <ellieandrogerxyzzy@mindspring.com.invalid> wrote in 
> message <i3vgp9$io5$1@fred.mathworks.com>...
>> > Thanks for your suggestion. I have read histc documentation several 
>> > times. It gives the following equation:
>> > n(k) counts the value x(i) if edges(k) <= x(i) < edges(k+1). What I 
>> > would like to be able to do, is to tweak the histc command so that it 
>> > gives the same frequency distribution as in excel. Is this possible? At 
>> > the moment, the Excel and Matlab are counting the numbers differently, 
>> > and I am at a loss to why it is doing this. Appreciate any further 
>> > advice you could give on this matter. Thanks in advance.
>> - - - - - - - -
>>   I repeat!  This something you are entirely capable of finding out for 
>> yourself with the use of the sort function.  Pin down the individual data 
>> value or values where Excel made one decision and histc a different one 
>> and then you are well on your way to solving your own problem.
>>
>> Roger Stafford
>
> I must say I wasn&#8217;t happy with the tone of your second message as 
> this is a serious query about the operation of Matlab. I will respond to 
> it, by submitting the following example.
>
> Y=[0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 
> 0.9 0.95 1 1.2]
> Bins=[0.2 0.4 0.6 0.8 1]
> Histc(Y,Bins) gives 4,4,4,4,1
>
> Frequency command in Excel gives 3,4,4,4,4
>
> I am more of an Excel user, and I already know that the frequency command 
> counts the instance of numbers that are less than or equal to the upper 
> limit of each bin. Obviously Matlab is using the formula: n(k) counts the 
> value x(i) if edges(k) <= x(i) < edges(k+1).

Indeed; that is the documented behavior of the function.

http://www.mathworks.com/access/helpdesk/help/techdoc/ref/histc.html

> It is unclear however what Matlab does for the last bin, does it just 
> count instances of 1 exactly?

That too is documented on the page above:

"The last bin counts any values of x that match edges(end). Values outside 
the values in edges are not counted."

> As I mentioned in my previous message, I already looked histc in Matlab 
> help, and I requested  a way to CHANGE the histc so that it matches Excel. 
> Histc is an inbuild command in Matlab and I don't know how to change the 
> above inbuilt equation.

We do not provide the source code for HISTC and so you cannot change what 
HISTC does short of shadowing it (which I do NOT recommend.)  I suppose you 
could create your own HISTC subfunction or private function to shadow it 
without making your shadowed version globally visible, but I'd still be wary 
of doing so.  What I would recommend is creating your own function that does 
the equivalent of Excel's FREQUENCY command, but you would likely need to 
test it thoroughly as the documentation for the FREQUENCY command leaves 
unsaid (at least in my cursory glance) what the command does in certain 
(potentially common) scenarios.

For your particular issue, you will need to be extra careful, as many of the 
numbers in your Y vector and almost all of the numbers in your Bins vector 
cannot be exactly represented in floating-point double precision.  I'm 
assuming that for your real problem (rather than the demonstration example 
above) that you will not be typing in the Y vector but are computing it 
somehow; if that's the case, you may think the third element is the same as 
the first element of the Bins vector, but it may not be.

x = 0:0.1:1;
x == 0.3 % returns all false values, even though it _looks_ like 0.3 is the 
fourth element of x.  This is the CORRECT behavior.

See question 6.1 in the newsgroup FAQ for more information on this issue 
related to floating-point arithmetic.

-- 
Steve Lord
slord@mathworks.com
comp.soft-sys.matlab (CSSM) FAQ: http://matlabwiki.mathworks.com/MATLAB_FAQ
To contact Technical Support use the Contact Us link on 
http://www.mathworks.com