In some scientific works, once the data have been gathered from a population of interest, it is often difficult to get a sense of what the data indicate when they are presented in an unorganized fashion.
Assembling the raw data into a meaningful form, such as a frequency distribution, makes the data easier to understand and interpret. It is in the context of frequency distributions that the importance of conveying in a succinct way numerical information contained in the data is encountered.
So, grouped data is data that has been organized into groups known as classes. The raw dataset can be organized by constructing a table showing the frequency distribution of the variable (whose values are given in the raw dataset). Such a frequency table is often referred to as grouped data.
A geometric mean, unlike an arithmetic mean, tends to dampen the effect of very high or low values, which might bias the mean if a straight average (arithmetic mean) were calculated. As an example, this is helpful when analyzing bacteria concentrations, because levels may vary anywhere from 10 to 10,000 fold over a given period. Geometric mean is really a log-transformation of data to enable meaningful statistical evaluations.
Besides being used by scientists and biologists, geometric means are also used in many other fields, most notably financial reporting. This is because when evaluating investment returns and fluctuating interest rates, it is the geometric mean, not the arithmetic mean, that tells you what the average financial rate of return would have had to have been over the entire investment period to achieve the end result.
Geometric mean is often used to evaluate data covering several orders of magnitude, and sometimes for evaluating ratios, percentages, or other data sets bounded by zero. If your data covers a narrow range or if the data is normally distributed around high values (i.e. skew to the left), geometric means and log transformations may not be appropriate. Do not use geometric mean on data that is already log transformed such as pH or decibels (dB).
The relationship between arithmetic, geometric and harmonic mean are:
AM > GM > HM
GM^2 = AM*HM .
The geometric and harmonic means penalise uneven performances, but the harmonic mean penalises them more heavily.
Landwehr (1978) describes the properties of the geometric mean. More generally (Jensen, 1998), one can define the power mean, p-norm, or generalized mean
Mp = [E[x^p]]^(1/p)
which reduces to the harmonic, geometric and arithmetic means for p = -1, p -> 0 (eg. 1/2,1/3,1/4,..1/20000,..,1/n) and p = 1, respectively.
Here, we developed a m-code to calculate the geometric mean of a grouped data.
One can input the returns or modified vectors n and xout containing the frequency counts and the bin locations of the hist m-function, in a column form matrix.
Geometric mean calculation uses the formula,
G = [(MC1^F1)*(MC2^F2)*...*(MCk^Fk)]^(1/N)
Gl = Sum[Fi*log10(MCi)]/N, i = 1,2,...,k. Then, G = 10^Gl
Fi = class frequency
MCi = class mark
N = sample size [sum(Fi)]
k = number of classes
Syntax: function y = ggeomean(x)
x - data matrix (Size of matrix must be n-by-2; absolut frequency=column 1, class mark=column 2)
y - geometric mean of the values in x