The idea of sufficiency is that if we observe a random variable
(using a
sample
, or X) whose distribution depends on
,
often X can be reduced via a function, without losing any information
about
. For example,
Definition
7..1
A statistic
is said to be sufficient
for a family of distributions if and only if the conditional distribution of
X given the value of
is the same for all members of the family (that
is, doesn't depend on
).
Equivalent definitions for the discrete case and continuous cases respectively are 7.3 and 7.3 below.
Definition
7..2
Let
be a family
of distributions of the discrete type.
For a random sample
from
, define
. Then
is a
sufficient statistic for
if, for all
and all possible sample
points,
| (7.1) |
Consider here the role of
. Its job is to represent all the stochastic
information of the data. Other information such as the scale of measurement
should not be random. So if
is sufficient for
, then
interpretation of the data conditional on
should
remove all the stochastic bits, leaving only the non-random bits.
Definition
7..3
Let
be a random sample from
a continuous distribution,
. Let
be a statistic
with pdf
. Then
is sufficient for
if and only if
![]() |
(7.2) |
Example
7..1
Given
is a random sample from a binomial distribution
with parameters
, show that
is a sufficient
statistic for
.
Solution. From Definition 1.2(a), we need to consider
Example
7..2
Let
be a random sample from the
truncated exponential distribution, where
Solution.
In Definition 1.2(b),
and to examine (1.2) we need
, the pdf of the smallest order statistic. Now for the pdf above,
From Distribution Theory (4.6), the pdf of
is
In establishing that a particular statistic is sufficient, we do not usually use the above definition(s) directly. Instead, a factorization criterion is preferred and this is described in 1.3.