ks.test {stats} | R Documentation |

Performs one or two sample Kolmogorov-Smirnov tests.

ks.test(x, y, ..., alternative = c("two.sided", "less", "greater"), exact = NULL)

`x` |
a numeric vector of data values. |

`y` |
either a numeric vector of data values, or a character string naming a distribution function. |

`...` |
parameters of the distribution specified (as a character
string) by `y` . |

`alternative` |
indicates the alternative hypothesis and must be
one of `"two.sided"` (default), `"less"` , or
`"greater"` . You can specify just the initial letter of the
value, but the argument name must be give in full.
See Details for the meanings of the possible values. |

`exact` |
`NULL` or a logical indicating whether an exact
p-value should be computed. See Details for the meaning of
`NULL` . Not used for the one-sided two-sample case. |

If `y`

is numeric, a two-sample test of the null hypothesis
that `x`

and `y`

were drawn from the same *continuous*
distribution is performed.

Alternatively, `y`

can be a character string naming a continuous
distribution function. In this case, a one-sample test is carried
out of the null that the distribution function which generated
`x`

is distribution `y`

with parameters specified by
`...`

.

The presence of ties generates a warning, since continuous distributions do not generate them.

The possible values `"two.sided"`

, `"less"`

and
`"greater"`

of `alternative`

specify the null hypothesis
that the true distribution function of `x`

is equal to, not less
than or not greater than the hypothesized distribution function
(one-sample case) or the distribution function of `y`

(two-sample
case), respectively. This is a comparison of cumulative distribution
functions, and the test statistic is the maximum difference in value,
with the statistic in the `"greater"`

alternative being
*D^+ = max_u [ F_x(u) - F_y(u) ]*. Thus in the two-sample case
`alternative="greater"`

includes distributions for which `x`

is stochastically *smaller* than `y`

(the CDF of `x`

lies
above and hence to the left of that for `y`

), in contrast to
`t.test`

or `wilcox.test`

.

Exact p-values are not available for the one-sided two-sample case, or
in the case of ties. If `exact = NULL`

(the default), an exact
p-value is computed if the sample size if less than 100 in the
one-sample case, and if the product of the sample sizes is less than
10000 in the two-sample case. Otherwise, asymptotic distributions are
used whose approximations may be inaccurate in small samples. In the
one-sample two-sided case, exact p-values are obtained as described in
Marsaglia, Tsang & Wang (2003). The formula of Birnbaum & Tingey
(1951) is used for the one-sample one-sided case.

If a single-sample test is used, the parameters specified in
`...`

must be pre-specified and not estimated from the data.
There is some more refined distribution theory for the KS test with
estimated parameters (see Durbin, 1973), but that is not implemented
in `ks.test`

.

A list with class `"htest"`

containing the following components:

`statistic` |
the value of the test statistic. |

`p.value` |
the p-value of the test. |

`alternative` |
a character string describing the alternative hypothesis. |

`method` |
a character string indicating what type of test was performed. |

`data.name` |
a character string giving the name(s) of the data. |

Z. W. Birnbaum & Fred H. Tingey (1951),
One-sided confidence contours for probability distribution functions.
*The Annals of Mathematical Statistics*, **22**/4, 592–596.

William J. Conover (1971),
*Practical Nonparametric Statistics*.
New York: John Wiley & Sons.
Pages 295–301 (one-sample “Kolmogorov” test),
309–314 (two-sample “Smirnov” test).

Durbin, J. (1973)
*Distribution theory for tests based on the sample distribution
function*. SIAM.

George Marsaglia, Wai Wan Tsang & Jingbo Wang (2003),
Evaluating Kolmogorov's distribution.
*Journal of Statistical Software*, **8**/18.
http://www.jstatsoft.org/v08/i18/.

`shapiro.test`

which performs the Shapiro-Wilk test for
normality.

x <- rnorm(50) y <- runif(30) # Do x and y come from the same distribution? ks.test(x, y) # Does x come from a shifted gamma distribution with shape 3 and rate 2? ks.test(x+2, "pgamma", 3, 2) # two-sided, exact ks.test(x+2, "pgamma", 3, 2, exact = FALSE) ks.test(x+2, "pgamma", 3, 2, alternative = "gr") # test if x is stochastically larger than x2 x2 <- rnorm(50, -1) plot(ecdf(x), xlim=range(c(x, x2))) plot(ecdf(x2), add=TRUE, lty="dashed") t.test(x, x2, alternative="g") wilcox.test(x, x2, alternative="g") ks.test(x, x2, alternative="l")

[Package *stats* version 2.5.0 Index]