histogram {lattice} R Documentation

## Histograms and Kernel Density Plots

### Description

Draw Histograms and Kernel Density Plots, possibly conditioned on other variables.

### Usage

```histogram(formula,
data,
type = c("percent", "count", "density"),
nint = if(is.factor(x)) length(levels(x))
else round(log2(length(x))+1),
endpoints = range(x[!na.x]),
breaks = if(is.factor(x)) seq(0.5, length = length(levels(x))
+ 1) else do.breaks(endpoints, nint),
equal.widths = FALSE,
...)
densityplot(formula, data, n = 50, plot.points = TRUE, ref = FALSE,
...)
do.breaks(endpoints, nint)
```

### Arguments

 `formula` A formula of the form `~ x | g1 * g2 * ...` indicates that histograms or Kernel Density estimates of `x` should be produced conditioned on the levels of the (optional) variables `g1,g2,...`. When the conditioning variables `g1,g2,...` are missing, the leading `~` can be dropped. `x` can be numeric (or factor for `histogram`), and each of `g1,g2,...` must be either factors or shingles. As a special case, the right hand side of the formula (`x`) can actually contain more than one variable separated by a '+' sign. What happens in this case is described in details in the documentation for `xyplot`. `data` optional data frame in which variables are to be evaluated `type` Character string indicating type of histogram to be drawn. ``percent'' and ``count'' give relative frequency and frequency histograms, and can be misleading when breakpoints are not equally spaced. ``density'' produces a density scale histogram. `type` defaults to ``percent'', except when the breakpoints are unequally spaced or `breaks = NULL`, when it defaults to ``density''. `nint` Number of bins. Applies only when `breaks` is unspecified in the call. `endpoints` vector of length 2 indicating the range of x-values that is to be covered by the histogram. Again, applies only when `breaks` is unspecified. In `do.breaks`, this specifies the interval that is to be divided up. `breaks` numeric vector of length = (number of bins + 1) defining the breakpoints of the bins. Note that when breakpoints are not equally spaced, the only value of `type` that makes sense is density. Usually all panels use the same breakpoints. This can be changed by specifying `breaks = NULL`. This has the effect of letting each panel choose its own breakpoints. The choice of these breakpoints are made as follows: The number of bins is calculated by the formula for `nint` above, and then breakpoints are chosen according to the value of `equal.widths`. `equal.widths` logical, relevant only when `breaks=NULL`. If `TRUE`, equally spaced bins will be selected, otherwise, approximately equal area bins will be selected (this would mean that the breakpoints will not be equally spaced). `n` number of points at which density is to be evaluated `plot.points` logical specifying whether the `x` values should be plotted along the `y=0` line. `ref` logical specifying whether a reference x-axis should be drawn. `...` other arguments, passed along to the panel function. In the case of `densityplot`, if the default panel function is used, then arguments appropriate to `density` can be included. This can control the details of how the Kernel Density Estimates are calculated. See documentation for `density` for details.

### Details

`histogram` draws Conditional Histograms, while `densityplot` draws Conditional Kernel Density Plots. The density estimate in `densityplot` is actually calculated using the function `density`, and all arguments accepted by it can be passed (as `...`) in the call to `densityplot` to control the output. See documentation of `density` for details. (Note: The default value of the argument `n` of `density` is changed to 50.)

These and all other high level Trellis functions have several arguments in common. These are extensively documented only in the help page for `xyplot`, which should be consulted to learn more detailed usage.

`do.breaks` is an utility function that calculates breakpoints given an interval and the number of pieces to break it into.

### Value

An object of class ``trellis''. The `update' method can be used to update components of the object and the `print' method (usually called by default) will plot it on an appropriate plotting device.

### Note

The form of the arguments accepted by the default panel function `panel.histogram` is different from that in S-PLUS. Whereas S-PLUS calculates the heights inside `histogram` and passes only the breakpoints and the heights to the panel function, here the original variable `x` is passed along with the breakpoints. This allows plots as in the second example below.

### Author(s)

Deepayan Sarkar Deepayan.Sarkar@R-project.org

`xyplot`, `panel.histogram`, `density`, `panel.densityplot`, `panel.mathdensity`, `Lattice`

### Examples

```require(stats)
histogram( ~ height | voice.part, data = singer, nint = 17,
endpoints = c(59.5, 76.5), layout = c(2,4), aspect = 1,
xlab = "Height (inches)")

histogram( ~ height | voice.part, data = singer,
xlab = "Height (inches)", type = "density",
panel = function(x, ...) {
panel.histogram(x, ...)
panel.mathdensity(dmath = dnorm, col = "black",
args = list(mean=mean(x),sd=sd(x)))
} )

densityplot( ~ height | voice.part, data = singer, layout = c(2, 4),
xlab = "Height (inches)", bw = 5)
```

[Package lattice version 0.11-6 Index]