histogram {lattice}R Documentation

Histograms and Kernel Density Plots


Draw Histograms and Kernel Density Plots, possibly conditioned on other variables.


          type = c("percent", "count", "density"),
          nint = if(is.factor(x)) length(levels(x))
                 else round(log2(length(x))+1),
          endpoints = range(x[!na.x]),
          breaks = if(is.factor(x)) seq(0.5, length = length(levels(x))
          + 1) else do.breaks(endpoints, nint),
          equal.widths = FALSE, 
densityplot(formula, data, n = 50, plot.points = TRUE, ref = FALSE,
do.breaks(endpoints, nint)


formula A formula of the form ~ x | g1 * g2 * ... indicates that histograms or Kernel Density estimates of x should be produced conditioned on the levels of the (optional) variables g1,g2,.... When the conditioning variables g1,g2,... are missing, the leading ~ can be dropped.
x can be numeric (or factor for histogram), and each of g1,g2,... must be either factors or shingles.
As a special case, the right hand side of the formula (x) can actually contain more than one variable separated by a '+' sign. What happens in this case is described in details in the documentation for xyplot.
data optional data frame in which variables are to be evaluated
type Character string indicating type of histogram to be drawn. ``percent'' and ``count'' give relative frequency and frequency histograms, and can be misleading when breakpoints are not equally spaced. ``density'' produces a density scale histogram.
type defaults to ``percent'', except when the breakpoints are unequally spaced or breaks = NULL, when it defaults to ``density''.
nint Number of bins. Applies only when breaks is unspecified in the call.
endpoints vector of length 2 indicating the range of x-values that is to be covered by the histogram. Again, applies only when breaks is unspecified. In do.breaks, this specifies the interval that is to be divided up.
breaks numeric vector of length = (number of bins + 1) defining the breakpoints of the bins. Note that when breakpoints are not equally spaced, the only value of type that makes sense is density.
Usually all panels use the same breakpoints. This can be changed by specifying breaks = NULL. This has the effect of letting each panel choose its own breakpoints. The choice of these breakpoints are made as follows: The number of bins is calculated by the formula for nint above, and then breakpoints are chosen according to the value of equal.widths.
equal.widths logical, relevant only when breaks=NULL. If TRUE, equally spaced bins will be selected, otherwise, approximately equal area bins will be selected (this would mean that the breakpoints will not be equally spaced).
n number of points at which density is to be evaluated
plot.points logical specifying whether the x values should be plotted along the y=0 line.
ref logical specifying whether a reference x-axis should be drawn.
... other arguments, passed along to the panel function. In the case of densityplot, if the default panel function is used, then arguments appropriate to density can be included. This can control the details of how the Kernel Density Estimates are calculated. See documentation for density for details.


histogram draws Conditional Histograms, while densityplot draws Conditional Kernel Density Plots. The density estimate in densityplot is actually calculated using the function density, and all arguments accepted by it can be passed (as ...) in the call to densityplot to control the output. See documentation of density for details. (Note: The default value of the argument n of density is changed to 50.)

These and all other high level Trellis functions have several arguments in common. These are extensively documented only in the help page for xyplot, which should be consulted to learn more detailed usage.

do.breaks is an utility function that calculates breakpoints given an interval and the number of pieces to break it into.


An object of class ``trellis''. The `update' method can be used to update components of the object and the `print' method (usually called by default) will plot it on an appropriate plotting device.


The form of the arguments accepted by the default panel function panel.histogram is different from that in S-PLUS. Whereas S-PLUS calculates the heights inside histogram and passes only the breakpoints and the heights to the panel function, here the original variable x is passed along with the breakpoints. This allows plots as in the second example below.


Deepayan Sarkar Deepayan.Sarkar@R-project.org

See Also

xyplot, panel.histogram, density, panel.densityplot, panel.mathdensity, Lattice


histogram( ~ height | voice.part, data = singer, nint = 17,
          endpoints = c(59.5, 76.5), layout = c(2,4), aspect = 1,
          xlab = "Height (inches)")

histogram( ~ height | voice.part, data = singer,
          xlab = "Height (inches)", type = "density",
          panel = function(x, ...) {
              panel.histogram(x, ...)
              panel.mathdensity(dmath = dnorm, col = "black",
                                args = list(mean=mean(x),sd=sd(x)))
          } )

densityplot( ~ height | voice.part, data = singer, layout = c(2, 4),  
            xlab = "Height (inches)", bw = 5)

[Package lattice version 0.11-6 Index]