model.matrix {stats} | R Documentation |

`model.matrix`

creates a design matrix.

model.matrix(object, ...) ## Default S3 method: model.matrix(object, data = environment(object), contrasts.arg = NULL, xlev = NULL, ...)

`object` |
an object of an appropriate class. For the default method, a model formula or terms object. |

`data` |
a data frame created with `model.frame` . |

`contrasts.arg` |
A list, whose entries are contrasts suitable for
input to the `contrasts` replacement function and
whose names are the names of columns of `data` containing
`factor` s. |

`xlev` |
to be used as argument of `model.frame` if
`data` has no `"terms"` attribute. |

`...` |
further arguments passed to or from other methods. |

`model.matrix`

creates a design matrix from the description given
in `terms(formula)`

, using the data in `data`

which must
contain columns with the same names as would be created by a call to
`model.frame(formula)`

or, more precisely, by evaluating
`attr(terms(formula), "variables")`

. There may be other columns
and the order is not important.

If `contrasts.arg`

is specified for a factor it overrides the
default factor coding for that variable and any `"contrasts"`

attribute set by `C`

or `contrasts`

.

In interactions, the variable whose levels vary fastest is the first
one to appear in the formula (and not in the term), so in ```
~ a +
b + b:a
```

the interaction will have `a`

varying fastest.

By convention, if the response variable also appears on the right-hand side of the formula it is dropped (with a warning), although interactions involving the term are retained.

The design matrix for a regression model with the specified formula
and data.

There is an attribute `"assign"`

, an integer vector with an entry
for each column in the matrix giving the term in the formula which
gave rise to the column.

If there are any factors in terms in the model, there is an attribute
`"contrasts"`

, a named list with an entry for each factor. This
specifies the contrasts that would be used in terms in which the
factor is coded by contrasts (in some terms dummy coding may be used),
either as a character vector naming a function or as a numeric matrix.

Chambers, J. M. (1992)
*Data for models.*
Chapter 3 of *Statistical Models in S*
eds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.

`model.frame`

, `model.extract`

,
`terms`

ff <- log(Volume) ~ log(Height) + log(Girth) str(m <- model.frame(ff, trees)) mat <- model.matrix(ff, m) dd <- data.frame(a = gl(3,4), b = gl(4,1,12))# balanced 2-way options("contrasts") model.matrix(~ a + b, dd) model.matrix(~ a + b, dd, contrasts = list(a="contr.sum")) model.matrix(~ a + b, dd, contrasts = list(a="contr.sum", b="contr.poly")) m.orth <- model.matrix(~a+b, dd, contrasts = list(a="contr.helmert")) crossprod(m.orth)# m.orth is ALMOST orthogonal

[Package *stats* version 2.1.0 Index]