|
The R package
R,
closely related to the commercial package S-Plus,
is the largest and most comprehensive public domain statistical
computing environment. It provides a coherent, flexible
programming environment for data analysis, applied mathematics,
statistical analysis, and graphics. Unlike some menu-drived
statistical packages, the user interacts with R with a C-like command
language with pop-up graphical windows. The core R package is
enhanced by several hundred user-supplied add-on packages in the Comprehensive R
Archive Network
(CRAN) and the Omegahat
Project
for Statistical Computing. Binary executables and
open source
codes for Linux, Windows and MacOS can be downloaded for instant
use. R has extensive documentation.
Here we list some of its capabilities that may be of interest to the
physical scientist.
The base R
package includes:
- arithmetic (scalar/vector/array)
- bootstrap resampling and confidence intervals (basic,
ABC,
percentile, studentized, tilted, jackknife)
- correlation coefficients (Pearson, Kendall, Spearman)
- distributions (Gaussian, Poisson, and many other
statistical distributions and special functions, including random
deviates)
- empirical distribution tests (Anderson-Darling,
Cramer-von
Mises, Kolmogorov-Smirnov) and quantiles
- exploratory data analysis
- generalized linear & generalized additive modelling
- graphics, publication-quality (scatter, dendrograms,
lattice,
etc)
- integration and interpolation
- linear algebra and equation solutions (extensive methods)
- linear mixed-effects modelling
- linear modelling (including nonlinear functions),
resistant
regression. robust M-estimators
- linear & quadratic programming (simplex, penalized
constraints)
- local and ridge regression (loess, variograms)
- maximum likelihood estimation (AIC, BIC)
- multivariate analysis (tabulations, ANOVA, discriminant,
factor,
principal components, Mahalanobis distances, MANOVA, principal
components)
- multivariate cluster analyses (agglomerative and divisive
clustering, dissimilarity matrix, fuzzy, k-nearest neighbor, k-means
& m-medioid partitioning, monothetic, recursive partitioning,
regression trees, self-organizing maps)
- neural networks (censored, least-squares, entropy,
log-linear,
maximum likelihood, perceptron)
- nonlinear least-squares regression
- smoothing (cross-validation, histograms, kernel, local
regression, variogram)
- sorting
- spatial analysis & point processes (correlogram,
kriging,
Moran's I, Geary's C, pattern analysis, polynomial surface, simulation,
variogram)
- splines (B-spline, periodic, polynomial)
- statistical tests, parametric & nonparametric
(Ansari,
Bartlett, binomial, Box, F, Fisher, Fligner, Friedman, Mantel-Haenzel,
Mauchley, McNemar, Mood, proportions, Shapiro, t, Wilcoxon, signed
rank),
- survival analysis for censored data (Cox regression,
Kaplan-Meier &
Fleming-Harrington survival curves, life table, linear regression,
ridge regression, tobit modelling, Weibull & other survival curve
fitting, k-sample tests)
- time series analysis (ARMA, acf, Box-Jenkins, FFT, Kalman
filter, lags, mixed-effects, prediction, smoothing, spectral
analysis)
CRAN add-on packages treat:
(see Chapter
5 for
brief individual descriptions)
- adaptive quadrature
- ARIMA modeling
- Bayesian computation (empirical Bayes, MCMC calculations
&
diagnostics, survival regression, logit/probit, networks
- Boolean hypotheses
- boosting
- bootstrap modelling
- classification and regression trees
- convex clustering & convex hulls
- conditional inference
- combinatorics
- elliptical confidence regions
- energy statistical tests
- extreme value distribution
- fixed point clusters
- genetic algorithms
- geostatistical modelling
- GUIs (Rcmdr, SciViews)
- heteroscedastic t-regression
- hidden Markov models
- hierarchical partitioning & clustering
- independent component analysis
- interpolation
- irregular time series.
- kernel smoothing
- kernel-based machine learning
- k-nearest neighbor tree classifier
- Kolmogorov-Zurbenko adaptive filtering
- least-angle and lasso regression
- linear programming (simplex)
- likelihood ratios
- local regression density estimators
- logistic regression
- map projections
- Matlab emulator
- matrices, sparse matrices, tensor decomposition
- Markov chain Monte Carlo
- mixture models
- mixture discriminant analysis
- model-based clustering
- nonlinear least squares
- Markov multistate models
- mixture models & regression
- multidimensional analysis
- multimodality test
- multivariate time series
- multivariate Shapiro-Wilk test
- multivariate outlier detection
- multivariate normal partitioning
- multivariate normals with missing data
- neural networks
- non-linear time series analysis
- nonparametric multiple comparisons
- omnibus tests for normality
- orientation data, outlier detection
- parallel coordinates plots
- partial least squares
- periodic autoregression analysis
- Poisson-Gamma additive models
- polychoric and polyserial correlations
- principal component regression
- principal curve fits
- projection pursuit
- proportional hazards modelling
- quantile regression
- quasi-variances
- random fields
- random forest classification
- ridge regression
- robust regression
- Sampford sampling
- segmented regression break points
- self-organizing maps
- shape analysis
- space-time ecological data analysis
- spatial analysis and kriging
- spline fits & regressions (MARS, BRUTO)
- structural regression with splines
- tesselations & Delaunay trangulation
- three-dimensional visualization
- two-stage least squares regression
- unit root tests
- variogram diagnostics
- wavelet toolbox & denoising
- weighted likelihood robust inference
CRAN
includes codes and datasets associated with textbooks on:
- Bayesian statistics
- bootstrapping
- circular statistics
- contingency tables
- data analysis
- engineering statistics
- econometrics
- kernel smoothing
- generalized additive models
- image analysis
- linear regression
- relative distribution methods
- smoothing
- survival analysis (censored data)
- time-frequency analysis
Through
base R,
CRAN and the Omegahat Project, R interfaces to the following languages,
formats and protocols:
- Languages :
BUGS, C,
Fortran,
Java, Python, Perl, XLisp
- Headers: XML
- I/O file structures: ASCII, binary,
bitmapped images, ftp, gzip, MIM, Oracle, SAS, S-Plus, SPSS, Systat,
Stata, URL, .wav)
- Web formats : cgi, HTML, Netscape, SOAP
- Statistics
packages: GRASS, Matlab (emulator), XGobi
- Spreadsheets:
Excel,
Gnumeric
- Graphics: Grace,
Gtk, OpenGL, Tcl/Tk
- Databases: MySQL, SQL, SQLite
- Science/math
Libraries: GSL,
Isoda, LAPACK
- Parallel
processing:
PVM
- Text processors:
LaTeX
- Network connections: sockets, DCOM,
CORBA
|