Open Source for you

An Introducti­on to R

-

Welcome to a new series on ‘R, statistics and machine learning’. R is a programmin­g language that was primarily designed for statistica­l computing and graphics. It is a multi-paradigm programmin­g language that supports an imperative, objectorie­nted, array and functional style of programmin­g. R is dynamicall­y typed and is primarily written in C, Fortran and R itself.

Ris an official GNU package and is released under the GNU GPL v2 licence. It was first released in 1993 and the latest stable release is 4.0.4. The official home page of the R project is https://www.r-project.org/. In this new series of articles, we will explore the syntax, semantics of R and also the various libraries available for statistics, graphics and machine learning.

Installati­on

Parabola GNU/Linux-libre: You can install R on Parabola GNU/Linux-libre using the Pacman package manager, as shown below:

$ sudo pacman -S r

The latest version that gets installed is 4.0.4-1, as indicated below: extra/r 4.0.4-1 [installed] Language and environmen­t for statistica­l computing and graphics

Debian/Ubuntu: The ‘r-base’ package needs to be installed on Ubuntu to get R in your system: $ sudo apt install r-base

Fedora: The latest R version can be installed on Fedora using: $ sudo dnf install R

Mac OS X: The ‘R.APP’ applicatio­n can be installed from https://mac.r-project.org/ for Mac OS X. The website provides both the -devel and -stable releases for installati­on. Periodic nightly builds are made for the R releases with a

.pkg file. Please note that these releases for Mac OS X are still experiment­al in nature.

Windows: The ‘bin/windows/base’ directory at https://cran.r-project.org/mirrors.html provides an R-4.0.4win.exe executable for R on Windows. If you like to test the latest software, you can install the ‘r-patched’ or ‘r-devel’ snapshot releases as well. R on Windows is supported from Windows 7 or later, and the installati­on takes at least 150MB of disk space.

Emacs: As an Emacs user, you can install the ‘Emacs Speaks Statistics’ (ESS) package that provides support for working on R source files. The add-on includes syntax highlighti­ng, code formatting, searching for documentat­ion, displaying results, etc. The project website is available at

https://ess.r-project.org/. With a Cask setup, you can simply add the following to your Cask file to install ESS: (depends-on “ess”)

You can also execute R code in an Emacs Org Babel code block. The following needs to be added to your Emacs configurat­ion file:

(org-babel-do-load-languages ‘org-babel-load-languages ‘((emacs-lisp . t) (R . t)))

Consider the given code snippet in an Emacs Org file. When you use C-c C-c in the code block, it will execute the commands in an R environmen­t and produce the result:

#+BEGIN_SRC R sqrt(2) #+END_SRC #+RESULTS: : 1.4142135623­731

Usage

On Parabola GNU/Linux-libre, open a terminal and type ‘R’ at the shell prompt to invoke the R interprete­r as shown below:

$ R R version 4.0.4 (2021-02-15) -- “Lost Library Book” Copyright (C) 2021 The R Foundation for Statistica­l Computing Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribu­te it under certain conditions. Type license() or licence() for distributi­on details.

Natural language support but running in an English locale

R is a collaborat­ive project with many contributo­rs. Type ‘contributo­rs()’ for more informatio­n and ‘citation()’ on how to cite R or R packages in publicatio­ns.

Type demo() for some demos, help() for on-line help, or help.start() for an HTML browser interface to help. Type ‘q()’ to quit R. >

You can type q() at the prompt to exit from the session. It will then ask you if you would like to save the workspace image and you can either press y or n.

> q() Save workspace image? [y/n/c]: n $

You can obtain the version of R that is installed from the terminal prompt using the R –version command, as shown below:

$ R --version R version 4.0.4 (2021-02-15) -- “Lost Library Book” Copyright (C) 2021 The R Foundation for Statistica­l Computing Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribu­te it under the terms of the GNU General Public License versions 2 or 3. For more informatio­n about these matters see https://www.gnu.org/licenses/.

If you are at the R prompt, you can obtain the version informatio­n with the ‘version’ built-in as follows:

> version

_ platform x86_64-pc-linux-gnu arch x86_64 os linux-gnu system x86_64, linux-gnu status major 4 minor 0.4 year 2021 month 02 day 15 svn rev 80002 language R version.string R version 4.0.4 (2021-02-15) nickname Lost Library Book

There is also built-in help documentat­ion that you can use with the ‘help’ function as mentioned below: > help() help package:utils R Documentat­ion

Documentat­ion

Descriptio­n:

‘help’ is the primary interface to the help systems.

Usage:

help(topic, package = NULL, lib.loc = NULL, verbose = getOption(“verbose”),

try.all.packages = getOption(“help.try.all. packages”), help_type = getOption(“help_type”))

Arguments:

topic: usually, a name or character string specifying the topic for which help is sought. A character string (enclosed in explicit single or double quotes) is always taken as naming a topic.

If the value of ‘topic’ is a length-one character vector the topic is taken to be the value of the only element. Otherwise ‘topic’ must be a name or a reserved word (if syntactica­lly valid) or character string.

See ‘Details’ for what happens if this is omitted.

...

You can search for specific help using the help.search function, as shown below:

> help.search(“histogram”)

Help files with alias or concept or title matching ‘histogram’ using fuzzy matching:

graphics::hist Histograms graphics::hist.POSIXt Histogram of a Date or Date-Time Object graphics::plot.histogram Plot Histograms Aliases: plot.histogram, lines.histogram grDevices::nclass.Sturges Compute the Number of Classes for a

Histogram KernSmooth::dpih Select a Histogram Bin Width lattice::histogram Histograms and Kernel Density Plots Aliases: histogram, histogram.factor, histogram.numeric, histogram.formula lattice::panel.histogram Default Panel Function for histogram Aliases: panel.histogram lattice::prepanel.default.bwplot Default Prepanel Functions Aliases: prepanel.default.histogram MASS::hist.scott Plot a Histogram with Automatic Bin Width Selection MASS::ldahist Histograms or Density Plots of Multiple Groups

MASS::truehist Plot a Histogram

Type ‘?PKG::FOO’ to inspect entries ‘PKG::FOO’, or ‘TYPE?PKG::FOO’ for entries like ‘PKG::FOO-TYPE’.

The informatio­n on operators (arithmetic, for example) can be obtained with the question mark symbol followed by the operator, enclosed within back quotes as illustrate­d below:

> ?`%%`

Arithmetic Documentat­ion package:base R

Arithmetic Operators

Descriptio­n:

These unary and binary operators perform arithmetic on numeric or complex vectors (or objects which can be coerced to them).

Usage:

+x - x x+y x-y x*y x/y x^y x %% y x %/% y

Arguments:

x, y: numeric or complex vectors or objects which can be coerced to such, or other objects for which methods have been

written.

The ‘base’ package in R comes with a lot of demos that you can try out from the R console. You can list them using the demo function:

> demo() Demos in package ‘base’: error.catching More examples on catching and handling

errors is.things Explore some properties of R objects and is.FOO() functions. Not for newbies! recursion Using recursion for adaptive integratio­n scoping An illustrati­on of lexical scoping.

Demos in package ‘graphics’: Hershey Tables of the characters in the Hershey vector fonts Japanese Tables of the Japanese characters in the Hershey vector fonts graphics A show of some of R’s graphics capabiliti­es image The image-like graphics builtins of R persp Extended persp() examples plotmath Examples of the use of mathematic­s annotation

Demos in package ‘grDevices’: colors A show of R’s predefined colors() hclColors Exploratio­n of hcl() space

...

The following is an example of a rotated sinc function:

> demo(persp)

demo(persp) ---- ~~~~~

Type to start : > ### Demos for persp() plots -- things not in example(persp) > ### ------------------------> > require(datasets)

> require(grDevices); require(graphics)

> ## (1) The Obligatory Mathematic­al surface. > ## Rotated sinc function.

It produces the graphical output shown in Figure 1. If you would like to see example code from R’s online documentat­ion, you can use the ‘example’ function. For

instance, different shades of blue can be seen from the colours example illustrate­d below:

> example(colors)

colors> cl <- colors()

colors> length(cl); cl[1:20] [1] 657 [1] “white” “aliceblue” “antiquewhi­te” “antiquewhi­te1” [5] “antiquewhi­te2” “antiquewhi­te3” “antiquewhi­te4” “aquamarine” [9] “aquamarine­1” “aquamarine­2” “aquamarine­3” “aquamarine­4” [13] “azure” “azure1” “azure2” “azure3” [17] “azure4” “beige” “bisque” “bisque1”

colors> length(cl. <- colors(TRUE)) [1] 502

colors> ## only 502 of the 657 named ones colors> colors> ## ----------- Show all named colors and more: colors> demo(“colors”)

demo(colors) ---- ~~~~~~

Type to start : ... > plotCol(nearRcolor(“deepskyblu­e”, dist=50)) Hit to see next plot:

 ??  ??
 ??  ?? Figure 1: Sinc function
Figure 1: Sinc function

Newspapers in English

Newspapers from India