Difference between revisions of "Programming/R"

From HPC
Jump to: navigation , search
m
m
Line 71: Line 71:
 
<pre style="background-color: black; color: white; border: 2px solid black; font-family: monospace, sans-serif;">
 
<pre style="background-color: black; color: white; border: 2px solid black; font-family: monospace, sans-serif;">
  
[username@login01 ~]$ module add R/3.4.1
+
[pysdlb@login01 ~]$ interactive
[username@login01 ~]$ R
+
salloc: Granted job allocation 3039458
 +
Job ID 3039458 connecting to c120, please wait...
 +
[pysdlb@c120 ~]$ module load R/4.0.2
 +
[pysdlb@c120 ~]$ R
  
R version 3.4.1 (2017-06-30) -- "Single Candle"
+
R version 4.0.2 (2020-06-22) -- "Taking Off Again"
Copyright (C) 2017 The R Foundation for Statistical Computing
+
Copyright (C) 2020 The R Foundation for Statistical Computing
 
Platform: x86_64-pc-linux-gnu (64-bit)
 
Platform: x86_64-pc-linux-gnu (64-bit)
  
Line 81: Line 84:
 
You are welcome to redistribute it under certain conditions.
 
You are welcome to redistribute it under certain conditions.
 
Type 'license()' or 'licence()' for distribution details.
 
Type 'license()' or 'licence()' for distribution details.
 
  Natural language support but running in an English locale
 
  
 
R is a collaborative project with many contributors.
 
R is a collaborative project with many contributors.
Line 92: Line 93:
 
Type 'q()' to quit R.
 
Type 'q()' to quit R.
  
>
+
[Previously saved workspace restored]
 +
 
 +
>  
  
 
</pre>
 
</pre>
Line 102: Line 105:
 
<pre style="background-color: black; color: white; border: 2px solid black; font-family: monospace, sans-serif;">
 
<pre style="background-color: black; color: white; border: 2px solid black; font-family: monospace, sans-serif;">
  
Packages in library /trinity/clustervision/CentOS/7/apps/R/3.3.0/lib64/R/library’:
+
Packages in library '/trinity/clustervision/CentOS/7/apps/R/4.0.2/lib64/R/library':
 +
 
 +
AnnotationDbi          Manipulation of SQLite-based annotations in
 +
                        Bioconductor
 +
BH                      Boost C++ Header Files
 +
Biobase                Biobase: Base functions for Bioconductor
 +
BiocFileCache          Manage Files Across Sessions
 +
BiocGenerics            S4 generic functions used in Bioconductor
 +
BiocManager            Access the Bioconductor Project Package
 +
                        Repository
 +
BiocParallel            Bioconductor facilities for parallel evaluation
 +
BiocStyle              Standard styles for vignettes and other
 +
                        Bioconductor documents
 +
BiocVersion            Set the appropriate version of Bioconductor
 +
                        packages
 +
Biostrings              Efficient manipulation of biological strings
 +
DBI                    R Database Interface
 +
DelayedArray            A unified framework for working transparently
 +
                        with on-disk and in-memory array-like datasets
 +
GenomeInfoDb            Utilities for manipulating chromosome names,
 +
                        including modifying them to follow a particular
 +
                        naming style
 +
GenomeInfoDbData        Species and taxonomy ID look up tables used by
 +
                        GenomeInfoDb
 +
GenomicAlignments      Representation and manipulation of short
 +
                        genomic alignments
 +
GenomicFeatures        Conveniently import and query gene models
 +
GenomicRanges          Representation and manipulation of genomic
 +
                        intervals
 +
IRanges                Foundation of integer range manipulation in
 +
                        Bioconductor
 +
KEGG.db                A set of annotation maps for KEGG
 +
KernSmooth              Functions for Kernel Smoothing Supporting Wand
 +
                        & Jones (1995)
 +
MASS                    Support Functions and Datasets for Venables and
 +
                        Ripley's MASS
 +
Matrix                  Sparse and Dense Matrix Classes and Methods
 +
R6                      Encapsulated Classes with Reference Semantics
 +
RColorBrewer            ColorBrewer Palettes
 +
RCurl                  General Network (HTTP/FTP/...) Client Interface
 +
                        for R
 +
RNAseqData.HNRNPC.bam.chr14
 +
                        Aligned reads from RNAseq experiment:
 +
                        Transcription profiling by high throughput
 +
                        sequencing of HNRNPC knockdown and control HeLa
 +
                        cells
 +
RSQLite                'SQLite' Interface for R
 +
RUnit                  R Unit Test Framework
 +
Rcpp                    Seamless R and C++ Integration
 +
Rhtslib                HTSlib high-throughput sequencing library as an
 +
                        R package
 +
Rsamtools              Binary alignment (BAM), FASTA, variant call
 +
                        (BCF), and tabix file import
 +
S4Vectors              Foundation of vector-like and list-like
 +
                        containers in Bioconductor
 +
SummarizedExperiment    SummarizedExperiment container
 +
XML                    Tools for Parsing and Generating XML Within R
 +
                        and S-Plus
 +
XVector                Foundation of external vector representation
 +
                        and manipulation in Bioconductor
 +
askpass                Safe Password Entry for R, Git, and SSH
 +
assertthat              Easy Pre and Post Assertions
 +
:
  
base                    The R Base Package
 
boot                    Bootstrap Functions (Originally by Angelo Canty
 
                        for S)
 
class                  Functions for Classification
 
cluster                "Finding Groups in Data": Cluster Analysis
 
                        Extended Rousseeuw et al.
 
codetools              Code Analysis Tools for R
 
colorspace              Color Space Manipulation
 
compiler                The R Compiler Package
 
datasets                The R Datasets Package
 
dichromat              Color Schemes for Dichromats
 
digest                  Create Compact Hash Digests of R Objects
 
foreign                Read Data Stored by Minitab, S, SAS, SPSS,
 
                        Stata, Systat, Weka, dBase, ...
 
ggplot2                Create Elegant Data Visualisations Using the
 
                        Grammar of Graphics
 
  
 
</pre>
 
</pre>
Line 133: Line 182:
 
> install.packages("R.matlab")
 
> install.packages("R.matlab")
 
Warning in install.packages("R.matlab") :
 
Warning in install.packages("R.matlab") :
   'lib = "/trinity/clustervision/CentOS/7/apps/R/3.3.0/lib64/R/library"' is not writable
+
   'lib = "/trinity/clustervision/CentOS/7/apps/R/4.0.2/lib64/R/library"' is not writable
 
Would you like to use a personal library instead?  (y/n) y
 
Would you like to use a personal library instead?  (y/n) y
 
Would you like to create a personal library
 
Would you like to create a personal library

Revision as of 15:13, 22 July 2020

Programming Details

R is a programming language and software environment for statistical analysis, graphics representation and reporting.

R is freely available under the GNU General Public License, and can be loaded as a module on Viper (with Linux), Windows (PC) and Mac.

  • R has versions 3.3.0, 3.4.1 and 3.5.1.

R-Studio is also provided as an interactive session on Viper for R-Studio.


Programming example


x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)
y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)

# Apply the lm() function.
relation <- lm(y~x)

print(relation)


Modules Available

The following modules are available:

  • module add R/3.4.1

Batch Mode

The SLURM batch file would typical look like the one below, in this case it would be named 'rscript.job'.


#!/bin/bash
#SBATCH -J compute-single-node
#SBATCH -N 1
#SBATCH --ntasks-per-node 1
#SBATCH -D /home/pysdlb/CODE_SAMPLES/R
#SBATCH -o %N.%j.%a.out
#SBATCH -e %N.%j.%a.err
#SBATCH -p compute

echo $SLURM_JOB_NODELIST

module purge
module add R/3.4.1

export I_MPI_DEBUG=5
export I_MPI_FABRICS=shm:tmi
export I_MPI_FALLBACK=no

Rscript r.rsc

To run this on the cluster you would need to invoke the command sbatch rscript.job


Interactive Mode

The program is interpreted by the R runtime program, so compilation does not apply here.


[pysdlb@login01 ~]$ interactive
salloc: Granted job allocation 3039458
Job ID 3039458 connecting to c120, please wait...
[pysdlb@c120 ~]$ module load R/4.0.2
[pysdlb@c120 ~]$ R

R version 4.0.2 (2020-06-22) -- "Taking Off Again"
Copyright (C) 2020 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

[Previously saved workspace restored]

> 

Libraries

To show the libraries included type ( within R ) the command : library()


Packages in library '/trinity/clustervision/CentOS/7/apps/R/4.0.2/lib64/R/library':

AnnotationDbi           Manipulation of SQLite-based annotations in
                        Bioconductor
BH                      Boost C++ Header Files
Biobase                 Biobase: Base functions for Bioconductor
BiocFileCache           Manage Files Across Sessions
BiocGenerics            S4 generic functions used in Bioconductor
BiocManager             Access the Bioconductor Project Package
                        Repository
BiocParallel            Bioconductor facilities for parallel evaluation
BiocStyle               Standard styles for vignettes and other
                        Bioconductor documents
BiocVersion             Set the appropriate version of Bioconductor
                        packages
Biostrings              Efficient manipulation of biological strings
DBI                     R Database Interface
DelayedArray            A unified framework for working transparently
                        with on-disk and in-memory array-like datasets
GenomeInfoDb            Utilities for manipulating chromosome names,
                        including modifying them to follow a particular
                        naming style
GenomeInfoDbData        Species and taxonomy ID look up tables used by
                        GenomeInfoDb
GenomicAlignments       Representation and manipulation of short
                        genomic alignments
GenomicFeatures         Conveniently import and query gene models
GenomicRanges           Representation and manipulation of genomic
                        intervals
IRanges                 Foundation of integer range manipulation in
                        Bioconductor
KEGG.db                 A set of annotation maps for KEGG
KernSmooth              Functions for Kernel Smoothing Supporting Wand
                        & Jones (1995)
MASS                    Support Functions and Datasets for Venables and
                        Ripley's MASS
Matrix                  Sparse and Dense Matrix Classes and Methods
R6                      Encapsulated Classes with Reference Semantics
RColorBrewer            ColorBrewer Palettes
RCurl                   General Network (HTTP/FTP/...) Client Interface
                        for R
RNAseqData.HNRNPC.bam.chr14
                        Aligned reads from RNAseq experiment:
                        Transcription profiling by high throughput
                        sequencing of HNRNPC knockdown and control HeLa
                        cells
RSQLite                 'SQLite' Interface for R
RUnit                   R Unit Test Framework
Rcpp                    Seamless R and C++ Integration
Rhtslib                 HTSlib high-throughput sequencing library as an
                        R package
Rsamtools               Binary alignment (BAM), FASTA, variant call
                        (BCF), and tabix file import
S4Vectors               Foundation of vector-like and list-like
                        containers in Bioconductor
SummarizedExperiment    SummarizedExperiment container
XML                     Tools for Parsing and Generating XML Within R
                        and S-Plus
XVector                 Foundation of external vector representation
                        and manipulation in Bioconductor
askpass                 Safe Password Entry for R, Git, and SSH
assertthat              Easy Pre and Post Assertions
:


Installing Additional Libraries

If you need an additional library installed that would be used by more than just yourself, contact the Viper team on help@hull.ac.uk with the details of the library and the version of R you require it for.

Alternatively, you can install a R library into your own personal library. Within R run the install.packages command with the package you wish to install.


> install.packages("R.matlab")
Warning in install.packages("R.matlab") :
  'lib = "/trinity/clustervision/CentOS/7/apps/R/4.0.2/lib64/R/library"' is not writable
Would you like to use a personal library instead?  (y/n) y
Would you like to create a personal library
~/R/x86_64-pc-linux-gnu-library/3.3
to install packages into?  (y/n) y
--- Please select a CRAN mirror for use in this session ---
HTTPS CRAN mirror

 1: 0-Cloud [https]                 2: Algeria [https]
 3: Australia (Canberra) [https]    4: Australia (Melbourne) [https]
 5: Australia (Perth) [https]       6: Austria [https]
 7: Belgium (Ghent) [https]         8: Brazil (RJ) [https]
 9: Brazil (SP 1) [https]          10: Bulgaria [https]
11: Canada (MB) [https]            12: Chile 1 [https]
13: Chile 2 [https]                14: China (Beijing) [https]
15: China (Hefei) [https]          16: China (Lanzhou) [https]
17: Colombia (Cali) [https]        18: Czech Republic [https]
19: Denmark [https]                20: France (Lyon 1) [https]
21: France (Lyon 2) [https]        22: France (Marseille) [https]
23: France (Montpellier) [https]   24: France (Paris 2) [https]
25: Germany (Münster) [https]      26: Iceland [https]
27: India [https]                  28: Indonesia (Jakarta) [https]
29: Ireland [https]                30: Italy (Padua) [https]
31: Japan (Tokyo) [https]          32: Malaysia [https]
33: Mexico (Mexico City) [https]   34: New Zealand [https]
35: Norway [https]                 36: Philippines [https]
37: Russia (Moscow) [https]        38: Serbia [https]
39: Spain (A Coruña) [https]       40: Spain (Madrid) [https]
41: Sweden [https]                 42: Switzerland [https]
43: Taiwan (Chungli) [https]       44: Turkey (Denizli) [https]
45: Turkey (Mersin) [https]        46: UK (Bristol) [https]
47: UK (Cambridge) [https]         48: UK (London 1) [https]
49: USA (CA 1) [https]             50: USA (IA) [https]
51: USA (IN) [https]               52: USA (KS) [https]
53: USA (MI 1) [https]             54: USA (OR) [https]
55: USA (TN) [https]               56: USA (TX 1) [https]
57: USA (TX 2) [https]             58: (HTTP mirrors)


Selection: 48
also installing the dependencies ‘R.methodsS3’, ‘R.oo’, ‘R.utils’

trying URL 'https://cran.ma.imperial.ac.uk/src/contrib/R.methodsS3_1.7.1.tar.gz'
Content type 'unknown' length 25731 bytes (25 KB)
==================================================
downloaded 25 KB

trying URL 'https://cran.ma.imperial.ac.uk/src/contrib/R.oo_1.21.0.tar.gz'
Content type 'unknown' length 403410 bytes (393 KB)
==================================================
downloaded 393 KB

trying URL 'https://cran.ma.imperial.ac.uk/src/contrib/R.utils_2.5.0.tar.gz'
Content type 'unknown' length 389402 bytes (380 KB)
==================================================
downloaded 380 KB

trying URL 'https://cran.ma.imperial.ac.uk/src/contrib/R.matlab_3.6.1.tar.gz'
Content type 'unknown' length 109818 bytes (107 KB)
==================================================
downloaded 107 KB

* installing *source* package ‘R.methodsS3’ ...
** package ‘R.methodsS3’ successfully unpacked and MD5 sums checked
** R
** inst
** preparing package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
* DONE (R.methodsS3)
* installing *source* package ‘R.oo’ ...
** package ‘R.oo’ successfully unpacked and MD5 sums checked
** R
** inst
** preparing package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
* DONE (R.oo)
* installing *source* package ‘R.utils’ ...
** package ‘R.utils’ successfully unpacked and MD5 sums checked
** R
** inst
** preparing package for lazy loading
Warning in setGenericS3.default(name, export = exportGeneric, envir = envir,  :
  Renamed the preexisting function warnings to warnings.default, which was defined in environment base.
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
* DONE (R.utils)
* installing *source* package ‘R.matlab’ ...
** package ‘R.matlab’ successfully unpacked and MD5 sums checked
** R
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
* DONE (R.matlab)

The downloaded source packages are in
        ‘/tmp/RtmpO7iqto/downloaded_packages’


Listing packages installed

If you need a package that will be used frequently email help@hull.ac.uk with your request.

To view those already installed within R

> installed.packages()

                 Package
abind            "abind"
acepack          "acepack"
ADGofTest        "ADGofTest"
ape              "ape"
assertthat       "assertthat"
backports        "backports"
base             "base"
base64enc        "base64enc"
BH               "BH"
bindr            "bindr"
bindrcpp         "bindrcpp"
Biobase          "Biobase"
BiocGenerics     "BiocGenerics"
BiocInstaller    "BiocInstaller"
bit              "bit"
bit64            "bit64"
....


Further Information

Navigation