Difference between revisions of "Programming/R"
m |
m |
||
Line 1: | Line 1: | ||
== Programming Details == | == Programming Details == | ||
− | R is a programming language and software environment for statistical analysis, graphics representation and reporting. | + | R is a programming language and software environment for statistical analysis, graphics representation, and reporting. |
− | R is freely available under the GNU General Public License | + | R is freely available under the GNU General Public License and can be loaded as a module on Viper (with Linux), Windows (PC), and Mac. |
* R has versions 3.5.1 and R/4.0.2. (Earlier versions will be retired) | * R has versions 3.5.1 and R/4.0.2. (Earlier versions will be retired) | ||
Line 35: | Line 35: | ||
====Batch Mode==== | ====Batch Mode==== | ||
− | The SLURM batch file would | + | The SLURM batch file would typically look like the one below, in this case, it would be named ''''rscript.job''''. |
<pre style="background-color: black; color: white; border: 2px solid black; font-family: monospace, sans-serif;"> | <pre style="background-color: black; color: white; border: 2px solid black; font-family: monospace, sans-serif;"> | ||
Line 47: | Line 47: | ||
#SBATCH -e %N.%j.%a.err | #SBATCH -e %N.%j.%a.err | ||
#SBATCH -p compute | #SBATCH -p compute | ||
+ | #SBATCH --mail-user= your email address here | ||
echo $SLURM_JOB_NODELIST | echo $SLURM_JOB_NODELIST | ||
Line 127: | Line 128: | ||
including modifying them to follow a particular | including modifying them to follow a particular | ||
naming style | naming style | ||
− | GenomeInfoDbData Species and taxonomy ID | + | GenomeInfoDbData Species and taxonomy ID lookup tables used by |
GenomeInfoDb | GenomeInfoDb | ||
GenomicAlignments Representation and manipulation of short | GenomicAlignments Representation and manipulation of short | ||
Line 156: | Line 157: | ||
Rhtslib HTSlib high-throughput sequencing library as an | Rhtslib HTSlib high-throughput sequencing library as an | ||
R package | R package | ||
− | Rsamtools Binary alignment (BAM), FASTA, variant | + | Rsamtools Binary alignment (BAM), FASTA, variant calling |
(BCF), and tabix file import | (BCF), and tabix file import | ||
S4Vectors Foundation of vector-like and list-like | S4Vectors Foundation of vector-like and list-like | ||
Line 176: | Line 177: | ||
If you need an additional library installed that would be used by more than just yourself, contact the Viper team on [https://support.hull.ac.uk/ Topdesk] with the details of the library and the version of R you require it for. | If you need an additional library installed that would be used by more than just yourself, contact the Viper team on [https://support.hull.ac.uk/ Topdesk] with the details of the library and the version of R you require it for. | ||
− | Alternatively, you can install | + | Alternatively, you can install an R library into your own personal library. Within R run the install.packages command with the package you wish to install. |
<pre style="background-color: black; color: white; border: 2px solid black; font-family: monospace, sans-serif;"> | <pre style="background-color: black; color: white; border: 2px solid black; font-family: monospace, sans-serif;"> |
Revision as of 09:18, 24 May 2021
Contents
Programming Details
R is a programming language and software environment for statistical analysis, graphics representation, and reporting.
R is freely available under the GNU General Public License and can be loaded as a module on Viper (with Linux), Windows (PC), and Mac.
- R has versions 3.5.1 and R/4.0.2. (Earlier versions will be retired)
R-Studio is also provided as an interactive session on Viper for R-Studio.
Programming example
x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131) y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48) # Apply the lm() function. relation <- lm(y~x) print(relation)
Modules Available
The following modules are available:
- module add R/3.4.1
Batch Mode
The SLURM batch file would typically look like the one below, in this case, it would be named 'rscript.job'.
#!/bin/bash #SBATCH -J compute-single-node #SBATCH -N 1 #SBATCH --ntasks-per-node 1 #SBATCH -D /home/pysdlb/CODE_SAMPLES/R #SBATCH -o %N.%j.%a.out #SBATCH -e %N.%j.%a.err #SBATCH -p compute #SBATCH --mail-user= your email address here echo $SLURM_JOB_NODELIST module purge module add R/3.4.1 export I_MPI_DEBUG=5 export I_MPI_FABRICS=shm:tmi export I_MPI_FALLBACK=no Rscript r.rsc
To run this on the cluster you would need to invoke the command sbatch rscript.job
Interactive Mode
The program is interpreted by the R runtime program, so compilation does not apply here.
[pysdlb@login01 ~]$ interactive salloc: Granted job allocation 3039458 Job ID 3039458 connecting to c120, please wait... [pysdlb@c120 ~]$ module load R/4.0.2 [pysdlb@c120 ~]$ R R version 4.0.2 (2020-06-22) -- "Taking Off Again" Copyright (C) 2020 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. [Previously saved workspace restored] >
Libraries
To show the libraries included type ( within R ) the command : library()
Packages in library '/trinity/clustervision/CentOS/7/apps/R/4.0.2/lib64/R/library': AnnotationDbi Manipulation of SQLite-based annotations in Bioconductor BH Boost C++ Header Files Biobase Biobase: Base functions for Bioconductor BiocFileCache Manage Files Across Sessions BiocGenerics S4 generic functions used in Bioconductor BiocManager Access the Bioconductor Project Package Repository BiocParallel Bioconductor facilities for parallel evaluation BiocStyle Standard styles for vignettes and other Bioconductor documents BiocVersion Set the appropriate version of Bioconductor packages Biostrings Efficient manipulation of biological strings DBI R Database Interface DelayedArray A unified framework for working transparently with on-disk and in-memory array-like datasets GenomeInfoDb Utilities for manipulating chromosome names, including modifying them to follow a particular naming style GenomeInfoDbData Species and taxonomy ID lookup tables used by GenomeInfoDb GenomicAlignments Representation and manipulation of short genomic alignments GenomicFeatures Conveniently import and query gene models GenomicRanges Representation and manipulation of genomic intervals IRanges Foundation of integer range manipulation in Bioconductor KEGG.db A set of annotation maps for KEGG KernSmooth Functions for Kernel Smoothing Supporting Wand & Jones (1995) MASS Support Functions and Datasets for Venables and Ripley's MASS Matrix Sparse and Dense Matrix Classes and Methods R6 Encapsulated Classes with Reference Semantics RColorBrewer ColorBrewer Palettes RCurl General Network (HTTP/FTP/...) Client Interface for R RNAseqData.HNRNPC.bam.chr14 Aligned reads from RNAseq experiment: Transcription profiling by high throughput sequencing of HNRNPC knockdown and control HeLa cells RSQLite 'SQLite' Interface for R RUnit R Unit Test Framework Rcpp Seamless R and C++ Integration Rhtslib HTSlib high-throughput sequencing library as an R package Rsamtools Binary alignment (BAM), FASTA, variant calling (BCF), and tabix file import S4Vectors Foundation of vector-like and list-like containers in Bioconductor SummarizedExperiment SummarizedExperiment container XML Tools for Parsing and Generating XML Within R and S-Plus XVector Foundation of external vector representation and manipulation in Bioconductor askpass Safe Password Entry for R, Git, and SSH assertthat Easy Pre and Post Assertions :
Installing Additional Libraries
If you need an additional library installed that would be used by more than just yourself, contact the Viper team on Topdesk with the details of the library and the version of R you require it for.
Alternatively, you can install an R library into your own personal library. Within R run the install.packages command with the package you wish to install.
> install.packages("R.matlab") Warning in install.packages("R.matlab") : 'lib = "/trinity/clustervision/CentOS/7/apps/R/4.0.2/lib64/R/library"' is not writable Would you like to use a personal library instead? (y/n) y Would you like to create a personal library ~/R/x86_64-pc-linux-gnu-library/3.3 to install packages into? (y/n) y --- Please select a CRAN mirror for use in this session --- HTTPS CRAN mirror 1: 0-Cloud [https] 2: Algeria [https] 3: Australia (Canberra) [https] 4: Australia (Melbourne) [https] 5: Australia (Perth) [https] 6: Austria [https] 7: Belgium (Ghent) [https] 8: Brazil (RJ) [https] 9: Brazil (SP 1) [https] 10: Bulgaria [https] 11: Canada (MB) [https] 12: Chile 1 [https] 13: Chile 2 [https] 14: China (Beijing) [https] 15: China (Hefei) [https] 16: China (Lanzhou) [https] 17: Colombia (Cali) [https] 18: Czech Republic [https] 19: Denmark [https] 20: France (Lyon 1) [https] 21: France (Lyon 2) [https] 22: France (Marseille) [https] 23: France (Montpellier) [https] 24: France (Paris 2) [https] 25: Germany (Münster) [https] 26: Iceland [https] 27: India [https] 28: Indonesia (Jakarta) [https] 29: Ireland [https] 30: Italy (Padua) [https] 31: Japan (Tokyo) [https] 32: Malaysia [https] 33: Mexico (Mexico City) [https] 34: New Zealand [https] 35: Norway [https] 36: Philippines [https] 37: Russia (Moscow) [https] 38: Serbia [https] 39: Spain (A Coruña) [https] 40: Spain (Madrid) [https] 41: Sweden [https] 42: Switzerland [https] 43: Taiwan (Chungli) [https] 44: Turkey (Denizli) [https] 45: Turkey (Mersin) [https] 46: UK (Bristol) [https] 47: UK (Cambridge) [https] 48: UK (London 1) [https] 49: USA (CA 1) [https] 50: USA (IA) [https] 51: USA (IN) [https] 52: USA (KS) [https] 53: USA (MI 1) [https] 54: USA (OR) [https] 55: USA (TN) [https] 56: USA (TX 1) [https] 57: USA (TX 2) [https] 58: (HTTP mirrors) Selection: 48 also installing the dependencies ‘R.methodsS3’, ‘R.oo’, ‘R.utils’ trying URL 'https://cran.ma.imperial.ac.uk/src/contrib/R.methodsS3_1.7.1.tar.gz' Content type 'unknown' length 25731 bytes (25 KB) ================================================== downloaded 25 KB trying URL 'https://cran.ma.imperial.ac.uk/src/contrib/R.oo_1.21.0.tar.gz' Content type 'unknown' length 403410 bytes (393 KB) ================================================== downloaded 393 KB trying URL 'https://cran.ma.imperial.ac.uk/src/contrib/R.utils_2.5.0.tar.gz' Content type 'unknown' length 389402 bytes (380 KB) ================================================== downloaded 380 KB trying URL 'https://cran.ma.imperial.ac.uk/src/contrib/R.matlab_3.6.1.tar.gz' Content type 'unknown' length 109818 bytes (107 KB) ================================================== downloaded 107 KB * installing *source* package ‘R.methodsS3’ ... ** package ‘R.methodsS3’ successfully unpacked and MD5 sums checked ** R ** inst ** preparing package for lazy loading ** help *** installing help indices ** building package indices ** testing if installed package can be loaded * DONE (R.methodsS3) * installing *source* package ‘R.oo’ ... ** package ‘R.oo’ successfully unpacked and MD5 sums checked ** R ** inst ** preparing package for lazy loading ** help *** installing help indices ** building package indices ** testing if installed package can be loaded * DONE (R.oo) * installing *source* package ‘R.utils’ ... ** package ‘R.utils’ successfully unpacked and MD5 sums checked ** R ** inst ** preparing package for lazy loading Warning in setGenericS3.default(name, export = exportGeneric, envir = envir, : Renamed the preexisting function warnings to warnings.default, which was defined in environment base. ** help *** installing help indices ** building package indices ** testing if installed package can be loaded * DONE (R.utils) * installing *source* package ‘R.matlab’ ... ** package ‘R.matlab’ successfully unpacked and MD5 sums checked ** R ** inst ** byte-compile and prepare package for lazy loading ** help *** installing help indices ** building package indices ** testing if installed package can be loaded * DONE (R.matlab) The downloaded source packages are in ‘/tmp/RtmpO7iqto/downloaded_packages’
Listing packages installed
If you need a package that will be used frequently visit Topdesk with your request.
To view those already installed within R
> installed.packages() AnnotationDbi "/trinity/clustervision/CentOS/7/apps/R/4.0.2/lib64/R/library" BH "/trinity/clustervision/CentOS/7/apps/R/4.0.2/lib64/R/library" Biobase "/trinity/clustervision/CentOS/7/apps/R/4.0.2/lib64/R/library" BiocFileCache "/trinity/clustervision/CentOS/7/apps/R/4.0.2/lib64/R/library" BiocGenerics "/trinity/clustervision/CentOS/7/apps/R/4.0.2/lib64/R/library" BiocManager "/trinity/clustervision/CentOS/7/apps/R/4.0.2/lib64/R/library" BiocParallel "/trinity/clustervision/CentOS/7/apps/R/4.0.2/lib64/R/library" BiocStyle "/trinity/clustervision/CentOS/7/apps/R/4.0.2/lib64/R/library" BiocVersion "/trinity/clustervision/CentOS/7/apps/R/4.0.2/lib64/R/library" Biostrings "/trinity/clustervision/CentOS/7/apps/R/4.0.2/lib64/R/library" DBI "/trinity/clustervision/CentOS/7/apps/R/4.0.2/lib64/R/library" DelayedArray "/trinity/clustervision/CentOS/7/apps/R/4.0.2/lib64/R/library" GenomeInfoDb "/trinity/clustervision/CentOS/7/apps/R/4.0.2/lib64/R/library" GenomeInfoDbData "/trinity/clustervision/CentOS/7/apps/R/4.0.2/lib64/R/library" GenomicAlignments "/trinity/clustervision/CentOS/7/apps/R/4.0.2/lib64/R/library" GenomicFeatures "/trinity/clustervision/CentOS/7/apps/R/4.0.2/lib64/R/library" GenomicRanges "/trinity/clustervision/CentOS/7/apps/R/4.0.2/lib64/R/library" IRanges "/trinity/clustervision/CentOS/7/apps/R/4.0.2/lib64/R/library" KEGG.db "/trinity/clustervision/CentOS/7/apps/R/4.0.2/lib64/R/library" KernSmooth "/trinity/clustervision/CentOS/7/apps/R/4.0.2/lib64/R/library" MASS "/trinity/clustervision/CentOS/7/apps/R/4.0.2/lib64/R/library" Matrix "/trinity/clustervision/CentOS/7/apps/R/4.0.2/lib64/R/library" R6 "/trinity/clustervision/CentOS/7/apps/R/4.0.2/lib64/R/library" RColorBrewer "/trinity/clustervision/CentOS/7/apps/R/4.0.2/lib64/R/library" RCurl "/trinity/clustervision/CentOS/7/apps/R/4.0.2/lib64/R/library" RNAseqData.HNRNPC.bam.chr14 "/trinity/clustervision/CentOS/7/apps/R/4.0.2/lib64/R/library" RSQLite "/trinity/clustervision/CentOS/7/apps/R/4.0.2/lib64/R/library" RUnit "/trinity/clustervision/CentOS/7/apps/R/4.0.2/lib64/R/library" Rcpp "/trinity/clustervision/CentOS/7/apps/R/4.0.2/lib64/R/library" ....
Further Information
- https://en.wikipedia.org/wiki/R_(programming_language)
- https://www.udemy.com/r-basics/
- https://www.r-project.org/about.html