# RaMS and friends

The strength of RaMS is its simple data format. Table-like data structures are common in most programming languages, and they can always be converted to the nigh-universal matrix format. The goal of this vignette is to illustrate this strength by exporting MS data to several formats that can be used outside of R.

## Standard export to CSV

As with all rectangular data, RaMS objects can be easily exported to CSV files with base R functions. This works best with a few chromatograms at a time, as the millions of data points found in most MS files can overwhelm common file readers.

library(RaMS)

# Locate an MS file
single_file <- system.file("extdata", "LB12HL_AB.mzML.gz", package = "RaMS")

# Grab the MS data
msdata <- grabMSdata(single_file, grab_what = "everything")
##
## Reading file LB12HL_AB.mzML.gz... 0.06 secs
## Total time: 0.19 secs
# Write out MS1 data to .csv file
write.csv(x = msdata$MS1, file = "MS1_data.csv") # Clean up afterward file.remove("MS1_data.csv") ## [1] TRUE ## Fancier export to Excel Excel workbooks are a common format because of their intuitive GUI and widespread adoption. They can also encode more information than CSV files due to their multiple “sheets” within a single workbook - perfect for encoding both MS1 and MS2 information in one place. This vignette uses the openxlsx package, although there are several alternatives with identical functionality. library(openxlsx) # Locate an MS2 file MS2_file <- system.file("extdata", "DDApos_2.mzML.gz", package = "RaMS") # Grab the MS1 and MS2 data msdata <- grabMSdata(MS2_file, grab_what=c("MS1", "MS2")) ## ## Reading file DDApos_2.mzML.gz... 0.55 secs ## Reading MS1 data...0.19 secs ## Reading MS2 data...0.48 secs ## Total time: 1.21 secs # Write out MS data to Excel file # openxlsx writes each object in a list to a unique sheet # Produces one sheet for MS1 and one for MS2 write.xlsx(msdata, file = "MS2_data.xlsx") # Clean up afterward file.remove("MS2_data.xlsx") ## [1] TRUE ## Exporting to SQL database For more robust data processing and storage, or to work with larger-than-memory data sets, SQL databases are an excellent choice. This vignette will demo the RSQLite package’s engine, although several other database engines have similar functionality. library(DBI) # Create the sqlite database and connect to it MSdb <- dbConnect(RSQLite::SQLite(), "MSdata.sqlite") # Export MS1 and MS2 data to sqlite tables dbWriteTable(MSdb, "MS1", msdata$MS1)
dbWriteTable(MSdb, "MS2", msdata$MS2) dbListTables(MSdb) ## [1] "MS1" "MS2" # Perform a simple query to ensure data was exported correctly dbGetQuery(MSdb, 'SELECT * FROM MS1 LIMIT 3') ## rt mz int filename ## 1 4.00085 80.05009 12057.776 DDApos_2.mzML.gz ## 2 4.00085 80.26269 8178.767 DDApos_2.mzML.gz ## 3 4.00085 80.94841 19075.213 DDApos_2.mzML.gz # Perform EIC extraction in SQL rather than in R EIC_query <- 'SELECT * FROM MS1 WHERE mz BETWEEN :lower_bound AND :upper_bound' query_params <- list(lower_bound=118.086, upper_bound=118.087) EIC <- dbGetQuery(MSdb, EIC_query, params = query_params) # Disconnect after export dbDisconnect(MSdb) # Clean up afterward unlink("MSdata.sqlite") ## Interfacing with Python via reticulate R and Python are commonly used together, and the reticulate package makes this even easier by enabling a Python interpreter within R. RStudio, in which this vignette was written, supports both R and Python code chunks as shown below. ### R code chunk: {r} # Locate a couple MS files data_dir <- system.file("extdata", package = "RaMS") file_paths <- list.files(data_dir, pattern = "HL.*mzML", full.names = TRUE) msdata <- grabMSdata(files = file_paths, grab_what = "BPC")$BPC
##
|
|                                                                      |   0%
|
|=======================                                               |  33%
|
|===============================================                       |  67%
|
|======================================================================| 100%
## Total time: 0.25 secs

### Python code chunk: {python}

# Not run to pass R CMD check on GitHub
# Make sure python, matplotlib, and seaborn are installed

import seaborn as sns
import matplotlib.pyplot as plt

sns.relplot(data=r.msdata, kind="line", x="rt", y="int", hue="filename")
plt.show()