Skip to content
Snippets Groups Projects
Commit d83ad069 authored by michele's avatar michele
Browse files

clean version for Bioc 2.13

parent 6f613afd
No related branches found
No related tags found
No related merge requests found
Package: RMassBank Package: RMassBank
Type: Package Type: Package
Title: Workflow to process tandem MS files and build MassBank records Title: Workflow to process tandem MS files and build MassBank records
Version: 1.3.1 Version: 1.3.2
Authors@R: c( Authors@R: c(
person(given = "RMassBank at Eawag", email = "massbank@eawag.ch", person(given = "RMassBank at Eawag", email = "massbank@eawag.ch",
role=c("cre")), role=c("cre")),
...@@ -13,6 +13,9 @@ Authors@R: c( ...@@ -13,6 +13,9 @@ Authors@R: c(
"aut", email = "erik.mueller@student.uni-halle.de"), person(given = "aut", email = "erik.mueller@student.uni-halle.de"), person(given =
"Tobias", family = "Schulze", role = "ctb", email = "Tobias", family = "Schulze", role = "ctb", email =
"tobias.schulze@ufz.de") ) "tobias.schulze@ufz.de") )
Author: Michael Stravs, Emma Schymanski, Steffen Neumann, Erik Mueller, with
contributions from Tobias Schulze
Maintainer: RMassBank at Eawag <massbank@eawag.ch>
Description: Workflow to process tandem MS files and build MassBank records. Description: Workflow to process tandem MS files and build MassBank records.
Functions include automated extraction of tandem MS spectra, formula Functions include automated extraction of tandem MS spectra, formula
assignment to tandem MS fragments, recalibration of tandem MS spectra with assignment to tandem MS fragments, recalibration of tandem MS spectra with
...@@ -22,11 +25,11 @@ License: Artistic-2.0 ...@@ -22,11 +25,11 @@ License: Artistic-2.0
SystemRequirements: OpenBabel SystemRequirements: OpenBabel
biocViews: Bioinformatics, MassSpectrometry, Metabolomics, Software biocViews: Bioinformatics, MassSpectrometry, Metabolomics, Software
Depends: Depends:
rcdk,yaml,mzR,methods mzR,rcdk,yaml,methods
Imports: Imports:
XML,RCurl,rjson XML,RCurl,rjson
Suggests: Suggests:
gplots,RMassBankData (>= 0.99.3), gplots,RMassBankData,
xcms (>= 1.37.1), xcms (>= 1.37.1),
CAMERA, CAMERA,
ontoCAT, ontoCAT,
......
export(CTS.externalIdSubset)
export(CTS.externalIdTypes)
export(RmbDefaultSettings)
export(RmbSettingsTemplate)
export(add.formula) export(add.formula)
export(addMB) export(addMB)
export(addPeaks) export(addPeaks)
...@@ -12,8 +16,6 @@ export(cleanElnoise) ...@@ -12,8 +16,6 @@ export(cleanElnoise)
export(combineMultiplicities) export(combineMultiplicities)
export(compileRecord) export(compileRecord)
export(createMolfile) export(createMolfile)
export(CTS.externalIdSubset)
export(CTS.externalIdTypes)
export(dbe) export(dbe)
export(deprofile) export(deprofile)
export(deprofile.fwhm) export(deprofile.fwhm)
...@@ -31,7 +33,6 @@ export(findMass) ...@@ -31,7 +33,6 @@ export(findMass)
export(findMsMsHR) export(findMsMsHR)
export(findMsMsHR.direct) export(findMsMsHR.direct)
export(findMsMsHR.mass) export(findMsMsHR.mass)
export(findMsMsHRperxcms.direct)
export(findMz) export(findMz)
export(findMz.formula) export(findMz.formula)
export(findName) export(findName)
...@@ -83,19 +84,19 @@ export(recalibrateSingleSpec) ...@@ -83,19 +84,19 @@ export(recalibrateSingleSpec)
export(recalibrateSpectra) export(recalibrateSpectra)
export(resetInfolists) export(resetInfolists)
export(resetList) export(resetList)
export(RmbDefaultSettings)
export(RmbSettingsTemplate)
export(smiles2mass) export(smiles2mass)
export(to.limits.rcdk) export(to.limits.rcdk)
export(toMassbank) export(toMassbank)
export(toRMB)
export(validate) export(validate)
exportClasses(mbWorkspace) exportClasses(mbWorkspace)
exportClasses(msmsWorkspace) exportClasses(msmsWorkspace)
exportMethods(show) exportMethods(show)
import(mzR)
import(RCurl) import(RCurl)
import(rjson)
import(XML) import(XML)
import(methods)
import(mzR)
import(rcdk)
import(rjson)
import(yaml)
importClassesFrom(mzR) importClassesFrom(mzR)
importMethodsFrom(mzR) importMethodsFrom(mzR)
#' @import methods
NULL
#' Workspace for \code{msmsWorkflow} data #' Workspace for \code{msmsWorkflow} data
#' #'
......
#' @import rcdk
NULL
......
...@@ -40,7 +40,7 @@ archiveResults <- function(w, fileName, settings = getOption("RMassBank")) ...@@ -40,7 +40,7 @@ archiveResults <- function(w, fileName, settings = getOption("RMassBank"))
#' workflow. #' workflow.
#' #'
#' @usage msmsWorkflow(w, mode="pH", steps=c(1:8), confirmMode = FALSE, newRecalibration = TRUE, #' @usage msmsWorkflow(w, mode="pH", steps=c(1:8), confirmMode = FALSE, newRecalibration = TRUE,
#' useRtLimit = TRUE, archivename=NA, readMethod = "mzR", findPeaksArgs = NA, plots = FALSE, #' useRtLimit = TRUE, archivename=NA, readMethod = "mzR",
#' precursorscan.cf = FALSE, #' precursorscan.cf = FALSE,
#' settings = getOption("RMassBank"), analyzeMethod = "formula", #' settings = getOption("RMassBank"), analyzeMethod = "formula",
#' progressbar = "progressBarHook") #' progressbar = "progressBarHook")
...@@ -56,15 +56,11 @@ archiveResults <- function(w, fileName, settings = getOption("RMassBank")) ...@@ -56,15 +56,11 @@ archiveResults <- function(w, fileName, settings = getOption("RMassBank"))
#' to reuse the currently stored curve (\code{FALSE}, useful e.g. for adduct-processing runs.) #' to reuse the currently stored curve (\code{FALSE}, useful e.g. for adduct-processing runs.)
#' @param useRtLimit Whether to enforce the given retention time window. #' @param useRtLimit Whether to enforce the given retention time window.
#' @param archivename The prefix under which to store the analyzed result files. #' @param archivename The prefix under which to store the analyzed result files.
#' @param readMethod Several methods are available to get peak lists from the files. #' @param readMethod Several methods are available to get peak lists from the input files.
#' Currently supported are "mzR", "xcms", "MassBank" and "peaklist". #' Currently supported are "mzR" and "peaklist".
#' The first two read MS/MS raw data, and differ in the strategy #' "mzR" reads MS/MS raw data. An alternative raw data reader "xcms" is not yet released.
#' used to extract peaks. MassBank will read existing records, #' "peaklist" reads a CSV with two columns and the column header "mz", "int".
#' so that e.g. a recalibration can be performed, and "peaklist" #' @param precursorscan.cf Whether to fill precursor scan entries (cf = carry forward). To be used with files which for
#' just requires a CSV with two columns and the column header "mz", "int".
#' @param findPeaksArgs A list of arguments that will be handed to the xcms-method findPeaks via do.call
#' @param plots A parameter that determines whether the spectra should be plotted or not (This parameter is only used for the xcms-method)
#' @param precursorscan.cf Whether to fill precursor scans. To be used with files which for
#' some reasons do not contain precursor scan IDs in the mzML, e.g. AB Sciex converted #' some reasons do not contain precursor scan IDs in the mzML, e.g. AB Sciex converted
#' files. #' files.
#' @param settings Options to be used for processing. Defaults to the options loaded via #' @param settings Options to be used for processing. Defaults to the options loaded via
...@@ -77,7 +73,7 @@ archiveResults <- function(w, fileName, settings = getOption("RMassBank")) ...@@ -77,7 +73,7 @@ archiveResults <- function(w, fileName, settings = getOption("RMassBank"))
#' @author Michael Stravs, Eawag <michael.stravs@@eawag.ch> #' @author Michael Stravs, Eawag <michael.stravs@@eawag.ch>
#' @export #' @export
msmsWorkflow <- function(w, mode="pH", steps=c(1:8), confirmMode = FALSE, newRecalibration = TRUE, msmsWorkflow <- function(w, mode="pH", steps=c(1:8), confirmMode = FALSE, newRecalibration = TRUE,
useRtLimit = TRUE, archivename=NA, readMethod = "mzR", findPeaksArgs = NA, plots = FALSE, useRtLimit = TRUE, archivename=NA, readMethod = "mzR",
precursorscan.cf = FALSE, precursorscan.cf = FALSE,
settings = getOption("RMassBank"), analyzeMethod = "formula", settings = getOption("RMassBank"), analyzeMethod = "formula",
progressbar = "progressBarHook") progressbar = "progressBarHook")
...@@ -124,28 +120,28 @@ msmsWorkflow <- function(w, mode="pH", steps=c(1:8), confirmMode = FALSE, newRec ...@@ -124,28 +120,28 @@ msmsWorkflow <- function(w, mode="pH", steps=c(1:8), confirmMode = FALSE, newRec
do.call(progressbar, list(object=pb, close=TRUE)) do.call(progressbar, list(object=pb, close=TRUE))
} }
if(readMethod == "xcms"){ # if(readMethod == "xcms"){
splitfn <- strsplit(w@files,'_') # splitfn <- strsplit(w@files,'_')
cpdIDs <- sapply(splitfn, function(splitted){as.numeric(return(splitted[length(splitted)-1]))}) # cpdIDs <- sapply(splitfn, function(splitted){as.numeric(return(splitted[length(splitted)-1]))})
files <- list() # files <- list()
wfiles <- vector() # wfiles <- vector()
for(i in 1:length(unique(cpdIDs))) { # for(i in 1:length(unique(cpdIDs))) {
indices <- sapply(splitfn,function(a){return(unique(cpdIDs)[i] %in% a)}) # indices <- sapply(splitfn,function(a){return(unique(cpdIDs)[i] %in% a)})
files[[i]] <- w@files[indices] # files[[i]] <- w@files[indices]
} # }
#
w@files <- sapply(files,function(files){return(files[1])}) # w@files <- sapply(files,function(files){return(files[1])})
#
for(i in 1:length(unique(cpdIDs))){ # for(i in 1:length(unique(cpdIDs))){
specs <- list() # specs <- list()
for(j in 1:length(files[[i]])){ # for(j in 1:length(files[[i]])){
specs[[j]] <- findMsMsHRperxcms.direct(files[[i]][j], unique(cpdIDs)[i], mode=mode, findPeaksArgs=findPeaksArgs, plots) # specs[[j]] <- findMsMsHRperxcms.direct(files[[i]][j], unique(cpdIDs)[i], mode=mode, findPeaksArgs=findPeaksArgs, plots)
} # }
w@specs[[i]] <- toRMB(unlist(specs, recursive = FALSE), unique(cpdIDs)[i], mode=mode) # w@specs[[i]] <- toRMB(unlist(specs, recursive = FALSE), unique(cpdIDs)[i], mode=mode)
} # }
names(w@specs) <- basename(as.character(w@files)) # names(w@specs) <- basename(as.character(w@files))
} # }
#
##if(readMethod == "MassBank"){ ##if(readMethod == "MassBank"){
## for(i in 1:length(w@files)){ ## for(i in 1:length(w@files)){
## w <- addMB(w, w@files[i], mode) ## w <- addMB(w, w@files[i], mode)
......
...@@ -32,7 +32,7 @@ NULL ...@@ -32,7 +32,7 @@ NULL
#' mzCoarse = getOption("RMassBank")$findMsMsRawSettings$mzCoarse, #' mzCoarse = getOption("RMassBank")$findMsMsRawSettings$mzCoarse,
#' fillPrecursorScan = getOption("RMassBank")$findMsMsRawSettings$fillPrecursorScan, #' fillPrecursorScan = getOption("RMassBank")$findMsMsRawSettings$fillPrecursorScan,
#' rtMargin = getOption("RMassBank")$rtMargin, #' rtMargin = getOption("RMassBank")$rtMargin,
#' deprofile = getOption("RMassBank")$deprofile) #' deprofile = getOption("RMassBank")$deprofile, headerCache = NA)
#' #'
#' @aliases findMsMsHR.mass findMsMsHR.direct findMsMsHR #' @aliases findMsMsHR.mass findMsMsHR.direct findMsMsHR
#' @param fileName The file to open and search the MS2 spectrum in. #' @param fileName The file to open and search the MS2 spectrum in.
...@@ -260,110 +260,6 @@ findMsMsHR.direct <- function(msRaw, cpdID, mode = "pH", confirmMode = 0, useRtL ...@@ -260,110 +260,6 @@ findMsMsHR.direct <- function(msRaw, cpdID, mode = "pH", confirmMode = 0, useRtL
return(sp) return(sp)
} }
#' Read in mz-files using XCMS
#'
#' Picks peaks from mz-files and returns the pseudospectra that CAMERA creates with the help of XCMS
#'
#' @usage findMsMsHRperxcms.direct(fileName, cpdID, mode="pH", findPeaksArgs = NULL, plots = FALSE)
#' @param fileName The path to the mz-file that should be read
#' @param cpdID The compoundID of the compound that has been used for the file
#' @param mode The ionization mode that has been used for the spectrum represented by the peaklist
#' @param findPeaksArgs A list of arguments that will be handed to the xcms-method findPeaks via do.call
#' @param plots A parameter that determines whether the spectra should be plotted or not
#' @return The \code{msmsWorkspace} with the additional peaklist added to the right spectrum
#' @seealso \code{\link{msmsWorkflow}}
#' @author Erik Mueller
#' @examples \dontrun{
#' fileList <- list.files(system.file("XCMSinput", package = "RMassBank"), "Glucolesquerellin", full.names=TRUE)[3]
#' loadList(system.file("XCMSinput/compoundList.csv",package="RMassBank"))
#' psp <- findMsMsHRperxcms.direct(fileList,2184)
#' }
#' @export
findMsMsHRperxcms.direct <- function(fileName, cpdID, mode="pH", findPeaksArgs = NULL, plots = FALSE) {
require(CAMERA)
require(xcms)
parentMass <- findMz(cpdID)$mzCenter
RT <- findRt(cpdID)$RT * 60
mzabs <- 0.1
getRT <- function(xa) {
rt <- sapply(xa@pspectra, function(x) {median(peaks(xa@xcmsSet)[x, "rt"])})
}
##
## MS
##
##
## MSMS
##
xrmsms <- xcmsRaw(fileName, includeMSn=TRUE)
print("File read")
## Where is the wanted isolation ?
precursorrange <- range(which(xrmsms@msnPrecursorMz == parentMass)) ## TODO: add ppm one day
## Fake MS1 from MSn scans
## xrmsmsAsMs <- msn2xcmsRaw(xrmsms)
xrs <- split(msn2xcmsRaw(xrmsms), f=xrmsms@msnCollisionEnergy)
## Fake s simplistic xcmsSet
setReplicate <- xcmsSet(files=fileName, method="MS1")
xsmsms <- as.list(replicate(length(xrs),setReplicate))
candidates <- list()
anmsms <- list()
psp <- list()
spectra <- list()
whichmissing <- vector()
for(i in 1:length(xrs)){
peaks(xsmsms[[i]]) <- do.call(findPeaks,c(findPeaksArgs, object = xrs[[i]]))
if (nrow(peaks(xsmsms[[i]])) == 0) {
spectra[[i]] <- matrix(0,2,7)
next
}
## Get pspec
pl <- peaks(xsmsms[[i]])[,c("mz", "rt"), drop=FALSE]
## Best: find precursor peak
candidates[[i]] <- which( pl[,"mz", drop=FALSE] < parentMass + mzabs & pl[,"mz", drop=FALSE] > parentMass - mzabs
& pl[,"rt", drop=FALSE] < RT * 1.1 & pl[,"rt", drop=FALSE] > RT * 0.9 )
print(paste("Candidates:",candidates[[i]]))
anmsms[[i]] <- xsAnnotate(xsmsms[[i]])
anmsms[[i]] <- groupFWHM(anmsms[[i]])
if(length(candidates[[i]]) > 0)
closestCandidate <- which.min (abs( RT - pl[candidates[[i]], "rt", drop=FALSE] ) )
else(closestCandidate <- 1)
## Now find the pspec for compound
psp[[i]] <- which(sapply(anmsms[[i]]@pspectra, function(x) {candidates[[i]][closestCandidate] %in% x}))
print(paste("Pseudospectra:",psp[[i]]))
## 2nd best: Spectrum closest to MS1
##psp <- which.min( abs(getRT(anmsms) - actualRT))
## 3rd Best: find pspec closest to RT from spreadsheet
##psp <- which.min( abs(abs(getRT(anmsms) - RT) )
if((plots == TRUE) && (length(psp[[i]]) > 0)){
plotPsSpectrum(anmsms[[i]], psp[[i]], log=TRUE, mzrange=c(0, findMz(cpdID)[[3]]), maxlabel=10)
}
if(length(psp[[i]]) != 0){
spectra[[i]] <- getpspectra(anmsms[[i]], psp[[i]])
} else {whichmissing <- c(whichmissing,i)}
}
if(length(spectra) != 0){
for(i in whichmissing){
spectra[[i]] <- matrix(0,2,7)
}
}
return(spectra)
}
# Finds the EIC for a mass trace with a window of x ppm. # Finds the EIC for a mass trace with a window of x ppm.
# (For ppm = 10, this is +5 / -5 ppm from the non-recalibrated mz.) # (For ppm = 10, this is +5 / -5 ppm from the non-recalibrated mz.)
#' Extract EICs #' Extract EICs
...@@ -408,120 +304,6 @@ findEIC <- function(msRaw, mz, limit = NULL, rtLimit = NA) ...@@ -408,120 +304,6 @@ findEIC <- function(msRaw, mz, limit = NULL, rtLimit = NA)
return(data.frame(rt = rt, intensity=pks_t, scan=scan)) return(data.frame(rt = rt, intensity=pks_t, scan=scan))
} }
#' Conversion of XCMS-pseudospectra into RMassBank-spectra
#'
#' Converts a pseudospectrum extracted from XCMS using CAMERA into the msmsWorkspace(at)specs-format that RMassBank uses
#'
#' @usage toRMB(msmsXCMSspecs, cpdID, mode, MS1spec)
#' @param msmsXCMSspecs The compoundID of the compound that has been used for the peaklist
#' @param cpdID The compound ID of the substance of the given spectrum
#' @param mode The ionization mode that has been used for the spectrum
#' @param MS1spec The MS1-spectrum from XCMS, which can be optionally supplied
#' @return One list element of the (at)specs-entry from an msmsWorkspace
#' @seealso \code{\link{msmsWorkspace-class}}
#' @author Erik Mueller
#' @examples \dontrun{
#' XCMSpspectra <- findmsmsHRperxcms.direct("Glucolesquerellin_2184_1.mzdata", 2184)
#' wspecs <- toRMB(XCMSpspectra)
#' }
#' @export
toRMB <- function(msmsXCMSspecs = NA, cpdID = NA, mode="pH", MS1spec = NA){
ret <- list()
ret$mz <- findMz(cpdID,mode=mode)
ret$id <- cpdID
ret$formula <- findFormula(cpdID)
print(paste("Length of msmsXCMSspecs:",length(msmsXCMSspecs)))
if(length(msmsXCMSspecs) == 0){
ret$foundOK <- FALSE
print("blabla")
return(ret)
}
if(is.na(msmsXCMSspecs)){
stop("You need a readable spectrum!")
}
if(is.na(cpdID)){
stop("Please supply the compoundID!")
}
numScan <- length(msmsXCMSspecs)
ret$foundOK <- TRUE
ret$parentscan <- 1
ret$parentHeader <- matrix(0, ncol = 20, nrow = 1)
rownames(ret$parentHeader) <- 1
colnames(ret$parentHeader) <- c("seqNum", "acquisitionNum", "msLevel", "peaksCount", "totIonCurrent", "retentionTime", "basepeakMZ",
"basePeakIntensity", "collisionEnergy", "ionisationEnergy", "lowMZ", "highMZ", "precursorScanNum",
"precursorMZ", "precursorCharge", "precursorIntensity", "mergedScan", "mergedResultScanNum",
"mergedResultStartScanNum", "mergedResultEndScanNum")
ret$parentHeader[1,1:3] <- 1
##Write nothing in the parents if there is no MS1-spec
if(is.na(MS1spec)){
ret$parentHeader[1,4:20] <- 0
ret$parentHeader[1,6] <- NA
} else { ##Else use the MS1spec spec to write everything into the parents
ret$parentHeader[1,4] <- length(MS1spec[,1])
ret$parentHeader[1,5] <- 0
ret$parentHeader[1,6] <- findRt(cpdID)
ret$parentHeader[1,7] <- MS1spec[which.max(MS1spec[,7]),1]
ret$parentHeader[1,8] <- max(MS1spec[,7])
ret$parentHeader[1,9] <- 0
ret$parentHeader[1,10] <- 0
ret$parentHeader[1,11] <- min(MS1spec[,1])
ret$parentHeader[1,12] <- max(MS1spec[,1])
ret$parentHeader[1,13:20] <- 0 ##Has no precursor and merge is not yet implemented
}
ret$parentHeader <- as.data.frame(ret$parentHeader)
##Write the peaks into the childscans
ret$childScans <- 2:(numScan+1)
childHeader <- t(sapply(msmsXCMSspecs, function(spec){
header <- vector()
header[3] <- 2
header[4] <- length(spec[,1])
header[5] <- 0 ##Does this matter?
header[6] <- median(spec[,4])
header[7] <- spec[which.max(spec[,7]),1]
header[8] <- max(spec[,7])
header[9] <- 0 ##Does this matter?
header[10] <- 0 ##Does this matter?
header[11] <- min(spec[,1])
header[12] <- max(spec[,1])
header[13] <- 1
header[14] <- findMz(cpdID)[[3]]
header[15] <- 1 ##Will be changed for different charges
header[16] <- 0 ##There sadly isnt any precursor intensity to find in the msms-scans. Workaround? msmsXCMS@files[1]
header[17:20] <- 0 ##Will be changed if merge is wanted
return(header)
}))
childHeader[,1:2] <- 2:(length(msmsXCMSspecs)+1)
ret$childHeader <- as.data.frame(childHeader)
rownames(ret$childHeader) <- 2:(numScan+1)
colnames(ret$childHeader) <- c("seqNum", "acquisitionNum", "msLevel", "peaksCount", "totIonCurrent", "retentionTime", "basepeakMZ",
"basePeakIntensity", "collisionEnergy", "ionisationEnergy", "lowMZ", "highMZ", "precursorScanNum",
"precursorMZ", "precursorCharge", "precursorIntensity", "mergedScan", "mergedResultScanNum",
"mergedResultStartScanNum", "mergedResultEndScanNum")
if (is.na(ret$parentHeader[1,"retentionTime"])) {
## Overwrite MS1 RT with average from MS2
ret$parentHeader[1,"retentionTime"] <- median(ret$childHeader[which(ret$childHeader[,"retentionTime"] != 0), "retentionTime"])
}
ret$parentPeak <- matrix(nrow = 1, ncol = 2)
colnames(ret$parentPeak) <- c("mz","int")
ret$parentPeak[1,] <- c(findMz(cpdID,mode=mode)$mzCenter,100)
ret$peaks <- list()
ret$peaks <- lapply (msmsXCMSspecs, function(specs){
peaks <- matrix(nrow = length(specs[,1]), ncol = 2)
colnames(peaks) <- c("mz","int")
peaks[,1] <- specs[,1]
peaks[,2] <- specs[,7]
return(peaks)
})
return(ret)
}
#' Addition of manual peaklists #' Addition of manual peaklists
#' #'
......
#' @import yaml
NULL
.checkMbSettings <- function() .checkMbSettings <- function()
{ {
......
\name{RmbDefaultSettings} \name{RmbDefaultSettings}
\alias{loadRmbSettings}
\alias{loadRmbSettingsFromEnv}
\alias{RmbDefaultSettings} \alias{RmbDefaultSettings}
\alias{RmbSettingsTemplate} \alias{RmbSettingsTemplate}
\alias{loadRmbSettings}
\alias{loadRmbSettingsFromEnv}
\title{RMassBank settings} \title{RMassBank settings}
\usage{ \usage{
loadRmbSettings(file_or_list) loadRmbSettings(file_or_list)
......
...@@ -39,7 +39,7 @@ ...@@ -39,7 +39,7 @@
w1 <- msmsWorkflow(w, steps=c(1:7), mode="pH") w1 <- msmsWorkflow(w, steps=c(1:7), mode="pH")
w2 <- msmsWorkflow(w, steps=c(1:7), mode="pH", confirmMode = 1) w2 <- msmsWorkflow(w, steps=c(1:7), mode="pH", confirmMode = 1)
wTotal <- combineMultiplicities(c(w1, w2)) wTotal <- combineMultiplicities(c(w1, w2))
wTotal <- msmsWorkflow(wTotal, steps=8, mode="pH", archiveName = "output") wTotal <- msmsWorkflow(wTotal, steps=8, mode="pH", archivename = "output")
# continue here with mbWorkflow # continue here with mbWorkflow
} }
} }
......
...@@ -27,7 +27,7 @@ ...@@ -27,7 +27,7 @@
fillPrecursorScan = fillPrecursorScan =
getOption("RMassBank")$findMsMsRawSettings$fillPrecursorScan, getOption("RMassBank")$findMsMsRawSettings$fillPrecursorScan,
rtMargin = getOption("RMassBank")$rtMargin, deprofile = rtMargin = getOption("RMassBank")$rtMargin, deprofile =
getOption("RMassBank")$deprofile) getOption("RMassBank")$deprofile, headerCache = NA)
} }
\arguments{ \arguments{
\item{fileName}{The file to open and search the MS2 \item{fileName}{The file to open and search the MS2
......
\name{findMsMsHRperxcms.direct}
\alias{findMsMsHRperxcms.direct}
\title{Read in mz-files using XCMS}
\usage{
findMsMsHRperxcms.direct(fileName, cpdID, mode="pH",
findPeaksArgs = NULL, plots = FALSE)
}
\arguments{
\item{fileName}{The path to the mz-file that should be
read}
\item{cpdID}{The compoundID of the compound that has been
used for the file}
\item{mode}{The ionization mode that has been used for
the spectrum represented by the peaklist}
\item{findPeaksArgs}{A list of arguments that will be
handed to the xcms-method findPeaks via do.call}
\item{plots}{A parameter that determines whether the
spectra should be plotted or not}
}
\value{
The \code{msmsWorkspace} with the additional peaklist
added to the right spectrum
}
\description{
Picks peaks from mz-files and returns the pseudospectra
that CAMERA creates with the help of XCMS
}
\examples{
\dontrun{
fileList <- list.files(system.file("XCMSinput", package = "RMassBank"), "Glucolesquerellin", full.names=TRUE)[3]
loadList(system.file("XCMSinput/compoundList.csv",package="RMassBank"))
psp <- findMsMsHRperxcms.direct(fileList,2184)
}
}
\author{
Erik Mueller
}
\seealso{
\code{\link{msmsWorkflow}}
}
...@@ -16,6 +16,11 @@ ...@@ -16,6 +16,11 @@
the \code{workspace} object and finds out which steps the \code{workspace} object and finds out which steps
have already been processed on it. have already been processed on it.
} }
\examples{
\dontrun{
findProgress(w)
}
}
\author{ \author{
Stravs MA, Eawag <michael.stravs@eawag.ch> Stravs MA, Eawag <michael.stravs@eawag.ch>
} }
......
...@@ -4,10 +4,9 @@ ...@@ -4,10 +4,9 @@
\usage{ \usage{
msmsWorkflow(w, mode="pH", steps=c(1:8), confirmMode = msmsWorkflow(w, mode="pH", steps=c(1:8), confirmMode =
FALSE, newRecalibration = TRUE, useRtLimit = TRUE, FALSE, newRecalibration = TRUE, useRtLimit = TRUE,
archivename=NA, readMethod = "mzR", findPeaksArgs = NA, archivename=NA, readMethod = "mzR", precursorscan.cf =
plots = FALSE, precursorscan.cf = FALSE, settings = FALSE, settings = getOption("RMassBank"), analyzeMethod
getOption("RMassBank"), analyzeMethod = "formula", = "formula", progressbar = "progressBarHook")
progressbar = "progressBarHook")
} }
\arguments{ \arguments{
\item{w}{A \code{msmsWorkspace} to work with.} \item{w}{A \code{msmsWorkspace} to work with.}
...@@ -35,25 +34,16 @@ ...@@ -35,25 +34,16 @@
analyzed result files.} analyzed result files.}
\item{readMethod}{Several methods are available to get \item{readMethod}{Several methods are available to get
peak lists from the files. Currently supported are peak lists from the input files. Currently supported are
"mzR", "xcms", "MassBank" and "peaklist". The first two "mzR" and "peaklist". "mzR" reads MS/MS raw data. An
read MS/MS raw data, and differ in the strategy used to alternative raw data reader "xcms" is not yet released.
extract peaks. MassBank will read existing records, so "peaklist" reads a CSV with two columns and the column
that e.g. a recalibration can be performed, and header "mz", "int".}
"peaklist" just requires a CSV with two columns and the
column header "mz", "int".}
\item{findPeaksArgs}{A list of arguments that will be \item{precursorscan.cf}{Whether to fill precursor scan
handed to the xcms-method findPeaks via do.call} entries (cf = carry forward). To be used with files which
for some reasons do not contain precursor scan IDs in the
\item{plots}{A parameter that determines whether the mzML, e.g. AB Sciex converted files.}
spectra should be plotted or not (This parameter is only
used for the xcms-method)}
\item{precursorscan.cf}{Whether to fill precursor scans.
To be used with files which for some reasons do not
contain precursor scan IDs in the mzML, e.g. AB Sciex
converted files.}
\item{settings}{Options to be used for processing. \item{settings}{Options to be used for processing.
Defaults to the options loaded via Defaults to the options loaded via
......
\name{toRMB}
\alias{toRMB}
\title{Conversion of XCMS-pseudospectra into RMassBank-spectra}
\usage{
toRMB(msmsXCMSspecs, cpdID, mode, MS1spec)
}
\arguments{
\item{msmsXCMSspecs}{The compoundID of the compound that
has been used for the peaklist}
\item{cpdID}{The compound ID of the substance of the
given spectrum}
\item{mode}{The ionization mode that has been used for
the spectrum}
\item{MS1spec}{The MS1-spectrum from XCMS, which can be
optionally supplied}
}
\value{
One list element of the (at)specs-entry from an
msmsWorkspace
}
\description{
Converts a pseudospectrum extracted from XCMS using
CAMERA into the msmsWorkspace(at)specs-format that
RMassBank uses
}
\examples{
\dontrun{
XCMSpspectra <- findmsmsHRperxcms.direct("Glucolesquerellin_2184_1.mzdata", 2184)
wspecs <- toRMB(XCMSpspectra)
}
}
\author{
Erik Mueller
}
\seealso{
\code{\link{msmsWorkspace-class}}
}
% \VignetteIndexEntry{RMassBank using XCMS walkthrough}
% \VignettePackage{rcdk}
% \VignetteKeywords{}
%% To generate the Latex code
%library(RMassBank)
%Rnwfile<- file.path("RMassBankXCMS.Rnw")
%Sweave(Rnwfile,pdf=TRUE,eps=TRUE,stylepath=TRUE,driver=RweaveLatex())
\documentclass[letterpaper, 11pt]{article}
\usepackage{times}
\usepackage{url}
\usepackage[pdftex,bookmarks=true]{hyperref}
\newcommand{\Rfunction}[1]{{\texttt{#1}}}
\newcommand{\Rpackage}[1]{{\textit{#1}}}
\newcommand{\funcarg}[1]{{\texttt{#1}}}
\newcommand{\Rvar}[1]{{\texttt{#1}}}
\newcommand{\rclass}[1]{{\textit{#1}}}
<<echo=FALSE>>=
options(width=74)
#library(xtable)
@
\parindent 0in
\parskip 1em
\begin{document}
\title{RMassBank for XCMS}
\author{Erik M\"uller}
\maketitle
\tableofcontents
\newpage
\section{Introduction}
As the RMassBank-workflow is described in the other manual, this document mainly explains how to utilize the
XCMS-, MassBank-, andpeaklist-readMethods for step 1 of the workflow.
\section{Input files}
\subsection{LC/MS data}
\Rpackage{RMassBank} handles high-resolution LC/MS spectra in mzML or mzdata format in
centroid\footnote{The term "centroid" here refers to any kind of data which are
not in profile mode, i.e. don't have continuous m/z data. It does not refer to
the (mathematical) centroid peak, i.e. the area-weighted mass peak.} or in
profile mode.
Data in the examples was acquired using an QTOF instrument.
In the standard workflow, the file names are used to identify a
compound: file names must be in the format \funcarg{xxxxxxxx\_1234\_xxx.mzXML},
where the xxx parts denote anything and the 1234 part denotes the compound ID in
the compound list (see below). Advanced and alternative uses can be implemented;
consult the implementation of \Rvar{msms\_workflow} and \Rvar{findMsMsHRperX.direct} for
more information.
\section{Additional Workflow-Methods}
The data used in the following example is available as a package \Rpackage{RMassBankData},
so both libraries have to be installed to run this vignette.
<<>>=
library(RMassBank)
library(RMassBankData)
@
\subsection{Options}
In the first part of the workflow, spectra are extracted from the files and processed. In the following example, we will process the Glulesquerellin spectra from the provided files.
For the workflow to work correctly, we use the default settings, and modify then to match the data acquisition method. The settings have to contain the same parameters as the mzR-method would for the workflow.
<<echo=TRUE,eval=TRUE>>=
RmbDefaultSettings()
rmbo <- getOption("RMassBank")
rmbo$spectraList <- list(
list(mode="CID", ces="10eV", ce="10eV", res=12000),
list(mode="CID", ces="20eV", ce="20eV", res=12000)
)
rmbo$filterSettings$prelimCut <- 0
rmbo$annotations$instrument <- "Bruker micrOTOFq"
rmbo$annotations$instrument_type <- "LC-ESI-QTOF"
options("RMassBank" = rmbo)
@
\subsection{XCMS-workflow}
First, a workspace for the \Rvar{msmsWorkflow} must be created:
<<>>=
msmsList <- newMsmsWorkspace()
@
The full paths of the files must be loaded into the container in the array
\Rvar{files}:
<<>>=
msmsList@files <- list.files(system.file("spectra.Glucolesquerellin",
package = "RMassBankData"),
"Glucolesquerellin.*mzData", full.names=TRUE)
@
Note the position of the compound IDs in the filenames. Historically, the "\Rvar{pos}" at the end was used to denote the polarity; it is obsolete now, but the ID must be terminated with an underscore.
If you have multiple files for one compound, you have to give them the same ID, but thanks to the polarity at the end being obsolete, you can just enumerate them.
Additionally, the compound list must be loaded using \Rfunction{loadList}:
<<>>=
loadList(system.file("list/PlantDataset.csv",package="RMassBankData"))
@
Basically, the changes to the workflow using XCMS can be described as follows:
The MS2-Spectra(and optionally the MS1-spectrum) are extracted and peakpicked using XCMS. You can pass different parameters for the \Rfunction{findPeaks} function of XCMS using the findPeaksArgs-argument to detect actual peaks. Then, CAMERA processes the peak lists and creates pseudospectra (or compound spectra). The obtained pseudospectra are stored in the array \Rvar{specs}.
Please note that "findPeaksArgs" has to be a list with the list elements named after the arguments that the method you want to use contains, as findPeaks is called by \Rfunction{do.call}.
For example, if you want to use centWave with a peakwidth from 5 to 10 and 25 ppm, findPeaksArgs would look like this:
<<eval=TRUE>>=
Args <- list(method="centWave",
peakwidth=c(5,12),
prefilter=c(0,0),
ppm=25, snthr=2)
@
If you want to utilize XCMS for Step 1 of the workflow, you have to set the readMethod-parameter to "xcms" and - if you don't want to use standard values for findPeaks - pass on findPeaksArgs to the workflow.
<<eval=TRUE>>=
msmsList <- msmsWorkflow(msmsList, steps=1:8,
mode="mH", readMethod="xcms",
findPeaksArgs = Args)
@
You can of course run the rest of the workflow as usual, by - like here - setting steps to 1:8
\subsection{peaklist-workflow}
The peaklist-workflow works akin to the normal mzR-workflow with the only difference being, that the supplied data has to be in .csv format and contain 2 columns: "mz" and "int".
You can look at an example file in the RMassBankData-package in spectra.Glucolesquerellin. Please note that the naming of the csv has to be similar to the mzdata-files, with the only difference being the filename extension.
The readMethod name for this is "peaklist"
<<eval=FALSE>>=
msmsPeaklist <- newMsmsWorkspace()
msmsPeaklist@files <- list.files(system.file("spectra.Glucolesquerellin",
package = "RMassBankData"),
"Glucolesquerellin.*csv", full.names=TRUE)
msmsPeaklist <- msmsWorkflow(msmsPeaklist, steps=1:8,
mode="mH", readMethod="peaklist")
@
\subsection{Export the records}
This section is just to debug the record creation with XCMS, and hence very terse.
<<>>=
mb <- newMbWorkspace(msmsList)
mb <- resetInfolists(mb)
mb <- loadInfolist(mb,system.file("infolists/PlantDataset.csv",
package = "RMassBankData"))
## Step
mb <- mbWorkflow(mb, steps=3:4)
@
\section{Session information}
<<>>=
sessionInfo()
@
\end{document}
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment