Skip to main content

Messages

This section allows you to view all Messages made by this member. Note that you can only see Messages made in areas you currently have access to.

Messages - trljcl

1
MS-DIAL / MS-DIAL versions after 4.48 crash after processing and on loading alignment file
Hello,
I've been processing Agilent GCMS data (1 to 5 files, as .abf files), using retention time identification and alignment (i.e. no RI index files).  In MS-DIAL versions up to and including 4.48 (I've tested multiple versions back to 4.24, i.e. 4.24, 4.36, 4.38, 4.48), the software behaves as expected, without any issues.  However, in the current version (4.80), and all versions after 4.48 up to this (i.e. 4.60, 4.70, 4.80), I get the following errors:

1)  After finishing Data Processing, MS-DIAL closes (crashes).  The .mtd2 file is automatically saved before the crash, as MS-DIAL can then be re-started and the .mtd2 project file opened, with processed data spots showing as expected for individual files.  This happens whether I process a single file, or multiple files as input.

2) After successfully processing multiple files (and re-starting MS-DIAL and loading the saved .mtd2 project file as outlined above), the spots from individual files can be viewed by double clicking on files in the File Navigator.  However, after double-clicking the alignment file in Alignment Navigator, MS-DIAL crashes.  So it is not possible to view alignment results.  I can generate an alignment file in the new versions (e.g. 4.80), and open the saved .mtd2 file in versions <= 4.48 to view the alignment result, but not in any versions >= 4.60.

I have installed all MS-DIAL versions on Win 10 (build version 1909, Intel i5-7200U 2.50 GHz processor with 16 GB RAM).

It looks to me that the alignment results are being successfully generated in he latest MS-DIAL versions, but simply not openable in the GUI.  Whatever caused this problem happened in the upgrade from version 4.48 to 4.60.  So at the moment I am restricted to using 4.48 until this is resolved.  Are there any user functionality differences between 4.48 and 4.80?

I've attached a screenshot of the .mtd2 file opened in 4.80, where I processed 5 .abf files.  Double clicking on any of the alignmentResults will crash MS-DIAL.  The screenshot looks identical in 4.48 (i.e. 4.80 generated .mtd2 file opened in 4.48) , and the alignmentResults all open with no issues.

Many thanks
Tony

2
CAMERA / annotate() pipeline inconsistencies and findIsotopes filter
Hi all,
I am using CAMERA 1.26.0 and noticed there are some inconsistencies in the annotate() defaults compared to running the individual functions; namely:

groupFWHM()
default intval = "maxo"; annotate() passes down default intval  = "into"

findIsotopes()
default intval = "maxo"; annotate() passes down default intval  = "into"
default mzabs = 0.01; annotate() passes down default mzabs = 0.015
default filter = TRUE; annotate() does not recognise "filter" as an argument so it cannot be changed from the default (e.g. to FALSE) in the annotate() pipeline

groupCorr()
default cor_exp_thr = 0.75; annotate() does not recognise "cor_exp_thr" as an argument so it cannot be changed from the default in the annotate() pipeline

I had no end of trouble running a sequence of the core functions with the default values (xsAnnotate(), groupFWHM(), findIsotopes(), groupCorr(), findAdducts()) vs annotate() and getting different results because of these inconsistencies.  Please could the defaults be harmonised, and the filter and cor_exp_thr arguments be made available to pass through in annotate()?

Also, I am having issues with the "filter" argument in findIsotopes.  As far as I can tell, this traces to the relevant code snippet from the CAMERA:::findIsotopesPspec() function (embedded in findIsoptopes()):

theo.mass <- spectra[j, 1] * charge
                      numC <- abs(round(theo.mass/12))
                      inten.max <- int.c12 * numC * 0.011
                      inten.min <- int.c12 * 1 * 0.011
                      if ((int.c13 < inten.max && int.c13 > inten.min) ||
                        !params$filter) {
                     
I read this as the maximum calculated C13 isotope intensity is estimated by taking the total number of carbons in the compound * the natural abundance of the C13 isotope and assuming all C are C13 isotopes.  This will be an overstimate as the numC will always be > the actual number of C as all the mass is not made up of C and not all the C are always C13.  (e.g. for a C12 alkane,  numC as calculated from the expected [M+H]+ = 14). This is fine and allows for a margin or error.  However, in contrast to inten.max, inten.min is a rather hard cutoff as it assumes a minimum of  n = one C13 (fine) but then sets the intesity to a value of the assumed natural abundance * the C12 intensity.  Not so fine as any instrumental measurements that slightly underestimate the C13 isotope intensity (e.g. on my orbitrap where isotope ratio intensities can deviate +/- 20% from theoretical) will fail this minimum estimate.  Why not do this instead:

theo.mass <- spectra[j, 1] * charge
                      numC <- abs(round(theo.mass/12))
                      inten.max <- int.c12 * numC * 0.011
                      if ((int.c13 < inten.max && int.c13 < int.C12) ||
                        !params$filter) {

This way the C13 intensity just has to be less than the C12 intensity for the minimum estimate (OK for small molecules where n C < 90)??

Or alternatively some minimum intensity fudge factor ; e.g. assume up to a 50% minimum intensity measurement error

theo.mass <- spectra[j, 1] * charge
                      numC <- abs(round(theo.mass/12))
                      inten.max <- int.c12 * numC * 0.011
                      inten.min <- int.c12 * 1 * 0.011 * 0.5
                      if ((int.c13 < inten.max && int.c13 > inten.min) ||
                        !params$filter) {

thanks
Tony
3
XCMS / Re: looking for xcms setup help for untargeted metabolomics
Hi Nat,
Regarding point (2) I'm not sue what you're talking about. (I don't usulally use diffreport).  Could you be more specific?

Regarding point (1), this is trickier. The rectUnique() function within xcms effectively discards all duplicates in an ordered matrix, keeping only the first hit. Do if it is ordered by the biggest peak, all smaller peaks are discarded. What you want to do is keep multiple peaks in preference to one integrated peak encompassing all, a quite different task. In that case, you would not use rectUnique. What you could do instead is iteratively select "duplicated" peaks within the same sample, and instead of picking the peak with largest area, pick the fwhm parameter that gives the highest number of peaks in the group of duplicates, and discard the other possibilities. This is certainly doable in R without too much effort (eg using a while loop on an intensity ordered peak matrix and table() to select by peak frequency). But I'm a bit snowed under at the moment. I could spend an hour on it next week maybe....

All the best,
Tony
4
XCMS / Re: looking for xcms setup help for untargeted metabolomics
Hi Nat, Laura and Paul,

I've been using xcms on low  - res LCQ data for years, using both the original (matchedFilter) and centWave algortihms.  The problems are thus:

1) matchedFilter works well with low res data but is not very adaptive in finding peaks with very different peak width (i.e. the fwhm setting)
2) centWave is good at finding a range of peak widths but doesn't work well with low res data.  You can set ppm to 600 - 800 to find peaks but it's a fine line between exceeding the mass accuracy limits of a low-res instrument and getting centWave - specific peak insertion problems.

The solution I've come up with that works best for me is to do multiple rounds of peak picking using matchedFilter with different FWHM settings, combine the peak lists and remove redundant peaks (i.e. peaks found more than once within some m/z and rt limits), and then proceed as normal with grouping, alignment, etc.  e.g:

#make 3 xcmsSet objects using 3 FWHM values keeping all else the same
set1a <- xcmsSet(files = mzXML.files, method = "matchedFilter", fwhm = 10, max = 500, snthresh = 10, step = 0.1, steps = 2 , mzdiff = 0.8)
set1b <- xcmsSet(files = mzXML.files, method = "matchedFilter", fwhm = 30, max = 500, snthresh = 10, step = 0.1, steps = 2 , mzdiff = 0.8)
set1c <- xcmsSet(files = mzXML.files, method = "matchedFilter", fwhm = 60, max = 500, snthresh = 10, step = 0.1, steps = 2 , mzdiff = 0.8)

#combine into one xcmsSet by using one of the above as a template and overriding its peaklist with a combination of all three
set1 <- set1c
set1@peaks <- rbind(set1a@peaks, set1b@peaks, set1c@peaks)
set1@peaks <- set1@peaks[order(set1@peaks[, "sample"], decreasing = FALSE), ]

#remove redundant peaks, in this case where there are any peaks within an absolute m/z value of 0.2 and within 3 s for any one sample in the xcmsSet (the largest peak is kept)
set2 <- deDuper(set1, mz.abs = 0.2, rt.abs = 3)

#then group, etc.

the deDuper function is something I've written that you are welcome to try:

deDuper <- function(object, mz.abs = 0.1, rt.abs = 2)
{
require("xcms")

mzdiff = 0

peaks.mat <- object@peaks
mz.min <- peaks.mat[, "mz"] - mz.abs
mz.max <- peaks.mat[, "mz"] + mz.abs
rt.min <- peaks.mat[, "rt"] - rt.abs
rt.max <- peaks.mat[, "rt"] + rt.abs

peaks.mat.out <- NULL

samples <- unique(peaks.mat[,"sample"])

cat("n", "Duplicate peak removal; % complete: ")
percplus <- -1

for(i in 1:length(samples))
        {
        perc <- round(i / length(samples) * 100)
        if(perc %% 10 == 0 && perc != percplus)
                {
                cat(perc, " ")
                }
        percplus <- perc

        peaks.mat.i <- peaks.mat[which(peaks.mat[, "sample"] == samples), , drop = FALSE]
        mz.min.i <- mz.min[which(peaks.mat[, "sample"] == samples)]
        mz.max.i <- mz.max[which(peaks.mat[, "sample"] == samples)]
        rt.min.i <- rt.min[which(peaks.mat[, "sample"] == samples)]
        rt.max.i <- rt.max[which(peaks.mat[, "sample"] == samples)]

        uorder.i <- order(peaks.mat.i[, "into"], decreasing = TRUE)
        uindex.i <- xcms:::rectUnique(cbind(mzmin = mz.min.i, mzmax = mz.max.i, rtmin = rt.min.i, rtmax = rt.max.i), uorder.i, mzdiff)
        peaks.mat.i <- peaks.mat.i[uindex.i, , drop = FALSE]
        peaks.mat.out <- rbind(peaks.mat.out, peaks.mat.i)
        }

cat("n")
object@peaks <- peaks.mat.out
return(object)

}


cheers
Tony
5
XCMS / Re: xcms & mzR installation problem
OK - thought so.

I've managed a dirty workaround instead that works.  Basically, I did 3 consecutive installs of netcdf, configured each time for  my ~/usr directory,  ~/usr/bin, and ~/usr/bin/lib64/R.  Interestingly, the ncdf library in R loaded fine with just the ~/usr install, but mzR (and xcms) required additionally the other two locations to install properly.

This was all with the latest netCDF build (4.2) with the --disable-netcdf-4 option

cheers
Tony
6
XCMS / xcms & mzR installation problem
Hi,
We recently upgraded to a new remote linux box, so I took the opportunity to install R 2.15.0, and then my selected bioconductor packages, including xcms 1.32.0 via the biocLite() method.  However, xcms failed to install, citing that mzR wasn't available.  I at first thought this was because I forgot to re-install the netCDF library (the paths had changed to the new box), so I reinstalled netCDF and then tried xcms installation again - still fails.  I'm not sure what to do next. 

Of possible note: I have used the --prefix options to install R and netcdf to my user-defined locations; it's not possible to install either as root as I am using a networked shared box.

Some details:
> sessionInfo()
R version 2.15.0 (2012-03-30)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8      LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=C                LC_NAME=C
 [9] LC_ADDRESS=C              LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats    graphics  grDevices utils    datasets  methods  base

other attached packages:
[1] BiocInstaller_1.4.4

loaded via a namespace (and not attached):
[1] tools_2.15.0

netCDF installation method:

netCDF installation (it all checks out on make check):

tar -zxvf netcdf-3.6.3.tar.gz
cd netcdf-3.6.3
./configure --prefix=/bdc/people/trl1/bin --enable-shared CFLAGS="-fPIC"
make
make install

failed xcms installation (also note that setting dependencies=F has no effect of preventing the mzR installation attempt):

> biocLite("xcms")
BioC_mirror: http://bioconductor.org
Using R version 2.15, BiocInstaller version 1.4.4.
Installing package(s) 'xcms'
also installing the dependency ‘mzR’

trying URL 'http://www.bioconductor.org/packages/2.10/bioc/src/contrib/mzR_1.2.1.tar.gz'
Content type 'application/x-gzip' length 4893749 bytes (4.7 Mb)
opened URL
==================================================
downloaded 4.7 Mb

trying URL 'http://www.bioconductor.org/packages/2.10/bioc/src/contrib/xcms_1.32.0.tar.gz'
Content type 'application/x-gzip' length 1320111 bytes (1.3 Mb)
opened URL
==================================================
downloaded 1.3 Mb

* installing *source* package ‘mzR’ ...
** libs
rm -f cramp.o  ramp_base64.o  ramp.o  RcppRamp.o RcppRampModule.o rnetCDF.o ./boost/system/src/error_code.o ./boost/regex/src/posix_api.o ./boost/regex/src/fileiter.o ./boost/regex/src/regex_raw_buffer.o ./boost/regex/src/cregex.o ./boost/regex/src/regex_debug.o ./boost/regex/src/instances.o ./boost/regex/src/icu.o ./boost/regex/src/usinstances.o ./boost/regex/src/regex.o ./boost/regex/src/wide_posix_api.o ./boost/regex/src/regex_traits_defaults.o ./boost/regex/src/winstances.o ./boost/regex/src/wc_regex_traits.o ./boost/regex/src/c_regex_traits.o ./boost/regex/src/cpp_regex_traits.o ./boost/regex/src/static_mutex.o ./boost/regex/src/w32_regex_traits.o ./pwiz/data/msdata/Version.o ./pwiz/utility/minimxml/XMLWriter.o ./pwiz/utility/minimxml/SAXParser.o ./boost/iostreams/src/zlib.o ./boost/thread/src/pthread/once.o ./boost/filesystem/src/operations.o ./pwiz/data/common/MemoryIndex.o ./pwiz/data/common/CVTranslator.o ./pwiz/data/common/cv.o ./pwiz/data/common/ParamTypes.o ./pwiz/data/common/BinaryIndexStream.o ./pwiz/data/common/diff_std.o ./pwiz/data/msdata/SpectrumList_MGF.o ./pwiz/data/msdata/DefaultReaderList.o ./pwiz/data/msdata/ChromatogramList_mzML.o ./pwiz/data/msdata/examples.o ./pwiz/data/msdata/Serializer_mzML.o ./pwiz/data/msdata/Serializer_MSn.o ./pwiz/data/msdata/Reader.o ./pwiz/data/msdata/Serializer_MGF.o ./pwiz/data/msdata/Serializer_mzXML.o ./pwiz/data/msdata/SpectrumList_mzML.o ./pwiz/data/msdata/SpectrumList_MSn.o ./pwiz/data/msdata/BinaryDataEncoder.o ./pwiz/data/msdata/Diff.o ./pwiz/data/msdata/MSData.o ./pwiz/data/msdata/References.o ./pwiz/data/msdata/SpectrumList_mzXML.o ./pwiz/data/msdata/IO.o ./pwiz/data/msdata/SpectrumList_BTDX.o ./pwiz/data/msdata/SpectrumInfo.o ./pwiz/data/msdata/RAMPAdapter.o ./pwiz/data/msdata/LegacyAdapter.o ./pwiz/data/msdata/SpectrumIterator.o ./pwiz/data/msdata/MSDataFile.o ./pwiz/data/msdata/SpectrumListCache.o ./pwiz/utility/misc/IntegerSet.o ./pwiz/utility/misc/Base64.o ./pwiz/utility/misc/IterationListener.o ./pwiz/utility/misc/MSIHandler.o ./pwiz/utility/misc/Filesystem.o ./pwiz/utility/misc/TabReader.o ./pwiz/utility/misc/random_access_compressed_ifstream.o ./pwiz/utility/misc/SHA1.o ./pwiz/utility/misc/SHA1Calculator.o ./pwiz/utility/misc/sha1calc.o ./random_access_gzFile.o rampR.o
g++ -I/bdc/people/trl1/bin/lib64/R/include -DNDEBUG -D_LARGEFILE_SOURCE -I./boost_aux/ -I. -DHAVE_PWIZ_MZML_LIB -I/usr/local/include -I"/bdc/people/trl1/bin/lib64/R/library/Rcpp/include"  -fpic  -g -O2  -c cramp.cpp -o cramp.o
g++ -I/bdc/people/trl1/bin/lib64/R/include -DNDEBUG -D_LARGEFILE_SOURCE -I./boost_aux/ -I. -DHAVE_PWIZ_MZML_LIB -I/usr/local/include -I"/bdc/people/trl1/bin/lib64/R/library/Rcpp/include"  -fpic  -g -O2  -c ramp_base64.cpp -o ramp_base64.o
g++ -I/bdc/people/trl1/bin/lib64/R/include -DNDEBUG -D_LARGEFILE_SOURCE -I./boost_aux/ -I. -DHAVE_PWIZ_MZML_LIB -I/usr/local/include -I"/bdc/people/trl1/bin/lib64/R/library/Rcpp/include"  -fpic  -g -O2  -c ramp.cpp -o ramp.o
g++ -I/bdc/people/trl1/bin/lib64/R/include -DNDEBUG -D_LARGEFILE_SOURCE -I./boost_aux/ -I. -DHAVE_PWIZ_MZML_LIB -I/usr/local/include -I"/bdc/people/trl1/bin/lib64/R/library/Rcpp/include"  -fpic  -g -O2  -c RcppRamp.cpp -o RcppRamp.o
g++ -I/bdc/people/trl1/bin/lib64/R/include -DNDEBUG -D_LARGEFILE_SOURCE -I./boost_aux/ -I. -DHAVE_PWIZ_MZML_LIB -I/usr/local/include -I"/bdc/people/trl1/bin/lib64/R/library/Rcpp/include"  -fpic  -g -O2  -c RcppRampModule.cpp -o RcppRampModule.o
gcc -std=gnu99 -I/bdc/people/trl1/bin/lib64/R/include -DNDEBUG -D_LARGEFILE_SOURCE -I./boost_aux/ -I. -DHAVE_PWIZ_MZML_LIB -I/usr/local/include -I"/bdc/people/trl1/bin/lib64/R/library/Rcpp/include"  -fpic  -g -O2  -c rnetCDF.c -o rnetCDF.o
rnetCDF.c:2:20: fatal error: netcdf.h: No such file or directory
compilation terminated.
make: *** [rnetCDF.o] Error 1
ERROR: compilation failed for package ‘mzR’
* removing ‘/bdc/people/trl1/bin/lib64/R/library/mzR’
ERROR: dependency ‘mzR’ is not available for package ‘xcms’
* removing ‘/bdc/people/trl1/bin/lib64/R/library/xcms’

The downloaded source packages are in
        ‘/tmp/RtmpcsZOLu/downloaded_packages’
Updating HTML index of packages in '.Library'
Making packages.html  ... done
Warning messages:
1: In install.packages(pkgs = pkgs, lib = lib, repos = repos, ...) :
  installation of package ‘mzR’ had non-zero exit status
2: In install.packages(pkgs = pkgs, lib = lib, repos = repos, ...) :
  installation of package ‘xcms’ had non-zero exit status
>
7
XCMS / Multiple class x features p-value corrections
Hi All,
This is really a stats question but reflects a situation I commonly come up against in interpreting xcms results, and I'm sure many forum members do too.

Scenario: I generate an xcmsSet with 100 samples, with 5 replicates over 20 classes generating 200 features.  One class is a control class, and the other 19 are different treatments.  I want to determine which features in each of the treatment classes are significantly different compared to the same features in the control class. What I ultimately want to do is produce a separate chart or table for each class where only features significantly different to the control class are listed. The scripting for this is straight-forward; how to apply the underlying stats is not.

A 2- class problem would be easy; I would do multiple t-tests on each feature followed by a Bonferroni or FDR adjustment on the list of pairwise p-values to determine significance for each feature.  However, for the multiple-class scenario, how should p-value adjustment be carried out?

What I've been doing to date is an ANOVA on each feature, and if the ANOVA p value is < 0.05 performing a post-hoc test using the TukeyHSD procedure to produce class-pairwise adjusted p values to detemine which treatment classes are significantly differerent from the control class.  In this case, there is no p-value adjustment for the ANOVA even though I am conducting multiple ANOVAs. I'm worried that although the Tukey's test corrects for family-wise error between classes, I am making no allowance for error-rate correction between features. 

In my approach, should I adjust the ANOVA p values with a multiple-testing correction for features (e.g. Bonferroni, FDR, etc) first, and then only if the adjusted ANOVA p value is < 0.05 go onto a post-hoc Tukey test? Or is there a better rcommended approach?



thanks
Tony
8
XCMS / migration of old xcms posts?
Is there any intention to migrate archived and future posts from the existing xcms mailing list to here, so all past xcms mailing list content will be searchable through this forum?  Also, will the existing xcms mailing list continue to be maintained and accessed separately from this forum?  It doesn't make much sense to keep them separated...

thanks
Tony