I'm seeing a noticeable number duplicate annotations for the analytes generated from Compound Discoverer. Specifically, there are cases where the molecular weight are the same, but some (or all) of the retention times are different.
What is the best way of handling such cases during downstream analyses?
Given that, how different are the distributions? Say there are 3 conditions, where all samples were randomly assigned to. It would be fine to have one batch with 30/40/50 samples from each treatment (assuming they are run in a randomized order). However, it wouldn't be good to have a distribution of something like 50/0/60. Even worst would be 10/0/100.
I'm talking about the extreme case (e.g. 10/0/100). So is it correct to say that, even if the pooled QC was created from all of the samples in the experiment, the resulting data would have issues because of the unbalanced distribution of the samples (after correction, normalization, etc...)?
If this is the case, what happens during the experiment as a result of the imbalance that leads unusable/low-quality data?
Thank you so much for your feedbacks! Learning so much from you all
It's not an experiment where we have 'paired samples', so we are okay with respect to that.
I wonder if it would be okay if an experiment set up where:
There are multiple batches.
Within each batch, samples with different conditions are not evenly distributed.
A pooled sample consisting of small amounts of all samples in batches is created. QC samples are analyzed in addition to the samples in each batch (which ideally would allow for inter + intra batch drift).
After the batch correction (inter + intra) using QC samples, will the quality of the data be okay given that the distribution of the conditions were not even?
I have been thinking about setting up an experiment with around 600 samples (3 batches of 200) with 3 different conditions. Would it be optimal to set up the batches such that the proportion of 3 different conditions are the same throughout the different batches?