Q1. I would like to know what type of normalization method is used when the guided workflow is used?

Agilent single Color: Percentile Shift
Agilent Two Color: No Normalization in Genespring
Affymetrix Expression: RMA (Summarization method)
Affymetrix Exon Expression: RMA-16 (summarization method)
Illumina Single Color: Percentile Shift
Agilent miRNA: Percentile Shift

Q2. Why is RMA chosen as default summarization algorithm in guided work flow?

RMA is chosen as the default summarization in the guided work flow because it is more popularly used as the default option, by the micro array community.

RMA considers only Perfect Matches and hence uses positive signal intensities for probe level normalization.
By not considering mismatches it reduces the noise.

Q3. In the Percentile Normalization, why is the default set as the 75th percentile?

The 75th percentile is a more robust intensity value to normalize the data. With any tissue that you are analyzing, there are a certain percentage of genes that are not expressed. Even if a gene is not expressed, a probe matching to that gene will still report an intensity value. These intensity values will most likely fall into the lower percentile ranking on the array. These values are considered noise and are less reliable. So, taking an intensity value of a higher percentile like 75th is taking a median of only the probes with reliable intensity values (taking median of only genes that are expressed).

Q4. How is Percentile shift normalization performed in GeneSpring?

The following are the order of steps in GeneSpring for percentile shift normalization:
1.Transforms the signal values to the log base
2.Arranges the log transformed signal values in increasing order.
3.Computes the rank of the required percentile (Pth percentile).
4.Now if the rank is an integer, then the Pth percentile would be the number with rank R.
In another scenario, when the rank is not an integer then the tool calculates the value using certain steps. Once the value corresponding to the Pth percentile is obtained , this value is subtracted from the corresponding log transformed signal values. This would give the normalized intensity value.

Q5. Can I change the normalization type or settings of an experiment once it is created?

Normalization could be performed only while creating an experiment. If you would like to change the normalization settings, you would have to create a new experiment.

Q6. What is the difference between PLIER and IterPLIER?

The IterPLIER differs from PLIER in the aspect that it does not use all the probes for summarization. It selects only the good probes and iteratively discards the bad probes.

Q7. Why is the normalization algorithm for Affymetrix arrays called "RMA16" instead of RMA? What does the number 16 represents?

RMA16 summarization algorithm is referred to as the addition of value16 to the expression values. This is done to attain variance stabilization.

Q8. What are Core, Extended and Full Meta Probe sets?

In order to establish a hierarchy of gene confidence levels, the sources of input transcript annotations are partitioned into three types. From the highest to the lowest confidence, the types are labeled as Core, Extended, and Full.
Core: Core list comprises 17,800 transcript clusters from RefSeq and full-length GenBank mRNAs.,
Extended: The Extended list comprises 129k transcript clusters including cDNA transcripts, syntenic rat and mouse mRNA, and Ensembl, microRNA, Mitomap, Vegagene and VegaPseudogene annotations.
Full: The full list comprises 262k transcript clusters including ab-initio predictions from Geneid, Genscan, GENSCAN Suboptimal, Exoniphy, RNAgene, SgpGene and TWINSCAN.

Q9. I know that GeneSpring GX 7.3 has the “Data transformation; Set measurement less 0.01 to 0.01” option for normalization. Is there a similar option in the latest GeneSpring version?

As part of the pre-processing step of experiment creation, thresholding is performed, due to which the raw values less than ‘1’ are threshold to 1. This is the default setup in GeneSpring, but the user has the choice to threshold the raw values to any value.
The option to transform the values to ‘0.01’ is unavailable in current GeneSpring version.

Q10. I am working with Agilent Single color technology. I am unable to specify the threshold value of less than one. I get the ?Threshold should be more than 1' error. Would this be possible in current GeneSpring version?

The threshold values cannot be specified as the value less than 1 in the standard experiment but, can be changed in a Generic experiment. The user could create the custom technology followed by Generic experiment creation to be able to change the threshold value.

Q11. How does GeneSpring handle dye swap arrays for two color data?

While creating an experiment for two color data, GeneSpring allows you to load the data files and select the dye swapped arrays.
Ratio computation for two color data is done as follows:

Samples without dye swap:

Cy5(test) / Cy3(control)

Samples with dye swap:

Cy3(test) / Cy5(control)

Q12 When I have multiple controls in my RTPCR, how does GeneSpring calculate the control signal value for Normalization?

If you have multiple endogenous controls, their 'Ct values' are averaged (arithmetic). That value is then subtracted from target Ct values for normalization.

Q13. What does raw and normalized data in GeneSpring mean for Agilent two color technology?

The term raw signal values used in the context of Agilent Two Color data refers to the linear data after thresholding and summarization for the individual channels (cy3 and cy5).
Summarization is performed by computing the geometric mean.
The term Normalized signal value refers to the data after ratio computation, log transformation and Baseline Transformation.
In GeneSpring, the sequence of events involved in the processing of the Agilent two color text data files is: Thresholding → Summarization → dye swap → ratio computation → log transformation → Baseline Transformation.

Q14. Does GeneSpring outputs GC-RMA expression data in log2?

In GeneSpring, the intensity values post the pre-processing steps are displayed in the log base 2 values. These values are further used for the analysis.

Q15. Why does GeneSpring add the variance stabilization value 16 to the expression values for exon arrays, why not for expression arrays?

For Exon arrays, the background subtraction is done on a pool of probes having similar GC content (which is not the case with expression arrays). This typically results in probe sets having small expression values leading to an unreliable estimate of the variance. To offset this, an adhoc value of 16 is added to the expression values of all the probe sets.

Q16. Why only variance stabilization value 16 is added to the expression values for exon arrays, why not some other value?

The reason for adding 16 is that it is generally considered a low enough number that it will provide the required stabilization effect without changing suppressing true signal values. 8 and 32 are other options that are commonly used. usually values smaller than 16 are due to noise and you could have values 8 and 16 causing a fold change of 2, purely due to noise.

Q17. I found that even after using baseline to median of all samples, the result of statistical analysis is same, although the profile plot is different with different baseline transformation. I want to know the reason for this situation?

Statistical analysis results would not change based on the Baseline transformation to median of all samples as the actual deviation between the conditions for the particular entity would not change and therefore there would not be any change in the P-values across all the experiments based on the baseline transformation. Baseline transformation provides the user better visualization when comparing the relation between two groups without affecting the downstream analysis.

Q18. We would like to skip 'Quantile Normalization' for Affymetrix Exon Expression data and normalize the data by using some other normalization method. Is it possible to do in current GeneSpring version?

Please follow the steps to disable and to select the other normalization methods.
1. Disable the "Perform Quantile Normalization" option under ToolsOptionsAffymetrix Exon Summarization AlgorithmsExon PLIER/Iter PLIERUn-Check 'Perform Quantile Normalization'.
2. Create the Exon Expression experiment in GeneSpring.
3. After getting the data in, export 'All Entities' from the right clickExport entity list option.
4. Import it back in as a Generic Experiment. (i.e. create a custom technology using the exported data) Please
Note: when you are importing data back into GX11, it is already in log scale, so while creating the generic experiment you should explicitly select the option "Please select if your data is in log scale" so that log transformation is not performed on the data again.
5.Once you have your data as a custom experiment, you can perform any of the normalization methods available for Generic single color data.

Q19. When I threshold my raw signals to 0 in the import data I get missing values, whereas when I threshold them to 1 I do not. Why is this and what is the difference between thresholding the data to 1 or 0? How do I decide what to threshold the raw signals to?

Thresholding the data to 1 is convert the values which are less than 1 to 1. This is done because a values less than 1 would give large negative value after log transformation.
Now, any entity with the value 1 after log transformation would give a value of 0 in GeneSpring. However, when we threshold the value to 0, it would not give any value after log tranformation and hence empty boxes would be observed.
So, if would like to filter entities with missing value , you could threshold to 0 and then, go to Utilities and select “Remove entities with missing value”.

Q20. I would like to import pre-normalized data into GeneSpring. How should I do this?

Please follow the steps below to import the pre-normalized data:
Create a custom technology with the data file from:
GeneSpring Menu Bar → Annotations → Create Technology → Custom From File
Once the technology is created, please create a new experiment with those files to import into GeneSPring.
While creating an experiment, please choose the Experiment type as 'Generic Single color'.
In step 2 of 4, please check the option 'Please select if your data is in log scale' and the Normalization algorithm as 'None'.
In step 4 of 4, choose the option 'Do not perform baseline transformation'.
Now, the experiment is created with the pre-normalized data.

Frequently Asked Questions

Categories

Resources