g:Profiler help

Statistical significance

Statistical significance is the notion used in hypothesis testing to reject or accept hypotheses. Generally, a null hypothesis H0 may be rejected and an alternative hypothesis H1 accepted, if there is sufficient statistical evidence given the observed events that favour H1 over H0. Statistical significance is calculated as a p-value, the probability of observing the given event or any other more extreme event.

In graphical output, significant results are printed in black and insignificant results in gray. In textual output, significant p-values are preceded with exclamation symbol `!'.

In g:GOSt, the analysis tests the following pair of hypotheses: g:GOSt uses Fisher's one-tailed test, also known as cumulative hypergeometric probability, as the p-value measuring the randomness of the occurred intersection G&Q. Generally, the smaller is the p-value, the higher are the odds that the given match with the term and the input query is important and has not appeared by mere chance. The p-value represents the probability of the observed intersection plus probabilities of all larger, more extreme intersections.

Null hypothesis may be rejected, if the p-value is sufficiently small. The borderline between "normal" and "significantly small" is called significance threshold. Traditional significance thresholds include 0.05, 0.01, 0.001.

The problem of multiple testing occurs when an analysis involves several rounds of testing the same pairs of hypotheses. It is rather intuitive that in a long series of tests, sooner or later one may observe quite a good p-value that has actually occurred by random chance. Therefore it is reasonable to lower significance thresholds as the number of performed tests grows.

Every analysis of a gene list in g:GOSt involves a series of comparisons, as the intersection G&Q and corresponding p-value is calculated for a large number of terms from GO, KEGG, TRANSFAC, and other data sources. Since all of these p-values are compared against a threshold, the GOSt analysis involves multiple testing, rendering traditional significance thresholds useless.

g:GOSt uses multiple testing correction algorithms for distinguishing significant results from random matches. These include Bonferroni correction, Benjamini-Hochberg False Discovery rate, and g:GOSt native method g:SCS. The latter method is used by default.

In multiple testing correction, all p-values in a series of tests are transformed to more conservative values based on the number and distribution of initial p-values. In g:GOSt, the cutoff value after correction is 0.05, denoting the fraction of false positives in a normal g:GOSt analysis.