g:Profiler help

g:SCS algorithm

g:SCS method is the default method for computing multiple testing correction for p-values gained from GO and pathway enrichment analysis. It corresponds to an experiment-wide threshold of a=0.05, i.e. at least 95% of matches above threshold are statistically significant.

This approach is based on the idea that standard multiple testing corrections such as Bonferroni correction and Benjamini-Hochberg False Discovery rate are designed for multiple tests that are independent of each other. This is not correct for the analysis in g:GOSt, since GO consists of hierarchically related general and specific terms. The True Path Rule of GO states that genes associated to a given go term t are implicitly associated to all more general parents of term t.

g:SCS threshold is a value pre-calculated for query list sizes up to 1000 genes. Given a fixed input query size, g:SCS analytically approximates a threshold t corresponding to the 5% upper quantile of randomly generated queries of that size. All actual p-values resulting from the query are transformed to corrected p-values by multiplying these to the ratio of the approximate threshold t and the initial experiment-wide threshold a=0.05.

The algorithm considers the set structure underlying gene sets annotated to terms of each organism, and should therefore give a tighter threshold to significant results. g:SCS thresholds perfectly agreed in simulations with randomly generated gene sets of fixed input query sizes.