Q: There has been an update of g:Profiler and my analysis no longer yields the same results as earlier. Why is that?
A: Generally, we see that it is quite common for results to change between g:Profiler / Ensembl releases (we strive for a g:Profiler release in sync with every Ensembl release). For this reason, we provide archived instances of recent g:Profiler releases. If using the archives is not preferable, you may want to not rely on the default multiple testing correction, but request all results (uncheck "Significant only") and set a custom p-value threshold in Advanced options. Alternatively, you can select a more lenient multiple testing correction (FDR and Bonferroni are available, also in Advanced options). Note that we also provide all annotations in GMT format (Advanced options -> Download g:Profiler data as GMT), in case you wish to run a custom downstream analysis.
Q: Which identifiers does g:Profiler accept? Why are my identifiers not recognized?
A: g:Profiler relies on Ensembl to provide all identifier namespaces, hence it only recognizes identifiers present in Ensembl. To make sure that g:Profiler can parse your input list, you may pass it through the g:Convert tool. All available identifier namespaces are listed in the "Target database" dropdown menu in g:Convert.
Q: Can I use g:Profiler to analyse an organism that is not in the list? Can I use my own annotations as input to g:Profiler?
A: Unfortunately, this is not possible. g:Profiler is based on Ensembl and can only use organisms present in its database. The annotations have been built into static, high performance indexes backing g:Profiler and cannot be modified at runtime.
Q: I plug the population size, sample size, population successes
and sample successes into R's
phyper function, but receive a different
p-value than what g:Profiler outputs for the term. Why?
A: This is because g:GOSt runs multiple testing correction after the analysis to
discard false positives. After finding a significance threshold
using the selected multiple testing algorithm (the custom g:SCS by default), all p-values are scaled so the
threshold would be constant at 0.05:
for each p-value Px: Px_scaled = 0.05Px / T
Q: What are some of the columns in g:GOSt output?
n. of term genes(graphical) /
n. of query genes/
n. of common genes/
Q&T— These fields contain the number of genes in a term, number of recognized genes in the input query and in the overlap of the two. If a custom statistical background has been passed, these sets have been intersected with the background, resulting in a value smaller or equal than without a background.
t type— This field is mostly useful for GO results. It denotes the GO branch of the term - either BP (biological process), CC (cellular component) or MF (molecular function).
t group(textual only) — Subgraph number. Since the GO ontology is essentially a large DAC (directed acyclic graph), any g:Profiler result can be segmented into a number of subgraphs. Each is assigned a successive integer.
Q: I receive intermittent HTTP 5xx errors when using the web interface or R / Python packages with long-running queries. What's the problem?
A: If g:Profiler's example queries work, but you sometimes receive HTTP 502 or other 5xx series errors, the issue is most likely a server timeout. We do not enforce any rules on query size or other input parameters, but a time limit of around 10-15 minutes applies per analysis to avoid overloading the server. Unfortunately this is not communicated clearly to the user in current stable g:Profiler, but is usually the reason behind HTTP 5xx errors after long-running queries.