MetaRef Database v 1.0
MetaRef is a resource to comprehensively catalog and characterize
clade-specific microbial genes. We identify and provide
all core genes associated with all microbial species and genera with
available reference genomes (final or draft). A subset of
these gene families are consistently present in one or more
taxonomic clades, which allows us to further indicate them
as marker genes.
MetaRef paper is now available on PubMed.
Usage and Examples
For tutorials of the MetaRef website, please visit our Help
For example, you can have a look to the core gene families,
and marker gene families of
Staphylococcus aureus as well as download
all genes in its pan genome, with the
corresponding functional associations.
When focusing on a specific gene family in this clade (e.g. this enzyme),
you can access:
- the aggregated functional annotations (see for example this beta-lyase that has
been flagged as hypothetical in 7 genomes).
- the prevalence and abundance in the human microbiome (most S. aureus core
families are present in the nose of about one quarter of the healthy
population, see for example 646054415 or 645660926).
- the KEGG, COG, and EC consensus annotation.
- the browsable phylogenetic distribution of the family in the whole microbial
tree of life (with the linkable static version).
- all the above information in fasta or tab-delimited and functionally annotated
Genomic Data Info
# Number of microbial genomes: 2,706 bacterial and 112 archaeal genomes
# Number of genes: 10,880,874
# Number of Non-redundant Metaref Gene Families: 5,006,295 (3,798,644 are single-gene clusters)
# Number of Core Gene Families (at species level and above): 3,600,814 (2,607,806 are single-gene clusters)
# Number of Marker Gene Families (at species level and above): 1,028,534 (880,332 are single-gene clusters)
Metagenomic Data Info
# Number of Metagenomes: 691 samples
# Number of Human Body Sites Covered: 6 sites
# Number of Human Contaminant-Screened Reads: >35 billions.