Towards the biogeography of prokaryotic genes
Coelho, Luis Pedro; Alves, Renato; del Rio, Alvaro Rodriguez; Myers, Pernille Neve; Cantalapiedra, Carlos P.; Giner-Lamia, Joaquin; Schmidt, Thomas Sebastian; Mende, Daniel R.; Orakov, Askarbek; Letunic, Ivica; Hildebrand, Falk; Van Rossum, Thea; Forslund, Sofia K.; Khedkar, Supriya; Maistrenko, Oleksandr M.; Pan, Shaojun; Jia, Longhao; Ferretti, Pamela; Sunagawa, Shinichi; Zhao, Xing-Ming; Nielsen, Henrik Bjorn; Huerta-Cepas, Jaime; Bork, Peer
Publicación: NATURE
2021
VL / 601 - BP / - EP /
abstract
Microbial genes encode the majority of the functional repertoire of life on earth. However, despite increasing efforts in metagenomic sequencing of various habitats(1-3), little is known about the distribution of genes across the global biosphere, with implications for human and planetary health. Here we constructed a non-redundant gene catalogue of 303 million species-level genes (clustered at 95% nucleotide identity) from 13,174 publicly available metagenomes across 14 major habitats and use it to show that most genes are specific to a single habitat. The small fraction of genes found in multiple habitats is enriched in antibiotic-resistance genes and markers for mobile genetic elements. By further clustering these species-level genes into 32 million protein families, we observed that a small fraction of these families contain the majority oft he genes (0.6% of families account for 50% of the genes). The majority of species-level genes and protein families are rare. Furthermore, species-level genes, and in particular the rare ones, show low rates of positive (adaptive) selection, supporting a model in which most genetic variability observed within each protein family is neutral or nearly neutral.
MENTIONS DATA
Biology & Biochemistry
-
0 Twitter
-
6 Wikipedia
-
0 News
-
606 Policy
Publicaciones similares en Biology & Biochemistry