TIGRFAMs is a database of protein families designed to support manual and automated genome annotation.[1][2][3] Each entry includes a multiple sequence alignment and hidden Markov model (HMM) built from the alignment. Sequences that score above the defined cutoffs of a given TIGRFAMs HMM are assigned to that protein family and may be assigned the corresponding annotations. Most models describe protein families found in Bacteria and Archaea.

Like Pfam, TIGRFAMs uses the HMMER package written by Sean Eddy.[4]

History

TIGRFAMs was produced originally at The Institute for Genomic Research (TIGR) and its successor, J. Craig Venter Institute (JCVI), but it moved in April 2018 to the National Center for Biotechnology Information (NCBI). TIGRFAMs remains a member database in InterPro. The last version from JCVI, release 15.0, contained 4488 models. TIGRFAMs now continues at NCBI as part of a larger collection of HMMs, called NCBIFAMs, used in its RefSeq and PGAP genome annotation pipelines.[5] Active curation and revision of TIGRFAMs models continues at NCBI, but the creation of TIGRFAMs models per se has ended, as newly constructed HMMs from the RefSeq group receive different designations when added to NCBIFAMs.

References

  1. Haft, DH; Selengut, JD; White, O (2003). "The TIGRFAMs database of protein families". Nucleic Acids Research. 31 (1): 371–3. doi:10.1093/nar/gkg128. PMC 165575. PMID 12520025.
  2. Selengut, JD; Haft, DH; Davidsen, T; Ganapathy, A; Gwinn-Giglio, M; Nelson, WC; Richter, AR; White, O (2007). "TIGRFAMs and Genome Properties: Tools for the assignment of molecular function and biological process in prokaryotic genomes". Nucleic Acids Research. 35 (Database issue): D260–4. doi:10.1093/nar/gkl1043. PMC 1781115. PMID 17151080.
  3. Haft, DH; Selengut, JD; Richter, RA; Harkins, DM; Basu, MK; Beck, E (2012). "TIGRFAMs and Genome Properties in 2013". Nucleic Acids Research. 41 (Database issue): D387-95. doi:10.1093/nar/gks1234. PMC 3531188. PMID 23197656.
  4. Eddy, SR (2009). "A new generation of homology search tools based on probabilistic inference". Genome Informatics. International Conference on Genome Informatics. 23 (1): 205–11. PMID 20180275.
  5. Li W, O'Neill KR, Haft DH, DiCuccio M, Chetvernin V, Badretdin A; et al. (2021). "RefSeq: expanding the Prokaryotic Genome Annotation Pipeline reach with protein family model curation". Nucleic Acids Res. 49 (D1): D1020–D1028. doi:10.1093/nar/gkaa1105. PMC 7779008. PMID 33270901.{{cite journal}}: CS1 maint: multiple names: authors list (link)


This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.