Non-Coding RNA Analysis Using the Rfam Database
Ioanna Kalvari
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
Search for more papers by this authorEric P. Nawrocki
National Center for Biotechnology Information, National Institutes of Health, Department of Health and Human Services, Bethesda, Maryland
Search for more papers by this authorJoanna Argasinska
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
Search for more papers by this authorNatalia Quinones-Olvera
Systems Biology Graduate Program, Harvard University, Cambridge, Massachusetts
Search for more papers by this authorRobert D. Finn
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
Search for more papers by this authorAlex Bateman
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
Search for more papers by this authorAnton I. Petrov
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
Search for more papers by this authorIoanna Kalvari
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
Search for more papers by this authorEric P. Nawrocki
National Center for Biotechnology Information, National Institutes of Health, Department of Health and Human Services, Bethesda, Maryland
Search for more papers by this authorJoanna Argasinska
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
Search for more papers by this authorNatalia Quinones-Olvera
Systems Biology Graduate Program, Harvard University, Cambridge, Massachusetts
Search for more papers by this authorRobert D. Finn
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
Search for more papers by this authorAlex Bateman
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
Search for more papers by this authorAnton I. Petrov
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
Search for more papers by this authorAbstract
Rfam is a database of non-coding RNA families in which each family is represented by a multiple sequence alignment, a consensus secondary structure, and a covariance model. Using a combination of manual and literature-based curation and a custom software pipeline, Rfam converts descriptions of RNA families found in the scientific literature into computational models that can be used to annotate RNAs belonging to those families in any DNA or RNA sequence. Valuable research outputs that are often locked up in figures and supplementary information files are encapsulated in Rfam entries and made accessible through the Rfam Web site. The data produced by Rfam have a broad application, from genome annotation to providing training sets for algorithm development. This article gives an overview of how to search and navigate the Rfam Web site, and how to annotate sequences with RNA families. The Rfam database is freely available at http://rfam.org. © 2018 by John Wiley & Sons, Inc.
Supporting Information
Filename | Description |
---|---|
batch-search-example.fasta13.7 KB |
|
my.fa17.1 MB |
|
sequence-search-example.fa98 B |
|
Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.
- Aken, B. L., Achuthan, P., Akanni, W., Amode, M. R., Bernsdorff, F., Bhai, J., … Flicek, P. (2017). Ensembl 2017. Nucleic Acids Research, 45(D1), D635–D642. doi: 10.1093/nar/gkw1104.
- Barquist, L., Burge, S. W., & Gardner, P. P. (2016). Studying RNA homology and conservation with infernal: From single sequences to RNA families. Current Protocols in Bioinformatics, 54, 12.13.1–12.13.25. doi: 10.1002/cpbi.4.
10.1002/cpbi.4 Google Scholar
- Bernhart, S. H., Hofacker, I. L., Will, S., Gruber, A. R., & Stadler, P. F. (2008). RNAalifold: Improved consensus structure prediction for RNA alignments. BMC Bioinformatics, 9, 474. doi: 10.1186/1471-2105-9-474.
- Cech, T. R., & Steitz, J. A. (2014). The noncoding RNA revolution—trashing old rules to forge new ones. Cell, 157(1), 77–94. doi: 10.1016/j.cell.2014.03.008.
- Federhen, S. (2012). The NCBI Taxonomy database. Nucleic Acids Research, 40(Database issue), D136–D143. doi: 10.1093/nar/gkr1178.
- Gardner, P. P., Daub, J., Tate, J., Moore, B. L., Osuch, I. H., Griffiths-Jones, S., … Bateman, A. (2011). Rfam: Wikipedia, clans and the “decimal” release. Nucleic Acids Research, 39(suppl_1), D141–D145. doi: 10.1093/nar/gkq1129.
- Gardner, P. P., & Eldai, H. (2015). Annotating RNA motifs in sequences and alignments. Nucleic Acids Research, 43(2), 691–698. doi: 10.1093/nar/gku1327.
- Kalvari, I., Argasinska, J., Quinones-Olvera, N., Nawrocki, E. P., Rivas, E., Eddy, S. R., … Petrov, A. I. (2018). Rfam 13.0: Shifting to a genome-centric resource for non-coding RNA families. Nucleic Acids Research, 46, D335–D342. http://doi.org/10.1093/nar/gkx1038.
- Kortmann, J., & Narberhaus, F. (2012). Bacterial RNA thermometers: Molecular zippers and switches. Nature Reviews. Microbiology, 10(4), 255–265. doi: 10.1038/nrmicro2730.
- Lai, D., Proctor, J. R., Zhu, J. Y. A., & Meyer, I. M. (2012). R-CHIE: A web server and R package for visualizing RNA secondary structures. Nucleic Acids Research, 40(12), e95. doi: 10.1093/nar/gks241.
- McCown, P. J., Corbino, K. A., Stav, S., Sherlock, M. E., & Breaker, R. R. (2017). Riboswitch diversity and distribution. RNA, 23(7), 995–1011. doi: 10.1261/rna.061234.117.
- Nawrocki, E. P., Burge, S. W., Bateman, A., Daub, J., Eberhardt, R. Y., Eddy, S. R., … Finn, R. D. (2015). Rfam 12.0: Updates to the RNA families database. Nucleic Acids Research, 43(Database issue), D130–D137. doi: 10.1093/nar/gku1063.
- Nawrocki, E. P., & Eddy, S. R. (2013). Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics, 29(22), 2933–2935. doi: 10.1093/bioinformatics/btt509.
- Quinlan, A. R. (2014). BEDTools: The swiss-army tool for genome feature analysis. Current Protocols in Bioinformatics, 47, 11.12.1–11.12.34. doi: 10.1002/0471250953.bi1112s47.
- Rivas, E., Clements, J., & Eddy, S. R. (2017). A statistical test for conserved RNA structure shows lack of evidence for structure in lncRNAs. Nature Methods, 14(1), 45–48. doi: 10.1038/nmeth.4066.
- The RNAcentral Consortium (2017). RNAcentral: A comprehensive database of non-coding RNA sequences. Nucleic Acids Research, 45(D1), D128–D134. doi: 10.1093/nar/gkw1008.
- Weinberg, Z., Lünse, C. E., Corbino, K. A., Ames, T. D., Nelson, J. W., Roth, A., … Breaker, R. R. (2017). Detection of 224 candidate structured RNAs by comparative analysis of specific subsets of intergenic regions. Nucleic Acids Research, 45(18), 10811–10823. Retrieved from http://academic.oup.com/nar/article/4080188. doi: 10.1093/nar/gkx699.
Key References
- Kalvari et al. (2018). See above.
- Nawrocki et al. (2015). See above.
- Gardner et al. (2011).
- Griffiths-Jones, S., Bateman, A., Marshall, M., Khanna, A., & Eddy, S. R. (2003). Rfam: An RNA family database. Nucleic Acids Research, 31(1), 439–441. doi: 10.1093/nar/gkg006.
- Nawrocki, E. P. (2014). Annotating functional RNAs in genomes using Infernal. Methods in Molecular Biology, 1097, 163–197. doi: 10.1007/978-1-62703-709-9_9.
- Barquist et al. (2016).
Describes Rfam release 13.0 that introduced genome-centric sequence database.
Describes Rfam release 12.0 including the addition of RNA motifs to Rfam and migration to Infernal 1.1.
Describes the usage of Wikipedia for creating family descriptions and the introduction of Rfam clans.
Describes the first version of Rfam.
Describes using Infernal for annotating genomes with non-coding RNAs using the Rfam database.
Describes building RNA families using Infernal and introduces related tools and workflows.
Internet Resources
- http://rfam.org
- http://rfam.org/help
- http://ftp.ebi.ac.uk/pub/databases/Rfam/
- http://eddylab.org/infernal
Rfam database.
Rfam help and documentation.
Rfam FTP archive.
Infernal homepage and User's Guide.