Technological advances have enabled the use of DNA sequencing as a flexible tool to characterize genetic variation and to measure the activity of diverse cellular phenomena such as gene isoform expression and transcription factor binding. Extracting biological insight from the experiments enabled by these advances demands the analysis of large, multi-dimensional datasets. This unit describes the use of the BEDTools toolkit for the exploration of high-throughput genomics datasets. Several protocols are presented for common genomic analyses, demonstrating how simple BEDTools operations may be combined to create bespoke pipelines addressing complex questions. Curr. Protoc. Bioinform. 47:11.12.1-11.12.34. © 2014 by John Wiley & Sons, Inc.
2011. The variant call format and VCFtools. Bioinformatics
2012. An integrated encyclopedia of DNA elements in the human genome. Nature
2010. A map of human genome variation from population-scale sequencing. Nature
2012. Exploring massive, genome scale datasets with the GenometriCorr package. PLoS Comput. Biol.
2002. The human genome browser at UCSC. Genome Res.
2010. BigWig and BigBed: Enabling browsing of large distributed datasets. Bioinformatics
2009. The Sequence Alignment/Map format and SAMtools. Bioinformatics
2012. Systematic localization of common disease-associated variation in regulatory DNA. Science
2010. Exome sequencing identifies the cause of a mendelian disorder. Nat. Genet.