Skip to main content

Play 'guess the dataset'

Ok now you are expert in different types of data - it's time to test yourself.

To get started

  1. first select all the tracks on the left and click 'Remove Track', to get rid of the existing data tracks.

  2. Next I suggest starting by pointing your IGV session at gene SPHK2 - (Sphingosine Kinase 2](https://en.wikipedia.org/wiki/SPHK2), which "mediates many cellular processes including migration, proliferation and apoptosis, and also plays a role in several types of cancer by promoting angiogenesis and tumorigenesis"

Note

(You can also look elsewhere in the genome of course. But remember these data are only present in a limited set of gene regions.

  1. Lastly, point your web browser at this folder:
https://www.chg.ox.ac.uk/bioinformatics/training/gms/data/sequence_data_sightseeing_tour/quiz/
Challenge

For each dataset from 1.bam to 7.bam, decide what type of data it is. Is it -

A. Illumina short-read genome sequence data? B. Illumina short-read RNA-seq data? C. Illumina short-read ATAC-seq data? D. Pacbio long-read genome sequence data? E. Pacbio long-read RNA-seq data? F. Nanopore long-read genome sequence data? G. Illumina short-read sequencing of 10X 'linked read' molecules?

Good luck!