Data Processing :
Illumina Casava1.8.2 software was used for basecalling. Sequenced reads were trimmed for adaptor sequence, and masked for low-complexity or low-quality sequence, then mapped to hg19 whole genome using tophat2 with --segment-mismatches 2 --read-mismatches 2 --read-gap-length 2 --read-edit-dist 2 –r 70 --library-type fr-unstranded. Reads Per Kilobase of exon per Megabase of library size (RPKM) were calculated using a protocol from Chepelev et al., Nucleic Acids Research, 2009. In short, exons from all isoforms of a gene were merged to create one meta-transcript. The number of reads falling in the exons of this meta-transcript were counted and normalized by the size of the meta-transcript and by the size of the library.