Data information

  To construct a Korean control database, we collected WES and WGS data generated from multiple projects. Most of the samples were originated from normal tissue of cancer patients (40.16 %), healthy parents of rare disease patients (28.4 %), or healthy volunteers (31.44 %). Raw reads from 6,654 sequencing data (4,258 WES and 2,396 WGS) were collected, processed and filtered according to the criteria collected from our previous experience and other studies. After filtering out 1,349 samples (20.27 %), variants from remaining 5,305 samples (3,409 WES and 1,896 WGS) were used for subsequent analyses. The mean coverage depth of the runs was 100x for wes, 30x for wgs. A total of 40,414,379 SNVs (874,026 coding and 39,540,353 noncoding), 2,888,275 indels (37,663 coding and 2,850,612 noncoding) were called and annotated.

Download