Detection of rare disease-related genetic variants using the birthday model

Berstein, Yael, McCarthy, Shane E, Kramer, Melissa, McCombie, W Richard (November 2018) Detection of rare disease-related genetic variants using the birthday model. bioRxiv. ISSN 2692-8205 (Submitted)

[thumbnail of 10.1101.464842.pdf] PDF
10.1101.464842.pdf - Submitted Version
Available under License Creative Commons Attribution Non-commercial.

Download (695kB)

Abstract

Motivation Exome sequencing is a powerful technique for the identification of disease-causing genes. A number of Mendelian inherited disease genes have been identified through this method. However, it remains a challenge to leverage exome sequencing for the study of complex disorders, such as schizophrenia and bipolar disorder, due to the genetic and phenotypic heterogeneity of these disorders. Although not feasible for many studies, sequencing large sample sizes (>10,000) may improve statistical power to associate more variants, while the aggregation of distinct rare variants associated with a given disease can make the identification of causal genes statistically challenging. Therefore, new methods for rare variant association are imperative to identify causative genes of complex disorders. Results Here we propose a method to predict causative rare variants using a popular probabilistic problem: The Birthday Model, which estimates the probability that multiple individuals in a group share the same birthday. We consider the probability and coincidence of samples sharing a variant akin to the chance of individuals sharing the same birthday. We investigated the parameter effects of our model, providing guidelines for its use and interpretation of the results. Using published data on autism spectrum disorder, hypertriglyceridemia in addition to a current case-control study on bipolar disorder, we evaluated this probabilistic method to identify potential causative variants. Several genes in the top results of the case-control study were associated with autism spectrum and bipolar disorder. Given that the core probability based on the birthday model is very sensitive to low recurrence, the method successfully tests the association of rare variants, which generally do not provide enough signal in commonly used statistical tests. Importantly, the simplicity of the model allows quick interpretation of genomic data, enabling users to select gene candidates for further biological validation of specific mutations and downstream functional or other studies.

Item Type: Paper
Subjects: bioinformatics
CSHL Authors:
Communities: CSHL labs > McCombie lab
SWORD Depositor: CSHL Elements
Depositing User: CSHL Elements
Date: 7 November 2018
Date Deposited: 24 Mar 2026 19:19
Last Modified: 24 Mar 2026 19:19
Related URLs:
URI: https://repository.cshl.edu/id/eprint/42115

Actions (login required)

Administrator's edit/view item Administrator's edit/view item