Grosse, I., Bernaola-Galvan, P., Carpena, P., Roman-Roldan, R., Oliver, J., Stanley, H. E. (April 2002) Analysis of symbolic sequences using the Jensen-Shannon divergence. Physical Review E, 65 (4). ISSN 1063-651X
Abstract
We study statistical properties of the Jensen-Shannon divergence D, which quantifies the difference between probability distributions, and which has been widely applied to analyses of symbolic sequences. We present three interpretations of D in the framework of statistical physics, information theory, and mathematical statistics, and obtain approximations of the mean, the variance, and the probability distribution of D in random, uncorrelated sequences. We present a segmentation method based on D that is able to segment a nonstationary symbolic sequence into stationary subsequences, and apply this method to DNA sequences, which are known to be nonstationary on a wide range of different length scales.
Item Type: | Paper |
---|---|
Subjects: | bioinformatics bioinformatics > genomics and proteomics > analysis and processing > Sequence Data Processing |
CSHL Authors: | |
Communities: | CSHL labs > Zhang lab |
Depositing User: | Matt Covey |
Date: | April 2002 |
Date Deposited: | 08 Jan 2014 17:48 |
Last Modified: | 08 Jan 2014 17:48 |
Related URLs: | |
URI: | https://repository.cshl.edu/id/eprint/28709 |
Actions (login required)
Administrator's edit/view item |