Grosse, I., Bernaola-Galvan, P., Carpena, P., Roman-Roldan, R., Oliver, J., Stanley, H. E. (April 2002) Analysis of symbolic sequences using the Jensen-Shannon divergence. Physical Review E, 65 (4). ISSN 1063-651X
Abstract
We study statistical properties of the Jensen-Shannon divergence D, which quantifies the difference between probability distributions, and which has been widely applied to analyses of symbolic sequences. We present three interpretations of D in the framework of statistical physics, information theory, and mathematical statistics, and obtain approximations of the mean, the variance, and the probability distribution of D in random, uncorrelated sequences. We present a segmentation method based on D that is able to segment a nonstationary symbolic sequence into stationary subsequences, and apply this method to DNA sequences, which are known to be nonstationary on a wide range of different length scales.
| Item Type: | Paper | 
|---|---|
| Subjects: | bioinformatics bioinformatics > genomics and proteomics > analysis and processing > Sequence Data Processing | 
| CSHL Authors: | |
| Communities: | CSHL labs > Zhang lab | 
| Depositing User: | Matt Covey | 
| Date: | April 2002 | 
| Date Deposited: | 08 Jan 2014 17:48 | 
| Last Modified: | 08 Jan 2014 17:48 | 
| Related URLs: | |
| URI: | https://repository.cshl.edu/id/eprint/28709 | 
Actions (login required)
|  | Administrator's edit/view item | 
