Analysis of symbolic sequences using the Jensen-Shannon divergence

Grosse, I., Bernaola-Galvan, P., Carpena, P., Roman-Roldan, R., Oliver, J., Stanley, H. E. (April 2002) Analysis of symbolic sequences using the Jensen-Shannon divergence. Physical Review E, 65 (4). ISSN 1063-651X

Abstract

We study statistical properties of the Jensen-Shannon divergence D, which quantifies the difference between probability distributions, and which has been widely applied to analyses of symbolic sequences. We present three interpretations of D in the framework of statistical physics, information theory, and mathematical statistics, and obtain approximations of the mean, the variance, and the probability distribution of D in random, uncorrelated sequences. We present a segmentation method based on D that is able to segment a nonstationary symbolic sequence into stationary subsequences, and apply this method to DNA sequences, which are known to be nonstationary on a wide range of different length scales.

Item Type: Paper
Subjects: bioinformatics
bioinformatics > genomics and proteomics > analysis and processing > Sequence Data Processing
CSHL Authors:
Communities: CSHL labs > Zhang lab
Depositing User: Matt Covey
Date: April 2002
Date Deposited: 08 Jan 2014 17:48
Last Modified: 08 Jan 2014 17:48
Related URLs:
URI: https://repository.cshl.edu/id/eprint/28709

Actions (login required)

Administrator's edit/view item Administrator's edit/view item