https://doi.org/10.1140/epjb/e2012-20969-5
Regular Article
Segmentation of time series with long-range fractal correlations
1 Dpto. de Física Aplicada II, Universidad de Málaga, 29071 Málaga, Spain
2 Dpto. de Genética, Inst. de Biotecnología, Universidad de Granada, 18071 Granada, Spain
3 Harvard Medical School, Division of Sleep Medicine, Brigham Women’s Hospital, 02115 Boston, MA, USA
4 Department of Physics and Center for Polymer Studies, Boston University, 2215 Boston, MA, USA
5 Institute of Solid State Physics, Bulgarian Academy of Sciences, 1784 Sofia, Bulgaria
a
e-mail: rick@uma.es
Received: 28 November 2011
Received in final form: 9 April 2012
Published online: 25 June 2012
Segmentation is a standard method of data analysis to identify change-points dividing a nonstationary time series into homogeneous segments. However, for long-range fractal correlated series, most of the segmentation techniques detect spurious change-points which are simply due to the heterogeneities induced by the correlations and not to real nonstationarities. To avoid this oversegmentation, we present a segmentation algorithm which takes as a reference for homogeneity, instead of a random i.i.d. series, a correlated series modeled by a fractional noise with the same degree of correlations as the series to be segmented. We apply our algorithm to artificial series with long-range correlations and show that it systematically detects only the change-points produced by real nonstationarities and not those created by the correlations of the signal. Further, we apply the method to the sequence of the long arm of human chromosome 21, which is known to have long-range fractal correlations. We obtain only three segments that clearly correspond to the three regions of different G + C composition revealed by means of a multi-scale wavelet plot. Similar results have been obtained when segmenting all human chromosome sequences, showing the existence of previously unknown huge compositional superstructures in the human genome.
Key words: Statistical and Nonlinear Physics
© EDP Sciences, Società Italiana di Fisica and Springer-Verlag, 2012