Mar 1, 2007 - displacement (RMD) algorithm [16], that is preferable over other signal generators for its high execution speed though it suffers of poo...

1 downloads 0 Views 279KB Size

Physica A 382 (2007) 9–15 www.elsevier.com/locate/physa

De trending moving average algorithm: A closed-form approximation of the scaling law Sergio Arianos,1, Anna Carbone Dipartimento di Fisica, Politecnico di Torino, Corso Duca degli Abruzzi 24, I-10129 Torino, Italy Available online 1 March 2007

Abstract The Hurst exponent H of long range correlated series can be estimated by means of the detrending moving average (DMA) method. PThe computational tool, on which the algorithm is based, is the generalized variance P en ðiÞ2 , with yen ðiÞ ¼ 1=n nk¼0 yði kÞ being the average over the moving window n and N s2DMA ¼ 1=ðN nÞ N i¼n ½yðiÞ y the dimension of the stochastic series yðiÞ. The ability to yield H relies on the property of s2DMA to vary as n2H over a wide range of scales [E. Alessio, A. Carbone, G. Castelli, V. Frappietro, Eur. J. Phys. B 27 (2002) 197]. Here, we give a closed form proof that s2DMA is equivalent to C H n2H and provide an explicit expression for C H . We furthermore compare the values of C H with those obtained by applying the DMA algorithm to artiﬁcial self-similar signals. r 2007 Published by Elsevier B.V. Keywords: Hurst exponent; Moving average; DMA algorithm

1. Introduction Long-memory stochastic processes are ubiquitous in ﬁelds as different as condensed matter, biophysics, social sciences, climate changes, ﬁnance [1–4]. The development of methods able to quantify the statistical properties and, in particular, to extract the Hurst exponent of long-range correlated signals continue therefore to draw the attention not only of the physicist community [5–15]. For long-memory correlated processes, the value of the Hurst exponent H ranges from 0oHo0:5 and from 0:5oHo1, respectively, for negative and positive persistence; H ¼ 0:5 is found in fully uncorrelated signals. Several techniques have been proposed in the literature to study the scaling properties of time series. We limit ourselves to mention here only a few of them such as the seminal work by Hurst on rescaled range statistical (R/S) analysis, the modiﬁed R/S analysis, the multi-afﬁne analysis, the detrended ﬂuctuation analysis (DFA), the periodogram regression (GPH) method, the ðm; kÞ-Zipf method, the detrended moving average (DMA) analysis. The challenge is to get the Hurst exponent H, that is related to the fractal dimension D ¼ 2 H, by means of more and more accurate

Corresponding author.

E-mail addresses: [email protected] (S. Arianos), [email protected] (A. Carbone). URL: http://www.polito.it/noiselab (A. Carbone). 1 Permanent address: Theoretical Physics Department, Universita´ di Torino, Italy. 0378-4371/$ - see front matter r 2007 Published by Elsevier B.V. doi:10.1016/j.physa.2007.02.074

ARTICLE IN PRESS S. Arianos, A. Carbone / Physica A 382 (2007) 9–15

10

and fast algorithms. The methods of extraction of the scaling exponents from a random signal exploit suitable statistical functions of the series itself. Recently, a method called DMA technique for the analysis of the persistence has been proposed. The striking difference between the DMA and other (R/S, DFA) variance methods is that the DMA algorithm does not need a division of the series in boxes. The scaling property is obtained by using a simple continuous function: the moving average. This fact makes the DMA algorithm highly efﬁcient from the computational point of view. The scaling properties of the DMA variance have been studied and applications have been demonstrated in previous work [13,14]. The purpose of this work is to derive a closed form approximation of the scaling behavior of the DMA variance at large n, i.e., s2DMA C H n2H . Furthermore, the expression C H n2H is compared with the data obtained by applying the DMA algorithm to surrogate series with assigned H. For such comparison, we use 30 samples of surrogate series with N ¼ 223 . Very long signals assure the consistency of the simulation results with the expression C H n2H holding at large n. The surrogate series are generated by the random midpoint displacement (RMD) algorithm [16], that is preferable over other signal generators for its high execution speed though it suffers of poorer accuracy [17].

2. Method First we describe the main steps of the DMA algorithm. The technique is based on the function: s2DMA ¼

y~ n ðiÞ ¼

N 1 X ½yðiÞ y~ n ðiÞ2 , N n i¼n

n 1X yði kÞ. n k¼0

(1a)

(1b)

Eq. (1a) deﬁnes a generalized variance of the random path yðiÞ with respect to the moving average yen ðiÞ (Eq. (1b)). The function yen ðiÞ is calculated by averaging the n-past value in each sliding window of length n. In so doing, the reference point of the averaging process is the last point of the window. The dynamic averaging process and the DMA algorithm can be, however, referred to any point within the window, by generalizing Eqs. (1a), (1b) as follows: s2DMA ¼

y~ n ðiÞ ¼

Nny X 1 ½yðiÞ y~ n ðiÞ2 , N n i¼nð1yÞ

nð1yÞ 1 X yði kÞ. n k¼ny

(1c)

(1d)

Upon variation of the parameter y in the range ½0; 1, the reference point of yen ðiÞ is accordingly set within the moving window n. In particular, we will consider the following three relevant cases: (i) y ¼ 0 corresponding to calculate yen ðiÞ over all the past points within the window n; (ii) y ¼ 12 corresponding to calculate yen ðiÞ over n=2 past and n=2 future points within the window n and (iii) y ¼ 1 corresponding to calculate yen ðiÞ over all the future points within the window n. In order to calculate the Hurst exponent of the series, the DMA algorithm is implemented as follows. The moving average yen ðiÞ is calculated for different values of the window n, with n ranging from 2 to a maximum value nmax depending upon the size of the series. The sDMA , deﬁned by Eqs. (1), is then calculated for all the windows n over the interval ½n; N. For each yen ðiÞ, the value of sDMA corresponding to each yen ðiÞ is plotted as a function of n on log–log axes. The most remarkable property of the log–log plot is to exhibit a power-law dependence on n, i.e., s2DMA n2H , allowing thus to calculate the scaling exponent H of the signal yðiÞ.

ARTICLE IN PRESS S. Arianos, A. Carbone / Physica A 382 (2007) 9–15

11

3. Derivation of the scaling relationship at large n A closed form approximation of Eqs. (1) will be deduced in the limit of large n using the properties of the fractional Brownian path. We will obtain the following expression: s2DMA C H n2H ;

nb1,

(2)

with CH ¼

ð1 yÞ2Hþ1 þ y2Hþ1 1 . 2ðH þ 1Þð2H þ 1Þ 2H þ 1

(3)

By simple transformations, Eq. (1c) can be written as ðN

nÞs2DMA

nX yn X X 2 Nyn 1 Nyn ¼ y ðiÞ yðiÞ yði kÞ þ 2 n i¼nyn n i¼nyn i¼nyn k¼yn Nyn X

2

nX yn

!2 yði kÞ

.

(4)

k¼yn

Let us consider each term on the right-hand side of Eq. (4) separately. The ﬁrst term writes Nyn X

y2 ðiÞ ¼

i¼nyn

Nyn X

i2H ’

i¼nyn

1 ½ðN ynÞ2Hþ1 ðn ynÞ2Hþ1 . 2H þ 1

(5)

The second term writes

nyn iþyn X X X X 2 Nyn 2 Nyn yðiÞ yði kÞ ¼ yðiÞ yðjÞ n i¼nyn n i¼nyn k¼yn j¼inþyn

1 ½ðN ynÞ2Hþ1 ðn ynÞ2Hþ1 2H þ 1 1 1 2Hþ2 ½N n2Hþ2 ðN nÞ2Hþ2 2ðH þ 1Þð2H þ 1Þ n n2H ð1 yÞ2Hþ1 þ y2Hþ1 ðN nÞ. þ 2H þ 1

’

ð6Þ

The third term writes X 1 Nyn n2 i¼nyn

nX yn

!2 yði kÞ

k¼yn

X 1 Nyn ¼ 2 n i¼nyn ’

"

iX þyn

#2 yðjÞ

j¼iþynn

1 1 2Hþ2 ½N n2Hþ2 ðN nÞ2Hþ2 2ðH þ 1Þð2H þ 1Þ n n2H ðN nÞ. 2ðH þ 1Þð2H þ 1Þ

Summing the contributions from each term, one obtains ð1 yÞ2Hþ1 þ y2Hþ1 1 s2DMA ’ n2H . 2ðH þ 1Þð2H þ 1Þ 2H þ 1

ð7Þ

(8)

One can easily check that the term in square brackets in Eq. (8) takes, respectively, the following expressions:

for y ¼ 0 or y ¼ 1, i.e., when the moving average is referred to the last or to the ﬁrst point of the window, it is: CH ¼

1 , 2ðH þ 1Þ

(9)

ARTICLE IN PRESS 12

S. Arianos, A. Carbone / Physica A 382 (2007) 9–15

Fig. 1. (Color online). Log–log plot of the function sDMA deﬁned by the Eq. (1) for artiﬁcial series generated by the random midpoint displacement (RMD) algorithm. The series have length N ¼ 223 and Hurst exponent varying from 0:1 to 0:9 with step 0:1. The parameter y is taken equal to 0, 0:5 and 1, respectively.

ARTICLE IN PRESS S. Arianos, A. Carbone / Physica A 382 (2007) 9–15

13

for y ¼ 12 (i.e., when the moving average is referred to the center of the window) it is: CH ¼

1 1 22H 2ðH þ 1Þ 2H þ 1

(10)

The above calculations have been performed for a fractional Brownian motion with variance s2 ¼ t2H . For the general case of a fractional Brownian motion with variance s2 ¼ DH t2H , Eq. (8) asymptotically behaves as s2DMA ¼ DH C H n2H : 4. Results and discussion In this section, the values of C H obtained by calculating the DMA variance of artiﬁcial fractional Brownian motions are compared with those calculated using Eq. (8). In Fig. 1, the results of the DMA algorithm implemented over artiﬁcial fractional random walks generated by the RMD algorithm are shown. We apply the DMA algorithm to 30 samples of random walks with N ¼ 223 and the Hurst exponent ranging from 0:1 to 0:9 with step 0:1. The curves in the three ﬁgures refer, respectively, to three values of the parameter y, namely y ¼ 0, y ¼ 0:5 and y ¼ 1. The slopes of the logarithms of the data plotted in Fig. 1 are shown in Fig. 2. The slopes and the intercepts of the logarithms of the data plotted in Fig. 1 are reported in Table 1. From the data shown in Table 1, it is possible to deduce that the DMA with y ¼ 0:5 performs better with positively correlated signals with 0:5oHo1, while the DMA with y ¼ 0 and y ¼ 1 performs better with negatively correlated signals with 0oHo0:5. In Fig. 3, the theoretical values of C H , calculated by using Eq. (8), are compared with those obtained by the intercepts of the curves plotted in Fig. 1 for y ¼ 0:5 (data of the 2nd column of Table 1). Since the fractional Brownian motions, used for the simulations plotted in Fig. 1, have been generated by the RMD algorithm, in the calculation of C H it must be kept in mind that s2RMD ¼ ð1 22H2 Þ=22Hn s2Gauss , n being the number of steps of the RMD algorithm. It is interesting to compare Eqs. (9), (10) with the corresponding ones obtained for the DFA algorithm. According to the DFA method, the integrated proﬁle yðiÞ is divided into boxes of equal length n. In each box, the signal yðiÞ is best-ﬁtted by an ‘-order polynomial yn;‘ ðiÞ, which represents the local trend in that box. The different order of the DFA-‘ (e.g., DFA-0 if ‘ ¼ 0, DFA-1 if ‘ ¼ 1, DFA-2 if ‘ ¼ 2, etc.) is obtained according to the order of the polynomial ﬁt. Finally, the variance: s2DFA‘

N 1X ½yðiÞ yn;‘ ðiÞ2 N i¼1

(11)

Fig. 2. (Color online). Values of the slopes of the function sDMA plotted in Fig. 1 for three different values of the parameter y, respectively, equal to 0, 0:5 and 1.

ARTICLE IN PRESS S. Arianos, A. Carbone / Physica A 382 (2007) 9–15

14

Table 1 Intercept (A) and slope (B) of the logarithms of the data plotted in Fig. 1 H

0:1 0:2 0:3 0:4 0:5 0:6 0:7 0:8 0:9

y¼0

y ¼ 0:5

y¼1

A

B

A

B

A

B

0.73211 1.48755 2.23912 2.99192 3.72180 4.45987 5.17084 5.84043 6.51500

0.13220 0.21675 0.30962 0.40900 0.50752 0.61307 0.71296 0.79625 0.87257

0.84280 1.62576 2.41792 3.20887 4.00245 4.80327 5.60555 6.40763 7.27298

0.14151 0.22023 0.30952 0.40212 0.50058 0.60187 0.70333 0.79829 0.89698

0.71759 1.47214 2.22270 2.97326 3.70336 4.44063 5.14974 5.82297 6.49215

0.12861 0.21294 0.30555 0.40421 0.50301 0.60843 0.70776 0.79277 0.86705

Fig. 3. (Color online). Values of C H for the DMA with y ¼ 0:5 (red squares) and for the DFA-1 (blue circles) algorithms applied to the same series. The solid lines represent the values of C H calculated by using expressions (8) and (14), respectively.

is calculated for each box n. The calculation is then repeated for different box lengths n, yielding the behavior of sDFA‘ over a broad range of scales. For scale-invariant signals with power-law correlations, the following relationship between the function sDFA‘ and the scale n holds: s2DFA‘ n2H .

(12)

The asymptotic behavior of the DFA 0 and DFA 1 functions has been derived in [17,18]. The following relation (Eq. (21) of Ref. [18]) has been worked out for the sDFA0 function: 1 1 s2DFA0 ’ (13) n2H . 2H þ 1 2ðH þ 1Þ It is easy to check that the ‘‘scaled windowed variance without any trend correction’’ in Ref. [18] is indeed equivalent to the DFA 0 variance. The function obtained by ﬁtting the random walks yðiÞ by constant segment in each box corresponds indeed to a zero-order approximation of the trend of yðiÞ. The asymptotic behavior of the DFA-1 function, as reported in the Appendix of Ref. [17], is 2 1 2 s2DFA1 ’ þ (14) n2H . 2H þ 1 H þ 2 H þ 1

ARTICLE IN PRESS S. Arianos, A. Carbone / Physica A 382 (2007) 9–15

15

In Fig. 3, the values of C H for the DMA (with y ¼ 0:5) and the DFA 1 are shown. It can be observed that the behavior of C H obtained from the simulations (square and circles) follows quite well the analytical curves (solid lines) around H ’ 0:5. Deviations are observed at the extrema of the H range. Such deviations might be related to the accuracy either of the DFA and DMA techniques or of the RMD signal generator. 5. Conclusions We have derived the asymptotic scaling behavior of the DMA algorithm (Eq. (8)) for an arbitrary value of the reference point of the function y~ n ðiÞ. The values of C H are compared with those yielded by the simulations of fractional Brownian paths with assigned values of H generated by the RMD algorithm. A comparison between the behavior of C H for the DMA (with y ¼ 0:5) and the DFA (with ‘ ¼ 1) functions is also provided. References [1] R.N. Mantegna, H.E. Stanley, An Introduction to Econophysics: Correlations and Complexity in Finance, Cambridge University Press, New York, 2000. [2] W. Willinger, R. Govindan, S. Jamin, V. Paxson, S. Shenker, PNAS 99 (2002) 2573. [3] W.S. Lam, W. Ray, P.N. Guzdar, R. Roy, Phys. Rev. Lett. 94 (2005) 010602. [4] S.O. Ferreira, et al., Appl. Phys. Lett. 88 (2006) 244102. [5] J. Feder, Fractals, Plenum, New York, 1988. [6] H.E. Hurst, Trans. Am. Soc. Civil Eng. 116 (1951) 770. [7] B.B. Mandelbrot, J.R. Wallis, Water Resources Res. 5 (2) (1969) 321. [8] C.-K. Peng, S.V. Buldyrev, S. Havlin, M. Simons, H.E. Stanley, A.L. Goldberger, Phys. Rev. E 49 (1994) 1685. [9] N. Vandewalle, M. Ausloos, Phys. Rev. E 58 (1998) 6832. [10] K. Ivanova, M. Ausloos, Physica A 265 (1999) 279; K. Ivanova, M. Ausloos, Eur. Phys. J. B 8 (1999) 665. [11] G. Rangarajan, M. Ding, Phys. Rev. E 61 (2000) 004991. [12] C. Heneghan, G. McDarby, Phys. Rev. E 62 (2000) 6103. [13] E. Alessio, A. Carbone, G. Castelli, V. Frappietro, Eur. J. Phys. B 27 (2002) 197. [14] A. Carbone, G. Castelli, H.E. Stanley, Phys. Rev. E 69 (2004) 0161; A. Carbone, G. Castelli, H.E. Stanley, Physica A 344 (2004) 267. [15] T. Di Matteo, T. Aste, M.M. Dacorogna, J. Banking Finance 29 (2005) 827. [16] R. Voss, Random fractal forgeries, in: R.A. Earnshaw (Ed.), Fundamental Algorithms for Computer Graphics, NATO ASI Series F, vol. 17, Springer, New York, 1985, pp. 805–835. [17] P. Abry, P. Flandrin, M. Taqqu, D. Veitch, Long Range Dependence: Theory and Applications, Birkha¨user, Boston, 2003; M.S. Taqqu, V. Teverovsky, W. Willinger, Fractals 3 (1995) 785. [18] G.M. Raymond, J.B. Bassingthwaighte, Physica A 265 (1999) 85.

Our partners will collect data and use cookies for ad personalization and measurement. Learn how we and our ad partner Google, collect and use data. Agree & close