User friendly programs for nonlinear time series analysis The following programs are available: delay coordinate embedding, nonlinear noise reduction, mutual information method, false nearest neighbor method, maximal Lyapunov exponent, recurrence plot, determinism test, and stationarity test. The source code is available upon request. Electronic address: matjaz.perc@gmail.com GENERAL INSTRUCTIONS
The input file for all programs should be a singlecolumn ASCII file with an extension DAT (*.dat).
The input file must be in the same directory (working directory) as the program.
Each program has an integrated filesearch routine that, upon start, lists all
files with extension DAT in the working directory.
Each of the listed files can be used as the input file. The individual file is chosen
simply by a single click. Note, however, that neither of the programs check if the file really
is a singlecolumn ASCII file or no. Therefore, bear in mind that, although all DAT files are listed in the parameter window, not all qualify as valid input files. If, for example, a chosen input file consists of more than one column, the second read value will be the next value in the same row, not the first value in the second row.
There are no error messages for such instances so prepare your input files with care!
The numbers can be either simple dotseparated numbers (e.g. 4.618)
or scientifically written as e.g. 1.328e+0001. For further details please inspect the sample files included in the download packages.
To run a program just doubleclick on it. A parameter window will appear, giving you the chance to insert proper parameter values and choose an appropriate input file. Each program is able to remember the last used parameter settings; so all parameters don't have to be inserted over and over again. To make this feature possible each program writes an INI (*.ini) file to the disk where the parameters are stored. There is no harm done if you delete this files. Each program always checks if the file is present in the directory. If yes, parameters form this file are used. If no, the program starts up with the default settings and writes a new INI file, this time with the default settings (in future runs the default values are replaced by the ones specified by the user). Don't forget to select the appropriate input file each time your run, since this "parameter" is not remembered. Finally, you have to click the OK button to run the program. If for some reason you do not want the program to execute just uncheck the little box by the OK button (by default it is checked) and then click OK. In this case the program will terminate without execution. After the program is set to execute, a maximized window and a progress bar will appear. When the stripe comes to the end of the progress bar the program is finished. Subsequently the progress bar will disappear and results will be displayed graphically. IMPORTANT: Sometimes the progress bar freezes because all CPU capacity is used for the calculations. If this happens, you should not assume that the whole program froze. It will almost certainly finish anyway; just that you wont know exactly how long it will take to do so. All programs incorporate a warning system, which almost always produces and error message if fatal errors occur. A frozen progress bar is, however, not one of those. All programs also write the relevant results to DAT (*.dat) files, where the number at the beginning of each file indicates in which consecutive run of the program the file was recorded. The exact content of these files is explained below separately for each program. Basically this is all you have to know to run the programs. After you are finished with examining the results you just should close the window and the program will terminate. Alternatively you can also use the "exit" link in the menu bar. If more runs are necessary just click on the "repeat" link in the menu bar. Thereafter, the parameter window will appear again giving you the chance to insert different parameter values or choose input file.
It should be emphasized that knowing the above described procedure is only enough to run the programs.
But in order to understand and correctly interpret the results you will need some knowledge about nonlinear time series analysis.
Recommended books are:
AVAILABLE PROGRAMS
Delay coordinate embedding
The program embedd.exe reads the time series from a singlecolumn ASCII input file and draws 2D phase space projections for 4 different embedding delays, whereby the coordinates to be drawn can be chosen arbitrarily. Parameters that have to be provided are the number of points in the input file, 4 embedding delays, and 4 pairs of coordinates (each pair is used for one particular 2D projection of the whole phase space). Additionally, you can change the size of the drawings and whether you would like the drawings in colour or blackwhite. You can also choose if the program should remember the last set parameters or not. After completion, the program writes 4 files to the disk (1_embedding1.dat, 1_embedding2.dat, 1_embedding3.dat and 1_embedding4.dat), each consisting of two ASCII columns. First column lists the nth embedding coordinate, whereas the second column lists the mth embedding coordinate, whereby (n, m) is the coordinate pair used for the 2D phase space projection. The number at the beginning of each output file indicates in which consecutive run of the program the file was recorded. In the second run files written to the disk would be 2_embedding1.dat, 2_embedding2.dat, 2_embedding3.dat and 2_embedding4.dat. Nonlinear noise reduction The program noisered.exe reads the time series from a singlecolumn ASCII input file and draws 2D phase space projections obtained from the original (upper two pictures) and the "clean" time series (lower two pictures). Parameters that have to be provided are the number of points in the input file, embedding parameters, and the neighbourhood size for searching neighbours. Additionally, you can choose if the program should remember the last set parameters or not. After completion, the program draws the abovedescribed graphs. Files that are written to disk are 1_noisered.dat and 1_series_clean.dat. The first file consists of three ASCII columns. The first column is the consecutive number of each data point, the second column lists the "clean" time series, whereas the third column lists how many points were found inside the neighbourhood. The second file has only one ASCII column, which contains the "clean" time series. The second file is written separately to allow immediate use of the "clean" time series for other applications. Mutual information method The program mutual.exe reads the time series from a singlecolumn ASCII input file and draws the mutual information (M. I.) and the autocorrelation (Ac.) in dependence on the embedding delay (tau). Parameters that have to be provided are the number of points in the input file, number of bins in which the time series is partitioned for the calculation of the mutual information, and the maximal embedding delay for which the mutual information and autocorrelation are calculated. Additionally, you can choose if the program should remember the last set parameters or not, and if besides the mutual information the autocorrelation should be calculated or not. After completion, the program draws the abovedescribed graphs as well as returns all minima of the mutual information and the value of the embedding delay at which the autocorrelation decays to 1/e. Files that are written to disk are 1_mutual.dat and 1_autocorr.dat, which both consist of two ASCII columns. The first column in both files is the embedding delay, whereas the second column in the 1_mutual.dat file lists the mutual information, while in the 1_autocorr.dat file the second column list the pertaining autocorrelation. Falsenearest neighbor method The program fnn.exe reads the time series from a singlecolumn ASCII input file and draws the fraction of false nearest neighbors (FNN) in dependence on the embedding dimension (DIM). Parameters that have to be provided are the number of points in the input file, the embedding delay, the minimal and the maximal embedding dimension for which the fraction of false nearest neighbors is to be determined, the starting neighbor distance, the factor for increasing the starting neighbor distance, the threshold for false neighbors, and the percent of data that is allowed to be wasted (that is, how many points, at most, are allowed not to have a close neighbor to still obtain a relevant statistic). Additionally, you can choose if the program should remember the last set parameters or not. After completion, the program draws the abovedescribed graph as well as returns the standard deviation of data to allow a better estimation of the starting neighbor distance and the factor for increasing it. The written file 1_fnn.dat consists of five ASCII columns. The first column lists the embedding dimension, while the second column lists the pertaining fraction of false nearest neighbors. The third and fourth column are the number of points that have a false nearest neighbor and the number of points for which an initially closeenough neighbor has been found, respectively. If you divide the third and the fourth column you should obtain the fraction of false nearest neighbors. The fifth column lists the largest neighborhood size that was used for finding neighbors. If the neighborhood size increases above [(std. of data)/(threshold)] an additional entry to the output file is made. If the neighborhood size increases above [(std. of data)/2.0] the program terminates (prior to that a warning message is diplayed). In this case try to enlarge the "amount of data that is allowed to be wasted" parameter. Determinism test The program determinism.exe reads the time series from a singlecolumn ASCII input file and draws the embedding space as well as the pertaining approximated directional vector field. Parameters that have to be provided are the number of points in the input file, the embedding delay, the embedding dimension, the number of boxes in one dimension (the whole reconstructed phase space is then partitioned into [(number of boxes) to the power of embedding dimension] boxes), and the socalled significance, which determines at least how many times a box must be visited by the trajectory to include it (the box) into the statistic for the determinism factor. Additionally, you can choose if the program should remember the last set parameters or not. Besides these parameters pertaining to the calculations, you may also adjust certain parameters pertaining to the drawings. In particular, the main size of the drawings, the length of the unit vector, the arrowhead length, the arrowhead angle, and the color of drawings can be adjusted. After completion, the program draws the abovedescribed graph as well as returns the calculated determinism factor. Files that are written to disk are 1_determinism.dat and 1_vectfield.dat. The first file consists of two ASCII columns; the first column lists the number of times each occupied box was visited by the trajectory, while the second column lists the pertaining average vector size for that box. The 1_vectfield.dat file consists of four ASCII columns; the first two columns represent the first (x, y) coordinate and the second two columns the second (x, y) coordinate of each vector (of the vector field), respectively. Stationarity test The program stationarity.exe reads the time series from a singlecolumn ASCII input file and draws a colour map, where the colour of each map segment indicates the crossprediction error of using segment i as the neighbour source for making predictions in segment j. Parameters that have to be provided are the number of points in the input file, the embedding delay, the embedding dimension, the number of points in one segment (number of points in the file/number of points in one segment = number of segments), the minimal number of neighbours that have to be found in order to make a prediction, the starting neighbour distance, the factor for increasing the starting neighbour distance, the number of time steps (ahead) for prediction, and the percent of data that is allowed to be wasted (that is, how many points, at most, are allowed not to have enough close neighbours to make a prediction. If this parameter is set >50%, then the starting neighbour size is left constant, while each time a point is encountered for which not enough close neighbours are found to make the prediction, the predicted values is simply set to equal the average value of the data segment in which neighbours are searched for). Additionally, you can choose to rescale the data to unit variance, but in this case make sure you also change the starting neighbour size accordingly (0.25 is then usually recommended). As always, you can also choose if the program should remember the last set parameters or not. You may also choose if the colour map should be displayed in colour or in black/white contrast. After completion, the program also returns the standard deviation of data, and the minimal, maximal and the average crossprediction error. Files that are written to disk are 1_stationarity.dat and 1_stdev.dat. The first file consists of three ASCII columns; the first two columns list the data segments used for crosspredictions, while the third column lists the pertaining crossprediction errors. The 1_stdev.dat file consists of three ASCII columns; first column indexes the various data segments, while the second and the third column list the running average and standard deviation, respectively. First two lines in the 1_stdev.dat file, however, hold information about the mean and standard deviation of the whole time series. Maximal Lyapunov exponent (Wolf et al.) The program lyapmax.exe reads the time series from a singlecolumn ASCII input file and draws the embedding space as well as the convergence of the maximal Lyapunov exponent in dependence on time. Parameters that have to be provided are the number of points in the input file, the sampling time of data (for correct scalation of the exponent), the embedding delay, the embedding dimension, the evolution time that determines how long each initial length element is iterated, the minimal and the maximal initial size of the length element, the maximally allowed angle separation between each successive length element, and the maximal multipliers for the maximally allowed size of the initial length element and angle separation (this multipliers set into action if the procedure cannot find a close enough neighbor with a small enough angle separation for a particular phase space point). As always, you can also choose if the program should remember the last set parameters or not. After completion, the program draws the abovedescribed graph as well as returns the calculated maximal Lyapunov exponent. Note that for the latter task, no leastsquares scheme is implemented, but solely the last calculated value of the exponent is returned. If the convergence of the maximal Lyapunov exponent in the presented drawing is not good, this will then most likely not be a correctly estimated value! Therefore, in such cases, you should try to obtain the best fit by visually inspecting the data. The written 1_lyapmax.dat file consists of two ASCII columns; the first column lists the number of time steps, while the second column lists the pertaining average maximal Lyapunov exponent at the pertaining time. Maximal Lyapunov exponent (Kantz) The program lyapmaxk.exe reads the time series from a singlecolumn ASCII input file file and draws the S(n) vs. n graph (see the book by Kantz and Schreiber for details). Parameters that have to be provided are the number of points in the input file, the embedding delay, the minimal and the maximal embedding dimension, the number of iterations that determines how long each closeby trajectory is iterated through the attractor, the minimal and the maximal size of the neighbourhoods in which neighbours are searched for, the minimal number of neighbours (starting points for nearby trajectories) to evaluate the average divergence of nearby trajectories, and the number of reference points that determines for at most how many points the minimal number of neighbours has to be found, before returning the averaged value of averaged divergences of nearby trajectories. After completion, the program draws the abovedescribed graph. In order to estimate the maximal Lyapunov exponent you have to calculate the slope of the linear part (if it exists!!!) of the presented graph manually. The written 1_lyapmaxk.dat file consists of four ASCII columns; the first column lists the number of iterations, the second column lists the pertaining S(n), the third column lists the actual number of reference points used, while the fourth column lists the neighbourhood size for a particular run. Each run for different embedding dimensions is separated by a blank row in the output file (the embedding dimensions go from the minimal to the maximal value). Recurrence plot The program recurrplot.exe reads the time series from a singlecolumn ASCII input file and draws the recurrence plot of the system, whereby those pairs (i, j) whose distance from one another is smaller than some fixed threshold (neighbourhood size) are plotted blue whilst white otherwise. Parameters that have to be provided are the number of points in the input file, the embedding delay, the embedding dimension, and the neighbourhood size. After completion, the program draws the recurrence plot in a maximized window, whereby different scales can be observed in detail by clicking on the "Zoom In" or "Zoom Out" links on the toolbar. The written n_points.dat (n counts consecutive runs) file consists of two ASCII columns listing (i, j) pairs whose distance from one another is smaller than the provided "neighbourhood size" parameter. Additionally, the program writes the file n_parameters.dat, which lists the parameters used for the nth consecutive run. DOWNLOAD PACKAGES
Programs within different packages available below are the same.
Each ZIP (*.zip) file containing the programs differs only in the INI files,
which store parameters that appear most appropriate for the time series to be analysed.
All programs were last updated 20070322.
REFERENCES AND DISCLAIMER
The above programs have been developed in the hope that they will be used in the classroom, but also in areas of science
where methods of nonlinear time series analysis could be beneficial yet their potentials are still underexplored.
I will be happy if some of these goals are met. You may use the programs for whatever purpose, but be aware that you do so without any warranty and completely at your own risk. The source code for all programs is available upon request via the electronic address: matjaz.perc@gmail.com. If used to obtain results presented in scientific publications, please acknowledge the use of above programs by citing the articles in which they were first presented. The references are:
