Network-based methods for high-dimensional nonlinear time series
Networks have emerged as popular tools in econometrics and applied economics to analyse financial, macroeconomic or microeconomic data. For instance, prediction of leading economic indicators and estimation of economic risk by the central banks and financial institutions, or the health risks by the insurance companies and public agencies, incorporate asymmetries about the way processes and entities in a system are connected and influence each other. In the context of economic time series, the nodes of a network are the component processes and the edges represent interactions between components.
In this research, we aim to develop and apply methods that analyse complex time series data by focusing on their representation as networks or graphs. Often, linear time series models are being used to model temporal processes. In these cases, the information contained in the autoregressive coefficients and the error covariance matrix is typically used to build graph representations that encode linear dependence between variables. In this project, we aim to extend the analysis to nonlinear and possibly non-stationary time series, which allow for more realistic assumptions about the data generating process.
To do this, appropriate modelling and estimation approaches of the dependence structure are necessary, that would enable the representation of the observed data as nodes and edges in a graph. This flexibility often comes at the price of increased dimensionality. Also, small samples of large-dimensional time panels data make difficult to interpret the estimated dependence. In handling high-dimensional graph structures, we want to investigate dimension-reduction techniques that operate directly on graphs, e.g., latent factors, shrinkage methods or economic restrictions. The purpose is to create a framework in which tractable causal and prediction tasks involving time series can be addressed.
Graphical models; Network embeddings; Learning directed and undirected information graphs; Latent factors; Dimension reduction; Robust estimation.
The availability of increasing amounts of collected data sets evolved in parallel with the development of statistical and computational methods aimed to make sense of the information encrypted in them. These methods are tailored to handle high-dimensional and heterogeneous data, with complicated underlying dependence structure, such as intertemporal dynamics, nonlinearity and/or non-stationarity.
Among these, graph-based methods have emerged as popular approaches that translate the relational features of the original variables into a non-Euclidian space that can be better suited to represent salient dependencies. For instance, covariance-based graphs encode linear dependence between stochastic processes. Depending on their focus, the graphs are causal or non-causal. A lot of progress has been made in understanding undirected graphs that do not examine the direction of influence between variables. Learning directed graphs, which often have associated causal interpretations, requires the existence of highly complex information signals in the sample, as well as the use of ancillary data (exogenous covariates).
Furthermore, in the presence of latent variables, adequate restrictions need to be imposed. Statistical learning on graphs is complicated by the fact that the observed data is asynchronous and noisy. While a lot of applied work has been done in this context, particularly in epidemiology and genetics, still there is a need to establish a sound theoretical basis for estimation and inference.
In all phases of the project, our research will be characterized by the following steps:
- Development of efficient estimation algorithms
- Theoretical statistical analysis
- Application to empirical data and a substantial interpretation of results.
The approach for this project is interdisciplinary and involves methods from time series econometrics, nonparametric statistics, graph theory, spectral methods and deep learning.
- Acemoglu, D., V. M. Carvalho, A. E. Ozdaglar, and A. Tahbaz-Salehi (2011). The Network Origins of Aggregate Fluctuations. Econometrica
- Brownlees, C. T. and G. Mesters (2017). Detecting Granular Time Series in Large Panels. Working paper
- Cucuringu, M., Koutis, I., Chawla, S., Miller, G.L., and R. (2016) Peng. Simple and Scalable Constrained Clustering: a Generalized Spectral Method. AISTATS
- Diebold, F. X. and K. Yilmaz (2014). On the network topology of variance decompositions: Measuring the connectedness of financial firms. In Journal of Econometrics
- Exterkate, P., P. J. Groenen, C. Heij, and D. van Dijk (2016). Nonlinear forecasting with many predictors using kernel ridge regression. International Journal of Forecasting
- Fan, J., Y. Liao, and M. Mincheva (2013). Large covariance estimation by thresholding principal orthogonal complements. Journal of the Royal Statistical Society. Series B: Statistical Methodology
- Grith, M. and Eckardt, M. Graphical Models for Multivariate Time Series Using Wavelets. In progress
- Jones, B. and M. West (2005). Covariance decomposition in undirected Gaussian graphical models. Biometrika
- Karasuyama, M. and H. Mamitsuka (2018). Factor Analysis on a Graph. AISTATS
- Kipf, T.N. and M. Welling (2017). Semi-Supervised Classification with Graph Convolutional Networks. Conference proceedings at ICLR
- Federal Reserve Bank of St. Louis
- OptionMetrics - IvyDB database
- Oxford-Man Institute's realized library
- Stock and Watson (2012) - macroeconomic time series
University of Amsterdam, University of Oxford, University of Pennsylvania, Turing Institute
The research is expected to lead to two methodological and two applied papers.
In this project, we address the possibly nonlinear dependence between the time-series process, that allows for more realistic and flexible modelling approaches. We propose dimension reduction techniques and methods for inferring the structure of interactions that go beyond traditional linear approaches. By doing so, we aim to promote the network-based techniques in econometrics, by proposing new approaches to traditional and novel modelling setups and providing a rigorous theoretical characterization of the circumstances in which particular methods are applicable and work best.
Analysing the interactions between high-dimensional data in a network-context will improve our understanding of the structural dependence of the processes involved. It will lead in turn to better insights into the risk factors, their source, and dynamics. The proposed methodology will result in improved predictions of the economic indicators, and better decisions by the policy-makers, firms, and individuals as a result of more coherent knowledge of the underlying mechanisms. These will have positive effects on promoting economic and financial stability, health and well-being.
PhD candidate profile
The candidate must have a solid background in econometrics, statistics and/or computer science. It is expected that (s)he is proficient in programming and has substantial experience with real data analyses. Broader knowledge of finance and/or medical research would be a plus point.
Prof. dr. Dick van Dijk
T: +31 (0)10 4081263
Dr. Maria Grith
T: +31 (0)10 4081339
This project is affiliated with the Tinbergen Institute graduate school, applicants for this project need to pass the Tinbergen Institute's admission requirements before they can be considered for a PhD position at ESE.
Note that the Tinbergen Institute requires valid GRE General Test results from all applicants. More information about the GRE test is available here. Be aware that available seats for this test fill up very fast so book your test well in advance. Please contact the GRE programme for specific questions about the GRE test.
Application deadline: 15 January 2019
Apply for this project using our online application form. Please use the project code below to apply for this project.
Tinbergen project code
TI PhD 2019 ESE DvD MG