Quantile Regression, LMS Method andRobust Statistics in the 21st Century19-23 June 2006, EdinburghHome Page | Workshop Arrangements | Scientific Programme and ParticipantsScientific Programme and ParticipantsWorkshop structureThe workshop will begin with Registration (09.00-09.45) on Monday 19 June and the first talk will be at 10.00. It will finish on the afternoon of Friday 23 June. Each day there will 2 one-hour keynote presentations followed by discussion and contributed sessions. For programme click here. For list of confirmed participants click here. Workshop topics:
Keynote speakers Gib Bassett (Department of Finance, Illinois at Chicago) Stef van Buuren (TNO Quality of Life in Leiden and University of Utrecht) Brian Cade (US Geological Survey) Probal Chaudhuri (Indian Statistical Institute) Tim Cole (Institute of Child Health, London) Xuming He (Department of Statistics, Illinois at Urbana-Champaign) Chris Jones (Department of Statistics, Open University) Roger Koenker (Department of Economics, Illinois at Urbana-Champaign) Ivan Mizera (Department of Mathematical and Statistical Sciences, University of Alberta) Steve Portnoy (Department of Statistics, Illinois at Urbana-Champaign) Programme Monday 19 June
Tuesday 20 June
Wednesday 21 June
Thursday 22 June
Friday 23 June
Confirmed Participants 7 June
Return to the top of the page Abstracts Cade, Brian Quantile regression applications in ecology and permutation tests for the linear model Heterogeneous response distributions are common in statistical models a pplied to observational data in ecology because important, interacting biological processes often are excluded from the models. Quantile regression provides an enlightened way to think about intervals of organism response in these models by allowing us to directly estimate the heterogeneous rates of change and resulting predictions with minimal assumptions. When the focal process being modeled is a limiting factor constraining organism responses, we may be more interested in predictions and rates of change associated with one end of the [0, 1] interval of quantiles, e.g., tau greater than or equal to 0.80. I will discuss example applications that vary from small to large in sample sizes, spatial extent, and policy implications, e.g., bivalve (Macomona liliana) habitat on a tidal sand flat in a New Zealand harbor (n = 200) to endangered Cape Sable seaside sparrow (Ammodramus maritimus mirabilis) response to changing water depths in the Florida Everglades (n > 5,000). The latter application used the quantile count model of Machado and Santos Silva (2005) as an alternative to zero-inflated distributional models. Changes in timing of bird migration in northern Europe as a potential effect of decadal climate change were estimated with b-splines in a weighted linear model, where changes in earliest (tau equal to or less than 0.25) migrants had greatest import. To improve inferences for lower or upper quantiles (e.g., tau equal to or less than 0.20 or tau greater than or equal to 0.80) for small to moderate sample size (n = 20 - 300) applications, two permutation procedures were developed for testing hypotheses or estimating confidence intervals on parameters in linear quantile regression models (Cade et al. 2006, Cade and Richards 2006). The F test is a permutation evaluation of an F-ratio version of the chi-square distributed quantile rank score test T. The D test uses an "F-ratio" like statistic that is the proportionate reduction in the sums minimized by the objective function between the reduced parameter null and full parameter alternative models. Because the D test used the magnitude of the residuals, it provided greater power (shorter CI) relative to the F and T rank score tests for some hypotheses at some sample size (n) and quantile (tau) combinations. All three tests required weighted estimates in their construction to maintain correct Type I error rates (confidence interval coverage) with heterogeneous errors. Two modifications of permuting residuals or rank scores against the full parameter design matrix (X) were required to maintain approximate exchangeability when null models were constrained through the origin (double permutation in F and D) or when null models had >1 parameter (dropping all but 1 zero residual in D). Return to timetable Cai, Yuzhi A forecasting method for quantile time series models Quantile regression is an important statistical technique which offers a mechanism for estimating models for the conditional median function and the full range of other conditional quantile functions. Compared with the estimation of conditional mean functions, quantile regression is capable of providing a more complete statistical analysis of the stochastic relationships among random variables. Some of the work on quantile autoregression methods in time series can be found in literature. One of the purposes for establishing a time series model is forecasting. Many forecasting methods for time series are based on mean models. Forecasting with nonlinear time series models is more involved than forecasting with linear models. In this talk we present a forecasting method for a quantile time series models. Return to timetable Chaudhuri, Probal Mahalanobis' fractile graphs, monotone index models and multivariate quantiles Fractile graphs introduced by P. C. Mahalanobis in the middle of the 20th century are nonparametric regression tools that regress the dependent variable on the fractiles (i.e., quantiles) of the independent variables. The use of fractiles of the independent variables leads to a universal distribution free standardization device that facilitates comparison of regression functions even if the independent variables are not in comparable scales as it happens sometimes in econometric and biostatistical applications. The problem becomes challenging in multiple regression situations when there are several independent variables, and Mahalanobis had only little success in extending the fractile curves into fractile surfaces or hyper-surfaces. Quantile regression on the other hand is a method for regressing the fractiles of the dependent variable on the independent variables. It has a fundamental connection with monotone index models that are well known in econometrics. A challenging problem there is the extension of monotone index models for multivariate response problems. In this talk, I shall explore some intriguing links between these two problems and their possible solutions using some versions of multivariate quantiles. Return to timetable Chen, Colin Automatic Bayesian quantile regression curve fitting Quantile regression, including median regression, as a more completed statistical model than mean regression, is now well known by its widespread applications. Bayesian inference on quantile regression or Bayesian quantile regression has attracted much interest recently. Most of existing research in Bayesian quantile regression focus on parametric quantile regression, although there is different discussion on modeling model error by a parametric distribution named asymmetric Laplace distribution or by a nonparametric alternative named scale mixture asymmetric Laplace distribution. This paper discusses Bayesian inference for nonparametric quantile regression. This general approach to quantile regression curves is to use piecewise polynomial functions with an unknown number of knots at unknown locations, all treated as parameters to be inferred through Reversible Jumping Markov Chain Monte Carlo (RJMCMC) of Green (1995). This method extends the work in automatic Bayesian mean curve fitting to quantile regression. The techniques can also be extended to the multivariate case by using additive models. Numerical results show that this automatic Bayesian quantile smoothing technique is competitive with quantile smoothing splines. Return to timetable Cole, Tim The LMS method - past, present and future The LMS method is a statistical technique for constructing age-related reference ranges that allows the median, variability and skewness of the distribution at each age to vary continuously with age. It assumes an underlying normal distribution with a Box-Cox transformation, and when it was first described by Cole (JRSS A 1988) the skewness adjustment was novel. Since then the technique has been refined, notably by Cole and Green (Stat Med 1992), and the LMS method is now widely used for the construction of reference centiles, mainly in the application area of growth chart assessment. The talk will describe how the LMS method was originally developed, identifying the key factors that influenced it; how it is used now, in different application areas, and its strengths and weaknesses; and how its defining concept has since been extended e.g. by Rigby and Stasinopoulos in GAMLSS. The similarities and differences between the LMS method and quantile regression will also be discussed. The practical basis for the original development will be emphasised using an example of height in boys during puberty. Return to timetable Furno, Marilena Parameter instability and quantile regression Angrist et al. (2004) show that, in case of misspecification, the quantile regression (QR) provides the best linear approximation to the chosen conditional quantile using a weighted mean squared error loss function, just as ordinary least squares (OLS) is the best mean squared error linear approximation to the conditional mean under misspecification. In their empirical implementation, the authors point out an interesting rule of thumb to signal the presence of structural breaks. They look at the QR estimates at the same quantile but over different sample periods and notice that these regressions are far apart, the confidence intervals of the two estimated equations do not overlap and do not have any common region. They interpret this as an evidence of the existence of a structural break. Bai (1995) provides consistency of QR estimators in the presence of structural break, even in the case of an estimated break, for both i.i.d. and n.i.i.d. errors. This result allows us to define a test for structural break based on QR estimates. We analyze the null and the alternative models, where the null imposes stability, while the alternative allows the regression coefficients to change in response to the break. The test compares the objective functions under the null and the alternative. It relies on the increase of the objective function and the worsening of the fit when unnecessary constraints are imposed, and is asymptotically distributed as an F test. This test can be extended to verify the predictive ability of a model, exclusion restrictions and other forms of misspecification, provided we can specify the model under the alternative. The definition of an omnibus test, that avoids the specification of the model under the alternative, could be defined by non-parametrically computing the unrestricted model, but it is left to future research. A Monte Carlo study analyzes the behavior of the test with non-normal errors, comparing least squares and quantile regression results. The gain of QR with respect to OLS relies on its robustness, on its absence of distributional assumptions, and on its capability to focus on different quantiles. Estimation and inference based on OLS may average out important features, for instance effects that take place in the tails and that on average cancel out. References: Angrist, J., Chernozhukov, V., Fernandez-Val, I., 2006, "Quantile regression under misspecification, with an application to the U.S. wage structure", Econometrica. Bai, J., 1995, "Least absolute deviation estimation of a shift", Econometric Theory, 403-436. Return to timetable Gannoun, Ali An affine equivariant estimator of conditional spatial median An affne equivariant modification of the conditional spatial median is proposed and studied. This modification used an adaptive transformation-retransformation (TR) procedure based on a data-driven coordinate system. This new estimate of multivariate conditional median improves upon the performance of nonequivariant spatial median especially when there are correlation among the real valued components of the vector of interest as well as when the scales of those components are different. The proposed approach is based on minimizing a loss function equivalent to that in univariate case. We indicate how to compute the proposed estimate and study its asymptotic properties. We also suggest an adaptive procedure to select the optimal data-driven coordinate system. We discuss the performance of our estimator with the help of a finite sample simulation study and illustrate our methodology by a data-set on blood pressure measurements. Return to timetable Hallin, Marc Local linear spatial quantile regression (joint with Zudi Lu and Keming Yu) Let $\left\{(Y_{\bf i},{\bf X_{\bf i} }), {\bf i}\in\Z ^N\right\}$ be a stationary real-valued $(d+1)$-dimensional spatial processes. Denote by ${\bf x}\mapsto q_p({\bf x})$, $p\in (0,1)$, ${\bf x}\in\R ^d$, the spatial quantile regression function of order $p$, characterized by ${\rm P}\{Y_{\bf i}\leq q_p(\bf x)\vert {\bf X_{\bf i} }= {\bf x }\}=p$. Assume that the process has been observed over a $N$-dimensional rectangular domain of the form ${\cal I}_{\bf n}:=\{ {\bf i}=(i_1,\ldots, i_N)\in\Z^N\vert 1\leq i_k \leq n_k, \, k=1,\ldots , N\}$, ${\bf n}=(n_1,\ldots , n_N)\in\Z^N$. We propose a local linear kernel estimator of $ q_p$, which extends to random fields with unspecified and possibly highly complex spatial dependence structure the weighted quantile regression methods considered in the context of independent samples or time series. Under mild regularity assumptions, we obtain a Bahadur representation for the estimators of q_p and its derivatives, from which we establish consistency and asymptotic normality. The spatial process is assumed to satisfy some very general mixing conditions, generalizing classical time-series strong mixing concepts. The size of the rectangular domain{\cal I}_{\bf n} is allowed to tend to infinity at different rates depending on the direction in \Z^N (non-isotropic asymptotics). The method provides much richer information than the traditional mean regression approach. Return to timetable Haupt, Harry Quantile regression asymptotics for dependent data (joint with Walter Oberhofer) We derive the consistency and asymptotic normality of the nonlinear quantile regression estimator with weakly dependent errors. The required assumptions are weak, and it is neither assumed that the error process is stationary nor that it is mixing. In fact, the notion of weak dependence introduced in this paper, can be considered in part as a quantile specific local variant of recently introduced concepts. There are obvious connections of the derived asymptotic results to corresponding, well known results from generalized least squares estimation. Return to timetable He, Xuming Power transformation towards a linear regression quantile We consider a power transformation towards a linear quantile regression model. Like the classical Box-Cox transformation, this approach extends the applicability of linear models without resorting to nonparametric smoothing, yet transformations on the quantile models are more natural due to the equivariance property of the quantiles under monotone transformations. We propose an estimation procedure and establish its consistency and asymptotic normality under regularity conditions. The objective function employed in the estimation can also be used to check inadequacy of a power-transformed linear quantile regression model and to obtain inference on the transformation parameter. The proposed approach is shown to be valuable through illustrative examples. The talk is based on joint work with Yunming Mu at the Texas A&M University. Return to timetable Heiler,Siegfried Local quantile estimation with examples and discussion of the bandwidth selection problem Two general approaches for local (polynomial) quantile regression are being discussed and illustrated with a few examples on daily finance data. The choice of bandwidths for iid data sets is then discussed in some detail. There, ideas developed for plug-in estimators in local linear regression are transfered to the case of local quantile regression. They seem to work well in practical applications. Return to timetable Jones, Chris Parametric families of distributions and their interaction with the Workshop title I intend to start with a bit of distribution theory and then discuss some of its potential practical ramifications. I will describe a couple of novel ways of generating three- and/or four-parameter families of continuous univariate distributions. These will afford skewness and a variety of tail weights, heavy if required, with obvious consequences for providing alternatives to more ad hoc methods of robust statistics. I will then consider some aspects of the interaction between these distributions and quantile estimation and regression. At the time of writing this abstract, I expect to be able to link kernel quantile estimation and regression in with my parametric ramblings. Permeating the talk will probably be one particular underused, but far from novel, family of distributions, the log F distributions. Look out too for special guest star appearances by a possibly surprising specific distribution. Return to timetable Jureckova, Jana Regression rank scores in nonlinear models Consider the nonlinear regression model Y_{i}= g(xi,theta)+e_{i}, i =1,..., n (1) with x_{i} ? R^{k} , theta = (theta_{0},theta_{1},...,theta_{p})epsilon THETA (compact in R^{p+1}), where g(x,theta)= theta_{0} + g(x,theta_{1},...,theta_{p}) is continuous, twice differentiable in theta and monotone in components of theta. Following Gutenbrunner and Jurecková (1992) and Jurecková and Procházka (1994), we introduce regression rank scores for model (1), and consider their asymptotic behaviour under some regularity conditions. We shall consider some tests in models with a nuisance nonlinear regression. Return to timetable Knight, Keith How many regression quantile breakpoints are there? Portnoy (1991) showed that the number of regression quantile breakpoints is O(n ln(n)) where is the number of observations. In this talk, we will consider the limiting distribution of the length of the interval on which the regression quantile estimator remains constant; this limiting distribution depends on the limit of the empirical distribution of the design. This result leads to a conjecture on the asymptotic normality of the number of regression quantile breakpoints. Return to timetable Koenker, Roger Quantile autoregression We consider quantile autoregression (QAR) models in which the autoregressivecoefficients can be expressed as monotone functions of a single, scalar random variable. The models can capture systematic influences of conditioning variables on the location, scale and shape of the conditional distribution of the response, and therefore constitute a significant extension of classical constant coefficient linear time series models in which the effect of conditioning is confined to a location shift. The models may be interpreted as a special case of the general random coefficient autoregression model with strongly dependent coefficients. Statistical properties of the proposed model and associated estimators are studied. The limiting distributions of the autoregression quantile process are derived. Quantile autoregression inference methods are also investigated. Empirical applications of the model to the U.S. unemployment rate and U.S. gasoline prices highlight the potential of the model. Return to timetable Lee, Simon Nonparametric instrumental variables estimation of a quantile regression model We consider nonparametric estimation of a regression function that is identified by requiring a specified quantile of the regression "error" conditional on an instrumental variable to be zero. The resulting estimating equation is a nonlinear integral equation of the first kind, which generates an ill-posed-inverse problem. The integral operator and distribution of the instrumental variable are unknown and must be estimated nonparametrically. We show that the estimator is mean-square consistent, derive its rate of convergence in probability, and give conditions under which this rate is optimal in a minimax sense. The results of Monte Carlo experiments show that the estimator behaves well in finite samples. Return to timetable Machado, José Identifying asset price booms and busts with quantile regressions This paper presents a methodology for detecting asset price booms and busts using non-parametric quantile regressions. The method consists in estimating the distribution of real stock prices as a function of fundamental determinants of stock returns, namely real economic activity and real interest rates. It is shown that changes in fundamentals affect not only the location but also the shape of the conditional distribution of stock prices. Asset price booms and busts are identified as realizations on the tails of that distribution. Then we use indicators to analyse the behaviour of money and credit around the boom and bust episodes. Return to timetable Mizera, Ivan Nonparametric regression quantiles: thou shalt not cross? Contrary to sometimes propagated impressions, there is not one, but many ways how to fit nonparametric quantile regressions; quite a few of standard approaches can be "quantilified", and many of the resulting prescriptions are effectively computable too. The question is not that much how to do it, but how to do it and achieve something reasonable. The guidelines from the standard nonparametric regression methodology might be useful here, would not be there a schism that offers rather useless theory on one side, and rather subjective practical criteria on side another; nevertheless, a tolerant attitude admitting potential multitude of objectives, expressed through various methods, is possible too. After all, the quantile context wouldn't have much else to add, except for one detail - important to some: that quantile lines for different quantile indices may intersect. The talk will review some existing and potential solutions that address this aspect. Return to timetable Ng, Pin A fast and efficient implementation of qualitatively constrained quantile smoothing splines Exploiting the sparse structure of the design matrices involved in the Frisch-Newton method, we implement a fast and efficient algorithm to compute qualitatively constrained smoothing and regression splines for quantile regression. In a previous implementation (He and Ng, 1999), the linear program involved was solved using the non-simplex active set algorithm for quantile smoothing spines proposed by Ng (1996). The current implementation uses the Frisch-Newton algorithm described in Koenker and Ng (2005a, 2005b). It is a variant of the interior-point algorithm proposed by Portnoy and Koenker (1997) which has been shown to outperform the simplex method in many applications. The current implementation relies on the R package SparseM of Koenker and Ng (2003) which contains a collection of basic linear algebra routines for sparse matrices to exploit the sparse structure of the matrices involved in the linear program to further speed up computation and save memory usage. A small simulation illustrates the superior performance of the new implementation. Return to timetable Portnoy, Steve and Neocleous, Tereza Recent advances in quantile regression for survival analysis The JASA paper (2003, p. 1001) introduced a new approach to analyzing censored data using regression quantiles. Unlike the Cox proportional Hazards model (imposing global structure), the Censored Regression Quantile approach allows local quantile effects to be analyzed under the usual assumption that censoring times are conditionally independent of responses given the covariates. Recent work has extended the basic ideas to partly linear models and to doubly censored cases. Some improved algorithms and new asymptotic results are also available. These developments will be sketched and new directions suggested. Return to timetable Stasinopoulos, Mikis Modelling skew and kurtotic data using GAMLSS Bob Rigby and Mikis Stasinopoulos Here we model the distribution of a dependent variable Y [given the value(s) of explanatory variable(s)] generally using a four parameter distribution, where the four parameters may relate to location, scale, skewness and kurtosis respectively. Each of the four distribution parameters is modelled using parametric and/or additive smooth nonparametric function(s) of the explanatory variable(s). Centiles of Y are computed from the distribution, providing smooth centile curves for Y as functions of the explanatory variables. The models are special cases of the generalized additive model for location, scale and shape, GAMLSS (Rigby and Stasinopoulos, 2005), which is implemented as a package in the R language, R Development Core Team (2005). The latest version is gamlss1.1-0, Stasinopoulos, Rigby and Akantziliotou, (2006a, 2006b). Specific four parameter continuous distributions considered include the Box-Cox power exponential (Rigby and Stasinopoulos, 2004) and Box-Cox t (Rigby and Stasinopoulos, 2006) distributions, providing generalizations of the LMS method of centile estimation (Cole, 1988 and Cole and Green, 1992) allowing modelling of kurtosis as well as skewness, and denoted LMSP and LMST methods respectively. Other four parameter distributions considered include the Johnson Su (Johnson, 1949), skew exponential power (DiCiccio and Monti, 2004), skew t (Jones and Faddy, 2003) and sinh-arcsinh (Jones, 2005) distributions. A family of mixed Poisson discrete distributions is considered for modelling overdispersed Poisson count data, including the three parameter Sichel and Delaporte distributions and the four parameter shifted generalized inverse Gaussian Poisson distribution (Rigby, Stasinopoulos and Akantziliotou, 2006). Modelling the skewness and kurtosis can provide a robust method of fitting the location and scale. Model comparison and diagnostics are also considered. Return to timetable Taylor, James Estimating value at risk and expected shortfall using expectiles Value at risk (VaR) and expected shortfall (ES) are measures of financial market risk. VaR is a tail quantile of the conditional distribution of returns, and ES is the conditional expectation of the returns that exceed the VaR. In this paper, we avoid distributional assumptions by estimating VaR and ES using asymmetric least squares (ALS) regression, which is the least squares analogue of quantile regression. The ALS solution is known as an expectile. In view of the existence of a one-to-one mapping from expectiles to quantiles, it has been proposed that the theta quantile be estimated by the expectile for which the proportion of in-sample observations lying below the expectile is theta. In this way, an expectile can be used to estimate VaR. We show that the corresponding ES estimator is a simple function of the expectile. As the basis for conditional VaR and ES modelling, we introduce two new classes of univariate expectile models: conditional autoregressive expectiles and exponentially weighted ALS. Empirical results indicate that the new expectile-based methods compare well with the established VaR and ES methods. Return to timetable van Buuren, Stef Statistical methods for growth diagrams Growth diagrams are widely used in child health to monitor growth and development of children. This lecture provides an overview of various type of growth diagrams (distance, velocity, conditional), describes ways to evaluate the fit between the diagram and the data (Q-statistics, worm plot), and outlines methods for investigating the diagnostic performance of referral rules based on diagrams (sensitivity, specificity, time to detect). Return to timetable Wade, Angie Using correlated data to create age-related standards Repeat measurements from individuals are often correlated. Where the number of measurements per individual is variable and/or related to the value of those measurements, a biased dataset may be obtained. We have modified a maximum-likelihood based LMS approach to incorporate and model correlations between measurements. Using this technique we are able to use longitudinal data in the creation of cross-sectional age-related reference ranges to make adjustment for bias in the dataset. Data simulation is used to illustrate the effectiveness of the process when the degree of bias within the data generation process is large. Return to timetable Wei, Ying Bivariate time-dependent growth charts based on quantile regression approach Growth charts have been widely used in clinics and medical centers to monitor individual's growth status in the context of population values. Typical growth charts consider only one measurement at a time, although it is well recognized that more informative readings can be obtained by considering multiple measurements simultaneously. We propose to construct bivariate growth charts by a nested sequence of time-dependent reference quantile contours on the joint distribution of bivariate measurements. A two stage method based on quantile regression is proposed to estimate such time-dependent bivariate growth charts from reference data with possibly irregular measurement times. The method is also flexible to include, whenever necessary, potentially important covariates. The performance of the propose methodology was demonstrated by a Monte-Carlo simulation study, as well as an application to height-weight screening of young children in the United States. Return to timetable Yang, Sanchao Nonparametric estimatation of expected shortfall (Shanchao Yang and Keming Yu) Since the concept of expected shortfall (ES) was proposed by Artzner et.al. (1997) and followed by Artzner et.al. (1999), it has been widely concerned and developed in market practice and theoretical progress. There are some scholars to present some nonparametric estimator of expected shortfall based on the structure of expectation, such as Scaillet (2004), further studied by Song (2005). However, it is now well-known that ES can be expressed as an integral of quantile (Acerbi and Tasche, 2001). This expression provides usefully mathematical tractability for studying the analytical properties of ES. Moreover, we are able to introduce conditional expected shortfall and investigate expected shortfall conditional on "information'' such as interests rate. In this paper, we try to propose a nonparametric estimator of ES based on the integral structure. We first will illustrate a problem in analyzing its mean square error of direct "plug-in'' estimator. Then we present an adjusted expected shortfall and its nonparametric estimator. Some simulations show that the estimator has small error and rapid rate of convergence. Return to timetable Home Page | Workshop Arrangements | Scientific Programme and Participants |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Back to ICMS Website |