Template-type: ReDIF-Paper 1.0 Author-Name: Jennifer L. Castle Author-Workplace-Name: Dept of Economics, Institute for New Economic Thinking at the Oxford Martin School and Magdalen College, University of Oxford Author-Email: jennifer.castle@magd.ox.ac.uk Author-Name: Jurgen A. Doornik Author-Workplace-Name: Dept of Economics, Institute for New Economic Thinking at the Oxford Martin School and Climate Econometrics, Nuffield College, University of Oxford Author-Email: jurgen.doornik@nuffield.ox.ac.uk Author-Name: David F. Hendry Author-Workplace-Name: Dept of Economics, Institute for New Economic Thinking at the Oxford Martin School and Climate Econometrics, Nuffield College, University of Oxford Author-Email: david.hendry@nuffield.ox.ac.uk Title: Robust Discovery of Regression Models Abstract: Since complete and correct a priori specifications of models for observational data never exist, model selection is unavoidable in that context. The target of selection needs to be the process generating the data for the variables under analysis, while retaining the objective of the study, often a theorybased formulation. Successful selection requires robustness against many potential problems jointly, including outliers and shifts; omitted variables; incorrect distributional shape; non-stationarity; misspecified dynamics; and non-linearity, as well as inappropriate exogeneity assumptions. The aim is to seek parsimonious final representations that retain the relevant information, are well specified, encompass alternative models, and evaluate the validity of the study. Our approach to doing so inevitably leads to more candidate variables than observations, handled by iteratively switching between contracting and expanding multi-path searches, here programmed in Autometrics. We investigate the ability of indicator saturation to discriminate between measurement errors and outliers, between outliers and large observations arising from non-linear responses (illustrated by artificial data), and apparent outliers due to alternative distributional assumptions. We illustrate the approach by exploring empirical models of the Boston housing market and inflation for the UK (both tackling outliers and non-linearities that can distort other estimation methods). We re-analyze the ‘local instability’ in the robust method of least median of squares shown by Hettmansperger and Sheather (1992) using indicator saturation to explain their findings. Classification-JEL: C51, C22 Keywords: Model Selection; Robustness; Outliers; Location Shifts; Indicator Saturation; Autometrics. Length: 33 pages Creation-Date: 2020-04-15 Number: 2020-W04 File-URL: https://www.nuffield.ox.ac.uk/economics/Papers/2020/2020W04_RobustDiscovery.pdf File-Format: application/pdf Handle: RePEc:nuf:econwp:2004