BORIS Theses

BORIS Theses
Bern Open Repository and Information System

Set-valued Data: Regression, Design and Outliers

Li, Qiyu (2021). Set-valued Data: Regression, Design and Outliers. (Thesis). Universität Bern, Bern

21li_q.pdf - Thesis
Available under License Creative Commons: Attribution-Noncommercial-No Derivative Works (CC-BY-NC-ND 4.0).

Download (2MB) | Preview


The focus of this dissertation is to study set‐valued data from three aspects, namely regression, optimal design and outlier identification. This dissertation consists of three peer‐reviewed published articles, each of them addressing one aspect. Their titles and abstracts are listed below: 1. Local regression smoothers with set‐valued outcome data: This paper proposes a method to conduct local linear regression smoothing in the presence of set‐valued outcome data. The proposed estimator is shown to be consistent, and its mean squared error and asymptotic distribution are derived. A method to build error tubes around the estimator is provided, and a small Monte Carlo exercise is conducted to confirm the good finite sample properties of the estimator. The usefulness of the method is illustrated on a novel dataset from a clinical trial to assess the effect of certain genes’ expressions on different lung cancer treatments outcomes. 2. Optimal design for multivariate multiple linear regression with set‐identified response: We consider the partially identified regression model with set‐identified responses, where the estimator is the set of the least square estimators obtained for all possible choices of points sampled from set‐identified observations. We address the issue of determining the optimal design for this case and show that, for objective functions mimicking those for several classical optimal designs, their set‐identified analogues coincide with the optimal designs for point‐identified real‐valued responses. 3. Depth and outliers for samples of sets and random sets distributions: We suggest several constructions suitable to define the depth of set‐valued observations with respect to a sample of convex sets or with respect to the distribution of a random closed convex set. With the concept of a depth, it is possible to determine if a given convex set should be regarded an outlier with respect to a sample of convex closed sets. Some of our constructions are motivated by the known concepts of half‐space depth and band depth for function‐valued data. A novel construction derives the depth from a family of non‐linear expectations of random sets. Furthermore, we address the role of positions of sets for evaluation of their depth. Two case studies concern interval regression for Greek wine data and detection of outliers in a sample of particles.

Item Type: Thesis
Dissertation Type: Cumulative
Date of Defense: 5 March 2021
Subjects: 300 Social sciences, sociology & anthropology > 310 Statistics
500 Science > 510 Mathematics
Institute / Center: 08 Faculty of Science > Department of Mathematics and Statistics > Institute of Mathematical Statistics and Actuarial Science
Depositing User: Hammer Igor
Date Deposited: 27 May 2021 17:00
Last Modified: 05 Mar 2022 01:30

Actions (login required)

View Item View Item