,
Arnab Bhattacharyya
,
Mathews Boban
,
Venkatesan Guruswami
,
Esty Kelman
Creative Commons Attribution 4.0 International license
We study the problem of robust multivariate polynomial regression: let p: ℝⁿ → ℝ be an unknown n-variate polynomial of degree at most d in each variable. We are given as input a set of random samples (𝐱_i,y_i) ∈ [-1,1]ⁿ × ℝ that are noisy versions of (𝐱_i,p(𝐱_i)). More precisely, each 𝐱_i is sampled independently from some distribution χ on [-1,1]ⁿ, and for each i independently, y_i is arbitrary (i.e., an outlier) with probability at most ρ < 1/2, and otherwise satisfies |y_i-p(𝐱_i)| ≤ σ. The goal is to output a polynomial p̂, of degree at most d in each variable, within an 𝓁_∞-distance of at most O(σ) from p.
Kane, Karmalkar, and Price [FOCS'17] solved this problem for n = 1. We generalize their results to the n-variate setting, showing an algorithm that achieves a sample complexity of O_n(dⁿlog d), where the hidden constant depends on n, if χ is the n-dimensional Chebyshev distribution. The sample complexity is O_n(d^{2n}log d), if the samples are drawn from the uniform distribution instead. The approximation error is guaranteed to be at most O(σ), and the run-time depends on log(1/σ). In the setting where each 𝐱_i and y_i are known up to N bits of precision, the run-time’s dependence on N is linear. We also show that our sample complexities are optimal in terms of dⁿ. Furthermore, we show that it is possible to have the run-time be independent of 1/σ, at the cost of a higher sample complexity.
@InProceedings{arora_et_al:LIPIcs.ESA.2024.12,
author = {Arora, Vipul and Bhattacharyya, Arnab and Boban, Mathews and Guruswami, Venkatesan and Kelman, Esty},
title = {{Outlier Robust Multivariate Polynomial Regression}},
booktitle = {32nd Annual European Symposium on Algorithms (ESA 2024)},
pages = {12:1--12:17},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {978-3-95977-338-6},
ISSN = {1868-8969},
year = {2024},
volume = {308},
editor = {Chan, Timothy and Fischer, Johannes and Iacono, John and Herman, Grzegorz},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2024.12},
URN = {urn:nbn:de:0030-drops-210830},
doi = {10.4230/LIPIcs.ESA.2024.12},
annote = {Keywords: Robust Statistics, Polynomial Regression, Sample Efficient Learning}
}