Outlier Robust Multivariate Polynomial Regression

Arora, Vipul; Bhattacharyya, Arnab; Boban, Mathews; Guruswami, Venkatesan; Kelman, Esty

doi:10.4230/LIPIcs.ESA.2024.12

Abstract

We study the problem of robust multivariate polynomial regression: let p: ℝⁿ → ℝ be an unknown n-variate polynomial of degree at most d in each variable. We are given as input a set of random samples (𝐱_i,y_i) ∈ [-1,1]ⁿ × ℝ that are noisy versions of (𝐱_i,p(𝐱_i)). More precisely, each 𝐱_i is sampled independently from some distribution χ on [-1,1]ⁿ, and for each i independently, y_i is arbitrary (i.e., an outlier) with probability at most ρ < 1/2, and otherwise satisfies |y_i-p(𝐱_i)| ≤ σ. The goal is to output a polynomial p̂, of degree at most d in each variable, within an 𝓁_∞-distance of at most O(σ) from p.

Kane, Karmalkar, and Price [FOCS'17] solved this problem for n = 1. We generalize their results to the n-variate setting, showing an algorithm that achieves a sample complexity of O_n(dⁿlog d), where the hidden constant depends on n, if χ is the n-dimensional Chebyshev distribution. The sample complexity is O_n(d^{2n}log d), if the samples are drawn from the uniform distribution instead. The approximation error is guaranteed to be at most O(σ), and the run-time depends on log(1/σ). In the setting where each 𝐱_i and y_i are known up to N bits of precision, the run-time’s dependence on N is linear. We also show that our sample complexities are optimal in terms of dⁿ. Furthermore, we show that it is possible to have the run-time be independent of 1/σ, at the cost of a higher sample complexity.

Cite As Get BibTex

Vipul Arora, Arnab Bhattacharyya, Mathews Boban, Venkatesan Guruswami, and Esty Kelman. Outlier Robust Multivariate Polynomial Regression. In 32nd Annual European Symposium on Algorithms (ESA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 308, pp. 12:1-12:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024) https://doi.org/10.4230/LIPIcs.ESA.2024.12

Author Details

Vipul Arora

School of Computing, National University of Singapore, Singapore

Arnab Bhattacharyya

School of Computing, National University of Singapore, Singapore

Mathews Boban

School of Computing, National University of Singapore, Singapore

Venkatesan Guruswami

Department of EECS, and Department of Mathematics, University of California, Berkeley, CA, USA

Esty Kelman

CSAIL, Massachusetts Institute of Technology, Cambridge, MA, USA
Department of Computer Science, and Faculty of Computing & Data Sciences, Boston University, MA, USA

Funding

Arora, Vipul: Supported in part by NRF-AI Fellowship R-252-100-B13-281.
Bhattacharyya, Arnab: Supported in part by NRF-AI Fellowship R-252-100-B13-281, Amazon Faculty Research Award, and Google South & Southeast Asia Research Award.
Boban, Mathews: Supported in part by NRF-AI Fellowship R-252-100-B13-281.
Guruswami, Venkatesan: Supported in part by NSF CCF-2211972 and a Simons Investigator Award.
Kelman, Esty: Supported in part by an Amazon Faculty Research Award to AB, in part by ERC grant 834735, and in part by NSF TRIPODS program (award DMS-2022448).

Acknowledgements

The authors would like to thank Yuval Filmus for fruitful discussions about some aspects of the robust regression problem.

References

Sanjeev Arora and Subhash Khot. Fitting algebraic curves to noisy data. Journal of Computer and System Sciences, 67(2):325-340, 2003. Special Issue on STOC 2002. URL: https://doi.org/10.1016/S0022-0000(03)00012-6.
Vipul Arora, Arnab Bhattacharyya, Mathews Boban, Venkatesan Guruswami, and Esty Kelman. Outlier Robust Multivariate Polynomial Regression, 2024. URL: https://arxiv.org/abs/2403.09465.
Hadassa Daltrophe, Shlomi Dolev, and Zvi Lotker. Big data interpolation using functional representation. Acta Informatica, 55:213-225, 2018. URL: https://doi.org/10.1007/s00236-016-0288-8.
Ilias Diakonikolas, Gautam Kamath, Daniel Kane, Jerry Li, Jacob Steinhardt, and Alistair Stewart. Sever: A robust meta-algorithm for stochastic optimization. In International Conference on Machine Learning, pages 1596-1606. PMLR, 2019. URL: https://arxiv.org/abs/1803.02815.
Ilias Diakonikolas, Weihao Kong, and Alistair Stewart. Efficient algorithms and lower bounds for robust linear regression. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 2745-2754. SIAM, 2019. URL: https://arxiv.org/abs/1806.00040.
V. Guruswami and D. Zuckerman. Robust Fourier and Polynomial Curve Fitting. In 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS), pages 751-759, Los Alamitos, CA, USA, October 2016. IEEE Computer Society. URL: https://doi.org/10.1109/FOCS.2016.75.
Helmut. Norms on 𝒫_N Vector Space of Polynomials up to Order N. Mathematics Stack Exchange. URL: https://math.stackexchange.com/q/2693954.
Daniel Kane, Sushrut Karmalkar, and Eric Price. Robust Polynomial Regression up to the Information Theoretic Limit. In 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS), pages 391-402, 2017. URL: https://doi.org/10.1109/FOCS.2017.43.
Adam R. Klivans, Pravesh K. Kothari, and Raghu Meka. Efficient algorithms for outlier-robust regression. In Sébastien Bubeck, Vianney Perchet, and Philippe Rigollet, editors, Conference On Learning Theory, COLT 2018, Stockholm, Sweden, 6-9 July 2018, volume 75 of Proceedings of Machine Learning Research, pages 1420-1430. PMLR, 2018. URL: http://proceedings.mlr.press/v75/klivans18a.html.
Andrey Andreyevich Markov. On a question by D. I. Mendeleev. Zap. Imp. Akad. Nauk. St. Petersburg, 62:1-24, 1890. URL: https://history-of-approximation-theory.com/fpapers/markov4.pdf.
Paul G Nevai. Bernstein’s inequality in lp for 0< p< 1. Journal of Approximation Theory, 27(3):239-243, 1979. URL: https://doi.org/10.1016/0021-9045(79)90105-9.
Adarsh Prasad, Arun Sai Suggala, Sivaraman Balakrishnan, and Pradeep Ravikumar. Robust Estimation via Robust Gradient Estimation. Journal of the Royal Statistical Society Series B: Statistical Methodology, 82(3):601-627, 2020. URL: https://arxiv.org/abs/1802.06485.
John Wolberg. Data analysis using the method of least squares: extracting the most information from experiments. Springer Science & Business Media, 2006. URL: https://doi.org/10.1007/3-540-31720-1.
Achim Zielesny. From curve fitting to machine learning, volume 18. Springer, 2011. URL: https://doi.org/10.1007/978-3-319-32545-3.

Outlier Robust Multivariate Polynomial Regression

Authors Vipul Arora , Arnab Bhattacharyya , Mathews Boban , Venkatesan Guruswami , Esty Kelman

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

Acknowledgements

References

Thanks for your feedback!

Could not send message

Outlier Robust Multivariate Polynomial Regression

Authors Vipul Arora , Arnab Bhattacharyya , Mathews Boban , Venkatesan Guruswami , Esty Kelman

File

Document Identifiers

Related Versions

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

Funding

Acknowledgements

References

Thanks for your feedback!

Could not send message