dorsal/arxiv
View SchemaOptimal Recovery of Local Truth
| Authors | Carlos C. Rodriguez |
|---|---|
| Categories | |
| ArXiv ID | physics/0010063 |
| URL | https://arxiv.org/abs/physics/0010063 |
| DOI | 10.1063/1.1381851 |
Abstract
Probability mass curves the data space with horizons. Let f be a multivariate probability density function with continuous second order partial derivatives. Consider the problem of estimating the true value of f(z) > 0 at a single point z, from n independent observations. It is shown that, the fastest possible estimators (like the k-nearest neighbor and kernel) have minimum asymptotic mean square errors when the space of observations is thought as conformally curved. The optimal metric is shown to be generated by the Hessian of f in the regions where the Hessian is definite. Thus, the peaks and valleys of f are surrounded by singular horizons when the Hessian changes signature from Riemannian to pseudo-Riemannian. Adaptive estimators based on the optimal variable metric show considerable theoretical and practical improvements over traditional methods. The formulas simplify dramatically when the dimension of the data space is 4. The similarities with General Relativity are striking but possibly illusory at this point. However, these results suggest that nonparametric density estimation may have something new to say about current physical theory.
{
"annotation_id": "552d3ed7-cb41-4d91-a3bf-ace1cb68f9f5",
"date_created": "2026-03-02T18:00:32.241000Z",
"date_modified": "2026-03-02T18:00:32.241000Z",
"file_hash": "2bd39e94afe705c16acf4be0366765ce1972a26827509daab46fbd6ba50b64a1",
"private": false,
"record": {
"abstract": "Probability mass curves the data space with horizons. Let f be a multivariate\nprobability density function with continuous second order partial derivatives.\nConsider the problem of estimating the true value of f(z) \u003e 0 at a single point\nz, from n independent observations. It is shown that, the fastest possible\nestimators (like the k-nearest neighbor and kernel) have minimum asymptotic\nmean square errors when the space of observations is thought as conformally\ncurved. The optimal metric is shown to be generated by the Hessian of f in the\nregions where the Hessian is definite. Thus, the peaks and valleys of f are\nsurrounded by singular horizons when the Hessian changes signature from\nRiemannian to pseudo-Riemannian. Adaptive estimators based on the optimal\nvariable metric show considerable theoretical and practical improvements over\ntraditional methods. The formulas simplify dramatically when the dimension of\nthe data space is 4. The similarities with General Relativity are striking but\npossibly illusory at this point. However, these results suggest that\nnonparametric density estimation may have something new to say about current\nphysical theory.",
"arxiv_id": "physics/0010063",
"authors": [
"Carlos C. Rodriguez"
],
"categories": [
"physics.data-an"
],
"doi": "10.1063/1.1381851",
"title": "Optimal Recovery of Local Truth",
"url": "https://arxiv.org/abs/physics/0010063"
},
"schema_id": "dorsal/arxiv",
"source": {
"execution_id": "678f8280-19e7-47f8-a436-4f77f888fb78",
"id": "arXiv Dataset IDs",
"type": "Model",
"variant": "snapshot-2026-03-01",
"version": "0.1.0"
},
"user_id": 1000002
}