dorsal/arxiv
View SchemaInformation theory and learning: a physical approach
| Authors | Ilya Nemenman |
|---|---|
| Categories | |
| ArXiv ID | physics/0009032 |
| URL | https://arxiv.org/abs/physics/0009032 |
Abstract
We try to establish a unified information theoretic approach to learning and to explore some of its applications. First, we define {\em predictive information} as the mutual information between the past and the future of a time series, discuss its behavior as a function of the length of the series, and explain how other quantities of interest studied previously in learning theory - as well as in dynamical systems and statistical mechanics - emerge from this universally definable concept. We then prove that predictive information provides the {\em unique measure for the complexity} of dynamics underlying the time series and show that there are classes of models characterized by {\em power-law growth of the predictive information} that are qualitatively more complex than any of the systems that have been investigated before. Further, we investigate numerically the learning of a nonparametric probability density, which is an example of a problem with power-law complexity, and show that the proper Bayesian formulation of this problem provides for the `Occam' factors that punish overly complex models and thus allow one {\em to learn not only a solution within a specific model class, but also the class itself} using the data only and with very few a priori assumptions. We study a possible {\em information theoretic method} that regularizes the learning of an undersampled discrete variable, and show that learning in such a setup goes through stages of very different complexities. Finally, we discuss how all of these ideas may be useful in various problems in physics, statistics, and, most importantly, biology.
{
"annotation_id": "68ea30ed-9577-44c2-8713-5f0c301ab024",
"date_created": "2026-03-02T18:00:32.204000Z",
"date_modified": "2026-03-02T18:00:32.204000Z",
"file_hash": "9f347f53620bb56766b31053cf0a385aae379b701836c2265b634a3d30627fb3",
"private": false,
"record": {
"abstract": "We try to establish a unified information theoretic approach to learning and\nto explore some of its applications. First, we define {\\em predictive\ninformation} as the mutual information between the past and the future of a\ntime series, discuss its behavior as a function of the length of the series,\nand explain how other quantities of interest studied previously in learning\ntheory - as well as in dynamical systems and statistical mechanics - emerge\nfrom this universally definable concept. We then prove that predictive\ninformation provides the {\\em unique measure for the complexity} of dynamics\nunderlying the time series and show that there are classes of models\ncharacterized by {\\em power-law growth of the predictive information} that are\nqualitatively more complex than any of the systems that have been investigated\nbefore. Further, we investigate numerically the learning of a nonparametric\nprobability density, which is an example of a problem with power-law\ncomplexity, and show that the proper Bayesian formulation of this problem\nprovides for the `Occam\u0027 factors that punish overly complex models and thus\nallow one {\\em to learn not only a solution within a specific model class, but\nalso the class itself} using the data only and with very few a priori\nassumptions. We study a possible {\\em information theoretic method} that\nregularizes the learning of an undersampled discrete variable, and show that\nlearning in such a setup goes through stages of very different complexities.\nFinally, we discuss how all of these ideas may be useful in various problems in\nphysics, statistics, and, most importantly, biology.",
"arxiv_id": "physics/0009032",
"authors": [
"Ilya Nemenman"
],
"categories": [
"physics.data-an",
"cond-mat.dis-nn",
"cs.LG",
"nlin.AO"
],
"title": "Information theory and learning: a physical approach",
"url": "https://arxiv.org/abs/physics/0009032"
},
"schema_id": "dorsal/arxiv",
"source": {
"execution_id": "89751ba3-0b95-4ad3-93d8-6f3420f1a2e0",
"id": "arXiv Dataset IDs",
"type": "Model",
"variant": "snapshot-2026-03-01",
"version": "0.1.0"
},
"user_id": 1000002
}