dorsal/arxiv
View SchemaPredictability, complexity and learning
| Authors | William Bialek, Ilya Nemenman, Naftali Tishby |
|---|---|
| Categories | |
| ArXiv ID | physics/0007070 |
| URL | https://arxiv.org/abs/physics/0007070 |
| Journal | Neural Computation 13, 2409-2463 (2001) |
Abstract
We define {\em predictive information} $I_{\rm pred} (T)$ as the mutual information between the past and the future of a time series. Three qualitatively different behaviors are found in the limit of large observation times $T$: $I_{\rm pred} (T)$ can remain finite, grow logarithmically, or grow as a fractional power law. If the time series allows us to learn a model with a finite number of parameters, then $I_{\rm pred} (T)$ grows logarithmically with a coefficient that counts the dimensionality of the model space. In contrast, power--law growth is associated, for example, with the learning of infinite parameter (or nonparametric) models such as continuous functions with smoothness constraints. There are connections between the predictive information and measures of complexity that have been defined both in learning theory and in the analysis of physical systems through statistical mechanics and dynamical systems theory. Further, in the same way that entropy provides the unique measure of available information consistent with some simple and plausible conditions, we argue that the divergent part of $I_{\rm pred} (T)$ provides the unique measure for the complexity of dynamics underlying a time series. Finally, we discuss how these ideas may be useful in different problems in physics, statistics, and biology.
{
"annotation_id": "2f3c4999-021d-4fa4-a269-1f652b4e3c75",
"date_created": "2026-03-02T18:00:31.605000Z",
"date_modified": "2026-03-02T18:00:31.605000Z",
"file_hash": "5c32ff3f9c7799eb56a7e3acccf25e769ae7432fd3dc96b4a5fd5b67864f298e",
"private": false,
"record": {
"abstract": "We define {\\em predictive information} $I_{\\rm pred} (T)$ as the mutual\ninformation between the past and the future of a time series. Three\nqualitatively different behaviors are found in the limit of large observation\ntimes $T$: $I_{\\rm pred} (T)$ can remain finite, grow logarithmically, or grow\nas a fractional power law. If the time series allows us to learn a model with a\nfinite number of parameters, then $I_{\\rm pred} (T)$ grows logarithmically with\na coefficient that counts the dimensionality of the model space. In contrast,\npower--law growth is associated, for example, with the learning of infinite\nparameter (or nonparametric) models such as continuous functions with\nsmoothness constraints. There are connections between the predictive\ninformation and measures of complexity that have been defined both in learning\ntheory and in the analysis of physical systems through statistical mechanics\nand dynamical systems theory. Further, in the same way that entropy provides\nthe unique measure of available information consistent with some simple and\nplausible conditions, we argue that the divergent part of $I_{\\rm pred} (T)$\nprovides the unique measure for the complexity of dynamics underlying a time\nseries. Finally, we discuss how these ideas may be useful in different problems\nin physics, statistics, and biology.",
"arxiv_id": "physics/0007070",
"authors": [
"William Bialek",
"Ilya Nemenman",
"Naftali Tishby"
],
"categories": [
"physics.data-an",
"cond-mat.dis-nn",
"cond-mat.other",
"cs.LG",
"nlin.AO",
"q-bio.OT"
],
"journal_ref": "Neural Computation 13, 2409-2463 (2001)",
"title": "Predictability, complexity and learning",
"url": "https://arxiv.org/abs/physics/0007070"
},
"schema_id": "dorsal/arxiv",
"source": {
"execution_id": "0888f529-6406-49c0-9208-e7f3207726ca",
"id": "arXiv Dataset IDs",
"type": "Model",
"variant": "snapshot-2026-03-01",
"version": "0.1.0"
},
"user_id": 1000002
}