0

0

A prescriptive theory for brain-like inference

    Published 10/28/2024 by Hadi Vafaii, Dekel Galor, Jacob L. Yates

    Overview

    • This paper presents a prescriptive theory for brain-like inference, drawing insights from neuroscience and machine learning.
    • It explores the connections between the evidence lower bound (ELBO) in variational inference and entropy in the brain's information processing.
    • The key contributions include a new objective function and an algorithm for brain-like inference.

    Iterative inference improves upon amortized inference in VAEs.

    1/4

    Iterative inference improves upon amortized inference in VAEs.

    Original caption: Figure 1: Amortized versus iterative inference. (a) Standard VAEs learn an approximate posterior through an encoder neural network, “amortizing” inference across the dataset. Inference components are color-coded in red, while generative components are in blue. 𝒙𝒙\bm{x}bold_italic_x, input (e.g., an image); 𝒙^^𝒙\hat{\bm{x}}over^ start_ARG bold_italic_x end_ARG, reconstruction; 𝒛𝒛\bm{z}bold_italic_z, latent samples. (b) The iterative Poisson VAE (i𝒫𝒫\operatorname{\mathcal{P}}caligraphic_P-VAE) replaces the encoder network with a parameter-free adaptive iterative algorithm, performing inference via “Analysis-by-Synthesis ” approach (Yuille & Kersten, 2006). Starting top-right, the process begins by sampling spikes from the prior, 𝒛tsubscript𝒛𝑡\bm{z}_{t}bold_italic_z start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, generating predictions via the decoder, fθ⁢(𝒛t)subscript𝑓𝜃subscript𝒛𝑡{\color[rgb]{0.16015625,0.671875,0.88671875}\definecolor[named]{pgfstrokecolor% }{rgb}{0.16015625,0.671875,0.88671875}{f_{\theta}}}(\bm{z}_{t})italic_f start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_italic_z start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ), and updating the state using 𝜹⁢𝒖t≔𝑱θ⁢(𝒛t)⋅(𝒙t−fθ⁢(𝒛t))≔𝜹subscript𝒖𝑡⋅subscript𝑱𝜃subscript𝒛𝑡subscript𝒙𝑡subscript𝑓𝜃subscript𝒛𝑡\bm{\delta u}_{t}\coloneqq{\bm{J}}_{\color[rgb]{0.16015625,0.671875,0.88671875% }\definecolor[named]{pgfstrokecolor}{rgb}{0.16015625,0.671875,0.88671875}{% \theta}}(\bm{z}_{t})\cdot(\bm{x}_{t}-{\color[rgb]{% 0.16015625,0.671875,0.88671875}\definecolor[named]{pgfstrokecolor}{rgb}{% 0.16015625,0.671875,0.88671875}{f_{\theta}}}(\bm{z}_{t}))bold_italic_δ bold_italic_u start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ≔ bold_italic_J start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_italic_z start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) ⋅ ( bold_italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - italic_f start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_italic_z start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) ), where 𝑱θ⁢(𝒛)=∂fθ⁢(𝒛)/∂𝒛subscript𝑱𝜃𝒛subscript𝑓𝜃𝒛𝒛{\bm{J}}_{\color[rgb]{0.16015625,0.671875,0.88671875}\definecolor[named]{% pgfstrokecolor}{rgb}{0.16015625,0.671875,0.88671875}{\theta}}(\bm{z})=\partial% {\color[rgb]{0.16015625,0.671875,0.88671875}\definecolor[named]{pgfstrokecolor% }{rgb}{0.16015625,0.671875,0.88671875}{f_{\theta}}}(\bm{z})/\partial\bm{z}bold_italic_J start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_italic_z ) = ∂ italic_f start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_italic_z ) / ∂ bold_italic_z is the Jacobian of the decoder (see eq. 7 and section B.4). After the update, a new sample from the posterior is drawn to generate the reconstruction and compute the ELBO loss. See Fig. 8 and Algorithm 1 for additional details.

    Model performance and efficiency on natural images, using sparse representations and a 512-dimensional latent space.

    1/1

    Model β Architecture # params MSE (train) MSE (test) Sparsity # iters
    i𝒫-VAE 24.00 <jacob|lin> 0.13M 12.0 ± 2.6 0.79 ± 0.03 60.0 64
    i𝒫-VAE 3.00 <jacob|lin> 0.13M 27.5 ± 7.1 0.85 ± 0.02 8 73.2
    i𝒫-VAE 1.50 <jacob|lin> 0.13M 50.4 ± 15.5 0.90 ± 0.03 4 83.3
    i𝒫-VAE 0.50 <conv|lin> 3.44M 101.9 ± 25.3 0.76 ± 0.16 1 65.9
    i𝒫-VAE 0.75 <conv|lin> 3.44M 119.4 ± 26.4 0.83 ± 0.09 1 77.7
    i𝒫-VAE 1.00 <conv|lin> 3.44M 131.8 ± 31.2 0.90 ± 0.08 1 84.1
    LCA 0.28 - 0.13M 16.1 ± 8.1 0.79 ± 0.02 65.6 1K
    LCA 0.44 - 0.13M 28.5 ± 14.1 0.86 ± 0.02 73.9 1K
    LCA 0.70 - 0.13M 50.1 ± 25.2 0.92 ± 0.01 83.4 1K
    ia-VAE (s) 1.00 <mlp|mlp> 39.55M 80.08 ± 21.06 ∼0.0 5 10
    sa-VAE 1.00 <conv|conv> 1.67M 97.74 ± 38.97 ∼0.0 20 20

    Original caption: Table 1: Model performance and efficiency. We prefer lightweight models that achieve low reconstruction loss using sparse representations and fewer parameters. We reported results on natural image patches extracted from the van Hateren dataset (Van Hateren & van der Schaaf, 1998). All models have K=512𝐾512K=512italic_K = 512 dimensional latent space. For the i𝒫𝒫\operatorname{\mathcal{P}}caligraphic_P-VAE models, we scaled the β𝛽\betaitalic_β parameter proportional to the number of training inference iterations. Specifically, we chose β=3/8∗Ttrain𝛽38subscript𝑇train\beta=3/8*T_{\mathrm{train}}italic_β = 3 / 8 ∗ italic_T start_POSTSUBSCRIPT roman_train end_POSTSUBSCRIPT, since this choice led to more stable convergence. We also tested other values of β𝛽\betaitalic_β and found that i𝒫𝒫\operatorname{\mathcal{P}}caligraphic_P-VAE results were robust to variations in β𝛽\betaitalic_β. Entries formatted as mean±stdplus-or-minusmeanstd{\text{mean}}\scriptstyle{\pm{\text{std}}}mean ± std.

    Plain English Explanation

    The paper examines how the brain might perform inference, or the process of drawing conclusions from available information. It looks at the similarities between the mathematical techniques used in machine learning, called variational inference, and the way the brain may handle information.

    In machine learning, variational inference uses an "evidence lower bound" (ELBO) to guide the training of models. The paper shows how this ELBO concept relates to the brain's tendency to minimize the uncertainty or "entropy" of its internal representations.

    Based on this insight, the paper proposes a new objective function and algorithm for brain-like inference. The idea is that the brain might use a similar mathematical approach to efficiently process information and make inferences, just as machine learning models do.

    Key Findings

    • The ELBO in variational inference corresponds to minimizing the entropy (uncertainty) of the brain's internal representations.
    • The paper introduces a new objective function and algorithm for brain-like inference, inspired by this connection between ELBO and entropy.
    • This approach aims to capture how the brain might perform efficient, brain-like inference.

    Technical Explanation

    The paper draws parallels between variational inference in machine learning and the brain's information processing. In variational inference, a model is trained to maximize the ELBO, which balances the model's ability to explain the observed data (the "evidence") and the complexity of the model itself.

    The authors show that this ELBO objective is equivalent to minimizing the entropy, or uncertainty, of the model's internal representations. They argue that the brain may use a similar principle to efficiently process information and make inferences.

    Based on this insight, the paper proposes a new objective function and algorithm for brain-like inference. The key idea is to directly minimize the entropy of the brain's internal representations, rather than maximizing the ELBO. The authors demonstrate how this approach can be implemented in a practical algorithm and discuss its potential advantages over standard variational inference.

    Implications for the Field

    This research explores the fundamental connections between machine learning techniques and the brain's information processing. By drawing these parallels, the paper offers a new perspective on how the brain might perform efficient, brain-like inference.

    The proposed objective function and algorithm for brain-like inference represent a novel approach that could inspire new developments in machine learning, cognitive science, and our understanding of the brain's information processing capabilities.

    Critical Analysis

    The paper provides a compelling theoretical framework for understanding the brain's inference processes, but it remains to be seen how well this approach would perform in practical applications. The authors acknowledge that further research is needed to validate the assumptions and test the proposed algorithm on real-world tasks.

    Additionally, the paper does not address potential limitations or caveats of the proposed approach. For example, it is unclear how the brain-like inference algorithm would handle complex, high-dimensional data or how it would scale to larger problems.

    Conclusion

    This paper presents a thought-provoking connection between variational inference in machine learning and the brain's information processing. By framing brain-like inference as a problem of minimizing internal representation entropy, the authors offer a new perspective on how the brain may perform efficient, probabilistic reasoning.

    While further research is needed to validate and refine the proposed approach, this work represents an important step towards a deeper understanding of the brain's computational principles and their potential applications in machine learning and cognitive science.

    Full paper

    Loading...

    Loading PDF viewer...

    Read original: arXiv:2410.19315



    This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

    Total Score

    1

    Follow @aimodelsfyi on 𝕏 →