Neural networks surround us, in the form of large language models, speech transcription systems, molecular discovery algorithms, robotics, and much more. Stripped of anything else, neural networks are compositions of differentiable primitives, and studying them means learning how to program and how to interact with these models, a particular example of what is called differentiable programming.
  This primer is an introduction to this fascinating field imagined for someone, like Alice, who has just ventured into this strange differentiable wonderland. I overview the basics of optimizing a function via automatic differentiation, and a selection of the most common designs for handling sequences, graphs, texts, and audios. The focus is on a intuitive, self-contained introduction to the most important design techniques, including convolutional, attentional, and recurrent blocks, hoping to bridge the gap between theory and code (PyTorch and JAX) and leaving the reader capable of understanding some of the most advanced models out there, such as large language models (LLMs) and multimodal architectures.

## Overview

- This book is a self-contained introduction to the design of modern (deep) neural networks, also referred to as "differentiable models" to avoid historical baggage.
- The focus is on building efficient building blocks for processing n-dimensional data, including convolutions, transformers, graph layers, and modern recurrent models.
- The author aims to strike a balance between theory and code, historical considerations and recent trends, assuming the reader has some exposure to machine learning and linear algebra.
- The book is a refined draft from lecture notes for a course on Neural Networks for Data Science Applications, and does not cover advanced topics like generative modeling, explainability, prompting, and agents, which will be published separately.

## Plain English Explanation

This book is a comprehensive guide to the design of modern neural networks, which the author prefers to call "differentiable models" to avoid the historical baggage associated with the term "neural". The focus is on creating efficient building blocks for processing multi-dimensional data, such as convolutions, transformers, graph layers, and advanced recurrent models. 

The author has tried to strike a balance between theory and practical implementation, as well as between historical context and the latest developments in the field. The book assumes the reader has some familiarity with machine learning and linear algebra, but covers the necessary preliminaries when needed.

This book is based on lecture notes for a course on [Neural Networks for Data Science Applications](https://aimodels.fyi/papers/arxiv/deep-neural-networks-via-complex-network-theory), and does not delve into more advanced topics like [generative modeling](https://aimodels.fyi/papers/arxiv/stretched-measured-neural-predictions-complex-network-dynamics), [explainability](https://aimodels.fyi/papers/arxiv/road-to-clarity-exploring-explainable-ai-world), [prompting], and [agents], which will be covered in a companion website.

## Technical Explanation

The book is a comprehensive introduction to the design and implementation of modern neural networks, referred to as "differentiable models" to avoid the historical baggage associated with the term "neural". The author focuses on building efficient building blocks for processing n-dimensional data, including [convolutions](https://aimodels.fyi/papers/arxiv/cellular-automata-many-valued-logic-deep-neural), [transformers](https://aimodels.fyi/papers/arxiv/singular-riemannian-geometry-approach-to-deep-neural), [graph layers], and [modern recurrent models](https://aimodels.fyi/papers/arxiv/stretched-measured-neural-predictions-complex-network-dynamics).

The book aims to strike a balance between theory and practical implementation, as well as between historical context and the latest developments in the field. The author assumes the reader has some familiarity with machine learning and linear algebra, but covers the necessary preliminaries when needed.

The content is based on refined lecture notes from a course called "Neural Networks for Data Science Applications" taught by the author at Sapienza University. The book does not cover more advanced topics like generative modeling, explainability, prompting, and agents, which will be published separately in a companion website.

## Critical Analysis

The author's decision to avoid the term "neural" in favor of "differentiable models" is an interesting approach that may help readers approach the subject with a fresh perspective, unencumbered by the historical baggage associated with the field of neural networks.

The focus on building efficient building blocks for processing n-dimensional data is a practical and relevant approach, as many real-world applications involve complex, high-dimensional data. The inclusion of transformers, graph layers, and modern recurrent models suggests the book will cover a broad range of cutting-edge techniques in neural network design.

One potential limitation of the book is its scope, as the author has chosen to exclude advanced topics like generative modeling, explainability, prompting, and agents. While this decision may have been made to maintain a focused and manageable volume, it could leave some readers wanting more in-depth coverage of these important areas of research and development.

Overall, this book appears to be a well-designed and comprehensive introduction to the modern design of neural networks, with a balanced approach between theory and practice. The author's expertise and the refinement of the content from a university course suggest the book will be a valuable resource for students, researchers, and practitioners in the field of machine learning and data science.

## Conclusion

This book offers a self-contained and up-to-date introduction to the design of modern neural networks, or "differentiable models" as the author prefers to call them. By focusing on the construction of efficient building blocks for processing n-dimensional data, the book provides a practical and relevant approach to neural network design, covering a range of cutting-edge techniques like convolutions, transformers, graph layers, and modern recurrent models.

While the book does not delve into more advanced topics like generative modeling, explainability, prompting, and agents, it aims to strike a balance between theory and code, as well as historical context and recent trends. The author's expertise and the refinement of the content from a university course suggest this book will be a valuable resource for students, researchers, and practitioners in the field of machine learning and data science.