Mechanistic Explanation in Deep Learning (Millière)

Raphaël Millière,  PhilosophyMacquarie University14 September, 2024

VIDEO


Abstract: Deep neural networks such as large language models (LLMs) have achieved impressive performance across almost every domain of natural language processing, but there remains substantial debate about which cognitive capabilities can be ascribed to these models. Drawing inspiration from mechanistic explanations in life sciences, the nascent field of “mechanistic interpretability” seeks to reverse-engineer human-interpretable features to explain how LLMs process information. This raises some questions: (1) Are causal claims about neural network components, based on coarse intervention methods (such as “activation patching”), genuine mechanistic explanations? (2) Does the focus on human-interpretable features risk imposing anthropomorphic assumptions? My answer will be “yes” to (1) and “no” to (2), closing with a discussion of some ongoing challenges.

Raphael Millière is Lecturer in Philosophy of Artificial Intelligence at Macquarie University in Sydney, Australia. His interests are in the philosophy of artificial intelligence, cognitive science, and mind, particularly in understanding artificial neural networks based on deep learning architectures such as Large Language Models. He has investigated syntactic knowledge, semantic competence, compositionality, variable binding, and grounding.

Elhage, N., et al. (2021). A mathematical framework for transformer circuitsTransformer Circuits Thread

Machamer, P., Darden, L., & Craver, C. F. (2000). Thinking about MechanismsPhilosophy of Science, 67(1), 1–25. 

Millière, R. (2023). The Alignment Problem in Context. arXiv preprint arXiv:2311.02147

Mollo, D. C., & Millière, R. (2023). The vector grounding problemarXiv preprint arXiv:2304.01481

Yousefi, S., et al. (2023). In-Context Learning in Large Language Models: A Neuroscience-inspired Analysis of Representations. arXiv preprint arXiv:2310.00313.

« Apprentissage continu et contrôle cognitif » (Alexandre)

Frédérique Alexandre , Inria, Bordeaux, 14-Dec  2023

Résumé : Jexplore la différence entre l’efficacité de l’apprentissage humain et celle des grands modèles de langage en termes de temps de calcul et de coûts énergétiques. L’étude se focalise sur le caractère continu de l’apprentissage humain et les défis associés, tels que l’oubli catastrophique. Deux types de mémoires, la mémoire de travail et la mémoire épisodique, sont examinés. Le cortex préfrontal est décrit comme essentiel pour le contrôle cognitif et la mémoire de travail, tandis que l’hippocampe est central pour la mémoire épisodique. Alexandre suggère que ces deux régions collaborent pour permettre un apprentissage continu et efficace, facilitant ainsi la pensée et l’imagination.

Abstract: I explore the difference between the efficiency of human learning and that of large language models in terms of computational time and energy costs. The study focuses on the continuous nature of human learning and associated challenges, such as catastrophic forgetting. Two types of memory, working memory and episodic memory, are examined. The prefrontal cortex is described as essential for cognitive control and working memory, while the hippocampus is central for episodic memory. Alexandre suggests that these two regions collaborate to enable continuous and effective learning, thus facilitating thought and imagination. 

Frédéric Alexandre est directeur de recherche à l’Inria et dirige l’équipe Mnemosyne à Bordeaux, spécialisée en Intelligence Artificielle et Neurosciences Computationnelles. L’équipe étudie les différentes formes de mémoire cérébrale et leur rôle dans des fonctions cognitives telles que le raisonnement et la prise de décision. Ils explorent la dichotomie entre mémoires explicites et implicites et comment elles interagissent. Leurs projets récents s’étendent de l’acquisition du langage à la planification et la délibération. Les modèles créés sont validés expérimentalement et ont des applications médicales, industrielles, ainsi qu’en sciences humaines, notamment en éducation, droit, linguistique, économie, et philosophie.

Frédéric Alexandre. A global framework for a systemic view of brain modelingBrain Informatics, 2021, 8 (1), 

Snigdha Dagar, Frédéric Alexandre, Nicolas P. Rougier. From concrete to abstract rules : A computational sketch15th International Conference on Brain Informatics, Jul 2022.  

Randa Kassab, Frédéric Alexandre. Pattern Separation in the Hippocampus: Distinct Circuits under Different ConditionsBrain Structure and Function, 2018, 223 (6), pp.2785-2808. 

Hugo Chateau-Laurent, Frédéric Alexandre. The Opportunistic PFC: Downstream Modulation of a Hippocampus-inspired Network is Optimal for Contextual Memory Recall36th Conference on Neural Information Processing System, Dec 2022.  

Pramod Kaushik, Jérémie Naudé, Surampudi Bapi Raju, Frédéric Alexandre. A VTA GABAergic computational model of dissociated reward prediction error computation in classical conditioningNeurobiology of Learning and Memory, 2022, 193 (107653),  

Falsifying the Integrated Information Theory of Consciousness (Hanson)

Jake R Hanson, Sr. Data Scientist, Astrophysics, 07-Dec  2023

VIDEO

Abstract: Integrated Information Theory is a prominent theory of consciousness in contemporary neuroscience, based on the premise that feedback, quantified by a mathematical measure called Phi, corresponds to subjective experience. A straightforward application of the mathematical definition of Phi fails to produce a unique solution due to unresolved degeneracies inherent in the theory. This undermines nearly all published Phi values to date. In the mathematical relationship between feedback and input-output behavior in finite-state systems automata theory shows that feedback can always be disentangled from a system’s input-output behavior, resulting in Phi=0 for all possible input-output behaviors. This process, known as “unfolding,” can be accomplished without increasing the system’s size, leading to the conclusion that Phi measures something fundamentally disconnected from what could ground the theory experimentally. These findings demonstrate that IIT lacks a well-defined mathematical framework and may either be already falsified or inherently unfalsifiable according to scientific standards.

Jake Hanson is a Senior Data Scientist at a financial tech company in Salt Lake City, Utah. His doctoral research in Astrophysics from Arizona State University focused on the origin of life via the relationship between information processing and fundamental physics. He demonstrated that there were multiple foundational issues with IIT, ranging from poorly defined mathematics to problems with experimental falsifiability and pseudoscientific handling of core ideas.

Hanson, J.R., & Walker, S.I. (2019). Integrated information theory and isomorphic feed-forward philosophical zombiesEntropy, 21.11, 1073.

Hanson, J.R., & Walker, S.I. (2021). Formalizing falsification for theories of consciousness across computational hierarchies.Neuroscience of Consciousness, 2021.2, niab014.

Hanson, J.R., & Walker, S.I. (2021). Falsification of the Integrated Information Theory of ConsciousnessDiss. Arizona State University, 2021.

Hanson, J.R., & Walker, S.I. (2023). On the non-uniqueness problem in Integrated Information TheoryNeuroscience of Consciousness, 2023.1, niad014.

LLMs, Patterns, and Understanding (Durt)

Christof Durt , Philosophy, U. Heidelberg, 30-Nov 2023

VIDEO

ABSTRACT: It is widely known that the performance of LLMs is contingent on their being trained with very large text corpora. But what in the text corpora allows LLMs to extract the parameters that enable them to produce text that sounds as if it had been written by an understanding being? In my presentation, I argue that the text corpora reflect not just “language” but language use. Language use is permeated with patterns, and the statistical contours of the patterns of written language use are modelled by LLMs. LLMs do not model understanding directly, but statistical patterns that correlate with patterns of language use. Although the recombination of statistical patterns does not require understanding, it enables the production of novel text that continues a prompt and conforms to patterns of language use, and thus can make sense to humans.

Christoph Durt is a philosophical and interdisciplinary researcher at Heidelberg university. He investigates the human mind and its relation to technology, especially AI. Going beyond the usual side-to-side comparison of artificial and human intelligence, he studies the multidimensional interplay between the two. This involves the study of human experience and language, as well as the relation between them. If you would like to join an international online exchange on these issues, please check the “courses and lectures” section on his website.

Durt, Christoph, Tom Froese, and Thomas Fuchs. preprint. “Against AI Understanding and Sentience: Large Language Models, Meaning, and the Patterns of Human Language Use.”

Durt, Christoph. 2023. “The Digital Transformation of Human Orientation: An Inquiry into the Dawn of a New Era” Winner of the $10.000 HFPO Essay Prize.

Durt, Christoph. 2022. “Artificial Intelligence and Its Integration into the Human Lifeworld.” In The Cambridge Handbook of Responsible Artificial Intelligence, Cambridge University Press.

Durt, Christoph. 2020. “The Computation of Bodily, Embodied, and Virtual Reality” Winner of the Essay Prize “What Can Corporality as a Constitutive Condition of Experience (Still) Mean in the Digital Age?”Phänomenologische Forschungen, no. 2: 25–39.

LLMs: Indication or Representation? (Søgaard)

Anders Søgaard , Computer Science & Philosophy, U. Copenhagen, 23-Nov 2023

VIDEO

ABSTRACT: People talk to LLMs – their new assistants, tutors, or partners – about the world they live in, but are LLMs parroting, or do they (also) have internal representations of the world? There are five popular views, it seems:

  • LLMs are all syntax, no semantics. 
  • LLMs have inferential semantics, no referential semantics. 
  • LLMs (also) have referential semantics through picturing
  • LLMs (also) have referential semantics through causal chains. 
  • Only chatbots have referential semantics (through causal chains) 

I present three sets of experiments to suggest LLMs induce inferential and referential semantics and do so by inducing human-like representations, lending some support to view (iii). I briefly compare the representations that seem to fall out of these experiments to the representations to which others have appealed in the past. 

Anders Søgaard is University Professor of Computer Science and Philosophy and leads the newly established Center for Philosophy of Artificial Intelligence at the University of Copenhagen. Known primarily for work on multilingual NLP, multi-task learning, and using cognitive and behavioral data to bias NLP models, Søgaard is an ERC Starting Grant and Google Focused Research Award recipient and the author of Semi-Supervised Learning and Domain Adaptation for NLP (2013), Cross-Lingual Word Embeddings (2019), and Explainable Natural Language Processing (2021). 

Søgaard, A. (2023). Grounding the Vector Space of an Octopus. Minds and Machines 33, 33-54.

Li, J.; et al. (2023) Large Language Models Converge on Brain-Like Representations. arXiv preprint arXiv:2306.01930

Abdou, M.; et al. (2021) Can Language Models Encode Perceptual Structure Without Grounding? CoNLL

Garneau, N.; et al. (2021) Analogy Training Multilingual Encoders. AAAI

« Algorithmes de Deep Learning flous causaux » (Faghihi)

Usef Faghihi , Informatique, UQTR, 16-Nov 2023

RÉSUMÉ : Je donnerai un bref aperçu de l’inférence causale et de la manière dont les règles de la logique floue peuvent améliorer le raisonnement causal (Faghihi, Robert, Poirier & Barkaoui, 2020). Ensuite, j’expliquerai comment nous avons intégré des règles de logique floue avec des algorithmes d’apprentissage profond, tels que l’architecture de transformateur Big Bird (Zaheer et al., 2020). Je montrerai comment notre modèle de causalité d’apprentissage profond flou a surpassé ChatGPT sur différentes bases de données dans des tâches de raisonnement (Kalantarpour, Faghihi, Khelifi & Roucaut, 2023). Je présenterai également quelques applications de notre modèle dans des domaines tels que la santé et l’industrie. Enfin, si le temps le permet, je présenterai deux éléments essentiels de notre modèle de raisonnement causal que nous avons récemment développés : l’Effet Causal Variationnel Facile Probabiliste (PEACE) et l’Effet Causal Variationnel Probabiliste (PACE) (Faghihi & Saki, 2023).

Usef Faghihi est professeur adjoint à l’Université du Québec à Trois-Rivières. Auparavant, Usef était professeur à l’Université d’Indianapolis aux États-Unis. Usef a obtenu son doctorat en Informatique Cognitive à l’UQAM. Il est ensuite allé à Memphis, aux États-Unis, pour effectuer un post-doctorat avec le professeur Stan Franklin, l’un des pionniers de l’intelligence artificielle. Ses centres d’intérêt en recherche sont les architectures cognitives et leur intégration avec les algorithmes d’apprentissage profond.

Faghihi, U., Robert, S., Poirier, P., & Barkaoui, Y. (2020). From Association to Reasoning, an Alternative to Pearl’s Causal Reasoning. In Proceedings of AAAI-FLAIRS 2020. North-Miami-Beach (Florida)

Faghihi, U., & Saki, A. (2023). Probabilistic Variational Causal Effect as A new Theory for Causal Reasoning. arXiv preprint arXiv:2208.06269

Kalantarpour, C., Faghihi, U., Khelifi, E., & Roucaut, F.-X. (2023). Clinical Grade Prediction of Therapeutic Dosage for Electroconvulsive Therapy (ECT) Based on Patient’s Pre-Ictal EEG Using Fuzzy Causal Transformers. Paper presented at the International Conference on Electrical, Computer, Communications and Mechatronics Engineering, ICECCME 2023, Tenerife, Canary Islands, Spain. 

Zaheer, M., Guruganesh, G., Dubey, K. A., Ainslie, J., Alberti, C., Ontanon, S., . . . Yang, L. (2020). Big bird: Transformers for longer sequences. Advances in neural information processing systems, 33, 17283-17297. 

Robotic Grounding and LLMs: Advancements and Challenges (Kennington)

Casey Kennington , Computer Science, Boise State, 09-Nov 2023

VIDEO

ABSTRACT: Large Language Models (LLMs) are p rimarily trained using large amounts of text, but there have also been noteworthy advancements in incorporating vision and other sensory information into LLMs. Does that mean LLMs are ready for embodied agents such as robots? While there have been important advancements, technical and theoretical challenges remain including use of closed language models like ChatGPT, model size requirements, data size requirements, speed requirements, representing the physical world, and updating the model with information about the world in real time. In this talk, I explain recent advance on incorporating LLMs into robot platforms, challenges, and opportunities for future work. 

Casey Kennington is associate professor in the Department of Computer Science at Boise State University where he does research on spoken dialogue systems on embodied platforms. His long-term research goal is to understand what it means for humans to understand, represent, and produce language. His National Science Foundation CAREER award focuses on enriching small language models with multimodal information such as vision and emotion for interactive learning on robotic platforms. Kennington obtained his PhD in Linguistics from Bielefeld University, Germany. 

Josue Torres-Foncesca, Catherine Henry, Casey Kennington. Symbol and Communicative Grounding through Object Permanence with a Mobile Robot. In Proceedings of SigDial, 2022. 

Clayton Fields and Casey Kennington. Vision Language Transformers: A Survey. arXiv, 2023.

Casey Kennington. Enriching Language Models with Visually-grounded Word Vectors and the Lancaster Sensorimotor Norms. In Proceedings of CoNLL, 2021

Casey Kennington. On the Computational Modeling of Meaning: Embodied Cognition Intertwined with Emotion. arXiv, 2023. 

Machine Psychology (Schulz)

Eric Schulz , MPI Tuebingen, 02-Nov 2023

VIDEO

ABSTRACT: Large language models are on the cusp of transforming society while they permeate into many applications. Understanding how they work is, therefore, of great value. We propose to use insights and tools from psychology to study and better understand these models. Psychology can add to our understanding of LLMs and provide a new toolkit for explaining LLMs by providing theoretical concepts, experimental designs, and computational analysis approaches. This can lead to a machine psychology for foundation models that focuses on computational insights and precise experimental comparisons instead of performance measures alone. I will showcase the utility of this approach by showing how current LLMs behave across a variety of cognitive tasks, as well as how one can make them more human-like by fine-tuning on psychological data directly.

Eric Schulz, Max-Planck Research Group Leader, Tuebingen University works on the building blocks of intelligence using a mixture of computational, cognitive, and neuroscientific methods. He has worked with Maarten Speekenbrink on generalization as function learning and Sam Gershman and Josh Tenenbaum.

Binz, M., & Schulz, E. (2023). Using cognitive psychology to understand GPT-3Proceedings of the National Academy of Sciences120(6), e2218523120

Akata, E., Schulz, L., Coda-Forno, J., Oh, S. J., Bethge, M., & Schulz, E. (2023). Playing repeated games with Large Language ModelsarXiv preprint arXiv:2305.16867.

Allen, K. R., Brändle, F., Botvinick, M., Fan, J., Gershman, S. J., Griffiths, T. L., … & Schulz, E. (2023). Using Games to Understand the Mind

Binz, M., & Schulz, E. (2023). Turning large language models into cognitive modelsarXiv preprint.

Enactivist Symbol Grounding:  From Attentional Anchors to Mathematical Discourse (Abrahamson)

Dor Abrahamson , Faculty of Education, UC-Berkeley, 26-Oct 2023

VIDEO

ABSTRACT: According to the embodiment hypothesis knowledge is the capacity for perceptuomotor enactment, situated in the world as much as in the body: a way of engaging the environment in anticipation of accomplishing interactions. What does this mean for educational practice? What is the embodiment or enactment of abstract ideas, like justice, photosynthesis, or algebra? What is the teacher’s role in embodied designs for learning? I will describe my lab’s educational design-based collaborative research on mathematical learning, and how we came to view in the analysis and promotion of content learning. I will describe how students spontaneously generate perceptual solutions to motor-control problems. These then become verbal through adopting symbolic artifacts provided by the teacher. This approach can also help students with diverse sensorimotor capacities.

Dor Abrahamson is Professor in the Graduate School of Education at the University of California Berkeley, where he established the Embodied Design Research Laboratory devoted to pedagogical technologies for teaching and learning mathematics. He is particularly interested in relations between learning to move in new ways and learning mathematicaal concepts. His research draws on embodied cognition, dynamic systems theory, and sociocultural theory. 

Abrahamson, D., & Sánchez-García, R. (2016). Learning is moving in new ways: The ecological dynamics of mathematics educationJournal of the Learning Sciences, 25(2), 203-239.  

Abrahamson, D. (2021). Grasp actually: An evolutionist argument for enactivist mathematics education. Human Development, 65(2), 1–17. https://doi.org/10.1159/000515680

Shvarts, A., & Abrahamson, D. (2023). Coordination dynamics of semiotic mediation: A functional dynamic systems perspective on mathematics teaching/learning. In T. Veloz, R. Videla, & A. Riegler (Eds.), Education in the 21st century [Special issue]. Constructivist Foundations, 18(2), 220–234. https://constructivist.info/18/2 

The Debate Over “Understanding” in AI’s Large Language Models (Mitchell)

Melanie Mitchell , Santa Fe Institute, 19-Oct

VIDEO

ABSTRACT:  I will survey a current, heated debate in the AI research community on whether large pre-trained language models can be said — in any important sense — to “understand” language and the physical and social situations language encodes. I will describe arguments that have been made for and against such understanding, and, more generally, will discuss what methods can be used to fairly evaluate understanding and intelligence in AI systems.  I will conclude with key questions for the broader sciences of intelligence that have arisen in light of these discussions. 

Melanie Mitchell is Professor at the Santa Fe Institute. Her current research focuses on conceptual abstraction and analogy-making in artificial intelligence systems.  Melanie is the author or editor of six books and numerous scholarly papers in the fields of artificial intelligence, cognitive science, and complex systems. Her 2009 book Complexity: A Guided Tour (Oxford University Press) won the 2010 Phi Beta Kappa Science Book Award, and her 2019 book Artificial Intelligence: A Guide for Thinking Humans (Farrar, Straus, and Giroux) is a finalist for the 2023 Cosmos Prize for Scientific Writing. 

Mitchell, M. (2023). How do we know how smart AI systems are? Science381(6654), adj5957.

Mitchell, M., & Krakauer, D. C. (2023). The debate over understanding in AI’s large language modelsProceedings of the National Academy of Sciences120(13), e2215907120.

Millhouse, T., Moses, M., & Mitchell, M. (2022). Embodied, Situated, and Grounded Intelligence: Implications for AIarXiv preprint arXiv:2210.13589.