Machine Psychology (Schulz)

Eric Schulz , MPI Tuebingen, 02-Nov 2023

VIDEO

ABSTRACT: Large language models are on the cusp of transforming society while they permeate into many applications. Understanding how they work is, therefore, of great value. We propose to use insights and tools from psychology to study and better understand these models. Psychology can add to our understanding of LLMs and provide a new toolkit for explaining LLMs by providing theoretical concepts, experimental designs, and computational analysis approaches. This can lead to a machine psychology for foundation models that focuses on computational insights and precise experimental comparisons instead of performance measures alone. I will showcase the utility of this approach by showing how current LLMs behave across a variety of cognitive tasks, as well as how one can make them more human-like by fine-tuning on psychological data directly.

Eric Schulz, Max-Planck Research Group Leader, Tuebingen University works on the building blocks of intelligence using a mixture of computational, cognitive, and neuroscientific methods. He has worked with Maarten Speekenbrink on generalization as function learning and Sam Gershman and Josh Tenenbaum.

Binz, M., & Schulz, E. (2023). Using cognitive psychology to understand GPT-3Proceedings of the National Academy of Sciences120(6), e2218523120

Akata, E., Schulz, L., Coda-Forno, J., Oh, S. J., Bethge, M., & Schulz, E. (2023). Playing repeated games with Large Language ModelsarXiv preprint arXiv:2305.16867.

Allen, K. R., Brändle, F., Botvinick, M., Fan, J., Gershman, S. J., Griffiths, T. L., … & Schulz, E. (2023). Using Games to Understand the Mind

Binz, M., & Schulz, E. (2023). Turning large language models into cognitive modelsarXiv preprint.

The Debate Over “Understanding” in AI’s Large Language Models (Mitchell)

Melanie Mitchell , Santa Fe Institute, 19-Oct

VIDEO

ABSTRACT:  I will survey a current, heated debate in the AI research community on whether large pre-trained language models can be said — in any important sense — to “understand” language and the physical and social situations language encodes. I will describe arguments that have been made for and against such understanding, and, more generally, will discuss what methods can be used to fairly evaluate understanding and intelligence in AI systems.  I will conclude with key questions for the broader sciences of intelligence that have arisen in light of these discussions. 

Melanie Mitchell is Professor at the Santa Fe Institute. Her current research focuses on conceptual abstraction and analogy-making in artificial intelligence systems.  Melanie is the author or editor of six books and numerous scholarly papers in the fields of artificial intelligence, cognitive science, and complex systems. Her 2009 book Complexity: A Guided Tour (Oxford University Press) won the 2010 Phi Beta Kappa Science Book Award, and her 2019 book Artificial Intelligence: A Guide for Thinking Humans (Farrar, Straus, and Giroux) is a finalist for the 2023 Cosmos Prize for Scientific Writing. 

Mitchell, M. (2023). How do we know how smart AI systems are? Science381(6654), adj5957.

Mitchell, M., & Krakauer, D. C. (2023). The debate over understanding in AI’s large language modelsProceedings of the National Academy of Sciences120(13), e2215907120.

Millhouse, T., Moses, M., & Mitchell, M. (2022). Embodied, Situated, and Grounded Intelligence: Implications for AIarXiv preprint arXiv:2210.13589.

Rethinking the Physical Symbol Systems Hypothesis (Rosenbloom)

Paul Rosenbloom , Computer Science, USC, 12-Oct 2023

VIDEO

ABSTRACT: It is now more than a half-century since the Physical Symbol Systems Hypothesis (PSSH) was first articulated as an empirical hypothesis.  More recent evidence from work with neural networks and cognitive architectures has weakened it, but it has not yet been replaced in any satisfactory manner.  Based on a rethinking of the nature of computational symbols – as atoms or placeholders – and thus also of the systems in which they participate, a hybrid approach is introduced that responds to these challenges while also helping to bridge the gap between symbolic and neural approaches, resulting in two new hypotheses, one – the Hybrid Symbol Systems Hypothesis (HSSH) – that is to replace the PSSH and the other focused more directly on cognitive architectures.  This overall approach has been inspired by how hybrid symbol systems are central in the Common Model of Cognition and the Sigma cognitive architectures, both of which will be introduced – along with the general notion of a cognitive architecture – via “flashbacks” during the presentation.

Paul S. Rosenbloom is Professor Emeritus of Computer Science in the Viterbi School of Engineering at the University of Southern California (USC).  His research has focused on cognitive architectures (models of the fixed structures and processes that together yield a mind), such as Soar and Sigma; the Common Model of Cognition (a partial consensus about the structure of a human-like mind); dichotomic maps (structuring the space of technologies underlying AI and cognitive science); “essential” definitions of key concepts in AI and cognitive science (such as intelligence, theories, symbols, and architectures); and the relational model of computing as a great scientific domain (akin to the physical, life and social sciences).

 Rosenbloom, P. S. (2023). Rethinking the Physical Symbol Systems Hypothesis.  In Proceedings of the 16th International Conference on Artificial General Intelligence (pp. 207-216).  Cham, Switzerland: Springer.  

Laird, J. E., Lebiere, C. & Rosenbloom, P. S. (2017). A Standard Model of the Mind: Toward a Common Computational Framework across Artificial Intelligence, Cognitive Science, Neuroscience, and Robotics. AI Magazine38, 13-26.  

Rosenbloom, P. S., Demski, A. & Ustun, V. (2016).  The Sigma cognitive architecture and system: Towards functionally elegant grand unificationJournal of Artificial General Intelligence7, 1-103.  

Rosenbloom, P. S., Demski, A. & Ustun, V. (2016). Rethinking Sigma’s graphical architecture: An extension to neural networks.  Proceedings of the 9th Conference on Artificial General Intelligence (pp. 84-94).  

Symbols and Grounding in LLMs (Pavlick)

Ellie Pavlick , Computer Science, Brown, 05-Oct 2023

VIDEO

ABSTRACT: Large language models (LLMs) appear to exhibit human-level abilities on a range of tasks, yet they are notoriously considered to be “black boxes”, and little is known about the internal representations and mechanisms that underlie their behavior. This talk will discuss recent work which seeks to illuminate the processing that takes place under the hood. I will focus in particular on questions related to LLM’s ability to represent abstract, compositional, and content-independent operations of the type assumed to be necessary for advanced cognitive functioning in humans. 

Ellie Pavlick is an Assistant Professor of Computer Science at Brown University. She received her PhD from University of Pennsylvania in 2017, where her focus was on paraphrasing and lexical semantics. Ellie’s research is on cognitively-inspired approaches to language acquisition, focusing on grounded language learning and on the emergence of structure (or lack thereof) in neural language models. Ellie leads the language understanding and representation (LUNAR) lab, which collaborates with Brown’s Robotics and Visual Computing labs and with the Department of Cognitive, Linguistic, and Psychological Sciences.

Tenney, Ian, Dipanjan Das, and Ellie Pavlick. “BERT Rediscovers the Classical NLP Pipeline.” Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019. https://arxiv.org/pdf/1905.05950.pdf

Pavlick, Ellie. “Symbols and grounding in large language models.” Philosophical Transactions of the Royal Society A 381.2251 (2023): 20220041. https://royalsocietypublishing.org/doi/pdf/10.1098/rsta.2022.0041

Lepori, Michael A., Thomas Serre, and Ellie Pavlick. “Break it down: evidence for structural compositionality in neural networks.” arXiv preprint arXiv:2301.10884 (2023). https://arxiv.org/pdf/2301.10884.pdf

Merullo, Jack, Carsten Eickhoff, and Ellie Pavlick. “Language Models Implement Simple Word2Vec-style Vector Arithmetic.” arXiv preprint arXiv:2305.16130 (2023). https://arxiv.org/pdf/2305.16130.pdf

Grounding in Large Language Models:  Functional Ontologies for AI (Mollo)

Dimitri Coelho Mollo. Philosophy of AI, Umeå University, 21 sept 2023

VIDEO

ABSTRACT:  I will describe joint work with Raphaël Millière, arguing that language grounding (but not language understanding) is possible in some current Large Language Models (LLMs). This does not mean, h that the way language grounding works in LLMs is similar to how grounding works in humans.  The differences open up two options: narrowing the notion of grounding to only the phenomenon in humans; or pluralism about grounding, extending the notion more broadly to systems that fulfil the requirements for intrinsic content. Pluralism invites applying recent work in comparative and cognitive psychology to AI, especially the search for appropriate ontologies to account for cognition and intelligence. This can help us better understand the capabilities and limitations of current AI systems, as well as potential ways forward.

Dimitri Coelho Mollo is Assistant Professor with focus in Philosophy of Artificial Intelligence at the Department of Historical, Philosophical and Religious Studies,  at Umeå University, Sweden, and focus area coordinator at TAIGA (Centre for Transdisciplinary AI), for the area ‘Understanding and Explaining Artificial Intelligence’. I am also an external Principal Investigator at the Science of Intelligence Cluster, in Berlin, Germany. My research focuses on foundational and epistemic questions within artificial intelligence and cognitive science, looking for ways to improve our understanding of mind, cognition, and intelligence in biological and artificial systems. My work often intersects issues in Ethics of Artificial Intelligence, Philosophy of Computing, and Philosophy of Biology. 

Coelho Mollo and Millière (2023), The Vector Grounding Problem

Francken, Slors, Craver (2022), Cognitive ontology and the search for neural mechanisms: three foundational problems

LLMs are impressive but we still need grounding to explain human cognition (Bergen)

Benjamin Bergen, Cognitive Science, UCSD, 14 sept 2023

VIDEO

ABSTRACT: Human cognitive capacities are often explained as resulting from grounded, embodied, or situated learning. But Large Language Models, which only learn on the basis of word co-occurrence statistics, now rival human performance in a variety of tasks that would seem to require these very capacities. This raises the question: is grounding still necessary to explain human cognition? I report on studies addressing three aspects of human cognition: Theory of Mind, Affordances, and Situation Models. In each case, we run both human and LLM participants on the same task and ask how much of the variance in human behavior is explained by the LLMs. As it turns out, in all cases, human behavior is not fully explained by the LLMs. This entails that, at least for now, we need grounding (or, more accurately, something that goes beyond statistical language learning) to explain these aspects of human cognition. I’ll conclude by asking but not answering a number of questions, like, How long will this remain the case? What are the right criteria for an LLM that serves as a proxy for human statistical language learning? and, How could one tell conclusively whether LLMs have human-like intelligence?

Ben Bergen is Professor of Cognitive Science at UC San Diego, where he directs the Language and Cognition Lab. His research focuses on language processing and production with a special interest in meaning. He’s also the author of ‘Louder than Words: The New Science of How the Mind Makes Meaning‘ and ‘What the F: What Swearing Reveals about Our Language, Our Brains, and Ourselves.’ 

Trott, S., Jones, C., Chang, T., Michaelov, J., & Bergen, B. (2023). Do Large Language Models know what humans know? Cognitive Science 47(7): e13309.

Chang, T. & B. Bergen (2023). Language Model Behavior: A Comprehensive Survey. Computational Linguistics.

Michaelov, J., S. Coulson, & B. Bergen (2023). Can Peanuts Fall in Love with Distributional Semantics? Proceedings of the 45th Annual Meeting of the Cognitive Science Society. Austin, TX: Cognitive Science Society.

Jones, C., Chang, T., Coulson, S., Michaelov, J., Trott, T., & Bergen, B. (2022). Distributional Semantics Still Can’t Account for Affordances. Proceedings of the 44th Annual Conference of the Cognitive Science Society. Austin, TX: Cognitive Science Society.

Conscious processing, inductive biases and generalization in deep learning (Bengio)

Yoshua Bengio, MILA, Université de Montréal 17 feb 2023

VIDEO

Abstract: Humans are very good at “out-of-distribution” generalization (compared to current AI systems). It would be useful to determine the inductive biases they exploit and translate them into machine-language architectures, training frameworks and experiments. I will discuss several of these hypothesized inductive biases. Many exploit notions in causality and connect abstractions in representation learning (perception and interpretation) with reinforcement learning (abstract actions). Systematic generalizations may arise from efficient factorization of knowledge into recomposable pieces. This is partly related to symbolic AI (aas seen in the errors and limitations of reasoning in humans, as well as in our ability to learn to do this at scale, with distributed representations and efficient search). Sparsity of the causal graph and locality of interventions — observable in the structure of sentences — may reduce the computational complexity of both inference (including planning) and learning. This may be why evolution incorporated this as “consciousness.” I will also suggest some open research questions to stimulate further research and collaborations.

Yoshua Bengio, Professor, University of Montreal, founder and scientific director of Mila – Institut québécois d’AI, and co-director CIFAR’s Machine Learning, Biological Learning program as a Senior Fellow. He also serves as scientific director of IVADO.

Butlin, P., Long, R., Elmoznino, E., Bengio, Y., Birch, J., Constant, A., … & VanRullen, R. (2023). Consciousness in artificial intelligence: insights from the science of consciousnessarXiv preprint arXiv:2308.08708.

Zador, A., Escola, S., Richards, B., Ölveczky, B., Bengio, Y., Boahen, K., … & Tsao, D. (2023). Catalyzing next-generation artificial intelligence through neuroaiNature Communications14(1), 1597.

Active inference and artificial curiosity (Friston)

Karl Friston, UCL, 8 December, 2022

VIDEO

Abstract: This talk offers a formal account of insight and learning in terms of active (Bayesian) inference. It deals with the dual problem of inferring states of the world and learning its statistical structure. In contrast to current trends in machine learning (e.g., deep learning), we focus on how agents learn from a small number of ambiguous outcomes to form insight. I will use simulations of abstract rule-learning and approximate Bayesian inference to show that minimising (expected) free energy leads to active sampling of novel contingencies. This epistemic, curiosity-directed behaviour closes `explanatory gaps’ in knowledge about the causal structure of the world, thereby reducing ignorance, in addition to resolving uncertainty about states of the known world. We then move from inference to model selection or structure learning to show how abductive processes emerge when agents test plausible hypotheses about symmetries in their generative models of the world. The ensuing Bayesian model reduction evokes mechanisms associated with sleep and has all the hallmarks of aha moments.

Karl FristonProfessor, Institute of Neurology, UCL, models functional integration in the human brain and the principles that underlie neuronal interactions. His main contribution to theoretical neurobiology is a free-energy principle for action and perception (active inference).

Medrano, J., Friston, K., & Zeidman, P. (2024). Linking fast and slow: the case for generative modelsNetwork Neuroscience8(1), 24-43.

Pezzulo, G., Parr, T., & Friston, K. (2024). Active inference as a theory of sentient behaviorBiological Psychology, 108741.

Cognitive architectures and their applications (Lebière)

Christian LebièreCarnegie-Mellon, 20 October, 2022

VIDEO

Abstract: Cognitive architectures are computational implementations of unified theories of cognition. Being able to represent human cognition in computational form enables a wide range of applications when humans and machines interact. Using cognitive models to represent common ground between deep learners and human users enables adaptive explanations. Cognitive models representing the behavior of cyber attackers can be used to optimize cyber defenses including techniques such as deceptive signaling. Cognitive models of human-automation interaction can improve robustness of human-machine teams by predicting disruptions to measures of trust under various adversarial situations. Finally, the consensus of 50 years of research in cognitive architectures can be captured in the form of a Common Model of Cognition that can provide a guide for neuroscience, artificial intelligence and robotics. 

Christian Lebière is a Research Faculty member in the Psychology Department at Carnegie Mellon University. His main research interests are cognitive architectures and their applications to psychology, artificial intelligence, human-computer interaction, decision-making, intelligent agents, network science, cognitive robotics and neuromorphic engineering. 

Cranford, E. A., Gonzalez, C., Aggarwal, P., Tambe, M., Cooney, S., & Lebiere, C. (2021). Towards a cognitive theory of cyber deception. Cognitive Science, 45(7), e13013.

Cranford, E., Gonzalez, C., Aggarwal, P., Cooney, S., Tambe, M., & Lebiere, C. (2020). Adaptive cyber deception: Cognitively informed signaling for cyber defense.

Lebiere, C., Blaha, L. M., Fallon, C. K., & Jefferson, B. (2021). Adaptive cognitive mechanisms to maintain calibrated trust and reliance in automation. Frontiers in Robotics and AI, 8, 652776.

Laird, J. E., Lebiere, C., & Rosenbloom, P. S. (2017). A standard model of the mind: Toward a common computational framework across artificial intelligence, cognitive science, neuroscience, and robotics. AI Magazine, 38(4), 13-26.

Lebiere, C., Pirolli, P., Thomson, R., Paik, J., Rutledge-Taylor, M., Staszewski, J., & Anderson, J. R. (2013). A functional model of sensemaking in a neurocognitive architecture. Computational Intelligence and Neuroscience, 2013.

Constraining networks biologically to explain grounding (Pulvermüller)

Friedemann PulvermuellerFU Berlin, 3 December, 2020

VIDEO

Abstract: Meaningful use of symbols requires grounding in action and perception through learning. The mechanisms of this sensorimotor grounding, however, are rarely specified in mechanistic terms; and mathematically precise formal models of the relevant learning processes are scarce. As the brain is the device that is critical for mechanistically supporting and indeed implementing grounding, modelling needs to take into account realistic neuronal processes in the human brain. This makes it desirable to use not just ‘neural’ networks that are vaguely similar to some aspects of real networks of neurons, but models implementing constraints imposed by neuronal structure and function, that is, biologically realistic learning and brain structure along with local and global structural connectivity and functional interaction. After discussing brain constraints for cognitive modelling, the talk will focus on the biological implementation of grounding, in order to address the following questions: Why do the brains of humans — but not those of their closest relatives — allow for verbal working memory and learning of huge vocabularies of symbols? Why do different word and concept types seem to depend on different parts of the brain (‘category-specific’ semantic mechanisms)? Why are there ‘semantic and conceptual hubs’ in the brain where general semantic knowledge is stored — and why would these brain areas be different from those areas where grounding information is present (i.e., the sensory and motor cortices)? And why should sensory deprivation shift language and conceptual processing toward ‘grounding areas’ — for example toward the visual cortex in the blind? I will argue that brain-constrained modelling is necessary to answer (some of) these questions and, more generally, to explain the mechanisms of grounding. 

Friedemann Pulvermüller is professor in the neuroscience of language and pragmatics at the Freie Universität Berlin, where he also directs the ‘Brain Language Laboratory’. 

Carota, F., Nili, H., Kriegeskorte, N., & Pulvermüller, F. (2023). Experientially-grounded and distributional semantic vectors uncover dissociable representations of conceptual categoriesLanguage, Cognition and Neuroscience, 1-25.

Pulvermüller, F., Garagnani, M., & Wennekers, T. (2014). Thinking in circuits: Towards neurobiological explanation in cognitive neuroscience. Biological Cybernetics, 108(5), 573-593. doi: 10.1007/s00422-014-0603-9