Rethinking the Physical Symbol Systems Hypothesis (Rosenbloom)

Paul Rosenbloom , Computer Science, USC, 12-Oct 2023

ABSTRACT: It is now more than a half-century since the Physical Symbol Systems Hypothesis (PSSH) was first articulated as an empirical hypothesis. More recent evidence from work with neural networks and cognitive architectures has weakened it, but it has not yet been replaced in any satisfactory manner. Based on a rethinking of the nature of computational symbols – as atoms or placeholders – and thus also of the systems in which they participate, a hybrid approach is introduced that responds to these challenges while also helping to bridge the gap between symbolic and neural approaches, resulting in two new hypotheses, one – the Hybrid Symbol Systems Hypothesis (HSSH) – that is to replace the PSSH and the other focused more directly on cognitive architectures. This overall approach has been inspired by how hybrid symbol systems are central in the Common Model of Cognition and the Sigma cognitive architectures, both of which will be introduced – along with the general notion of a cognitive architecture – via “flashbacks” during the presentation.

Paul S. Rosenbloom is Professor Emeritus of Computer Science in the Viterbi School of Engineering at the University of Southern California (USC). His research has focused on cognitive architectures (models of the fixed structures and processes that together yield a mind), such as Soar and Sigma; the Common Model of Cognition (a partial consensus about the structure of a human-like mind); dichotomic maps (structuring the space of technologies underlying AI and cognitive science); “essential” definitions of key concepts in AI and cognitive science (such as intelligence, theories, symbols, and architectures); and the relational model of computing as a great scientific domain (akin to the physical, life and social sciences).

Rosenbloom, P. S. (2023). Rethinking the Physical Symbol Systems Hypothesis. In Proceedings of the 16^th International Conference on Artificial General Intelligence (pp. 207-216). Cham, Switzerland: Springer.

Laird, J. E., Lebiere, C. & Rosenbloom, P. S. (2017). A Standard Model of the Mind: Toward a Common Computational Framework across Artificial Intelligence, Cognitive Science, Neuroscience, and Robotics. AI Magazine, 38, 13-26.

Rosenbloom, P. S., Demski, A. & Ustun, V. (2016). The Sigma cognitive architecture and system: Towards functionally elegant grand unification. Journal of Artificial General Intelligence, 7, 1-103.

Rosenbloom, P. S., Demski, A. & Ustun, V. (2016). Rethinking Sigma’s graphical architecture: An extension to neural networks. Proceedings of the 9^th Conference on Artificial General Intelligence (pp. 84-94).

Symbols and Grounding in LLMs (Pavlick)

Ellie Pavlick , Computer Science, Brown, 05-Oct 2023

VIDEO

ABSTRACT: Large language models (LLMs) appear to exhibit human-level abilities on a range of tasks, yet they are notoriously considered to be “black boxes”, and little is known about the internal representations and mechanisms that underlie their behavior. This talk will discuss recent work which seeks to illuminate the processing that takes place under the hood. I will focus in particular on questions related to LLM’s ability to represent abstract, compositional, and content-independent operations of the type assumed to be necessary for advanced cognitive functioning in humans.

Ellie Pavlick is an Assistant Professor of Computer Science at Brown University. She received her PhD from University of Pennsylvania in 2017, where her focus was on paraphrasing and lexical semantics. Ellie’s research is on cognitively-inspired approaches to language acquisition, focusing on grounded language learning and on the emergence of structure (or lack thereof) in neural language models. Ellie leads the language understanding and representation (LUNAR) lab, which collaborates with Brown’s Robotics and Visual Computing labs and with the Department of Cognitive, Linguistic, and Psychological Sciences.

Tenney, Ian, Dipanjan Das, and Ellie Pavlick. “BERT Rediscovers the Classical NLP Pipeline.” Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019. https://arxiv.org/pdf/1905.05950.pdf

Pavlick, Ellie. “Symbols and grounding in large language models.” Philosophical Transactions of the Royal Society A 381.2251 (2023): 20220041. https://royalsocietypublishing.org/doi/pdf/10.1098/rsta.2022.0041

Lepori, Michael A., Thomas Serre, and Ellie Pavlick. “Break it down: evidence for structural compositionality in neural networks.” arXiv preprint arXiv:2301.10884 (2023). https://arxiv.org/pdf/2301.10884.pdf

Merullo, Jack, Carsten Eickhoff, and Ellie Pavlick. “Language Models Implement Simple Word2Vec-style Vector Arithmetic.” arXiv preprint arXiv:2305.16130 (2023). https://arxiv.org/pdf/2305.16130.pdf

From the History of Philosophy to AI: Does Thinking Require Sensing? (Chalmers)

David Chalmers , Center for Mind, Brain & Consciousness, NYU, 28-Sep 2023

VIDEO

ABSTRACT: There has recently been widespread discussion of whether large language models might be sentient or conscious. Should we take this idea seriously? I will discuss the underlying issue and will break down the strongest reasons for and against. I suggest that given mainstream assumptions in the science of consciousness, there are significant obstacles to consciousness in current models: for example, their lack of recurrent processing, a global workspace, and unified agency. At the same time, it is quite possible that these obstacles will be overcome in the next decade or so. I conclude that while it is somewhat unlikely that current large language models are conscious, we should take seriously the possibility that extensions and successors to large language models may be conscious in the not-too-distant future.

David Chalmers is University Professor of Philosophy and Neural Science and co-director of the Center for Mind, Brain, and Consciousness at New York University. He is the author of The Conscious Mind (1996), Constructing The World (2010), and Reality+: Virtual Worlds and the Problems of Philosophy (2022). He is known for formulating the “hard problem” of consciousness, and (with Andy Clark) for the idea of the “extended mind,” according to which the tools we use can become parts of our minds.

Chalmers, D. J. (2023). Could a large language model be conscious?. arXiv preprint arXiv:2303.07103.

Chalmers, D.J. (2022) Reality+: Virtual worlds and the problems of philosophy. Penguin

Chalmers, D. J. (1995). Facing up to the problem of consciousness. Journal of Consciousness Studies, 2(3), 200-219.

Clark, A., & Chalmers, D. (1998). The extended mind. Analysis, 58(1), 7-19.

Grounding in Large Language Models: Functional Ontologies for AI (Mollo)

Dimitri Coelho Mollo. Philosophy of AI, Umeå University, 21 sept 2023

VIDEO

ABSTRACT: I will describe joint work with Raphaël Millière, arguing that language grounding (but not language understanding) is possible in some current Large Language Models (LLMs). This does not mean, h that the way language grounding works in LLMs is similar to how grounding works in humans. The differences open up two options: narrowing the notion of grounding to only the phenomenon in humans; or pluralism about grounding, extending the notion more broadly to systems that fulfil the requirements for intrinsic content. Pluralism invites applying recent work in comparative and cognitive psychology to AI, especially the search for appropriate ontologies to account for cognition and intelligence. This can help us better understand the capabilities and limitations of current AI systems, as well as potential ways forward.

Dimitri Coelho Mollo is Assistant Professor with focus in Philosophy of Artificial Intelligence at the Department of Historical, Philosophical and Religious Studies, at Umeå University, Sweden, and focus area coordinator at TAIGA (Centre for Transdisciplinary AI), for the area ‘Understanding and Explaining Artificial Intelligence’. I am also an external Principal Investigator at the Science of Intelligence Cluster, in Berlin, Germany. My research focuses on foundational and epistemic questions within artificial intelligence and cognitive science, looking for ways to improve our understanding of mind, cognition, and intelligence in biological and artificial systems. My work often intersects issues in Ethics of Artificial Intelligence, Philosophy of Computing, and Philosophy of Biology.

Coelho Mollo and Millière (2023), The Vector Grounding Problem

Francken, Slors, Craver (2022), Cognitive ontology and the search for neural mechanisms: three foundational problems

LLMs are impressive but we still need grounding to explain human cognition (Bergen)

Benjamin Bergen, Cognitive Science, UCSD, 14 sept 2023

VIDEO

ABSTRACT: Human cognitive capacities are often explained as resulting from grounded, embodied, or situated learning. But Large Language Models, which only learn on the basis of word co-occurrence statistics, now rival human performance in a variety of tasks that would seem to require these very capacities. This raises the question: is grounding still necessary to explain human cognition? I report on studies addressing three aspects of human cognition: Theory of Mind, Affordances, and Situation Models. In each case, we run both human and LLM participants on the same task and ask how much of the variance in human behavior is explained by the LLMs. As it turns out, in all cases, human behavior is not fully explained by the LLMs. This entails that, at least for now, we need grounding (or, more accurately, something that goes beyond statistical language learning) to explain these aspects of human cognition. I’ll conclude by asking but not answering a number of questions, like, How long will this remain the case? What are the right criteria for an LLM that serves as a proxy for human statistical language learning? and, How could one tell conclusively whether LLMs have human-like intelligence?

Ben Bergen is Professor of Cognitive Science at UC San Diego, where he directs the Language and Cognition Lab. His research focuses on language processing and production with a special interest in meaning. He’s also the author of ‘Louder than Words: The New Science of How the Mind Makes Meaning‘ and ‘What the F: What Swearing Reveals about Our Language, Our Brains, and Ourselves.’

Trott, S., Jones, C., Chang, T., Michaelov, J., & Bergen, B. (2023). Do Large Language Models know what humans know? Cognitive Science 47(7): e13309.

Chang, T. & B. Bergen (2023). Language Model Behavior: A Comprehensive Survey. Computational Linguistics.

Michaelov, J., S. Coulson, & B. Bergen (2023). Can Peanuts Fall in Love with Distributional Semantics? Proceedings of the 45th Annual Meeting of the Cognitive Science Society. Austin, TX: Cognitive Science Society.

Jones, C., Chang, T., Coulson, S., Michaelov, J., Trott, T., & Bergen, B. (2022). Distributional Semantics Still Can’t Account for Affordances. Proceedings of the 44th Annual Conference of the Cognitive Science Society. Austin, TX: Cognitive Science Society.

Conscious processing, inductive biases and generalization in deep learning (Bengio)

Yoshua Bengio, MILA, Université de Montréal 17 feb 2023

VIDEO

Abstract: Humans are very good at “out-of-distribution” generalization (compared to current AI systems). It would be useful to determine the inductive biases they exploit and translate them into machine-language architectures, training frameworks and experiments. I will discuss several of these hypothesized inductive biases. Many exploit notions in causality and connect abstractions in representation learning (perception and interpretation) with reinforcement learning (abstract actions). Systematic generalizations may arise from efficient factorization of knowledge into recomposable pieces. This is partly related to symbolic AI (aas seen in the errors and limitations of reasoning in humans, as well as in our ability to learn to do this at scale, with distributed representations and efficient search). Sparsity of the causal graph and locality of interventions — observable in the structure of sentences — may reduce the computational complexity of both inference (including planning) and learning. This may be why evolution incorporated this as “consciousness.” I will also suggest some open research questions to stimulate further research and collaborations.

Yoshua Bengio, Professor, University of Montreal, founder and scientific director of Mila – Institut québécois d’AI, and co-director CIFAR’s Machine Learning, Biological Learning program as a Senior Fellow. He also serves as scientific director of IVADO.

Butlin, P., Long, R., Elmoznino, E., Bengio, Y., Birch, J., Constant, A., … & VanRullen, R. (2023). Consciousness in artificial intelligence: insights from the science of consciousness. arXiv preprint arXiv:2308.08708.

Zador, A., Escola, S., Richards, B., Ölveczky, B., Bengio, Y., Boahen, K., … & Tsao, D. (2023). Catalyzing next-generation artificial intelligence through neuroai. Nature Communications, 14(1), 1597.

Active inference and artificial curiosity (Friston)

Karl Friston, UCL, 8 December, 2022

VIDEO

Abstract: This talk offers a formal account of insight and learning in terms of active (Bayesian) inference. It deals with the dual problem of inferring states of the world and learning its statistical structure. In contrast to current trends in machine learning (e.g., deep learning), we focus on how agents learn from a small number of ambiguous outcomes to form insight. I will use simulations of abstract rule-learning and approximate Bayesian inference to show that minimising (expected) free energy leads to active sampling of novel contingencies. This epistemic, curiosity-directed behaviour closes `explanatory gaps’ in knowledge about the causal structure of the world, thereby reducing ignorance, in addition to resolving uncertainty about states of the known world. We then move from inference to model selection or structure learning to show how abductive processes emerge when agents test plausible hypotheses about symmetries in their generative models of the world. The ensuing Bayesian model reduction evokes mechanisms associated with sleep and has all the hallmarks of aha moments.

Karl Friston, Professor, Institute of Neurology, UCL, models functional integration in the human brain and the principles that underlie neuronal interactions. His main contribution to theoretical neurobiology is a free-energy principle for action and perception (active inference).

Medrano, J., Friston, K., & Zeidman, P. (2024). Linking fast and slow: the case for generative models. Network Neuroscience, 8(1), 24-43.

Pezzulo, G., Parr, T., & Friston, K. (2024). Active inference as a theory of sentient behavior. Biological Psychology, 108741.

Cognitive architectures and their applications (Lebière)

Christian Lebière, Carnegie-Mellon, 20 October, 2022

VIDEO

Abstract: Cognitive architectures are computational implementations of unified theories of cognition. Being able to represent human cognition in computational form enables a wide range of applications when humans and machines interact. Using cognitive models to represent common ground between deep learners and human users enables adaptive explanations. Cognitive models representing the behavior of cyber attackers can be used to optimize cyber defenses including techniques such as deceptive signaling. Cognitive models of human-automation interaction can improve robustness of human-machine teams by predicting disruptions to measures of trust under various adversarial situations. Finally, the consensus of 50 years of research in cognitive architectures can be captured in the form of a Common Model of Cognition that can provide a guide for neuroscience, artificial intelligence and robotics.

Christian Lebière is a Research Faculty member in the Psychology Department at Carnegie Mellon University. His main research interests are cognitive architectures and their applications to psychology, artificial intelligence, human-computer interaction, decision-making, intelligent agents, network science, cognitive robotics and neuromorphic engineering.

Cranford, E. A., Gonzalez, C., Aggarwal, P., Tambe, M., Cooney, S., & Lebiere, C. (2021). Towards a cognitive theory of cyber deception. Cognitive Science, 45(7), e13013.

Cranford, E., Gonzalez, C., Aggarwal, P., Cooney, S., Tambe, M., & Lebiere, C. (2020). Adaptive cyber deception: Cognitively informed signaling for cyber defense.

Lebiere, C., Blaha, L. M., Fallon, C. K., & Jefferson, B. (2021). Adaptive cognitive mechanisms to maintain calibrated trust and reliance in automation. Frontiers in Robotics and AI, 8, 652776.

Laird, J. E., Lebiere, C., & Rosenbloom, P. S. (2017). A standard model of the mind: Toward a common computational framework across artificial intelligence, cognitive science, neuroscience, and robotics. AI Magazine, 38(4), 13-26.

Lebiere, C., Pirolli, P., Thomson, R., Paik, J., Rutledge-Taylor, M., Staszewski, J., & Anderson, J. R. (2013). A functional model of sensemaking in a neurocognitive architecture. Computational Intelligence and Neuroscience, 2013.

Constraining networks biologically to explain grounding (Pulvermüller)

Friedemann Pulvermueller, FU Berlin, 3 December, 2020

VIDEO

Abstract: Meaningful use of symbols requires grounding in action and perception through learning. The mechanisms of this sensorimotor grounding, however, are rarely specified in mechanistic terms; and mathematically precise formal models of the relevant learning processes are scarce. As the brain is the device that is critical for mechanistically supporting and indeed implementing grounding, modelling needs to take into account realistic neuronal processes in the human brain. This makes it desirable to use not just ‘neural’ networks that are vaguely similar to some aspects of real networks of neurons, but models implementing constraints imposed by neuronal structure and function, that is, biologically realistic learning and brain structure along with local and global structural connectivity and functional interaction. After discussing brain constraints for cognitive modelling, the talk will focus on the biological implementation of grounding, in order to address the following questions: Why do the brains of humans — but not those of their closest relatives — allow for verbal working memory and learning of huge vocabularies of symbols? Why do different word and concept types seem to depend on different parts of the brain (‘category-specific’ semantic mechanisms)? Why are there ‘semantic and conceptual hubs’ in the brain where general semantic knowledge is stored — and why would these brain areas be different from those areas where grounding information is present (i.e., the sensory and motor cortices)? And why should sensory deprivation shift language and conceptual processing toward ‘grounding areas’ — for example toward the visual cortex in the blind? I will argue that brain-constrained modelling is necessary to answer (some of) these questions and, more generally, to explain the mechanisms of grounding.

Friedemann Pulvermüller is professor in the neuroscience of language and pragmatics at the Freie Universität Berlin, where he also directs the ‘Brain Language Laboratory’.

Carota, F., Nili, H., Kriegeskorte, N., & Pulvermüller, F. (2023). Experientially-grounded and distributional semantic vectors uncover dissociable representations of conceptual categories. Language, Cognition and Neuroscience, 1-25.

Pulvermüller, F., Garagnani, M., & Wennekers, T. (2014). Thinking in circuits: Towards neurobiological explanation in cognitive neuroscience. Biological Cybernetics, 108(5), 573-593. doi: 10.1007/s00422-014-0603-9

Grounded Language Learning in Virtual Environments (Clark)

Stephen Clark, U Cambridge and Quantinuum, 19 November, 2020

VIDEO

Abstract: Natural Language Processing is currently dominated by the application of text-based language models such as GPT. One feature of these models is that they rely entirely on the statistics of text, without making any connection to the world, which raises the interesting question of whether such models could ever properly “understand” the language. One these models can be grounded is to connect them to images or videos, for example by conditioning the language models on visual input and using them for captioning. In this talk I extend the grounding idea to a simulated virtual world: an environment which an agent can perceive, explore and interact with. A neural-network-based agent is trained — using distributed deep reinforcement learning — to associate words and phrases with things that it learns to see and do in the virtual world.The world is 3D, built in Unity, and contains recognisable objects, including some from the ShapeNet repository of assets. One of the difficulties in training such networks is that they have a tendency to overfit to their training data.We demonstrate how the interactive, first-person perspective of an agent helps it to generalize to out-of-distribution settings. Training the agents typically requires a huge number of training examples. We show how meta-learning can be used to teach the agents to bind words to objects in a one-shot setting. The agent is able to combine its knowledge of words obtained one-shot with its stable knowledge of word meanings learned over many episodes, providing a form of grounded language learning which is both “fast and slow”. Joint work with Felix Hill.

Clark, S., Lerchner, A., von Glehn, T., Tieleman, O., Tanburn, R., Dashevskiy, M., & Bosnjak, M. (2021). Formalising Concepts as Grounded Abstractions. arXiv preprint arXiv:2101.05125.

Tull, S., Shaikh, R. A., Zemljic, S. S., & Clark, S. (2023). From Conceptual Spaces to Quantum Concepts: Formalising and Learning Structured Conceptual Models. arXiv preprint arXiv:2401.08585.