Building machines that learn and think like people

FYI: Building machines that learn and think like people

Lake BM, Ullman TD, Tenenbaum JB, Gershman SJ (2017) Building machines that learn and think like people. Behavioral and Brain Sciences. 40:1-70. URL: https://cims.nyu.edu/~brenden/LakeEtAl2017BBS.pdf

“The difference between pattern recognition and model building, between prediction and explanation, is central to our view of human intelligence. Just as scientists seek to explain nature, not simply predict it, we see human thought as fundamentally a model building activity. ” (p. 2).

“The central goal of this article is to propose a set of core ingredients for building more human-like learning and thinking machines. We elaborate on each of these ingredients and topics in Section 4, but here we briefly overview the key ideas. The first set of ingredients focuses on developmental “start-up software,” or cognitive capabilities present early in development. … We focus on two pieces of developmental start-up software (see Wellman & Gelman [1992] for a review of both). First is intuitive physics (sect. 4.1.1): Infants have primitive object concepts that allow them to track objects over time and to discount physically implausible trajectories. … A second type of software present in early development is intuitive psychology (sect. 4.1.2): Infants under- stand that other people have mental states like goals and beliefs, and this understanding strongly constrains their learning and predictions. … Our second set of ingredients focus on learning. Although there are many perspectives on learning, we see model building as the hallmark of human-level learning, or explaining observed data through the construction of causal models of the world (sect. 4.2.2). From this perspective, the early- present capacities for intuitive physics and psychology are also causal models of the world. A primary job of learning is to extend and enrich these models and to build analogous causally structured theories of other domains. Compared with state-of-the-art algorithms in machine learning, human learning is distinguished by its richness and its efficiency. Children come with the ability and the desire to uncover the underlying causes of sparsely observed events and to use that knowledge to go far beyond the paucity of the data. It might seem paradoxical that people are capable of learning these richly structured models from very limited amounts of experience. We suggest that compositionality and learning-to-learn are ingredients that make this type of rapid model learning possible (sects. 4.2.1 and 4.2.3, respectively). A fi nal set of ingredients concerns how the rich models our minds build are put into action, in real time (sect. 4.3). It is remarkable how fast we are to perceive and to act.” (p. 4).

“Here we present two challenge problems for machine learning and AI: learning simple visual concepts (Lake et al. 2015a ) and learning to play the Atari game Frostbite (Mnih et al. 2015).” (p. 5).

“Figure 1. The Characters Challenge: Human-level learning of novel handwritten characters (A), with the same abilities also illustrated for a novel two-wheeled vehicle (B). A single example of a new visual concept (red box) can be enough information to support the (i) classification of new examples, (ii) generation of new examples, (iii) parsing an object into parts and relations, and (iv) generation of new concepts from related concepts. Adapted from Lake et al. (2015a).” (p. 6).

“In Frostbite, players control an agent (Frostbite Bailey) tasked with constructing an igloo within a time limit. The igloo is built piece by piece as the agent jumps on ice floes in water (Fig. 2A–C). … The Frostbite example is a particularly telling contrast when compared with human play. Even the best deep networks learn gradually over many thousands of game episodes, take a long time to reach good performance, and are locked into particular input and goal patterns. Humans, after playing just a small number of games over a span of minutes, can understand the game and its goals well enough to perform better than deep networks do after almost a thousand hours of experience. Even more impressively, people understand enough to invent or accept new goals, generalize over changes to the input, and explain the game to others. Why are people different? ” (p. 7-9).

“4. Core ingredients of human intelligence: In the Introduction, we laid out what we see as core ingredients of intelligence. Here we consider the ingredients in detail and contrast them with the current state of neural network modeling. Although these are hardly the only ingredients needed for human-like learning and thought (see our discussion of language in sect. 5), they are key building blocks, which are not present in most current learning- based AI systems – certainly not all present together – and for which additional attention may prove especially fruitful. We believe that integrating them will produce significantly more powerful and more human-like learning and thinking abilities than we currently see in AI systems. Before considering each ingredient in detail, it is important to clarify that by “core ingredient” we do not necessarily mean an ingredient that is innately specified by genetics or must be “built in” to any learning algorithm.” (p. 9).

“We have focused on how cognitive science can motivate and guide efforts to engineer human-like AI, in contrast to some advocates of deep neural networks who cite neuroscience for inspiration. Our approach is guided by a pragmatic view that the clearest path to a computational formalization of human intelligence comes from understanding the “software” before the “hardware.” In the case of this article, we proposed key ingredients of this software in previous sections. Nonetheless, a cognitive approach to intelligence should not ignore what we know about the brain. Neuroscience can provide valuable inspirations for both cognitive models and AI researchers: The centrality of neural networks and model-free reinforcement learning in our proposals for “thinking fast” (sect. 4.3) are prime exemplars.” (p. 20).

“We believe that understanding language and its role in intelligence goes hand-in-hand with understanding the building blocks discussed in this article. It is also true that language builds on the core abilities for intuitive physics, intuitive psychology, and rapid learning with compositional, causal models that we focus on. These capacities are in place before children master language, and they provide the building blocks for linguistic meaning and language acquisition (Carey 2009; Jackendoff 2003; Kemp 2007; O’Donnell 2015; Pinker 2007 ; Xu & Tenenbaum 2007).” (p.21).

“There has been recent interest in integrating psychological ingredients with deep neural networks, especially selective attention (Bahdanau et al. 2015; Mnih et al. 2014; Xu et al. 2015), augmented working memory (Graves et al. 2014; 2016; Grefenstette et al. 2015; Sukhbaatar et al. 2015; Weston et al. 2015b ), and experience replay (McClelland et al. 1995; Mnih et al. 2015).” (p. 22).

“One worthy goal would be to build an AI system that beats a world-class player with the amount and kind of training human champions receive, rather than overpowering them with Google-scale computational resources. AlphaGo is initially trained on 28.4 million positions and moves from 160,000 unique games played by human experts; it then improves through reinforcement learning, playing 30 million more games against itself. Between the publication of Silver et al. (2016) and facing world champion Lee Sedol, AlphaGo was iteratively retrained several times in this way. The basic system always learned from 30 million games, but it played against successively stronger versions of itself, effectively learning from 100 million or
more games altogether (D. Silver, personal communication, 2017). In contrast, Lee has probably played around 50,000 games in his entire life. Looking at numbers like these, it is impressive that Lee can even compete with AlphaGo. What would it take to build a professional-level Go AI that learns from only 50,000 games?” (p. 23).

[Open Peer Commentary: The architecture challenge: Future artificial-intelligence systems will require sophisticated architectures, and knowledge of the brain might guide their construction. Gianluca Baldassarre, Vieri Giuliano Santucci, Emilio Cartoni, and Daniele Caligiore] “We agree with the claim of Lake et al. that to obtain human-level learning speed and cognitive flexibility, future artificial-intelligence (AI) systems will have to incorporate key elements of human cognition: from causal models of the world, to intuitive psychological theories, compositionality, and knowledge transfer. However, the authors largely overlook the importance of a major challenge to implementation of the functions they advocate: the need to develop sophisticated architectures to learn, represent, and process the knowledge related to those functions. Here we call this the architecture challenge. In this commentary, we make two claims: (1) tackling the architecture challenge is fundamental to success in developing human-level AI systems; (2) looking at the brain can furnish important insights on how to face the architecture challenge. The difficulty of the architecture challenge stems from the fact that the space of the architectures needed to implement the several functions advocated by Lake et al. is huge.” (p. 25-26).

[Open Peer Commentary: Thinking like animals or thinking like colleagues?. Daniel C. Dennett and Enoch Lambert] “The step up to human-style comprehension carries moral implications that are not mentioned in Lake et al.’s telling. Even themost powerful of existing AIs are intelligent tools, not colleagues, and whereas they can be epistemically authoritative (within limits we need to characterize carefully), and hence will come to be relied on more and more, they should not be granted moral authority or responsibility because they do not have skin in the game: they do not yet have interests, and simulated interests are not enough. We are not saying that an AI could not be created to have genuine interests, but that is down a very long road (Dennett 2017 ; Hurley et al. 2011). Although some promising current work suggests that genuine human consciousness depends on a fundamental architecture that would require having interests (Deacon 2012; Dennett 2013 ), long before that day arrives, if it ever does, we will have AIs that can communicate with natural language with their users (not collaborators).” (p. 34-35).

[Open Peer Commentary: Understand the cogs to understand cognition. Adam H. Marblestone, Greg Wayne, and Konrad P. Kording] “We argue that the study of evolutionarily conserved neural structures will provide a means to identify the brain’s true, fundamental inductive biases and how they actually arise.” (p. 43).

[Open Peer Commentary: Avoiding frostbite: It helps to learn from others Michael Henry Tessler, Noah D. Goodman, and Michael C. Frank] “Learning from others also does more than simply “speed up” learning about the world. Human knowledge seems to accumulate across generations, hence permitting progeny to learn in one life-time what no generation before them could learn (Boyd et al., 2011; Tomasello, 1999). We hypothesize that language–and particularly its flexibility to refer to abstract concepts – is key to faithful transmission of knowledge, between individuals and through generations. ” (p.48-29).