AI Progress = Leaderboards + Compute + Data + Algorithms

AI Progress = Leaderboards + Compute + Data + Algorithms

Companion Presentation: AI Progress = Leaderboards Compute Data Algorithms 20180817 v3
Paper: Rouse Spohrer Automating Versus Augmenting Intelligence 12-21-17 copy
Also see: https://www.slideshare.net/spohrer/ai-progress-leaderboards-compute-data-algorithms-20180817-v3

Opening quote:

Michael Witbrock, a manager of Cognitive Systems at IBM Research, says about two-thirds of the advances in AI over the past 10 years have come from increases in computer processing power, much of that from the use of graphics processing units (GPUs). About 20% of the gain came from bigger datasets, and 10% from better algorithms, he estimates. That’s changing, he says; “Advances in the fundamental algorithms for learning are now the main driver of progress.”

From: Anthes (2017) Artificial Intelligence Poised to Ride a New Wave. Comm of the ACM, July 60(7):19-21.

 

AI progress can be measured by tracking scores on AI leaderboards

What is an AI leaderboard challenge? A challenge typically provides (1) input: a labeled set of data, that is used to create (2) output: an AI model that is scored and ranked, and placed on the leaderboard website for the world to see the results.  The team with the best scoring AI model is ranked #1, and they are celebrated, until the next team of AI researchers/data scientists/software developers comes along and knocks them out of first place.  The result of this competition is measurable AI progress.  An example of a leaderboard is SQuAD (Stanford Question Answering Dataset).  As the presentation above shows, there are a wide range of AI leaderboard challenges that span tasks for which AI is already at super-human performance levels to tasks where AI performance is barely better than random guessing.  Over time, the set of tasks that AI systems can perform at super-human level performance, about human-level performance (par-human), and less than human-level performances changes.   The scores of the #1 ranked teams on leaderboards change regularly because of thee three main factors: more/better compute power, more/better labeled data, and more/better algorithm building-blocks for competitors to use.

AI progress depends on more compute power

Moore’s law has been the primary way to describe the trend in computing power growth for about sixty years.  For the last sixty years, see the diagrams in the presentation and paper above, Moore’s law could be summarized (approximately) as the cost of computing decreases by a factor of 1000x every two decades, or the amount of computing power that can be bought for $1K increases by 1000x every two decades.  For example, a “terascale” computer costs about $1K today (2018), and a terascale is a million million instructions per second.  The human brain is estimated at about an “exascale” which is a billion billion instructions per second (some estimates on brain as computer are as low as a “petascale”, see Scientific American).  So in about twenty years, one can estimate an exascale computer will cost about $1M (2038).   Of course, no one can predict the future, so while experts expect computing costs will continue to drop, and computers will be able to perform more instructions per second, it is still a bit risky to try to predict when one might be able to buy a computer, with the computing power equivalent of one human brain (estimated at an exascale).  Nevertheless, in the presentation and paper, we estimate that around 2060, an exascale of computing may cost $1K.  Still, no one can predict the future, and one danger of AI predictions/hype is contributing to AI bubbles.  PowerAI is now used in the fastest super-computer in the world that IBM helped build at USA Oakridge National Labs, called SUMMIT = 200 petaflops = 1/5 human brain.

AI progress depends on more labeled data

Labeled data can be thought of as thousands, millions, billions, or even trillions of input-output pairs that allow AI models to be built that can produce the correct output when provided an input.  For example, ImageNET includes over ten million labeled images.  Mozilla’s Open Voice project is creating large amounts of speech data sets. The business of labeling data is growing (see FigureEight).  It is also possible to generate useful labeled data sets automatically using a number of technques (see GANs).

AI progress depends on more algorithm building blocks

Algorithms have progressed from hand-crafted instructions to learning procedures to composition of AI models.  For example, Kaggle (acquired by Google) is probably the largest set of leaderboard challenges in the world, and Kaggle Masters (and Grandmasters) – the people who win the most challenges – are expert at combining a set of lower performing models into a composite new model that scores better than any of the single models alone (see ensembling).  So Kaggle Masters view earlier models as building blocks, or instructions that can be combined with learning algorithms to create new better performing AI models.   The result is progress in algorithms – more and better algorithm building blocks over time.   The algorithm building blocks can be found in so called “zoos” and model asset exchanges (see IBM MAX and CODAIT.org website, as well as UIUC ML Model Scope website, or ventures like Algorithmia).  Another type of competition is creating a particular type of building block much faster than before, that is finding techniques to build models faster and faster (see fast.ai result).  Some see the beginnings of a Software 2.0 stack, as the progression of (a) hand-crafted instructions to (b) labeled data and learning algorithms to (c) composition of AI models unfolds.

AI progress is advancing to single AI models that can do multiple tasks/leaderboards

As AI progress continues, a relatively new phenomenon is starting to happen more and more.  AI Researchers are creating a single model that can perform multiple tasks (see Salesforce’s Einstein Natural Language Decathlon result and Stanford-Berkeley’s vision taskonomy result).   This means that someday a single AI model may be ranked #1 on dozens of AI leaderboard challenges.  This phenomena marks the transition from narrow intelligence to broad intelligence for AI models/systems.  Think about all the different tasks that a child must learn to do well before they can become an adult.  The transition from child to adult in today’s world typically takes about 10 million minutes of experience (18 years).  Also think of all the different types of models of tasks, models of the physical world, models of themselves and others, and certainly models of the social world, cultures, norms, institutions, and laws.  A mature person has a brain and mind full of integrated models of the complex physical and social world.  Furthermore, for adults, the transition from novice to expert in a wide range of work occupations takes about 2 million minutes of experience (4 years).   In the future, it may be possible for AI models/systems to perform not only all the tasks of a single person, but also of fictitious people such as organizations or even whole nations.  By some estimates a modern national economy depends on about one thousand different types of occupations to function (see O*NET, the USA occupation network), so the USA economy of roughly 350 million population, with about 100 million workers (algorithms), represents almost a billion billion billion instructions per second (compute), with about a combined billion billion minutes of experience (data), and this system is on the “leaderboard” that compares nations by economic output.  Thomas Friedman in his book “Thank You for Being Late” mentions that what use to take the combined resources of a nation, can now be done by a company, and someday may be doable by individuals using advanced technologies – for example, space flight to launch satellites or even the production of nuclear weapons.  It is for this reason, rapidly growing technological capabilities, that the resilience of societal systems, which includes the ability to rapidly rebuild from scratch after a natural or human-made disaster, becomes so important (see the Call For Code).

Concluding Remarks

AI progress can be measured by monitoring a portfolio of AI leaderboard challenges (see Electronic Frontiers Foundations AI Progress Measurement website).  The three main drivers of AI advances/progress include: (a) compute power, (b) labeled data, and (c) algorithm building blocks (see CACM article and initial quote above).  This short blog post includes links to more materials – including a presentation and a paper that suggest that AI will be “solved” in a narrow sense by 2040 and in a broader sense by 2060.  “Solved” relates to the range of tasks a single AI model can accomplish at adult human-level performance, as well as expert occupation level performance.  Solving AI will increase the need to improve the resilience of service systems (smarter and wiser) from both natural and human-made disasters.