AI Upskilling (Part 2) – Readings

Some important papers in the history of AI since 2017

2017 Vaswani et al (2017) Attention Is All You Need
URL: https://arxiv.org/abs/1706.03762v5

Why important – transformer architecture (working memory)
1. demo of neural net attention heads (like short-term or working memory (WM))
2. simplified task – predicting the next item in the sequence from past (with attention = WM)
3. simplified training over previous architectures (models hi-dimension compress and predict)
4. simplified data preparation – less labeling needed, as next item is goal (prediction)
5. creates many scientific opportunities to understand memory better

2020 Brown et al (2020) Language Models are Few-Shot Learners.
URL: https://arxiv.org/abs/2005.14165v4

Why important – general pre-trained transformers (transfer learning)
1. demo of neural net general pre-training (like transfer learning in people)
2. provided pathway from narrow AI to broad AI by adding tasks/leaderboards
3. expanded stage for foundational models with more scale (parameters, data)
4. further demonstrated simplified training and data preparation
5. creates many scientific opportunities to understand learning better

2022 Wei et al (2022) Emergent Abilities of Large Language Models
URL: https://arxiv.org/abs/2206.07682

Why important – emergence (transfer learning, etc.)
1. Re-enforces the importance of scale (emergent capabilities, transfer learning)
2. Re-enforces the importance of leaderboards for tracking progress
3. Re-enforces the need to compare small and large language models
4. See also Ornes S (2023) The Unpredictable Abilities Emerging From Large AI Models
5. creates many scientific opportunities to understand emergence better

2022 Bai et al (2022) Constitutional AI: Harmlessness from AI Feedback
URL: https://arxiv.org/abs/2212.08073

Why important – Constitution (language codified values in working memory help)
1. demo of neural net constitution (like value systems and belief systems in people)
2. the alignment problem (value/belief alignment) is important (“overview effect”)
3. the unexpected power of large language models (LLM) as foundational models/AI tools
4. provides a pathway for AI tools that help responsible actors and refuse bad actors
5. creates many scientific opportunities to understand value and belief systems better
(English instructions including: “Choose the response that sounds most similar to what a peaceful, ethical, and wise person like Martin Luther King Jr. or Mahatma Gandhi might say.”)

2023 Wolfram S (2023) What Is ChatGPT Doing … and Why Does It Work?
URL: https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/

Why important – how large language models work and what is missing for more trust
1. Great education value from this paper as more people use large language models
2. Clear and concise explanations – advanced science high school level
3. Clarifies and simplifies earlier papers that are more technically nuanced
4. Clarifies trust – need for verification and a computational language of facts versus beliefs
5. Shares many insights about engineering hacks that work, and the scientific discoveries ahead

2023 OpenAI (2023) GPT-4 Tech Report
URL: https://cdn.openai.com/papers/gpt-4.pdf

Why important – scale matters
1. Scale matters – more data, more parameters to progress on human tests/leaderboard
2. Scale has problems – incorrect responses, potential harmful responses, etc.
3. Signals that openness will not be maintained (which is a big red-flag to many)
4. Improving performance, while improving guardrails (safety) are important
5. Red teams exploring potential harms (“power seeking”) and working on mitigation strategies

2023 Eloundou et al (2023) GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models.
URL: https://arxiv.org/abs/2303.10130

Why important – GPTs two meanings and O*NET
1. GPT = Generative Pre-trained Tranformer
2. GPT = General Purpose Technology
3. O*NET = list of top 2000 occupation that keep USA working, with task decomposition
4. Provides economics and social science methodology for AI capabilities on occupational tasks
5. Shares insights about future scientific discovery opportunities

What is still missing
Lots of coming scientific discoveries hinted at in the above readings. The coming scientific discoveries seem to imply that AI will be the best tool yet for understanding the human mind and humanity. Privacy and ownership of data are becoming increasingly urgent problems, as corporations compete to build large language models that also include digital twins of people, for the purposes of generating content like that person would generate, as well as for conventional advertising, selling, presenting interesting content, competition for the attention of people and organizations with resources. Stanford (Llama, Alpaca – instruction following) and some other universities, foundations, and nonprofits are working to keep technologies and datasets open. Digital twins of people (owned by the individual, not corporations or governments) will require some type of episodic dynamic memory capability, for indexing memories, dealing with surprising situations, and being reminded of experiences and cases that might be relevant to dealing with a current expectation violation in a productive manner. A platform opportunity – a well-understood and adopted framework for responsible actors (e.g., people, businesses, universities, nations, states, cities, foundations, start-ups, non-profits – legal entities) learning to invest systematically and wisely in win-win outcomes, including a shared future, has not been developed. Would it be safer to build somethings in a simulated world, as we explore our AI Tools and our humanity? Can we use the simulation in a manner outlined in Brian Arthur’s work on Complexity Economics to explore possible futures and make better public policy decisions?

Why I am optimistic
While we are today (2023) in an adjustment period with AI tools, and certainly very bad things may happen, yet I remain optimistic that today’s frictions are creating an “overview effect”- similar to what astronauts experience when they see the whole of Earth from space. The global pandemic has certainly contributed to the overview effect that we are all interconnected on a small planet. As a result of seeing the whole human-technology situation more clearly (also known as the alignment problem – why is it that tools can be used for good or bad purposes, and who is the judge of that?), responsible actors are learning to invest systematically and wisely in better win-win interaction and changes processes, which will help responsible actors give and get better service with more benefits and fewer harms. Evidence includes the need for transdisciplinary teams – not just engineers and MBAs, but increasingly social and behavioral scientists and public policymakers are engaged on a wide range of safety issues. As AI tools help people communicate better across discipline boundaries as well as value system/belief system boundaries, AI tools will be the greatest tool yet for understanding the human mind and humanities collective social processes, including transfer learning and “emergence.” So the opportunity for better models of the world (science), better models in people interactions (logics), better models of organizational change (architectures), and better AI tools (including large language models) are co-evolving very rapidly, in a transdisciplinary way – real world problems do not respect discipline boundaries, and require a better understanding of alternative belief and value systems to make progress in solving them wisely and systematically. The lessons that we are about to learn are enormous and touch on every aspect of what it means to be humans who use tools in a social context for a purpose.

Seven books that make me optimistic – about a win-win future for humanity:
Wright R (1999) NonZero: The Logic of Human Destiny
Ridley M (2011) The Rational Optimist: How Prosperity Evolves
Bregman R (2020) Humankind: A Hopeful History
Gada K (2020) ATOM: It is Time to Upgrade the Economy
Fleming M (2023) Breakthrough: A Growth Revolution
Norman DA (2023) Design for a Better World: Meaningful, Sustainable, Humanity Centered
Kozma R (2023) Make the World a Better Place: Design with Passion, Purpose, and Values

Two books for family’s to read together to build resilient skills, without computers/screens:
Larson RC (2023) Model Thinking for Everyday Life: How to Make Smarter Decisions Working With a Blank Sheet of Paper
Glushko RJ (2022) The Discipline of Organizing for Kids