Remembering Jean Paul Jacob (1937-2019): Brazil, IBM, and Berkeley’s Invisible Hand

Berkeley School of Engineering, June 20th, 2019

Remembering Jean Paul Jacob – the invisible hand guiding Brazil, IBM, and Berkeley into the future. Jean-Paul would surely have us smiling and laughing in spite of this solemn occasion today. Thank-you so much Berkeley friends.

Jean Paul was a master communicator, like Carl Sagan in my mind. However, instead of billions and billions of stars, Jean-Paul spoke of billions and billions of transistors.

Jean Paul was cool – he looked a bit like “the most interesting man in the world” from the beer commercials, with his well-groomed beard – which he slowly pulled on as he contemplated his next witty remark.

We miss Jean Paul greatly – his razor-sharp wit and passion for explaining the future of technology to business and government leaders, as well as journalist and students of diverse backgrounds. Carolyn Wallace who is here today was our Almaden customer briefing leader, and she scheduled Jean Paul in hundreds of presentations over the years.

Jean Paul had 57 years of service at IBM, more than anyone else I have ever known. Forty years as a regular employee and seventeen years as IBM Research Emeritus.

After getting his bachelor’s degree in electrical engineering in Brazil and then moving to France in 1960, Jean Paul first started working for IBM Nordics in 1962 in Stockholm Sweden. The next year, he moved to IBM Scientific Computing Research Center in California to work on the NASA Space Mission Simulator, and getting his Berkeley degrees. He spent the 1970’s back in Brazil working with universities on scientific computing, and then helped establish in 1980 the IBM Brazil Scientific Center and Institute for Software Engineering, where he met Fabio Gandour, a doctor, who later came to work for IBM (and Camille Crittendon read Fabio’s Eulogy early). In the mid 1980’s he returned to California, and had offices both at the newly built IBM Almaden Research Center and the IBM Cottle Road Storage Systems center down the hill in San Jose. In 1991, he met Mike Foster.

I met Jean Paul when I started at IBM Almaden in 1998, and Jean Paul took on the big challenge of helping me learn to make a proper presentation. This proved to be a lifelong challenge for Jean Paul – but I hope he is smiling down on us all today, since I used no slides.

Thank-you Berkeley friends for this wonderful event, and I would just like to share a few of the comments that have been collected from those who Jean Paul worked with over the years at IBM:

Robin Williams (IBM Research – Almaden, Retired): “Jean Paul would go to Brazil about once a year and later I would hear that he was on TV there being interviewed, that he gave great talks about the future of technology and got rave reviews.  He was treated like a rock star. “

Sergio Borger (IBM Brazil, Executive): “JP was a role model for me, since I was in high school.”

John Cohn (IBM Fellow): “In honor of Jean Paul, I made this video called Video, about The Pickle Lightbulb which I know was one of his favorite science experiments.”

Fernando Koch (IBM): “Jean Paul Jacob was a master communicator. When he spoke about the future of technology, people listened”

Laura Anderson (IBM Almaden): “He was legendary.”

Ted Selker (IBM Fellow, Retired): “Jean Paul Jacob’s energy and caring were infectious… I so  appreciated him introducing me to fascinating opportunities and experiences… The most surprising experience was when he had Playboy Brazil interview me about our research project called Room With A View.”

Mike Ross (IBM Almaden Communications, Retired): “JP’s entertaining/evolving talk/presentation .”The Future Is Not What It Used To Be!” – would love to have an image of the JP-2000 universal portable computer/communicator he predicted in the early 1990s … which was so outrageous when he predicted it … but in retrospect, clearly something similar to what smartphones are today.”

Dan Russell (Google, formerly IBM, author of “The Joy of Search” book): “My favorite JP memory is that for years I had a standing Thursday afternoon meeting with him.  When I’d show up he’d say “Where’s my Dan Russell list??” and look for a piece of paper with my name on it. “

Ethevaldo Siqueira (Brazilian Journalist): “Few scientists in Brazil and the United States have contributed more than Jean-Paul Jacob to enrich our knowledge of digital technologies”

Robert Morris (IBM, Retired, former Director of IBM Almaden): “Thanks for forwarding the sad news regarding Jean Paul.  He was such a towering force and was so truly caring about people and institutions.  He will be missed.”

Rich Pasco (IBM, Retired): “Jean Paul Jacob was my boss of sorts, in that he arranged my two trips to Brazil in the 1980’s to teach. I deeply respected his work in bringing about international collaboration on scientific and technical topics.”

Sonia Sachs (former IBM Almaden Researcher): “He always made me laugh. Jean Paul was incredibly selfless, never saying much about himself. Always listening. And he never let his illness reduce his incredible sense of humor, his interest in making our conversation light and happy. He said that he would find a way to communicate with me from the “Beleleu,” i.e., in the afterlife… We made a lot of jokes about the Beleleu conversations, all of which delighted Jean Paul. “

For more stories click here.

Timeline

Young Jean Paul Jacob

1937 (Jan) Brazil – Born São Paulo

1960 Brazil – Electrical Engineering degree from the Instituto Tecnológico de Aeronáutica

1960-1962 Europe – France Industry and Academic Positions, including possible master’s degree at the Sorbonne in aeronautical engineering.

1962 Europe, Sweden, Stockholm – IBM Europe Nordics

1963 USA, CA, San Jose, IBM San Jose  – NASA Space Mission Simulations, and PhD Berkeley 1966 Math & Engineering

1969 Brazil – Faculty University of São Paulo (USP), the Instituto Tecnológico de Aeronáutica (ITA) and the Federal University of Rio de Janeiro (UFRJ) – Systems Department.

1980 Brazil – Founded IBM Brazile Scientific Center and Institute for Software Engineering
(Fabio Gandour: “Jean Paul was the second person that I met when in the early 80’s I, still a practicing MD, went to IBM Brasil to propose a partnership between the Hospital Foundation of the Federal Disctric and the then IBM Brasil Scientific Center.”)

1986 USA, CA, San Jose – IBM Research Almaden – Research Staff Member (Visiting Scholar Stanford and Berkeley)
(Mike Foster: “In 1991, when I started at Almaden, he had offices in Almaden and in Building 28 on plant site.” On 20190618, Email Michael Foster mfoster@nhusd.k12.ca.us)

2002, USA, CA, Emeryville – Retired – IBM Researcher Emeritus Faculty Berkeley

2019, April 7 – Passed away, Emeryville, CA USA after a life well-lived.

Additional information: https://alchetron.com/Jean-Paul-Jacob

Jean Paul on Right in Top photo – “the most interesting man in the world”

Journals, Conferences, Books

Journal of Service Research https://journals.sagepub.com/home/jsr Editor-in_Chief Michael K. Brady, Florida State University, USA

INFORMS Journal of Service Science https://pubsonline.informs.org/journal/serv Editor-in-Chief Saif Benjaafar, University of Minnesota, USA

Journal of Systems and Service-Oriented Engineering (IJSSOE) https://www.igi-global.com/journal/international-journal-systems-service-oriented/1155 Editor-in-Chief: Dickson K.W. Chiu (The University of Hong Kong, Hong Kong) [Additional Contact Katelyn Hoover khoover@igi-global.com]

Originally posted by Jim Spohrer on 2 January 2013, 7:15 pm

Calls for papers with Service Science themes

Journals

Journal of Service Research
Editor-in-chief: Mary Jo Bitner
Founding Editor: Roland Rust
Impact Factor: 2.732
Ranked: 16 out of 113 in Business
Source: 2011 Journal Citation Reports® (Thomson Reuters, 2012)
News: About 25 articles a year since about 1990

Journal of Service Science (INFORMS)
Founding Editor: Robin Qiu
News: About 33 articles per year since 2009

International Journal of Information Systems in the Service Sector (IJISSS)
An Official Publication of the Information Resources Management Association
Editor-in-chief: John Wang
News: About 33 articles per year since 2009

International Journal of Service Science, Management, Engineering, and Technology (IJSSMET)
An Official Publication of the Information Resources Management Association
Contact: Miguel-Angel Sicilia (University of Alcalá, Spain)
News: About 30 articles per year since 2010

International Journal of E-Services and Mobile Applications
Editor-in-chief: Ada Scupola
News: About 20 articles per year since 2009

International Journal of Services Sciences
Inderscience Publishers
Editor-in-chief: Desheng (Dash) Wu
News: About 12 articles per year since 2008

Service Science
Online electronic journal
Editor-in-chief: Minder Chen
News: About 4 articles per year since 2008

Journal of Service Science
Clute Institute
Contact: Ronal Clute
News: About 11 articles per year since 2008

International Journal of  u- and e- Service, Science and Technology
Science and Research Support Society (SERSC)
Contact: Jianhua Ma, Hosei University, Japan
Contact: Byeong-ho Kang, University of Tasmania, Australia
News: About 25 articles per year since 2008

Journal of Service Science and Management
Contact: Editor-in-Chief Prof. Samuel Mendlinger Boston University, USA
News: About 50 articles per year since 2008

Service Science and Management Research
Contact: Editorial Board: Dr. Rocío Pérez de Prado, University of Jaén, Spain
News: About 2 articles per year since 2012

International Journal of Quality and Service Sciences
Contact: Editor Professor Su Mi Dahlgaard Park, Lund University, Sweden ijqss@ch.lu.se
News: About 25 articles per year since 2008

Journal of Service Science Research
Contact: Editor-in-Chief: Daihwan Min
Society: The Society of Service Science
News: About 12 articles a year since 2008

Production Planning and Control
Organisational transformation in servitization
Deadline for submission: January 14th, 2013
Contact: “Paolo Gaiardelli” <paolo.gaiardelli@unibg.it>

Conferences, Workshops, Seminars:

Naples Forum on Service 2013 
Contact: Francesco Polese
Contact: Cristina Mele
Contact: Evert Gummesson
News: final deadline to submitt a proposals is 15 January 2013.
‘2013 Naples Forum on Service’ to be held in Ischia, June 2013

POMS College
Contact: Ravi Behara
Contact: Gang Li
News: deadline for abstract submission is January 18, 2013.
24th Annual Meeting in Denver Colorado May 3-6, 2013

MIT and the Digital Economy
Contact: @ErikB
Friday, January 18, 2013, Noon – 7:00 p.m.
Grand Hyatt San Francisco, 345 Stockton Street, San Francisco, CA
Participating speakers at this time include:
Rod Brooks – Founder, Chairman, and CTO, Rethink Robotics
Erik Brynjolfsson, PhD ’91 – Director, The MIT Center for Digital Business,
Schussel Family Professor of Management Science, MIT Sloan School of Management
Douglas Leone, SM ’88 – General Partner, Sequoia Capital
Andrew McAfee, ’88, ’89, LGO ’90 – Associate Director and Principal Research Scientist, MIT Center for Digital Business
Gokul Rajaram, MBA ’01 – Product Director, Ads, Facebook

Service oriented Enterprise Architecture for Enterprise Engineering (SoEA4EE 2012)
Contact: Selmin Nurcan
Contact: Rainer Schmidt
Info: Working on a 2013 special issue for IJISSS on SoEA4EE

ICIW 2013,
The Eighth International Conference on Internet and Web Applications and Services
Contact: Steffen Fries
June 23 – 28, 2013 – Rome, Italy

5th Annual International Service Innovation and Design
eminar on March 14, 2013!
Contact: Laurea – Uuden edellä | Prime mover
5th International SID Seminar | March 14, 2013 at 8:30-17:30
• What is the role of design in value creation?
• How do you ensure sustained value creation for all stakeholders?
• How do you improve your competitive advantage?

MSI’s conference Beyond the Product: Designing Customer Experiences at Stanford University on February 19-20, 2013 in Stanford, CA.
Contact: #custexpMSI

4th Summer School of the European Social Simulation Association (ESSA)
Hamburg University of Technology, July 15-19, 2013
Matthias Meyer and Iris Lorscheid
Hamburg University of Technology
Institute of Management Control and Accounting
http://www.cur.tu-harburg.de
The NAACSOS mailing list is a service of NAACSOS
North American Association for Computational and Organizational Science

THROUGH-LIFE ENGINEERING SERVICES (TESconf 2012)
2nd International conference of
5th & 6th November 2013
Cranfield, Cranfield University, UK
Sponsor: EPSRC Centre for Innovative Manufacturing in Through-life Engineering Services
Contact: Rajkumar Roy
Deadline: 15th February 2013
Center first annual report:
1st Annual Report for 2011-12

University-Industry Demonstration Partnership Project Summit
January 15-17, 2013
National Academies’ Keck Building
500 5th St NW, Washington, DC 20001
Contact: Anthony Boccanfuso
UIDP: University-Industry Development Program

SERVICE COMPUTATION 2013, The Fifth International Conferences on Advanced Service Computing
May 27 – June 1, 2013 – Valencia, Spain
Submission deadline: January 22, 2013
Sponsored by IARIA,
Contact: ?

First International Conference of Serviceology
Contact: @Yurikos

22nd annual International Conference on Management of Technology,
in Porto Alegre, Brazil (April 14-18, 2013)
Contact: “iamot@miami.edu”, Yasser.Hosni@ucf.edu, tkhalil@nileuniversity.edu.eg

15th IEEE Conference on Business Informatics

[successor of the IEEE Conference of e-Commerce and Enterprise Computing (CEC)]

Vienna (Austria), 15 – 18 July 2013
Contacts: Huemer Christian <huemer@big.tuwien.ac.at>, “Birgit Hofreiter” <birgit.hofreiter@tuwien.ac.at>
* Paper Submission: March 1, 2013

10th WSEAS International Conference on Engineering Education (EDUCATION ’13)
University of Cambridge, UK, February 20-22, 2013

Books:

Service Science: Research and Innovations in the Service Economy
Springer, Series Editors: Bill Hefley and Wendy Murphy
Website:

Service Systems & Innovations in Business & Society
Business Expert Press (BEP)
BEP, Series Editors: Haluk Demirkan and Jim Spohrer

Encyclopedia of Quality and the Service Economy (SAGE)
Contact: Su Mi Dahlgaard-Park

Finland 2019: OpenTech AI Workshop (May 6-7, Helsinki)

Join the live stream here.

Presentation on Future of AI here. Perspective 1: David Cox (IBM-MIT Partnership) – Narrow AI, Broad AI, General AI, and Perspective 2: Jim Spohrer (IBM) – Better Building Blocks

Highlights include keynotes from European Commission, Business Finland, VTT projects, and IBM on AI & Ethics, Future of AI. Please consider attending, and/or submitting a poster or tutorial.
Name: AI Workshop (2019 Finland #OpenTechAI)
Date: May 6-7, 2019
Location: IBM Finland HQ, Laajalahdentie 23, Helsinki
Size: 120+ Attendees, 1/3 industry, 1/3 academic, 1/3 government and other
For Invitation Contact: Jim Spohrer <spohrer@us.ibm.com>
Registration: http://ibm.biz/IBMOpentechAI2019
Sponsors: IBM Finland and VTT Finland (~3000 Researchers)
Focus: Open Data and AI Technologies
Note: Finland is a leader in open data, and aspires to be the world leader in “applying” artificial intelligence in real use cases to benefit society.

Agenda
Day 1 (Monday May 6)
1:00 pm-4:00 Tutorials (multiple tracks)
4:00-5:00 Free time
5:00-7:00 Reception and Poster session

,Day 2 (Tuesday May 7) – will be live streamed
7:30am Breakfast
8:30 Welcome from Mirva Antila (IBM Finland Country Manager) and Antti Vasara (President and CEO VTT Finland)
9:00 Morning keynotes
Keynote: EC: Data for AI and related EU policies and programmes, Speaker: Kimmo Rossi (Head Research & Innovation Sector, European Commission)
Keynote: Finland: Finnish AI Landscape and Roadmap, Speaker: Mika Klemettinen (Director, Digitalisation, Business Finland)
Keynote: IBM: Put AI to Work for Business, Trust and Transparency, Co-speaker: Ann-Elise Delbec (IBM), Dr. Jean-Francois Puget (IBM DE, Kaggle Grand Master)
12:00 Lunch
1:00 Afternoon three panels (each with one moderator, three panelists)
Panel 1: AI and Healthcare Projects
Miikka Kiiski, Janne Huttunen, Mikael von und zu Fraunberg, Aleksi Kopponen,
Panel 2: AI and Industry/Energy Projects
Laura Sutinen, Shuli Goodman (Linux Foundation), Caj Södergård, Juho Korpela, Juha Rokka,Samuli Savo (Stora Enso)
Panel 3: AI and Open Source Projects
Haddad (Linux Foundation), Swirtun (FOSSID), Pakkala (VTT), Peret (Nokia), Roter (Mozilla)
4:00 Closing keynote – Future of AI, Jim Spohrer (IBM)
5:00 Closing thank-you

Program Committee
    Daniel Pakkala (VTT) and Jim Spohrer (IBM)
    Päivi Cederberg and Teppo Seesto (IBM Finland)
    Susan Malaika (IBM) and Tuomo Tuikka (VTT)
    Ibrahim Haddad (Linux Foundation)
    Eveliina Paljärvi
(IBM) – publicity

How to Participate

Just to emphasize we’d love posters too – or even a small tutorial? Please submit proposals here https://easychair.org/conferences/?conf=otai2019

You can see photos from last year’s successful poster session here https://twitter.com/sumalaika/status/1113606625942806533

Link to last year’s event: https://developer.ibm.com/opentech/2018/01/29/helsinki-march-2018-opentech-ai-workshop/

Link to this year’s call for tutorials and posters: https://developer.ibm.com/opentech/2019/03/25/helsinki-may-2019-opentech-ai-workshop/


For Invitation Contact: Jim Spohrer <spohrer@us.ibm.com>
 Registration: http://ibm.biz/IBMOpentechAI2019

ISSIP Guide to Service/Systems/Innovation Degrees and Certifications

Motivation:
People from around the world email me every week, asking about “service/systems/innovation degrees and certifications.” Their questions range from (1) where to get a PhD/Masters/Bachelors as a fulltime student, to (2) same, but as a remote, working professional, part-time students, to (3) short-term, certification or certificate of completion for one week workshop, attending a few day conference, or full or have day tutorial at a conference. Next they want to know how long it will take and estimated cost of such a program, and if there is an email address or URL to take next steps to get more information. Finally they ask for a textbook recommendation or one book they could easily purchase and read to start preparing to apply to programs to get needed degree or certifications. Some ask if there are any online courses from Coursera or Udacity that I could recommend.

Traditional response:
In response to these queries, I have a short list of universities (by country) and textbooks that I usually point people to…. However, I would like to do better than that.

Better response:
To do better my first thought was to have ISSIP design a survey, and then email members and colleagues to try to collect the first draft version of the information more systematically. Compiling an ISSIP guide to service/systems/innovation degrees and certifications seems to be worth investigating a bit… Then realized maybe someone or some organization has already done this in part or completely…

Questions:
(1) Has this already been done? What is best that exists?

(2) Needs to be done? Would below be an OK first pass survey?

Proposed Survey:
Survey estimated time to complete 20 minutes if your institutions offers service/systems/innovation degrees or certifications:

Definition (casting a wide net, since it is a big tent): A “service/systems/innovation degree or certification” includes a service-related emphasis with a specialized expert teaching/research faculty or instructors for traditional academic degrees such as design, marketing, management, engineering, operations, economics, computing, web services, information systems, industrial engineering, operations research, analytics, decision-making, data sciences, artificial intelligence, law, public policy, anthropology, humanities, ethics, leadership, entrepreneurship, innovation, open innovation, complex systems, sustainable systems, or other traditional or non-traditional degree or certification. The faculty or instructors for these programs may be from academia or from industry, government, or other areas of practice or industrial research or professional development. The expert instructors must have distinguished themselves, in some way such as: (a) publications in service/systems/innovation research literature, (b) positions of responsibility and distinction in business, government or practice, or (c) some other form of professional success.

For “service/systems/innovation degree or certification” as defined above…

  1. Degrees/Certifications: For your institutions, please indicate which of these are granted:
    (a) PhD degree
    (b) Masters degree
    (c) Bachelor degree
    (d) Associate degree
    (e) Certificate of Completion, Certification, or Badge
    (f) Other
  2. Student Options: For each of the above selected, indicate student options…
    (a) PhD degree – full-time, night-school (working professional), remote students, online
    (b) Masters degree – full-time, night-school (working professional), remote students, online
    (c) Bachelor degree – full-time, night-school (working professional), remote students, online
    (d) Associate degree – full-time, night-school (working professional), remote students, online
    (e) Certificate of Completion, Certification, or Badge – full-time, night-school (working professional), remote students, online
    (f) Other – full-time, night-school (working professional), remote students, online
  3. Time, Cost: Please indicate estimated time to complete and cost
    (a) PhD degree- time (years), cost ($)
    (b) Masters degree- time (years), cost ($)
    (c) Bachelor degree- time (years), cost ($)
    (d) Associate degree — time (years), cost ($)
    (e) Certificate of Completion, Certification, or Badge – time (months, weeks, days, hours, as appropriate), cost ($)
    (f) Other – time (years, months, weeks, days, hours, as appropriate), cost ($)
  4. Center: For your institution, do you have an existing center of excellence with service/systems/innovation related research, and if so is there a URL?
    (a) Yes, name:
    (b) Yes, URL:
    (c) No
    (d) Other
  5. Textbook: Do you recommend a specific textbook or book for students preparing to get a degree/certification, and if so is there a URL?
    (a) Yes, name:
    (b) Yes, URL:
    (c) No
    (d) Other
  6. Workshops/Conferences: Do you recommend a specific workshop or conference that students/professionals can attend to get a certificate of completion, certification, badge, etc., and is so is there a URL?
    (a) Yes, name:
    (b) Ues, URL:
    (c) No
    (d) Other
  7. Can you recommend institutions or people to contact to survey?
    (a) Suggestion 1:
    (b) Suggestion 2:
    (c) Suggestion 3:
    (d) Other
  8. Are you an ISSIP member? If so, would you like to see an annual ISSIP Guide to Service/Systems/Innovations Degrees and Certifications?
    (a) Yes member, Yes like to see annual guide
    (b) Yes member, No need for annual guide
    (c) No, not a member
    (d) Other
  9. Any final thoughts, or comments?
    (a) Comments:

From Handbook of Service Science diagram:


Page 706, HOSS1
Page 706, Handbook of Service Science (2010)

From Cambridge SSME Report:

pages 23-24 Cambridge SSME report.
Pages 23-24 from Cambridge SSME report (2008)

Example of a well-designed anthropology online Masters from University of North Texas: http://anthropology.unt.edu/graduate/online-masters-program

Others can be found here: https://www.guidetoonlineschools.com/degrees

SERVSIG lists these: http://www.servsig.org/wordpress/teaching/services-marketing-syllabi/

The New Foundational Skills of the Digital Economy & Universities Respond

Universities are responding to the need for all graduates to have foundational skills for a data-driven, AI-powered, digital economy. These new university programs will create graduates with depth in traditional disciplines, as well as broader boundary spanning skills – resulting in T-shapes. Over time, our data will become our AI helper.

Two Skills Reports

Two skills reports are especially relevant to the breadth and depth of skills of T-shaped Adaptive Innovators, from BHEF and NESTA:

BHEF (2018): Markow W, Hughes D, Bundy A (2018) The New Foundational Skills of the Digital Economy: Professionals of the Future. Burning Glass and Business Higher Education Forum Report (BHEF). URL: http://www.bhef.com/sites/default/files/BHEF_2018_New_Foundational_Skills.pdf

“Modern jobs integrate an array of broadly demanded skills. These are not the specialized skills of the engineer the physicist, working with advanced mathematical models, so much as they are those of the analyzer of complex bodies of data, the software programmer, the project manager, and the critical thinker.”

Oddly worded, since engineers and physicists are typically critical thinkers who know how to analyze complex bodies of data. That said software programmer (especially Python), and project manager (especially Agile methods with scrums) are not always taught to engineers and physicists at the bachelors level.

NESTA (2017): Bakhshi H, Downing J, Osborne M, Schneider P (2017) The Future of Skills: Employment in 2030. London: Pearson and Nesta. URL: https://www.nesta.org.uk/report/the-future-of-skills-employment-in-2030/

  • Around one-tenth of the workforce are in occupations that are likely to grow as a percentage of the workforce and round one-fifth are in occupations that will likely shrink.
  • Education, healthcare, and wider public sector occupations are likely to grow while some low-skilled jobs, in fields like construction and agriculture, are less likely to suffer poor labor market outcomes than has been assumed in the past.
  • The report highlights the skills that are likely to be in greater demand in the future, which include interpersonal skills, higher-order cognitive skills, and systems skills.
  • We also identify how the skills make up of different occupations can be altered to improve the odds that they will be in higher demand in the future.
  • The future workforce will need broad-based knowledge in addition to the more specialised skills that will are needed for specific occupations.”

The last bullet point in the above NESTA report is especially relevant to T-Shaped Adaptive Innovators with breadth (“broad-based knowledge”) and depth (“more specialised skills”).

Systems thinking and collaborative problem-solving are also characteristics of T-Shaped Adaptive Innovators:

  • ” Interestingly, systems skills, relatively underexplored in the literature, all feature in the top 10. Systems thinking emphasises the ability to recognise and understand socio-technical systems – their interconnections and feedback effects – and choose appropriate actions in light of them. It marks a shift from more reductionist and mechanistic forms of analysis and lends itself to pedagogical approaches such as game design and case method with evidence that it can contribute to interdisciplinary learning (Tekinbas et al., 2014; Capra and Luisi, 2014; Arnold and Wade, 2015).
  • —  The combined importance of these skills and interpersonal skills supports the view that the demand for collaborative problem-solving skills may experience higher growth in the future (Nesta, 2017). “

Four Universities Respond

The importance of Data Sciences and Artificial Intelligence to all disciplines, occupations, and yes, even cultures (values), is becoming increasingly apparent to universities, so they are starting AI Colleges, Sub-Universities, and Centers to explore AI’s impact across the board, and/or managing complex systems from a transdisciplinary perspective.

For example, consider MIT, Berkeley, Stanford, UC Merced.

MIT (Oct. 15, 2018), see: https://www.nytimes.com/2018/10/15/technology/mit-college-artificial-intelligence.html

The goal of the college, said L. Rafael Reif, the president of M.I.T., is to “educate the bilinguals of the future.” He defines bilinguals as people in fields like biology, chemistry, politics, history and linguistics who are also skilled in the techniques of modern computing that can be applied to them.

But, he said, “to educate bilinguals, we have to create a new structure.”

Academic departments still tend to be silos, Mr. Reif explained, despite interdisciplinary programs that cross the departmental boundaries. Half the 50 faculty positions will focus on advancing computer science, and the other half will be jointly appointed by the college and by other departments across M.I.T.

Traditionally, departments hold sway in hiring and tenure decisions at universities. So, for example, a researcher who applied A.I.-based text analysis tools in a field like history might be regarded as too much a computer scientist by the humanities department and not sufficiently technical by the computer science department.

Berkeley (Nov 2, 2018), see: https://www.insidehighered.com/news/2018/11/02/big-data-ai-prompt-major-expansions-uc-berkeley-and-mit

Berkeley provost Paul Alivisatos said that simply expanding the university’s existing computer sciences department would not be enough to match the surge of interest.

“Pretty much any field of inquiry and knowledge connects to [data science],” he said. “We wanted to create a structure that would allow that new methodological development to grow more, but also allow it to be widely used everywhere, where it can be beneficial.”

He said Berkeley envisions incorporating faculty members from fields as varied as sociology, public health and physics into a kind of “data science commons” to deepen their research. “From what we can tell, pretty much every part of this university wants to be involved, which is great.”

The field, Alivisatos said, is forcing other disciplines to come to terms not just with the widespread availability of data from diverse sources, but with “new methods that allow it to be sifted and analyzed.”

David Culler, Berkeley’s interim dean for data sciences, said the new division will be a peer of the university’s other schools and colleges. “But rather than standing apart from them, it’s really integrated with them,” he said, since these days, data science “touches almost every domain of inquiry.”

Culler said Berkeley, like most major universities, has been “grappling with this for at least five years” as it tried to figure out how to fit new computational disciplines into the broader world of other academic fields.

“The frontiers of knowledge are extremely integrative, and yet to a large extent, institutions of higher learning are very hierarchical,” he said.

Stanford (Mar 15, 2019), see: https://www.mercurynews.com/2019/03/15/stanford-unveils-new-ai-institute-built-to-create-a-better-future-for-all-humanity/amp/

“The scope and scale of impact of the Age of AI will be more profound than any other period of transformation in our history,” Li and co-director John Etchemendy said in an online note about the new institute. “AI has the potential to radically transform every industry and every society.”

The institute will take advantage of Stanford’s strength in a variety of disciplines, including AI, computer science, engineering, robotics, business, economics, genomics, law, literature, medicine, neuroscience and philosophy, according to promotional materials.

“Our goal is for Stanford HAI to become an interdisciplinary, global hub for AI thinkers, learners, researchers, developers, builders and users from academia, government and industry, as well as leaders and policymakers who want to understand and leverage AI’s impact and potential,” the institute said.

UC Merced (Dec 12, 2018), add complex systems thinking, see: https://news.ucmerced.edu/news/2018/uc-merced-designing-management-school-future

The planning initiative is a faculty-led effort to create a new, transdisciplinary school that draws upon the expertise of scientists, researchers and practitioners from broad backgrounds to instill the next generations of leaders with the skills and knowledge needed to understand, design and manage complex systems.

The process will take several years, but Professor Paul Maglio, recently named director of the Gallo School Planning Initiative, said it’s time to look to the future and the next big development at UC Merced.

“We think the time is right to establish a new Gallo school at UC Merced to carry forward the interdisciplinary mission and vision of the campus and that relates broadly to management, decision making, information, communication and sustainability, and embraces the complexities of real interactions between people, institutions, technologies and the natural world,” Maglio said.

Brian Fitzgerald (BHEF) just send me this with more universities responding in the DC area: Cardenas-Navia I, Fitzgerald BK (2019) The digital dilemma: Winning and losing strategies in the digital talent race. Industry and Higher Education. 2019 Mar 25:0950422219836669. This was very interesting: In his study, 60% of the acquired employees left within 3 years—double the rate direct hires.  The study also found that acquired employees were more likely to find their own companies, many of which appeared later to compete against the acquiring company (Kim, 2018). In Figure 1, blended professional – domain knowledge looks like academic disciplines.  Industry knowledge, for example healthcare, retail, finance, etc. – is what IBM would be looking for.

My Advice

My advice to students and life learners of all ages:

Skills: Build: Data sciences and python programming for AI to build next generation learning systems.
Skills: Teach: Learning sciences for social-emotional-learning (SEL) skills
Skills: Collaborate/Lead: Agile scrum master and positive leadership.
Skills: Understand: Complex systems: Smarter/wiser service systems and service science, and service-dominant logic mindset.
Skills: Memberships: GitHub, Kaggle, Wikipedia, ISSIP.org -> becoming active in these communities of builders, teachers, collaborators

Working to become a T-Shaped Adaptive Innovator and learning about “service science” may also be helpful to those interested in working at IBM some day. IBM is always looking for high integrity individuals who are global citizens interested in building a smarter/wiser planet. Collaborating across industries, disciplines, cultures is hard, hard work, so not for everyone. See https://service-science.info/archives/3328

Acknowledgements

Thanks to Steve Kwan (SJSU Emeritus) for suggesting BHEF report and Stanford report, and Christine Leitner for suggesting the NESTA Report.

Annotated Bibliography Item: Language Models are Unsupervised Multitask Learners

FYI: Annotated Bibliography Item: Language Models are Unsupervised Multitask Learners

Wu J, Child R, Luan D, Amodei D, Suskeve I (2018) Language Models are Unsupervised Multitask Learners. URL: https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf

Good summary of the above here: https://towardsdatascience.com/one-language-model-to-rule-them-all-26f802c90660

“Abstract. Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on task-specific datasets. We demonstrate that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText… These findings suggest a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations.” (p. 1);

“1. Introduction. Current systems are better characterized as narrow experts rather than competent generalists. We would like to move towards more general systems which can perform many tasks – eventually without the need to manually create and label a training dataset for each one. The dominant approach to creating ML systems is to collect a dataset of training examples demonstrating correct behavior for a desired task, train a system to imitate these behaviors, and then test its performance on independent and identically distributed (IID) held-out examples… Multitask learning (Caruana, 1997) is a promising framework for improving general performance. However, multitask training in NLP is still nascent… his suggests that multitask training many need just as many effective training pairs to realize its promise with current approaches. It will be very difficult to continue to scale the creation of datasets and the design of objectives to the degree that may be required to brute force our way there with current techniques. This motivates exploring additional setups for performing multitask learning.” (p. 1);

“2. Approach. Learning to perform a single task can be expressed in a probabilistic framework as estimating a conditional distribution p(output|input). Since a general system should be able to perform many different tasks, even for the same input, it should condition not only on the input but also on the task to be performed. That is, it should model p(output|input, task). This has been variously formalized in multitask and meta-learning settings. Task conditioning is often implemented at an architectural level, such as the task specific encoders and decoders in (Kaiser et al., 2017) or at an algorithmic level such as the inner and outer loop optimization framework of MAML (Finn et al., 2017). But as exemplified in McCann et al. (2018), language provides a flexible way to specify tasks, inputs, and outputs all as a sequence of symbols. For example, a translation training example can be written as the sequence (translate to french, english text, french text). Like-wise, a reading comprehension training example can be written as (answer the question, document, question, answer). ” (p. 2).

“Language modeling is also able to, in principle, learn the tasks of McCann et al. (2018) without the need for explicit supervision of which symbols are the outputs to be predicted. Since the supervised objective is the the same as the unsupervised objective but only evaluated on a subset of the sequence, the global minimum of the unsupervised objective is also the global minimum of the supervised objective. In this slightly toy setting, the concerns with density estimation as a principled training objective discussed in (Sutskever et al., 2015) are side stepped. The problem instead becomes whether we are able to, in practice, optimize the unsuper- vised objective to convergence. Preliminary experiments confirmed that sufficiently large language models are able to perform multitask learning in this toy-ish setup but learning is much slower than in explicitly supervised approaches.” (p. 2).

“While it is a large step from the well-posed setup described above to the messiness of “language in the wild”, Weston (2016) argues, in the context of dialog, for the need to develop systems capable of learning from natural language directly and demonstrated a proof of concept – learning a QA task without a reward signal by using forward prediction of a teacher’s outputs. While dialog is an attractive approach, we worry it is overly restrictive. The internet contains a vast amount of information that is passively available without the need for interactive communication. Our speculation is that a language model with sufficient capacity will begin to learn to infer and perform the tasks demonstrated in natural language sequences in order to better predict them, regardless of their method of procurement. If a language model is able to do this it will be, in effect, performing unsupervised multitask learning. We test whether this is the case by analyzing the performance of language models in a zero-shot setting on a wide variety of tasks.” (p. 2);

“2.1. Training Dataset. Most prior work trained language models on a single domain of text, such as news articles (Jozefowicz et al., 2016), Wikipedia (Merity et al., 2016), or fiction books (Kiros et al., 2015). Our approach motivates building as large and diverse a dataset as possible in order to collect natural language demonstrations of tasks in as varied of domains and contexts as possible.” (p. 3);

“Instead, we created a new web scrape which emphasizes document quality. To do this we only scraped web pages which have been curated/filtered by humans. Manually filtering a full web scrape would be exceptionally expensive so as a starting point, we scraped all outbound links from Reddit, a social media platform, which received at least 3 karma. This can be thought of as a heuristic indicator for whether other users found the link interesting, educational, or just funny.” (p. 3);

“The resulting dataset, WebText, contains the text subset of these 45 million links. To extract the text from HTML responses we use a combination of the Dragnet (Peters & Lecocq, 2013) and Newspaper1 content extractors. All results presented in this paper use a preliminary version of WebText which does not include links created after Dec 2017 and which after de-duplication and some heuristic based cleaning contains slightly over 8 million documents for a total of 40 GB of text. We removed all Wikipedia documents from WebText since it is a common data source for other datasets and could complicate analysis due to over-lapping training data with test evaluation tasks.” (p. 3-4);

“2.2 Input Representation. A general language model (LM) should be able to compute the probability of (and also generate) any string. Current large scale LMs include pre-processing steps such as lower-casing, tokenization, and out-of-vocabulary tokens which restrict the space of model-able strings… Byte Pair Encoding (BPE) (Sennrich et al., 2015) is a practical middle ground between character and word level language modeling which effectively interpolates between word level inputs for frequent symbol sequences and character level inputs for infrequent symbol sequences. Despite its name, reference BPE implementations often operate on Unicode code points and not byte sequences… To avoid this, we prevent BPE from merging across character categories for any byte sequence. We add an exception for spaces which significantly improves the compression efficiency while adding only minimal fragmentation of words across multiple vocab tokens. This input representation allows us to combine the empirical benefits of word-level LMs with the generality of byte-level approaches. Since our approach can assign a probability to any Unicode string, this allows us to evaluate our LMs on any dataset regardless of pre-processing, tokenization, or vocab size.” (p. 4)’

“2.3. Model. We use a Transformer (Vaswani et al., 2017) based architecture for our LMs. The model largely follows the details of the OpenAI GPT model (Radford et al., 2018) with a few modifications.” (p. 4);

“3. Experiments. We trained and benchmarked four LMs with approximately log-uniformly spaced sizes. The architectures are summarized in Table 2. The smallest model is equivalent to the original GPT, and the second smallest equivalent to the largest model from BERT (Devlin et al., 2018).” (p. 4);

“3.1. Language Modeling. As an initial step towards zero-shot task transfer, we are interested in understanding how WebText LM’s perform at zero-shot domain transfer on the primary task they are trained for – language modeling. Since our model operates on a byte level and does not require lossy pre-processing or tokenization, we can evaluate it on any language model benchmark.” (p. 4);

“For many of these datasets, WebText LMs would be tested significantly out-of-distribution, having to predict aggressively standardized text, tokenization artifacts such as disconnected punctuation and contractions, shuffled sentences, and even the string which is extremely rare in WebText – occurring only 26 times in 40 billion bytes. We report our main results in Table 3 using invertible de-tokenizers which remove as many of these tokenization / pre-processing artifacts as possible.” (p. 4-5);

“[3.2 – 3.8 Tasks.] 3.2. Children’s Book Test. The Children’s Book Test (CBT) (Hill et al., 2015) was created to examine the performance of LMs on different categories of words: named entities, nouns, verbs, and prepositions… 3.3. LAMBADA. The LAMBADA dataset (Paperno et al., 2016) tests the ability of systems to model long-range dependencies in text… 3.4. Winograd Schema Challenge. The Winograd Schema challenge (Levesque et al., 2012) was constructed to measure the capability of a system to perform commonsense reasoning by measuring its ability to resolve ambiguities in text… 3.5. Reading Comprehension. The Conversation Question Answering dataset (CoQA) Reddy et al. (2018) consists of documents from 7 different domains paired with natural language dialogues between a question asker and a question answerer about the document… 3.6. Summarization. We test GPT-2’s ability to perform summarization on the CNN and Daily Mail dataset (Nallapati et al., 2016). To induce summarization behavior we add the text TL;DR: after the article and generate 100 tokens with Top-k random sampling (Fan et al., 2018) with k= 2 which reduces repetition and encourages more abstractive summaries than greedy decoding. 3.7. Translation. We test whether GPT-2 has begun to learn how to translate from one language to another. In order to help it infer that this is the desired task, we condition the language model on a context of example pairs of the format english sentence = french sentence and then after a final prompt of english sentence = we sample from the model with greedy decoding and use the first generated sentence as the translation. 3.8. Question Answering. A potential way to test what information is contained within a language model is to evaluate how often it generates the
correct answer to factoid-style questions.” (p. 5-7).

“4. Generalization vs Memorization. Recent work in computer vision has shown that common image datasets contain a non-trivial amount of near-duplicate images.” (p. 8).

“5. Related Work. A significant portion of this work measured the performance of larger language models trained on larger datasets.” (p. 8).

“6. Discussion. Much research has been dedicated to learning (Hill et al.,2016), understanding (Levy & Goldberg, 2014), and critically evaluating (Wieting & Kiela, 2019) the representations of both supervised and unsupervised pre-training methods. Our results suggest that unsupervised task learning is an additional promising area of research to explore.” (p. 9);

“7. Conclusion. When a large language model is trained on a sufficiently large and diverse dataset it is able to perform well across many domains and datasets. GPT-2 zero-shots to state of the art performance on 7 out of 8 tested language modeling datasets. The diversity of tasks the model is able to perform in a zero-shot setting suggests that high-capacity models trained to maximize the likelihood of a sufficiently varied text corpus begin to learn how to perform a surprising amount of tasks without the need for explicit supervision.” (p. 10);

References include one of my favorites from IBM Research (circa 1980, based on a lot of work in the 1970’s) – Jelinek, F. and Mercer, R. L. Interpolated estimation of markov source parameters from sparse data. In Proceedings of the Workshop on Pattern Recognition in Practice, Amsterdam, The Netherlands: North-Holland, May. , 1980

Books for those redesigning education

Books for those redesigning education:


Kline SJ (1995) Conceptual foundations for multidisciplinary thinking. Stanford University Press. URL https://www.amazon.com/Conceptual-Foundations-Multidisciplinary-Thinking-Stephen/dp/0804724091/

Dartnell L. The knowledge: How to rebuild our world from scratch. Random House. URL: https://www.amazon.com/Knowledge-How-Rebuild-World-Scratch/dp/159420523X/r

Open Source Ecology: https://wiki.opensourceecology.org/wiki/Book#Disclaimer
Open Source Ecology Introduction:
Key Points: This book treats the perennial question of making a better world, and is an applied experiment inviting you to the journey.
Artificial Scarcity – ending it
Evolving as Humans – core of the message is that we will not move forward with technology alone – but by gaining in meaningful, full lives. This takes wisdom. Technology can help. Current technology will not do – it must be appropriate. The GVCS is designed to fill this gap – the prerequisite for a civilization advancing in its human potential beyond artificial scarcity.

Book for CEOs to learn AI:
Domingos P (2015) The master algorithm: How the quest for the ultimate learning machine will remake our world. Basic Books. URL: https://www.amazon.com/Master-Algorithm-Ultimate-Learning-Machine/dp/0465094279

Presentations:
Spohrer (IBM) – Artificial Intelligence: https://www.slideshare.net/spohrer/ypo-20190131-v1
Ezell (IFTF) – Silicon Valley: http://www2.itif.org/2015-innovation-ecosystem-success.pdf

More reading recommendations:
https://service-science.info/archives/4416

Service System Evolution in the Era of AI

Tom Malone’s perspective in “Superminds” seems most well thought out to me – it is a very service science oriented perspective, since organizations were the first superminds. See: https://sloanreview.mit.edu/article/how-human-computer-superminds-are-redefining-the-future-of-work

Malone TW (2018) How Human-Computer ‘Superminds’ Are Redefining the Future of Work. MIT Sloan Management Review. 2018 Jul 1;59(4):34-41.

A few thoughts….

(1) For some reason, I prefer the word “People” rather than “Humans” for noun usage – unless “human” (adjective) is used in something like Human Factors or “Human-side of Service Engineering” – so “human” as an adjective is fine, but as a noun it seems better to use the word “people” to me.

(2) Service systems and cognitive mediators can be defined/introduced as “dynamic configurations of resources (people, technology, organizations, and information) connected internally and externally by value propositions to other service system entities. Every service system entity has a focal decision-making authority that is a person. For example, even in a business there is the CEO, and for a nation the President.” The progression from tool to assistant to collaborator to coach to mediator in AI systems is the progression of both capabilities (models) and trust (earned). For example, who do you trust today to make certain decisions on your behalf? Your doctor, your lawyer, your spouse? Someday people will trust their cognitive mediators to make certain types of decisions on their behalf. BTW in governments with president, congress, and supreme court, you can see the dimension of time in decision-making outcomes, the president is for fast decisions, the congress allows more debate for longer term decisions about what laws to make and what to invest in, and supreme court is for really long term decisions that reflect reaffirmations or changes in the cultural values of a society. Governments are cognitive systems with hidden, partially visible, and explicit/recorded case-based decision-making as cognitive forms, supported by organizational structures. Governments are also service systems because they have “rights and responsibilities” both to their citizens as well as to other governmental entities with which they interact. The evolution of cognitive system entities and service system entities are intertwined, and Haluk Dermirkan and I have proposed studying them from a service science perspective in terms of AEIOU Framework (Abstract-Entity-Interaction-Outcome-Universals). The speed of decision-making between different types of entities (people, AI) is something to watch as new configuration of resources unfold, as new composite types of cognitive system entities and service system entities.

(3) All service system entities are cognitive system entities, but not all cognitive system entities are service system entities. The difference boils down to “rights and responsibilities” which relates to societal norms, and individual accountability (ability level of local cognitive resources). A service system entity is a cognitive system entity with rights and responsibilities. Not all cognitive system entities have rights and responsibilities. Just as children, elderly, animals, and yes, even some early AI systems, may have some cognitive abilities, the cognitive abilities have to reach a certain level in a large enough populations of those entities before the societal or business decision is made to give those entities rights and responsibilities. When an entity is given rights and responsibilities, the individual entity can be held accountable for its actions, rewarded or punished – and therefore, increased cognitive abilities leads to increased accountability – which can lead to rights and responsibilities of an entity, with which they can become the focal decision-making resource/authority in a service system entity.

(4) To understand the impact of organizations (first) and later AI (second) on the evolution of the service system ecology, one has to understand localized and distributed service systems. Organizations shifted the expertise from people into distributed organizations (this required specialization, increasing the concentration of expertise in individuals, while simultaneously increasing the diversity of types of expertise in the ecology). Enter AI to simultaneously reverse and amplify this trend/evolutionary force. AI has the ability to concentrate general expertise into local entities, like smartphones. So while organizations created service supply chains of expertise flows between specialized entities, AI allows the reconcentration of general expertise in a local form (effectively reversing the need for some types of organizations). Imagine a family farm with service robots that know how to repair themselves for example. The evolution of service systems has gone from local to global, and in the era of AI, there will be a double re-invention of both local system and global systems with AI.

(5) So in summary, as Tom Malone suggests in “superminds,” cities, businesses, and other types of organizations of people where the first super-minds. From a service science perspective, the types of service systems with many people were the first types of superminds, families, tribes, cities, etc. Now we are entering the era of AI, and AI systems (entities) can (someday, in decades ahead) become superminds as well. This represents the miniaturization of superminds, so at once they can become local again, as well as continue to grow a distributed, global form.

(6) So “service systems and innovations for business and society” is going into an new evolutionary mode powered by AI technology innovations.

Young Presidents Organization

Delighted by a visit from a dozen energetic members of YPO yesterday at IBM’s Silicon Valley Lab briefing center. YPO is the “the premier leadership organization of chief executives in the world” – and I must say they are one of the most inquisitive groups that I have ever presented to: https://www.ypo.org/about-ypo/ – my summary of a few of their questions and my responses here:

Q: Who will make money in a world of advanced AI capabilities? Those who use it wisely. Our data is becoming our AI.
Three Laws of Robo-Economics
https://www.emeraldinsight.com/doi/pdfplus/10.1108/JPEO-04-2018-0015
My summary here: https://service-science.info/archives/5021
Summary of open source and AI at IBM: https://developer.ibm.com/blogs/2018/12/12/open-source-ibm-and-ai/

Q: History of IBM? History (and future) of IBM in AI? A long journey of innovations the matter to business and society.
IBM History Video (shown at beginning)
https://www.youtube.com/watch?v=-eWxUWJgfzk
Future of AI (my presentation to the group): https://www.slideshare.net/spohrer/ypo-20190131-v1

Q: How will we trust our AI? Working together in the open.
Partnership on AI (ensuring fair and trusted AI)
https://www.partnershiponai.org/
IBM’s AI Fairness 360 software on GitHub (open source): https://github.com/IBM/AIF360
Also, note Linux Foundation Deep Learning landscape: https://github.com/LFDLFoundation/lfdl-landscape

Q: If the future of AI has a large open source component, how will IBM make money? Two models.
How RedHat makes money ($3B annually, >10% CAGR) with an open source product
IBM intent to acquire: https://www.redhat.com/en/blog/monumental-day-open-source-and-red-hat
RedHat’s model to make money (contribute and value-add subscriptions): https://www.techrepublic.com/article/heres-red-hats-open-secret-on-how-to-make-3b-selling-free-stuff/
The alternative model to make money with open source (Amazon – run it in public cloud): https://stratechery.com/2019/aws-mongodb-and-the-economic-realities-of-open-source/

Q: Are patents still important in a world of open source? You bet.
IBM patents (#1 in world for 26 years in a row)
http:// https://www.ibm.com/blogs/research/2019/01/2018-patent/
AI-related patents: https://www.wipo.int/pressroom/en/articles/2019/article_0001.html
IBM Research summary tweet: https://twitter.com/IBMResearch/status/1090987118728491008

Last but not least, since I just turned 63, I have increasingly noticed that I am the oldest person in the room when I am speaking to groups these days. The dozen members of the YPO were all substantially older than I am, full of energy and inquisitive, and this put a big smile on face – it was a wonderful fun visit together we all had. Only Marc Boegner (IBM) and Sean, the YPO guide leader, were younger than me. What a pleasant surprise!

Building machines that learn and think like people

FYI: Building machines that learn and think like people

Lake BM, Ullman TD, Tenenbaum JB, Gershman SJ (2017) Building machines that learn and think like people. Behavioral and Brain Sciences. 40:1-70. URL: https://cims.nyu.edu/~brenden/LakeEtAl2017BBS.pdf

“The difference between pattern recognition and model building, between prediction and explanation, is central to our view of human intelligence. Just as scientists seek to explain nature, not simply predict it, we see human thought as fundamentally a model building activity. ” (p. 2).

“The central goal of this article is to propose a set of core ingredients for building more human-like learning and thinking machines. We elaborate on each of these ingredients and topics in Section 4, but here we briefly overview the key ideas. The first set of ingredients focuses on developmental “start-up software,” or cognitive capabilities present early in development. … We focus on two pieces of developmental start-up software (see Wellman & Gelman [1992] for a review of both). First is intuitive physics (sect. 4.1.1): Infants have primitive object concepts that allow them to track objects over time and to discount physically implausible trajectories. … A second type of software present in early development is intuitive psychology (sect. 4.1.2): Infants under- stand that other people have mental states like goals and beliefs, and this understanding strongly constrains their learning and predictions. … Our second set of ingredients focus on learning. Although there are many perspectives on learning, we see model building as the hallmark of human-level learning, or explaining observed data through the construction of causal models of the world (sect. 4.2.2). From this perspective, the early- present capacities for intuitive physics and psychology are also causal models of the world. A primary job of learning is to extend and enrich these models and to build analogous causally structured theories of other domains. Compared with state-of-the-art algorithms in machine learning, human learning is distinguished by its richness and its efficiency. Children come with the ability and the desire to uncover the underlying causes of sparsely observed events and to use that knowledge to go far beyond the paucity of the data. It might seem paradoxical that people are capable of learning these richly structured models from very limited amounts of experience. We suggest that compositionality and learning-to-learn are ingredients that make this type of rapid model learning possible (sects. 4.2.1 and 4.2.3, respectively). A fi nal set of ingredients concerns how the rich models our minds build are put into action, in real time (sect. 4.3). It is remarkable how fast we are to perceive and to act.” (p. 4).

“Here we present two challenge problems for machine learning and AI: learning simple visual concepts (Lake et al. 2015a ) and learning to play the Atari game Frostbite (Mnih et al. 2015).” (p. 5).

“Figure 1. The Characters Challenge: Human-level learning of novel handwritten characters (A), with the same abilities also illustrated for a novel two-wheeled vehicle (B). A single example of a new visual concept (red box) can be enough information to support the (i) classification of new examples, (ii) generation of new examples, (iii) parsing an object into parts and relations, and (iv) generation of new concepts from related concepts. Adapted from Lake et al. (2015a).” (p. 6).

“In Frostbite, players control an agent (Frostbite Bailey) tasked with constructing an igloo within a time limit. The igloo is built piece by piece as the agent jumps on ice floes in water (Fig. 2A–C). … The Frostbite example is a particularly telling contrast when compared with human play. Even the best deep networks learn gradually over many thousands of game episodes, take a long time to reach good performance, and are locked into particular input and goal patterns. Humans, after playing just a small number of games over a span of minutes, can understand the game and its goals well enough to perform better than deep networks do after almost a thousand hours of experience. Even more impressively, people understand enough to invent or accept new goals, generalize over changes to the input, and explain the game to others. Why are people different? ” (p. 7-9).

“4. Core ingredients of human intelligence: In the Introduction, we laid out what we see as core ingredients of intelligence. Here we consider the ingredients in detail and contrast them with the current state of neural network modeling. Although these are hardly the only ingredients needed for human-like learning and thought (see our discussion of language in sect. 5), they are key building blocks, which are not present in most current learning- based AI systems – certainly not all present together – and for which additional attention may prove especially fruitful. We believe that integrating them will produce significantly more powerful and more human-like learning and thinking abilities than we currently see in AI systems. Before considering each ingredient in detail, it is important to clarify that by “core ingredient” we do not necessarily mean an ingredient that is innately specified by genetics or must be “built in” to any learning algorithm.” (p. 9).

“We have focused on how cognitive science can motivate and guide efforts to engineer human-like AI, in contrast to some advocates of deep neural networks who cite neuroscience for inspiration. Our approach is guided by a pragmatic view that the clearest path to a computational formalization of human intelligence comes from understanding the “software” before the “hardware.” In the case of this article, we proposed key ingredients of this software in previous sections. Nonetheless, a cognitive approach to intelligence should not ignore what we know about the brain. Neuroscience can provide valuable inspirations for both cognitive models and AI researchers: The centrality of neural networks and model-free reinforcement learning in our proposals for “thinking fast” (sect. 4.3) are prime exemplars.” (p. 20).

“We believe that understanding language and its role in intelligence goes hand-in-hand with understanding the building blocks discussed in this article. It is also true that language builds on the core abilities for intuitive physics, intuitive psychology, and rapid learning with compositional, causal models that we focus on. These capacities are in place before children master language, and they provide the building blocks for linguistic meaning and language acquisition (Carey 2009; Jackendoff 2003; Kemp 2007; O’Donnell 2015; Pinker 2007 ; Xu & Tenenbaum 2007).” (p.21).

“There has been recent interest in integrating psychological ingredients with deep neural networks, especially selective attention (Bahdanau et al. 2015; Mnih et al. 2014; Xu et al. 2015), augmented working memory (Graves et al. 2014; 2016; Grefenstette et al. 2015; Sukhbaatar et al. 2015; Weston et al. 2015b ), and experience replay (McClelland et al. 1995; Mnih et al. 2015).” (p. 22).

“One worthy goal would be to build an AI system that beats a world-class player with the amount and kind of training human champions receive, rather than overpowering them with Google-scale computational resources. AlphaGo is initially trained on 28.4 million positions and moves from 160,000 unique games played by human experts; it then improves through reinforcement learning, playing 30 million more games against itself. Between the publication of Silver et al. (2016) and facing world champion Lee Sedol, AlphaGo was iteratively retrained several times in this way. The basic system always learned from 30 million games, but it played against successively stronger versions of itself, effectively learning from 100 million or
more games altogether (D. Silver, personal communication, 2017). In contrast, Lee has probably played around 50,000 games in his entire life. Looking at numbers like these, it is impressive that Lee can even compete with AlphaGo. What would it take to build a professional-level Go AI that learns from only 50,000 games?” (p. 23).

[Open Peer Commentary: The architecture challenge: Future artificial-intelligence systems will require sophisticated architectures, and knowledge of the brain might guide their construction. Gianluca Baldassarre, Vieri Giuliano Santucci, Emilio Cartoni, and Daniele Caligiore] “We agree with the claim of Lake et al. that to obtain human-level learning speed and cognitive flexibility, future artificial-intelligence (AI) systems will have to incorporate key elements of human cognition: from causal models of the world, to intuitive psychological theories, compositionality, and knowledge transfer. However, the authors largely overlook the importance of a major challenge to implementation of the functions they advocate: the need to develop sophisticated architectures to learn, represent, and process the knowledge related to those functions. Here we call this the architecture challenge. In this commentary, we make two claims: (1) tackling the architecture challenge is fundamental to success in developing human-level AI systems; (2) looking at the brain can furnish important insights on how to face the architecture challenge. The difficulty of the architecture challenge stems from the fact that the space of the architectures needed to implement the several functions advocated by Lake et al. is huge.” (p. 25-26).

[Open Peer Commentary: Thinking like animals or thinking like colleagues?. Daniel C. Dennett and Enoch Lambert] “The step up to human-style comprehension carries moral implications that are not mentioned in Lake et al.’s telling. Even themost powerful of existing AIs are intelligent tools, not colleagues, and whereas they can be epistemically authoritative (within limits we need to characterize carefully), and hence will come to be relied on more and more, they should not be granted moral authority or responsibility because they do not have skin in the game: they do not yet have interests, and simulated interests are not enough. We are not saying that an AI could not be created to have genuine interests, but that is down a very long road (Dennett 2017 ; Hurley et al. 2011). Although some promising current work suggests that genuine human consciousness depends on a fundamental architecture that would require having interests (Deacon 2012; Dennett 2013 ), long before that day arrives, if it ever does, we will have AIs that can communicate with natural language with their users (not collaborators).” (p. 34-35).

[Open Peer Commentary: Understand the cogs to understand cognition. Adam H. Marblestone, Greg Wayne, and Konrad P. Kording] “We argue that the study of evolutionarily conserved neural structures will provide a means to identify the brain’s true, fundamental inductive biases and how they actually arise.” (p. 43).

[Open Peer Commentary: Avoiding frostbite: It helps to learn from others Michael Henry Tessler, Noah D. Goodman, and Michael C. Frank] “Learning from others also does more than simply “speed up” learning about the world. Human knowledge seems to accumulate across generations, hence permitting progeny to learn in one life-time what no generation before them could learn (Boyd et al., 2011; Tomasello, 1999). We hypothesize that language–and particularly its flexibility to refer to abstract concepts – is key to faithful transmission of knowledge, between individuals and through generations. ” (p.48-29).