Current challenges from the perspective of Google
2010 Google Faculty Summit: Opening Remarks, same video from Techminer
The challenges are: Innovation begins with
- commitment to advancing technology
- rich domain of work due to our mission (Google’s mission is to organize the world‘s information and make it universally accessible and useful. )
- grand challenge problems
- internal consensus that production issues are often as challenging / fun as pure invention
- technical leverage
- a focus on services
- google common distributed system
- empiricism and a holistic approach to design
Blurring of the border between research and engineering
Scale / prodigiousness
- giga: 10^9, tera: 10^12, peta: 10^15, exa: 10^18, zetta: 10^21
- publicized: bigtable of 70 petabytes, 10M ops/sec
- warehouse computing possibilities? 100 x 10 x 20 x 20 x 40 = 16 000 000 nodes
- some representative numbers
- storage: 10^18 → 10^20-21
- users: 10^9 → 10^10
- devices: 10^? → 10^12 (100 per user)
- network: 10^20, now → 10^21/year, 32KB/sec for 1B people
- apps: 10^5 → 10^6-7 or more
- e.g. embedded car systems: 30-50 ECUs, 100M lines of codes
Research Challenges in Ideal Distributed Computing
- alternative design that would give better energy efficiency at lower utilization
- server OS design aimed at many highly-connected machines in one building
- unifying abstractions for exploiting parallelism beyond inter-transaction parallelism and map-reduce
- latency reduction
- a general model of replication including consistency choices, explained and codified
- machine learning techniques applied to monitoring/controlling such systems
- automatic dynamic world-wide placement of data & computation to minimize latency and/or cost given constraints on government regulations
- building retrieval systems that efficiently and usabily deal with ACLs
- holistic models of privacy
- the user interface to a user’s diverse processing and state
Totally Transparent Processing
- the set of all user devices
- the set of all human languages
- the set of all modalities (text, voice, image, video)
- the set of all corpora (normal web, deep web, books, periodicals, blogs, geo-data, health data, scientific data)
“Hybrid” Intelligence Research Challenges in Transparent Computing Hybrid intelligence
- endless applications, with new user interface implications
- addressing limits to data
- techniques to integrate user-feedback in acceptable fashions
- approaches to new signal
- explanation, scale, and variance minimization in machine learning
- information fusion/learning across diverse signals - The Combination Hypothesis, more generally
- usability: devices and subpopulations
- privacy
Domains of Applications
- search engines
- translation
- speech recognition
- vision
- remedial education / personal health / epidemology / economic prediction / societal/environmental optimization / social networks in ever more clever/useful ways / humanities and social sciences / multi-player gaming
Focused Program: Culuromics
- Digital Humanities Culturomics
- Goal: Advance the field of culuromics as an important interface between history, linguistics, sociology, and theoretical and computational sciences
Focused Program: Worldly Knowledge
- World Knowledge: extracting facts in context
- Goal: Teach computers to learn how to read about specific places and the people and events associated with them
Focused Center Program: AMP Lab (Berkeley)
Bringing Technology to Schools & Universities
Technology Leadership: Google Code University