Timeline MB

1805 Legendre lays the groundwork for machine learning

French mathematician Adrien-Marie Legendre publishes the least square method for regression, which he used to determine, from astronomical observations, the orbits of bodies around the sun. Although this method was developed as a statistical framework, it would provide the basis for many of today’s machine-learning models.

1958 Rosenblatt develops the first self-learning algorithm

American psychologist and computer scientist Frank Rosenblatt creates the perceptron algorithm, an early type of artificial neural network (ANN), which stands as the first algorithmic model that could learn on its own. American computer scientist Arthur Samuel would coin the term “machine learning” the following year for these types of self-learning models (as well as develop a groundbreaking checkers program seen as an early success in AI).

1965 Birth of deep learning

Ukrainian mathematician Alexey Grigorevich Ivakhnenko develops the first general working learning algorithms for supervised multilayer artificial neural networks (ANNs), in which several ANNs are stacked on top of one another and the output of one ANN layer feeds into the next. The architecture is very similar to today’s deep-learning architectures.

1986 Backpropagation takes hold

American psychologist David Rumelhart, British cognitive psychologist and computer scientist Geoffrey Hinton, and American computer scientist Ronald Williams publish on backpropagation, popularizing this key technique for training artificial neural networks (ANNs) that was originally proposed by American scientist Paul Werbos in 1982. Backpropagation allows the ANN to optimize itself without human intervention (in this case, it found features in family-tree data that weren’t obvious or provided to the algorithm in advance). Still, lack of computational power and the massive amounts of data needed to train these multilayered networks prevent ANNs leveraging backpropagation from being used widely.

1989 Birth of CNNs for image recognition

French computer scientist Yann LeCun, now director of AI research for Facebook, and others publish a paper describing how a type of artificial neural network called a convolutional neural network (CNN) is well suited for shape-recognition tasks. LeCun and team apply CNNs to the task of recognizing handwritten characters, with the initial goal of building automatic mail-sorting machines. Today, CNNs are the state-of-the-art model for image recognition and classification.

1992 Upgraded SVMs provide early natural-language-processing solution

Computer engineers Bernhard E. Boser (Swiss), Isabelle M. Guyon (French), and Russian mathematician Vladimir N. Vapnik discover that algorithmic models called support vector machines (SVMs) can be easily upgraded to deal with nonlinear problems by using a technique called kernel trick, leading to widespread usage of SVMs in many natural-language-processing problems, such as classifying sentiment and understanding human speech.

1997 RNNs get a “memory,” positioning them to advance speech to text

In 1991, German computer scientist Sepp Hochreiter showed that a special type of artificial neural network (ANN) called a recurrent neural network (RNN) can be useful in sequencing tasks (speech to text, for example) if it could remember the behavior of part sequences better. In 1997, Hochreiter and fellow computer scientist Jürgen Schmidhuber solve the problem by developing long short-term memory (LSTM). Today, RNNs with LSTM are used in many major speech-recognition applications.

1998 Brin and Page publish PageRank algorithm

The algorithm, which ranks web pages higher the more other web pages link to them, forms the initial prototype of Google’s search engine. This brainchild of Google founders Sergey Brin and Larry Page revolutionizes Internet searches, opening the door to the creation and consumption of more content and data on the World Wide Web. The algorithm would also go on to become one of the most important for businesses as they vie for attention on an increasingly sprawling Internet.

2006 Hinton reenergizes the use of deep-learning models

To speed the training of deep-learning models, Geoffrey Hinton develops a way to pretrain them with a deep-belief network (a class of neural network) before employing backpropagation. While his method would become obsolete when computational power increased to a level that allowed for efficient deep-learning-model training, Hinton’s work popularized the use of deep learning worldwide—and many credit him with coining the phrase “deep learning.”

1991 Opening of the World Wide Web

The European Organization for Nuclear Research (CERN) begins opening up the World Wide Web to the public.

EARLY 2000s Broadband adoption begins among home Internet users

Broadband allows users access to increasingly speedy Internet connections, up from the paltry 56 kbps available for downloading through dial-up in the late 1990s. Today, available broadband speeds can surpass 100 mbps (1 mbps = 1,000 kbps). Bandwidth-hungry applications like YouTube could not have become commercially viable without the advent of broadband.

2004 Web 2.0 hits its stride, launching the era of user-generated data

Web 2.0 refers to the shifting of the Internet paradigm from passive content viewing to interactive and collaborative content creation, social media, blogs, video, and other channels. Publishers Tim O'Reilly and Dale Dougherty popularize the term, though it was coined by designer Darcy DiNucci in 1999.

2004 Facebook debuts

Harvard student Mark Zuckerberg and team launch “Thefacebook,” as it was originally dubbed. By the end of 2005, the number of data-generating Facebook users approaches six million.

2005 Number of Internet users worldwide passes one-billion mark

The total is a nearly two-fold increase from up from 1.2 billion purchased in 2003

2005 YouTube debuts

Within about 18 months, the site would serve up almost 100 million views per day.

EARLY 2000s Number of Internet users worldwide passes 1 billion mark

2007 Introduction of the iPhone propels smartphone revolution—and amps up data generation

Apple cofounder and CEO Steve Jobs introduces the iPhone in January 2007. The total number of smartphones sold in 2007 reaches about 122 million. The era of around-the-clock consumption and creation of data and content by smartphone users begins.

1965 Moore recognizes exponential growth in chip power

Intel cofounder Gordon Moore notices that the number of transistors per square inch on integrated circuits has doubled every year since their invention. His observation becomes Moore’s law, which predicts the trend will continue into the foreseeable future (although it later proves to do so roughly every 18 months). At the time, state-of-the-art computational speed is in the order of three million floating-point operations per second (FLOPS).

1997 Increase in computing power drives IBM’s Deep Blue victory over Garry Kasparov

Deep Blue’s success against the world chess champion largely stems from masterful engineering and the tremendous power computers possess at that time. Deep Blue’s computer achieves around 11 gigaFLOPS (11 billion FLOPS).

1999 More computing power for AI algorithms arrives … but no one realizes it yet

Nvidia releases the GeForce 256 graphics card, marketed as the world’s first true graphics processing unit (GPU). The technology will later prove fundamental to deep learning by performing computations much faster than computer processing units (CPUs).

2002 Amazon brings cloud storage and computing to the masses

Amazon launches its Amazon Web Services, offering cloud-based storage and computing power to users. Cloud computing would come to revolutionize and democratize data storage and computation, giving millions of users access to powerful IT systems— previously only available to big tech companies—at a relatively low cost.

2004 Dean and Ghemawat introduce the MapReduce algorithm to cope with data explosion

With the World Wide Web taking off, Google seeks out novel ideas to deal with the resulting proliferation of data. Computer scientist Jeff Dean (current head of Google Brain) and Google software engineer Sanjay Ghemawat develop MapReduce to deal with immense amounts of data by parallelizing processes across large data sets using a substantial number of computers.

2005 Cost of one gigabyte of disk storage drops to $0.79, from $277 ten years earlier

And the price of DRAM, a type of random-access memory (RAM) commonly used in PCs, drops to $158 per gigabyte, from $31,633 in 1995.

2006 Cutting and Cafarella introduce Hadoop to store and process massive amounts of data

Inspired by Google’s MapReduce, computer scientists Doug Cutting and Mike Cafarella develop the Hadoop software to store and process enormous data sets. Yahoo uses it first, to deal with the explosion of data coming from indexing web pages and online data.

2009 UC Berkley introduces Spark to handle big data

Developed by Romanian-Canadian computer scientist Matei Zaharia at UC Berkeley’s AMPLab, Spark streams huge amounts of data leveraging RAM, making it much faster at processing data than software that must read/write on hard drives. It revolutionizes the ability to update big data and perform analytics in real time.

2009 Ng uses GPUs to train deep-learning models more efficiently

American computer scientist Andrew Ng and his team at Stanford University show that training deep-belief networks with 100 million parameters on GPUs is more than 70 times faster than doing so on CPUs, a finding that would reduce training that once took weeks to only one day.

2010 Microsoft and Google introduce their clouds

Cloud computing and storage take another step toward ubiquity when Microsoft makes Azure available and Google launches its Google Cloud Storage (the Google Cloud Platform would come online about a year later).

2010 Worldwide IP traffic exceeds 20 exabytes (20 billion gigabytes) per month

Internet protocol (IP) traffic is aided by the growing adoption of broadband, particularly in the United States, where adoption reaches 65 percent, according to Cisco, which reports this monthly figure and the annual figure of 242 exabytes.

2010 Number of smartphones sold in the year nears 300 million

This represents a nearly 2.5 times increase over the number sold in 2007.

2011 IBM Watson beats Jeopardy!

IBM’s question answering system, Watson, defeats the two greatest Jeopardy! champions, Brad Rutter and Ken Jennings, by a significant margin. IBM Watson uses ten racks of IBM Power 750 servers capable of 80 teraFLOPS (that’s 80 trillion FLOPS—the state of the art in the mid-1960s was around three million FLOPS).

2012 Deep-learning system wins renowned image-classification contest for the first time

Geoffrey Hinton’s team wins ImageNet’s image-classification competition by a large margin, with an error rate of 15.3 percent versus the second-best error rate of 26.2 percent, using a convolutional neural network (CNN). Hinton’s team trained its CNN on 1.2 million images using two GPU cards.

2012 Google demonstrates the effectiveness of deep learning for image recognition

Google uses 16,000 processors to train a deep artificial neural network with one billion connections on ten million randomly selected YouTube video thumbnails over the course of three days. Without receiving any information about the images, the network starts recognizing pictures of cats, marking the beginning of significant advances in image recognition.

2012 Number of Facebook users hits one billion

The amount of data processed by the company’s systems soars past 500 terabytes.

2013 DeepMind teaches an algorithm to play Atari using reinforcement learning and deep learning

While reinforcement learning dates to the late 1950s, it gains in popularity this year when Canadian research scientist Vlad Mnih from DeepMind (not yet a Google company) applies it in conjunction with a convolutional neural network to play Atari video games at superhuman levels.

2014 Number of mobile devices exceeds number of humans

As of October 2014, GSMA reports the number of mobile devices at around 7.22 billion, while the US Census Bureau reports the number of people globally at around 7.20 billion.

2017 Electronic-device users generate 2.5 quintillion bytes of data per day

According to this estimate, about 90 percent of the world’s data were produced in the past two years. And, every minute, YouTube users watch more than four million videos and mobile users send more than 15 million texts.

2017 AlphaZero beats AlphaGo Zero after learning to play three different games in less than 24 hours

While creating AI software with full general intelligence remains decades off (if possible at all), Google’s DeepMind takes another step closer to it with AlphaZero, which learns three computer games: Go, chess, and shogi. Unlike AlphaGo Zero, which received some instruction from human experts, AlphaZero learns strictly by playing itself, and then goes on to defeat its predecessor AlphaGo Zero at Go (after eight hours of self-play) as well as some of the world’s best chess- and shogi-playing computer programs (after four and two hours of self-play, respectively).

2017 Google introduces upgraded TPU that speeds machine-learning processes

Google first introduced its tensor processing unit (TPU) in 2016, which it used to run its own machine-learning models at a reported 15 to 30 times faster than GPUs and CPUs. In 2017, Google announced an upgraded version of the TPU that was faster (180 million teraFLOPS—more when multiple TPUs are combined), could be used to train models in addition to running them, and would be offered to the paying public via the cloud. TPU availability could spawn even more (and more powerful and efficient) machine-learning-based business applications.

Improve capital allocation

Apply predictive maintenance

Detect fraud

Personalize marketing campaigns

Optimize salesforce coverage

Forecast demand more accurately

Optimize power grids

Recommend next product to buy

Predict customer churn

Diagnose disease

Optimize price points

Understand product sales drivers

Business Use Cases

2013 - 2017

Understand product sales drivers

Optimize price points

Diagnose disease

Predict customer churn

Recommend next product to buy

Optimize power grids

Forecast demand more accurately

Optimize salesforce coverage

Personalize marketing campaigns

Detect fraud

Apply predictive maintenance

Improve capital allocation

2009

2014

2013

2011

2010

2012

2017

2006

2005

2004

2002

1999

1997

1965

2006

1998

1997

1992

1989

1986

1965

1958

1805

EARLY 2000s

2005

2004

2007

1991

Algorithmic advancements

Exponential increases in computing power and storage

Explosion of data

Business use cases

A convergence of algorithmic advances, data proliferation, and tremendous increases in computing power and storage has propelled AI from hype to reality.

2012 -2017

Why AI now?