1805
Legendre lays the groundwork for machine learning
French mathematician Adrien-Marie Legendre publishes the least square method for regression, which he used to determine, from astronomical observations, the orbits of bodies around the sun. Although this method was developed as a statistical framework, it would provide the basis for many of today’s machine-learning models.
1958
Rosenblatt develops the first self-learning algorithm
American psychologist and computer scientist Frank Rosenblatt creates the perceptron algorithm, an early type of artificial neural network (ANN), which stands as the first algorithmic model that could learn on its own. American computer scientist Arthur Samuel would coin the term “machine learning” the following year for these types of self-learning models (as well as develop a groundbreaking checkers program seen as an early success in AI).
1965
Birth of deep learning
Ukrainian mathematician Alexey Grigorevich Ivakhnenko develops the first general working learning algorithms for supervised multilayer artificial neural networks (ANNs), in which several ANNs are stacked on top of one another and the output of one ANN layer feeds into the next. The architecture is very similar to today’s deep-learning architectures.
1986
Backpropagation takes hold
American psychologist David Rumelhart, British cognitive psychologist and computer scientist Geoffrey Hinton, and American computer scientist Ronald Williams publish on backpropagation, popularizing this key technique for training artificial neural networks (ANNs) that was originally proposed by American scientist Paul Werbos in 1982. Backpropagation allows the ANN to optimize itself without human intervention (in this case, it found features in family-tree data that weren’t obvious or provided to the algorithm in advance). Still, lack of computational power and the massive amounts of data needed to train these multilayered networks prevent ANNs leveraging backpropagation from being used widely.
1989
Birth of CNNs for image recognition
French computer scientist Yann LeCun, now director of AI research for Facebook, and others publish a paper describing how a type of artificial neural network called a convolutional neural network (CNN) is well suited for shape-recognition tasks. LeCun and team apply CNNs to the task of recognizing handwritten characters, with the initial goal of building automatic mail-sorting machines. Today, CNNs are the state-of-the-art model for image recognition and classification.
1992
Upgraded SVMs provide early natural-language-processing solution
Computer engineers Bernhard E. Boser (Swiss), Isabelle M. Guyon (French), and Russian mathematician Vladimir N. Vapnik discover that algorithmic models called support vector machines (SVMs) can be easily upgraded to deal with nonlinear problems by using a technique called kernel trick, leading to widespread usage of SVMs in many natural-language-processing problems, such as classifying sentiment and understanding human speech.
1997
RNNs get a “memory,” positioning them to advance speech to text
In 1991, German computer scientist Sepp Hochreiter showed that a special type of artificial neural network (ANN) called a recurrent neural network (RNN) can be useful in sequencing tasks (speech to text, for example) if it could remember the behavior of part sequences better. In 1997, Hochreiter and fellow computer scientist Jürgen Schmidhuber solve the problem by developing long short-term memory (LSTM). Today, RNNs with LSTM are used in many major speech-recognition applications.
1998
Brin and Page publish PageRank algorithm
The algorithm, which ranks web pages higher the more other web pages link to them, forms the initial prototype of Google’s search engine. This brainchild of Google founders Sergey Brin and Larry Page revolutionizes Internet searches, opening the door to the creation and consumption of more content and data on the World Wide Web. The algorithm would also go on to become one of the most important for businesses as they vie for attention on an increasingly sprawling Internet.
2006
Hinton reenergizes the use
of deep-learning models
To speed the training of deep-learning models, Geoffrey Hinton develops a way to pretrain them with a deep-belief network (a class of neural network) before employing backpropagation. While his method would become obsolete when computational power increased to a level that allowed for efficient deep-learning-model training, Hinton’s work popularized the use of deep learning worldwide—and many credit him with coining the phrase “deep learning.”
1991
Opening of the World Wide Web
The European Organization for Nuclear Research (CERN) begins opening up the World Wide Web to the public.
EARLY 2000s
Broadband adoption begins among home Internet users
Broadband allows users access to increasingly speedy Internet connections, up from the paltry 56 kbps available for downloading through dial-up in the late 1990s. Today, available broadband speeds can surpass 100 mbps (1 mbps = 1,000 kbps). Bandwidth-hungry applications like YouTube could not have become commercially viable without the advent of broadband.
2004
Web 2.0 hits its stride, launching the era of user-generated data
Web 2.0 refers to the shifting of the Internet paradigm from passive content viewing to interactive and collaborative content creation, social media, blogs, video, and other channels. Publishers Tim O'Reilly and Dale Dougherty popularize the term, though it was coined by designer Darcy DiNucci in 1999.
2004
Facebook debuts
Harvard student Mark Zuckerberg and team launch “Thefacebook,” as it was originally dubbed. By the end of 2005, the number of data-generating Facebook users approaches six million.
2005
Number of Internet users worldwide passes one-billion mark
The total is a nearly two-fold increase from up from 1.2 billion purchased in 2003
2005
YouTube debuts
Within about 18 months, the site would serve up almost 100 million views per day.
EARLY 2000s
Number of Internet users worldwide passes 1 billion mark
2007
Introduction of the iPhone propels smartphone revolution—and amps up data generation
Apple cofounder and CEO Steve Jobs introduces the iPhone in January 2007. The total number of smartphones sold in 2007 reaches about 122 million. The era of around-the-clock consumption and creation of data and content by smartphone users begins.
1965
Moore recognizes exponential growth in chip power
Intel cofounder Gordon Moore notices that the number of transistors per square inch on integrated circuits has doubled every year since their invention. His observation becomes Moore’s law, which predicts the trend will continue into the foreseeable future (although it later proves to do so roughly every 18 months). At the time, state-of-the-art computational speed is in the order of three million floating-point operations per second (FLOPS).
1997
Increase in computing power drives IBM’s Deep Blue victory over Garry Kasparov
Deep Blue’s success against the world chess champion largely stems from masterful engineering and the tremendous power computers possess at that time. Deep Blue’s computer achieves around 11 gigaFLOPS (11 billion FLOPS).
1999
More computing power for AI algorithms arrives … but no one realizes it yet
Nvidia releases the GeForce 256 graphics card, marketed as the world’s first true graphics processing unit (GPU). The technology will later prove fundamental to deep learning by performing computations much faster than computer processing units (CPUs).
2002
Amazon brings cloud storage and computing to the masses
Amazon launches its Amazon Web Services, offering cloud-based storage and computing power to users. Cloud computing would come to revolutionize and democratize data storage and computation, giving millions of users access to powerful IT systems— previously only available to big tech companies—at a relatively low cost.
2004
Dean and Ghemawat introduce the MapReduce algorithm to cope with data explosion
With the World Wide Web taking off, Google seeks out novel ideas to deal with the resulting proliferation of data. Computer scientist Jeff Dean (current head of Google Brain) and Google software engineer Sanjay Ghemawat develop MapReduce to deal with immense amounts of data by parallelizing processes across large data sets using a substantial number of computers.
2005
Cost of one gigabyte of disk storage drops to $0.79, from $277 ten years earlier
And the price of DRAM, a type of random-access memory (RAM) commonly used in PCs, drops to $158 per gigabyte, from $31,633 in 1995.
2006
Cutting and Cafarella introduce Hadoop to store and process massive amounts of data
Inspired by Google’s MapReduce, computer scientists Doug Cutting and Mike Cafarella develop the Hadoop software to store and process enormous data sets. Yahoo uses it first, to deal with the explosion of data coming from indexing web pages and online data.
2009
UC Berkley introduces Spark to handle big data
Developed by Romanian-Canadian computer scientist Matei Zaharia at UC Berkeley’s AMPLab, Spark streams huge amounts of data leveraging RAM, making it much faster at processing data than software that must read/write on hard drives. It revolutionizes the ability to update big data and perform analytics in real time.
2009
Ng uses GPUs to train deep-learning models more efficiently
American computer scientist Andrew Ng and his team at Stanford University show that training deep-belief networks with 100 million parameters on GPUs is more than 70 times faster than doing so on CPUs, a finding that would reduce training that once took weeks to only one day.
2010
Microsoft and Google introduce their clouds
Cloud computing and storage take another step toward ubiquity when Microsoft makes Azure available and Google launches its Google Cloud Storage (the Google Cloud Platform would come online about a year later).
2010
Worldwide IP traffic exceeds 20 exabytes (20 billion gigabytes) per month
Internet protocol (IP) traffic is aided by the growing adoption of broadband, particularly in the United States, where adoption reaches 65 percent, according to Cisco, which reports this monthly figure and the annual figure of 242 exabytes.
2010
Number of smartphones sold in the year nears 300 million
This represents a nearly 2.5 times increase over the number sold in 2007.
2011
IBM Watson beats Jeopardy!
IBM’s question answering system, Watson, defeats the two greatest Jeopardy! champions, Brad Rutter and Ken Jennings, by a significant margin. IBM Watson uses ten racks of IBM Power 750 servers capable of 80 teraFLOPS (that’s 80 trillion FLOPS—the state of the art in the mid-1960s was around three million FLOPS).
2012
Deep-learning system wins renowned image-classification contest for the first time
Geoffrey Hinton’s team wins ImageNet’s image-classification competition by a large margin, with an error rate of 15.3 percent versus the second-best error rate of 26.2 percent, using a convolutional neural network (CNN). Hinton’s team trained its CNN on 1.2 million images using two GPU cards.
2012
Google demonstrates the effectiveness of deep learning for image recognition
Google uses 16,000 processors to train a deep artificial neural network with one billion connections on ten million randomly selected YouTube video thumbnails over the course of three days. Without receiving any information about the images, the network starts recognizing pictures of cats, marking the beginning of significant advances in image recognition.
2012
Number of Facebook users
hits one billion
The amount of data processed by the company’s systems soars past 500 terabytes.
2013
DeepMind teaches an algorithm to play Atari using reinforcement learning and deep learning
While reinforcement learning dates to the late 1950s, it gains in popularity this year when Canadian research scientist Vlad Mnih from DeepMind (not yet a Google company) applies it in conjunction with a convolutional neural network to play Atari video games at superhuman levels.
2014
Number of mobile devices exceeds number of humans
As of October 2014, GSMA reports the number of mobile devices at around 7.22 billion, while the US Census Bureau reports the number of people globally at around 7.20 billion.
2017
Electronic-device users generate 2.5 quintillion bytes of data per day
According to this estimate, about 90 percent of the world’s data were produced in the past two years. And, every minute, YouTube users watch more than four million videos and mobile users send more than 15 million texts.
2017
AlphaZero beats AlphaGo Zero after learning to play three different games in less than 24 hours
While creating AI software with full general intelligence remains decades off (if possible at all), Google’s DeepMind takes another step closer to it with AlphaZero, which learns three computer games: Go, chess, and shogi. Unlike AlphaGo Zero, which received some instruction from human experts, AlphaZero learns strictly by playing itself, and then goes on to defeat its predecessor AlphaGo Zero at Go (after eight hours of self-play) as well as some of the world’s best chess- and shogi-playing computer programs (after four and two hours of self-play, respectively).
2017
Google introduces upgraded TPU that speeds machine-learning processes
Google first introduced its tensor processing unit (TPU) in 2016, which it used to run its own machine-learning models at a reported 15 to 30 times faster than GPUs and CPUs. In 2017, Google announced an upgraded version of the TPU that was faster (180 million teraFLOPS—more when multiple TPUs are combined), could be used to train models in addition to running them, and would be offered to the paying public via the cloud. TPU availability could spawn even more (and more powerful and efficient) machine-learning-based business applications.
Improve capital allocation
Apply predictive maintenance
Detect fraud
Personalize marketing campaigns
Optimize salesforce coverage
Forecast demand more accurately
Optimize power grids
Recommend next product to buy
Predict customer churn
Diagnose disease
Optimize price points
Understand product sales drivers
Business Use Cases
2013 - 2017
Understand product sales drivers
Optimize price points
Diagnose disease
Predict customer churn
Recommend next product to buy
Optimize power grids
Forecast demand more accurately
Optimize salesforce coverage
Personalize marketing campaigns
Detect fraud
Apply predictive maintenance
Improve capital allocation
2009
2014
2013
2011
2010
2012
2017
2006
2005
2004
2002
1999
1997
1965
2006
1998
1997
1992
1989
1986
1965
1958
1805
EARLY 2000s
2005
2004
2007
1991
Algorithmic
advancements
Exponential increases in computing power and storage
Explosion
of data
Business use cases
A convergence of algorithmic advances, data proliferation, and tremendous increases in computing power and storage has propelled AI from hype to reality.
2012 -2017
Why AI now?
