Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save pengan1987/174e9ed29fb177c1fadb to your computer and use it in GitHub Desktop.

Select an option

Save pengan1987/174e9ed29fb177c1fadb to your computer and use it in GitHub Desktop.
1
00:00:04,320 --> 00:00:08,040
现代生活被接管已经是一个不言而喻的事实
2
00:00:10,320 --> 00:00:14,960
当我们寻找爱情,在线购物
3
00:00:14,960 --> 00:00:18,280
环球旅行
4
00:00:18,280 --> 00:00:20,640
甚至是拯救生命
5
00:00:20,640 --> 00:00:25,360
幕后都有按部就班的指令悄悄运行着
6
00:00:27,400 --> 00:00:30,640
这样的情形越来越多,正在控制着我们的生活
7
00:00:30,640 --> 00:00:33,240
它们被称为算法
8
00:00:34,680 --> 00:00:36,920
无处不在的算法
9
00:00:36,920 --> 00:00:39,960
这些数学中的小魔术,已经成为今天
10
00:00:39,960 --> 00:00:41,800
日常生活的中心
11
00:00:41,800 --> 00:00:45,080
但是因为它们是看不见的,我们往往将其视为理所当然
12
00:00:45,080 --> 00:00:46,520
甚至误解它们
13
00:00:51,240 --> 00:00:52,240
(笑)
14
00:00:53,560 --> 00:00:57,240
他们是数字世界的密码,而且不止于此
15
00:00:59,000 --> 00:01:02,440
在这一期节目里,我将演示一些我最喜欢的
16
00:01:02,440 --> 00:01:06,160
算法和他们的来历
17
00:01:06,160 --> 00:01:08,760
算法有着悠久的历史
18
00:01:08,760 --> 00:01:10,080
'..如何工作...'
19
00:01:10,080 --> 00:01:12,280
面临的挑战是寻找最短路径...
20
00:01:12,280 --> 00:01:14,960
这就是粗略的说明,然后你会用来
21
00:01:14,960 --> 00:01:18,120
带你回到出发点
22
00:01:18,120 --> 00:01:20,520
'..它们未来所能做到的.'
23
00:01:20,520 --> 00:01:23,760
算法可以自己创造自己吗?或者...? 正是如此.
24
00:01:23,760 --> 00:01:26,400
'..我们已经无法离开它们生存'
25
00:01:26,400 --> 00:01:29,920
即使我们烤蛋糕的时候,我们也遵循着某种算法
26
00:01:29,920 --> 00:01:32,800
作为一名数学家,我热爱算法
27
00:01:32,800 --> 00:01:35,440
然而不仅是因为它们解决问题的本领令人印象深刻
28
00:01:35,440 --> 00:01:39,080
而且由于他们奇特的美感,展示着
29
00:01:39,080 --> 00:01:42,520
宇宙运行背后的数学原理
30
00:01:42,520 --> 00:01:45,960
欢迎来到奇妙又美丽的算法世界
31
00:01:54,920 --> 00:01:57,960
我们大多数人都随身携带着它
32
00:01:57,960 --> 00:02:00,440
今天,你在用手机拍照的时候,你可能会发现
33
00:02:00,440 --> 00:02:06,120
手机会在脸上画一个方框,就像这样
34
00:02:06,120 --> 00:02:09,760
这是特殊的人脸识别算法的结果
35
00:02:09,760 --> 00:02:13,400
这使得面部总在画面的焦点上
36
00:02:14,760 --> 00:02:18,600
就像所有算法一样,它解决了一个问题
37
00:02:18,600 --> 00:02:21,800
对这个算法来说,是寻找面部图像
38
00:02:21,800 --> 00:02:25,040
它并不会被水果拼出的脸所迷惑
39
00:02:25,040 --> 00:02:28,640
但能识别出照片中的人脸
40
00:02:28,640 --> 00:02:31,480
不过,它是如何做到的呢?
41
00:02:31,480 --> 00:02:34,440
从根本上讲,算法并不只是
42
00:02:34,440 --> 00:02:37,600
一系列按部就班的指令
43
00:02:37,600 --> 00:02:40,840
这是一种扫描图像的方法
44
00:02:40,840 --> 00:02:43,800
寻找与脸部相关的
45
00:02:43,800 --> 00:02:46,240
四个特殊的抽象图案
46
00:02:46,240 --> 00:02:48,600
当检测出其中的一个紧跟着另一个的时候
47
00:02:48,600 --> 00:02:52,840
算法就认为它找到了人脸
48
00:02:52,840 --> 00:02:57,160
这个过程依靠的是所有面孔所共有的模式
49
00:02:57,160 --> 00:02:59,760
而无关乎形状和尺寸
50
00:02:59,760 --> 00:03:03,120
最终的结果就是算法
51
00:03:03,120 --> 00:03:05,880
让我们的生活更容易的一个例子
52
00:03:05,880 --> 00:03:09,240
I'll do it! I'll do it! I was here
first! OK.
53
00:03:09,240 --> 00:03:10,920
So, off you go.
54
00:03:10,920 --> 00:03:14,520
我们倾向于将算法与电脑、智能手机
55
00:03:14,520 --> 00:03:15,760
或者互联网联系起来
56
00:03:15,760 --> 00:03:19,640
但它们并不仅仅与科技有关
57
00:03:19,640 --> 00:03:24,440
我平时的工作是牛津大学的数学教授
58
00:03:24,440 --> 00:03:27,160
'And one of the things I enjoy most
is keeping
59
00:03:27,160 --> 00:03:29,360
'the students on their toes.'
60
00:03:29,360 --> 00:03:31,360
OK, I'll take one.
61
00:03:31,360 --> 00:03:34,240
现在我们正在玩一个数学游戏,
62
00:03:34,240 --> 00:03:37,320
装满巧克力和红辣椒的罐子
63
00:03:38,720 --> 00:03:43,000
游戏的目标是让罐子里最后只剩下红辣椒
64
00:03:43,000 --> 00:03:44,680
但学生们却不知道
65
00:03:44,680 --> 00:03:49,800
我在游戏中依靠的是算法的帮助
66
00:03:49,800 --> 00:03:52,080
准备好了吗?(齐声说): 好了.
67
00:03:52,080 --> 00:03:54,840
那好,我先开始了,请记住,你们一次可以拿
68
00:03:54,840 --> 00:03:57,480
一颗、两颗、或者三颗巧克力。
69
00:03:57,480 --> 00:04:01,160
我并不是贪吃鬼,所以我只拿一颗。现在轮到你了。
70
00:04:01,160 --> 00:04:05,760
每个玩家在他们的回合中拿走一到三颗巧克力
71
00:04:05,760 --> 00:04:09,400
你拿了两颗,那么我呢……也拿两颗
72
00:04:09,400 --> 00:04:12,520
不管我的对手怎样做,我得算法都告诉我
73
00:04:12,520 --> 00:04:14,440
应当如何应对
74
00:04:14,440 --> 00:04:16,800
好的,我拿两颗
75
00:04:16,800 --> 00:04:19,320
然后又轮到你了
SHE LAUGHS
76
00:04:19,320 --> 00:04:20,960
欧也
77
00:04:20,960 --> 00:04:24,360
然后,我拿……三颗,是三颗。然后我拿一颗
78
00:04:24,360 --> 00:04:27,440
然后就只剩下辣椒了... 所以说,等下是我了?是的,所以你得
79
00:04:27,440 --> 00:04:29,920
把辣椒吃掉!噢,不要。所以,请了。
80
00:04:29,920 --> 00:04:33,240
让我解释一下算法是如何让我获胜的
81
00:04:33,240 --> 00:04:34,720
这是学会他的唯一办法
82
00:04:35,960 --> 00:04:40,840
问题的关键是思考如何将物品四个一组分开
83
00:04:42,320 --> 00:04:47,040
十三颗巧克力四个一组,可以分三组,剩下一颗。
84
00:04:47,040 --> 00:04:50,440
所以,第一轮的时候拿走一颗,接下来每轮四颗
85
00:04:50,440 --> 00:04:54,440
减去对手所拿的,就是这一回合我所要拿的数量
86
00:04:54,440 --> 00:04:57,320
这个算法可以保证我的对手
87
00:04:57,320 --> 00:04:59,200
总是会剩下辣椒
88
00:04:59,200 --> 00:05:01,400
算法神奇魔力背后
89
00:05:01,400 --> 00:05:04,480
的本质,如果你喜欢的话,是数学
90
00:05:04,480 --> 00:05:07,760
最好的算法是总是会挖掘到那些
91
00:05:07,760 --> 00:05:10,640
隐藏在问题背后的数学结构
92
00:05:11,880 --> 00:05:13,280
OK,辣椒的问题就讲到这里
93
00:05:14,760 --> 00:05:18,080
我将为您介绍一些在现代生活中
94
00:05:18,080 --> 00:05:20,560
已经如同心跳一般不可或缺的算法
95
00:05:22,160 --> 00:05:25,200
但一开始,我要讲解的是在现代算法应用之前
96
00:05:25,200 --> 00:05:28,600
算法其实是一门极其古老的学问
97
00:05:29,960 --> 00:05:33,840
实际上,算法在计算机出现的几千年之前就已经存在了
98
00:05:35,720 --> 00:05:38,520
我们所知道的最古老的算法是用来
99
00:05:38,520 --> 00:05:40,800
解决一个数学难题
100
00:05:40,800 --> 00:05:45,080
它最早是由古埃及数学家欧几里得所记载的
101
00:05:45,080 --> 00:05:47,800
欧几里得的算法,据我们所知
102
00:05:47,800 --> 00:05:50,880
是一种寻找最大公约数的方法
103
00:05:52,640 --> 00:05:55,960
最大公约数就是可以同时除以两个数字
104
00:05:55,960 --> 00:06:00,560
并且都不留余数的最大的数字
105
00:06:00,560 --> 00:06:03,280
比如,在这个例子里,四除以八
106
00:06:03,280 --> 00:06:06,400
和十二都不会有余数
107
00:06:06,400 --> 00:06:08,800
对于较小的数字来说这很容易被找到
108
00:06:08,800 --> 00:06:10,840
但对于大数字来说就更难了
109
00:06:12,400 --> 00:06:15,800
欧几里得是当时最伟大的数学家
110
00:06:15,800 --> 00:06:18,840
他的算法使他作为瓦工的发了一笔大财
111
00:06:20,120 --> 00:06:22,560
让我告诉你他是如何做到的
112
00:06:22,560 --> 00:06:25,680
想象一下你有一片四边形的地板
113
00:06:25,680 --> 00:06:27,080
并且你希望找到
114
00:06:27,080 --> 00:06:30,640
最高效的方法用方砖将其铺满
115
00:06:30,640 --> 00:06:34,360
换句话说,就是选用最大的方砖,
116
00:06:34,360 --> 00:06:38,280
能够正好分割地板,而且不留边角料
117
00:06:38,280 --> 00:06:40,720
这实际上是几何学版本
118
00:06:40,720 --> 00:06:43,360
的最大公约数问题
119
00:06:43,360 --> 00:06:46,600
地板的变长是两个数字
120
00:06:46,600 --> 00:06:48,880
而我们试图找到的方砖的尺寸
121
00:06:48,880 --> 00:06:52,120
是他们的最大公约数
122
00:06:54,360 --> 00:06:57,800
我们将逐步重现欧几里得的算法,来展示
123
00:06:57,800 --> 00:07:01,720
他是如何找到最适合这块地板的方砖的
124
00:07:03,200 --> 00:07:07,080
According to Euclid's Algorithm, we
need to start filling the rectangle
125
00:07:07,080 --> 00:07:11,160
with square tiles corresponding to
the smallest of the two dimensions.
126
00:07:14,000 --> 00:07:16,080
This is the first stage of the job.
127
00:07:17,400 --> 00:07:19,360
Euclid's Algorithm then tells us
128
00:07:19,360 --> 00:07:22,560
to do exactly the same thing again
with this rectangle.
129
00:07:24,360 --> 00:07:28,880
At each stage, the algorithm tells us
to select square tiles
130
00:07:28,880 --> 00:07:31,840
corresponding to the shortest
side of the rectangle.
131
00:07:33,880 --> 00:07:39,160
So this time, our square tiles
perfectly fill the leftover space.
132
00:07:39,160 --> 00:07:43,240
Now, my square tile has
dimensions 15x15.
133
00:07:43,240 --> 00:07:46,040
So Euclid's Algorithm tells us
134
00:07:46,040 --> 00:07:50,680
that the greatest common devisor
of 150 and 345 is 15.
135
00:07:53,560 --> 00:07:56,400
I'm not suggesting you use
Euclid's Algorithm every time
136
00:07:56,400 --> 00:07:58,640
you need to order some tiles,
137
00:07:58,640 --> 00:08:02,800
but the amazing thing is that this
simple step-by-step method
138
00:08:02,800 --> 00:08:06,600
finds the perfect square tile
whatever the dimensions of the floor.
139
00:08:08,080 --> 00:08:11,920
Euclid's Algorithm may appear to be
just a mathematical technique,
140
00:08:11,920 --> 00:08:16,680
but it very elegantly fulfils all
the criteria for an algorithm.
141
00:08:16,680 --> 00:08:20,440
It's a precisely stated
set of instructions, the procedure
142
00:08:20,440 --> 00:08:25,040
always finishes, and it can be
proven that it works in all cases.
143
00:08:28,640 --> 00:08:32,040
The power of algorithms is
that you don't have to reinvent
144
00:08:32,040 --> 00:08:36,760
the wheel each time. They're general
solutions to problems.
145
00:08:36,760 --> 00:08:40,160
This holds as true for ancient
algorithms as for modern ones.
146
00:08:45,640 --> 00:08:49,760
In 1998, in this
garage in Menlo Park in California,
147
00:08:49,760 --> 00:08:52,600
an important piece of algorithmic
history was made.
148
00:08:54,800 --> 00:08:58,840
Inside were two PHD
students from Stamford University.
149
00:08:58,840 --> 00:09:00,960
Larry Page and Sergey Brin.
150
00:09:02,640 --> 00:09:05,920
Their aim was to come up with
a search engine that could find
151
00:09:05,920 --> 00:09:08,680
things efficiently
on the World Wide Web.
152
00:09:11,320 --> 00:09:14,080
Out of these humble beginnings,
Google was born.
153
00:09:15,720 --> 00:09:19,080
But Google wouldn't be Google
if it wasn't for the algorithm that
154
00:09:19,080 --> 00:09:21,800
Larry and Sergey created,
called PageRank.
155
00:09:31,160 --> 00:09:34,360
PageRank was the algorithm
at the heart of the first
156
00:09:34,360 --> 00:09:37,480
incarnation of the Google
search engine.
157
00:09:37,480 --> 00:09:42,400
Now, technically, it's not a search
algorithm, but a ranking algorithm.
158
00:09:42,400 --> 00:09:45,880
So when you type
a query into a search engine,
159
00:09:45,880 --> 00:09:50,160
then there are literally millions
of pages which will match that query.
160
00:09:50,160 --> 00:09:53,920
What PageRank does is to rank
all of those pages so the one
161
00:09:53,920 --> 00:09:57,120
at the top is the one you're more
likely to be interested in.
162
00:09:58,840 --> 00:10:01,720
Larry and Sergey came up with
the idea to do PageRank
163
00:10:01,720 --> 00:10:06,120
and to use it as a ranking system to
improve the quality of web search.
164
00:10:06,120 --> 00:10:08,080
I remember myself at the time,
165
00:10:08,080 --> 00:10:10,960
you used a web search engine
like AltaVista.
166
00:10:10,960 --> 00:10:13,080
You would have to click
the Next Page link
167
00:10:13,080 --> 00:10:15,160
a lot of times to find
what you were looking for.
168
00:10:15,160 --> 00:10:17,560
PageRank was one of the reasons
why Google was
169
00:10:17,560 --> 00:10:20,960
so much better than the existing
search engines at the time.
170
00:10:22,120 --> 00:10:25,120
The inner workings of PageRank
are hidden from view
171
00:10:25,120 --> 00:10:26,960
on the World Wide Web.
172
00:10:26,960 --> 00:10:30,680
So to reveal how it does its job,
we're going to use the PageRank
173
00:10:30,680 --> 00:10:33,640
algorithm to rank
the players of a football team.
174
00:10:34,800 --> 00:10:37,200
PageRank looks at two things.
175
00:10:37,200 --> 00:10:42,280
It looks at the incoming links to
a web page, that is the other pages
176
00:10:42,280 --> 00:10:46,520
that link to the page, and it looks
at how important those pages are.
177
00:10:52,200 --> 00:10:55,120
In our demonstration to show
the cleverness of the PageRank
178
00:10:55,120 --> 00:10:59,560
algorithm, the players in the
football team are the web pages
179
00:10:59,560 --> 00:11:03,160
and the passes between them
are the web links.
180
00:11:03,160 --> 00:11:06,000
The input for the algorithm.
181
00:11:06,000 --> 00:11:09,520
Generally speaking, the PageRank
algorithm will give a higher
182
00:11:09,520 --> 00:11:13,520
rank to a website if it's got a lot
of links coming from other websites.
183
00:11:13,520 --> 00:11:16,320
So in the case of football,
if a player gets more
184
00:11:16,320 --> 00:11:20,280
passes from the rest of the team,
then they'll be ranked higher.
185
00:11:20,280 --> 00:11:21,960
It's not quite that simple.
186
00:11:21,960 --> 00:11:25,240
Because the PageRank algorithm
actually gives more weight to
187
00:11:25,240 --> 00:11:29,160
a link from a website that itself
has a high page rank.
188
00:11:29,160 --> 00:11:32,840
So actually, a pass from a popular
player is worth more than
189
00:11:32,840 --> 00:11:36,120
a pass from a player who's hardly
involved in the game at all.
190
00:11:37,400 --> 00:11:41,240
This is a visualisation
of the algorithm at work.
191
00:11:41,240 --> 00:11:46,160
The stats are the players' current
ranking. The output of the algorithm.
192
00:11:46,160 --> 00:11:50,520
And every time there's a pass,
these rankings are updated.
193
00:11:50,520 --> 00:11:56,640
When Google uses this algorithm, it
only changes once thing - the input.
194
00:11:56,640 --> 00:11:59,440
In place of passes,
it uses web links.
195
00:12:01,560 --> 00:12:04,560
Note that the importance of a page
depends on the importance
196
00:12:04,560 --> 00:12:06,760
of the pages that link to it.
197
00:12:06,760 --> 00:12:09,400
This means that you have to
compute page rank for all
198
00:12:09,400 --> 00:12:11,520
the pages at the same time.
199
00:12:11,520 --> 00:12:14,480
And you actually have to repeat
the computation because, each time,
200
00:12:14,480 --> 00:12:16,880
you'll update
the importance of all the pages.
201
00:12:16,880 --> 00:12:19,320
And that in turn will influence
202
00:12:19,320 --> 00:12:22,320
the importance of the pages
that those pages link to.
203
00:12:30,920 --> 00:12:34,040
At the end of the match,
the job of the algorithm is done.
204
00:12:37,040 --> 00:12:40,160
If we wanted to search for the key
player in the team,
205
00:12:40,160 --> 00:12:42,000
this is PageRank's answer.
206
00:12:44,080 --> 00:12:46,560
Player 11 has the highest
PageRank score.
207
00:12:48,600 --> 00:12:50,840
I think the PageRank algorithm
is probably
208
00:12:50,840 --> 00:12:52,880
my favourite algorithm of all time.
209
00:12:52,880 --> 00:12:55,240
And it's amazing that it can be
applied not just to
210
00:12:55,240 --> 00:12:58,800
the World Wide Web, but analysing
a football match, as well.
211
00:12:58,800 --> 00:13:01,600
But for me, it's the fact that
there's a beautiful bit of
212
00:13:01,600 --> 00:13:04,160
mathematics at its heart that always
seems to find
213
00:13:04,160 --> 00:13:06,160
the website I'm looking for.
214
00:13:08,320 --> 00:13:09,600
Within Google, I think
215
00:13:09,600 --> 00:13:14,520
PageRank is seen as a very important
part of Google's early development.
216
00:13:15,800 --> 00:13:18,880
PageRank was the secret to why
the search engine that Larry
217
00:13:18,880 --> 00:13:22,440
and Sergey built in the 1990s
was so successful.
218
00:13:24,240 --> 00:13:28,920
Now, Google handles over 3.5 billion
searches every day.
219
00:13:28,920 --> 00:13:32,280
It's the world's most poplar
search engine.
220
00:13:32,280 --> 00:13:36,680
And the company is worth more
than $450 billion.
221
00:13:37,840 --> 00:13:40,960
Not bad for two PhD students
working out of a garage.
222
00:13:49,280 --> 00:13:52,880
Algorithms are simple
step-by-step recipes.
223
00:13:52,880 --> 00:13:57,160
Inventing them requires incredible
creativity and genius.
224
00:13:57,160 --> 00:14:01,320
But using them is just
a matter of following instructions.
225
00:14:01,320 --> 00:14:04,760
And this is why algorithms
are perfect for computers.
226
00:14:08,520 --> 00:14:10,480
Computers are just machines.
227
00:14:10,480 --> 00:14:14,280
They just do repetitive
tasks at phenomenal speeds.
228
00:14:14,280 --> 00:14:15,840
Unbelievable speeds.
229
00:14:15,840 --> 00:14:20,360
So they're absolutely perfect for
performing these repetitive tasks
230
00:14:20,360 --> 00:14:23,400
that are unambiguously defined
231
00:14:23,400 --> 00:14:27,480
and can be done in
a finite amount of time.
232
00:14:29,360 --> 00:14:32,320
Computer code is basically
making an algorithm specific.
233
00:14:32,320 --> 00:14:34,080
So the algorithm is
the kind of idea.
234
00:14:34,080 --> 00:14:35,600
How would you solve the problem?
235
00:14:35,600 --> 00:14:37,960
These are the rough instructions
that you would use.
236
00:14:37,960 --> 00:14:40,960
And then that can be
translated into particular code.
237
00:14:44,200 --> 00:14:48,040
Lots of types of algorithms have been
created with a computer in mind.
238
00:14:50,080 --> 00:14:53,560
And some of the most
important are sorting algorithms.
239
00:14:55,160 --> 00:14:59,160
Now, the job of a sorting algorithm
is to put things in order.
240
00:14:59,160 --> 00:15:00,840
And they have lots of uses.
241
00:15:00,840 --> 00:15:04,000
For example, on the internet,
information gets
242
00:15:04,000 --> 00:15:08,960
broken down into packets of data
which then get sent across the web.
243
00:15:08,960 --> 00:15:11,280
Now, to reassemble that data,
244
00:15:11,280 --> 00:15:15,400
sorting algorithms are absolutely
crucial to putting this data
245
00:15:15,400 --> 00:15:19,040
back in the correct order
so that we can view the picture,
246
00:15:19,040 --> 00:15:21,720
or read the email
we've just been sent.
247
00:15:26,440 --> 00:15:30,280
This is the System Development
Corporation in California.
248
00:15:30,280 --> 00:15:35,840
It's considered to be the world's
first computer software company.
249
00:15:35,840 --> 00:15:40,960
And it was here in 1963 that two
computer scientists first formally
250
00:15:40,960 --> 00:15:44,560
wrote down one of the most iconic
sorting algorithms of all-time.
251
00:15:48,520 --> 00:15:50,600
It's called bubble sort.
252
00:15:50,600 --> 00:15:53,800
And here's an example
of bubble sort in action,
253
00:15:53,800 --> 00:15:56,080
sorting blocks instead of numbers.
254
00:15:58,000 --> 00:16:01,520
It gets its name because with
each round of the algorithm,
255
00:16:01,520 --> 00:16:05,600
the largest unsorted object
bubbles to the top.
256
00:16:05,600 --> 00:16:09,200
Like all our algorithms so far,
there's method in the madness.
257
00:16:14,960 --> 00:16:16,960
To see how this algorithm works,
258
00:16:16,960 --> 00:16:19,280
we're going to use it to
sort eight objects.
259
00:16:21,040 --> 00:16:25,000
Now, the bubble sort algorithm says
to consider the objects in pairs
260
00:16:25,000 --> 00:16:27,760
and swap them over
if they're in the wrong order.
261
00:16:27,760 --> 00:16:32,120
So we're going to start at this end
here and work our way to the top.
262
00:16:32,120 --> 00:16:36,080
So I consider these two, they're in
the wrong order, so I swap them over.
263
00:16:37,840 --> 00:16:40,280
Consider the next pair,
they're in the right order,
264
00:16:40,280 --> 00:16:42,600
so I leave them as they are.
265
00:16:42,600 --> 00:16:46,120
Consider this pair, they're
in the wrong order, so I swap them.
266
00:16:49,200 --> 00:16:51,200
And we just continue doing this.
267
00:16:58,440 --> 00:17:01,880
Now the bubble sort algorithm says
to go back to the beginning
268
00:17:01,880 --> 00:17:05,960
and repeat the process over and over
again until the objects are in order.
269
00:17:20,040 --> 00:17:24,400
The algorithm stops when there are
no pairs to swap round.
270
00:17:24,400 --> 00:17:28,160
So the bubble sort algorithm has
successfully done its job.
271
00:17:28,160 --> 00:17:31,000
I've now got the objects
perfectly ordered,
272
00:17:31,000 --> 00:17:32,760
according to ascending height.
273
00:17:34,440 --> 00:17:37,920
Bubble sort is elegantly simple
and straightforward.
274
00:17:37,920 --> 00:17:42,160
But if the scale of the sorting task
is huge, say, organising vast swathes
275
00:17:42,160 --> 00:17:45,920
of data, then there might be better
sorting algorithms for the job.
276
00:17:51,040 --> 00:17:52,960
This is John von Neumann,
277
00:17:52,960 --> 00:17:56,800
the scientific genius who helped
pioneer the modern computer,
278
00:17:56,800 --> 00:17:59,040
game theory, the atomic bomb
279
00:17:59,040 --> 00:18:02,400
and, as it turns out,
invented a sorting algorithm.
280
00:18:05,080 --> 00:18:08,360
He devised it to work on this,
one of the world's earliest
281
00:18:08,360 --> 00:18:12,080
electronic computers,
which he'd helped design.
282
00:18:12,080 --> 00:18:15,000
The algorithm is called merge sort.
283
00:18:17,080 --> 00:18:21,480
The merge sort algorithm works
on a principle of divide and conquer.
284
00:18:21,480 --> 00:18:26,440
And it consists of two parts.
The first bit is the dividing part.
285
00:18:28,840 --> 00:18:32,080
This involves splitting
everything into smaller groups.
286
00:18:35,480 --> 00:18:38,360
And now comes the conquering bit.
287
00:18:41,040 --> 00:18:43,920
The groups are now merged
back together.
288
00:18:43,920 --> 00:18:47,720
But as I merge the two groups,
I compare the sizes of the objects
289
00:18:47,720 --> 00:18:51,560
one pair at a time so that the merged
group becomes sorted.
290
00:19:00,720 --> 00:19:03,480
Now, the merge sort algorithm might
look rather similar to the
291
00:19:03,480 --> 00:19:07,520
bubble sort, but where it comes
into its own is that with a larger
292
00:19:07,520 --> 00:19:10,560
number of objects,
it's much, much faster.
293
00:19:10,560 --> 00:19:15,800
So let's see how merge sort compares
in speed to bubble sort.
294
00:19:15,800 --> 00:19:18,200
It's time for a battle
of the algorithms!
295
00:19:22,160 --> 00:19:26,320
Here we've got bubble sort on the
bottom and merge sort on the top.
296
00:19:26,320 --> 00:19:29,040
And we've got them
sorting 1,000 objects.
297
00:19:29,040 --> 00:19:32,120
Now, although they'll both produce
the same end result,
298
00:19:32,120 --> 00:19:35,520
you can already see merge sort
is getting there much faster.
299
00:19:35,520 --> 00:19:39,040
And this difference in performance
gets more pronounced
300
00:19:39,040 --> 00:19:41,320
the more objects
they're asked to sort.
301
00:19:53,280 --> 00:19:55,360
LAUGHTER
302
00:19:57,800 --> 00:19:59,840
Well, er...
303
00:19:59,840 --> 00:20:03,160
I'm sorry, maybe...
No, no, no, no, no.
304
00:20:03,160 --> 00:20:05,280
I-I think...I think, er...
305
00:20:05,280 --> 00:20:08,600
I think the bubble sort
would be the wrong way to go.
306
00:20:08,600 --> 00:20:10,360
LAUGHTER
307
00:20:10,360 --> 00:20:11,880
APPLAUSE
308
00:20:12,960 --> 00:20:15,520
Come on. Who told him this?
309
00:20:22,800 --> 00:20:25,000
Merge sort beats bubble sort
hands down
310
00:20:25,000 --> 00:20:27,000
for sorting large amounts of data.
311
00:20:28,840 --> 00:20:31,480
But in the crazy world of algorithms,
there are many,
312
00:20:31,480 --> 00:20:33,720
many different ways to sort.
313
00:20:36,240 --> 00:20:37,920
At the last count,
314
00:20:37,920 --> 00:20:41,360
there were over 20 different
types of sorting algorithms.
315
00:20:43,200 --> 00:20:46,920
All weirdly achieving the same
result, but by different means.
316
00:20:58,560 --> 00:21:02,960
So there's bubble sort,
there's merge sort. Insertion sort.
317
00:21:02,960 --> 00:21:06,760
There's heap sort,
there's quick sort. Timsort.
318
00:21:06,760 --> 00:21:08,160
You've got gnome sort.
319
00:21:08,160 --> 00:21:11,120
There's pigeonhole sort, which
is also called radix sort.
320
00:21:11,120 --> 00:21:13,640
There's bogosort,
which might never finish.
321
00:21:19,680 --> 00:21:23,560
There's no such thing as the best
sorting algorithm.
322
00:21:23,560 --> 00:21:25,680
Each has its own pros and cons.
323
00:21:26,920 --> 00:21:28,400
And which one gets used
324
00:21:28,400 --> 00:21:31,280
often depends on the specifics
of the problem.
325
00:21:33,080 --> 00:21:36,960
I think the beauty of studying
algorithms is to try to aspire
326
00:21:36,960 --> 00:21:40,760
for solutions that are as
elegant and efficient as possible.
327
00:21:40,760 --> 00:21:44,960
I actually think bubble sort's
very pretty. I like it.
328
00:21:44,960 --> 00:21:46,520
Merge sort's beautiful.
329
00:21:49,800 --> 00:21:52,120
We really couldn't live
without them.
330
00:21:52,120 --> 00:21:55,000
Sorting algorithms bring
order to the world.
331
00:22:05,520 --> 00:22:08,240
So far,
we've seen algorithms tackle the tiny
332
00:22:08,240 --> 00:22:11,520
problems of sizing our bathroom tiles
and sorting our data.
333
00:22:13,200 --> 00:22:16,280
But how well do they cope with
the messy world of love?
334
00:22:18,360 --> 00:22:21,200
Online dating is really
popular these days.
335
00:22:21,200 --> 00:22:23,920
In fact, one survey suggests that
over a third
336
00:22:23,920 --> 00:22:26,560
of recent marriages started online.
337
00:22:27,720 --> 00:22:31,080
How these dating websites work
is that they use something called
338
00:22:31,080 --> 00:22:33,320
a matching algorithm.
339
00:22:33,320 --> 00:22:36,480
They search through the profiles,
try to match people up according
340
00:22:36,480 --> 00:22:40,640
to their likes and dislikes,
personality traits and so on.
341
00:22:40,640 --> 00:22:43,480
In fact, the algorithms
seem to be better than humans.
342
00:22:43,480 --> 00:22:46,760
Because recent research has shown
those who meet online
343
00:22:46,760 --> 00:22:49,400
tend to be happier
and have longer marriages.
344
00:22:52,680 --> 00:22:56,960
I'll ask you to receive your prizes
from His Majesty the King.
345
00:22:56,960 --> 00:23:01,440
In fact, matching algorithms
have quite a lot to brag about.
346
00:23:01,440 --> 00:23:06,080
Because in 2012, for the first time,
a Nobel Prize was awarded
347
00:23:06,080 --> 00:23:08,120
because of an algorithm.
348
00:23:08,120 --> 00:23:11,560
A matching algorithm
created by the late David Gale
349
00:23:11,560 --> 00:23:13,800
and mathematician Lloyd Shapley,
350
00:23:13,800 --> 00:23:16,440
seen here receiving his share
of the prize.
351
00:23:20,320 --> 00:23:24,040
The story begins in the 1960s when
Gale and Shapley wanted to
352
00:23:24,040 --> 00:23:28,120
solve a problem concerned with
college admissions.
353
00:23:28,120 --> 00:23:32,080
How to match up students to colleges
so that everyone got a place.
354
00:23:33,200 --> 00:23:35,680
But, more importantly,
was happy, even if
355
00:23:35,680 --> 00:23:37,720
they didn't get their first choice.
356
00:23:40,800 --> 00:23:44,480
They called it
the stable marriage problem.
357
00:23:44,480 --> 00:23:46,960
The stable marriage problem
goes like this.
358
00:23:46,960 --> 00:23:49,360
Suppose you've got four women
and four men
359
00:23:49,360 --> 00:23:51,320
and they want to get married.
360
00:23:51,320 --> 00:23:54,280
Now, they've ranked each other
according to their preferences.
361
00:23:54,280 --> 00:23:56,200
So, for example,
the Queen of Hearts here,
362
00:23:56,200 --> 00:23:58,240
first choice is the King of Clubs.
363
00:23:58,240 --> 00:24:00,360
Second choice, King of Diamonds,
364
00:24:00,360 --> 00:24:03,160
and her last choice is
the King of Hearts.
365
00:24:03,160 --> 00:24:06,440
So the challenge here is to play
Cupid and pair up the kings
366
00:24:06,440 --> 00:24:10,160
and queens so that each one gets
a partner, but, more importantly,
367
00:24:10,160 --> 00:24:12,840
so that the marriages are stable.
368
00:24:12,840 --> 00:24:15,960
A stable marriage means that the
kings and queens don't
369
00:24:15,960 --> 00:24:21,000
necessarily get their first choice,
but they get the best on offer.
370
00:24:21,000 --> 00:24:25,520
For example, if I paired the King
of Hearts and the Queen of Hearts
371
00:24:25,520 --> 00:24:28,520
and the King of Spades
and the Queen of Spades,
372
00:24:28,520 --> 00:24:31,320
this would be an unstable marriage.
373
00:24:31,320 --> 00:24:34,760
Because the King of Spades doesn't
really like the Queen of Spades.
374
00:24:34,760 --> 00:24:36,880
He'd prefer the Queen of Hearts.
375
00:24:38,360 --> 00:24:40,360
The Queen of Hearts, in her turn,
376
00:24:40,360 --> 00:24:42,280
doesn't really like
the King of Hearts.
377
00:24:42,280 --> 00:24:45,120
She'd prefer the King of Spades.
378
00:24:45,120 --> 00:24:48,320
So these two are going to
run off together in this pairing.
379
00:24:52,240 --> 00:24:56,800
Where there's a problem,
there's an algorithm not far behind.
380
00:24:56,800 --> 00:24:59,400
In 1962, Gale and Shapley
came up with
381
00:24:59,400 --> 00:25:03,120
their Nobel-Prize-winning algorithm.
382
00:25:03,120 --> 00:25:09,880
A step-by-step recipe which always
finds perfectly-stable marriages.
383
00:25:09,880 --> 00:25:11,520
So in the first
round of the algorithm,
384
00:25:11,520 --> 00:25:14,680
the queens all proposed to
their first-choice kings.
385
00:25:14,680 --> 00:25:18,960
So the Queen of Spades'
first choice is the King of Spades.
386
00:25:18,960 --> 00:25:21,480
She proposes to the King of Spades.
387
00:25:21,480 --> 00:25:24,600
The Queen of Hearts'
first choice is the King of Clubs,
388
00:25:24,600 --> 00:25:27,080
so she proposes to the King of Clubs.
389
00:25:27,080 --> 00:25:30,680
The Queen of Diamonds'
first choice is the King of Spades.
390
00:25:30,680 --> 00:25:33,600
And the Queen of Clubs' first choice
is also the King of Spades.
391
00:25:33,600 --> 00:25:36,800
So King of Spades seems to be
the Darcy of this royal court.
392
00:25:38,040 --> 00:25:40,720
Now, the King of Spades has
got three proposals.
393
00:25:42,000 --> 00:25:45,160
So he chooses his most popular queen,
394
00:25:45,160 --> 00:25:48,840
who is actually the Queen of
Diamonds, and rejects the other two.
395
00:25:51,720 --> 00:25:55,920
So we have two provisional
engagements, two rejections.
396
00:25:55,920 --> 00:25:59,560
We now remove the rejected
queen's first choices.
397
00:25:59,560 --> 00:26:01,280
And it's time for round two.
398
00:26:02,840 --> 00:26:07,280
So the Queen of Spades is going to
propose to the King of Diamonds.
399
00:26:07,280 --> 00:26:10,400
And the Queen of Clubs
proposes to the King of Clubs.
400
00:26:11,840 --> 00:26:14,520
But now the King of Clubs has
got two proposals
401
00:26:14,520 --> 00:26:17,680
and actually prefers
the Queen of Clubs.
402
00:26:17,680 --> 00:26:20,560
So he rejects the Queen of Hearts,
his provisional
403
00:26:20,560 --> 00:26:23,160
engagement on the first round of the
algorithm,
404
00:26:23,160 --> 00:26:24,640
and we have to start again.
405
00:26:26,280 --> 00:26:28,440
In each round, the rejected queens
406
00:26:28,440 --> 00:26:31,680
propose to the next king
on their list.
407
00:26:31,680 --> 00:26:34,680
And the kings always
go for the best offer they get.
408
00:26:35,920 --> 00:26:40,280
In this round of the algorithm,
she proposes to the King of Hearts
409
00:26:40,280 --> 00:26:44,320
and finally, everyone's paired up
with a single queen and king
410
00:26:44,320 --> 00:26:46,160
and all the marriages are stable.
411
00:26:49,400 --> 00:26:53,760
The Gale-Shapley algorithm is now
used all over the world.
412
00:26:53,760 --> 00:26:57,120
In Denmark,
to match children to day-care places.
413
00:26:57,120 --> 00:27:00,360
In Hungary,
to match students to schools.
414
00:27:00,360 --> 00:27:03,720
In New York, to allocate
rabbis to synagogues.
415
00:27:03,720 --> 00:27:07,600
And in China, Germany and Spain,
to match students to universities.
416
00:27:10,760 --> 00:27:13,840
Whilst in the UK,
it's led to the development
417
00:27:13,840 --> 00:27:18,600
of a matching algorithm that, for
some people, has saved their lives.
418
00:27:23,360 --> 00:27:27,080
At the age of 20, Seraya
in south London was diagnosed
419
00:27:27,080 --> 00:27:31,360
with a chronic kidney disease
and told she needed a transplant.
420
00:27:33,200 --> 00:27:37,280
I was on dialysis for 18 months
and very unwell.
421
00:27:37,280 --> 00:27:40,520
I couldn't go to work.
I had no social life.
422
00:27:40,520 --> 00:27:44,400
It was literally hospital three
times a week for treatment and home.
423
00:27:45,720 --> 00:27:48,240
A close friend was willing to donate,
424
00:27:48,240 --> 00:27:51,080
but their tissue types
were not compatible.
425
00:27:53,720 --> 00:27:56,160
In St Albans, Tamir was seriously ill
426
00:27:56,160 --> 00:27:59,080
and his wife, Lyndsey,
wanted to donate.
427
00:27:59,080 --> 00:28:00,800
But they had the same problem.
428
00:28:02,280 --> 00:28:05,120
We went through all the blood tests
and all the workup
429
00:28:05,120 --> 00:28:08,280
and it turned out that we were
incompatible blood groups.
430
00:28:10,600 --> 00:28:13,440
Often, kidney patients
who are fortunate enough
431
00:28:13,440 --> 00:28:16,360
to have a would-be donor find
there's a mismatch
432
00:28:16,360 --> 00:28:19,080
between their donor's
blood group or tissue type.
433
00:28:21,000 --> 00:28:26,640
But since 2007, the NHS has been
using a special matching algorithm
434
00:28:26,640 --> 00:28:29,400
to find potential matches
for willing donors
435
00:28:29,400 --> 00:28:31,680
to kidney patients all over the UK.
436
00:28:35,640 --> 00:28:37,920
When we first looked at
this problem,
437
00:28:37,920 --> 00:28:41,640
we really underestimated
the complexity.
438
00:28:41,640 --> 00:28:46,640
And originally, we just started
with swaps between two pairs.
439
00:28:46,640 --> 00:28:48,400
So it was very simple,
440
00:28:48,400 --> 00:28:53,240
but it soon became obvious that we
needed something much more complex.
441
00:28:57,200 --> 00:29:00,280
I became in touch with
Rachel Johnson at the NHS
442
00:29:00,280 --> 00:29:03,000
and we then got involved at that
stage in being able to design
443
00:29:03,000 --> 00:29:05,840
algorithms which would allow
not just pair-wise exchanges,
444
00:29:05,840 --> 00:29:08,280
but also exchanges among
three couples, as well.
445
00:29:10,400 --> 00:29:13,320
The algorithm considers
several scenarios.
446
00:29:13,320 --> 00:29:15,680
The simplest is a two-way swap
447
00:29:15,680 --> 00:29:18,600
with two couples exchanging kidneys.
448
00:29:21,840 --> 00:29:24,120
More complicated is a three-way swap,
449
00:29:24,120 --> 00:29:26,960
where the kidneys get passed around
in a cycle.
450
00:29:30,240 --> 00:29:35,280
There are 200 patients
in each of our matching runs.
451
00:29:35,280 --> 00:29:39,120
We need to look for
all the possible transplants.
452
00:29:40,560 --> 00:29:42,720
And it's surprising
how many there are.
453
00:29:42,720 --> 00:29:44,720
There are literally, you know,
hundreds,
454
00:29:44,720 --> 00:29:47,320
sometimes thousands
of possibilities.
455
00:29:47,320 --> 00:29:51,600
It's something that just could not
be achieved without the algorithm.
456
00:29:53,360 --> 00:29:57,440
One day, Seraya received the call
that a match had been found
457
00:29:57,440 --> 00:30:02,440
400 miles away with Linda, a donor
living in Bowness near Edinburgh.
458
00:30:04,000 --> 00:30:07,040
My husband's dad needed
a new kidney.
459
00:30:07,040 --> 00:30:11,520
He'd been ill for a bit of time.
And I wasn't a perfect match.
460
00:30:11,520 --> 00:30:17,200
And I then got a phone call
and it was all go from there.
461
00:30:19,320 --> 00:30:21,200
We got the initial phone call saying
462
00:30:21,200 --> 00:30:23,840
we'd been matched up
in the three-way pool.
463
00:30:23,840 --> 00:30:26,840
You're just nervous
that it's not going to go ahead
464
00:30:26,840 --> 00:30:28,480
because your life depends on it.
465
00:30:30,240 --> 00:30:31,920
For the matching couples,
466
00:30:31,920 --> 00:30:35,360
all the operations had to happen
simultaneously.
467
00:30:35,360 --> 00:30:38,600
It was a major logistical challenge.
468
00:30:38,600 --> 00:30:41,640
When my donor went to theatre,
they called over to check
469
00:30:41,640 --> 00:30:44,880
that my donor was also in Newcastle
going to theatre.
470
00:30:44,880 --> 00:30:47,240
And they both got it
at the exact same time.
471
00:30:47,240 --> 00:30:49,640
And they make the call
and the kidneys come out.
472
00:30:49,640 --> 00:30:51,480
I think they went by motorbike.
473
00:30:51,480 --> 00:30:53,480
We were told
they might go by helicopter,
474
00:30:53,480 --> 00:30:56,920
so I thought at least one bit of me
might have been in a helicopter,
475
00:30:56,920 --> 00:30:59,160
but, no, it went by motorbike.
476
00:31:03,200 --> 00:31:06,480
And it eventually went ahead,
thankfully, in December.
477
00:31:06,480 --> 00:31:09,440
The best Christmas present. Hm!
478
00:31:09,440 --> 00:31:12,720
Personally, I just imagined
it was doctors behind there
479
00:31:12,720 --> 00:31:15,160
matching people up off this list.
480
00:31:15,160 --> 00:31:17,960
So, yeah, it's a bit strange
481
00:31:17,960 --> 00:31:20,520
that it comes down to maths
at the end of the day.
482
00:31:20,520 --> 00:31:24,000
It's a great scheme
and it's still fairly recent.
483
00:31:24,000 --> 00:31:27,440
And many years ago,
I wouldn't have had this chance.
484
00:31:27,440 --> 00:31:31,760
I feel a lot of gratitude to Linda
and also to the algorithm.
485
00:31:31,760 --> 00:31:33,640
So, yeah, I'm very grateful.
486
00:31:34,960 --> 00:31:39,960
So far, more than 400 patients have
benefited from the NHS scheme
487
00:31:39,960 --> 00:31:42,800
and its special matching algorithm.
488
00:31:42,800 --> 00:31:45,120
It was only
when we actually seen media articles
489
00:31:45,120 --> 00:31:47,480
and we actually started to think,
"Oh, hold on,
490
00:31:47,480 --> 00:31:49,760
"that person might have actually
had that match
491
00:31:49,760 --> 00:31:53,440
"through the October matching run's
pair-wise exchange," and so on,
492
00:31:53,440 --> 00:31:55,560
that you actually start to see
the stories
493
00:31:55,560 --> 00:31:57,480
that are behind the anonymous data.
494
00:31:57,480 --> 00:32:00,840
It's quite funny because
David's always really concerned
495
00:32:00,840 --> 00:32:03,680
that the algorithm will take
a long time to run.
496
00:32:03,680 --> 00:32:07,560
And, you know, it's been up to
30 minutes and he gets concerned.
497
00:32:07,560 --> 00:32:10,680
But actually, 30 minutes,
you know, to us,
498
00:32:10,680 --> 00:32:14,320
it's incredible that it can do
all of that in 30 minutes.
499
00:32:25,320 --> 00:32:29,560
So far, we have seen how algorithms
are capable of amazing feats.
500
00:32:30,720 --> 00:32:33,840
From solving abstract
mathematical problems
501
00:32:33,840 --> 00:32:37,600
to helping us find stuff
on the World Wide Web.
502
00:32:37,600 --> 00:32:41,520
And they key thing for all of
these algorithms is their speed.
503
00:32:41,520 --> 00:32:44,800
So the important feature
of a good algorithm is first
504
00:32:44,800 --> 00:32:47,680
that it'd better be correct,
but once you know it's correct,
505
00:32:47,680 --> 00:32:49,680
it's also important
that it runs quickly.
506
00:32:49,680 --> 00:32:52,880
There's no good having an algorithm
that takes longer
507
00:32:52,880 --> 00:32:57,200
than your lifetime to run if
you're wanting the result tomorrow.
508
00:32:58,600 --> 00:33:02,960
This face-detection algorithm is
an example of an efficient algorithm.
509
00:33:02,960 --> 00:33:06,040
Because it's efficient,
it's able to run in real time.
510
00:33:06,040 --> 00:33:07,920
And that's what makes it useful.
511
00:33:09,920 --> 00:33:14,480
But just as in real life,
some problems are harder than others.
512
00:33:14,480 --> 00:33:17,680
Every now and then,
algorithms meet their match.
513
00:33:19,560 --> 00:33:22,240
I think the most common
misconception about algorithms
514
00:33:22,240 --> 00:33:24,600
is just that
algorithms can do anything.
515
00:33:24,600 --> 00:33:27,520
I think people don't really
know about the limits.
516
00:33:27,520 --> 00:33:30,920
Some problems simply cannot be
solved by efficient algorithms.
517
00:33:32,880 --> 00:33:37,160
There are some places where
efficient algorithms cannot go.
518
00:33:37,160 --> 00:33:40,280
Lines in the sand
that can't be crossed.
519
00:33:40,280 --> 00:33:43,480
The trouble is knowing
which problems they can solve
520
00:33:43,480 --> 00:33:44,840
and which they can't.
521
00:33:48,400 --> 00:33:51,600
Take this Rubik's Cube and imagine
the more general challenge
522
00:33:51,600 --> 00:33:54,280
of trying to solve a cube
of arbitrary dimensions.
523
00:33:54,280 --> 00:33:57,280
So, for example,
with 50 squares down each side.
524
00:33:57,280 --> 00:33:58,840
Now, you might expect this
525
00:33:58,840 --> 00:34:01,920
to be one of the really
fiendishly difficult problems,
526
00:34:01,920 --> 00:34:04,280
but actually,
it belongs in the easy camp.
527
00:34:04,280 --> 00:34:08,200
We know an algorithm that can
solve the general Rubik's Cube
528
00:34:08,200 --> 00:34:10,000
in a reasonable amount of time.
529
00:34:13,560 --> 00:34:14,960
Although it looks hard,
530
00:34:14,960 --> 00:34:18,120
this problem can be
cracked by efficient algorithms.
531
00:34:23,080 --> 00:34:25,480
However,
here's one that definitely can't.
532
00:34:27,680 --> 00:34:30,640
Imagine you've got
a draughts board of arbitrary size
533
00:34:30,640 --> 00:34:33,040
and an arrangement of pieces
on the board.
534
00:34:33,040 --> 00:34:34,680
The challenge is to work out
535
00:34:34,680 --> 00:34:38,520
whether white can force a win
from this position.
536
00:34:38,520 --> 00:34:40,360
Now, draughts is a pretty easy game,
537
00:34:40,360 --> 00:34:42,680
but it's been mathematically proven
538
00:34:42,680 --> 00:34:46,880
that there's no algorithm that
can solve this problem efficiently.
539
00:34:46,880 --> 00:34:49,240
It's an inherently
difficult problem.
540
00:34:51,440 --> 00:34:55,920
The only way to solve this puzzle
is through sheer hard slog -
541
00:34:55,920 --> 00:34:58,480
working out
all the millions of possibilities.
542
00:35:00,360 --> 00:35:05,080
So this problem lies firmly beyond
the reach of efficient algorithms.
543
00:35:05,080 --> 00:35:06,760
It can't be solved quickly.
544
00:35:10,600 --> 00:35:14,880
But for some problems,
how hard they are is not clear cut.
545
00:35:14,880 --> 00:35:19,320
This is a large sudoku.
It's got 625 squares.
546
00:35:20,600 --> 00:35:24,680
One of the nice things about sudoku
is that once you've found a solution,
547
00:35:24,680 --> 00:35:28,320
it's relatively straightforward
to check whether or not it's right.
548
00:35:28,320 --> 00:35:30,520
And this is true
however large the puzzle.
549
00:35:32,640 --> 00:35:35,080
In this case,
I've just got to check each row,
550
00:35:35,080 --> 00:35:38,560
column and block doesn't feature
a number twice.
551
00:35:38,560 --> 00:35:42,520
Sudoku belongs to a very special
category of problems
552
00:35:42,520 --> 00:35:45,160
that all share this characteristic.
553
00:35:45,160 --> 00:35:49,040
Once you've come up with a solution,
it's always easy to check it.
554
00:35:50,160 --> 00:35:53,440
The mystery is whether
there's an efficient algorithm
555
00:35:53,440 --> 00:35:55,720
to find the solution
in the first place.
556
00:35:58,640 --> 00:36:02,800
And sudoku is not alone.
There are lots of problems like this.
557
00:36:02,800 --> 00:36:05,320
The most intensely studied
of them all
558
00:36:05,320 --> 00:36:08,680
is known as the travelling salesman
problem.
559
00:36:13,600 --> 00:36:17,240
A travelling salesman travels
door to door, city to city,
560
00:36:17,240 --> 00:36:20,680
selling anything from brushes
and Hoovers to double-glazing.
561
00:36:22,760 --> 00:36:25,280
It sounds like a straightforward job.
562
00:36:25,280 --> 00:36:29,080
But all travelling salesmen
face the same question.
563
00:36:29,080 --> 00:36:31,800
What's the shortest route to take?
564
00:36:33,800 --> 00:36:37,680
So important is this problem
that the Clay Mathematics Institute
565
00:36:37,680 --> 00:36:42,400
has offered $1 million for whoever
can find an efficient algorithm,
566
00:36:42,400 --> 00:36:44,720
or prove that none exists.
567
00:36:46,680 --> 00:36:49,200
The travelling salesman problem
goes like this.
568
00:36:49,200 --> 00:36:50,800
Imagine you're a salesman
569
00:36:50,800 --> 00:36:55,400
and you've got to visit a list of
cities represented by the red dots.
570
00:36:55,400 --> 00:36:57,920
The challenge is to find
the shortest route
571
00:36:57,920 --> 00:37:02,320
so you visit each city once before
returning to your starting point.
572
00:37:02,320 --> 00:37:04,800
Now, you might imagine
the best thing is
573
00:37:04,800 --> 00:37:07,720
to just consider all the routes,
like this.
574
00:37:14,240 --> 00:37:18,800
The method of checking all
possibilities is a type of algorithm.
575
00:37:18,800 --> 00:37:20,720
And for three cities, it works fine
576
00:37:20,720 --> 00:37:23,840
because there are only
three possible routes to check.
577
00:37:27,360 --> 00:37:30,400
But what if we add
two more cities to the list?
578
00:37:33,240 --> 00:37:36,560
With five cities, there are
60 different possible routes.
579
00:37:39,440 --> 00:37:44,320
And if we add another city,
then there are 360 possible routes.
580
00:37:44,320 --> 00:37:49,600
And for ten cities, there are
over 1.8 million possible routes.
581
00:37:49,600 --> 00:37:51,880
If our algorithm
chugged through them,
582
00:37:51,880 --> 00:37:55,000
checking all of these
at a rate of ten per second,
583
00:37:55,000 --> 00:37:58,600
it would take two days
before it found the shortest.
584
00:37:58,600 --> 00:38:02,040
So you can see a method of trying
all the different possibilities,
585
00:38:02,040 --> 00:38:06,600
a kind of brute-force algorithm, if
you like, is just simply impractical.
586
00:38:08,000 --> 00:38:11,160
If somebody found a fast algorithm
for the travelling salesman problem,
587
00:38:11,160 --> 00:38:12,600
it would be hugely significant.
588
00:38:12,600 --> 00:38:15,480
If one of my students came up
with an efficient algorithm
589
00:38:15,480 --> 00:38:17,520
for the travelling salesman problem,
590
00:38:17,520 --> 00:38:20,560
I would get him to explain it to me,
591
00:38:20,560 --> 00:38:23,440
I would kill him
and then I'd go and claim
592
00:38:23,440 --> 00:38:25,960
the Clay prize, $1 million.
593
00:38:25,960 --> 00:38:28,560
But I think my students are safe.
594
00:38:29,920 --> 00:38:32,920
The problem crops up
in lots of areas.
595
00:38:32,920 --> 00:38:35,200
From soldering circuit boards...
596
00:38:37,640 --> 00:38:40,960
..to planning the routes
for supermarket deliveries.
597
00:38:40,960 --> 00:38:45,520
But has the travelling salesman
problem secretly already been solved?
598
00:38:50,240 --> 00:38:54,360
A team of scientists working
at Rothamsted Research in Harpenden
599
00:38:54,360 --> 00:38:57,720
have turned to nature to see
if it has found the answer.
600
00:39:03,440 --> 00:39:06,480
They're carrying out an elaborate
experiment to study
601
00:39:06,480 --> 00:39:10,520
how the travelling salesman problem
is tackled by the bumblebee.
602
00:39:13,760 --> 00:39:17,920
Bees have to forage for nectar
in order to provision their hive.
603
00:39:17,920 --> 00:39:20,200
And so they have to visit
604
00:39:20,200 --> 00:39:22,760
possibly hundreds of flowers
on each trip.
605
00:39:22,760 --> 00:39:25,560
What they want to do is
find an efficient way
606
00:39:25,560 --> 00:39:28,200
to go between all these flowers
that they visit.
607
00:39:31,600 --> 00:39:35,920
The humble bumblebee faces its own
travelling salesman problem.
608
00:39:35,920 --> 00:39:38,680
The flowers are just like the cities.
609
00:39:38,680 --> 00:39:41,760
And the bee is
the travelling salesman.
610
00:39:41,760 --> 00:39:45,840
One bee will go out foraging
many, many times every day.
611
00:39:45,840 --> 00:39:47,640
So over the course of a day,
612
00:39:47,640 --> 00:39:52,000
it really helps to take
the most efficient possible route.
613
00:39:52,000 --> 00:39:54,240
So what we're doing
is trying to figure out
614
00:39:54,240 --> 00:39:58,200
exactly what rules they're using
to narrow down the possibilities.
615
00:40:00,720 --> 00:40:04,360
Joe has laid out five feeders
which play the role of flowers.
616
00:40:05,800 --> 00:40:10,440
Each feeder has just enough nectar to
ensure the bee has to visit all five
617
00:40:10,440 --> 00:40:12,520
to give it a full honey stomach.
618
00:40:13,800 --> 00:40:16,560
And how are you actually knowing
where it's going?
619
00:40:16,560 --> 00:40:19,200
For this,
we're using a harmonic radar.
620
00:40:19,200 --> 00:40:22,560
So as that spins round and round,
it's emitting a radar signal.
621
00:40:22,560 --> 00:40:25,520
And we've attached a small antenna
to the back of the bee,
622
00:40:25,520 --> 00:40:28,160
which then reflects the signal
from the radar.
623
00:40:28,160 --> 00:40:31,400
And so this allows us to see
exactly where the bee has gone
624
00:40:31,400 --> 00:40:33,000
as she moves around the field.
625
00:40:34,520 --> 00:40:38,280
So, how does the bumblebee tackle
the travelling salesman problem?
626
00:40:38,280 --> 00:40:40,320
OK, we're switching it on now.
627
00:40:47,360 --> 00:40:51,920
With five feeders, there are
a total of 60 possible routes.
628
00:40:51,920 --> 00:40:54,680
The shortest
is around the outer edge.
629
00:40:58,320 --> 00:41:02,760
This heat map shows the path
taken by a single bee.
630
00:41:02,760 --> 00:41:06,480
At first, it's simply discovering
the positions of the feeders.
631
00:41:08,160 --> 00:41:12,640
Then the bee appears to methodically
change different parts of the route
632
00:41:12,640 --> 00:41:14,880
to see if it can make it shorter.
633
00:41:17,200 --> 00:41:20,920
Within 20 trips,
it's honed in on an efficient route.
634
00:41:26,720 --> 00:41:30,080
This route is not always
the absolute shortest,
635
00:41:30,080 --> 00:41:31,920
but, for the bee, it's good enough.
636
00:41:36,720 --> 00:41:40,320
That's amazing that just after
a very few tries, they've got
637
00:41:40,320 --> 00:41:44,320
to something which is efficient
enough for them to do their foraging.
638
00:41:44,320 --> 00:41:48,200
Yes, that's right. They can't spend
days or even, you know,
639
00:41:48,200 --> 00:41:50,840
it could take months or years
to try every possibility.
640
00:41:50,840 --> 00:41:53,240
So they have to very quickly
find a route
641
00:41:53,240 --> 00:41:56,040
that they can do again
and again and again
642
00:41:56,040 --> 00:42:00,080
in order to efficiently
provide food. Fantastic.
643
00:42:00,080 --> 00:42:02,280
I think the bee's become
my favourite insect now.
644
00:42:02,280 --> 00:42:05,680
It's obviously a mathematician
at heart. Absolutely.
645
00:42:07,200 --> 00:42:11,960
Let's be clear. Bees are not about
to be awarded $1 million.
646
00:42:11,960 --> 00:42:15,400
They've not miraculously solved
the travelling salesman problem
647
00:42:15,400 --> 00:42:18,240
because they don't always find
the shortest route.
648
00:42:19,720 --> 00:42:22,000
But their algorithm is
a clever approach.
649
00:42:22,000 --> 00:42:25,360
In maths, it's known as heuristics.
650
00:42:25,360 --> 00:42:29,560
Algorithms that are efficient,
that don't find the perfect solution,
651
00:42:29,560 --> 00:42:31,240
but get as close as they can.
652
00:42:44,800 --> 00:42:47,000
The same heuristic approach
653
00:42:47,000 --> 00:42:50,160
has been used to develop
an algorithm for Heathrow airport.
654
00:42:51,680 --> 00:42:54,320
DISPATCHER: 'Clear for takeoff...'
655
00:42:54,320 --> 00:42:58,120
Heathrow handles
over 1,300 flights a day.
656
00:42:58,120 --> 00:43:00,280
It's Europe's busiest airport.
657
00:43:00,280 --> 00:43:04,880
'..430 clear for takeoff. Surface
wind 247 degrees at three knots.'
658
00:43:13,080 --> 00:43:15,360
The challenge for air traffic control
659
00:43:15,360 --> 00:43:18,960
is to maximise the number of aircraft
departing every hour
660
00:43:18,960 --> 00:43:23,080
and ensure that the airport operates
both efficiently and safely.
661
00:43:23,080 --> 00:43:29,760
'..behind the British Airways 747,
line up 27 right behind.'
662
00:43:29,760 --> 00:43:33,800
One of the key decisions
is the order of takeoff.
663
00:43:33,800 --> 00:43:37,000
We're currently departing
a group of medium aircraft,
664
00:43:37,000 --> 00:43:40,000
which will be separated
one minute apart.
665
00:43:40,000 --> 00:43:43,600
Behind that, then, you can see
a 747, which is a large aircraft.
666
00:43:45,120 --> 00:43:48,440
Medium aircraft need to be
separated from the turbulence
667
00:43:48,440 --> 00:43:50,640
produced by larger aircraft.
668
00:43:50,640 --> 00:43:52,880
So the ordering of sizes is crucial.
669
00:43:54,120 --> 00:43:56,400
The ideal sequence for takeoff
involves
670
00:43:56,400 --> 00:43:59,120
really blocking together
groups of aircraft.
671
00:43:59,120 --> 00:44:01,440
So you want large aircraft
to be grouped together,
672
00:44:01,440 --> 00:44:03,760
medium aircraft
to be grouped together.
673
00:44:03,760 --> 00:44:05,480
And that allows the separation
674
00:44:05,480 --> 00:44:07,880
between those aircraft
to be minimised.
675
00:44:10,960 --> 00:44:14,400
The other factor that needs to be
considered where planning takeoff
676
00:44:14,400 --> 00:44:16,200
is where the planes are heading.
677
00:44:20,240 --> 00:44:22,600
We want one to be going
to the north, one to the south,
678
00:44:22,600 --> 00:44:24,680
the next going to the north,
then the south.
679
00:44:24,680 --> 00:44:29,320
If all the aircraft were going in
the same direction, the separation
would be much greater
680
00:44:29,320 --> 00:44:31,880
and we wouldn't use the runways
as efficiently.
681
00:44:31,880 --> 00:44:34,920
All controllers are sitting
in the control towers thinking,
682
00:44:34,920 --> 00:44:38,160
"I've all these aircraft going
north, all these going south.
683
00:44:38,160 --> 00:44:39,920
"I've got these that are large ones,
684
00:44:39,920 --> 00:44:42,480
"so I want to try and group
all the large ones together
685
00:44:42,480 --> 00:44:44,920
"so I don't have to go from
a large one to a small one."
686
00:44:44,920 --> 00:44:48,240
And it's a very complex problem
to solve in their heads.
687
00:44:48,240 --> 00:44:50,760
'..906 November...'
688
00:44:50,760 --> 00:44:54,560
In 2013, an algorithm
joined the team.
689
00:44:54,560 --> 00:44:58,480
Its job is to predict
the most likely order for takeoff
690
00:44:58,480 --> 00:45:00,760
and advise air traffic control
691
00:45:00,760 --> 00:45:03,560
when aircraft should push back
from the gates.
692
00:45:03,560 --> 00:45:06,520
To do this involves nothing less
than simulating
693
00:45:06,520 --> 00:45:09,680
the entire outward-bound operation
of the airport.
694
00:45:11,560 --> 00:45:14,520
Carrying out millions
of calculations every second.
695
00:45:14,520 --> 00:45:17,200
FAINT DISPATCHER
696
00:45:22,000 --> 00:45:25,400
The algorithm works
by trying to predict
697
00:45:25,400 --> 00:45:28,600
what order the aircraft are going
to take off in.
698
00:45:28,600 --> 00:45:30,920
If it knows what order
they can take off in,
699
00:45:30,920 --> 00:45:32,840
then it can work backwards and say,
700
00:45:32,840 --> 00:45:34,880
"If it needs to take off
at this time,
701
00:45:34,880 --> 00:45:37,760
"then it needs to enter
the runway queue at this time,
702
00:45:37,760 --> 00:45:40,160
"then it needs finishing its taxi
at this time,
703
00:45:40,160 --> 00:45:42,840
"so it needs to start
its taxi operation at this time.
704
00:45:42,840 --> 00:45:45,760
"In that case, it needs to
finish its pushback by this time,
705
00:45:45,760 --> 00:45:47,880
"so it needs to start its pushback
by this time."
706
00:45:47,880 --> 00:45:50,920
And it can work all the way back
from what time it should take off
707
00:45:50,920 --> 00:45:52,840
to what time it should start
pushing back.
708
00:45:55,720 --> 00:45:59,000
The output of the algorithm is given
to air traffic control
709
00:45:59,000 --> 00:46:01,840
through the airport's
internal computer system
710
00:46:01,840 --> 00:46:06,080
and displayed to the pilot
at the gate in the form of the TSAT,
711
00:46:06,080 --> 00:46:08,040
the recommended pushback time.
712
00:46:10,280 --> 00:46:13,080
The pilot can look
on the stand-entry system
713
00:46:13,080 --> 00:46:16,160
to actually see what time
he is expecting to depart.
714
00:46:18,160 --> 00:46:21,480
The biggest benefit of the algorithm
is that it means you can
715
00:46:21,480 --> 00:46:25,320
hold aircraft on stand for longer
without them taking off any later.
716
00:46:25,320 --> 00:46:28,720
So there's no loss for
any passengers in terms of delays.
717
00:46:28,720 --> 00:46:31,040
What you can do is you can
start your engines later.
718
00:46:33,360 --> 00:46:35,800
In actual fact, if we save
two minutes' taxi time
719
00:46:35,800 --> 00:46:38,160
on the way to the end of
the runway, over a year,
720
00:46:38,160 --> 00:46:40,720
that's actually £15 million
worth of fuel savings.
721
00:46:42,560 --> 00:46:46,520
The Heathrow sequencing algorithm
shows just what can be accomplished
722
00:46:46,520 --> 00:46:48,120
with the heuristic approach.
723
00:46:49,320 --> 00:46:52,640
Just like the bees,
the algorithm is not finding
724
00:46:52,640 --> 00:46:55,720
the absolute perfect solution
all the time,
725
00:46:55,720 --> 00:46:58,920
but nevertheless makes a tough job
that bit easier.
726
00:47:00,560 --> 00:47:02,440
We're very proud of the algorithm
727
00:47:02,440 --> 00:47:05,880
because it actually now, we feel,
models the real world and is of use.
728
00:47:16,440 --> 00:47:19,360
In the beginning,
algorithms were created
729
00:47:19,360 --> 00:47:21,920
by mathematicians for mathematicians.
730
00:47:21,920 --> 00:47:24,080
And over the last century,
731
00:47:24,080 --> 00:47:26,600
algorithms have been created
for computers.
732
00:47:29,520 --> 00:47:34,160
But perhaps our relationship is about
to go through a dramatic revolution.
733
00:47:39,960 --> 00:47:42,240
At Microsoft Research in Cambridge,
734
00:47:42,240 --> 00:47:46,640
scientists are using new techniques
to develop algorithms...
735
00:47:46,640 --> 00:47:50,600
blurring the boundary between
inventor and the algorithm itself.
736
00:47:56,880 --> 00:48:00,200
This is the Kinect
skeletal-tracking algorithm.
737
00:48:00,200 --> 00:48:03,000
The amazing thing is that it's able
to identify
738
00:48:03,000 --> 00:48:05,200
the different parts of my body.
739
00:48:05,200 --> 00:48:08,600
So you can see it's coloured
the top of my head in red
740
00:48:08,600 --> 00:48:11,320
and my right hand here in blue.
741
00:48:11,320 --> 00:48:13,920
You can see it's coloured
my neck green.
742
00:48:13,920 --> 00:48:16,360
Now, this algorithm
has never met me before,
743
00:48:16,360 --> 00:48:19,080
doesn't know how I'm going to move
in space,
744
00:48:19,080 --> 00:48:22,320
but just using the data
coming from this special camera here,
745
00:48:22,320 --> 00:48:25,800
measuring the distance
from the camera to my body,
746
00:48:25,800 --> 00:48:28,320
it's able to produce this map.
747
00:48:30,800 --> 00:48:34,240
Whatever posture I take,
using nothing more than the input
748
00:48:34,240 --> 00:48:36,680
from the special
depth-sensing camera,
749
00:48:36,680 --> 00:48:39,720
the algorithm is able
to accurately identify,
750
00:48:39,720 --> 00:48:42,920
pixel by pixel,
the different parts of my body.
751
00:48:46,920 --> 00:48:50,000
It was developed
for the Microsoft Xbox console
752
00:48:50,000 --> 00:48:53,800
to track the movement of a player's
body posture in real time.
753
00:48:58,760 --> 00:49:01,920
But just as remarkable as
what this algorithm can do
754
00:49:01,920 --> 00:49:04,760
is the process behind
how it was created,
755
00:49:04,760 --> 00:49:07,320
as researcher Jamie Shotton explains.
756
00:49:09,960 --> 00:49:12,920
What's happening is that
every pixel in the image,
757
00:49:12,920 --> 00:49:16,360
we are running an algorithm
called a decision tree.
758
00:49:16,360 --> 00:49:19,840
And you can think of a decision tree
as a game of 20 questions.
759
00:49:19,840 --> 00:49:22,840
So the decision tree is sort of
taking a pixel, say, on my hand,
760
00:49:22,840 --> 00:49:25,640
and trying to decide,
OK, I've got to colour that blue
761
00:49:25,640 --> 00:49:28,760
because that's on the hand
rather than on my body. Yes.
762
00:49:28,760 --> 00:49:31,480
The key to a decision tree is
the fact that the 20 questions
763
00:49:31,480 --> 00:49:34,160
that you ask are not the same
764
00:49:34,160 --> 00:49:37,320
for every pixel
that we're trying to classify.
765
00:49:37,320 --> 00:49:39,960
And the full set
of the possible questions
766
00:49:39,960 --> 00:49:43,360
that could be answered
is exponential.
767
00:49:43,360 --> 00:49:46,640
It's two to the twenty. Right, OK.
That's over a million questions,
768
00:49:46,640 --> 00:49:49,520
a lot of questions you're going to
have to program in there.
769
00:49:49,520 --> 00:49:51,360
Yes. It would take far too long
770
00:49:51,360 --> 00:49:55,400
and be far too error-prone for us
as humans to program that by hand.
771
00:49:55,400 --> 00:49:58,960
So, the algorithm's kind of writing
itself, or...? Absolutely.
772
00:50:03,200 --> 00:50:05,800
The algorithm was not
designed by Jamie
773
00:50:05,800 --> 00:50:09,120
but instead through a process
called machine learning.
774
00:50:11,720 --> 00:50:16,040
It involved showing the algorithm
millions of training images,
775
00:50:16,040 --> 00:50:19,640
of bodies in different poses
and of various shapes and sizes,
776
00:50:19,640 --> 00:50:23,760
from the very fat to the very thin,
the very short to the very tall.
777
00:50:24,920 --> 00:50:29,120
And from this, the algorithm
essentially learned by example,
778
00:50:29,120 --> 00:50:31,200
devising its own rules.
779
00:50:34,480 --> 00:50:38,040
Where our intelligence comes in
as the designers of the system
780
00:50:38,040 --> 00:50:41,520
is not in programming the algorithm,
per se,
781
00:50:41,520 --> 00:50:44,480
but in designing
the training data set
782
00:50:44,480 --> 00:50:48,480
to capture all of the kind
of variations that we expect to see
783
00:50:48,480 --> 00:50:51,280
when we deploy this system
in people's living rooms
784
00:50:51,280 --> 00:50:52,640
to play their games.
785
00:50:52,640 --> 00:50:55,880
So in the end, do you actually
know what the algorithm is doing?
786
00:50:55,880 --> 00:50:58,040
We can get a sense
of what it's trying to do
787
00:50:58,040 --> 00:50:59,680
and how it's roughly working,
788
00:50:59,680 --> 00:51:03,120
but we couldn't possibly really
understand what exactly is going on.
789
00:51:05,240 --> 00:51:10,240
The same approach of machine learning
has been used in other applications.
790
00:51:10,240 --> 00:51:14,920
For example, this algorithm is able
to do something that for a long time
791
00:51:14,920 --> 00:51:19,840
was thought to be a skill exclusive
to neurosurgeons and radiologists.
792
00:51:19,840 --> 00:51:23,040
From an MRI scan,
the algorithm can identify
793
00:51:23,040 --> 00:51:26,760
and map a brain tumour in 3-D.
794
00:51:26,760 --> 00:51:29,520
Meaning that a job that normally
takes an hour
795
00:51:29,520 --> 00:51:31,520
can be done in a matter of minutes.
796
00:51:34,920 --> 00:51:37,920
Professor Chris Bishop is
interested in developing
797
00:51:37,920 --> 00:51:41,160
the concept of machine learning
even further.
798
00:51:41,160 --> 00:51:44,920
To create algorithms
that can learn just like we do,
799
00:51:44,920 --> 00:51:46,800
directly from experience.
800
00:51:49,400 --> 00:51:52,400
So this demonstration, I think,
illustrates the direction
801
00:51:52,400 --> 00:51:54,400
that algorithms will go
in the years ahead.
802
00:51:54,400 --> 00:51:57,960
OK, I can see a lot of films up here,
so what is the algorithm going to do?
803
00:51:57,960 --> 00:52:01,000
We've got a couple of hundred
of the most commonly watched films,
804
00:52:01,000 --> 00:52:02,520
and what it's going to do,
805
00:52:02,520 --> 00:52:06,840
it's going to learn about
your personal likes and dislikes.
806
00:52:06,840 --> 00:52:08,320
It's already been trained,
807
00:52:08,320 --> 00:52:11,360
so it's a machine-learning algorithm
behind the scenes,
808
00:52:11,360 --> 00:52:14,800
but it's already been trained
on data from about 10,000 people.
809
00:52:14,800 --> 00:52:18,440
What it's going to do now is
to learn about your preferences.
810
00:52:18,440 --> 00:52:20,520
At the moment
it knows nothing about you,
811
00:52:20,520 --> 00:52:23,040
so these films are just arranged
at random on the screen.
812
00:52:23,040 --> 00:52:25,720
What I need you to do is
to find one of these films,
813
00:52:25,720 --> 00:52:28,360
either one that you like
or one that you don't like.
814
00:52:28,360 --> 00:52:31,440
If you like it, you can drag
it across to the green region,
815
00:52:31,440 --> 00:52:33,880
if you don't like it,
across to the red region.
816
00:52:33,880 --> 00:52:35,880
Rushmore, I'm a big fan of Rushmore.
817
00:52:35,880 --> 00:52:37,840
You like Rushmore? OK, right.
818
00:52:37,840 --> 00:52:41,360
So what's happening now is that if
a film is down the right-hand side
819
00:52:41,360 --> 00:52:45,040
near the green region, it's very
confident you'll like it. OK.
820
00:52:45,040 --> 00:52:46,880
So down here
close to the red region,
821
00:52:46,880 --> 00:52:48,840
it's very confident
you won't like it.
822
00:52:48,840 --> 00:52:51,640
In the middle, it's 50-50.
It doesn't really know.
823
00:52:51,640 --> 00:52:54,600
So if I choose
a movie in the middle here,
824
00:52:54,600 --> 00:52:57,960
I'm not a great Austin Powers fan,
so let's shoot that one...
825
00:52:57,960 --> 00:53:01,120
So you see, they're beginning
to spread out sideways,
826
00:53:01,120 --> 00:53:04,720
it's going to be a little bit
more confident. It's pretty good.
827
00:53:04,720 --> 00:53:07,760
I'm a big fan of Dr Strangelove
828
00:53:07,760 --> 00:53:11,760
and I'm a big fan of Woody Allen,
829
00:53:11,760 --> 00:53:14,800
but Spinal Tap,
it thinks I'll like that.
830
00:53:14,800 --> 00:53:18,320
So that's interesting, so when
it was confident you liked them
831
00:53:18,320 --> 00:53:20,080
and you said you liked them,
832
00:53:20,080 --> 00:53:23,240
not much happened
because it didn't learn much.
833
00:53:23,240 --> 00:53:26,120
When it was confident you'd like it,
in the case of Spinal Tap
834
00:53:26,120 --> 00:53:28,520
and you said, "I don't like it,"
there was a big change.
835
00:53:28,520 --> 00:53:30,480
It's learning things from me.
836
00:53:30,480 --> 00:53:33,360
I'm actually changing the algorithm
as I interact with it.
837
00:53:33,360 --> 00:53:36,760
Exactly. Whereas Kinect was trained
in the laboratory and then frozen,
838
00:53:36,760 --> 00:53:38,840
this algorithm continues to adapt
839
00:53:38,840 --> 00:53:41,560
and continues to evolve
throughout its life.
840
00:53:41,560 --> 00:53:44,400
The more films that you rate
as like and don't like,
841
00:53:44,400 --> 00:53:46,280
the more it knows
about you personally
842
00:53:46,280 --> 00:53:49,080
and the better able it is
to make good recommendations.
843
00:53:49,080 --> 00:53:52,560
This algorithm is beginning
to feel much more human
844
00:53:52,560 --> 00:53:55,120
in the way that it interacts
with the world.
845
00:53:55,120 --> 00:53:58,120
Is that your aim, to find
a way to produce algorithms
846
00:53:58,120 --> 00:54:00,840
that are a bit like the way
that we negotiate the world?
847
00:54:00,840 --> 00:54:04,000
Exactly. It's a step down that very
long road to producing machines
848
00:54:04,000 --> 00:54:06,160
that really are as capable
as the human brain.
849
00:54:06,160 --> 00:54:08,960
We've a long way to go, but this is
a small step in that direction
850
00:54:08,960 --> 00:54:10,400
because it's not fixed any more.
851
00:54:10,400 --> 00:54:12,760
It's now continuing to learn
just the same way
852
00:54:12,760 --> 00:54:15,000
that we continue to learn
in our daily lives.
853
00:54:19,920 --> 00:54:22,000
I think we're just starting
854
00:54:22,000 --> 00:54:24,520
to realise the full potential
of algorithms
855
00:54:24,520 --> 00:54:26,840
and I have one more place
I want to visit,
856
00:54:26,840 --> 00:54:29,160
which I'm told will give me a glimpse
857
00:54:29,160 --> 00:54:31,920
of just how much they are able
to do for us.
858
00:54:40,880 --> 00:54:43,840
It's a world where almost
everything is automated.
859
00:54:47,160 --> 00:54:49,640
Where algorithms are in control.
860
00:54:49,640 --> 00:54:54,200
It's the largest automated
grocery warehouse on earth.
861
00:54:54,200 --> 00:54:57,880
It belongs to the online grocery
retailer Ocado
862
00:54:57,880 --> 00:55:01,200
and it's the equivalent
of 45 supermarkets in one.
863
00:55:02,960 --> 00:55:06,920
Over two million items flow
through this warehouse every day.
864
00:55:06,920 --> 00:55:10,560
At any one time, there are
something like 7,000 crates
865
00:55:10,560 --> 00:55:13,080
going over 25 kilometres of track,
866
00:55:13,080 --> 00:55:18,560
and controlling every aspect of this
astonishing spectacle are algorithms.
867
00:55:25,800 --> 00:55:29,440
Each of those red crates
is part of a customer order
868
00:55:29,440 --> 00:55:33,120
and they may go on from here
to find other items
869
00:55:33,120 --> 00:55:35,400
that they want across the warehouse,
870
00:55:35,400 --> 00:55:37,560
until they are eventually finished,
871
00:55:37,560 --> 00:55:41,600
loaded onto a van and then
driven out by our routing system
872
00:55:41,600 --> 00:55:44,000
on a route, which in many ways,
873
00:55:44,000 --> 00:55:47,680
is solving problems like
the travelling salesman problem.
874
00:55:47,680 --> 00:55:49,960
There are decisions being made
all over the place
875
00:55:49,960 --> 00:55:52,560
as a red crate goes this way
and then that way.
876
00:55:52,560 --> 00:55:55,880
The complexity behind all this
is beyond
877
00:55:55,880 --> 00:55:59,040
what any human could control
or solve,
878
00:55:59,040 --> 00:56:02,040
and that is where
these algorithms,
879
00:56:02,040 --> 00:56:04,160
these problem-solving techniques
come in
880
00:56:04,160 --> 00:56:06,160
to overcome those challenges.
881
00:56:11,280 --> 00:56:15,680
Everywhere you look, the invisible
hand of the algorithm is at work.
882
00:56:16,840 --> 00:56:20,640
Forecasting algorithms monitor
and replenish the stock
883
00:56:20,640 --> 00:56:24,880
of more than 43,000 products,
anticipating customer demand.
884
00:56:27,040 --> 00:56:30,160
Control system algorithms
manage the traffic
885
00:56:30,160 --> 00:56:33,560
of the more than 7,000 crates
around the warehouse.
886
00:56:36,680 --> 00:56:40,080
And van routing algorithms
control the movement of the fleet
887
00:56:40,080 --> 00:56:42,240
of over 1,500 vans,
888
00:56:42,240 --> 00:56:46,400
testing over four million different
route combinations every second.
889
00:56:48,400 --> 00:56:51,400
You can almost see
the mind of the machine at work
890
00:56:51,400 --> 00:56:54,640
and it's not a static process,
so that's why there is a huge amount
891
00:56:54,640 --> 00:56:59,800
of machine learning in here, so it's
like a self-adapting organism.
892
00:56:59,800 --> 00:57:02,440
It's constantly having to learn
how to do it better.
893
00:57:02,440 --> 00:57:04,560
People couldn't do that.
894
00:57:04,560 --> 00:57:06,840
The machine has to tune itself.
895
00:57:10,960 --> 00:57:14,360
So who would you say was actually
in control of the whole thing?
896
00:57:14,360 --> 00:57:17,680
Ultimately, it's the algorithms
that are in control.
897
00:57:17,680 --> 00:57:20,200
I think I'm getting
algorithmic hot flushes
898
00:57:20,200 --> 00:57:22,280
by looking at this amazing thing!
899
00:57:24,640 --> 00:57:26,840
In some sense, this warehouse is like
900
00:57:26,840 --> 00:57:29,160
a little microcosm
of the modern world.
901
00:57:29,160 --> 00:57:32,920
Algorithms are running everything
from search engines on the internet,
902
00:57:32,920 --> 00:57:35,960
sat nav, even keeping
our credit cards secure.
903
00:57:35,960 --> 00:57:39,840
Our world wouldn't function without
the power of these algorithms.
904
00:57:45,680 --> 00:57:49,200
The Open University have produced
a free pack for you to learn,
905
00:57:49,200 --> 00:57:53,160
create and discover more about
digital technology past and present.
906
00:57:53,160 --> 00:57:55,440
To order your copy, phone...
907
00:57:58,840 --> 00:58:00,320
..or follow the link below
908
00:58:00,320 --> 00:58:01,840
to the Open University.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment