Page 20 - 04_May-2025
P. 20

City
The Power
of Positive
Reinforcement
Continued from pg 19
receiving the Alan M. Turing Award from the
Association of Computing Machinery. The
award — which comes with $1 million — is often
referred to as the Nobel Prize in computing
and Sutton won it for his decades-long, a
fundamental work in developing the AI field
of reinforcement learning, which takes a trial-
and-error approach to a desirable outcome.
That outcome has shifted, and possibly
profoundly. In 2018, the breakthroughs were
in gaming (Sutton’s U of A team’s research
contributed to AI beating checkers, chess and
the Chinese board game Go). Today, they’re
in AI-generated software like ChatGPT and
Midjourney that’s been finely tuned by
reinforcement learning.
The Father of Reinforcement Learning
Many of the biggest AI advancements are
directly tied to the work of the Ohio-born
researcher, who moved to Canada in 2003.
While introducing Sutton at the press confer-
ence, Amii CEO Cam Linke mentions these
and other “Sputnik moments” from around the
world (like DeepSeek’s chatbot, in China), and
the proliferation of local startups (RL Core,
Artificial Agency) that have relied upon Sutton’s
AI research in some way.
“Rich has been an incredible mentor to
both his students and his community,” Linke
says. “That inspiration, I think, is one of the
reasons we’re seeing reinforcement learning
take off today.”
But where is it headed? While text- and
image-generating AI may wow the world, it’s
all a long way off from Sutton’s “science of
mind” convergence. To him, any AI that takes
in large data sets — say, millions of pictures
of cats — to correctly identify the information
contained within them going forward, hasn’t
learned anything beyond pattern recognition
and mimicry.
“And any kind of mimicking is not going to be
powerful,” says Sutton. “Influencing the world,
picking your actions, solving a puzzle of ‘How
do you get reward from this environment?’ —
that’s what intelligence is. I don’t want to claim
deep insight. I just want to claim that I’m seeing
the obvious.”
20 EDify. MAY.25
To Go Where No AI’s Gone Before
It’s also obvious to Sutton that, if he’s right,
the community he has helped foster will play
an even bigger role in the future of AI.
“If reinforcement learning is going to be
essential to AI, then the University of Alberta,
far more than any other place, is going to be
the leader in reinforcement learning,” he says.
“It seems almost inevitable that a lot of
responsibility falls to us — we’re the ones to do
it or to fail to do it.”
Sutton still thinks “convergence” is inevitable,
and believes the AI world is doing its part.
He’s just waiting for the world of psychology
to catch up. “We should already have a science
of mind — I think it should be a new field.
But psychology doesn’t seem interested,” he
argues. “They just want to do a natural science.
The lack of ambition of these people!”
ED.
–Cory Schachtel
Software and Startups
Rich Sutton
has spent
decades
proving that
trial and
error is the
smartest
way for
machines
to learn
ALPHAGO
(2016)
Through rein-
forcement
learning (RL),
Google Deep-
mind’s AlphaGo
plays millions
of games of
Go against it-
self to become
the game’s first
non-human (and
unbeatable)
champion in
2,500 years.
Lead researcher
David Silver was
one of Sutton’s
PhD students.
DEEPSEEK
ZERO
(2025)
Eschewing large
data sets, Deep-
Seek instead
used nothing
but trial and error
(a.k.a. reinforce-
ment learning)
to achieve
success in writing,
editing and
summarization,
validating Sutton’s
belief that RL
would usher in
the new era of AI.
ARTIFICIAL
AGENCY
(2024)
The Edmonton-
based startup,
which combines
generative AI with
videogames, was
co-founded
by former Sutton
students Brian
Tanner and
Alex Kearney. It
launched in 2024
with US$16 million
in funding.
RL CORE
(2023)
Using RL, the
company
monitors waste-
water control in
real time and
continually
adapts to plant
conditions,
allowing human
operators to
focus on mainte-
nance and
emergency
response.
Cofounders
Martha and
Adam White
were Sutton
colleagues.
photo CHRIS ONCIUL
   18   19   20   21   22