Big Blue is now moving its Project Debate technology on to commercial uses.
The subject under debate was whether the government should subsidize preschools. But the real question was whether a machine called IBM Debater could out-argue a top-ranked human debater.
The answer, on Monday night, was no.
Harish Natarajan, the grand finalist at the 2016 World Debating Championships, swayed more among an audience of hundreds toward his point of view than IBM Debater did toward its. Humans, at least those equipped with with degrees from Oxford and Cambridge universities, can still prevail when it comes to the subtleties of knowledge, persuasion and argument.
It wasn’t a momentous headline victory like we saw when IBM’s Deep Blue computers beat the best human chess player in 1997 or vanquish the world’s best human players of the ancient game of Goin 2017. But IBM still showed that AI can be useful handling in situations where there’s ambiguity and debate, not just an easy score to judge who won a game.
“What really struck me is the potential value of IBM Debater when [combined] with a human being,” Natarajan said after the debate. IBM’s AI was able to dig through mountains of information and offer useful context for that knowledge, he said.
It was the second time IBM Debater took on humans in public, though it’s held dozens of debates behind Big Blue’s walls. In the first IBM Debater competition, the AI defeated one human debater soundly while losing a closer competition with another. This time, though, the human opponent was tougher — indeed IBM researchers involved in the years-long project expected their AI would lose.
Computer persuasion
IBM Debater lost, but there’s no question it won in a way: Listening to it, you evaluate what it’s saying, not just that it’s a computer saying it. It marshaled its argument, broke it down into a few points and backed them up with data from various studies. It wasn’t perfect, but it was on point.
And, weirdly for an AI, it told us how Homo sapiens ought to behave.
“Giving opportunities to the less fortunate should be a moral obligation for any human being,” IBM Debater said.
In the debate, each side had 15 minutes to prepare — though only IBM Debater has the advantage of being able to draw upon 10 billion sentences worth of publications from news articles and academic research. Each side took turns making its case, rebutting the other then presenting a closing argument.
The debate is scored based on how many people change their minds. Before the debate, 79 percent agreed with the position in favor of preschool subsidies, but afterward, the figure dropped 17 percentage points to 62 percent.
In an age where Apple’s Siri, Amazon’s Alexa and the Google Assistant listen to our questions and answer in human-sounding voices, it’s easy to forget how remarkable it is that we can converse with computers. IBM Debater goes a step beyond, speaking for minutes.
“She was surprisingly charming and human-sounding,” said John Donvan, host of the debate moderator of Intelligence Squared Debates, which runs debates and broadcasts them through a radio show.
Don’t expect to run something like Project Debater on your laptop any time soon. It ran mainly on a powerful server with 28 processing cores and a whopping 768GB memory — roughly 50 times that of a high-end laptop. It was supported by a quartet of servers, each with 64GB of memory and 2-terabyte hard drives packed with text.
Preschool subsidies
IBM Debater argued in favor of the view that we should subsidize preschools, and Natarajan argued against it.
In Debater’s view, preschools “carry benefits for society as a whole. It is our duty to support them.” Good preschools mean kids — especially poor kids — do better in life.
Natarajan countered that preschool subsidies are “little more than a politically motivated giveaway to members of the middle class … and not to the individuals who are most underprivileged.” He also poked holes in Debater’s assumptions, for example that a subsidy will meaningfully improve education for the poor.
Debater showed improvements over its 2018 debate. One new trick up its sleeve was the ability to offer a parallel argument — in this case that subsidizing health care can be beneficial. Another was improved rebuttal skills. After Natarajan argued that some kids might not benefit from immersion into the potentially competitive world of preschool at age 3 or 4, IBM grasped that view and took issue with it: “My opponent argued that preschools are harmful,” it said.
“We were working very hard since June to improve the system,” said Noam Slonim, the Project Debater principal investigator at IBM Research. Debater’s source material — academic publications and news articles — also have been expanded with another year’s worth of data to the end of 2018.
Most challenging contest so far
The competition was the most challenging yet for IBM’s AI.
Natarajan “is at a different level compared to the debaters we faced so far,” said Ranit Aharonov, IBM’s manager of Project Debater. “He’s the most decorated debater in the history of university debate competitions with the world record in the number of victories.”
The event, at IBM’s Think conference in San Francisco, is IBM Debater’s last big debate. “Debater is nice, and it’s good to showcase, but we should be focusing on how to take that technology and make something that’s commercially viable,” Aharonov said.
“We are at the stage where we’ll finalize the first use case we’ll work on,” Aharonov said. That could be something like helping a company understand the views of its employees or customers, or helping the news media or governments engage people in discussion about contested issues, she said.
That’s because the technology behind Project Debater is all about the messiness and nuance of the real world we humans live in, not the black-and-white realm of games.
“We are going out of the comfort zone of AI into territory which is more gray,” Slonim said.