Alibaba’s neural network, which was trained (like all AI) to think like a human, beat human competitors last week on a Stanford University comprehension test meant to measure computers’ growing reading abilities.
This is the first time machines have outperformed humans on such a test. The Chinese tech conglomerate’s program scored 82.44 percent on the exam, while humans scored 82.304 percent.
That may sound like a minuscule difference, but given the scope of the test, the AI’s performance is impressive. Stanford’s Question Answering Dataset (SQuAD) exam features more than 100,000 questions based on over 500 Wikipedia articles. Each answer is based on a specific section of text from an online encyclopedia article.
Alibaba’s AI showed strong comprehension of individual words, along with longer sentences and paragraphs. This is one of the hallmarks of natural language processing, which is helping computers to process large amounts of linguistic data such as Game of Thrones books.
Admittedly most of the questions on Stanford’s test are objective and don’t require critical thinking. Sample questions include “Which group headlined the Super Bowl 50 halftime show?” (Coldplay), “Who invented the Turing machine?” (Alan Turing) and “What is Doctor Who’s spaceship called?” (the TARDIS).
But Alibaba’s AI has already shown promise in more complex arenas. The program helped answer customer service questions on Singles Day, China’s biggest shopping day of the year.
Soon after Alibaba’s program took the test, Microsoft’s AI also bested humans with a score of 82.65 percent. The tech giant is using AI to make the answers on its Bing search engine more accurate. It’s also teaching AI how to answer multi-step questions, such as when and where someone was born.
Both companies are also exploring more complex AI applications in areas like law (finding legal precedent in documents) and medicine (scanning medical records). So your doctor’s office might not need a receptionist soon.
Alibaba and Microsoft have not responded to Observer requests for comment.