Yahoo Taught a Machine to Detect Abuse Online
Gareth Andrews / 8 years ago
The internet is a wide and open place, capable of catching the imaginations and dreams of millions of people with videos, pictures and words alike. The problem is that not everyone uses the internet to bring together people, instead using it to spread hate and break apart groups. The people at Yahoo Labs may have just figured out a way to help police and detect the abuse that now swarms the internet.
As the internet grows and access to the internet increases there has been a sharp rise in the abuse found across the web. To help police and detect this Yahoo looked at turning away from blacklisted terms and syntax clues to instead look at the field of machine learning to help detect comments that may still be considered abuse.
The technique is called “word embedding”, processing words as vectors instead of the typical positive/negative approach. This way the system doesn’t rely on flagging up a word or sentence, instead picking up on the entire string of words even if they were all completely innocent.
The results are promising with the system detecting abusive language 90 percent of the time within a single data set. With the nature of human speech changing sentence to sentence, it won’t be 100 percent accurate all of the time but it may help detect and alert sites to words and phrases that previously would have gone unnoticed.