Over two billion people use Facebook every month, which means the social media giant has data on a broad cross section of the world’s population. But how exactly does it use that data?
A new Chrome extension wants to help users find out. Data Selfie tracks the Facebook activity of anyone who downloads it to show them how much of a trace they leave on the site. It then reveals how Facebook’s machine learning algorithms employ that specific data to gain insights into the user’s personality.
“The tool explores our relationship to the online data we leave behind as a result of media consumption and social networks—the information you share consciously and unconsciously,” Data Selfie’s website reads.
Once it’s installed, Data Selfie runs in the background while you’re on Facebook. It records data about what you type, how much time you spend on the site, clicks and likes in your news feed and link clicks to external sites. The more you use Facebook, the more data is recorded.
This data is then anonymized to remove personal information and passed through Data Selfie’s servers to two machine learning programs. The first, IBM Watson, became famous for beating Ken Jennings at Jeopardy. Data Selfie utilizes two aspects of Watson in particular:
- Personality Insights, which uses social media data to identify psychological traits which determine purchase decisions, intent and behavior.
- Tone Analyzer, which leverages cognitive linguistic analysis to identify a variety of tones at both the sentence and document level and refine and improve communications. It can detect emotion, social openness and language styles in text.
The second service, Apply Magic Sauce, predicts users’ psycho-demographic traits based on their digital footprints. It can detect traits such as personality, satisfaction with life, intelligence, political views and religion.
Data Selfie was created by coder Hang Do Thi Duc and scientist Regina Flores Mir—both have an interest in design according to their bios. The creators write in their pitch that they hope to “provide a personal perspective on data mining, predictive analytics and our online data identity.”
That personal aspect is particularly crucial—Data Selfie doesn’t store users’ data once they’ve closed the extension, so that the information remains localized on your computer and can be imported, exported or deleted at any time. Data Selfie’s open-source code is also available free on GitHub for extra transparency.
This individualized approach to big data is gaining popularity—almost 78,000 people have downloaded Data Selfie.
The extension received seed funding from the New York City Economic Development Corporation, the Mayor’s Office of Media and Entertainment and the NYC Media Lab’s Combine program, in which entrepreneurs develop their ideas through mentorship from city media companies.