MIT Personas, more than a stunning data mining and -visualization project

MIT Personas proves the existence of your publicy and to be conscious about it. But it also shows the opportunity and possibility to control just that.

By way of some interweb connections, I also came across this amazing MIT project, Personas. It’s actually quite a strange thing, that leaves you pondering about a few things internet related. To quote the project’s site:

Personas uses sophisticated natural language processing and the internet to create a data portrait of one’s aggregated online identity. In short, Personas shows you how the Internet sees you.

MIT Personas result on Wim Van Leuven
MIT Personas result on Wim Van Leuven

What is Personas?

The basic idea of Personas is actually rather simple:

Just enter your name and Personas scours the web for information and attempts to characterize the person – to fit them to a predetermined set of categories that an algorithmic process created from a massive corpus of data. The computational process is visualized with each stage of the analysis, finally resulting in the presentation of a seemingly authoritative personal profile.

So, from a technical perspective, Personas is much more than a more than clean website design and Flash application. It is intrincsically just an amazing combination of huge dataset processing, stunning data visualization, but also gorgeous algorithm visualization. On the latter topic: you just see the algorithm mining the dataset, which is fabulous! What I find so intriguing to appealing infographs, is that they tend to address the right hemisphere of our brain more than the left.

MIT Personas working on Wim Van Leuven
MIT Personas working on Wim Van Leuven

But what does Personas mean?

If you think about this neat technological result, there’s more to Personas than just “showing how the internet sees you“. On the one hand, it does give you that idea about the ubiquity and long-time persistence of your online profile . Meet your publicy, as this facet of the internet has been coined. Stowe Boyd has a very nice article written about the 3 facets of one’s person: secrecy, privacy and publicy. Welcome to the decade of publicy, where interactions and tools “default to things being open and with open access, rather than concealing things and limiting access to those explicitly invited“. So, Personas is again a very good reminder to be careful with the breadcrumb trail you create across the internet. Well, not to be careful, but be conscious. Being self-conscious about your publicy, gives you control, because you are careful about your paper-trail.

On the other hand, the Personas project also points out another very important aspect of the current state of the internet:

In a world where fortunes are sought through data-mining vast information repositories, the computer is our indispensable but far from infallible assistant. Personas demonstrates the computer’s uncanny insights and its inadvertent errors, such as the mischaracterizations caused by the inability to separate data from multiple owners of the same name. It is meant for the viewer to reflect on our current and future world, where digital histories are as important if not more important than oral histories, and computational methods of condensing our digital traces are opaque and socially ignorant.

As the internet is run and scavenged by computers, which are just algorithmic and not intelligent (“The question of whether a computer can think is no more interesting than the question of whether a submarine can swim.” – Edsger W. Dijkstra), one has to be careful about the information retrieved by searching or mining the net. On the other hand, regarding publicy and its constant tension with privacy, the Personas project proves there’s a realistic risc in retrieving confusing information coming from alternate sources. So it’s a matter of being master of Google’s first pages (Dutch) as Bart De Waele states it. Otherwise said, control the conversation about your profile. Become the only source of information about yourself. Or your company for that matter. However, as this might be a very labour intensive undertaking, as a last resort, there is not only a possibility, but also an opportunity, not only in falsifying information (p. 28) to protect one’s privacy. but in smoke screening your publicy by actively dropping false information, maybe even using mirror identities.

Self regulation

Which proves a conclusive point, stated also by Laurent Haug: “Self regulation is already underway“. Which works in both directions: “You want to spy on me? I will feed you with fake data to push the envelope to where I want it to be … In the contrary, if you give users a system they can trust, one where they can control what is controllable, then they will share the data advertisers need.

What can I do with the concrete Personas result?

In short? Nothing! It is what it is: an animation and final visualization … that indicate something, but nothing concrete. It can give you an idea or overview of what can be found on the interweb when people search you. You can use it as such in your internet toolchest.