Social data users, beware
It’s easy to see why social data has got marketeers and researchers so excited. Social media users are expressing what’s important to them, in their own words, on a massive scale — and much of what they’re sharing can be accessed for free.
However, using social data isn’t that simple (even if we leave the potential ethical issues to one side). The data generated on social media is far from context free — and unless we take that into account, social data on its own might not be able to tell us much more than how people want to be seen.
5 reasons to worry about context when using social data:
1. Context collapse
People don’t post online in a bubble. If you share online, you’re probably aware that what you share could be seen by others coming from all sorts of contexts that don’t usually mix-your mum, friends, employer, or even the odd lurking brand or government agency. When these groups are potentially reading every idle thought you post online, saying what you want in a way that’s appropriate for all of them suddenly becomes much harder. This is where self-censorship kicks in, and we start thinking about how to present ourselves. The end result? Far from being an unfiltered digital version of ourselves, much of the data shared voluntarily online is, to some extent, curated.
2. The problem of explicit articulation
Social media encourages us to explicitly articulate who we are and how we see ourselves (via danah boyd — whose observations on how people use social media provide a useful context for understanding how to interpret their data). Twitter user bios ask you to sum up your life and work in just 160 characters. Instagram is pretty much a living pinboard of how you want other people to perceive you. The problem is that it’s sort of unnatural — building a Facebook profile requires a degree of introspection that most people don’t normally indulge in their day to day lives.
In other words, the way we think about how to present ourselves online (and the fact that we spend time thinking about it at all) probably doesn’t mirror the way we act in other areas — so when it comes to trying to understand the people behind the data, that data has already been through an unhelpful degree of rationalisation that most of our day to day behaviour hasn’t.
3. A perceived lack of privacy
The potential publicity of what we post online — even to a public we’ve selected ourselves — may limit how freely we’re willing to express ourselves. Way back in 2008, one academic study found that increasing privacy controls on social media increased how much users were willing to share — so as we move into a world where a lack of privacy seems to be becoming a norm, a backlash isn’t surprising. This means that the data that is freely available online is likely to be less revealing than the stuff that lies behind the privacy settings.
This can also have implications for brands trying to learn more about their consumers online. Brands using data to infer things about their customers (as in the infamous Target-pregnancy-gate) run the risk of crossing the ‘creepy’ line — when actually, customers might be more engaged with a brand who don’t use every opportunity to engage with them.
4. Specialised social media
Social media is fragmenting into increasingly specialised channels — meaning that it’s getting harder to get a full picture of internet users from what they post online. If you only use Twitter to read the news, only ever interact with brands on Facebook, and only use Pinterest to keep track of things you want to buy, it becomes harder to create a joined up picture of you as a person. A lot of social data analysis is done at an aggregated level, where the specialisation of social media could actually be useful for refining focus (e.g. to a particular social networking site) — but it also means that the picture generated from individual sites is increasingly likely to have gaps.
5. Much of the information shared online is essentially shallow
The reasons above may help to explain why much of the data posted online is essentially shallow — articles read, music listened to, pictures of meals just cooked. This type of information is more a list of ‘likes’ than anything deeper — and it’s perhaps no surprise that this is the type of information (“basic purchasing habits” and media liked) that a Pew research study found to be considered “least sensitive” amongst Americans, compared to more sensitive topics like health, physical location, relationship history, religious and political views. The same study also found that 81% of Americans consider social networks to be “not at all” or “not very” secure — suggesting why they might be the dumping ground for so much non-sensitive, shallow information.
Whilst it might seem obvious that people are more willing to post pictures of their cats than their doctors’ notes, we shouldn’t take what social media has become for granted. The social web didn’t start as an appeal to the lowest common denominator, and its content now (hopefully) isn’t indicative of a corresponding dumbing down of the people behind it — actually, it’s probably a fairly narrow expression of those people. We may yet see that as social norms (or incentives) change, sharing other types of information becomes more common.
When using social data it’s important to remember that this is only part of the picture — the picture of an individual that they want you to see, think is worth sharing and appropriate to share. Data on web activity, location and transactions — that is, the less self-selected trail that internet users leave online — can add to the picture. But context is still needed into why people post about certain things and not others, who they’re intending to read their posts, and what meanings they attach to it all. Unless we understand all that (tip: make friends with qualitative research), all the data on the internet might not tell you that much about the people behind it.