cft

What Does Facebook Actually Know About Me?

Using Data Science to investigate the 7,500 files of data Facebook has collected about me


user

Chris Brownlie

3 years ago | 6 min read

Unless you have had your head buried in the sand for the last decade you will know that tech company - and in particular social media companies - have a significant amount of information about you.

If you have ever had an account with the likes of Facebook, Google, Amazon etc., the chances are they know more about you than you realise.

In this series of articles, I will pick apart the roughly 7,500 files (2.7GB) of data from Facebook which they allow me to download, all of it on a particularly boring topic - yours truly. That said, I’m hoping there will be some interesting insights and some thought-provoking visualisations at the end of this and that when all is said and done, I am not too scared and horrified by what I find.

It could be that Facebook knows a lot more about me than I previously realised, that my analysis highlights patterns in my usage of the platform which I have never considered before, or simply that I’m horrendously embarrassed by my social media activity as a teenager - we’ll see. I’ll be posting a part of the series on each day this week (7th-11th September), so give me a follow if you’re interested!

The data set in question dates back to when I first joined Facebook in January 2009 (when I was 12 years old - scary). So that is over 11.5 years of interactions, activity (both on and off Facebook - more details on that later) and personal information, all of which I downloaded in JSON file format and analysed using R.

Note: I have also developed a very simple, primitive R package to help with extracting your own Facebook data into an easy to analyse format — check that out on my Github here.

First Impressions

Rather than type them out, I think its easier for me to show you the names of the various folders that were included in the data I downloaded from Facebook. This will give you an idea of the kind of information we’ll be analysing.

The folders included in my data download from Facebook
The folders included in my data download from Facebook

As you would expect, the list is fairly extensive and covers just about anything you can do on Facebook. Note that several of these (e.g. archive, short_videos, stories, voice_recording_and_transcription) are empty because I have never used these features on Facebook.

Throughout the rest of this first article, I will analyse my own personal use of Facebook. This will give you an idea of how much I used my account and how that might be reflected in what they as a company know about me as an individual.

I hope that in reading this series you become more aware of what social media platforms know about you and your own decisions can be more informed.

My Facebook Fingerprint

To start with, there is some basic information that Facebook has about me which isn’t possible to visualise so I will quickly mention here.

  • They have a file which they use for facial recognition - the contents of which mean nothing without access to Facebook’s own machine learning algorithm which they use for recognising faces. Nevertheless, it shows that they do have this information and that it was generated using 236 examples of my face (where I have been tagged in photos).
  • My designated ‘friend peer group’ is classed as ‘Starting adult life’.
  • I have not used any secret conversations on Messenger.

F for Facebook

Next, lets visualise my use of Facebook over time. Hopefully this will give you an idea of how much I use the platform and how that might affect the results in the rest of this series if your own data was to be used.

The most obvious ways to look at activity is to visualise how much I have posted and commented across the years. These can both be seen below, with key stages of my life annotated:

Number of posts made on Facebook over time since I joined in January 2009
Number of posts made on Facebook over time since I joined in January 2009
Number of comments made on Facebook over time since I joined in January 2009
Number of comments made on Facebook over time since I joined in January 2009

A few points to note:

  • For context, it is probably worth saying that — academically speaking — I under-performed at secondary school and university level but not at sixth form. When you have this piece of information, the first graph can be interpreted as visualising the negative relationship between academic performance and social media usage.
  • The second graph shows a similiar pattern but is conspicuously lacking during my secondary school years. This is a reflection of how my use of Facebook has changed over the years, around the time I went to university is when the habit of ‘tagging friends in comments on videos/posts you find funny’ became a trend - consequently the number of comments I made rose significantly.
  • Note that the large spikes in the first graph (i.e. where there is more than ~20 posts in a month) come from the fact that when multiple photos were uploaded at once, Facebook counted them as individual posts.

Although posts and comments don’t necessarily equate to social media usage I think it is a good enough proxy of my level of ‘involvement’ in social media. An alternative way to consider social media usage would be likes and reactions.

I like it like that

Below you can see the number of times I have reacted to something on Facebook (e.g. ‘liked’ a post) over time.

The number of times I have reacted to something on Facebook since I first joined.
The number of times I have reacted to something on Facebook since I first joined.

This again seems to show higher Facebook use at times in my life when my academic performance was lower.

In February 2016, Facebook introduced some new ‘reactions’: ‘haha’, ‘wow’, ‘sorry’, ‘anger’ and ‘love’. As well as the overall picture above, we can see also break this down by type. The graphs below show how much I have used each reaction in the past 4.5 years (since all the reactions became available) both over time and in total.

The number of times I have used each Facebook reaction since February 24th, 2016.
The number of times I have used each Facebook reaction since February 24th, 2016.

Although I don’t tend to use ‘non-like’ reactions very much, liking has remained fairly common. This is reflective of the fact that the ‘like’ has become the de-facto default response on Facebook since it has been around longest and is the least emotive of the six possible reactions.

You can also see in the first graph that while over time I’ve been using reactions less, this has meant there is more variety in the way I react and non-like reactions have started to play a bigger part.

You’ve got a Friend in me

Finally, I looked at how the number of Facebook friends I added each month related to the different stages of my life.

The number of Facebook friends who I have added or accepted each month since I joined Facebook.
The number of Facebook friends who I have added or accepted each month since I joined Facebook.

Unsurprisingly, the first month and a half of University is when I added the most friends- although I personally expected the number to be higher than ~60 (if you include the first three months of that section).

The second highest period was when I first joined Facebook. It is interesting to see that in my second month on the platform I added almost nobody, perhaps this suggests a ‘friend hangover’ from adding a lot of people in the first month. Again I kind of expected this number to be higher than 30 though.

I’m also surprised that there are relatively few months where I haven’t added anyone. Especially since I finished University I feel like I very rarely add people on Facebook and if I do, it’s several at the same time. It is interesting to see that this isn’t the case and there are several months through 2018 and 2019 where I added one or two new Facebook friends.

That’s enough of that

This introductory post has hopefully given you an idea of what this series will be like. Although it is not surprising that Facebook knows how much I have posted, commented and reacted, I think it was worth mentioning again as deductions can still be made from this data - as I have hopefully shown here.

The remaining articles will give perhaps more interesting answers to the title question of this series. In the next part of the series, I take a look at the classic Big Brother question - are Facebook tracking my movements (spoiler: yes) and if so, how much do they know about my location?

If you enjoyed this post or want to make sure you see the rest of the series then please give me and my publication Data Slice a follow to stay up to date, thanks for reading!

Upvote


user
Created by

Chris Brownlie

Data Scientist working in the UK public sector. https://medium.com/@chris.brownlie


people
Post

Upvote

Downvote

Comment

Bookmark

Share


Related Articles