More than 3 years after the historically significant referendum, the UK finally left the European Union officially last Friday (31 Jan 2020). As the Economist states, “nothing much will change after 2300 GMT on 31 Jan {…} people, goods and services will continue to move between Britain and the EU.” And that’s indeed true; life still goes on, as usual, the only little different is that Brexit became the most discussed topics for a while. But the divided public that turned the referendum upside down more than 3 years ago is still divided.

#Brexit for sure was the trending hashtag and topic on Twitter for a few days. After cleaning the data and removing #Brexit, #brexit or some typo Brexit hashtags, the most used hashtag related to Brexit on Twitter are shown below.

png

While we can see one of the popular hashtags is #BrexitCelebration, we can also see #NotMyBrexit or #BlackHistoryMonth.

To make sure the tweets' context while the hashtag #NotMyBrexit is used, I extract the words that used together with #NotMyBrexit and analyse the most frequent words into the word cloud below.

png

As we can see, when users used the#NotMyBrexit, it is mostly associated with #iameuropean, #stilleuropean, #rejoineu or words such as bad and blame. But of course, we can meanwhile see words such as good and happy is used as well. At first, I thought maybe people can also mean “not happy” or “not bad”, but if that is the case, the word “not” should appear in the word cloud as well, which as we can see is not the case.

png

And out of curiosity, I tried to produce a word cloud based on every tweet (~ > 200,000) I scrapped related to Brexit. Words such as people, Britain and done are used the most.

Now let’s look at this from the UK politicians' perspective

I scrapped the Twitter timeline of each of the UK politicians below.

png

It is obvious (and makes perfect sense) that Boris Johnson is the most frequent users on Twitter. Especially before the official Brexit.

png

Different from the average Twitter users, the politicians tend to use the hashtag #GetBrexitDone or #RealChange a lot.

Well, after 3 years wait, this makes sense.

png

Each of them seems to like to tag themselves, and their parties as well, such as conservative and labour. But also we can see Boris Johnson mentioned “done” quite frequently, and “NHS” and “pm” by Jeremy Corbyn and Theresa May respectively.

This can be confirmed by the chi-square keyness graph I generated below. Which measure the chi-sq of the most used word by that particular politician with reference to the other 2 politicians.

For Borish Johnson

png

For Jeremy Corbyn

png

For Theresa May

png

It is clear that the chi-sq of Theresa May @ing herself is the highest compared to everything else, which is in line with the word cloud above. And we can see the reference section of each of the graphs are usually the words used the most by the other 2 politicians.

Lastly, let’s look at the emotions of the tweets by each of these politicians.

png

Using the dictionary library in R or Python, we can obtain the positivity/ negativity of an individual’s tweets. While Theresa May sounds the most positive on Twitter, Jeremy Corbyn sounds relatively pessimistic in comparison.

But there might be more to look at if we dig deeper.

png

Through the Moral Foundation Dictionary, a clearer categorisation is revealed about each of the politicians.

I found it interesting that Theresa May has the highest score in every measure, Jeremy Corbyn is always ranked second and followed by Boris Johnson. Overall, their tweets ranked the highest in the care virtue and lowest in the sanctity virtue. I am guessing this is related to sound more appealing to their followers, and more importantly, citizens who are in the middle and do not have a particular political preference.

From a data scientist perspective, the first thing I think of when talking about Brexit is the movie Brexit: The Uncivil War by Benedict Cumberbatch or The Great Hac on Netflix. The first time I heard about Cambridge Analytica was back in the summer of 2017 while I was at Stanford University and everyone was talking about Michal Kosinski. About 1 year after, The Observer and the New York Times both wrote a report on Cambridge Analytica and individuals finally began to be aware of the issue. Undoubted, it is splendid that people finally pay attention to these technology giants and data privacy issues, but meanwhile, there are many more concerning issues emerging every day. For example, Deep Fake and its applications.

Recently, when I was talking with my friends about what researches they have been doing lately, technology companies such as Uber or Tinder frequently came up in the discussion. My friend told me that Uber will adjust their price based on users' phone battery level. Basically, if your phone is at a low battery level, Uber will ask for a higher price for your ride since they know you have a high possibility to accept the price anyway. Furthermore, based on the previous locations we have used the app, Uber will adjust and personalised every user’s acceptable price range. Last month when my partner and I were at Liverpool travelling, we ordered Uber at the same place together, and my Uber account is always asking around 3~5 £ more.

Alas, after knowing, and constantly utilising the power of data, I’m still a frequent user of Uber, Instagram and Twitter.