Search
  • Anton Chernetskiy

Building your facebook Ego Network

Hi there.


Here at Broscorp, we like to mess around with data and to look for patterns. So today I want to show you how to build your ego network and run some graph analysis on top of that. We will look for natural groups and communities of my friends and try to guess what do they have in common.

So our plan for today is the following:


  1. Scrap Facebook for a list of friends and connections between them.

  2. Convert scrapped data to the graph that is readable by Gephi and run community detection algorithms.

  3. Visualise the graph to get some pretty infographics(everyone loves that).


Scrapping Facebook data

You can't get a list of friends from Facebook API since 2014, so we'll have to scrap actual website for this information. I've built many scrappers, and for this task, I've chosen to use Selenium WebDriver. Reason for that is the abundance of dynamic content and javascript usage on facebook pages. We can either reverse engineer protocol in which pages communicate with the server or use a real browser to prepare the page for us. The second option is better since it is easier to implement and this kind of scraper is more stable(it won't break if the site will change its protocol).

So, what you need is to setup WebDriver, download my code from GitHub and insert your facebook id and password at the beginning of a script. Python 3 is used.

If you run the script, it will grab all your friends names and connections between them and store that in a file. This script can work for quite a long time because it depends on internet connection speed, but you can stop it at any time and continue later, it stores intermediate results in file state.pkl.


Converting data to GraphML

All right, now we have the list of friends and their connections, so we want to take a look at the graph. Most convenient way to do that is to store the graph in "*.graphml" file. This file can be visualized by the number of graph processing tools. We will use Gephi because it is a free and handy tool for working with graphs. To do that we need to execute "create_graph.py" in the same directory. This script creates a graph out of downloaded data and performs community detection with few different algorithms.


Visualising graph with Gephi

Gephi is a powerful opensource tool for exploring and visualizing graphs. We can open "ego_network.graphml" and play with it. First of all, I would make a username to be the label of a node then apply some layout.

Gephi visualisation

Then we can explore the graph, see detected communities, find 'influencers' and so forth.

Let's go back to my case. I found seven more or less distinct communities of my friends and tried to guess where do these people know each other from. All of them correspond to some event or interest I have. For example, I've got a very tight community of people from summer school LVDS because it happened in another city and none of them know any other of my friends. However, many people whom I know from the university also have been my coworker for some point in time.


Ego network with detected communities

That's all for today. If you find some parts particularly interesting and want more details - leave us a comment or write an email.

45 views