First of all happy new year! Lets make it a great Power BI year, one of my resolves is to do more blogging this year.
My wife was talking to me about the need for her to get some information from a Facebook page she is admin for. A project that is getting more and more traction across the us and the globe is the “buy nothing project” (http://buynothingproject.org/) . This project is a hyperlocal Facebook group that enables members to post anything you’d like to give away, lend, or share among neighbors in the same city. Instead of buying it in a store you can get it from a neighbor who doesn’t need to have that specific item any more.
In the project admins group a thread got started on how to get more information on these local group. So I thought lets see if I can get this information from Facebook using Power Query and into Excel. Some information that might be useful is to be able to see if the members of the local group actually live in the same city as the group or what is the percentage of male / female posts or what is the most active hour etc etc.
And I managed to get it working so lets take a look at how this works.
Lets get started, open Excel 2013, open the Power Query tab and select import from other sources, From Facebook:
Now you need to log in with your Facebook credentials that will be stored on your computer, The window that opens next asks for a “object” in the Facebook graph.
Facebook graph is the primary way for external apps to talk with Facebook. This allows you to get all the information you can think off from Facebook and Power Query can handle all this information. For more information see: https://developers.facebook.com/docs/graph-api/
In order to figure out what the ID of the object is what you want to get information on is open that page in the browser and copy the number at the end. This is your Object ID. I created a test group where I added some colleagues and friends.
Now each object in Facebook has many connections, you can find all the connections that are available from the API documentation mentioned above or you can select the connection you want to use from the dropdown in Power Query. In our case I want to see the member list:
I press OK and this opened the Power Query Editor which loaded the data from my Facebook group which I named “Members” (I got rid f the name column for privacy reasons):
I wanted to get some additional information so I click on the header of the object_link column, this opens a small list of items that get returned by Facebook:
One thing that I found out is that some items are not shown because Facebook wont return items that are not available for all members. In my case I want to get the location of the member if available. Many members don’t have their location so you want be able to select this by default. So what I did is filter the list down to only me by clicking on the header above the name field and selecting “Kasper de Jonge”. When I now click the object link we get a much larger list where I select “gender” and “location”:
This adds two columns but the location is still not usable as Facebook returns data in a hierarchical shape and not a neat row that we expect in Excel. Luckily Power Query knows how to handle this.
I clicked on the icons in the header again and select name:
This gives me the members with the information that I need for my group. I select Load to Data Model and press “Apply & Close”:
This loads the data into Excel and the Power Pivot model. I open the query again by clicking on the Workbook button in the Power Query tab. This opens a pane with all the queries in my workbook, I hove over the one that I want and click “Edit Query”:
Now that I have all the columns I remove the filter by clicking on the X mark next to the filtered row step:
This now loads in all the rows into the Power Pivot Model, I open the Power Pivot window by clicking on the PowerPivot tab and select “Manage”:
Now the next thing that I want to get is the posts done to this group. I again go to import from Facebook and this time I select “Feed” as connection. This will download all the posts done in this group. Again I make a selection of the fields that I want, similar as before and import the data into the Power Pivot model:
This gives us all kinds of interesting details on the posts made. One additional action that I took when importing the data was add some aggregations to the list. For example comments is not a cell in this table but actually a reference to another table, instead of importing all the comments made on this post I want to see the nr of comments. I can click on the arrows icons
This will give me a UI that allows me to choose what I want to do. Do I either want to Expand the data or Aggregate the rows. In this case I choose Aggregate by the number of ID’s of the comments:
Now the last thing I want to do is create a relationship between the members table and the feeds table so I can combine the information from both together. I open the Power Pivot window again and switch to the diagram view:
Here I drag the memberID on top of the ID column of members, thus creating a relationship. Some data I already know I want to report on is the number of members and number of feeds. I open the Data view again and add two calculated fields in the calculation area:
Nr of posts:=COUNTROWS(Feeds)
Nr of members:=COUNTROWS(‘Members’)
Now I am ready to do some analysis. I go back to Excel and create some simple reports:
From here there are tons of things we can do and create all kinds of reports from the data.