Getting data into Power Query with the twitter search API (how to hack PQ to use OAuth)

Last week I decided to try an interesting experiment. Lets see if I can get data from the Twitter REST API into Power Pivot using Power Query. Power Query already supports two out of three things that are needed for me to import this data:

  1. Import from a web url is already supported
  2. Parsing JSON data is already supported
  3. Logging into the service with OAuth is not supported for a datasource that is not developed by the Power Query team. Unfortunately even though most products use OAuth, no OAuth implementation is the same. Thanks to some hacking with Power Query I managed to work around this.

So lets go take a look at how we can solve this.

The first thing you need to have is a twitter account. That’s easy, next you should think about what kind of data you want to get, there is extensive API documentation that you can look though here to find the one that interest you the most: https://dev.twitter.com/rest/public

I decided to go with the Search API to get a list recent PowerPivot posts: https://dev.twitter.com/rest/public/search. Now the fun begins, first lets get familiar to what the search API is all about. Twitter provides a great API console from APIGee that you can use to play around with the API’s without any code. You can find that here: https://dev.twitter.com/rest/tools/console

I opened the console and selected the search API:

image

Next I signed in using my Twitter username and password. Typed in PowerPivot as the keyword to search for and pressed send. This will execute the API under my credentials:

image

Now this is “all” that  we need to get the data loaded into the data model and Power Query. I open Power Query and use “From Web” and paste in the URL for the API: “https://api.twitter.com/1.1/search/tweets.json?q=PowerPivot

image

This will give you an error message: bad request:

image

If you want to see what is really going on install a sniffer tool like fiddler, there we can see that twitter returns that we aren’t authenticated properly. I will spare you those details but if you want to try it yourself download fiddler here: Fiddler – Official Site.

Now how we can get around this? If the datasource would have been supported by PQ would have gotten a credentials popup or you could choose the right credentials in the Edit Setting dialog you can popup using the button on the screen. Alas these options are not available for us today, we need to start writing M code. So lets pop open the editor, this is the code we have so far:

let
Source = Web.Contents(“https://api.twitter.com/1.1/search/tweets.json?q=PowerPivot”)
in
Source

Very simple right? Now we make twitter aware of who we are by adding authentication to the request. This happens in OAuth by adding headers to the request. Here is where the hacking part starts, we can do this in Power Query by adding some options to the web.contents call as you can read in the help documentation here.

So we change the code by adding the headers that are shown in the dev console above in this fashion:

let

Source = Web.Contents(“https://api.twitter.com/1.1/search/tweets.json?q=PowerPivot&count=100″, [Headers=[#”Authorization”=”OAuth oauth_consumer_key=””YOURKEY””, oauth_nonce=””YOURCODE””, oauth_signature=””YOURSIGNATURE””, oauth_signature_method=””HMAC-SHA1″”, oauth_timestamp=””TIMESTAMP””, oauth_token=””YOURTOKEN””, oauth_version=””1.0″””]]
)
in
Source

 

You can just copy and paste the values from the dev console into Power Query and it will work. Unfortunately it will only work for a small period of time until the token is invalid again, if this was a supported data source in Power Query they would take care of refreshing the token and adding the right values to the header.

Now we see results actually coming back:

image

From here on it is just normal Power Query functionality as it returns JSON format

image

Now I can save and close and analyze the Twitter data inside Excel and Power View. Pretty cool right?