I recently got access to a csv file that contains twitter feeds filtered by Azure. I want to do some simple analytics on where the twitter users live in a visual way.
I am going to load the csv file into Excel using DataExplorer (download add-in for Excel 2012 here).
And I select Import from Csv, and select the file that I received. That opens the DataExplorer query editor and in this case I want to import the csv file completely without any changes. I do want to Use the first Row as Headers :
Now I import this data into Excel,
after importing the data into Excel I want to rename my query by double clicking Query1 and rename it to “Twitter Data” and press Load to data model to load it into the AS engine.
Now I want to plot the tweets on a map. I am going to use GeoFlow for that, as this is data that I got from the Internet there are some data quality issues. The current version of GeoFlow doesn’t allow us to filter data. So I am going to use a Excel trick to create a filtered down table by using a Query Table where I filter out all tweets where no region is defined.
So we go back to Excel, select an empty sheet and click on Data, Existing Connections, Tables and double click the Twitter data table:
And press OK in the next dialog to add a table. This adds a table to the Excel sheet:
Now right mouse click on the table and select Table, Edit DAX:
Now I change what is shown on the sheet by editing the DAX expression for this table:
Evaluate
calculatetable(
Summarize(
‘Twitter data’,
‘Twitter data'[ID]
,’Twitter data'[Location]
)
, ISBLANK(‘Twitter data'[Location]) = false)
Press OK and this will only give me that ID’s and Locations where the Location is not blank:
Comment: the results of the query will have the limitation of 1,000,000 rows in Excel but you could always aggregate the results by location using DAX. I don’t think many people want to plot more than a 1,000,000 locations.
Now go to Insert GeoFlow (Install add-in here):
In GeoFlow you now can pick columns from the filtered table:
Now select location as other and Map it:
Now select ID as height:
This plots all the tweets on Azure for the last week on the map, you might notice the Netherlands and Belgium are quite some hotspots on Azure (no I did not rig the results!)
Hope that you some idea’s on how to use Excel, DataExplorer and GeoFlow.