Luca Zavarella Blog

Let’s keep informed with “Data Explorer”

At Pass Summit 2011 a new project was announced. It’s a Microsoft SQL Azure Lab and its codename is Microsoft “Data Explorer”. According to the official blog (http://blogs.msdn.com/b/dataexplorer/), this new tool provides an innovative way to acquire new knowledge from the data that interest you. In a nutshell, Data Explorer allows you to combine data from multiple sources, to publish and share the result. In addition, you can generate data streams in the RESTful open format (Open Data Protocol), and they can then be used by other applications. Nonetheless we can still use Excel or PowerPivot to analyze the results.

Sources can be varied: Excel spreadsheets, text files, databases, Windows Azure Marketplace, etc.. For those who are not familiar with this resource, I strongly suggest you to keep an eye on the data services available to the Marketplace:

https://datamarket.azure.com/browse/Data

To tell the truth, as I read the above blog post, I was tempted to think of the Data Explorer as a "SSIS on Azure" addressed to the Power User. In fact, reading the response from Tim Mallalieu (Group Program Manager of Data Explorer) to the comment made to his post, I had a positive response to my first impression:

“…we originally thinking of ourselves as Self-Service ETL. As we talked to more folks and started partnering with other teams we realized that would be an area that we can add value but that there were more opportunities emerging.”

The typical operations of the ETL phase ( processing and organization of data in different formats) can be obtained thanks to Data Explorer Mashup. This is an image of the tool:

The flexibility in the manipulation of information is given by Data Explorer Formula Language. This is a formula-based Excel-style specific language:

Anyone wishing to know more can check the project page in addition to aforementioned blog:

http://www.microsoft.com/en-us/sqlazurelabs/labs/dataexplorer.aspx

In light of this new project, there is no doubt about the intention of Microsoft to get closer and closer to the Power User, providing him flexible and very easy to use tools for data analysis. The prime example of this is PowerPivot.

The question that remains is always the same: having in a company more Power User will implicitly mean having different data models representing the same reality. But this would inevitably lead to anarchical data management... What do you think about that?