Transforming Your Business with Microsoft: A Guide to Leveraging the Full Suite

The Microsoft analytics platform, led by Power BI, is generally seen as one of the most capable and advanced data platforms out there and has been for a while. Supporting Power BI is an array of other technologies and resources to help manage and prepare data for reporting.

With the multitude of options offered by Microsoft, it’s easy to feel overwhelmed when trying to decide which resource to use. This blog post aims to alleviate that confusion by providing a review of the significant offerings. The content is organized based on the typical ‘role’ each resource plays, although many could easily fit into various areas.

Sections

Storing Data

Data needs to be stored somewhere. The classic option is a relational database. Microsoft’s flagship product for this is SQL Server. SQL Server has a long history as an on-premises database. It is also available on Azure in several different forms, from a single database to massive servers that fully implement SQL Server in the cloud.

Stepping beyond relational databases, Cosmos DB emerges as a versatile NoSQL database designed for largescale, distributed, real-time, and always-on applications. Its flexibility is further enhanced by offering multiple APIs, allowing users to interact with it in a way that best suits their needs. For real-time analytics, Data Explorer is there for you. It uses the Kusto Query Language (KQL) to retrieve data stored in JSON files.

Of course, the simplest way to store data is as files. Azure’s best offering for this is Data Lake Gen2, which is a form of Azure’s blob storage but allows for the creation of a folder structure. Beyond its own database systems, Azure can also host several non-Microsoft databases such as MySQL, PostgreSQL, NoSQL, MongoDB, and more.

Moving Data

Any analytics process will require moving data from the source to one or more storage locations. The traditional tool for this is SQL Server Integration Services (SSIS), an on-premises tool for moving and transforming data. In the cloud, Azure Data Factory fills the role of SSIS for moving and doing basic data transformations. It is also a good orchestration tool for scheduling various tasks in other Azure resources.

Transforming Data

The tools I mentioned in the previous section can perform some data transformations; however, the ones in this section should be discussed separately because they offer more data transformation abilities and can also be used to move data in some cases.

HDInsight is a platform that implements several open-source Apache frameworks for storing, analyzing, and transforming data. Of note here is Apache Spark, which is a framework for working with data in notebooks through several different programming languages. Where Apache Spark is open-source, Databricks is the proprietary version of it. Azure can host Databricks clusters and run all sorts of code on data.

Analyzing real-time data can be done with Stream Analytics. It can rapidly perform analysis on streaming data. Machine learning generates predictions based on data. As part of SQL Server, SQL Server Machine Learning Services can handle machine learning workloads on-prem. Online, Machine Learning in Azure offers several options for using your data. Of course, many of the databases listed above in the ‘storage’ section can also transform data within themselves.

Viewing Data

The last step in the line is viewing data. On-premises, SQL Server Reporting Services (SSRS) has been around a long time and is still a good option for generating tightly formatted reports.

Power BI, as mentioned earlier, is a powerful report and dashboarding tool which can display data in a wide variety of visuals and perform complex calculations to implement specific logic in the report. In addition, there is also a Paginated Reports tool which implements SSRS in the Power BI portal. Although it is technically a type of database, Analysis Services is an in-memory data store used to feed reports. It is available both on-prem (SSAS) and in Azure. Also, Power BI semantic models are effectively instances of Analysis Services.

Combined Solutions

Over the last few years, Microsoft has moved towards combining several of its Azure offerings into a single ‘Swiss army knife’ style resource. These resources provide all the abilities of the individual resources while making it easier for them to work together.

The first of these is Azure Synapse, a convenient solution that combines Data Factory, Spark Notebooks (from HDInsight), Data Explorer, and a few SQL options while using Data Lake Gen2 for storage. It’s a one-stop-shop for moving, transforming, and storing data, making your data analytics tasks more manageable and efficient. Plus, it’s cost-efficient because Data Lake storage is cheap, and the compute options only incur costs when they are running, unlike a normal DB system.

Recently, Microsoft combined tools further with Fabric. Fabric brings everything from Synapse and more into the Power BI portal, so it is now a complete analytics offering under one roof. It also includes machine learning support and new features such as Direct Lake, which makes getting data into reports easier.

Conclusion

Microsoft has many offerings for working with data, covering the entire process from collecting to presenting data. Knowing which one to use can be difficult with so many options, but my hope is that this guide will give you an idea about where to start and which resources to use.

Share this post

Let's get started with a consultation

Get started with training