Utilize Azure SQL Data Warehouse & PolyBase for your Big-Data needs…

[1] Azure SQL Data Warehouse

Azure SQL Data Warehouse is a cloud-based, scale-out database capable of processing massive volumes of data, both relational and non-relational. Built on our massively parallel processing (MPP) architecture, SQL Data Warehouse can handle your enterprise workload.

SQL Data Warehouse:

Combines the SQL Server relational database with Azure cloud scale-out capabilities. You can increase, decrease, pause, or resume compute in seconds. You save costs by scaling out CPU when you need it, and cutting back usage during non-peak times.

Leverages the Azure platform. It’s easy to deploy, seamlessly maintained, and fully fault tolerant because of automatic back-ups.

Complements the SQL Server ecosystem. You can develop with familiar SQL Server Transact-SQL (T-SQL) and tools.


Control node The Control node manages and optimizes queries. It is the front end that interacts with all applications and connections.  Under the surface, the Control node coordinates all of the data movement and computation required to run parallel queries on your distributed data. When you submit a T-SQL query to SQL Data Warehouse, the Control node transforms it into separate queries that run on each Compute node in parallel
Compute nodes The Compute nodes serve as the power behind SQL Data Warehouse. They are SQL Databases that store your data and process your query. The Compute nodes are the workers that run the parallel queries on your data. After processing, they pass the results back to the Control node. To finish the query, the Control node aggregates the results and returns the final result.
Storage Your data is stored in Azure Blob storage. When Compute nodes interact with your data, they write and read directly to and from blob storage. Since compute and storage are independent, SQL Data Warehouse can automatically scale storage separately from scaling compute, and vice-versa. Azure Blob storage is also fully fault tolerant, and streamlines the backup and restore process.
Data Movement Service (DMS) Data Movement Service (DMS) moves data between the nodes. DMS gives the Compute nodes access to data they need for joins and aggregations. DMS is not an Azure service. It is a Windows service that runs alongside SQL Database on all the nodes.


[2] Azure PolyBase

PolyBase allows you to leverage your data from different sources by using familiar T-SQL commands. PolyBase enables you to query non-relational data held in Azure Blob storage as though it is a regular table. PolyBase can be used to query non-relational data, or to import non-relational data into SQL Data Warehouse. It also allows you to run queries on external data in Hadoop or Azure blob storage. The queries are optimized to push computation to Hadoop.

By simply using Transact-SQL (T-SQL) statements, you an import and export data back and forth between relational tables in SQL Server and non-relational data stored in Hadoop or Azure Blob Storage. You can also query the external data from within a T-SQL query and join it with relational data.


10 thoughts on “Utilize Azure SQL Data Warehouse & PolyBase for your Big-Data needs…

Leave a Reply

Your email address will not be published. Required fields are marked *