Enterprise Data Pipelines using Azure PaaS Services – An Introduction

September 19, 2017, 11:53 pm

≫ Next: Unattended install and configuration for SQL Server 2017 on Linux

≪ Previous: Instant Log Initialization for SQL Server in Azure

Authors: Rangarajan Srirangam, Mandar Inamdar, John Hoang
Reviewers: Murshed Zaman, Sanjay Mishra, Dimitri Furman, Mike Weiner, Kun Cheng

Overview

You might have worked with enterprise data pipelines using the SQL Server suite of products on-premises, or using virtual machines in the cloud. Now, you can build a similar enterprise data pipeline on Azure, comprised purely of Platform as a Service (PaaS) services. This article discusses data pipelines composed using Azure PaaS services, the main benefits, hybrid pipelines and some best practices.

Traditional Enterprise Data Pipeline

Enterprise data pipelines based on the SQL Server suite of products have been traditionally used on-premises to meet organizational needs of transaction processing, data analytics and reporting. SQL Server Database Engine is a favorite choice to host online transactional databases serving high volume transactions. With data marts hosted in SQL Server Analysis Services, Online Analytical Processing (OLAP), cubes can be built for analytical needs. With the addition of capabilities such as ColumnStore, using SQL Server Database Engine to host Data marts is another viable alternative. With SQL Server APS (formerly PDW), the scale and performance benefits inherent in a massively parallel processing (MPP) architecture can be realized. Reports with rich and powerful visualizations can be built using SQL Server Reporting Services. SQL Server Integration Services can be used to implement data transformation and/or data movement pipelines across the SQL Server product suite and can connect to external data sources and sinks. SQL Server Agent is a popular choice to schedule automated jobs for data maintenance and movement. Windows Server Active Directory provides a uniform identity management and single sign on solution. The SQL Server product suite (except for SQL Server APS which is an appliance) can be hosted on physical or virtual machines in an on-premises datacenter, or on virtual machines in the public cloud. This style of enterprise data warehouse (EDW) construction is depicted in figure 1 below

Figure 1:

While the above continues to be a perfectly valid style of deployment on-premises or in the Azure Cloud, there is now an alternate, modern way to construct the same architectural pipeline using Azure PaaS Services.

Enterprise Data Pipeline on Azure PaaS

Azure SQL Database which supports most of the Transact-SQL features available in SQL Server is your first choice OLTP RDBMS in Azure. Azure SQL Data Warehouse provides a cloud-based MPP data platform similar to SQL Server APS. Azure Analysis services, which is compatible with SQL Server Analysis Services, enables enterprise-grade data modeling in the cloud. Power BI is a suite of business analytics tools to deliver insights by connecting to a variety of data sources, enabling ad hoc analysis, creating rich visualizations, and publishing reports for web/mobile consumption. Azure Automation provides a way for users to automate and schedule tasks on the cloud. Using Azure Scheduler, you can run your jobs on simple or complex recurring schedules. Azure Data Factory is a cloud-based data integration service that orchestrates and automates the movement and transformation of data. Azure Active Directory is a multi-tenant cloud based directory and identity management service integrated with Azure PaaS Services. The services mentioned above are made available in Azure regions as autonomous Microsoft managed PaaS services and are accessed through named service end points. This style of enterprise EDW construction on Cloud using PaaS services is depicted in figure 2 below. Note that this is not the only one way of composing PaaS services and is intended to illustrate a like-for-like replacement for equivalents in a traditional pipeline. With a large number of varied cloud services in Azure, many other possibilities open up – for example, your cloud Analytics pipeline could leverage Azure Data Lake Storage or Azure HD Insight depending on the requirement

Figure 2:

Benefits

Let us look at a few benefits you can get by constructing an enterprise data pipeline on Azure PaaS.

Rapid provisioning: Instances of PaaS services can be created quickly – typically seconds or minutes. For example, an Azure Analysis Services instance can be created and deployed usually within seconds.

Ease of Scale up/down: The services can be scaled up or scaled down easily. For example, databases in an Azure SQL Database Elastic Pool can automatically scale up from a few to thousands to meet demand. Azure SQL Database can scale up and down through changes to service tier and performance level or scale-out via elastic scale and elastic pool.

Global Availability: Each service is made available in multiple Azure regional datacenters across the world, so you can choose a region close to your user base. Check the Azure Regions page to find out more.

Cost Management: PaaS Services are costed on a pay-as-you go model, rather than a fixed cost model. You can terminate (or in some cases also pause) services once they are no-longer needed. For example, Azure SQL Data Warehouse is costed based on the provisioned Data Warehouse Units, which can be scaled up or down depending on your demand.

Hardware/Software update and patch management: The updating and patching of the hardware and software constituents of PaaS services is done by Microsoft, removing this concern from you.

Integration: The PaaS services provide mechanisms to integrate with each other, as applicable, and with other non-Azure services. Power BI, for example, can connect directly to Azure Analysis Services, Azure SQL Database, and Azure Data Warehouse. Azure Data Factory supports a variety of data sources including Amazon Redshift, DB2, MySQL, Oracle, PostgreSQL, SAP HANA, Sybase and Teradata.

Security and Compliance: Security features and compliance certifications continue to be added to each PaaS service. Azure SQL Database, for example, supports multiple security features such as authentication, authorization, row-level security, data masking, and auditing. The SQL Threat Detection feature in Azure SQL Database and in Azure SQL Data Warehouse helps detect potential security threats. Data movement with Azure Data Factory has been certified for HIPAA/HITECH, ISO/IEC 27001, ISO/IEC 27018 and CSA STAR.

Common Tools: There are common tools that can be used with most services. For example, the Azure Portal can be used for creating, deleting, modifying and monitoring resources. Most services also support deployment and management through Azure Resource Manager. Familiar tools like SQL Server Management Studio, DMVs etc., continue work with Azure SQL Database, Azure SQL Data Warehouse and Azure Analysis Services. Azure Data Factory is a common mechanism to move data between services. There is a common place to view the health of these and other Azure services.

High Availability & Disaster Recovery: Specific services offer built-in High Availability and Disaster Recovery options. For example, Azure SQL Database Active geo-replication enables you to configure readable secondary databases in the same or different data center locations (regions). Azure SQL Data Warehouse offers backups that can be locally and geographically redundant, while blob snapshots provide a way to restore a SQL Server database to a point in time.

SLAs: Each service provides an SLA. This SLA is specific to each service. For example, Azure Automation provides an SLA for the maximum time for a runbook job to start, and for the availability of the Azure Automation DSC agent.

Which style wins – Physical Server/VM based or PaaS service based?

Your requirements and goals determine what style is most applicable to you. Box products as well as Azure PaaS services continue to improve and interoperate, expanding your possibilities rather than narrowing them down. For example, in SQL Server the Managed Backup to Microsoft Azure manages and automates SQL Server backups to Microsoft Azure Blob storage. As another example, Azure Analysis Services provides data connections to both cloud data sources and on-premises data sources. These and many more mechanisms allow you to think of one style as complementing or supplementing another, letting you build hybrid architectures if needed.

Hybrid data pipeline Architectures

You can have hybrid architectures with a combination of PaaS and IaaS services in the same solution. Customers are already implementing such architectures today. For example, SQL Server can be deployed in Azure Virtual Machines working seamlessly with Azure SQL Data Warehouse or Azure Analysis Services. Data may reside in multiple SQL Servers on virtual machines representing application databases. Data from these databases relevant to DW could be consolidated into a data warehouse on Azure SQL Data Warehouse. Conversely Azure SQL Data Warehouse could act a Hub for warehouse data and feed multiple SQL Servers/SQL Server Analysis Services/Azure Analysis Services systems as spokes for analysis / ad hoc / reporting needs. Again, for ETL and data movement orchestration you can use SQL Server Integration Services as an alternative to ADF, or for better performance you can use PolyBase to load the data. Hybrid architectures allow a staged migration of on-premises solutions to Azure, and enable a part of the deployment to be on-premises during the transition. For example, customers who have traditional warehouses on premises and are in the process of migrating to cloud can choose to keep some of the data sources on premises. ETL processes can move data over a Site to Site VPN (S2S VPN) or Express Route connection between Azure and the on-premises data center.

Customer implementations in the real world

Let’s explore two different real-world Enterprise data pipelines built by customers

Cloud PaaS Centered Architecture

The first example, depicted above, caters to the back-end analytical needs of an Azure customer who decided to adopt a PaaS based solution architecture after considering the ease of provisioning and elastic scale abilities. There is an on-premise Data warehouse implemented in Hive that holds current and historical data. Data relevant to analytics is copied from the on-premise data warehouse to Azure blob storage using Azure Blob Storage REST APIs over HTTPS.

This data (consisting of sale transactions, financial data, web user clickstreams, master data and prior historical data) is loaded into Azure SQL Data Warehouse using Polybase every night. In-memory tabular data models are then built in Azure Analysis Services over aggregates extracted from Azure SQL Data Warehouse (every night post data loading) with incremental processing of Tabular data model. These compute and memory intensive aggregates are executed on Azure SQL Data Warehouse for efficiency while Azure Analysis Services stores these aggregates and supports a much larger user/connection concurrency than allowed by Azure SQL Data Warehouse. The data models could be hosted in Azure Analysis Services or on Power BI. The customer chose to host on Azure Analysis Services considering the size of data to be stored in memory.

Power BI is used to build visually appealing dashboards for Sales Performance, Click Stream Analysis, Web-site Traffic Patterns etc. A few hundred organization users access the Power BI dashboards. Power BI users and the Service Accounts to administer Azure Services are provisioned in Azure Active Directory.

Automated scheduled jobs needed for Azure SQL Data Warehouse and Azure Analysis Services (data loading, index maintenance, statistics update, scheduled cube processing, etc.) are executed using runbooks hosted in Azure Automation. A Disaster Recovery (DR) setup in a secondary Azure region can be easily implemented by periodically copying backups of Azure Analysis Services data and by enabling geo-redundant backups in Azure SQL Data Warehouse.

Hybrid IaaS – PaaS Architecture

The second example depicted above caters to the backend analytical needs of a customer in the financial sector. The solution is a hybrid in using IaaS and PaaS services on Azure. The on-premise deployment continues to host the multiple data sources including SQL Server 2016, Oracle, flat files etc., for legacy application compatibility reasons though there is thinking on migrating these sources eventually to cloud. These data sources store customer, financial, CRM, user clickstream and credit bureau data.

A few GBs of data need to be transferred every night and around 50 GB data over the weekend from on-premise to Azure. The customer has connected the on-premise data center to Azure via Azure Express Route given the data volumes involved and a need for SLAs on network availability. Data is moved from on-premise data sources into flat files using Windows Scheduled Jobs. This data is copied onto a local file server and compressed using algorithms like gzip on-premise. Compressed files are copied onto Azure Storage over HTTPS using the AzCopy tool.

The data loading workflow from Azure Blob Storage to Azure SQL Data Warehouse is orchestrated by SQL Server Integration services (SSIS) hosted in an Azure Virtual Machine. This allows the customer to reuse existing SSIS packages developed for SQL Server and available skill sets in the technical team on SSIS. The customer is also considering migrating these jobs to Azure Data Factory in the future. The data load jobs are scheduled using SQL Server Agent on the same SQL Server Virtual Machine.

SSIS then loads the data into Azure SQL Data Warehouse using the PolyBase engine. Some dimension tables are loaded without PolyBase engine (using direct Inserts) as the number of records are very small. For larger dimension tables with a considerable number of rows and columns, the customer chose to vertically divide the tables to overcome certain limitations of the PolyBase engine. Azure Automation is used to schedule on-cloud maintenance activities such as re-indexing, and statistics updating for Azure SQL Data Warehouse.

This customer implemented a “Hub and Spoke” pattern with heavy analytical queries served from the “hub” and smaller queries (fetching few rows with FILTER conditions) served from the “spoke”. Azure SQL Data Warehouse functions as the Hub and SQL Server 2016 running on Azure Virtual Machines functions as the Spoke. This SQL Server 2016 instance has a subset of the data from Azure SQL Data Warehouse. For Ad hoc reporting /dashboards another spoke using Azure Analysis Services with the In-Memory tabular data model option is being evaluated. More spokes can be added in future if needed.

Azure Active Directory Authentication is used for connecting to Azure SQL Data Warehouse. The user accounts are assigned specific resource classes in Azure SQL Data Warehouse depending on the quantum of the workload. Based on factors such as the nature of operations (data loading, maintenance activities, ad hoc queries, application queries, etc.) and priority, a relevant user account with the appropriate resource class connects to Azure SQL Data Warehouse.

Azure SQL Data Warehouse firewall rules are used to restrict IP ranges that connect to Azure SQL Data Warehouse. Encryption at rest is used with Azure Storage and Azure SQL Data Warehouse. All IaaS VMs and PaaS services are chosen from same Azure region. A Disaster Recovery (DR) setup in a secondary Azure region can be easily implemented by using Always on for SQL Server 2016, periodically copying backups of Azure Analysis Services data and by enabling geo-redundant backups in Azure SQL Data Warehouse.

Best Practices

To realize the benefits from a PaaS based solution architecture, it is important to follow a few best practices.

Co-locate Services: When you allocate a set of PaaS services that need to communicate with each other, consider co-locating the services in the same Azure region to reduce inter-service latencies and any cross-datacenter network traffic charges. Certain services may not be available in all regions, so you may need to plan your deployment accordingly.

Understand Service Limits: Each PaaS service works within certain limits, and these limits keep getting regularly raised. For example, there are resource limits on Azure SQL Database. You should design your application to work within the limits. There can be some difference between the capability in the SQL Server product suite and equivalent cloud service capabilities. For e.g. Azure Analysis Services currently supports only tabular models at the 1200 and 1400 Preview compatibility levels, whereas SQL Server Analysis Services also supports Multi-dimensional models

Design to Scale Out: As a general best practice on any public cloud, you can realize greater benefits by designing for scale out. Using multiple smaller data stores instead of a single large one, doing horizontal/vertical partitioning etc., are common strategies to consider when you need to work within service limits. Azure SQL Database even provides elastic database tools to scale out Azure SQL Databases.

Pausing and Scaling: Some services offer a pause and resume feature – you should consider using it to reduce costs. When you have a period of inactivity (say night times or weekends as applicable), you can pause Azure SQL Data Warehouse further reducing your bill. Not all services offer a pause feature though. But there are other options to control costs based on demand, such as changing performance levels within a tier, and moving from a higher performance tier to a lower tier. The reverse (scaling up or moving to a higher tier) applies for periods of high activity. If you know that the pausing/scaling needs to be done periodically, you can consider automating the pause/scale using Azure Automation.

Semantic layer with Azure Analysis Services: Creating an abstraction, or a semantic layer, between the data warehouse and the end user, as supported by Azure Analysis Services, makes it easier for users to query data. In addition, Azure Analysis Services in Cache mode supports more concurrent users than Azure SQL Data Warehouse. If the number of report users is large, and/or a lot of ad hoc reporting queries are used, then instead of connecting Power BI directly to your Azure SQL Data Warehouse, it is good to buffer such access though Azure Analysis Services.

Follow the best practices specific to each individual service: It is important to follow the documented best practices for each individual service, if mentioned in the documentation. For example, with Azure SQL Data Warehouse it is important to drain transactions before pausing or scaling an Azure SQL Data Warehouse, else the pause or scale can take a long time to complete. There is a number of other individual best practices for Azure SQL Data Warehouse as documented here and patterns and anti-patterns that are important to understand.

Refer to the product documentation of other services and AzureCAT and SQLCAT blogs for more information.

↧

Unattended install and configuration for SQL Server 2017 on Linux

October 3, 2017, 1:34 pm

≫ Next: Secure your on-premises network outbound connection to Azure SQL Database by locking down target IP addresses

≪ Previous: Enterprise Data Pipelines using Azure PaaS Services – An Introduction

SQL 2017 bits are generally available to customers today. One of the most notable milestones in the 2017 release is SQL Server on Linux. Setup has been relatively simple for SQL Server on Linux, but often there are questions around unattended install. For SQL Server on Linux, there are several capabilities that are useful in unattended install scenarios:

You can specify environment variables prior to the install that are picked up by the install process, to enable customization of SQL Server settings such as TCP port, Data/Log directories, etc.
You can pass command line options to Setup.
You can create a script that installs SQL Server and then customizes parameters post-install with the mssql-conf

In SQL Server documentation, we have a sample of what an unattended install process would look like for specific distributions here.

Given multiple installs SQLCAT had to do as a part of working with early versions of SQL Server on Linux, multiple supported platforms, and some common post-install tasks, we created a sample script that would further ease the unattended install process, and would allow you to:

Have one script across multiple distributions
Choose the components installed (SQL Server, SQL Agent, SQL FTS, SQL HA)
Configure common install parameters via a config file
Set up some SQL Server on Linux best practices we have documented such as tempdb configuration and processor affinity, which are not part of the core install
Enable you to specify a custom post-install .sql file to run once SQL Server is installed

Note: If you choose to install HA components, for RHEL have to enable subscription manager and add the right HA repository and for SLES you need to add the HA add-on. The configuration file has links to the documentation in both cases.

Here is how to use the unattended install script:

a. Download the script:

         git clone https://github.com/denzilribeiro/sqlunattended.git
         cd sqlunattended

b. To prevent sudo password prompts during unattended install:

sudo chown root:root sqlunattended.sh
sudo chmod 4755 sqlunattended.sh

c. Modify the conf file to specify the configuration options required, including what components to install, data/log directories, etc. Here is a snippet from the sqlunattended.conf:

#Components to install
INSTALL_SQL_AGENT=YES
INSTALL_FULLTEXT=NO
INSTALL_HA=NO

# This will set SQL processor affinity for all CPUs, we have seen perf improvements doing that on Linux
SQL_CPU_AFFINITY=YES

# This creates 8 tempdb files if NumCPUS >=8, or as many tempdb files as there are CPUs if NumCPUS < 8
SQL_CONFIGURE_TEMPDB_FILES=YES
SQL_TEMPDB_DATA_FOLDER=/mnt/data
SQL_TEMPDB_LOG_FOLDER=/mnt/log
SQL_TEMPDB_DATA_FILE_SIZE_MB=500
SQL_TEMPDB_LOG_FILE_SIZE_MB=100

d. Run the unattended install

/bin/bash sqlunattended.sh

We hope that this will help to further simplify the customization of the install process. Your feedback is appreciated!

↧

Secure your on-premises network outbound connection to Azure SQL Database by locking down target IP addresses

October 6, 2017, 3:32 pm

≫ Next: Performance implications of using multi-Statement TVFs with optional parameters

≪ Previous: Unattended install and configuration for SQL Server 2017 on Linux

Reviewed by: Dimitri Furman, John Hoang, Mike Weiner, Rajesh Setlem

Most people familiar with Azure SQL DB (aka SQL Database) are aware of the firewall setting requirements of SQL DB, which are very important to lock down connections to SQL DB on Azure. Details are documented here. The firewall setting helps restrict inbound connections to clients with specific IP addresses. Note that this refers to the firewall on the Azure SQL DB side. But when the SQL client is running on-premises behind internet gateway/proxy server, the firewall of the gateway/proxy server needs to be configured as well, to allow outbound connections to Azure. Other than opening TCP port 1433, which is the port SQL DB listens on, customers may also limit the IP addresses of target SQL DB that are allowed for outbound connections. This is important from security perspective, since it blocks unwanted network traffic and limits the extent of damage that a potential hacker/malware could do to copy confidential data out of the on-premises network. So how do you know your intended target Azure SQL DB IP addresses for outbound connection?

I had a customer using a simple approach to get his SQL DB IP address. Let me explain with an example. When you spin up a SQL database on Azure, it’s always associated with a logical SQL Server, which has a DNS name like this (shown on Azure portal SQL DB overview blade):

It’s easy to look up its IP address by ping.

The customer got his SQL DB IP address and configured his on-premises network firewall to allow outbound connection only to the IP address for his SQL client application. And it worked! But only until one day that IP address changed without notice. What happened?

SQL DB actually has a fixed set of public gateway IP addresses. These IP addresses have been published in this article on SQL DB connectivity (It’s also a very good read for you to understand the connectivity architecture of Azure SQL DB). Note that for each region, there could be up to 2 IP addresses, which provide enhanced high availability for the SQL DB gateway. The aforementioned customer had connection problem in the end because when the gateway failover occurred, its IP address switched to the secondary. So it’s very important to configure your on-premises firewall to allow outbound connections to both gateway IP addresses of each region. And here is the important part: these IP addresses won’t change. But equally important note: the regions with only primary IP address today eventually will have secondary IP address as well. For future updates on these secondary IPs please refer to this article.

If your on-premises application uses other Azure services besides SQL DB (e.g., storage, compute), and you desire to control outbound connections to all IP addresses of each Azure region, you may download the current IP addresses here. Note:

It includes SQL DB IP addresses as mentioned above.
It’s updated on a weekly basis so you need to update your on-premises firewall rules accordingly (unlike SQL DB, other Azure service IP addresses might change over time)

Summary: to help secure your on-premises network environment, it’s a best practice to configure your on-premises firewall and allow outbound connections on port 1433 only to your target SQL DB IP addresses listed here. To allow outbound connections to other Azure services besides SQL DB, the IP ranges can be downloaded here, which is updated every week.

↧

Performance implications of using multi-Statement TVFs with optional parameters

October 16, 2017, 8:00 am

≫ Next: Using SQL Service Broker for asynchronous external script (R / Python) execution in OLTP systems

≪ Previous: Secure your on-premises network outbound connection to Azure SQL Database by locking down target IP addresses

Authored by Arvind Shyamsundar (Microsoft)
Credits: Prasad Wagle, Srinivasa Babu Valluri, Arun Jayapal, Ranga Bondada, Anand Joseph (members of the Sayint by Zen3 team)
Reviewers: Rajesh Setlem, Joe Sack, Dimitri Furman, Denzil Ribeiro (Microsoft)

This blog post was inspired our recent work with the Sayint dev team, who are a part of Zen3 Infosolutions. SQLCAT has been working with them in their process of adopting SQL Server 2017. During a recent lab engagement, we found a set of seemingly innocuous query (anti-)patterns which were actually having a significant impact on overall performance. In this blog post, we show you what these anti-patterns are and how to remedy them.

Background

Table valued functions (TVFs) are a popular way for T-SQL developers to abstract their queries into reusable objects. There are two types of TVFs: inline and multi-statement. Broadly speaking, multi-statement TVFs offer more capabilities but come with a cost. That’s what we will look at in this post. Do note that we have previously alluded to some of these complications: our SQLCAT Guide to the Relational Engine has a classic article “Table-Valued Functions and tempdb Contention” which has a lot of detail and background – so, it is highly recommended that you read that article first! (Download the PDF linked in that older blog post and go to page 115 in the PDF).

What we saw

During the lab engagement, we were stress-testing a query which basically selected all the rows from a TVF. For reproduction purposes we have sample code below which works with the WideWorldImporters sample database. Here’s the code for the TVF which is equivalent to the initial state of the TVF initially implemented by the customer:

CREATE FUNCTION MSTVF
(@startOrderID INT=NULL, @endOrderID INT=NULL)
RETURNS
    @result TABLE (
        OrderID    INT,
        CustomerID INT)
AS
BEGIN
    INSERT @result
    SELECT OrderID,
           CustomerID
    FROM   Sales.Orders AS Ord
    WHERE  Ord.OrderID BETWEEN ISNULL(@startOrderID, Ord.OrderID) AND ISNULL(@endOrderID, Ord.OrderID);
    RETURN;
END

The actual call from the application is made via a stored procedure as defined below:

CREATE OR ALTER PROCEDURE getOrders(@startOrderID INT, @endOrderID INT)
AS
SELECT *
FROM   dbo.MSTVF(@startOrderID, @endOrderID);

And here is a sample procedure call with actual parameters:

EXEC getOrders 1, 100

If we review the execution plan (we used SET SHOWPLAN_ALL ON for convenience to obtain the below output) for the above, you will notice a couple of interesting things. Let’s take a look at the plan first:

EXEC getOrders 1, 100
  CREATE   PROCEDURE getOrders(@startOrderID INT, @endOrderID INT)  AS  SELECT *  FROM   dbo.MSTVF(@startOrderID, @endOrderID);
       |--Sequence
            |--Table-valued function(OBJECT:([WideWorldImporters].[dbo].[MSTVF]))
            |--Table Scan(OBJECT:([WideWorldImporters].[dbo].[MSTVF]))
    UDF: [WideWorldImporters].[dbo].[MSTVF]
      CREATE FUNCTION MSTVF  (@startOrderID INT=NULL, @endOrderID INT=NULL)  RETURNS       @result TABLE (          OrderID    INT,          CustomerID INT)  AS  BEGIN      INSERT @result      SELECT OrderID,             CustomerID      FROM   Sales.Orders AS Ord      WHERE  Ord.OrderID BETWEEN ISNULL(@startOrderID, Ord.OrderID) AND ISNULL(@endOrderID, Ord.OrderID)      OPTION (RECOMPILE);
                 |--Table Insert(OBJECT:([WideWorldImporters].[dbo].[MSTVF]), SET:([OrderID] = [WideWorldImporters].[Sales].[Orders].[OrderID] as [Ord].[OrderID],[CustomerID] = [WideWorldImporters].[Sales].[Orders].[CustomerID] as [Ord].[CustomerID]))
                      |--Index Scan(OBJECT:([WideWorldImporters].[Sales].[Orders].[FK_Sales_Orders_CustomerID] AS [Ord]))

From the above showplan, notice the following:

The query is scanning the Sales.Orders table (notice the Index Scan with the WHERE predicate) even though there is a clustered primary key on the OrderID column which could potentially have been used to seek to the range of records being accessed;
There is a Table Insert operation though we are not really inserting anything from our query above!

In fact, if you turn on STATISTICS IO and actually execute the query, you will notice the below:

Table '#A25547B1'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Notice this is a ‘temp’ table! Keep this in mind, we tie back to this later in the post. We then used the OStress utility from the RML Utilities package to simulate a number of connections all querying the TVF at the same time. Here is the command-line we used:

ostress -S.\SQL2017 -dWideWorldImporters -q -Q"EXEC getOrders 1, 100" -n100 -r100

The above command line simulates 100 users, and repeats the SELECT query 100 times for each user. This query takes around 16.5 seconds to complete on a laptop with an SSD and an i7 CPU. Of great interest to us was the fact that TEMPDB was being used (we found this by using monitoring queries like the ones specified in our older blog post mentioned at the start of this article.)

Fix #1: Parameter Embedding

If you notice carefully, the above query is an example of ‘optional parameters’ wherein the same query caters to situations where there are specific values for the parameters as well as other cases where there are none. Due to the implementation of the query (specifically the usage of ISNULL(@paramname, ColName)) what ends up happening is that the query plan thus generated will not leverage any indexes on the table. While this query can be refactored to separate versions for cases where the parameter values are supplied, and where they are not, another viable option is to use OPTION (RECOMPILE) on the statement level. This is an acceptable solution in most cases because the cost of scanning the table is often far higher than the cost of recompiling this query. So here is how we used OPTION RECOMPILE in this case:

ALTER FUNCTION MSTVF
(@startOrderID INT=NULL, @endOrderID INT=NULL)
RETURNS
    @result TABLE (
        OrderID    INT,
        CustomerID INT)
AS
BEGIN
    INSERT @result
    SELECT OrderID,
           CustomerID
    FROM   Sales.Orders AS Ord
    WHERE  Ord.OrderID BETWEEN ISNULL(@startOrderID, Ord.OrderID) AND ISNULL(@endOrderID, Ord.OrderID)
    OPTION (RECOMPILE);
    RETURN;
END

We leave the rest of the code (the wrapper stored procedure and the actual parameters for the stored procedure call) unchanged. Here is the execution plan for the revised stored procedure / TVF:

EXEC getOrders 1, 100
  CREATE   PROCEDURE getOrders(@startOrderID INT, @endOrderID INT)  AS  SELECT *  FROM   dbo.MSTVF(@startOrderID, @endOrderID);
       |--Sequence
            |--Table-valued function(OBJECT:([WideWorldImporters].[dbo].[MSTVF]))
            |--Table Scan(OBJECT:([WideWorldImporters].[dbo].[MSTVF]))
    UDF: [WideWorldImporters].[dbo].[MSTVF]
        CREATE FUNCTION MSTVF  (@startOrderID INT=NULL, @endOrderID INT=NULL)  RETURNS       @result TABLE (          OrderID    INT,          CustomerID INT)  AS  BEGIN      INSERT @result      SELECT OrderID,             CustomerID      FROM   Sales.Orders AS Ord      WHERE  Ord.OrderID BETWEEN ISNULL(@startOrderID, Ord.OrderID) AND ISNULL(@endOrderID, Ord.OrderID)      OPTION (RECOMPILE);
                 |--Table Insert(OBJECT:([WideWorldImporters].[dbo].[MSTVF]), SET:([OrderID] = [WideWorldImporters].[Sales].[Orders].[OrderID] as [Ord].[OrderID],[CustomerID] = [WideWorldImporters].[Sales].[Orders].[CustomerID] as [Ord].[CustomerID]))
                      |--Clustered Index Seek(OBJECT:([WideWorldImporters].[Sales].[Orders].[PK_Sales_Orders] AS [Ord]), SEEK:([Ord].[OrderID] &amp;gt;= (1) AND [Ord].[OrderID] &amp;lt;= (100)) ORDERED FORWARD)

After using the above hint, the query does not scan the table (notice the Clustered Index Seek above) any more but we still see a some TEMPDB usage and consequent overall slowness. The OStress test now completes in around 3.3 seconds, which is obviously much better, but can improve it further?

Fix #2: Inline TVFs

If you read the article linked to initially, you probably guessed the real answer to this problem, which is the fact that we have actually implemented a multi-statement TVF (which still has only 1 statement in this case!) While multi-statement TVFs are not always bad, and recently there have been specific improvements in processing such TVFs, we really don’t need a multi-statement TVF here given we just have a single SELECT statement inside! Therefore, we proceeded to refactor the query as an inline TVF, and this is how the result looks:

CREATE FUNCTION ITVF
(@startOrderID INT=NULL, @endOrderID INT=NULL)
RETURNS TABLE
AS
RETURN
    (SELECT OrderID,
            CustomerID
     FROM   Sales.Orders AS Ord
     WHERE  Ord.OrderID BETWEEN ISNULL(@startOrderID, Ord.OrderID) AND ISNULL(@endOrderID, Ord.OrderID))

We alter the wrapper stored procedure as below:

CREATE OR ALTER PROCEDURE getOrders(@startOrderID INT, @endOrderID INT)
AS
SELECT *
FROM   dbo.ITVF(@startOrderID, @endOrderID)

Let’s look at the execution plan for the same set of parameters now:

EXEC getOrders 1, 100
    CREATE   PROCEDURE getOrders(@startOrderID INT, @endOrderID INT)  AS  SELECT *  FROM   dbo.ITVF(@startOrderID, @endOrderID)
       |--Index Scan(OBJECT:([WideWorldImporters].[Sales].[Orders].[FK_Sales_Orders_CustomerID] AS [Ord]),  WHERE:([WideWorldImporters].[Sales].[Orders].[OrderID] as [Ord].[OrderID]&amp;gt;=isnull([@startOrderID],[WideWorldImporters].[Sales].[Orders].[OrderID] as [Ord].[OrderID]) AND [WideWorldImporters].[Sales].[Orders].[OrderID] as [Ord].[OrderID]&amp;lt;=isnull([@endOrderID],[WideWorldImporters].[Sales].[Orders].[OrderID] as [Ord].[OrderID])))

We then used an equivalent OStress command line to stress test this new inline TVF:

ostress -S.\SQL2017 -dWideWorldImporters -q -Q"EXEC getOrders 1, 100" -n100 -r100

This version goes back to around 14 seconds in total. We no longer see the TEMPDB I/O though – by making the TVF as an inline one, we completely avoid having to store results in a temporary table variable, and directly ‘stream’ the results to the client. You would have probably spotted one remaining problem, which is the fact that the optional parameter anti-pattern is not actually addressed by the re-written inline TVF. This is by design, because an inline TVF cannot specify query hints. If we forced a query hint in the stored procedure and then run OStress, what do you think will happen? Let’s modify the stored procedure first:

CREATE OR ALTER PROCEDURE getOrders(@startOrderID INT, @endOrderID INT)
AS
SELECT *
FROM   dbo.ITVF(@startOrderID, @endOrderID)
OPTION (RECOMPILE)

The execution plan is below; as expected the parameter embedding optimization that the OPTION (RECOMPILE) enables, results in a Clustered Index Seek:

EXEC getOrders 1, 100
  CREATE   PROCEDURE getOrders(@startOrderID INT, @endOrderID INT)  AS  SELECT *  FROM   dbo.ITVF(@startOrderID, @endOrderID)  OPTION (RECOMPILE)
       |--Clustered Index Seek(OBJECT:([WideWorldImporters].[Sales].[Orders].[PK_Sales_Orders] AS [Ord]), SEEK:([Ord].[OrderID] >= (1) AND [Ord].[OrderID] <= (100)) ORDERED FORWARD)

Then we run the same OStress command line as before. If you try this, you will see that introducing OPTION (RECOMPILE) in the stored procedure further reduces execution time for query to around 1.8 seconds. This is because in this specific case, the cost of recompiling the statement each time was much lower than the cost of scanning the Sales.Orders table. In other cases where the cost of scanning was not as high, you may find OPTION (RECOMPILE) actually degrades performance. But that has to be checked in each case.

Side note: it is useful to note that the ‘magic parameter value’ of NULL has been used here to denote an ‘select all’ case – but we also see other ‘special’ values being used to represent these optional filters. In those cases instead of using ISNULL, the optional parameter is checked by using a Boolean OR operator. Also, this ‘optional ‘ pattern has other manifestations which are listed below:

ColumnName = @Param OR @Param IS NULL
ColumnName = COALESCE (@Param, ColumnName)

“Fix” #3: Mandatory parameters

Depending on your perspective, this last iteration can be taken as a true fix or just a test. For this, let’s write a version of the inline TVF which has all parameters mandatory, so that means we no longer need the ISNULL handling in the TVF.

CREATE FUNCTION ITVF_MandatoryParams
(@startOrderID INT, @endOrderID INT)
RETURNS TABLE
AS
RETURN
    (SELECT OrderID,
            CustomerID
     FROM   Sales.Orders AS Ord
     WHERE  Ord.OrderID BETWEEN @startOrderID AND @endOrderID)

Let’s make a corresponding stored procedure as well:

CREATE OR ALTER PROCEDURE getOrders_MandatoryParams(@startOrderID INT, @endOrderID INT)
AS
SELECT *
FROM   dbo.ITVF_MandatoryParams(@startOrderID, @endOrderID)

Here is the execution plan for this version with mandatory params:

EXEC getOrders_MandatoryParams 1, 100
      CREATE   PROCEDURE getOrders_MandatoryParams(@startOrderID INT, @endOrderID INT)  AS  SELECT *  FROM   dbo.ITVF_MandatoryParams(@startOrderID, @endOrderID)
       |--Clustered Index Seek(OBJECT:([WideWorldImporters].[Sales].[Orders].[PK_Sales_Orders] AS [Ord]), SEEK:([Ord].[OrderID] &amp;gt;= [@startOrderID] AND [Ord].[OrderID] &amp;lt;= [@endOrderID]) ORDERED FORWARD)

And then we repeat the OStress test with the same parameters but calling the new ITVF above:

ostress -S.\SQL2017 -dWideWorldImporters -q -Q"EXEC getOrders_MandatoryParams 1, 100" -n100 -r100

This version performs the best (as expected) (805 milliseconds). However, refactoring code into special versions for mandatory parameters is not always possible, which is why the OPTION (RECOMPILE) hint is in many cases the most viable option for a lot of customers.

Summary

Here are the key points we discussed above:

Multi-statement TVFs use table variables to store the result; thereby causing TEMPDB contention at scale
Inline TVFs do not need such temporary storage and scale much better
Beware of optional parameters in your TVFs. These can compound your performance problems. Whenever possible, replace TVFs which use optional parameters with specialized versions (with mandatory parameter values) for each case
OPTION (RECOMPILE) can be very handy to deal with the optional parameter problem, but that comes with its own cost. Be sure to test thoroughly at scale before finalizing the code

The OStress test results are also summarized below:

Scenario	Total test time
Multi-statement TVF with optional parameters	16.493 seconds
Multi-statement TVF with OPTION(RECOMPILE)	3.093 seconds
Inline TVF with optional parameters	14.707 seconds
Inline TVF with OPTION(RECOMPILE) at query level	1.803 seconds
Inline TVF with mandatory parameters	805 millisec

We want to hear from you!

In closing, the SQL team is eager to hear from you if you would like some improvements in scenarios involving optional parameters. Do you have a lot of such cases? Is OPTION (RECOMPILE) something which is acceptable for you? If not, what would you like to see in SQL Server and Azure SQL DB to help you deal with these optional parameter cases? Do let us know by leaving a comment!

↧

Using SQL Service Broker for asynchronous external script (R / Python) execution in OLTP systems

October 19, 2017, 11:06 am

≫ Next: Top Questions from New Users of Azure SQL Database

≪ Previous: Performance implications of using multi-Statement TVFs with optional parameters

Authored by Arvind Shyamsundar (Microsoft)
Credits: Prasad Wagle, Srinivasa Babu Valluri, Arun Jayapal, Ranga Bondada, Anand Joseph (members of the Sayint by Zen3 team)
Reviewers: Nellie Gustafsson, Umachandar Jayachandran, Dimitri Furman (Microsoft)

This blog post was inspired our recent work with the Sayint dev team, who are a part of Zen3 Infosolutions. SQLCAT has been working with them in their process of adopting SQL Server 2017. Microsoft introduced the ability to invoke external Python scripts in SQL Server 2017, and this capability to effectively move ‘intelligence’ closer to the data, was a big motivation factor for the Sayint team to adopt SQL Server 2017.

Sayint application Overview

The Sayint application has many layers through which data flows:

It all starts with the Automatic Speech Recognition (ASR) server. This server takes audio recordings of conversations between a customer and a call center executive, and then converts that to text.
This text is then recorded into a database and in parallel sent to a message queue.
The Sayint application uses several Python scripts which are operationalized in a separate application server to read the same call transcript from the message queue.
Once the message is read by the Python script; it then processes that call transcript in various ways, one of which is to find ‘n-grams’ in the text.
These n-grams are then recorded into the same database. The diagram below shows the data flow at a high level:

The multiple paths that the data takes, and the external application server, are moving parts which add to the latency and complexity of the solution. This is one of the places where the in-database execution of Python code in SQL Server 2017 appeared to be a valuable simplification for the Sayint development team.

Motivation

Given the capability for external script execution in SQL Server, one scenario is that some external (Python) scripts which are fairly lightweight, might be invoked very frequently in a OLTP system. In fact there are special optimizations now in SQL Server 2017 to directly invoke specific types of ML models using the new native / real-time scoring functions.

But what if you had a (more expensive / long running) external script which you ideally wanted to run as ‘quickly’ as possible when new data arrives, but did not want that expensive external script to block the OLTP insert operation?

That is the scenario we address in this blog post. Here, we will show you how you can use the asynchronous execution mechanism offered by SQL Server Service Broker to ‘queue’ up data inside SQL Server which can then be asynchronously passed to a Python script, and the results of that Python script then stored back into SQL Server.

This is effectively similar to the external message queue pattern but has some key advantages:

The solution is integrated within the data store, leading to fewer moving parts and lower complexity
Because the solution is in-database, we don’t need to make copies of the data. We just need to know what data has to be processed (effectively a ‘pointer to the data’ is what we need).

Service Broker also offers options to govern the number of readers of the queue, thereby ensuring predictable throughput without affecting core database operations.

Implementation Details

We assume that you have some background about how SQL Server Service Broker helps decouple execution of code inside of SQL Server. If not, please take some time to review the documentation on the feature. There are also many excellent articles in the public domain on the general application of Service Broker.

The chart below shows the sequence of operations that is involved in our solution:

With that overview in our mind, let us take a look at sample code which illustrates exactly how this works!

Tables and Indexes

First let’s create a table which contains the raw call data. The ASR application inserts into this table. Note the call_id which will serve as a ‘unique key’ for incoming calls.

CREATE TABLE dbo.BaseTable (
     call_id    INT PRIMARY KEY NOT NULL,
     callrecord NVARCHAR (MAX)
);

Next, let’s create a table which will store the ‘shredded’ n-grams corresponding to each call_id:

CREATE TABLE dbo.NGramTable (
     call_id   INT            NOT NULL,
     ngramid   INT            IDENTITY (1, 1) NOT NULL,
     ngramtext NVARCHAR (MAX),
     CONSTRAINT PK_NGramTable PRIMARY KEY (ngramid),
     INDEX CCI_NGramTable CLUSTERED COLUMNSTORE,
     INDEX NC_NGramTable NONCLUSTERED (call_id)
);

In the above table, note that we have a Clustered Columnstore Index declared on this table. Here’s why:

The primary purpose of that is to enable high performance querying on aggregates (such as COUNT.) The second purpose is to save on space.
Secondly, given that n-grams will be naturally repetitive in nature, the dictionary compression offered by Columnstore Indexes is really effective in saving storage space.

In the Sayint application, there is a need to perform full-text search queries over the n-grams (“show me all n-grams which contain a specific keyword”). To support those queries, we proceed to create a SQL Full-Text Index on the n-gram text column:

CREATE FULLTEXT CATALOG [mydb_catalog] WITH ACCENT_SENSITIVITY = ON

CREATE FULLTEXT INDEX ON [dbo].NGramTable
     (ngramtext)
     KEY INDEX PK_NGramTable
     ON mydb_catalog;

Service Broker Objects

With the tables and indexes created, we now proceed to create the necessary Service Broker constructs. First, we enable Service Broker for this database:

ALTER DATABASE mydb SET ENABLE_BROKER;

Well, that was simple enough! Next, let’s create message types and associated contracts:

CREATE MESSAGE TYPE [RequestMessage]
     VALIDATION = WELL_FORMED_XML;

CREATE MESSAGE TYPE [ReplyMessage]
     VALIDATION = WELL_FORMED_XML;

CREATE CONTRACT [SampleContract]
     ([RequestMessage] SENT BY INITIATOR, [ReplyMessage] SENT BY TARGET);

Lastly we create the Queues and associated Services. We have to do this for both the target and initiator, though the initiator queue will not really store any messages in our specific implementation.

CREATE QUEUE TargetQueue;

CREATE SERVICE [TargetService]
     ON QUEUE TargetQueue
     ([SampleContract]);

CREATE QUEUE InitiatorQueue;

CREATE SERVICE [InitiatorService]
     ON QUEUE InitiatorQueue;

Stored Procedures

We first create a stored procedure which takes a list of call_ids and compute the n-grams for those calls. This stored procedure is the one that actually invokes the Python code using sp_execute_external_script.

CREATE PROCEDURE GenerateNGrams
(
     -- varchar is okay here because we only have numbers
     @listOfCallIds varchar(max)
)
AS
BEGIN
     EXECUTE sp_execute_external_script @language = N'Python', @script = N'
import nltk
from nltk import corpus
import pandas as pd
import string
from revoscalepy import RxSqlServerData, rx_data_step
mydict = []
translate_table = dict((ord(char), None) for char in string.punctuation)
def generate_ngrams(row):
     inputText = row["InputText"]
     nopunc = inputText.translate(translate_table)
     raw_words = nltk.word_tokenize(nopunc)

     words = [word for word in raw_words if word not in corpus.stopwords.words("english")]

     my_bigrams = list(nltk.bigrams(words))
     my_trigrams = nltk.trigrams(words)

     for bigram in my_bigrams:
         rowdict = {}
         rowdict["call_id"] = row["call_id"]
         rowdict["ngram_text"] = (bigram[0] + " " + bigram[1])
         mydict.append(rowdict)
     return
mydict = []
result = InputDataSet.apply(lambda row: generate_ngrams(row), axis=1)
OutputDataSet = pd.DataFrame(mydict)
'
, @input_data_1 = N' SELECT B.call_id, B.callrecord as InputText FROM BaseTable B
WHERE EXISTS(SELECT * FROM STRING_SPLIT(@listOfCallIds, '','') as t WHERE CAST(t.value as int) = B.call_id)'
, @params = N'@listOfCallIds varchar(max)'
, @listOfCallIds = @listOfCallIds
END

Next, we create an “activation stored procedure” which is invoked by the Service Broker queue whenever there are messages on the queue. This activation procedure reads from the queue, constructs a list of call_id values and then invokes the previous “n-gram creator” stored procedure:

CREATE PROCEDURE TargetActivProc
AS
BEGIN
DECLARE @RecvReqDlgHandle AS UNIQUEIDENTIFIER;
DECLARE @RecvReqMsg AS XML;
DECLARE @RecvReqMsgName AS sysname;
DECLARE @listOfCallIds varchar(max);
WHILE (1 = 1)
     BEGIN
         BEGIN TRANSACTION;
         WAITFOR (RECEIVE TOP (1) @RecvReqDlgHandle = conversation_handle, @RecvReqMsg = message_body, @RecvReqMsgName = message_type_name FROM TargetQueue),  TIMEOUT 5000;
         IF (@@ROWCOUNT = 0)
             BEGIN
                 ROLLBACK;
                 BREAK;
             END
         IF @RecvReqMsgName = N'RequestMessage'
             BEGIN
                 -- Populate a local variable with a comma separated list of {call_id} values which are new
                 -- so that the external script invoked by GenerateNGrams can operate on the associated data
                 -- We avoid making a copy of the InputText itself as it will then occupy space in a temp table
                 -- as well as in a Pandas DF later on
                 SELECT @listOfCallIds = STRING_AGG(T.c.value('./@call_id', 'varchar(100)'), ',')
                 FROM   @RecvReqMsg.nodes('/Inserted') AS T(c);
                -- Call the SPEES wrapper procedure to generate n-grams and then insert those into the n-gram table
                 INSERT NGramTable (call_id, ngramtext)
                 EXECUTE GenerateNGrams @listOfCallIds;
                END CONVERSATION @RecvReqDlgHandle;
             END
         ELSE
             IF @RecvReqMsgName = N'http://schemas.microsoft.com/SQL/ServiceBroker/EndDialog'
                 BEGIN
                     END CONVERSATION @RecvReqDlgHandle;
                 END
             ELSE
                 IF @RecvReqMsgName = N'http://schemas.microsoft.com/SQL/ServiceBroker/Error'
                     BEGIN
                         END CONVERSATION @RecvReqDlgHandle;
                     END
         COMMIT TRANSACTION;
     END
END

We then associate the activation procedure with the queue:

ALTER QUEUE TargetQueue WITH ACTIVATION (STATUS = ON, PROCEDURE_NAME = TargetActivProc, MAX_QUEUE_READERS = 5, EXECUTE AS SELF);

Trigger

We finally create a trigger on the BaseTable which will be invoked automatically for each INSERT operation. This trigger takes the list of call_id values which have been inserted and transforms that into a XML document (using FOR XML) which is then inserted into the Service Broker queue:

CREATE OR ALTER TRIGGER trg_Insert_BaseTable
     ON dbo.BaseTable
     FOR INSERT
     AS BEGIN
            DECLARE @InitDlgHandle AS UNIQUEIDENTIFIER;
            DECLARE @RequestMsg AS XML;

            SELECT @RequestMsg = (SELECT call_id
                                  FROM   Inserted
                                  FOR    XML AUTO);

            BEGIN TRANSACTION;
           BEGIN DIALOG @InitDlgHandle
                FROM SERVICE [InitiatorService]
                TO SERVICE N'TargetService'
                ON CONTRACT [SampleContract]
                WITH ENCRYPTION = OFF;

            -- Send a message on the conversation
            SEND ON CONVERSATION (@InitDlgHandle) MESSAGE TYPE [RequestMessage] (@RequestMsg);
           COMMIT TRANSACTION;
        END

Unit Testing

Let’s now unit test the operation of the trigger, activation procedure and the n-gram computation using Python! Let’s insert 4 rows in one go into the base table, simulating what the ASR application might do:

TRUNCATE TABLE BaseTable;
TRUNCATE TABLE NGramTable;
INSERT  INTO BaseTable
VALUES (1, 'the quick brown fox jumps over the lazy dog'),
(2, 'the lazy dog now jumps over the quick brown fox'),
(3, 'But I must explain to you how all this mistaken idea of denouncing of a pleasure and praising pain was born and I will give you a complete account of the system, and expound the actual teachings of the great explorer of the truth, the master-builder of human happiness. No one rejects, dislikes, or avoids pleasure itself, because it is pleasure, but because those who do not know how to pursue pleasure rationally encounter consequences that are extremely painful. Nor again is there anyone who loves or pursues or desires to obtain pain of itself, because it is pain, but occasionally circumstances occur in which toil and pain can procure him some great pleasure. To take a trivial example, which of us ever undertakes laborious physical exercise, except to obtain some advantage from it? But who has any right to find fault with a man who chooses to enjoy a pleasure that has no annoying consequences, or one who avoids a pain that produces no resultant pleasure?'),
(4, 'On the other hand, we denounce with righteous indignation and dislike men who are so beguiled and demoralized by the charms of pleasure of the moment, so blinded by desire, that they cannot foresee the pain and trouble that are bound to ensue; and equal blame belongs to those who fail in their duty through weakness of will, which is the same as saying through shrinking from toil and pain. These cases are perfectly simple and easy to distinguish. In a free hour, when our power of choice is untrammeled and when nothing prevents our being able to do what we like best, every pleasure is to be welcomed and every pain avoided. But in certain circumstances and owing to the claims of duty or the obligations of business it will frequently occur that pleasures have to be repudiated and annoyances accepted. The wise man therefore always holds in these matters to this principle of selection: he rejects pleasures to secure other greater pleasures, or else he endures pains to avoid worse pains.');

(In case you are wondering, the last 2 text samples in the unit test are taken from this Wikipedia page!)

Then we can take a look at the n-gram table:

SELECT *
FROM   NGramTable;

As you can see below, n-grams have been computed for multiple call_id values! (The results below have been truncated for readability; there are values for call_id 4 as well in the actual table.)

Now, let’s try a representative full-text query:

SELECT T.*
FROM   CONTAINSTABLE(NGramTable, ngramtext, 'fox') FT
JOIN NGramTable T
ON T.ngramid = FT.[Key];

This query returns the following:

Stress Test

Let’s put this to the ultimate test – multiple threads concurrently inserting records. The ideal outcome is to see that the INSERT operations complete quickly, and the Python.exe processes run asynchronously and finish the n-gram generation in due course of time. To facilitate the stress test, we create a SEQUENCE object in SQL and an associated Stored procedure:

CREATE SEQUENCE CallIdsForTest AS INT
    START WITH 1
    INCREMENT BY 1;
GO

CREATE OR ALTER PROCEDURE InsertCall
AS
BEGIN
	SET NOCOUNT ON

	INSERT  INTO BaseTable
	VALUES (NEXT VALUE FOR CallIdsForTest, 'But I must explain to you how all this mistaken idea of denouncing of a pleasure and praising pain was born and I will give you a complete account of the system, and expound the actual teachings of the great explorer of the truth, the master-builder of human happiness. No one rejects, dislikes, or avoids pleasure itself, because it is pleasure, but because those who do not know how to pursue pleasure rationally encounter consequences that are extremely painful. Nor again is there anyone who loves or pursues or desires to obtain pain of itself, because it is pain, but occasionally circumstances occur in which toil and pain can procure him some great pleasure. To take a trivial example, which of us ever undertakes laborious physical exercise, except to obtain some advantage from it? But who has any right to find fault with a man who chooses to enjoy a pleasure that has no annoying consequences, or one who avoids a pain that produces no resultant pleasure?')
END;

To simulate workload, we use the OStress utility from the RML Utilities package to simulate a number of connections. The specific OStress command line used is shown below:

ostress -S.\SQL2017 -dmydb -Q"exec InsertCall" -n100 -q

Each execution of the stored procedure should eventually result in 84 n-gram values being populated in the NGramTable. So with the above stress test, we should see 8400 records in the table. When we run the OStress command line, the 100 concurrent calls to the InsertCall stored procedure finish very quickly – in 1.403 seconds. This is good, because we do not want the front-end stored procedure to block in any way. The time it takes the NGramTable to be completely populated with the 8400 records is around 88 seconds. Almost all that time is actually spent in Python.exe crunching away at the text (converting to lower case, tokenizing, generating n-grams etc.)

Resource Governance

When running the above stress test (with default settings for SQL Server), you will notice that the Python.exe instances end up using almost 100% of the CPU on the machine. This is expected because text processing is a CPU-bound activity. To balance the resource usage and restrict the external Python processes from using too much CPU, we can use the external resource governor setting for maximum CPU used by external instances of R / Python. It is as easy as running the code below, no restart required!

ALTER EXTERNAL RESOURCE POOL [default] WITH (max_cpu_percent=30, AFFINITY CPU = AUTO)

ALTER RESOURCE GOVERNOR RECONFIGURE;

After setting this external pool CPU limit to 30%, the overall CPU usage in the system is reduced, with the obvious tradeoff in overall latency in populating the n-grams into their table. In this case, the total time to populate the 8400 n-grams is around 234 seconds, which when you think about it is just right (we would expect processing time to increase to around 3x the old duration, given we have just 30% of CPU resources.)

So by adjusting the Resource Governor limit you can achieve the right balance between resource usage on the system and external script processing time. More information about Resource Pools for external scripts is here and here.

Summary

In the actual Sayint application, the ability to query for n-grams containing a specific keyword is crucial. With this workflow within SQL Server 2017, that requirement is elegantly accomplished using a combination of several SQL Server features alongside the in-DB Python code – with the advantages of lower latency, data security and no data duplication.

↧

SQLCAT @PASS Summit 2017

October 30, 2017, 8:57 am

≫ Next: Recent Updates to setting up SQL Server Availability Groups in Azure VM with AAD Domain Services

≪ Previous: Top Questions from New Users of Azure SQL Database

Are you coming to the PASS Summit 2017 in Seattle? SQLCAT will be in full force at the PASS Summit 2017, and we will also bring along our colleagues from the broader AzureCAT team, including the newly formed DataCAT team.

SQLCAT Sessions

SQLCAT sessions are unique. We bring in real customer stories and present their deployments, architectures, challenges and lessons learned. This year at the PASS Summit, we will have 4 sessions – each one filled with rich learnings from real world customer deployments.

SQL Server on Linux: DBA Focused Lessons Learned from Early Deployments

Building a Graph Database Application with SQL Server 2017 and Azure SQL Database

Real-world SQL Server R Services

How Microsoft Sales Evolved from a Monolithic On-Prem Solution to a Scaleout Solution in Azure

SQL Clinic

Have a technical question, a troubleshooting challenge, want to have an architecture discussion, want to find ways to move your data solutions to Azure, or want to find best ways to upgrade your SQL Server? SQL Clinic is the place you want to be at. SQL Clinic is the hub of technical experts from SQLCAT, Tiger team, SQL PG, CSS and others. Whether you want a makeover of your deployment or need an open-heart surgery, the experts at SQL Clinic will have the right advice for you. Find all your answers in one place!

Bonus

AzureCAT is hiring for various opportunities across the team, including data experts in the areas of Oracle, Teradata, open source data solutions, big data technologies, data science and related areas. Come talk to a CAT!

And More …

That’s not all. SQLCAT will be involved in many more events and customer conversations during the Summit. If you have a suggestion on how we can make your experience at the PASS Summit more effective and more productive, don’t hesitate to leave a note.

Thanks, and see you all at the PASS Summit 2017 in Seattle. You are coming, right?

↧

Recent Updates to setting up SQL Server Availability Groups in Azure VM with AAD Domain Services

October 30, 2017, 4:07 pm

≫ Next: SQL Server VLDB in Azure: DBA Tasks Made Simple

≪ Previous: SQLCAT @PASS Summit 2017

Reviewed by: Dimitri Furman, Kun Cheng

This blog is an extension to the one that was published in February 2017 . In these eight months there has been some notable improvements to AAD Domain Services (AAD DS) and Azure Virtual Network (VNET) capabilities. Now, it’s even more easier and convenient to leverage AAD DS for setting up SQL Server Availability Groups (AG) in Azure VMs.

As a recap, following scenarios were covered in the blog linked above:

Scenario 1: Enabling AAD Domain Services in Classic Virtual Network, deploying two SQL Server 2016 Classic VMs (Windows Server 2012), and then setting up AG

Scenario 2: Leveraging AAD Domain Services enabled in Classic Virtual Network from an ARM virtual network by adding an ARM based SQL Server 2016 VM as a replica to the existing AG.

Summary of recent improvements to AAD DS and new VNET capability:

AAD DS is generally available in many Azure regions now. Check here to find out availability by region (AAD DS is under Security + Identity).
General availability of AAD DS support for Azure Resource Manager (ARM) based VNET. This was eagerly expected by customers. You can read more about it here.
General availability of AAD DS in the new Azure Portal (portal.azure.com).
Public preview for Global VNET Peering. VNET peering is a powerful feature allowing you to peer virtual networks in Azure. Peering VNETs in the same region is generally available, and the ability to peer VNETs across different Azure regions is in preview. You can read more about it here.

With these improvements, scenarios covered in previous blog can be done in a different way. A new scenario is enabled as well.

Scenario 1: This can be implemented in ARM only mode with no need to create any classic resources. Good news is it can be done in new Azure Portal (portal.azure.com).

Scenario 2: Since AAD DS can be enabled on ARM VNET now, you can peer it with another ARM VNET, or with a Classic VNET if there is a specific need.

Scenario 3: This is a new scenario possible with Global VNET Peering to connect an Azure region designated as your DR with your primary Azure region. Asynchronous AG replica in your Azure DR region would connect to primary replica over VNET Peering and provide failover capabilities in the event of disaster. We tried this setup and it works just as expected.

Go ahead and explore these new capabilities and scenarios and let us know if you have any questions or comments.

↧

SQL Server VLDB in Azure: DBA Tasks Made Simple

January 22, 2018, 2:29 pm

≫ Next: Azure SQL DB Managed Instance – sp_readmierrorlog

≪ Previous: Recent Updates to setting up SQL Server Availability Groups in Azure VM with AAD Domain Services

Reviewed by: Rajesh Setlem, Mike Weiner, Xiaochen Wu

With thanks to Joey D’Antoni (blog) for asking a great question that prompted this article.

As any experienced DBA knows, supporting a very large database (VLDB) tends to be exponentially more complex than supporting smaller databases. The definition of VLDB has been shifting in recent years, but most DBAs would agree that complexities become significant starting with database size in the 3-5 TB range. A few obvious examples of tasks that are more challenging with a VLDB are backup/restore, consistency checking, HA/DR, partitioning, etc.

While these challenges are inherently present in any VLDB, the platform used for hosting the database can sometimes make them easier. In this article, we will show how using Azure as the cloud platform for SQL Server can simplify some of these common challenges.

To be clear from the start, we will be talking about databases in the 3-30 TB range running on Azure IaaS VMs. At those sizes, the traditional database operation methods tend to be inefficient, or sometimes break down completely. For smaller databases, the approaches described in this article may be unnecessary (though they would still work).

A VLDB database

In the examples in this article, we will be using a 12 TB database, appropriately named VLDB01. The database was created on a SQL Server VM in Azure by loading data from text files in Azure Blob Storage, using PolyBase. The database was created with files stored directly in Azure blob storage, using a Premium Storage account.

The ability to store database files directly in blob storage, rather than on disk, is what enables the scenarios described in this article. Starting with SQL Server 2014, in addition to using files stored on a local disk or on a UNC path, SQL Server also supports placing database files as page blobs in Azure Blob Storage, and accessing them over an http(s) endpoint. This lets SQL Server use certain features of Azure Blob Storage that are not necessarily available in the more traditional storage subsystems. One example of this is described in Instant Log Initialization for SQL Server in Azure. This article will cover additional scenarios, with emphasis on simplifying operational tasks for a VLDB.

Database file layout for the VLDB01 database is described by the output from the following query:

SELECT name,
       physical_name,
       SUM(CAST(FILEPROPERTY(name, 'SpaceUsed') AS bigint) * 8192.0/1024/1024/1024/1024) AS space_used_TB,
       SUM(CAST((size/128.0) AS decimal(15,2))/1024/1024) AS file_size_TB
FROM sys.database_files
GROUP BY GROUPING SETS ((name, physical_name),());

name	physical_name	space_used_TB	file_size_TB
VLDB01	https://premiumstorageaccount.blob.core.windows.net/mssqldata/VLDB01.mdf	0.0000068	0.0000687
VLDB01_File1	https://premiumstorageaccount.blob.core.windows.net/mssqldata/VLDB01_File1.ndf	1.5332245	2.0000000
VLDB01_File2	https://premiumstorageaccount.blob.core.windows.net/mssqldata/VLDB01_File2.ndf	1.5351921	2.0000000
VLDB01_File3	https://premiumstorageaccount.blob.core.windows.net/mssqldata/VLDB01_File3.ndf	1.5322309	2.0000000
VLDB01_File4	https://premiumstorageaccount.blob.core.windows.net/mssqldata/VLDB01_File4.ndf	1.5393468	2.0000000
VLDB01_File5	https://premiumstorageaccount.blob.core.windows.net/mssqldata/VLDB01_File5.ndf	1.5361704	2.0000000
VLDB01_File6	https://premiumstorageaccount.blob.core.windows.net/mssqldata/VLDB01_File6.ndf	1.5354084	2.0000000
VLDB01_File7	https://premiumstorageaccount.blob.core.windows.net/mssqldata/VLDB01_File7.ndf	1.5347077	2.0000000
VLDB01_File8	https://premiumstorageaccount.blob.core.windows.net/mssqldata/VLDB01_File8.ndf	1.5317377	2.0000000
VLDB01_log	https://premiumstorageaccount.blob.core.windows.net/mssqldata/VLDB01_log.ldf	0.0281003	0.1874998
		12.3061257	16.1875685

Each data file, other than the PRIMARY filegroup data file and the log file, uses a 2 TB Premium Storage page blob, the equivalent of a P40 disk. This provides maximum per-blob storage throughput and IOPS available in Azure Premium Storage today. The number of data files, eight for this particular database, was chosen to allow the database to grow by another ~4 TB without any changes in the storage layout. Even if the data could fit into fewer data files, using more files lets us increase the overall storage throughput and IOPS for the database (subject to VM level limits; more on this in the Limitations section below).

Backup and restore

Let’s see how long it would take to back up this 12 TB database using the traditional streaming backup to URL. Since each SQL Server backup blob cannot exceed ~195 GB (see Backing up a VLDB to Azure Blob Storage for details), we have to stripe the backup over multiple blobs. In this case, we are using the maximum allowed number of stripes, 64, to ensure we do not hit the 195 GB per-stripe limit. That said, for a 12 TB database, we could use fewer stripes, assuming reasonably high backup compression ratio. Here is the abbreviated BACKUP DATABASE command (omitting most stripes), with parameters optimized for backup to URL as described in Backing up a VLDB to Azure Blob Storage:

BACKUP DATABASE VLDB01 TO
URL = 'https://storageaccount.blob.core.windows.net/backup/vldb01_stripe1.bak',
...
URL = 'https://storageaccount.blob.core.windows.net/backup/vldb01_stripe64.bak'
WITH COMPRESSION, CHECKSUM, MAXTRANSFERSIZE = 4194304, BLOCKSIZE = 65536, STATS = 1;

On a GS5 VM (32 cores, 448 GB RAM, 20000 Mbps network bandwidth), this took nearly 17 hours:

...
99 percent processed.
Processed 848 pages for database 'VLDB01', file 'VLDB01' on file 1.
Processed 206278432 pages for database 'VLDB01', file 'VLDB01_File2' on file 1.
Processed 206014904 pages for database 'VLDB01', file 'VLDB01_File1' on file 1.
Processed 205881752 pages for database 'VLDB01', file 'VLDB01_File3' on file 1.
Processed 206835160 pages for database 'VLDB01', file 'VLDB01_File4' on file 1.
Processed 206409496 pages for database 'VLDB01', file 'VLDB01_File5' on file 1.
Processed 206307512 pages for database 'VLDB01', file 'VLDB01_File6' on file 1.
Processed 206213568 pages for database 'VLDB01', file 'VLDB01_File7' on file 1.
Processed 205815696 pages for database 'VLDB01', file 'VLDB01_File8' on file 1.
100 percent processed.
Processed 2 pages for database 'VLDB01', file 'VLDB01_log' on file 1.
BACKUP DATABASE successfully processed 1649757370 pages in 60288.832 seconds (213.783 MB/sec).

While for some applications this might be acceptable (after all, this is a potentially infrequent full backup; more frequent differential and transaction log backups would take less time), a restore of this database would take at least as long as the full backup took, i.e. close to a day. There are few applications that can accept a downtime that long.

Let’s see what would happen if instead of a streaming backup, we use a file-snapshot backup, available if database files are stored directly in Azure Blob Storage.

BACKUP DATABASE VLDB01
TO URL = 'https://standardstorageaccount.blob.core.windows.net/backup/VLDB01_snapshot1.bak'
WITH FILE_SNAPSHOT;

Processed 0 pages for database 'VLDB01', file 'VLDB01' on file 1.
Processed 0 pages for database 'VLDB01', file 'VLDB01_File1' on file 1.
Processed 0 pages for database 'VLDB01', file 'VLDB01_File2' on file 1.
Processed 0 pages for database 'VLDB01', file 'VLDB01_File3' on file 1.
Processed 0 pages for database 'VLDB01', file 'VLDB01_File4' on file 1.
Processed 0 pages for database 'VLDB01', file 'VLDB01_File5' on file 1.
Processed 0 pages for database 'VLDB01', file 'VLDB01_File6' on file 1.
Processed 0 pages for database 'VLDB01', file 'VLDB01_File7' on file 1.
Processed 0 pages for database 'VLDB01', file 'VLDB01_File8' on file 1.
Processed 0 pages for database 'VLDB01', file 'VLDB01_log' on file 1.
BACKUP DATABASE successfully processed 0 pages in 0.151 seconds (0.000 MB/sec).

This backup completed in 16 seconds! This is possible because no data movement has actually occurred, unlike in the streaming backup case. Instead, a storage snapshot was created for each database file, with SQL Server ensuring backup consistency for the entire database by briefly freezing write IO on the database. The backup file referenced in the BACKUP DATABASE statement above (VLDB01_snapshot1.bak) stores pointers to these snapshots, and other backup metadata.

We should note that the database was idle during this backup. When taking the same backup on the same GS5 VM running queries with 60% CPU utilization, and reading ~1 GB/s from database files in blob storage, the backup took 1 minute 29 seconds. The backup took 1 minute 55 seconds when a bulk load into a table was running. But even though file-snapshot backups get slower on a busy instance, they are still orders of magnitude faster than streaming backups.

Let’s try restoring this backup on another SQL Server instance on a different VM:

RESTORE DATABASE VLDB01 FROM URL = 'https://standardstorageaccount.blob.core.windows.net/backup/VLDB01_snapshot1.bak'
WITH
MOVE 'VLDB01' TO 'https://premiumstorageaccount.blob.core.windows.net/mssqldata/VLDB01.mdf',
MOVE 'VLDB01_File1' TO 'https://premiumstorageaccount.blob.core.windows.net/mssqldata/VLDB01_File1.ndf',
MOVE 'VLDB01_File2' TO 'https://premiumstorageaccount.blob.core.windows.net/mssqldata/VLDB01_File2.ndf',
MOVE 'VLDB01_File3' TO 'https://premiumstorageaccount.blob.core.windows.net/mssqldata/VLDB01_File3.ndf',
MOVE 'VLDB01_File4' TO 'https://premiumstorageaccount.blob.core.windows.net/mssqldata/VLDB01_File4.ndf',
MOVE 'VLDB01_File5' TO 'https://premiumstorageaccount.blob.core.windows.net/mssqldata/VLDB01_File5.ndf',
MOVE 'VLDB01_File6' TO 'https://premiumstorageaccount.blob.core.windows.net/mssqldata/VLDB01_File6.ndf',
MOVE 'VLDB01_File7' TO 'https://premiumstorageaccount.blob.core.windows.net/mssqldata/VLDB01_File7.ndf',
MOVE 'VLDB01_File8' TO 'https://premiumstorageaccount.blob.core.windows.net/mssqldata/VLDB01_File8.ndf',
MOVE 'VLDB01_log' TO 'https://premiumstorageaccount.blob.core.windows.net/mssqldata/VLDB01_log.ldf',
RECOVERY, REPLACE;

Processed 0 pages for database 'VLDB01', file 'VLDB01' on file 1.
Processed 0 pages for database 'VLDB01', file 'VLDB01_File1' on file 1.
Processed 0 pages for database 'VLDB01', file 'VLDB01_File2' on file 1.
Processed 0 pages for database 'VLDB01', file 'VLDB01_File3' on file 1.
Processed 0 pages for database 'VLDB01', file 'VLDB01_File4' on file 1.
Processed 0 pages for database 'VLDB01', file 'VLDB01_File5' on file 1.
Processed 0 pages for database 'VLDB01', file 'VLDB01_File6' on file 1.
Processed 0 pages for database 'VLDB01', file 'VLDB01_File7' on file 1.
Processed 0 pages for database 'VLDB01', file 'VLDB01_File8' on file 1.
Processed 0 pages for database 'VLDB01', file 'VLDB01_log' on file 1.
RESTORE DATABASE successfully processed 0 pages in 1.530 seconds (0.000 MB/sec).

This restore completed in 8 seconds on an idle instance. For restore, concurrent load does not affect timing as much as it does for backup. On a busy instance with the same CPU and storage utilization as above, restore completed in 10 seconds. During restore, no data movement is required if the same storage account is used for the restored database, even though a new set of database files (blobs) is created. Azure Blob Storage ensures that the initial content of these new blobs is identical to the content of the backup snapshots.

You might notice that the output from the backup and restore commands above looks a little unusual, with the page and throughput statistics reported as zeroes. This is because SQL Server does not have them. The backup and restore are actually performed at the Azure Blob Storage level, with SQL Server calling blob storage API to create storage snapshots of all database files as a part of backup, or create blobs from snapshots as a part of restore. You may also notice that the backup and restore times in the output are much shorter compared to the actual duration. This is because they are reported for a specific phase (data copy) of the overall backup/restore process, which for file-snapshot backup/restore is very quick, compared to the overall duration of the operation. See this blog by Nacho Alonso Portillo for details.

These examples show that for a VLDB with files stored directly in Azure Blob Storage, backup/restore scenarios can be greatly simplified by using file-snapshot backups, in much the same way as SAN storage snapshots, if used correctly, simplify these scenarios for large on-premises databases.

Consistency checking

What about another task that is challenging with a VLDB, namely consistency checking (DBCC CHECKDB)? For a multi-TB database, the traditional on-premises solutions usually involve some combination of limited (PHYSICAL_ONLY) and/or partial (at the table/filegroup level) checks. This compromise is often required to complete the check within the available maintenance window, and without causing excessive resource contention between the check and the regular database workload. Sometimes, the check is done only on a replica (if using availability groups), or on a restored copy of the database. Neither is ideal: using an AG replica increases resource utilization on the replica, potentially affecting other workloads running on the replica and AG replication itself (not to mention that corruption could exist on some replicas but not on others). Checking a copy of the database restored on a separate instance requires a lengthy (hours, days) restore process, implying that corruption on the source database could potentially occur while this restore is in progress, and remain undetected until the next check.

How does this look in Azure? We have just shown that the entire backup/restore process for a 12 TB database with files in Azure blob storage completed in a couple of minutes, even on a busy instance. This provides an elegant solution for checking consistency of a VLDB in Azure: restore a file-snapshot backup on a separate VM, and run consistency check against the restored copy of the database. This VM can be stopped most of the time to save costs, and only started periodically to run consistency checks. The size of this VM can be different from the size of the VM hosting the database, to strike the right balance between the VM cost and the duration of the check, which is determined by available compute resources. Most importantly, the checks will be done on a “fresh” copy of the database, minimizing the likelihood of corruption happening on the source database while the copy is being created.

Reporting and analytics

If we take a step back, we will realize that with file-snapshot backups, we effectively have a way to create a copy of the database in near real-time, within minutes or even seconds. This opens a possibility to use such a copy (or multiple copies, in a scale-out fashion) for reporting and analytics, to offload the part of these workloads that can tolerate some data latency from the primary database. Other scenarios may be possible as well, e.g. quickly creating an “as-of” copy of the VLDB for research and audit purposes.

Limitations

As promising as all this may look, before everyone starts deploying VLDBs with files stored directly in blob storage, we must describe the limitations.

1. Each Premium Storage account is limited to 35 TB, and all files for a database must reside in the same storage account. Using Standard storage with its 500 TB per-account limit, while supported, is usually not recommended for performance reasons. For the restore from file-snapshot backup to be fast, the files of the restored database must reside in the same storage account as the files of the source database. Restoring to a different storage account is possible, but requires a size-of-data operation in blob storage, which for a VLDB could take hours or days. This implies that for practical purposes, there is a limited number of database copies that can be created using this approach. For example, for a 12 TB database used in the examples above, we can create at most one other copy in the same storage account via file-snapshot backup/restore, and for databases larger than ~17 TB, no additional copies may be created. Attempting to restore a file-snapshot backup when there is not enough space in the storage account results in an error:

Msg 5120, Level 16, State 152, Line 127
Unable to open the physical file "https://premiumstorageaccount.blob.core.windows.net/mssqldata/VLDB01_File3.ndf". Operating system error 1450: "1450(Insufficient system resources exist to complete the requested service.)".
Msg 3013, Level 16, State 1, Line 127
RESTORE DATABASE is terminating abnormally.

The mitigating factor here is that the snapshots taken for file-snapshot backups are not counted towards the 35 TB limit. As mentioned in the Scalability and Performance section of Premium Storage documentation, the snapshot capacity of a Premium Storage account is 10 TB. The extent to which this additional space is used depends on the volume of changes in the database between each pair of consecutive file-snapshot backups. This is analogous to the way space is consumed by sparse files used in SQL Server database snapshots (not to be confused with Azure Blob Storage snapshots used in file-snapshot backups).

2. Since we create database files directly in a storage account, we cannot take advantage of the Managed Disks capabilities. In particular, for databases using availability groups, we cannot guarantee that the different storage accounts behind each availability replica are on different storage scale units. Using the same storage scale unit for more than one replica creates a single point of failure. While the likelihood of storage scale unit failure is not high, there are known cases when database availability had been negatively affected due to this, before Managed Disks became available to avoid this issue.

3. Because database files are accessed over http, disk IO becomes a part of network traffic for all intents and purposes. This means that unlike with Azure VM disks, there is no local SSD cache to potentially improve storage performance. Additionally, the same network bandwidth is shared between the VM network traffic, and the database storage traffic, which can affect network intensive applications. Further, the VM limits on disk throughput and IOPS that are published in Azure VM documentation do not apply to SQL Server database files stored directly in blob storage, though the network bandwidth limits are still applicable.

4. There are additional considerations and limitations to consider, documented in File-Snapshot Backups for Database Files in Azure.

That said, even with these limitations, managing a SQL Server VLDB in Azure can be greatly simplified if files are stored directly in blob storage, allowing the use of techniques described in this article. We strongly recommend considering this option for your next project involving a SQL Server VLDB in Azure.

↧

Azure SQL DB Managed Instance – sp_readmierrorlog

May 4, 2018, 3:18 pm

≫ Next: CPU and Memory Allocation on Azure SQL Database Managed Instance

≪ Previous: SQL Server VLDB in Azure: DBA Tasks Made Simple

Reviewed by: Kun Cheng, Borko Novakovic, Arvind Shyamsundar, Mike Weiner

Azure SQL Database Managed Instance is a new offering that provides an instance-based SQL PaaS service in Azure. If you are not yet familiar with this new Azure SQL Database capability, you can start with the What is Managed Instance documentation topic. Since early private preview of Managed Instance (MI), SQLCAT has been working with early adopter customers to help them evaluate MI as a new platform for their applications, gather their feedback, and improve the offering for everyone. Azure SQL Database Managed Instance is now available in public preview.

SQL error log is available on MI

The primary goal of MI is close compatibility with the traditional SQL Server, to help facilitate migrations from on-premises environments to the Azure SQL Database PaaS service. In this article, we will discuss one MI capability that it shares with SQL Server, namely the ability to see the instance error log. For SQL Server DBAs, the error log is one of the first things to check when troubleshooting application issues, and that is still the case when using a managed service such as MI. The availability of the instance error log highlights the MI focus on compatibility with SQL Server; by comparison, in Azure SQL Database, diagnostics and error information are exposed in different ways, i.e. using the sys.event_log DMV and diagnostics logging.

When you connect to an MI instance, you can right away look at its error log, using either Log File Viewer in SSMS (latest SSMS is strongly recommended), or the sp_readerrorlog stored procedure. If you do that on an instance of SQL Server that has just started, you will typically see 100-200 lines of output, depending on server configuration, the number of databases, etc. But when you look at the MI error log, you may be surprised by the large volume of messages. For example, in the first minute after instance startup, more than 2500 messages are logged. While some of them are the familiar messages you would find in a SQL Server error log, many others might look a bit cryptic, and are not actionable from an end-user standpoint.

Why is all this information in the MI error log? This diagnostic data is needed for Microsoft engineers to manage the service and troubleshoot any problems efficiently. As an aside, the MI error log is so detailed, that a curious user with relatively advanced knowledge of SQL Server internals can glean many interesting details about internal workings of MI, even without full knowledge of the MI architecture.

An important note is that in current MI preview, error logs are not persisted across instance restarts and failovers. Therefore, when looking at older logs, gaps are possible.

A new way to look at the MI error log

But for most customers, the high volume of debug-level messages in the log just makes it harder to see the messages relevant to their applications/databases, which are lost in the noise, as it were. This is the reason why we wrote sp_readmierrorlog. This is a simple stored procedure that returns the contents of the instance error log, filtering out messages that are unlikely to be useful to an MI customer.

sp_readmierrorlog has the same familiar parameters and result set that sp_readerrorlog has. It can be created in the master database of each MI instance you use, making it easy to call the procedure from the context of any database on the instance. As an example, on a mostly idle MI instance, sp_readmierrorlog reduced a 150,000-line error log, generated over one day, to 760 lines. This is much more manageable, while still providing useful diagnostic information to MI customers. For customers requiring a more in-depth look into instance behavior and diagnostics, access to the unfiltered log is still available using the built-in sp_readerrorlog procedure.

The procedure is open source and is hosted on GitHub. The filtering logic used by the procedure can be easily customized to include/exclude specific messages from the result set. If you find that the procedure is filtering out too much or too little, we would welcome pull requests with changes that make the stored procedure better for all MI customers.

↧

CPU and Memory Allocation on Azure SQL Database Managed Instance

May 4, 2018, 3:44 pm

≫ Next: Turbo boost data loads from Spark using SQL Spark connector

≪ Previous: Azure SQL DB Managed Instance – sp_readmierrorlog

Reviewed By: Ajay Kalhan, Borko Novakovic, Drazen Sumic, Branislav Uzelac

In the current Azure SQL Database Managed Instance (MI) preview, when customers create a new instance, they can allocate a certain number of CPU vCores and a certain amount of disk storage space for the instance. However, there is no explicit configuration option for the amount of memory allocated to the instance, because on MI, memory allocation is proportional to the number of vCores used.

How can a customer determine the actual amount of memory their MI instance can use, in GB? The answer is less obvious than it may seem. Using the traditional SQL Server methods will not provide the right answer on MI. In this article, we will go over the technical details of CPU and memory allocation on MI, and describe the correct way to answer this question.

The information and behavior described in this article are as of the time of writing (April 2018). Some aspects of MI behavior, including the visibility of certain compute resource allocations, may be temporary and will likely change as MI progresses from the current preview to general availability and beyond. Nevertheless, customers using MI in preview will find that this article answers some of the common questions about MI resource allocation.

First glance at CPU and memory on MI

We will use a MI instance with 8 vCores as an example. On the traditional SQL Server, most customers would look at the Server Properties dialog in SSMS to see the compute resources available to the instance. On our example MI instance, this is what we see:

We should right away note that the resource numbers in this dialog, as well as in several other sources (DMVs) described later, can change over the lifetime of a given MI instance. These changes can be relatively frequent. Customers should not take any dependencies, or make any conclusions based on these numbers. Later in the article, we will describe the correct way to determine actual compute resource allocation on MI.

An immediate question is why we see 24 processors here, when we have created this instance with only 8 vCores/processors. To determine the actual number of logical processors available to this instance, we can look at the number of VISIBLE ONLINE schedulers in the sys.dm_os_schedulers DMV:

SELECT COUNT(1) AS SchedulerCount
FROM sys.dm_os_schedulers
WHERE status = 'VISIBLE ONLINE';

SchedulerCount
--------------
8

This is in line with the number of vCores we have for this MI instance. Then why does SSMS show that 24 processors are available?

CPU and memory resources are managed differently on MI

To answer this question, we need to take a high-level look at the MI architecture. Each MI instance runs in a virtual machine (VM). Each VM may host multiple MI instances of varying sizes, in terms of compute resources allocated to the instance.

It is important to note here that all MI instances on a given VM always belong to the same customer; there is no multi-tenancy at the VM level. In effect, the VM hosting MI instances serves as an additional isolation boundary for customer workloads. This does not mean that if a customer creates multiple MI instances, they will necessarily be packed on the same VM. In reality, this does not happen very often. The service intelligently allocates instances to VMs to always provide guaranteed SLAs and ensure good customer experience.

What SSMS shows in the Server Properties dialog is the number of processors and the amount of memory at the OS level on the VM that happens to currently host the instance. This works the same way for the traditional SQL Server, where SSMS also shows OS level numbers. SQL Server error log, which is accessible on MI, shows the same information during MI instance startup:

SQL Server detected 2 sockets with 12 cores per socket and 12 logical processors per socket, 24 total logical processors; using 24 logical processors based on SQL Server licensing. This is an informational message; no user action is required.

Detected 172031 MB of RAM. This is an informational message; no user action is required.

For our example MI instance, this means that there are 24 processors accessible to the OS on the underlying VM. However, given the number of visible online schedulers, the MI instance can only use 8 of these processors, as expected given its current provisioned size.

To reiterate, the number of processors and the amount of memory at the VM level (24 processors and 172031 MB in this example) is not fixed. It can change over time as the MI instance moves across VMs allocated to the customer, for example when it is scaled up or scaled down. These values will, however, always be larger than or equal to the resource values actually allocated to the instance.

But what about memory? Does this MI instance have 168 GB of memory, as shown in SSMS and in the error log? Let’s look at some DMVs.

SELECT cpu_count,
       physical_memory_kb,
       committed_target_kb
FROM sys.dm_os_sys_info;

cpu_count   physical_memory_kb   committed_target_kb
----------- -------------------- --------------------
8           176,160,308          48,586,752

SELECT cntr_value
FROM sys.dm_os_performance_counters
WHERE object_name LIKE '%Memory Manager%'
      AND
      counter_name = 'Target Server Memory (KB)';

cntr_value
----------
48,586,752

Both of these show that the server target memory, which is commonly used to measure the amount of memory available to the instance, is about 46 GB, while the total physical memory at the OS level is 168 GB, as seen in SSMS and in the error log. This shows that not all memory available at the VM OS level is allocated to this MI instance.

On the traditional SQL Server, the usual reason for the target memory to be much lower than the available OS physical memory is configuring a limit on server maximum memory using sp_configure. Is that the case here?

SELECT value_in_use
FROM sys.configurations
WHERE name = 'max server memory (MB)';

value_in_use
------------
2147483647

This large value shows that the maximum server memory for this instance is not limited, and the instance should be able to allocate all available physical memory. What is causing target memory to be much less than the total physical memory for this instance? Is there some other mechanism that can impose a lower limit?

For MI, this mechanism is Job Objects. Running a process such as SQL Server in a job object provides resource governance for the process at the OS level, including CPU, memory, and IO. This resource governance is what allows the service to share the same VM among multiple instances belonging to the same customer, without resource contention and “noisy neighbor” problems. At the same time, this mechanism guarantees a dedicated allocation of vCores and memory for each instance. Memory is allocated according to the GB/vCore ratio for the instance size selected by the customer. There is no overcommit of either vCores or memory across instances on the same VM. In other words, the instance always gets the resources specified during provisioning.

Can we see the configuration of the job object that contains our MI instance? Yes, we can:

SELECT cpu_rate,
       cpu_affinity_mask,
       process_memory_limit_mb,
       non_sos_mem_gap_mb
FROM sys.dm_os_job_object;

cpu_rate    cpu_affinity_mask    process_memory_limit_mb non_sos_mem_gap_mb
----------- -------------------- ----------------------- --------------------
100         255                  57,176                  9,728

The sys.dm_os_job_object DMV exposes the configuration of the job object that hosts the MI instance. For our current topic, there are four columns in the DMV that are particularly relevant:

cpu_rate: this is set to 100%, showing that each vCore can be utilized by the MI instance to its full capacity.

cpu_affinity_mask: this is set to 11111111 (in binary), showing that only eight OS level processors can be used by the process hosted in the job object (SQL Server). This is in line with the number of vCores provisioned for this instance.

process_memory_limit_mb: this is set to about 56 GB, and is the total amount of memory allocated to the process within the job object. Note that this is larger than the server target memory, which, as we saw earlier, is about 46 GB. The next column provides the explanation.

non_sos_mem_gap_mb: this is the amount of memory that is a part of total process memory, but is not available for SQL Server SOS (SQL OS) memory allocations, i.e. is reserved for things like thread stacks and DLLs loaded into the SQL Server process space. The difference between process_memory_limit_mb and non_sos_mem_gap_mb is 46 GB, which is exactly the server target memory that we saw earlier.

To elaborate on the last point, even though SQL Server target memory visible in DMVs and in the output of DBCC MEMORYSTATUS is less than the total memory allocated to the instance, this difference, known as the non-SOS memory gap, is still being used by the instance. In fact, a sufficiently large allocation of non-SOS memory is required for the instance to function reliably.

Conclusion

To summarize the technical details above:

1. MI compute resource values shown in SSMS and the instance error log reflect resources at the underlying OS level, not the actual resources available to the MI instance. The resource values at the OS level can change over time, without affecting the resources allocated to the MI instance in any way.

2. MI instance is resource-governed at the OS level using a job object.

3. The sys.dm_os_job_object DMV exposes job object configuration for the MI instance. This is the DMV that should be used to determine the actual compute resources (vCores and memory) allocated to the MI instance, as described above.

We hope that the information in this article will help customers using Managed Instance to accurately and confidently determine the amount of compute resources allocated to their instances, and avoid potential confusion in this area due to architectural differences between the traditional SQL Server and Managed Instance.

↧

Turbo boost data loads from Spark using SQL Spark connector

May 12, 2018, 10:40 am

≫ Next: DataCAT team at Data Platform Summit 2018

≪ Previous: CPU and Memory Allocation on Azure SQL Database Managed Instance

Reviewed by: Dimitri Furman, Xiaochen Wu

Apache Spark is a distributed processing framework commonly found in big data environments. Spark is often used to transform, manipulate, and aggregate data. This data often lands in a database serving layer like SQL Server or Azure SQL Database, where it is consumed by dashboards and other reporting applications. Prior to the release of the SQL Spark connector, access to SQL databases from Spark was implemented using the JDBC connector, which gives the ability to connect to several relational databases. However, compared to the SQL Spark connector, the JDBC connector isn’t optimized for data loading, and this can substantially affect data load throughput.

As an example, utilizing the SQLBulkCopy API that the SQL Spark connector uses, dv01, a financial industry customer, was able to achieve 15X performance improvements in their ETL pipeline, loading millions of rows into a columnstore table that is used to provide analytical insights through their application dashboards.

In this blog, we will describe several experiments that demonstrate the major performance improvement provided by the SQL Spark connector.

You can download the SQL Spark Connector here

Dataset

Dataset size: 117 million rows
Source: Azure Blob Storage containing 50 parquet files.
Spark Cluster: 8+1 node cluster, each node is a DS3V2 Azure VM (4 cores, 17 GB RAM)

Scenario 1: Loading data into SQL Server

SQL Version: SQL Server 2017 CU 5 on RedHat 7.4
Azure VM Size: DS16sV3 (16 cores, 64 GB RAM)
Storage: 8 P30 disks in Azure Blob Storage
Database Recovery Model: Simple

Performance in SQL on windows v/s SQL on Linux is comparable and for brevity we only depict results on SQL Server on Linux.

Loading into a heap

Using Spark JDBC connector

Here is a snippet of the code to write out the Data Frame when using the Spark JDBC connector. We used the batch size of 200,000 rows. Changing the batch size to 50,000 did not produce a material difference in performance.

dfOrders.write.mode("overwrite").format("jdbc") 
  .option("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver")
  .option("url", "jdbc:sqlserver://server.westus.cloudapp.azure.com;databaseName=TestDB")
  .option("dbtable", "TestDB.dbo.orders")
  .option("user", "myuser")
  .option("batchsize","200000") 
  .option("password", "MyComplexPassword!001") 
  .save()

Since the load was taking longer than expected, we examined the sys.dm_exec_requests DMV while load was running, and saw that there was a fair amount of latch contention on various pages, which wouldn’t not be expected if data was being loaded via a bulk API.

Examining the statements being executed, we saw that the JDBC driver uses sp_prepare followed by sp_execute for each inserted row; therefore, the operation is not a bulk insert. One can further example the Spark JDBC connector source code, it builds a batch consisting of singleton insert statements, and then executes the batch via the prep/exec model.

This is an 8-node Spark cluster, each executor with 4 CPU’s and due to sparks default parallelism, there were 32 tasks running simultaneously with multiple insert statements batched together. The primary contention was PAGELATCH_EX and just like any latch contention the more parallel sessions requesting for the same resource, the more the contention.

Using SQL Spark connector

The SQL Spark connector also uses the Microsoft JDBC driver. However, unlike the Spark JDBC connector, it specifically uses the JDBC SQLServerBulkCopy class to efficiently load data into a SQL Server table. Given that in this case the table is a heap, we also use the TABLOCK hint ( “bulkCopyTableLock” -> “true”) in the code below to enable parallel streams to be able to bulk load, as discussed here. It is a best practice to use the BulkCopyMetadata class to define the structure of the table. Otherwise, there is additional overhead querying the database to determine table schema.

import com.microsoft.azure.sqldb.spark.config._
import com.microsoft.azure.sqldb.spark.connect._
import com.microsoft.azure.sqldb.spark.query._
import com.microsoft.azure.sqldb.spark._
import com.microsoft.azure.sqldb.spark.bulkcopy._

var bulkCopyMetadata = new BulkCopyMetadata
bulkCopyMetadata.addColumnMetadata(1, "o_orderkey", java.sql.Types.INTEGER, 0, 0)
bulkCopyMetadata.addColumnMetadata(2, "o_custkey", java.sql.Types.INTEGER, 0, 0)
//trimming other columns for brevity… only showing the first 2 columns being added to BulkCopyMetadata

val bulkCopyConfig = Config(Map(
  "url"               -> "server.westus.cloudapp.azure.com",
  "databaseName"      -> "testdb",
  "user"              -> "denzilr",
  "password"          -> "MyComplexPassword1!",
  "dbTable"           -> "dbo.orders",
  "bulkCopyBatchSize" -> "200000",
  "bulkCopyTableLock" -> "true",
  "bulkCopyTimeout"   -> "600"
))
dfOrders.bulkCopyToSqlDB(bulkCopyConfig)

Looking at the Spark UI, we see a total of 50 tasks that this DataFrame write is broken into, each loading a subset of the data:

Further investigating the statements, we see the familiar INSERT BULK statement, which is an internal statement used by the SQL Server bulk load APIs. This proves that we are indeed using the SQLBulkCopy API.

There is no longer page latch contention, rather now we are waiting on network IO or client fetching the results

Loading into a clustered columnstore table

With JDBC connector

Load performance in the columnstore case will be far worse with the JDBC connector than in the heap case. Given that the JDBC connector emits single row insert statements, all this data lands in the delta store. And we are back with more severe latch contention this time around. You can read about many more details of loading data into columnstore tables in this blog .

If we examine the sys.dm_db_column_store_index_physical_stats DMV, we notice that all rows are going into an OPEN rowgroup that is a delta store until that rowgroup is filled up and closed. These rows will then have to be compressed by tuple mover into compressed segments later.

Using SQL Spark connector

For the bulk load into clustered columnstore table, we adjusted the batch size to 1048576 rows, which is the maximum number of rows per rowgroup, to maximize compression benefits. Having batch size > 102400 rows enables the data to go into a compressed rowgroup directly, bypassing the delta store. Also, you have to set TABLOCK hint to false, else you will be serializing the parallel streams.

import com.microsoft.azure.sqldb.spark.config._
import com.microsoft.azure.sqldb.spark.connect._
import com.microsoft.azure.sqldb.spark.query._
import com.microsoft.azure.sqldb.spark._
import com.microsoft.azure.sqldb.spark.bulkcopy._
val bulkCopyConfig = Config(Map(
  "url"               -> "server.westus.cloudapp.azure.com",
  "databaseName"      -> "testdb",
  "user"              -> "denzilr",
  "password"          -> "MyComplexPassword1!",
  "dbTable"           -> "dbo.orders",
  "bulkCopyBatchSize" -> "1048576", 
  "bulkCopyTableLock" -> "false",
  "bulkCopyTimeout"   -> "600"
))
dfOrders.bulkCopyToSqlDB(bulkCopyConfig)

When bulk loading in parallel into a columnstore table, there are a few considerations:

Memory grants and RESOURCE_SEMAPHORE waits. Depending on how many parallel streams, you could run into this issue, and it could end up bulk inserting into delta row groups. For more information, see this blog.
Compressed rowgroups could be trimmed due to memory pressure. You would see the trim_reason_description column in the sys.dm_db_column_store_row_group_physical_stats DMV as “MEMORY_LIMITATION”.

In this case, you see that rows land in the compressed rowgroup directly. We had 32 parallel streams going against a DS16sV3 VM (64GB Ram). There are cases where having too many parallel streams, as a result of memory requirements for a bulk insert can cause rowgroups to get trimmed due to memory limitations.

Scenario 2: Loading data into Azure SQL Database

SQL version: Azure SQL Database
Database performance level: P11

There are a couple of differences that need to be noted, which make the Azure SQL Database tests fundamentally different than the tests with SQL Server on a VM.

Database recovery model is Full, vs. Simple recovery model that was used with SQL Server in VM. This prevents the use of minimal logging in Azure SQL Database.
P11 is a Premium database, therefore it is in an availability group used to provide built-in HA within the service. This adds an overhead of committing every transaction on multiple replicas.

When loading into Azure SQL Database, depending on the performance level of the database, you may see other wait types such as LOG_RATE_GOVERNOR, which would be an indicator of a bottleneck. There are multiple ways to monitor resource utilization in Azure SQL Database to detect resource bottleneck, e.g. the sys.dm_db_resource_stats DMV. If a resource bottleneck exists, the database can be easily scaled up to a higher performance level to achieve higher throughput during data loads. More on Azure SQL Database monitoring can be found here.

Recapping a few considerations relevant to data loading from Spark into SQL Server or Azure SQL Database:

Use the Spark SQL connector. We have just shown that in the bulk insert scenario, we get fundamentally better performance, by an order of magnitude, than with the Spark JDBC connector.
For tables that are heaps, use the TABLOCK hint to allow parallel streams. This is particularly relevant for staging tables, which tend to be heaps.
For bulk loads into columnstore tables, do not use the TABLOCK hint, as that would serialize parallel streams.
For bulk loads into columnstore tables, ensure that batch size is >= 102400 so that row go directly into a compressed rowgroup. Ideally start with the batch size of 1048576
For partitioned tables, see the section on partitioned tables in the Data Loading performance considerations with Clustered Columnstore indexes Depending on the number of rows per partition, they could land in the delta store, which would affect bulk insert performance.

↧

DataCAT team at Data Platform Summit 2018

July 16, 2018, 6:44 pm

≫ Next: Storage performance best practices and considerations for Azure SQL DB Managed Instance (General Purpose)

≪ Previous: Turbo boost data loads from Spark using SQL Spark connector

Team members from Microsoft AzureCAT (a.k.a. DataCAT / SQLCAT) will be presenting a full-day pre-conference on “Building A Scalable Data Architecture In Azure” at the upcoming DPS 2018 conference in Bengaluru, India. We are very excited to be part of the 4th edition of Asia’s largest Microsoft data-focused conference, with over 1000 attendees expected overall.

The motivation for this full-day pre-conference training is to share with attendees typical scenarios and considerations in architecting and implementing solutions on Microsoft Azure. Whether you are migrating an existing on-premises data pipeline, or implementing a greenfield project in the cloud, the variety of choices in terms of technologies can be bewildering and confusing. We intend to share scenarios and patterns and practices which have worked in those scenarios. With this background, the attendee has a head-start on the technical front and more importantly, confidence that the architectural patterns work in real-world scenarios.

The pre-conference is structured into the following sections:

Overall Azure Data Services landscape
Transactional systems and structured data with Azure SQL DB
Not Only SQL systems (NoSQL) with Cosmos DB
Relational Data Warehousing using Azure SQL DW
Handling streaming data in Azure with Event Hubs, Kafka etc.
Building Data Lakes in Azure
Data Analytics and AI / ML at scale using Apache Spark, Azure Data Lake Analytics etc.
Options for orchestrating data pipelines: Azure Data Factory etc.
Interactive Querying using Power BI, Azure Analysis Services etc.
Top lessons learnt from real-world customer engagements

Each section will be accompanied by demos highlighting practical considerations and recommendations beyond just the basics. The team members who are part of this conference are all very experienced in the industry and are true believers in sharing information and learnings:

Sanjay Mishra leads the DataCAT group within AzureCAT and is based in the Microsoft HQ at Redmond, USA. His current focus is on overall architecture patterns in the cloud.
Rangarajan Srirangam is part of AzureCAT based in Bengaluru, India. He has helped major customers adopt and use Microsoft Azure. He brings to the table practical considerations and architectural recommendations based on his hard-earned experiences.
Arvind Shyamsundar is part of the DataCAT group based in Redmond, USA. His focus areas include relational (SQL) systems and distributed computing using Apache Spark.
Mandar Inamdar is part of the AzureCAT team and is based in Mumbai, India. His focus areas are Azure SQL DW, Interactive querying using Power BI and AAS.
Guru Charan Bulusu is part of AzureCAT and works in the Microsoft Bengaluru office. His key focus areas are Streaming data, NoSQL, AI and ML.

We hope you are as excited as we are to bring this content to you. We hope you can join us on August 8th at the DPS 10 pre-conference!

↧

Storage performance best practices and considerations for Azure SQL DB Managed Instance (General Purpose)

July 20, 2018, 9:44 am

≫ Next: Real-time performance monitoring for Azure SQL Database Managed Instance

≪ Previous: DataCAT team at Data Platform Summit 2018

Reviewed by: Kun Cheng, Borko Novakovic, Jovan Popovic, Denzil Ribeiro, Rajesh Setlem, Arvind Shyamsundar, Branislav Uzelac

In this article, we describe database storage architecture on Azure SQL Database Managed Instance (MI), for General Purpose (GP) instances specifically. We also provide a set of best practices to help optimize storage performance.

At the time of this writing (July 2018), Managed Instance, both General Purpose and Business Critical, is in public preview. Some aspects of MI storage architecture will likely change as MI progresses from the current preview to general availability and beyond. This article reflects the current state of the offering.

Database storage architecture on MI GP

MI GP uses Azure Premium Storage to store database files for all databases, except for the tempdb database. From the perspective of the database engine, this storage type is remote, i.e. it is accessed over the network, using Azure network infrastructure. To use Azure Premium Storage, MI GP takes advantage of SQL Server native capability to use database files directly in Azure Blob Storage. This means that there is not a disk or a network share that hosts database files; instead, file path is an HTTPS URL, and each database file is a page blob in Azure Blob Storage.

Since Azure Premium Storage is used, its performance characteristics, limits, and scalability goals fully apply to MI GP. The High-performance Premium Storage and managed disks for VMs documentation article includes a section describing Premium Storage disk limits. While the topic is written in the context of VMs and Azure disks, which is the most common usage scenario for Azure Premium Storage, the documented limits are also applicable to blobs. As shown in the limits table in the documentation, the size of the blob determines the maximum IOPS and throughput that can be achieved against the blob. For MI GP, this means that the size of a database file determines the maximum IOPS and throughput that is achievable against the file.

The disk/blob size shown in the limits table is the maximum size for which the corresponding limit applies. For example, a blob that is > 64 GB and <= 128 GB (equivalent to a P10 disk) can achieve up to 500 IOPS and up to 100 MB/second throughput.

The current implementation of MI GP does not use blobs smaller than 128 GB (P10). The system will use 128 GB (P10) blobs even for very small database files, to avoid negative performance impact that would be likely with smaller blob sizes (P4 and P6). Additionally, when allocating blobs in Premium Storage, MI always uses the maximum blob size within a storage performance tier. For example, if database file size is 900 GB, MI GP will use a 1 TB (P30) blob for that file. In other words, blob size is snapped up to the maximum size of each storage performance tier. If the file grows above that limit, the system automatically increases blob size to the maximum of the next performance tier. Conversely, if the file is shrunk below the maximum size of a performance tier, blob size is automatically reduced as well.

For billing purposes, MI uses the configured instance file size limit (8 TB or less), not the larger blob size allocated in Azure Premium Storage.

As mentioned earlier, the tempdb database is not using Azure Premium Storage. It is located on the local SSD storage, which provides very low latency and high IOPS/throughput. This article focuses on databases that use remote storage.

Managed Instance Business Critical (BC) instances do not use remote Azure Premium Storage, but instead use local SSD storage. Storage performance considerations discussed in this article do not apply to BC instances.

Azure Storage throttling

When the IOPS or throughput generated by the database workload exceed the limits of a database file/blob, storage throttling occurs. For MI GP instances, a typical symptom of storage throttling is high IO latency. It is possible to see IO latency spike to hundreds of milliseconds when being throttled, while without throttling, average storage latency at the OS level would be in the 2-10 millisecond range. Another symptom of storage throttling is long PAGEIOLATCH waits (for data file IO) and WRITELOG waits (for the transaction log file IO). Needless to say, when storage throttling is occurring, there is a substantial negative effect on database performance.

If you observe the symptoms of storage throttling when running workloads on MI GP, we recommend increasing database file size to get more IOPS/throughput from Azure Premium Storage. More specific recommendations, including a script to determine if storage IO generated by the database workload approaches Azure Premium Storage IOPS/throughput limits, are provided further in the article.

Storage limits on MI GP

There are two storage-related limits that should be considered when using MI GP.

The first limit is on the total size of all database files on the instance. System databases including tempdb are counted towards this limit, and all data files and transaction log files are considered. Currently, this limit is 8 TB per instance, though a customer can configure their instance to use a lower limit. Note that this limit applies specifically to file size as it appears in the size column in the sys.master_files DMV, not to the space actually used to store data within each file.

The second limit is the Azure Premium Storage limit on the maximum space in a storage account, which is currently limited to 35 TB. Each MI GP instance uses a single Premium Storage account.

The up to 8 TB file size limit implies that if there are many databases on the instance, or if a database has many files, then it may be necessary to reduce the size of individual database files to stay within this limit. In that case, the IOPS/throughput against these files would be reduced as well. The more files an MI instance has, the more pronounced is the impact of instance file size limit on storage performance. In the worst case, the size of individual database files may have to be reduced to the point where they all end up using 128 GB (P10) blobs, even if the workload could take advantage of better performance available with larger files.

Whether this limit is relevant for a particular workload depends on its storage performance needs. Many workloads will perform sufficiently well even with smaller database files. The IOPS/throughput that can be obtained for a particular file size can be determined in advance from the Azure Premium Storage limits table.

In the context of migration to MI, customers can measure actual IOPS/throughput on the source system being migrated, to see if it is necessary to increase database file size on MI GP to provide comparable storage performance. For SQL Server on Windows, use the Disk Transfers/sec and Disk Bytes/sec PerfMon counters, for IOPS and throughput respectively. For SQL Server on Linux, use iostat, looking at the sum of r/s and w/s values for IOPS, and the sum of rKB/s and wKB/s for throughput. Alternatively, the sys.dm_io_virtual_file_stats() DMF can be used as well. The script referenced in the best practices section below provides an example of using this DMF to determine IOPS and throughput per database file.

The fact that MI GP does not use blobs smaller than 128 GB (P10), coupled with the fact that the blob size is snapped up to the maximum size of its performance tier, can make the 35 TB limit relevant for some MI GP deployments. As described in documentation, when there are many database files with sizes smaller than the sizes of the blobs they use, then the 35 TB blob size limit may be reached sooner than the 8 TB file size limit. To see the total file and blob sizes for an instance, customers can use the following query. It returns a single row with two columns, one showing the total file size, where the 8 TB limit is relevant, and the other showing the total blob size, where the 35 TB limit is relevant:

WITH DatabaseFile AS
(
SELECT database_id AS DatabaseID,
CAST(size * 8. / 1024 / 1024 AS decimal(12,4)) AS FileSizeGB
FROM sys.master_files
)
SELECT SUM(FileSizeGB) AS FileSizeGB,
       SUM(
          IIF(
             DatabaseID <> 2,
             CASE WHEN FileSizeGB <= 128 THEN 128
                  WHEN FileSizeGB > 128 AND FileSizeGB <= 256 THEN 256
                  WHEN FileSizeGB > 256 AND FileSizeGB <= 512 THEN 512
                  WHEN FileSizeGB > 512 AND FileSizeGB <= 1024 THEN 1024
                  WHEN FileSizeGB > 1024 AND FileSizeGB <= 2048 THEN 2048
                  ELSE 4096
             END,
             0
             )
          )
       AS BlobSizeGB
FROM DatabaseFile;

For example, this result set shows different file and blob total sizes for a MI GP instance:

FileSizeGB  BlobSizeGB
----------- -----------
2048.4474   2944

Comparison with SQL Server in Azure IaaS VM

In a typical deployment of SQL Server on an Azure IaaS VM (SQL VM), database files are placed on Azure disks attached to the VM. Sometimes, a Storage Spaces pool (on Windows) or an LVM volume (on Linux) is created using multiple disks, to aggregate IOPS/throughput provided by multiple disks, and to increase storage size. There are notable differences between this model, and the model used by MI GP. For SQL VMs, available IOPS/throughput are shared among all database files using a disk (or a storage pool), whereas for MI GP, each file/blob gets its own allocation of IOPS/throughput.

Each model has its pros and cons. When there are multiple databases on the instance, MI GP makes it possible to provide a fixed allocation of IOPS/throughput per database file, avoiding the “noisy neighbor” effect of other databases on the instance. On the other hand, each database on a SQL VM can take advantage of higher aggregated IOPS/throughput provided by multiple disks, assuming that all databases do not create spikes in storage utilization at the same time.

One other important difference between MI GP and SQL VM is that the per-VM IOPS/throughput limits, documented for each VM type as Max uncached disk throughput: IOPS / MBps, do not apply to MI. When we talked about throttling earlier in this article, we referred specifically to Azure Premium Storage throttling, and not VM-level throttling. VM-level throttling can occur on SQL VM when cumulative IO requests against all Azure disks attached to the VM exceed the per-VM limits. However, MI GP does not use Azure disks for database files; therefore, the per-VM disk limits are not applicable, and MI instances do not experience VM-level throttling in the way that SQL VMs do.

Performance impact of database file size on MI GP – an example

To illustrate the performance impact of database file size on MI GP, we ran a series of tests using a generic OLTP workload on an 8 vCore instance with 8 TB storage. This example is not intended as a benchmark, and the numbers shown should not be taken as representative for MI GP in general. Storage performance is highly dependent on workload characteristics, and the results obtained using the same MI instance type and storage layout used in these tests, but a different workload, may be very different. These results are shown only to demonstrate the relative difference in performance when different file size configurations on MI GP are used.

In these tests, we used a database with 3 data files and 1 log file. In the initial test, the size of every file was 32 GB (as noted earlier, the system actually uses 128 GB P10 blobs in this case). In following tests, we first increased the size of the transaction log file, and then gradually increased the size of data files. Finally, we increased the size of all files to over 1 TB (to use 2 TB P40 blobs) to get maximum storage performance possible with this file layout. We observed the changes in workload throughput and behavior using metrics such as batch requests/second (BRPS), CPU utilization percentage, PAGEIOLATCH and WRITELOG wait percentage relative to all other non-ignorable waits, and write latency for data files and log file (measured using the sys.dm_io_virtual_file_stats() DMF).

At the storage level, the workload had a 60/40 read/write ratio with cold cache. The database was small enough to fit in memory, so during the test, that ratio changed to nearly 100% writes as most of the data pages became cached in the buffer pool. For this reason, we show only write latency in test results.

Test results are presented in the following table:

File layout	Average BRPS	CPU utilization (%)	WRITELOG waits (%)	PAGEIOLATCH waits (%)	Average data write latency (ms/write)	Average log write latency (ms/write)
3 x 32 GB data files, 32 GB log file	1457	10	21	70	132	16
3 x 32 GB data files, 512 GB log file	1732	12	13	85	477	4.5
3 x 256 GB data files, 512 GB log file	2422	12	12	80	155	4.7
3 x 512 GB data files, 512 GB log file	2706	12	12	77	165	4.4
3 x 1.1 TB data files, 2 TB log file	7022	46	74	4	49	4.5

We can see that for this workload, incrementally growing data and log files up to 512 GB in the first four tests provided gradual improvements. Using 512 GB files, workload throughput (BRPS) was almost twice as high as with the 32 GB files used initially. But increasing the size of all files from 512 GB to over 1 TB provided a 4.8x increase in workload throughput, compared to the initial test throughput. In this last test, removing the bottleneck in data file IO drastically increased both BRPS and CPU utilization, and shifted the dominant wait type from PAGEIOLATCH to WRITELOG as many more write transactions were processed.

To reiterate, the degree of performance improvement from increasing file size on MI GP is workload-dependent. For some workloads, sufficient performance levels can be achieved without using large files. Furthermore, other factors besides storage performance could affect the overall workload performance. For example, increasing file size on a CPU-constrained MI GP instance would not necessarily provide a significant performance increase.

Storage performance best practices for MI GP

In closing, we present a set of storage performance best practices for MI GP instances. These are not absolutes. We encourage the readers to understand the reasons behind these best practices that were discussed earlier in the article, and consider each recommendation in the context of their specific application/workload.

1. Determine if the storage IO generated by the database workload is exceeding Azure Premium Storage limits. A script that examines IOPS/throughput per database file over a period of time, and compares them against the limits, is available on GitHub. If storage IO is near the limits, allocate more IOPS/throughput for the database as described below.

2. When possible, increase space allocation at the MI instance level. This will allow you to increase the size of individual database files, resulting in higher IOPS/throughput limits per file.

3. Increase the size of individual database files as much as possible to utilize available instance space, even if a large portion of each file would remain empty. But consider Azure Premium Storage limits as well. For example, if the size of a file is 150 GB, then increasing it up to 256 GB would not provide any additional IOPS/throughput. MI already uses a 256 GB (P15) blob in this case. This larger file would take some of the instance storage space that may be better used to increase IOPS/throughput for other database files on the same instance. In this example, to get the advantage of the next storage performance tier for this file, it would be necessary and sufficient to increase file size to 257 GB. MI GP would then snap up the blob size for this file to 512 GB (P20).

4. For many write-intensive workloads, increasing the size of the transaction log file provides higher performance improvement than increasing the size of data files.

5. Make all data files equally sized with the same growth increment. While this recommendation is not specific to MI, it is particularly relevant here. On MI GP, a single small file may negatively impact performance of the entire workload, due to storage throttling becoming more likely for that small file.

6. Avoid using many small files for a database. While the aggregate IOPS/throughput from many small files may be comparable to the IOPS/throughput provided by fewer large files, each individual small file is more likely to encounter storage throttling, which affects the entire workload.

7. Consider not only increasing file size, but also adding data files. For example, consider an MI instance with a single database. If this database has a single data file, it can get at most 7500 IOPS for data IO, by making that file larger than 1 TB in size (a P40 or P50 blob would be used depending on the exact file size). But using N files, each larger than 1 TB, would provide 7500*N IOPS for data IO, and 7500 IOPS for log IO. For example, you could use six 1.1 TB data files, and one 1.1 TB log file. The total file size would be 7.7 TB, while the total blob size would be 14 TB, remaining within the 8 TB and 35 TB limits respectively. Keep in mind that simply adding empty files to an existing database would not improve performance for most workloads; it is also necessary to reallocate existing data over all files.

Conclusion

In this article, we show how database file size and layout can have a significant performance impact on MI GP instances. We provide recommendations to help customers achieve optimal storage performance, and to avoid common pitfalls. This article applies to General Purpose MI instances. The complexities and limitations we describe do not apply to Business Critical instances, which are recommended for mission critical workloads with high storage performance requirements. If sufficient storage performance cannot be achieved on a General Purpose instance even after following best practices described above, then switching to a Business Critical instance is recommended.

↧

Real-time performance monitoring for Azure SQL Database Managed Instance

September 26, 2018, 4:38 pm

≫ Next: SQLCAT at PASS Summit 2018

≪ Previous: Storage performance best practices and considerations for Azure SQL DB Managed Instance (General Purpose)

This article was co-authored by Dimitri Furman and Denzil Ribeiro

Reviewed by: Danimir Ljepava, Borko Novakovic, Jovan Popovic, Rajesh Setlem, Mike Weiner

Introduction

In our ongoing engagements with the SQL DB Managed Instance preview customers a common requirement has been to monitor database workload performance in real time. This includes not just monitoring of a single (or a few) performance metrics, but being able to look at many relevant metrics at once, within seconds of events occurring. While much of the needed performance data is available in various DMVs, consuming it via T-SQL queries and looking at the results in SSMS query windows is quite cumbersome, and does not provide a simple way to look at real time data, or at recent historical data and trends.

More than a year ago, we had similar monitoring requirements for SQL Server on Linux, and had blogged about it here: How the SQLCAT Customer Lab is Monitoring SQL on Linux. For monitoring Managed Instance, we used Telegraf. Telegraf is a general-purpose agent for collecting metrics, and is broadly adopted in the community. It can be used for monitoring SQL Server on both Windows and Linux using the SQL Server plugin. In Telegraf version 1.8.0, this plugin has been updated to support SQL DB Managed Instance as well. Telegraf connects to Managed Instance remotely, and collects performance data from DMVs. InfluxDB is the time series database for the monitoring data collected by Telegraf, while new Grafana dashboards, specific to Managed Instance, were developed to visualize this data. Big thanks to Daniel Nelson and Mark Wilkinson for accepting pull requests with the changes to the Telegraf SQL Server plugin needed to support Managed Instance.

This blog will describe a solution for real time performance monitoring of Managed Instance in sufficient detail for customers to use it for the same purpose. The code we developed is open source and is hosted in the sqlmimonitoring GitHub repo. We welcome community contributions to improve and expand this solution. Contributions to the Telegraf collector should be made via pull requests to the SQL Server Input Plugin for Telegraf. For changes to Grafana dashboards, pull requests should be made to the sqlmimonitoring GitHub repo.

The detailed discussion below is intended specifically for real time performance monitoring and troubleshooting. For other monitoring needs, such as query-level performance monitoring, historical trend investigations, and resource consumption monitoring, consider using Azure SQL Analytics. In near future, Azure Monitor will provide support for monitoring of Managed Instance metrics and sending notifications.

Dashboard examples

Before describing the solution in detail, here are a few examples of Managed Instance Grafana dashboards, captured while a customer workload is running.

Figure 1. Performance dashboard. Additional panels not shown: Memory Manager, Memory Clerks, PLE, Buffer Manager, Access Methods, SQL Statistics.

Figure 2. Database IO Stats dashboard. Additional panels not shown: Data File Size, Log File Size, Log File Used.

Figure 3. HADR dashboard.

Setup and configuration

Here is the conceptual diagram of the solution. The arrows show the flow of monitoring data from source to destination. Telegraf pulls data from Managed Instances and loads it into InfluxDB. Grafana queries InfluxDB to visualize the data on web dashboards.

Prerequisites

Basic knowledge of Linux.
Linux VM(s) with internet connectivity, which is needed to download Docker images during initial configuration. In a typical configuration shown on the diagram, a single VM runs all three components (Telegraf, InfluxDB, and Grafana).
Network connectivity between the Telegraf VM and monitored Managed Instances. Traffic from the Telegraf VM to the Managed Instances on TCP port 1433 must be allowed. The Telegraf VM can be in the same VNet where monitored instances are, or in a peered VNet, or on premises if the on-prem network is connected to the VNet over ExpressRoute or VPN. For details on supported Managed Instance connectivity scenarios, see this documentation article.
Inbound traffic to the VM(s) on the following ports must be allowed. If deploying in Azure, modify NSGs accordingly.
- TCP port 8086 – this is the default InfluxDB port. This is only required on the InfluxDB VM, and only if lnfluxDB is running on a machine different from the Grafana machine.
- TCP port 3000 – this is the Grafana default web port to access the web-based dashboards. This is only required on the Grafana VM.

In our test environment, we use a single VM in the same Azure VNet as the monitored instances. If monitoring just a couple of instances, the VM can be quite small and inexpensive. A single F1 VM (1 vCore and 2 GB of memory) with one P10 (128 GB) data disk for InfluxDB was sufficient for monitoring two instances. However, what we describe is a central monitoring solution, and as such it can be used to monitor many Managed Instance and SQL Server instances. For large deployments, you will likely need to use larger VMs, and/or use a separate VM for InfluxDB. If dashboard refreshes start taking too long, then more resources for InfluxDB are needed.

Step 1

Install Git and Docker, if they are not already installed, otherwise skip this step.

Ubuntu:

sudo apt-get install git -y
wget -qO- https://get.docker.com/ | sudo sh

RHEL/CentOS:

sudo yum install git -y
sudo yum install docker -y
sudo systemctl enable docker
sudo systemctl start docker

Note: On RHEL, SELinux is enabled by default. For the InfluxDB and Grafana containers to start successfully, we either have to disable SELinux, or add the right SELinux policy. With the default RHEL configuration, SELINUX configuration is set to enforcing. Changing this configuration to permissive or disabled will work around this issue. Alternatively set the right SELinux policy as described here. See this documentation article for more information.

sudo vi /etc/selinux/config

Step 2

Clone the GitHub repo.

cd $HOME
sudo git clone https://github.com/denzilribeiro/sqlmimonitoring.git/

Step 3

Install, configure, and start InfluxDB.

a. When creating the Azure VM, we added a separate data disk for InfluxDB. Now, mount this disk as /influxdb, following the steps in this documentation article (replace datadrive with influxdb).

b. Optionally, edit the runinfluxdb.sh file and modify the INFLUXDB_HOST_DIRECTORY variable to point to the directory where you want the InfluxDB volume to be mounted, if it is something other than the default of /influxdb.

cd $HOME/sqlmimonitoring/influxdb
sudo vi runinfluxdb.sh

If you are new to vi (a text editor), use the cheat sheet.

c. Pull the InfluxDB Docker image and start InfluxDB.

cd $HOME/sqlmimonitoring/influxdb
sudo ./runinfluxdb.sh

Step 4

Pull the Grafana Docker image and start Grafana.

cd $HOME/sqlmimonitoring/grafana
sudo ./rungrafana.sh

Step 5

Optionally, if the firewall is enabled on the VM, create an exception for TCP port 3000 to allow web connections to Grafana.

Ubuntu:

sudo ufw allow 3000/tcp
sudo ufw reload

RHEL/CentOS:

sudo firewall-cmd --zone=public --add-port=3000/tcp -permanent
sudo firewall-cmd --reload

Step 6

Install Telegraf version 1.8.0 or later. Support for MI was first introduced in this version.

Ubuntu:

cd $HOME
wget https://dl.influxdata.com/telegraf/releases/telegraf_1.8.0-1_amd64.deb
sudo dpkg -i telegraf_1.8.0-1_amd64.deb

RHEL/CentOS:

cd $HOME&lt;b&gt;&lt;/b&gt;
wget https://dl.influxdata.com/telegraf/releases/telegraf-1.8.0-1.x86_64.rpm
sudo yum localinstall telegraf-1.8.0-1.x86_64.rpm -y

Step 7

Create a login for Telegraf on each instance you want to monitor, and grant permissions. Here we create a login named telegraf, which is referenced in the Telegraf config file later.

USE master;
CREATE LOGIN telegraf WITH PASSWORD = N'MyComplexPassword1!', CHECK_POLICY = ON;
GRANT VIEW SERVER STATE TO telegraf;
GRANT VIEW ANY DEFINITION TO telegraf;

Step 8

Configure and start Telegraf.

a. The Telegraf configuration file (telegraf.conf) is included in sqlmimonitoring/telegraf, and includes the configuration settings used by the solution. Edit the file to specify your monitored instances in the [[inputs.sqlserver]] section, and then copy it to /etc/telegraf/telegraf.conf, overwriting the default configuration file.

sudo mv /etc/telegraf/telegraf.conf /etc/telegraf/telegraf_original.conf
sudo cp $HOME/sqlmimonitoring/telegraf/telegraf.conf /etc/telegraf/telegraf.conf
sudo chown root:root /etc/telegraf/telegraf.conf
sudo chmod 644 /etc/telegraf/telegraf.conf

b. Alternatively, edit the /etc/telegraf/telegraf.conf file in place. This is preferred if you would like to keep all other Telegraf settings present in the file as commented lines.

sudo vi /etc/telegraf/telegraf.conf

Uncomment every line shown below, and add a connection string for every instance/server you would like to monitor in the [[inputs.sqlserver]] section. When editing telegraf.conf in the vi editor, search for [[inputs.sqlserver]] by typing “/” followed by “inputs.sqlserver”. Ensure the URL and database name for InfluxDB in the outputs.influxdb section are correct.

[[inputs.sqlserver]]
servers = ["Server=server1.xxx.database.windows.net;User Id=telegraf;Password=MyComplexPassword1!;app name=telegraf;"
,"Server=server2.xxx.database.windows.net;User Id=telegraf;Password=MyComplexPassword1!;app name=telegraf;"]

query_version = 2

[[outputs.influxdb]]
urls = ["http://127.0.0.1:8086"]
database = "telegraf"

c. Note that the default polling interval in Telegraf is 10 seconds. If you want to change this, i.e. for more precise metrics, you will have to change the interval parameter in the [agent] section in the telegraf.conf file.

d. Once the changes are made, start the service.

sudo systemctl start telegraf

e. Confirm that Telegraf is running.

sudo systemctl status telegraf

Step 9

In Grafana, create the data source for InfluxDB, and import Managed Instance dashboards.

In this step, [GRAFANA_HOSTNAME_OR_IP_ADDRESS] refers to the public hostname or IP address of the Grafana VM, and [INFLUXDB_HOSTNAME_OR_IP_ADDRESS] refers to the hostname or IP address of the InfluxDB VM (either public or private) that is accessible from the Grafana VM. If using a single VM for all components, these values are the same.

a. Browse to your Grafana instance – http://[GRAFANA_IP_ADDRESS_OR_SERVERNAME]:3000.

Login with the default user admin with password admin. Grafana will prompt you to change the password on first login.

b. Add a data source for InfluxDB. Detailed instructions are at http://docs.grafana.org/features/datasources/influxdb/

Click “Add data source”
Name: influxdb-01
Type: InfluxDB
URL: http://[INFLUXDB_HOSTNAME_OR_IP_ADDRESS]:8086. The default of http://localhost:8086 works if Grafana and InfluxDB are on the same machine; make sure to explicitly enter this URL in the field.
Database: telegraf

Click “Save & Test”. You should see the message “Data source is working”.

c. Download Grafana dashboard JSON definitions from the repo dashboards folder for all three dashboards, and then import them into Grafana. When importing each dashboard, make sure to use the dropdown labeled InfluxDB-01 to select the data source created in the previous step.

Step 10

Optionally, configure a retention policy for data in InfluxDB. By default, monitoring data is maintained indefinitely.

a. Start InfluxDB console.

sudo docker exec -it influxdb influx

b. View current retention policies in InfluxDB.

use telegraf;
show retention policies;

c. Create a retention policy and exit InfluxDB console.

create retention policy retain30days on telegraf duration 30d replication 1 default;
quit

In this example, we created a 30-day retention policy for the telegraf database, setting it as the default policy.

Step 11

You are done! The dashboards should now display data from all instances specified in the Telegraf config file. You can switch between monitored instances using the dropdown at the top of each dashboard. You can change the time scale by using the pull-down menu in upper right. You can switch between different dashboards using the “Other Managed Instance Dashboards” button.

Troubleshooting examples

Here we will show some examples of actual troubleshooting scenarios on Managed Instance where this monitoring solution helped us identify the root cause.

A six-minute workload stall

In this case, we were testing an OLTP application workload. During one test, we noticed a complete stall in workload throughput.

During the stall, we saw no current waits on the instance at all, so this was not the kind of problem where the workload was stalled because of waiting for some resource, such as disk IO or memory allocation.

Then, after about six minutes, the workload resumed at the previous rate.

Looking at the waits again, we now saw a very large spike in LCK_IX_M waits, dwarfing all other waits.

This immediately pointed to a blocking problem, which was indeed the case. As a part of each transaction, the application was inserting a row into a history table. That table was created as a heap. We found that during the stall, a single DELETE statement affecting a large number of rows in the history table was executed under the serializable transaction isolation level. Running a DML statement targeting a heap table under serializable isolation always causes lock escalation. Thus, while the DELETE query ran for six minutes, it held an X lock at the table level. This OLTP workload inserts a row into the history table as a part of each transaction, therefore it was completely blocked.

What is interesting about monitoring this kind of typical blocking scenario is that the LCK_IX_M waits did not appear on the dashboard until after the stall has ended. This is because SQL Server reports waits in sys.dm_os_wait_stats DMV only once they have completed, and none of the LCK_IX_M waits completed until the X lock went away and the workload had resumed.

Periodic drops in throughput

The next example is more interesting, and is specific to Managed Instance. During an otherwise steady high volume OLTP workload on a General Purpose instance, we observed sharp drops in workload throughput at five-minute intervals.

Looking at the graph of wait types, we saw corresponding spikes in BACKUPIO and WRITELOG waits, at the same five-minute intervals.

Since log backups on Managed Instance run every five minutes, this seemed to imply a strong correlation between drops in throughput and log backups. But why do log backups cause drops in throughput? Looking at the Database IO Stats dashboard, we saw corresponding five-minute spikes in Read Bytes/sec specifically for the transaction log file, which is indeed expected while a transaction log backup is running.

The magnitude of these spikes in read throughput pointed to the root cause of the problem. The spikes were close to the throughput limit of the transaction log blob. At the high level of workload throughput we were driving (~ 8K batch requests/second), the combined read and write IO against the log blob during log backup was enough to cause Azure Storage throttling. When throttling occurs, it affects both reads and writes. This made transaction log writes slower by orders of magnitude, which naturally caused major drops in workload throughput. This problem is specific to General Purpose Managed Instance with its remote storage, and does not occur on Business Critical Managed Instance. For General Purpose Managed Instance, the engineering team is looking for ways to minimize or eliminate this performance impact in a future update.

Conclusion

In this article, we describe a solution for real time performance monitoring of Azure SQL Database Managed Instance. In our customer engagements, we found it indispensable for monitoring workload performance, and for diagnosing a broad range of performance issues in customer workloads. We hope that customers using Managed Instance will find it useful for database workload monitoring and troubleshooting, as they migrate their on-premises applications to Azure. We hope many of you will start your journey with Managed Instance soon, and are looking forward to hearing about your experiences, as well as feedback and contributions to this monitoring solution.

↧

SQLCAT at PASS Summit 2018

November 5, 2018, 4:23 pm

≪ Previous: Real-time performance monitoring for Azure SQL Database Managed Instance

This week in Seattle, PASS Summit 2018 will be happening for the twentieth time. As for many years before, the SQLCAT (a.k.a. DataCAT) team will be at the conference. This year, we will present three sessions:

On Thursday Nov 8 at 4:45 PM in room 6B, Alexei Khalyako, John Hoang, and Mike Weiner will talk about Learnings and Best Practices in Building an Enterprise Data Warehouse Using Azure SQL DW. Our team has had the opportunity to work with a number of customers moving their data warehouse workloads to Azure SQL DW. In this session we’ll share learnings around the most common design patterns with Azure SQL DW and go deeper on data loading and performance tuning and optimization considerations with Azure SQL DW (Gen2). Come to this session to learn from our real-world experiences in working with customers on adopting Azure SQL DW.

On Friday Nov 9 at 8:00 AM in room 615, Arvind Shyamsundar will show how to tackle Advanced SQL Server Troubleshooting with SQLCallStackResolver. This Level 400 session is a product of many real-world customer engagements that our team has worked on. Arvind will show you some tools and techniques to tackle the hardest-to-investigate performance issues in SQL Server and he will demo one such scenario involving spinlock contention on high-end servers. Most importantly, the information in the session is very practical and directly usable. Guaranteed to help you be much more efficient and precise when troubleshooting SQL Server performance!

On Friday Nov 9 at 9:30 AM in the TCC room Tahoma 5, Kun Cheng, Dimitri Furman, and Mike Weiner will talk about SQL DB Managed Instance – Best Practices and Lessons Learned. For well over a year, since the early private preview in summer of 2017 and through General Availability in fall of 2018, we have been collaborating with Microsoft customers who are the early adopters of Managed Instance, and with the Managed Instance engineering team, to ensure successful implementations and to improve the offering. If you want to learn how Managed Instance works and how to use it optimally, from performance tips to migration best practices, this session is for you.

In addition to presenting, we will also be at the Data Clinic, where together with our colleagues from the SQL Server engineering team, the Tiger team, the CSS team, and many others we will help you solve real life problems and answer hard (or easy ) questions about SQL Server and other data technologies we work with.

See you at the Summit!

↧

Overview

Traditional Enterprise Data Pipeline

Enterprise Data Pipeline on Azure PaaS

Benefits

Which style wins – Physical Server/VM based or PaaS service based?

Hybrid data pipeline Architectures

Customer implementations in the real world

Cloud PaaS Centered Architecture

Hybrid IaaS – PaaS Architecture

Best Practices

Background

What we saw

Fix #1: Parameter Embedding

Fix #2: Inline TVFs

“Fix” #3: Mandatory parameters

Summary

We want to hear from you!

Sayint application Overview

Motivation

Implementation Details

Tables and Indexes

Service Broker Objects

Stored Procedures

Trigger

Unit Testing

Stress Test

Resource Governance

Summary

Q1. How can I check if my SQL Server database is compatible with Azure SQL Database?

Q2. How should I choose the service tier and the performance level for my database(s)?

Q3. How do I monitor resource consumption and query performance?

Q4. Should I use Elastic Pools?

Q5. How should I configure firewall rules?

Q6. Can I create a database in a VNet?

Q7. Can I use Windows Authentication in Azure SQL Database?

Q8. How should I implement retry logic?

Q9. What is the best way to load large amounts of data in Azure SQL Database?

Q10. What if my application is chatty?

Q11. What kind of workload is appropriate for Azure SQL Database?

Q12. How do I scale out my database?

SQLCAT Sessions

SQL Clinic

Bonus

And More …

A VLDB database

Backup and restore

Consistency checking

Reporting and analytics

Limitations

SQL error log is available on MI

A new way to look at the MI error log

First glance at CPU and memory on MI

CPU and memory resources are managed differently on MI

Conclusion

Dataset

Scenario 1: Loading data into SQL Server

Loading into a heap

Using Spark JDBC connector

Using SQL Spark connector

Loading into a clustered columnstore table

With JDBC connector

Using SQL Spark connector

Scenario 2: Loading data into Azure SQL Database

Database storage architecture on MI GP

Azure Storage throttling

Storage limits on MI GP

Comparison with SQL Server in Azure IaaS VM

Performance impact of database file size on MI GP – an example

Storage performance best practices for MI GP

Conclusion

Introduction

Dashboard examples

Setup and configuration

Prerequisites

Step 1

Step 2

Step 3

Step 4

Step 5

Step 6