Following Microsoft’s recommendations, you should update statistics in Azure Synapse Analytics SQL Pools to improve the performance of your queries. In this blog post, I’ll share two scripts you can schedule to update statistics.
Why do I need to update statistics in Azure Synapse Analytics?
To begin, you need to update your statistics to improve the performance of your queries.
Keeping statistics up to date helps the query engine select the best query plans to execute.
Azure Synapse Analytics SQL Pool’s automatic creation of statistics is turned on by default. However, this does not include maintenance or statistics updates.
You can verify if the creation of stats is on by executing the following script:
SELECT name, is_auto_create_stats_on FROM sys.databases
Remember that Auto Update Statistics is not enabled yet in Azure Synapse Analytics.
When should I update my statistics?
Ideally, you should maintain Azure Synapse Analytics statistics as part of your data movement process. Include logic to maintain the statistics as part of your stored procedures.
When you are creating a stored procedure, that’s the best moment to maintain your statistics. For example, after T-SLQ statements that load, delete and update data.
To avoid updating statistics one at a time, I suggest using a command that will update all the stats in a table.
Update Statistics SchemaName.TableName
This statement updates all the statistics for your Azure Synapse Analytics SQL Pool table.
The scripts shared in this blog post will help you complement this routine. Although this shouldn’t be the only way that you maintain your statistics.
The scripts are built on top of the logic provided here by Microsoft.
Updating Statistics in Azure Synapse Analytics SQL Pools
This script updates statistics when the variance of the number of rows is greater than the value for the parameter @percent_deviation_from_actual.
exec [dbo].[syn_update_stats] @percent_deviation_from_actual = 10, @execute= 0
I suggest running this script once daily.
Updating Old Statistics in Azure Synapse Analytics SQL Pools
This script updates old statistics in your Azure Synapse Analytics SQL Pools when the value of the parameter @number_of_days is the number of days.
exec [dbo].[syn_update_old_stats] @number_of_days = 7, @execute= 1
I suggest running this script once a week.
To sum up, the two scripts mentioned above to update statistics in your Azure Synapse Analytics SQL Pools are a must when working with this fantastic service.
You can schedule them using Azure Data Factory or Azure Synapse Analytics pipelines.
In upcoming blog posts, you’ll learn how to easily enable Continuous Integration and Continuous for this solution.
Please follow Tech Talk Corner on Twitter for blog updates, virtual presentations, and more!
As always, please leave any comments or questions below.