Azure Synapse Analytics replicated tables play an important role in Azure Synapse Analytics SQL Pools. They avoid shuffle move operations that are extremely time consuming for the engine. For this reason, you want to make sure that the data is replicated across different notes and up-to-date.
Replication takes place after the first query hits the replicated table. This means that query will perform slowly.
I always recommend executing a “select top 1 * from table” after inserting, deleting, updating, and altering the table. This is so that you don’t penalize the data warehouse user who runs the first query using this table. You also need to kick off the replication in case you assign more compute nodes.
The following query allows you to identify the status of your Azure Synapse Analytics SQL Pool replicated tables.
-- This query returns the state for the tables that are replicated SELECT [SchemaName] = schema_name(schema_id) ,[TableName] = t.name ,[CompleteTableName] = QUOTENAME(schema_name(schema_id)) + '.' + QUOTENAME(t.name) ,[State] = STATE ,[SyncDMLStatement] = CASE WHEN STATE = 'NotReady' THEN 'select top 1 * from ' + schema_name(schema_id) + '.' + t.name END FROM sys.tables t JOIN sys.pdw_replicated_table_cache_state c ON c.object_id = t.object_id JOIN sys.pdw_table_distribution_properties p ON p.object_id = t.object_id WHERE distribution_policy = 3 ORDER BY t.name
In upcoming blog posts, we’ll continue to explore some of the features within Azure Services.
Please follow Tech Talk Corner on Twitter for blog updates, virtual presentations, and more!
As always, please leave any comments or questions below.