Azure SQL Server 2022 Upgrade: Fixing Performance Degradation from SQL 2016 August 5, 202533 views0 By IG Share Share The recent migration from SQL Server 2016 SP3 to SQL Server 2022 CU16 has introduced significant performance challenges. This detailed analysis explores the root causes, from the new Query Optimizer and operating system to specific code patterns, and provides a clear path forward for optimization. SQL Server 2022 Performance Degradation Analysis | GigXP.com GigXP.com Home Services Blog Contact SQL Server 2022 Performance Degradation: An In-Depth Analysis Posted on August 5, 2025 by The GigXP Team The recent migration from SQL Server 2016 SP3 to SQL Server 2022 CU16 has introduced significant performance challenges for our client. Despite a well-executed log-shipping migration and the strategic decision to retain compatibility mode 2012, average CPU utilization has increased from 30% to 50% during business hours. A key observation is that many queries, while retaining the same execution plan shape, are now taking significantly longer to execute. This detailed analysis explores the root causes, from the new Query Optimizer and operating system to specific code patterns, and provides a clear path forward for optimization. The Performance Shift Old CPU 30% New CPU 50% 66% Increase in CPU Query Latency Old Query Time 6ms New Query Time 50ms ~8x Slower Migration Comparison Summary Feature Old SQL 2016 Server New SQL 2022 Server Operating System Windows Server 2016 Windows Server 2025 Datacenter SQL Server Version 2016, service pack 3 2022, cumulative update 16 Compatibility Mode 2012 for most databases Identical, 2012 for most databases Host & Cores AWS Z1D.12XLarge, 48 logical cores Identical, AWS Z1D.12XLarge, 48 logical cores TempDB 24 files on RAM disk 24 files on NVMe disk Average CPU 30% at peak 50% at peak Interactive Performance Metrics Select a metric below to view its simulated trend based on our analysis. CPU Utilization Lock Wait Times Detailed Analysis: The 'Silent Saboteurs' The core issue is a paradox: the execution plans appear stable, yet queries are running slower. This points to subtle, underlying changes in the SQL Server 2022 engine itself. The new Cardinality Estimator (CE), even in compatibility mode 2012, can influence runtime decisions, leading to suboptimal memory grants and increased I/O. A Note on Temp Tables The observed high recompilation rate for a stored procedure using a temp table is a critical symptom. While SQL Server 2022 has enhanced temp table caching, this behavior suggests the existing code is likely violating the rules for effective caching (e.g., explicitly dropping tables or altering schemas), which the new engine is more sensitive to. Furthermore, the use of the new Windows Server 2025 is an unquantified risk. The operating system's kernel and memory management can have a profound impact on SQL Server's performance, and this new platform may not be fully optimized for the SQL Server 2022 workload, potentially contributing to memory allocation and CPU scheduling inefficiencies. Deep Dive: Root Cause Analysis The performance degradation is not a single issue, but a combination of factors related to the new SQL Server 2022 engine and the operating system. Query Optimizer and Cardinality Estimator Behavior A key part of SQL Server 2022 is the new Cardinality Estimator (CE160). Even in compatibility mode 2012 (level 110), the underlying SQL Server engine can still be influenced by its native CE, leading to different, and sometimes less accurate, estimates about intermediate result sets. This can cause: Suboptimal memory grants for queries. Inefficient join orders, even with the same plan shape. Increased I/O operations due to inaccurate row estimates. This is the "same plan, slower execution" paradox, where the plan looks identical but the internal execution is less efficient, directly causing higher CPU usage and longer query times. TempDB Configuration and NVMe vs. RAM Disk The migration of TempDB from a RAM disk to a high-speed NVMe disk should, in theory, improve performance. RAM disks can lead to "double RAM caching," which reduces the memory available for the SQL Server buffer pool. The NVMe disk eliminates this issue and provides excellent I/O performance. However, SQL Server 2022 also has enhanced TempDB caching and latch-free allocation. The fact that recompilations are *increasing* for a temp table query suggests the existing code is not properly utilizing these new features, likely by violating temp table caching rules like explicitly dropping tables or altering schemas. This behavior can overwhelm the improved TempDB subsystem. Windows Server 2025: The Bleeding Edge Risk The new environment runs on the very recent Windows Server 2025. This is a critical, unquantified variable. The interaction between SQL Server 2022 and this new OS is not well-documented and could be a source of performance issues. External reports have noted similar slowness on newer Windows Server versions, often linked to memory allocation (`MEMORY_ALLOCATION_EXT`) and CPU scheduling inefficiencies (`SOS_SCHEDULER_YIELD`). Deep Dive: Temp Table Recompilations The high rate of recompilations for a stored procedure using a temporary table is a critical symptom and a direct contributor to increased CPU. Each recompile consumes CPU and adds latency. Why This is Happening SQL Server 2022 has enhanced temp table caching to prevent recompiles. However, this caching mechanism is sensitive to certain coding patterns. If the stored procedure is: Explicitly using a `DROP TABLE` statement for the temp table. Altering the temp table's schema after its initial creation (e.g., adding an index in a separate statement). The new engine is likely more sensitive to these violations, triggering a recompile each time the procedure runs, which was not the case on SQL Server 2016. The client's test of converting to a table variable is a valid workaround, as table variables are not subject to these same recompilation triggers. Analysis of Monitoring Metrics and Wait Statistics The monitoring dashboards reveal key metrics that provide direct evidence of the performance degradation. The following table summarizes the most critical observations. Metric Name Observed Value (Peak/Average) Interpretation/Significance CPU Utilization - Overall 50% (Avg), 80-90% (Peak Cores) Significant increase from 30%, indicates high CPU pressure. Batch Requests/sec ~7.5K (Peak) Consistent with old server, workload volume not the cause. Lock Wait Time >2.5K (Peak) Significant blocking occurring, queries waiting for locks. Latch Waits/sec >9K (Peak) High contention on internal SQL Server in-memory structures. Memory Available ~35-40K MB OS has free memory, but SQL Server might have internal allocation issues. Buffer Cache Hit Ratio ~99% (Avg) Good data caching, but doesn't account for TempDB or other memory pressure. The most critical metrics are the spikes in Lock Wait Time and Latch Waits/sec. The Lock Wait Time graph shows that queries are frequently being blocked, which directly contributes to their longer execution times. The high Latch Waits indicate contention on SQL Server's internal memory structures, suggesting that the underlying engine is struggling to manage resources under the increased load. This combination of waits, alongside the CPU increase, strongly points to a cascade of issues where less efficient query execution (caused by the new CE and OS) is leading to increased resource contention. Recommendations and Next Steps You can expand this section to view a detailed list of actionable steps. Show Recommendations Immediate Troubleshooting Verify and Adjust TempDB Configuration: Ensure TempDB files are properly sized and there are no underlying I/O bottlenecks on the NVMe disk. Investigate Temp Table Recompilations: Review the problematic stored procedure for `DROP TABLE` statements or schema changes. Monitor for Specific Wait Types: Actively look for `MEMORY_ALLOCATION_EXT` and `SOS_SCHEDULER_YIELD` waits using `sys.dm_os_wait_stats`. Long-Term Strategy Analyze Execution Plans: Use `SET STATISTICS XML ON` to compare actual plans and look for cardinality estimation issues. Test `LEGACY_CARDINALITY_ESTIMATION`: Try explicitly forcing the legacy CE to see if performance improves for problematic queries. Plan Gradual Upgrades: Develop a phased plan to move to a higher compatibility level to take advantage of new IQP features. Disclaimer: The Questions and Answers provided on https://gigxp.com are for general information purposes only. We make no representations or warranties of any kind, express or implied, about the completeness, accuracy, reliability, suitability or availability with respect to the website or the information, products, services, or related graphics contained on the website for any purpose. Share What's your reaction? Excited 0 Happy 0 In Love 0 Not Sure 0 Silly 0 IG Website Twitter
SQL Server 2022 Performance Degradation: An In-Depth Analysis Posted on August 5, 2025 by The GigXP Team The recent migration from SQL Server 2016 SP3 to SQL Server 2022 CU16 has introduced significant performance challenges for our client. Despite a well-executed log-shipping migration and the strategic decision to retain compatibility mode 2012, average CPU utilization has increased from 30% to 50% during business hours. A key observation is that many queries, while retaining the same execution plan shape, are now taking significantly longer to execute. This detailed analysis explores the root causes, from the new Query Optimizer and operating system to specific code patterns, and provides a clear path forward for optimization. The Performance Shift Old CPU 30% New CPU 50% 66% Increase in CPU Query Latency Old Query Time 6ms New Query Time 50ms ~8x Slower Migration Comparison Summary Feature Old SQL 2016 Server New SQL 2022 Server Operating System Windows Server 2016 Windows Server 2025 Datacenter SQL Server Version 2016, service pack 3 2022, cumulative update 16 Compatibility Mode 2012 for most databases Identical, 2012 for most databases Host & Cores AWS Z1D.12XLarge, 48 logical cores Identical, AWS Z1D.12XLarge, 48 logical cores TempDB 24 files on RAM disk 24 files on NVMe disk Average CPU 30% at peak 50% at peak Interactive Performance Metrics Select a metric below to view its simulated trend based on our analysis. CPU Utilization Lock Wait Times Detailed Analysis: The 'Silent Saboteurs' The core issue is a paradox: the execution plans appear stable, yet queries are running slower. This points to subtle, underlying changes in the SQL Server 2022 engine itself. The new Cardinality Estimator (CE), even in compatibility mode 2012, can influence runtime decisions, leading to suboptimal memory grants and increased I/O. A Note on Temp Tables The observed high recompilation rate for a stored procedure using a temp table is a critical symptom. While SQL Server 2022 has enhanced temp table caching, this behavior suggests the existing code is likely violating the rules for effective caching (e.g., explicitly dropping tables or altering schemas), which the new engine is more sensitive to. Furthermore, the use of the new Windows Server 2025 is an unquantified risk. The operating system's kernel and memory management can have a profound impact on SQL Server's performance, and this new platform may not be fully optimized for the SQL Server 2022 workload, potentially contributing to memory allocation and CPU scheduling inefficiencies. Deep Dive: Root Cause Analysis The performance degradation is not a single issue, but a combination of factors related to the new SQL Server 2022 engine and the operating system. Query Optimizer and Cardinality Estimator Behavior A key part of SQL Server 2022 is the new Cardinality Estimator (CE160). Even in compatibility mode 2012 (level 110), the underlying SQL Server engine can still be influenced by its native CE, leading to different, and sometimes less accurate, estimates about intermediate result sets. This can cause: Suboptimal memory grants for queries. Inefficient join orders, even with the same plan shape. Increased I/O operations due to inaccurate row estimates. This is the "same plan, slower execution" paradox, where the plan looks identical but the internal execution is less efficient, directly causing higher CPU usage and longer query times. TempDB Configuration and NVMe vs. RAM Disk The migration of TempDB from a RAM disk to a high-speed NVMe disk should, in theory, improve performance. RAM disks can lead to "double RAM caching," which reduces the memory available for the SQL Server buffer pool. The NVMe disk eliminates this issue and provides excellent I/O performance. However, SQL Server 2022 also has enhanced TempDB caching and latch-free allocation. The fact that recompilations are *increasing* for a temp table query suggests the existing code is not properly utilizing these new features, likely by violating temp table caching rules like explicitly dropping tables or altering schemas. This behavior can overwhelm the improved TempDB subsystem. Windows Server 2025: The Bleeding Edge Risk The new environment runs on the very recent Windows Server 2025. This is a critical, unquantified variable. The interaction between SQL Server 2022 and this new OS is not well-documented and could be a source of performance issues. External reports have noted similar slowness on newer Windows Server versions, often linked to memory allocation (`MEMORY_ALLOCATION_EXT`) and CPU scheduling inefficiencies (`SOS_SCHEDULER_YIELD`). Deep Dive: Temp Table Recompilations The high rate of recompilations for a stored procedure using a temporary table is a critical symptom and a direct contributor to increased CPU. Each recompile consumes CPU and adds latency. Why This is Happening SQL Server 2022 has enhanced temp table caching to prevent recompiles. However, this caching mechanism is sensitive to certain coding patterns. If the stored procedure is: Explicitly using a `DROP TABLE` statement for the temp table. Altering the temp table's schema after its initial creation (e.g., adding an index in a separate statement). The new engine is likely more sensitive to these violations, triggering a recompile each time the procedure runs, which was not the case on SQL Server 2016. The client's test of converting to a table variable is a valid workaround, as table variables are not subject to these same recompilation triggers. Analysis of Monitoring Metrics and Wait Statistics The monitoring dashboards reveal key metrics that provide direct evidence of the performance degradation. The following table summarizes the most critical observations. Metric Name Observed Value (Peak/Average) Interpretation/Significance CPU Utilization - Overall 50% (Avg), 80-90% (Peak Cores) Significant increase from 30%, indicates high CPU pressure. Batch Requests/sec ~7.5K (Peak) Consistent with old server, workload volume not the cause. Lock Wait Time >2.5K (Peak) Significant blocking occurring, queries waiting for locks. Latch Waits/sec >9K (Peak) High contention on internal SQL Server in-memory structures. Memory Available ~35-40K MB OS has free memory, but SQL Server might have internal allocation issues. Buffer Cache Hit Ratio ~99% (Avg) Good data caching, but doesn't account for TempDB or other memory pressure. The most critical metrics are the spikes in Lock Wait Time and Latch Waits/sec. The Lock Wait Time graph shows that queries are frequently being blocked, which directly contributes to their longer execution times. The high Latch Waits indicate contention on SQL Server's internal memory structures, suggesting that the underlying engine is struggling to manage resources under the increased load. This combination of waits, alongside the CPU increase, strongly points to a cascade of issues where less efficient query execution (caused by the new CE and OS) is leading to increased resource contention. Recommendations and Next Steps You can expand this section to view a detailed list of actionable steps. Show Recommendations Immediate Troubleshooting Verify and Adjust TempDB Configuration: Ensure TempDB files are properly sized and there are no underlying I/O bottlenecks on the NVMe disk. Investigate Temp Table Recompilations: Review the problematic stored procedure for `DROP TABLE` statements or schema changes. Monitor for Specific Wait Types: Actively look for `MEMORY_ALLOCATION_EXT` and `SOS_SCHEDULER_YIELD` waits using `sys.dm_os_wait_stats`. Long-Term Strategy Analyze Execution Plans: Use `SET STATISTICS XML ON` to compare actual plans and look for cardinality estimation issues. Test `LEGACY_CARDINALITY_ESTIMATION`: Try explicitly forcing the legacy CE to see if performance improves for problematic queries. Plan Gradual Upgrades: Develop a phased plan to move to a higher compatibility level to take advantage of new IQP features.
Azure Azure SQL MI vs. VM Performance Gap: Migration Estimator Tool It’s a common and frustrating scenario for teams migrating to Azure SQL PaaS. A workload ...
Azure CLI Command Generator Tool | Free Build & Copy CMDLETs Tired of searching for the right syntax for your Azure CLI commands? Our interactive Azure ...
Azure Azure Arc Data Services Sizing Tool & Calculator for SQL MI PostGreSQL Planning your Azure Arc Data Services deployment is a critical first step. This interactive sizing ...
Azure On‑premises DNS → Azure DNS Migration Tool Estimator Checklist Move your authoritative DNS from on‑premises Linux/BIND (or similar) to Azure DNS with confidence. This ...
Microsoft Azure Private Link Cost Calculator & TCO Guide Price Estimator The Total Cost of Ownership (TCO) for an Azure Private Link deployment is a complex ...
Azure SQL Server 2025 Upgrade & Backwards Compatibility Guide Steps The release of SQL Server 2025 is more than a version bump—it’s a strategic leap ...
Azure Windows Server 2025 Hotpatching: On-Prem Readiness & Cost Calculator Thinking about implementing Windows Server 2025‘s new Hotpatching feature for your on-premise servers? This interactive ...
Azure Azure Egress Network Cost Calculator | Estimate Data Transfer Cost Struggling to predict your monthly Azure egress costs? You’re not alone. Azure’s data transfer pricing ...
Azure Azure AI Token Cost Calculator & Estimator | OpenAI & Foundry Models Planning your budget for an AI project? Our Azure AI Token Cost Estimator is a ...
Azure ASR & GRS: The Hidden Gap in Your Azure Disaster Recovery Plan A common belief in IT is that protecting on-premises workloads with Azure Site Recovery (ASR) ...
Azure Azure Files Lifecycle Management: A Guide to NAS Migration Strategies Customers migrating from on-premises NAS to Azure Files want to leverage the cloud’s scalability and ...
Azure Migrate VMware to Azure Stack (Local) Azure VMware Solution (AVS) The recent acquisition of VMware by Broadcom has sent ripples through the virtualization market. With ...