Troubleshoot SQL Server Cardinality Estimate
Today, I will show you why good cardinality estimate is essential for an efficient execution plan and how I troubleshoot inaccurate cardinality estimates.
I will use the database StackOverflow2010 to demonstrate.
INSERT 50,000 rows
I insert 50,000 rows with OwnerUserID 26837 into the table. Assuming this is a real life workload!
DECLARE @date AS DATETIME = Getdate();
INSERT INTO posts
(body,
lastactivitydate,
creationdate,
score,
viewcount,
posttypeid,
owneruserid)
VALUES ('Some question',
@date,
@date,
0,
4,
1,
26837)
go
50000
INSERT INTO posts
(body,
lastactivitydate,
creationdate,
score,
viewcount,
posttypeid,
owneruserid)
VALUES ('Some question',
@date,
@date,
0,
4,
1,
26837)
go
50000
Run a query
The following query will be run:
DROP TABLE IF EXISTS #tempposts;
CREATE TABLE #tempposts
(
id INT
)
INSERT INTO #tempposts
CREATE TABLE #tempposts
(
id INT
)
INSERT INTO #tempposts
SELECT id FROM dbo.posts WHERE owneruserid = 26837
SELECT title, u.displayname, pt.type
FROM dbo.posts p
INNER JOIN #tempposts temp ON p.id = temp.id
INNER JOIN dbo.users u ON p.owneruserid = u.id
INNER JOIN dbo.posttypes pt ON p.posttypeid = pt.id
OPTION(recompile, USE hint('QUERY_OPTIMIZER_COMPATIBILITY_LEVEL_150'))
Examine the execution plan
Execution Plan (Before statistics updated)
The Cardinality Estimator, from now I will call it CE, failed to estimate the number of rows being processed accurately. While 50,319 rows were returned from the join, the estimate was only 202 rows. This underestimated row count caused descendant operators in the execution plan (e.g., the Sort operator) to be underestimated as well.
The number of logical reads is 457,732. Whether this is optimal or not, we may not know, but we do know that the estimates are significantly off.
Additionally, the optimizer chose to perform seeks on the PostTypes and Users tables instead of scans, based on the assumption of only 202 rows being returned.
Checkout the OptimizerStatsUsage property
Since SQL Server 2017, we can identify which statistics were used during compilation through the execution plan.
To see this property, I make sure to set the query to recompile and to run with the newer compatibility level (e.g. 150) through query hint.
The OptimizerStatsUsage property is located under the properties of the root node of the plan:

Notice there have been 50,000 modifications to the Posts table since the statistics was last updated corresponding to the rows we just inserted. Updating the statistics might help improve the accuracy of the estimate.
Investigate with DBCC SHOW_STATISTICS
But we can dive deeper to understand how the CE draws the estimate of 202 rows. We will use the command DBCC SHOW_STATISTICS.
DBCC show_statistics ('dbo.Posts', pk_posts__id);
Next is the statistics associated to the temp table:
DBCC show_statistics ('#tempPosts', _wa_sys_00000001_b7a2b873);

For simple joins with a single equality or inequality predicate, the new CE thus uses a coarse alignment algorithm that aligns histograms using minimum and maximum matching histogram boundaries to estimate the number of the matching rows.
The maximum matching histogram step of the temporary table on the right could have extended to the 5th step but stopped at 2nd step because the left table's maximum step was only at 12,496,713.
50,000 rows were not included in the calculation.
Update statistics
Let's update statistics and review the execution plan and the performance change:
UPDATE STATISTICS dbo.posts

The number of logical reads by the query was reduced to 221,217. The performance improves 2x.
2371 trace flag
The trace flag adjusts the default 20% threshold to a dynamic percentage based on the number of rows in a table. The more rows, the lower the threshold.
The table Posts has 3.7 million rows. Without enabling the trace flag, we would need (3.7millions * 20%) = 740,000 inserts for the system to update the statistics.
With the trace flag enabled, the threshold is reduced: sqrt(3.7millions * 1000) = 60,000 inserts.
- Given the trace flag enabled, as the number of inserts hits 60k, the statistics are updated. Using the up-to-date statistics, the query plan is more efficient, causing the reduction in the number of logical reads.
- The more data the query reads, the worse the performance is, indicated by the linear growth as the number of rows inserted to the table and qualifying the query increases.
- With up-to-date statistics and optimal query plan, the slope of the growth becomes smaller, indicating the better scalability.

To better understand about cardinality estimates, you should check out this Microsoft article Optimizing Your Query Plans with the SQL Server 2014 Cardinality Estimator | Microsoft Learn.
Comments
Post a Comment