SQL SERVER INDEX (Part 2): Key Lookup
Firstly, I recommend watching this How to Think Like the Engine - YouTube by Brent Ozar on YouTube (all parts). It is incredibly helpful foundational material before you begin your SQL performance tuning journey.
To demonstrate concepts about key lookups, I will use the StackOverflow2010 database. Feel free to follow along with my examples. If you haven't, please check out my post Tools I Used During Query Performance Troubleshooting to see how to setup environment to obtain execution plans and track logical reads.
Now, let's consider the following index and query:
ON dbo.posts (PostTypeId);
SELECT ID,
Title,
LastEditDate,
Score
FROM dbo.Posts
WHERE posttypeid = 3
The index IX_PostTypeID
has PostTypeID as its key column. To execute the query, the optimizer can choose to use the index to seek post where the PostTypeID is 3.
However, the index doesn't contain other columns required by the query, specifically, Title, LastEditdate, Score. Therefore, additional key lookups on the clustered index PK_Post_ID are necessary.
That is how a key lookup works. Using the non-clustered index IX_PostTypeID, SQL Server seeks post ids with PostTypeID = 3. Once those post IDs are retrieved, the optimizer performs key lookups on the clustered index PK_Post_ID to fetch the additional column data (See the demonstration below).
However, key lookups can be extremely costly if there are a large number of rows to lookup.
When Key Lookups Are Acceptable
Key lookups are generally acceptable when there are only a few rows to retrieve.

PostTypeID = 3
. Therefore, 28 key lookups are required to retrieve Title
, LastEditDate
, and Score
. The optimizer determines that this is manageable.When Key Lookups Become a Problem
PostTypeID
that matches millions of rows, the optimizer decides using index seek with key lookups is no longer efficient, as it would require millions of key lookups. Instead, it chooses a clustered index scan - scanning the entire table and filter rows (with logical reads totaling 800,856).
A key point to remember is that if cardinality estimates are accurate, the optimizer usually makes the right choice—unless a query hint forces it otherwise.
However, when cardinality is underestimated, the optimizer might mistakenly choose an index seek with millions of key lookups, when a scan would have been more efficient.
OPTION (RECOMPILE)
when analyzing cardinality estimates.The Solution: Covering Index
CREATE INDEX IX_PostTypeID
ON dbo.posts (PostTypeID)
include (Title, LastEditDate, Score);

This index covers the query, eliminating the need for key lookups on the clustered index PK_Post_ID
.
Why Not Include All Columns in an Index?
IX_PostTypeID
is to support faster seeks for queries filtered by PostTypeID
. However, we don’t want the index to become bloated, which would lead to longer reads and writes. That’s why we include only the columns required by the query.A useful rule of thumb I learned from DBA guru Brent Ozar:
A table should have no more than 5 non-clustered indexes, and each index should ideally include no more than 5 columns —unless you're running a high-performance machine (e.g., 12–16 CPU cores, 64GB RAM, 2TB SSD).
Fewer supporting indexes may require some queries to scan the table to retrieve data (while seeks would have read a smaller number of pages and search the data faster). More indexes can slow down updates, deletes, and inserts; potentially lead to slower storage, and reduce memory efficiency. It’s important to find the right balance so performance doesn't suffer (helpful tool for index tuning and query tuning: sp_blitzindex and sp_blitzcache)
Comments
Post a Comment