Once in a while, most of SQL Server DBA/developer might have encountered an issue where SQL Server host is using almost all memory and CPU when checked in Windows Task Manager. A memory bottleneck can lead to excessive paging that ultimately impacts performance
Usually, on SQL Server machine we can see that memory used is high while CPU and disk usage is low, which is what we expect on SQL Server machine. SQL Server uses as much memory as it can to improve performance by caching data into memory so that it can avoid reading from disk. It acts like a memory hog and will hold on all the memory we have allocated to it, even if it is not using it. SQL Server used Buffer pool to efficiently manage memory requests for SQL Server processes and is one of the massive consumers of memory for SQL Server. In 32 bit system, there is some memory which is allocated outside buffer pool and is allocated during system start up and is referred to as Reserved memory aka memory to leave. So “Buffer pool” and “Reserved memory” contributes to total memory consumed by SQL Server.
Default memory setting during installation of SQL Server is for it to use all of the available memory. When Windows OS requests for memory then SQL gives it back in non-preemptive mode and that too in huge amounts and then proceed to build it back again to the maximum amount. This can happen if you have not given enough memory for OS to work. That is why it is very important to configure MAX server setting for SQL Server instance leaving enough memory for OS to work smoothly. Sometimes some drives leak memory or some DLL’s which are loaded leak memory which is installed on OS if such is the case then we need to find it and get it removed.
Rather than relying on Task Manager, we can check some other metrics to investigate if there is any memory related performance concern. For any bottleneck, we need to identify the problem and then iterate possible causes. For testing that your SQL Server is facing memory crunch, you can refer to Error log or Perfmon.
- Error Log: If you see Out of memory error(OOM), then it might point to memory issue. It can be due to poorly written query, an inappropriate setting of max memory in SQL Server configuration or if there is actual memory crunch on OS due to which SQL Server is kicking.
- Perfmon: Best place to see how SQL Server is performing with current memory configuration.
Below list highlights few metrics, some of which can be queried from T-SQL or can be followed with Perfmon on which we can focus our attention:
- Buffer Cache Hit Ratio
- Page Life Expectancy
- Checkpoint Pages/Sec
- Target and Total Server Memory
- Memory Grants Pending
- Batch Requests and Compilations per second
Buffer Cache Hit Ratio (BCHR)
This counter averages (since the last restart of your SQL instance) how often SQL Server goes to the buffer pool to get data, and actually finds that data in memory, instead of having to go to disk. If your BCHR is high 90 to 100 then it points to fact that no memory pressure. If somebody runs a query which requests a lot of pages in that case momentarily BCHR might come down to 60 or 70 may be less but that does not mean it is a memory pressure it means your query requires large memory and will take it. After that query completes you will see BCHR rising again.
SELECT [cntr_value]
FROM sys.dm_os_performance_counters
WHERE [object_name] LIKE '%Buffer Manager%'
AND [counter_name] = ‘Buffer cache hit ratio’
If this number is consistently below the 90% mark, you may need to add more RAM to your server to boost performance.
Page Life Expectancy (PLE)
PLE shows for how long page resides in the buffer pool. The longer it stays the better it is. It’s a general misconception to take 300 as a baseline for PLE. The 300 was the baseline for SQL Server version 2000 when maximum RAM was used to be 4-6 GB. For calculating baseline, there is a custom formula(tentative) provided in Jonathan Kehayias book (troubleshooting SQL Server). Take the base counter value of 300 presented by most resources, and then determine a multiple of this value based on the configured buffer cache size, which is the ‘max server memory’ sp_ configure option in SQL Server, divided by 4 GB. So, for a server with 32 GB allocated to the buffer pool, the PLE value should be at least (32/4)*300 = 2400.
SELECT [cntr_value]
FROM sys.dm_os_performance_counters
WHERE [object_name] LIKE '%Buffer Manager%'
AND [counter_name] = 'Page life expectancy'
When SQL Server runs out of memory PLE value drops immediately and raises back gradually. But if PLE value stays down the baseline value consistently then the server may have the memory problem and may need additional RAM.
Checkpoint Pages/Sec
Checkpoint pages /sec counter measures the number of pages read from or written to disk. This counter provides us details of memory pressure because if buffer cache is low then lots of new pages need to be brought into and flushed out from the buffer pool, due to load, checkpoint’s work will increase and will start flushing out dirty pages very frequently. If this counter is high then your SQL Server buffer pool is not able to operate up with requests coming and we need to increase it by increasing buffer pool memory or by increasing physical RAM and then making adequate changes in Buffer pool size. Technically this value should be low if you are looking at line graph in perfmon this value should always touch base for the well-built system.
Target and Total Server Memory
Target server memory is amount of memory SQL Server is trying to obtain and Total Server memory is current amount of memory obtained by SQL Server
SELECT [counter_name], [cntr_value]
FROM sys.dm_os_performance_counters
WHERE [object_name] LIKE '%Memory Manager%'
AND [counter_name] IN ('Total Server Memory (KB)','Target Server Memory (KB)')
The ratio of Total Server Memory and Target Server Memory should be close to 1. If Total Server memory stays significantly less than Target Server memory, then it may lead to couple of thoughts
- Allocated memory for SQL server is much more than it can use, this will be a situation when databases are smaller than the amount of memory on the machine.
- SQL cannot grow buffer pool because of memory pressure from outside of SQL. In this case, you need to either increase Max Server memory or increase RAM to boost performance.
Memory Grants Pending
Query memory grant is a part of SQL server memory used to store temporary row data while sorting and joining rows. It is called “grant” because the server requires those queries to “reserve” before actually using memory. This reservation improves query reliability under server load because a query with reserved memory is less likely to hit out-of-memory while running, and the server prevents one query from dominating entire server memory. If you see memory grants pending in buffer pool your server is facing SQL Server memory crunch. Value for counter should always be less than 1, anything above 1 indicates there are SQL processes waiting for memory for further transactions.
SELECT [cntr_value]
FROM sys.dm_os_performance_counters
WHERE [object_name] LIKE '%Memory Manager%'
AND [counter_name] = 'Memory Grants Pending'
Memory grants pending could be due to bad queries, missing indexes, wrong key ordering, sorts or hashes. To investigate this, you can query the sys.dm_exec_query_memory_grants view, which will show which queries (if any) that require a memory grant to execute.
--Find all queries waiting in the memory queue
SELECT * FROM sys.dm_exec_query_memory_grants WHERE grant_time IS NULL
I hope this article will help you in troubleshooting SQL Server memory issues, and instead of looking only at stats from Task Manager, above measurements would help you to make critical decisions about resources on your SQL Server.
Comments
Post a Comment