Network Monitoring - Broadcast Equipment Monitoring

With MTD Systems On-Air Failures Don’t Always Have To Happen!

Network Monitoring - Broadcast Equipment Monitoring header image 2

Finding Memory Leaks

July 30th, 2008 · No Comments · Uncategorized

A memory leak is one of the most vicious problems a PC based system can have. This post explains what a memory leak is and how it might be detected.

Memory leaks happen when a program, process, or thread requests memory from the operating system and fails to return it after using it: when the process needs memory again it asks for more. It doesn’t matter what the OS is, Windows, Mac, Linux, all are vulnerable to this programming error. The memory leak could be only a very small amount but if this is done over hours, days, weeks, or months then the requested memory adds up and at some point will consume all system memory. As total memory is approached the application will get slower and sluggish. Finally the whole computer will crash.

Typically if tracking the issue in a Windows System you might bring performance monitor or task manager up on the PCs console and with careful viewing of the page file graph you might discern the slight upward trend that is visible:

InvisibleMemoryLeak

I’ll tell you that the graph is showing a substantial leak and yet it’s difficult to see because the task manager, set for normal update speed only shows a few minutes of data.

Next is a graph taken over a 24 hour interval. If you notice the left the trace is up to 35%, then drops down to 18% and stays relatively flat before dropping to 5% or so on the extreme right. This doesn�t show a memory leak or does it?

DailyLeak

Here is the same machine tracked over a period of a week with the 1 day interval from above displayed on the right side of the graph:

WeeklyLeak

You can see that memory increased over most of the week followed by a sudden drop off on Monday (when the system was rebooted). Again putting the samples into an even larger context you can see the above week (labeled as week 42) preceded by other weeks with the same leak pattern:

MonthlyLeak

What may not be visible in short term monitoring clearly shows itself when performance counters are tracked over long intervals. We have seen these leaks play out over several months not being visible in even a weekly view. This problem turned out to be a time service that someone downloaded from the web. What we did to find it is have the system snapshot running processes with memory use and other statistics for each when the memory threshold was reached. It was then a simple matter to see where the trouble was. We have found unintended problems like this one, personnel playing games on mission critical machines, as well as genuine application problems using this approach. We have also seen memory problems in individual processes, in this case the process itself may start getting a little fat with memory yet not enough to show up on the virtual memory graph on the machine. When the process gets fat enough, it fails, the application stops.

If you suspect a memory leak because of periodic, unexplained crashes and your budget doesn’t support a monitoring system there are certain things you can do with the tools available in Windows.

1. See our blog post on Getting WMI Data From Remote Machines. It describes the setup necessary to get performance data from a remote machine.
2. Open a command prompt and run: services.msc. Browse to the Performance Logs and Alerts service, right click select properties and set the start up to automatic. Then select the Log On tab and set the service to Log On with the common account you setup in step 1 above. Finally go back to the general tab an start the service.
3. Open a command prompt and run: perfmon

4. Select performance logs and alerts and right click on counter logs a dialog like the next one will open. Notice the green barrel, this is a set of counters that have been setup for a device: green means the counters are successfully logging data. We will be adding another one of these in the next step.:

PerfmonCounters

5. Select new and give it the target PC name a dialog will open with the �General� tab selected. Click the add counters button on the dialog that opens click the add counters from computer radio button and enter your PC Name, in this case Mikegw has been entered

VMCounter

6. After adding the counters and closing that dialog select the log files tab and set the log file type to Text File (Comma Delimited) then apply and close.

CSV Logging

There are other counters that could be added but in the context of memory monitoring Virtual Bytes is a good choice.

You will also notice a schedule tab, this lets you set a monitoring interval and even rotate the logs. If you have more PCs you’d like to monitor repeat the above process to add them. Perfmon will allow you to add up to 64 machines using this method. There are many other options in the various dialogs (like setting a schedule for logging) of perfmon but we are only introducing this tool.

Now to play with the data: notice in the dialog the path to your log files: c:\PerfLogs. Browse to this folder and open the csv file with the name scheme that you setup, you will see your data. If you have Microsoft Office installed the file will open by default in Excel (if not congratulations on the nice numbers that you can review).

RawCSVData

Select both columns, the first column is a time stamp with minutes first (I collected data for about 30 minutes, in your case you might collect for a longer time), the second column is the virtual memory data. Then select insert and line graph and Excel will do the rest:

Excel Generated Memory Leak Graph

This graph covers about 30 minutes and shows a substantial memory leak that we setup: VM went from about 750M at the beginning to 1.1G at the end.

You can learn more about Perfmon at Microsoft’s Technet.

The MTD System automatically tracks memory, virtual memory and many other parameters on hundreds of monitored devices it can also track memory use of individual processes.

MTD Best Practice: Monitor virtual memory on all PC based systems in your house and trigger an alert at 80%. This is usually a good, safe generic trip point although certain systems may even need a lower threshold and have specific processes watched.

Tagged: Memory Leak, performance monitoring, WMI, SNMP, MTD Systems, mtdsystems.com

Bookmark and Share

Tags:

0 responses so far ↓

  • There are no comments yet...Kick things off by filling out the form below.

Leave a Comment