Before SQL Server 2005 became available, I often ran into two situations when using either Profiler or System Monitor that were completely frustrating. First, I'd see a long duration event in Profiler and I wanted to know how it affected server performance. Or second, I would see a spike in server activity in System Monitor and wondered what query or other activity caused the spike to happen. In either case, I was unable to find out my answer because there was no way to correlate Profiler events and System Monitor counter activity. But in SQL Server 2005, there is.
In this article, we take a look at how to collect both System Monitor and Profiler data, then correlate the two so you can see exactly how SQL Server activity is directly affecting server performance, and vice versa. For this article, I am assuming you already know the basics of how to use System Monitor and Profiler.
Collecting Data in System Monitor
If you want to correlate System Monitor counter data with SQL Server 2005 Profiler Trace Events, you must capture both types of data during the same time period. While you can start either System Monitor or Profiler first, for the following example, I am going to start with System Monitor first, then Profiler second.
To begin collecting counter performance data with System Monitor, launch System Monitor, then expand "Performance Logs and Alerts." See the figure below. This is where you create a new System Monitor log file, which is needed to correlate to the Profiler Trace data we will also be collecting.
Next, right-click on "Counter Logs" and select "New Log Settings." The following dialog box appears.
Above, enter the name you want to assign to the log file you are going to be collecting. Then click "OK." The following dialog box appears.
Now, you can add whatever Objects or Counters you want to collect, and specify the length of the sample data interval you want to use to collect the data. When you select "Add Objects" or "Add Counters" to specify the data you want to capture in the log file, keep in mind that the more counters you collect, the more resources it will take away from SQL Server, so don't get carried away with trying to collect every Object or Counter there is. The same applies to selecting a sample data collection interval. While the more often you collect data, the easier it will be to correlate System Monitor Counters with Profiler Events, doing so also adds to SQL Server's resource overhead. You may want to experiment with different time intervals to see what works best for you.
In this demonstration, I have selected to collect counter data on three Objects, and to collect them every five seconds. Once you are done specifying the above two options, and any other options you choose to modify, select "OK." The following window appears.
You are done creating the Counter Log. Now, to manually start the counter, right-click on the Counter Log name and select "Start." At this time, the counter log will begin collecting the counter data you selected.
Collecting Data in SQL Server 2005 Profiler
Now that System Monitor is collecting data, it is time to create a Profiler Trace and have it begin collecting activities and logging them.
To do this, start Profiler. The following dialog box appears.
In the above dialog box, you will first need to assign a "Trace Name." Next, you need to select the "Save to file" option so that the trace you are collecting will be saved. Now click on "RUN," and the following dialog box appears.
In this dialog box, you select what events and data columns you want to capture. In my case, I limited my data collection to TSQL events, both SQL:BatchCompleted and SQL:BatchStarting. When specifying events, it is important that you select both their start and completed times so that you can see the proper correlation between System Monitor and Profiler data.
Once you are happy with the events, data columns, and any filters you want to use, select "Run," and the following window appears. This is your standard Profiler data collection screen. Profiler events are now being captured and stored in a trace log.
Ending the Traces
At this point, both System Monitor and Profiler are collecting data. Collect the data for as long as it is appropriate to meet your needs. When you have collected all the data you want, end both System Monitor and Profiler, in any order, it doesn't matter.
At this point you now have two log files: a System Monitor Counter Log and a Profiler Trace Log. I would also suggest that you exit both System Monitor and Profiler at this time. Doing so now can save some frustrations a little later.
How to Correlate System Monitor and Profiler Data
Now for the fun part. This is where you get to correlate the data in both of the logs you just collected, seeing how they relate to each other.
To perform the data correlation, start Profiler and load the Trace Log file that you just created. Next, click on "File" and then select "Import Performance Data." The following dialog box appears. Here, you must select the System Monitor Counter Log file you created earlier.
Once you have selected the correct log, select "Open," and the following dialog box appears.
In the above dialog box, you will see all the counters available to you. The only counters you will see will be the ones in the log you just opened. Select which counters you want to see correlated with Profiler event data then click "OK." The following window appears.
In the above window, you now see a correlation between Profiler Events (at the top of the dialog box) and System Monitor Counter data in the graph (in the middle). Depending on how many events and counters you are viewing, the dialog box above can become very busy. The larger your screen, and the fewer the events and counters displayed, the easier the data will be to view and analyze.
You can view the data two different ways. First, you can click anywhere on the graph and a red line will appear. This indicates the counter activity at a specific point in time. And then at the top, you will see an event highlighted that occurred closest to the time the counter data was collected. This option allows you to see what event (or events) was occurring at a specific point in time. You can then see what query or other activity was going on during this time and how it was affecting server resources.
Second, you can click on any of the Profiler events at the top of the dialog box, and then the red line in the graph will move to the time that particular event took place. This way, you can see how a particular event affected server resources.
For example, in the figure below, notice that I have clicked on an event at the top of the window that had a duration of 3364934 milliseconds. Notice where the red line is. Here, you can see that this particular event, which was very long, was the likely cause that the % Processor Time ran about 60% or so for a very long period.
Fully Understanding What You Are Seeing
After you begin playing around with this correlation window, you may discover some oddities. For example, above, how do I know that this long running query was the cause for the CPU running for 60% for such a long time? Actually, I don't. It is an educated guess. This tool does not have the ability to granularly determine exactly how much server resources were used up by a particular query. But you can make some educated guesses. In my case, I sorted all of the queries captured by Profiler by duration. By doing this, I found out that one of the queries ran for 3364934 milliseconds. Most of the others did not run that long. In the graph, I can see that the % CPU Utilization was high for about this same period of time. Thus, I can conclude that this particular query was the main cause of this excess CPU utilization. Especially when I can see that the CPU utilization went down to almost zero after the query finished executing.
You will also want keep in mind that multiple queries will probably be running at the same time, and that there may not be an exact time correlation between counters and events, as counters are measured over an interval of time you specify, and while events occur at discreet points in time. If you keep all of this in mind, and do a little practicing with the data, I think you will come to find that this correlation capability is very useful indeed. Now, finally, I can fairly easily determine what query (or queries) is causing my server to get stressed out.
No comments:
Post a Comment