Monday, August 30, 2021

Testing stored procedures - getting great coverage with zero external tools

I've been working on performance tuning stored procedures recently. These are stored procedure that just return data, with no data modification, mostly for reporting. Often the performance tuning isn't the hard part. The hard part is making sure that the changes that I make don't introduce bugs into the output.

Here's a process I use to make the testing processes easier for this type of procedure.

Step 1: Creating a select statement that outputs an executable script, with a random assortment of IDs

Most of the stored procedures take, as a parameter, a UserID (or some other ID field). In order to get a set of stored procedure calls with a random UserID, you can use a script like the following:

 Select Top 10  
   'exec api_GetTiers '   
   + '''' +   
   + convert(varchar(100), UserID) + ''''  
 From Users  
 Order by NEWID()  

When you run the above script, it will give you an output that is a set of calls to your target stored procedure. The NewID() function is used in order to get a random set of UserIDs for the parameter.

Step 2: Execute the stored procedure, with random UserIDs

The output of this script will look something like this:

 exec api_GetTiers '40273230'  
 exec api_GetTiers '60372087'  
 exec api_GetTiers '30128477'  
 exec api_GetTiers '60008969'  
 exec api_GetTiers '60121799'  
 exec api_GetTiers '00303810'  
 exec api_GetTiers '60466614'  
 exec api_GetTiers '60147429'  
 exec api_GetTiers '70278452'  
 exec api_GetTiers '50542343'  

Copy and paste the output into a new query window in SQL Server Management Studio, and change the output to Text mode instead of Grid Mode (right-click in the query window, click on Results To, then choose Text). 

Next, execute the set of stored procedure calls. 

Copy and paste the output (since the Results are now in text mode, it's easy to do) into a text editing tool. My tool of choice is Notepad++. 


The final step is to update the stored procedure that you're working on, and then follow the above procedure again. You will now have two chunks of text, one with the output of the old stored procedure and one with the output of the new stored procedure, both called with the same UserIDs. Compare the output from the old and the new versions of the stored procedure with your preferred text comparison tool (I use the Compare plugin, in Notepad++).

Any differences that show up are potential bugs.



Monday, August 23, 2021

Performance tuning a stored procedure with user defined function calls in SQL Server

UPDATE: It turns out that there are easier ways of doing this. Take a look at this post instead of following the below.

******************************************* *******************************************

Have you ever tried to do performance tuning on a stored procedure that has user defined function calls?  It can be very challenging. This is mainly because one of the of the most basic tools of performance tuning—checking the number of logical reads using Set Statistics IO—is just not accurate in this scenario. The logical reads of any user defined functions called during a stored procedure are not reported using Set Statistics IO, so you get output that doesn't reflect their true cost.  

Before I get into the details of how to work around this, let's just get this out of the way—yes, user defined functions have a very mixed reputation because of performance issues. Functions in the Select or Order By clause are likely to be okay. But functions have no place in the Where clause, and can cause huge performance issues.

Yet you may be working on a system that has hundreds of user defined functions scattered everywhere, including the Where clause, within hundreds of stored procedures. That's what I was recently faced with.

It's not possible to fix all these issues quickly, so you need to target the the worst performers, and figure out how to improve them. I use Query Store (usually using the Top Resource Consuming Queries report) to find the worst performers in general. But once you have a stored procedure targeted for performance improvement, how do you get the full performance picture if it contains user defined functions?

Here's a process you can use in order to get data that accurately reflects performance (including Logical Reads), even when there are user defined functions inside your stored procedure. The key is to use  the built-in reporting in Query Store, and not Statistics IO.

Step 1: Purge Query Store data

Make sure Query Store is turned on for your database. Information on how to do this is easy to find online. After ensuring it's turned on, click on the Purge Query Data button. (This is, of course, assuming you're not working on a production server). This will give you a blank slate in the Query Store.



Step 2: Run the stored procedure

Execute the stored procedure in a query window. This will populate the previously empty Query Store with just the performance metrics for this specific call.

Step 3: Check the Query Store: Top Resource Consuming Queries

Next, go to the Top Resource Consuming Queries report in Query Store, and open it. By default it gives you the graph output, which doesn't give you numbers. Instead, switch to the simple grid output, and copy and paste the results to a spreadsheet. I usually do this for both Logical Reads and CPU.

Now what? When I first ran through this with my stored procedure, I found that the user defined function calls in the stored procedure were actually not as big of a problem as I had suspected. Instead of focusing on user defined functions, I targetted other areas of the stored procedure and got a very substantial performance improvement. 

To validate the improvement, I ran through the same steps as above, but using the updated stored procedure. Comparing Logical Reads and CPU numbers between the old and new versions of the stored procedure gave me the results I was looking for.