Most medium to large sized businesses have a reporting solution that provides timely financial information in a user-friendly format for sales people, managers and customers. Usually, the "solution" is a dedicated reporting employee who manually queries source databases for statistical data.
This paper offers two software specific solutions to overcome limitations exposed in Microsoft SQL Server's Reporting Services. These solutions offer the benefit of providing employees with the ability to quickly generate dynamic report templates and focus their energy on future enhancements instead of continuous maintenance.
(Note: This article applies to Microsoft SQL Server Reporting Services 2000 w/SP4.)
The Framework
Microsoft SQL Server Reporting Services provides the development environment needed to develop report templates and to parse, retrieve and display the data through Internet Information Server (IIS).
The Reporting Services development environment is a Visual Studio (VS) application that provides report specific methods and functions, including two-dimensional tables, multi-dimensional matrices, list boxes, dropdown selection boxes and other visual report tools.
Reports are created with VS, uploaded to the Reporting Services server and then accessed through IIS by end-users.
Management of developed reports is restricted to the Reporting Services Web site, which lets you configure scheduled report deliveries through subscriptions, manage the security of individual reports and display basic information. Bug fixes, enhancements or further development must take place within the development environment.
The Limitations
To understand the restrictions inherent in Reporting Services, one must have an understanding of the customer's report requirements. In most cases, the tools and methods provided by Reporting Services are sufficient to author typical reports. However, in specific circumstances the report requirements can eclipse the built-in tools and methods. One of these specific circumstances occurs when the source data on which a report depends consists of multidimensional data, such as the data retrieved from OLAP cubes. This data is often horizontally dynamic, which means that the number, format and name of the columns in an explicitly named, two-dimensional result set can be different from any other result set within the same report.
For example, in a standard table format, rows represent individual entities, whereas columns represent characteristics of the rows. In an "Inkjet Product" table, each row may denote an individual hardware component, such as a printer cartridge. Each column in that row would denote a characteristic of that component, such as "manufacturing date," "size" or "cost." In a "LaserJet Product" table, there may be similar columns, but there may also be additional columns to denote different characteristics.
In multidimensional structures, we are nesting these separate datasets hierarchically. Each product may belong to a product category, a product line, a sales channel or a geographic region. We are also nesting these datasets within a time hierarchy, which adds the key limitation that my solutions will resolve.
Fig. 1 In this example of a nested product hierarchy, "A" is the top level of the Product Line, "B" is the Product Category, "C" is the Product Sales Motion, "D" is the Product SKU and "E" is the current inventory. Each D-level Product SKU could belong to any, all or none of the C-level Sales Motions.
The limitation of Reporting Services appears when the concept of a "rolling average" or "rolling window" is introduced. This concept is quite simple and very common in financial reports. Basically, the idea is to provide a dataset that encompasses data from an ever-changing range of dates. If I were to ask for a list of the dates included in the "Last 13 weeks of sales," the result set would be a static list: "2005 Week 50," "2005 Week 51," "2005 Week 52," "2006 Week 1," "2006 Week 2," and so on. If I were to ask for the same list a month from now, it would be different: "2005 Week 50" would fall off the list and a new one would be added.
Fig. 2 In this report of the "Last 4 Weeks," each date in column "A," "B," "C" and "D" are valid today. But what about a week from now? Column A would no longer be the oldest, or appear on the report. Each column and its corresponding data rows shift to the left each week.
A Reporting Services template does not allow for this. When a report is requested and rendered through Reporting Services, a dataset is retrieved from the database. At this point, the dataset is a table with statically named columns based on the data available in the OLAP cube at the exact time the report is made, so each time you render the report, the columns could potentially change.
This limitation would make manual authoring imperative for a report with horizontally dynamic datasets. For each time period, an employee must manually edit the report definition to include the latest date and remove the oldest date. Then the employee must re-render and upload the newly modified report.
The second limitation is simpler to understand than the first limitation. Quite simply, the Reporting Services feature set provides dataset "filtering" for use with two-dimensional datasets but does not provide post-retrieval filtering for multidimensional datasets. This would require re-sorting, re-ordering and hiding specific rows or columns denoted within the filter language. However, column filtering is not designed to work on a multidimensional dataset; the Reporting Services filtering method only works on two-dimensional data. This means that any filtering of OLAP data must be made during the data retrieval step, and not after Reporting Services has rendered the report.
Reporting Services offers a Report Parameter feature set that provides the ability to pass a "filtering" value to the dataset query, which occurs before the report is rendered. This causes problems because the language used to query a multidimensional database or OLAP cube does not provide a simple to use filtering element, such as a "where" clause in ANSI97 SQL standards. In some circumstances, it is not feasible to use the MDX "where" filtering clause, due to the nature of multidimensional queries.
The solutions offered in this paper exceed these limitations and provide the Reporting Services programmer with a more complete set of tools to author realistic reports.
Solution 1: Parameterization of Multidimensional Sourced Reports
Before we begin to add report parameters to our OLAP query, let's look at a common OLAP query. Understanding MDX and OLAP queries is beyond the scope of this article; please seek this information elsewhere.
SELECT {
{ [13 Weeks] } *
{ [Measures].[Sell To Units], [Measures].[Sell To Units Last Year] }} ON COLUMNS,
NON EMPTY {
{ { [Sales Type].[All Sales Types].CHILDREN }*
{ DESCENDANTS( [Product].[All Product], [Product].[Product Line] ) }} } ON ROWS
FROM [Financial Cube] WHERE [Account].[Account Family Id].[SampleAccount]
In this example, "13 Weeks" is a named set defined on the OLAP cube that returns a subset of the Time dimension containing the last 13 weeks from the current date, reversed. We're returning two calculated measures "Sell to Units" and "Sell to Units Last Year" across the X-axis and grouping all our products by "Product Line" and "Sales Type." Our only filter is located in the WHERE clause, which denotes returning the data that intersects with the "SampleAccount."
This MDX query is quite easy to parameterize. We would simply create a Report Parameter named "Account" and fill it with the list of accounts retrieved from a separate dataset. This dataset could be simply defined as
SELECT { [13 Weeks] } ON COLUMNS , NON EMPTY { [Account].[All Account].CHILDREN } ON ROWS FROM [Financial Cube]
In our main dataset, we would replace the existing WHERE clause with the following:
WHERE [Account].[Account Family Id].[" + Parameters!Account.Value + "]"
And add =" to the beginning of the entire dataset, so we can trick Reporting Services into treating our entire query as a string to execute, hence the query is wrapped inside a pair of quotes: ="
The only drawback to this method is that the query dataset designer will not properly evaluate the query, so you won't be able to preview the rows. This requires you to confirm that you have a decent, working MDX query before you start adding in parameters.
Adding a Report Parameter to a multidimensional report doesn't appear to be much different from a standard two-dimensional query string with report parameters, until we require additional report parameters NOT filterable from the WHERE clause.
Consider the following MDX:
SELECT NON EMPTY
{ { {[Measures].[Sell To Units], [Measures].[Inventory Units]} *
{ DESCENDANTS( [13 Weeks], [Time].[Week].[Days] )}}
} ON COLUMNS,
NON EMPTY
{ [Product].[Category].[Paper],
DESCENDANTS( [Product].[Category].[Product Line 1],
[Product].[SKU] )
} ON ROWS
FROM [Staples]
WHERE ([Account].[Account Family Id].[Account Family Name 1] , [Sales Type].[Type].[Sales Type 1]
Here we're asking for the last 13 weeks of data on a daily level, including all products from Product Line 1, and filtering out anything that isn't intersected with Account Family Name 1 and Sales Type 1. Let's pretend that the customer is requesting a report that allows a choice of Account Families and Sales Types.
This is easily added to our query parameters . The WHERE clause (above) becomes
="…WHERE ([Account].[Account Family Id].["+ Parameters!Account_Family.Value + "] , [Sales Type].[Type].[" + Parameters!Sales_Type.Value + "])"
What if in addition to these parameters, a customer requested the ability to filter on Product Category and Product Line (our vertical elements)? Could we simply substitute [Paper] and [Product Line 1] with escaped strings? Yes.
Our new Y-axis would be represented as
NON EMPTY
{ [Product].[Category].&["+ Parameters!Product_Line.Value +"],
DESCENDANTS( [Product].[Category].&["+
Parameters!Product_Line.Value +"],
[Product].[SKU] )
} ON ROWS
The report parameters would need to be defined, and the values available. Using report parameters in your MDX query means you do not need to filter the data using the Reporting Services filter method, which can't be done on multidimensional data anyway. If there were a way to use post query filtering through MDX, you would still need to return ALL of the data first, and then apply the Reporting Services filter method. This could lead to some very long running MDX queries that may be detrimental to the user experience, and in some cases, may cause the Reporting Services/IIS server to stop responding long enough to timeout, resulting in a failure to render the report.
Solution 2: Multidimensional Report Dynamic Columns
Next, we're going to extend our query string code to return additional data that we're going to use in our report headers. The trick we're using works due to the way a query string can be defined. Instead of writing a simple SELECT statement to return data, we're going to create a stored procedure type logic set. Recall that in our previous solution, we used the ="
Consider the following stored procedure, broken into three sections for readability:
DECLARE @rec_num int, @weeknm varchar(255)
IF EXISTS
(
SELECT * FROM dbo.sysobjects
WHERE id = object_id(N'[Rpt_Header]') AND OBJECTPROPERTY(id,
N'IsUserTable') = 1)
DROP TABLE [Rpt_Header]
CREATE TABLE [Rpt_header](rec_num INT IDENTITY (1, 1), weeknm
VARCHAR(40)
)
--
INSERT [Rpt_Header](weeknm)
SELECT
cast([weekyear] as varchar(4))+'_'+weeknm + '_'+ cast(year([date]) as
varchar(4)) + '_'
+ case when len(datepart(mm, [date])) < 2 then '0' + cast(month([date]) as
varchar(2)) else cast(month([date]) as varchar(2)) end
+ '_' + case when len(datepart(dd, [date])) < 2 then '0' + cast(day([date]) as
varchar(2)) else cast(day([date]) as varchar(2)) end
+ '_00_00_00'
FROM olapcube_time_dimension_table
WHERE [Date] BETWEEN getdate() — 100 AND getdate() AND [weekpart] = 6
order by [weeknm] desc
--
SELECT c1.weeknm AS C1, c2.weeknm AS C2, c3.weeknm AS C3, c4.weeknm
AS C4, c5.weeknm AS C5, c6.weeknm AS C6,
c7.weeknm AS C7, c8.weeknm AS C8, c9.weeknm C9, c10.weeknm C10,
c11.weeknm AS C11, c12.weeknm AS C12,
c13.weeknm AS C13
FROM rpt_header AS c1, rpt_header AS c2, rpt_header AS c3, rpt_header AS
c4, rpt_header AS c5, rpt_header AS c6,
rpt_header AS c7, rpt_header AS c8, rpt_header AS c9, rpt_header AS c10,
rpt_header AS c11, rpt_header AS c12,
rpt_header AS c13
WHERE c1.rec_num = 13 AND c2.rec_num = 12 AND c3.rec_num = 11 AND
c4.rec_num = 10 AND c5.rec_num = 9 AND
c6.rec_num = 8 AND c7.rec_num = 7 AND c8.rec_num = 6 AND
c9.rec_num = 5 AND c10.rec_num = 4 AND
c11.rec_num = 3 AND c12.rec_num = 2 AND c13.rec_num = 1
Let's look at this procedure step by step.
The first section simply declares a couple of local variables and creates a local table. This section could be modified to create a temporary global or local table if database permissions for the Reporting Services account aren't set to allow the creation of objects.
The second section fills our newly created table with an identity value and a date value written as a string. We're using our WHERE clause to return the thirteen most recent date fields from the database. The string written will be similar to
2006_2006W06_2006_02_09_00_00_00
At this point, our table contains 13 rows on two columns. Each row contains an identity value and a date string. The last section of the procedure uses the self-join method (also known as a pivot) to distribute the rows across columns. This technique can cause the select statement to be quite messy and confusing, but it provides the functionality we're looking for. After this runs, we'll have the following table available:
C1 | C2 | C3 | C4 | C5 | C6 | C7 | C8 | C9 | C10 | C11 | C12 | C13 |
200 | 200 | 200 | 200 | 200 | 200 | 200 | 200 | 200 | 200 | 200 | 200 | 200 |
Why are we getting this table? We're going to substitute the date string values from this table into our report definition. More specifically, we're going to take each date string and put them in the report header that we'll use to construct our main query. Since this procedure is dynamic in nature due to our WHERE clause using the "getdate()" function, each time the query is run, it will return the most recent thirteen weeks.
Before we start using these strings in our report, we need to set up the other query strings that pull the actual data from the OLAP cube. If we use the example from the first solution above in a new dataset, our report data will contain something like the following:
| Measures_Sell | Measures_Sell | Measures_Sell |
Product1 | 192 | 190 | 112 |
Product2 | 10 | 12 | 9 |
If we created a matrix within our report and started to drag the dataset fields to our matrix columns, everything would be fine — until next week. Then the value in the first column would be from last week. The report designer would have to go into Visual Studio, rerun the OLAP dataset query, then drag and drop the dataset fields again to the matrix data section.
Inspecting the matrix properties, in the first column, we would see that the value is derived from this code:
=Fields(Measures_Sell_ToTime_2006_2006W05_2006_02_02_00_00_00).Value
Can we simply modify this code somehow to pass in a string? Yes. We can escape out of the code as before with the ="" method, but with a small modification. Since Reporting Services doesn't like ="Fields(blahblah)", we'll use =Fields("blahblah").value so it doesn't complain.
We'll take our dynamic date string dataset (named DataSet1 in this example) and pass it into our matrix data section code.
=Fields("Measures_Sell_To_Time_" + First(Fields!C1.Value, "DataSet1") + "").Value
Since datasets execute before the report definition evaluates report items, we're getting the list of dates as strings and passing them immediately to the report headers, which then evaluate the newly made string. We can also use the date string anywhere else we want, most likely in the column header names of the report.
The code =First(Fields!C1.Value, "DataSet1") can be put in the header textboxes of the matrix column to display the string for our customers.
To review, this solution simply gets a list of the last thirteen dates, converts them to strings, pivots the result set and substitutes these strings into report items.
This solution can definitely be improved upon. No exception handling is done by our stored procedure, a custom pivot function could be used in place of the messy self-join and our derived table could include better-formatted date strings.
Conclusion
With a bit of effort, Reporting Services reports can become self sufficient to the point where they only need to be authored once. Introducing logic into your datasets can provide the functionality you need to make self sufficient, intelligent reports.
No comments:
Post a Comment