Wednesday, July 2, 2014

Missing parameter values in Reporting Serivces Linked Reports

If, one day, you're creating a linked report, and you specify a default value for one (or more) of the parameters, you might receive an error stating 'The parameter [name] is missing a value' when you test that the report is working.

No matter what you try, nothing works until you decide to check the XML in the Parameter column in the table ReportServer.dbo.Catalog. You'll need to remove this particular string:


This key/value causes the report to basically delete the default value you've specified for the parameter, hence the error message. Just copy the current XML string from that field, remove that string, and update it with the XML you've just modified.

Fun times.

Monday, June 30, 2014

Finding the biggest tables in a database

There will always be times as a Database Administrator or Developer when you need to know where the bulk of the data lies in any given database. There are probably some instances of software available that will provide you with that information, but there's no need to use those. Below is a very simple script I wrote that will give you what you need, ordered from largest to smallest.

CREATE TABLE #tabledata
 tablename VARCHAR(255)
 ,numrows INT
 ,reserved VARCHAR(127)
 ,data VARCHAR(127)
 ,index_size VARCHAR(127)
 ,unused VARCHAR(127)

EXEC sp_MSforeachtable 'INSERT INTO #tabledata EXEC sp_spaceused ''?'''

SELECT * FROM #tabledata ORDER BY numrows DESC

DROP TABLE #tabledata

This script can come in handy if you want to appease data-hungry managers appetites with low level database growth figures by storing it in a data warehouse somewhere on a daily basis, and then write some reports showing where the major growth is, and then go back to developers and say "Hey, 75% of the database is audit records - stop doing so much crap" or some such tactful assertion.

Monday, May 12, 2014

A Rant: Bad Database Design

Sometimes, there's just no excuse for bad database design. Design that shuns any best practices, past and present (or even future), and dives headlong into a design that requires far more CPU time than is ever necessary, causes data file sizes to grow at an exponential rate, and generally leaves users blaming a DBA or an IT department in general when things are slow.

While I won't mention the names of the company or the product, I need to at least divulge some of the particulars about the table design to illustrate some of its terrible aspects - perhaps I'll be as specifically unspecific as possible.

The offending table which gives rise to all other offending tables relates to Employee data. In this particular table, there is an Employee ID column which contains the employee ID, which is an all-numeric ID. Debates will rage whether a fixed-length identifier which contains only numeric characters should be stored in an INT or a VARCHAR column, but this design uses CHAR. I hear cries of "Page splits! Query Performance! Data file usage!" for VARCHAR column types. Of course, data in a CHAR column is automatically allocated for each row, so there's no need to shuffle data around to add a new value as there may be required for VARCHAR types. However, with the speed of commercial grade hard drives and CPUs these days, with appropriate data file growth and fill factor settings, and use of CHAR data types only when average data length is consistent, I feel this becomes a moot point.

Back to the design. So, we've got a CHAR type for the Employee ID. However, this ID isn't used as the primary key. There's a column in the table that is referred to as a surrogate key column, which stores VARBINARY data and has been set as the primary key. However, this VARBINARY column only stores the employee ID which is already stored as CHAR data. Digging further, tables that contain employee data store the same Employee ID, plus additional data separated by a space, in a VARBINARY column. Now SQL Server or the application needs to convert from one data type to another for each row in the main table, and on child tables it also needs to use LEFT() to retrieve the Employee ID from the surrogate column.

I've thought about this for a while, and I can't really see why this has been done (or at the very least, continued through three major upgrades and perhaps 20 minor revisions). It requires more than double the storage for the same data (going against typical normalisation forms), and requires greater CPU time to get the data in and out of particular formats.

Then, there's the aforementioned upgrades to the database & application themselves. Each major upgrade has resulted in large increases in database size - the most recent one being 9GB. I know, I know - that's not a lot considering the size of hard drives these days. However - we don't spend money on appropriately sized SANs, and we only have one SAN for the entire company (a few thousand people), and I wasn't asked how big I thought the SAN should be, so I've only been given 1.3TB in total for data, logs/TempDB and backups between three SQL instances. This compounds the problem of arbitrarily increasing the size of the database during an upgrade (this generally involves increasing the size of CHAR and VARBINARY columns by a few characters each time).

In these instances, VARCHAR columns are a lot better than CHAR, because simply altering the maximum size of a VARCHAR column won't result in data file size growth, because it doesn't need to pad the columns with extra spaces.

My biggest gripe with this particular database design is how foreign keys, lookup values and other frequently accessed pieces of data are stored. In some tables (not all, so there's a lack of consistency in design), often accessed data is stored in a long VARBINARY data type which in a previous iteration of the application was CHAR, but was changed for some unknown reason. Within this VARBINARY column, character data is stored in either TAB or SPACE separated format, so now you need to CAST() or CONVERT() the data, and then use SUBSTRING() to get the particular data you want. I've no idea why, but it takes a LOT longer now to run reports over this database than it did previously.

These points sound like basic things that don't really matter, but having watched this database grow 2000% over four years, and watched changes made to it which have severely affected performance, I just wanted to at least keep bad design near the front of peoples minds so they can perhaps remember to prevent it where possible.

Sunday, May 4, 2014

Audit Tables, Clustered Indexes, GUIDs

After a weekend full of application and database updates, upgrades and maintenance, I have refreshed some things, and learned others.

Firstly: Don't use a GUID as an ID for an audit table. If it REALLY is just an audit table, INT is completely acceptable. If you want to create 58,494,241,735.507 per second every second for 5 years, and never truncate the table, and can store that much data, then use BIGINT, which is still only half the space of UNIQUEIDENTIFIER. Incidentally, at that rate, you're creating 35.91 petabytes of data every day. You could probably do this if you were somehow able to harness all the storage in the world.

Second: Don't leave the fill factor for the unique clustered index at 5%, because it wastes a TON of space. For a table that had 148,000,000 rows, there was 35GB of data and 15GB of indexes reported by sp_spaceused. However, this is VERY misleading, because with the fill factor set to 5% on the clustered index, most of that was mostly blank sections of pages in the database, and with index fragmentation at 100%, there was a massive performance hit.

Third: Reorganise your index every once in a while, especially if there's lots of inserts/updates/deletes. Tthis is a given, but when you get side tracked with non-DBA work for six months, and the developers say they've got it under control, spend a little bit of time outside normal hours to make sure the index has an adequate fill factor, and isn't too fragmented.

Fourth: If you've got one or more tables that are heavily used, but the data stored doesn't form part of what is essential business data (again: audit data, transactional data to be synced with an ESB like Neuron), create another file group and put the table(s) there. It will make a MASSIVE difference to performance. There's less unnecessary data to go through if you're running reports against the live database (should be warehoused where possible of course), and the applications that use the database will run quicker. Everyone will be happy.

For those of you who are curious about moving tables between filegroups, and you find Books Online to be too tedious to go through, this article on gives a concise overview of the process:

Wednesday, December 18, 2013

The best laid plans of mice and men...

It would be nice if people didn't do this...

Tuesday, October 29, 2013


I've found two minor quirks within SQL Server today that, if I had hair, would have caused me to tear it (the hair) out.

1. SQL returns "*" when casting/converting an integer to data type varchar, and the data length is too short. Two two lines of code will show what I mean:

SELECT CAST(123 AS VARCHAR(3)) -- Returns 123
SELECT CAST(123 AS VARCHAR(2)) -- Returns *
This caused be issues because I (unfortunately) have to create a massive dynamic SQL statement to do some ridiculous data transformation from a flat file data source to an XLS file to importing into another database. Yes, I understand the best way to do it would be to insert directly into the necessary tables, however the vendor doesn't want us doing that.

2. You need to have the same number of columns numbered in your Excel spreadsheet destination as you  are selecting from the database.

As I mentioned above, I need to output data to an XLS file, and each column that I am placing data in has to be numbered. So if I select 61 columns of data from SQL, there has to be columns numbered 1 to 61 on the first row of the spreadsheet. 60 or 62 causes errors.

Tuesday, August 13, 2013

Using BCP to export files stored as image from SQL Server

At the moment, I'm working on retrieving files stored in a Sharepoint database and placing them in a file share, because we're no longer using that particular Sharepoint site, and the size of the database is larger than just having the images stored on disk.

One option was to use WebDAV, but that can be quite cumbersome, and it also means that the files are created temporarily (so that they exist to be copied), and then they're copied to the location they need to go to. We can avoid the middle man and send them straight from SQL Server to the file share location using BCP.

For those who don't know what BCP is, have a look at this TechNet article.

Roughly outlined, we'll be doing the following:

1. Generate a format file (used by bcp to determine how to render the data in the file)
2. Enable xp_cmdshell access
3. Write your query to retrieve the file data from the database
4. Combine the two and use bcp to export that data to a file share

1. Generate a format file

You can do this on any machine that has SQL Server installed. Open a command prompt and run this command (substituting your parameters as necessary)

C:\Documents and Settings\User>bcp DatabaseName.dbo.TableName format nul -f C:\FileLocation\format.fmt -T -S servername\instancename

The first parameter is either the table name or the query (I just use the table name). Second is the direction (these can be in, out, query or format nul), third is the format file switch (-f) and location for the file, fourth is user authentication switch (-T or -U & -P), and finally the server name (and instance name if required) -S.

When this runs, depending on how many columns you have, you'll see these three lines:

Enter the file storage type of field Id [uniqueidentifier]:
Enter prefix-length of field Id [1]:
Enter field terminator [none]:

When I create a format file, I press enter for the first two options (leaving them as their default), and then use a pipe, |, as the field terminator. Repeat for all columns.

Once this is done, the format file will be created in the location you specify. For ease of keeping everything together, I place the format file in the location I'm exporting the files in the database to.

2. Enable xp_cmdshell access

Note: xp_cmdshell is disabled for security reasons by default in SQL Server, so as soon as you're finished with it, disable it. To enable it, run this:
EXEC sp_configure 'show advanced options', 1
EXEC sp_configure 'xp_cmdshell', 1

And to disable it when you're finished, run this:

EXEC sp_configure 'xp_cmdshell', 0
EXEC sp_configure 'show advanced options', 0

3. & 4. Write your SQL query & Combine with bcp

This I'll have to leave to you, but to give you an idea of what I've done to export files from a database, here's the query I've written:


 RowNumber BETWEEN 1 AND 1000

OPEN CurPhoto

FETCH NEXT FROM CurPhoto INTO @ID, @DirName, @FileName;

 SET @Sql = 'bcp "SELECT TOP 1 CAST([Content] AS VARBINARY(MAX)) AS [Content] FROM Pictures.dbo.AllDocStreams WHERE Id = ''' + @ID + '''" queryout "F:\Pictures\' + RTRIM(@DirName) + '\' + LTRIM(@FileName) + '" -T -f F:\Pictures\format.fmt -S servername\instancename'
 EXEC master.dbo.xp_cmdshell @Sql

 FETCH NEXT FROM CurPhoto INTO @ID, @DirName, @FileName

CLOSE CurPhoto

I've used a cursor for this (where I manually change the row number range) because I needed to export 41GB of image files, and SSMS started giving out of memory errors when I exported more than 7GB of data (while running on my machine). I could probably have done it in larger batches, but there were only 20 batches of 1000, and it only took 5 minutes per batch to create at little over 2GB of files each time.

When you're exporting from SQL to a data file, I found things worked properly when I converted from IMAGE to VARBINARY(MAX) data type. I haven't looked into WHY that is yet, but that's what worked for me.

The above query wasn't particularly resource intensive, and other sites running on the server experienced performance issues, so if you absolutely need to, you could run it during business hours, but I would recommend doing it outside of business hours anyway.