This weekend I came across an unusual circumstance that I thought I would share with many of those part-time SQL Server admins. I currently maintain more than a couple of SQL servers. Because SQL Server has a good maintenance program I don’t spend the money on a third party software for backup. Instead I setup the maintenance plan to create a backup every 6 hours and then push the file to a network share. For some reason or another, the network share came detached and the backups filled up the local data volume. This effectively shutdown the server. I cleared up the space, restored the mapping, and didn’t think much more about the problem. I noticed that I was getting a backup file from each database but failed to pay attention to the transaction log.
This is where my new problem that consumed my weekend started. Friday night at 7pm I got another phone call about the SQL server being out of disk space again. Again I had no space on the volume, but the space wasn’t consumed by the backups. Instead, the transaction log which is normally a couple of gigs in size had ballooned to 100GB in size. I had attached an external USB drive to push a backup of the transaction log to and tried to shrink the transaction log from the SQL Server Manager. This only give me about 3 GB of storage back, but they were quickly consumed as soon as the end users started using in their application again. I then kicked off a backup of the database and then transaction log. I now had 99% of the space free in the transaction log file, but still could not shrink the database. I had fought and fought with the database trying to get that free space back.
Finally at about 2am, running out of ideas, I deleted the transaction log file and started up the database again which effectively locked the database for a lot of people. Having migrated the database before, and knowing that a simple restore of the database could easily fix the problem, I took the most recent backup which was actually taken after end users were cut off from the server and restored the database. After the restore, I again had the same problem of a database with a 100 GB transaction log file. This time however, I for some reason threw caution to the wind and performed yet another shrink to the transaction log file. Finally, I freed up 75% of the space on the volume which allowed everything to return to normal.
Why I had to backup and restore the database before I could perform an effective shrink of the database, I do not know. If this has happened to other people, I would like to know the reason behind this.
My corrective actions include scripting a compression command on the backups to reduce their size. I also plan on creating a trigger to notify me by email when the disk space is low, 20% is one of my favorite guide lines as far as that is concerned. I am considering running a network mapping script to reattach the volume of the server before the files are moved over so that the network volume that I monitor won’t be so easily missed with the other backup files that I file on the backup storage volume. I don’t like using compression because of how having to decompress a file to restore it effectively adds to the lengthy process of getting the database back to working order. Then again, having a few extra copies of the database around is also handy.
I am open to other input. I thought I would just share my wonderful late night experience with others in hope to get some improvements or perhaps help out other admins who might run into the same problem.