Thursday, July 16, 2015

Replication - To init or not to init that is the question


We use replication quite heavily at work. We are just getting into availability groups now so previously replication was the only way to get readable data onto a secondary server. Last week I was primary on-call and there were two changes to be implemented on Saturday dealing with replication. One change was adding a table to replication and the other was altering an existing table and an existing view to add a column to both. I was curious if any or all of the changes would require a re-initialization or at the very least that I would have to run the snapshot agent. Like anything I do, I wanted to be sure before I had to do the actual work so I did some testing. I have two test servers with the AdventureWorks database setup to be replicated between them.

First Test: Adding an article to a publication

  1. I created a new table in the publisher database 
  2. I added the table as an article in the publication 
  3. I checked replication status on the Subscriber and it says "The initial snapshot for article 'JohnReplTest' is not yet available." 
  4. Ran the Replication Snapshot Agent for the publication on the publisher 
  5. Check the subscriber and the table was there and synced. 

Second Test: Adding a view to an existing publication

  1. Created a view in the published database 
  2. Added the view as an article in the publication 
  3. Checked the subscriber but the view was not there and the status said: The initial snapshot for article 'vw_JohnsReplView' is not yet available. (I did not realize you needed a snapshot for a view) 
  4. Ran the Replication snapshot agent job for the publication 
  5. Checked the Subscriber and the view is now there. 

Third Test: Adding a Column to a Table and to a View

  1. Altered the table to add a column 
  2. Altered the view to add the same column 
  3. Checked the subscriber and the information was there. 
I did not need to do anything additional to get the data across.

Fourth Test: Adding a column with a default value

  1. Altered the table to add a column with a default value 
  2. Altered the view to add the same column 
  3. Checked the subscriber and the column was there with the data. 
I didn't need to do anything additional to get the data across.

Sum Up

I find Replication a difficult technology to work with, there are so many facets that change based on what you are doing. It's easy to go the route of a full re-initialization when something goes wrong but if you have an oversees subscriber you have to consider the time it will take to push all the data out.
Out of my tests I was most surprised that adding a view required running the snapshot agent. I did not think that creating an object that had no data would require a snapshot. The saving grace is the fact that if you have not re-initialized any of the subscribers, the snapshot agent will only publish the new articles not the whole publication. Try these tests for yourself. The best way to learn about something is not just to read but to actually experiment.

Hasta Luego,

Wednesday, July 1, 2015

TempDB - Round Robin and Proportional Fill

Proportional Fill

My post from last week sparked quite a bit of discussion at work about how the data files for TempDB fill up when they are not of equal size. Pretty much all of the documentation and reccomendations out there say to keep the data files the same size so that the round-robin data flow works properly. What we were having trouble deciding is what happens when they are not the same size, for example if you have one file that is 5GB and three files that were 500 MB.

The discussions ended up in two different camps.

  1. TempDB would allocate pages in the first larger file until the free space was equal to the three smaller files then the round robin would continue.
  2. The allocations would still be handled in a round robin fashion with the larger file just getting more so that they would fill up evenly (from a percentage standpoint). 
Like any good discussion we decided to take this to our test boxes and see if we could figure it out The answer I found comes down to a little guy named proportional fill.

The Test

I took my test server and created four TempDB data files. Look at the image below to see how they looked when I started out. Notice they all have a very similar free space percentage
(The script I used to view this info came from Dave Turpin)

Then I wrote a simple script that would create a temp table and add data to it
DECLARE @x int
SET @x = 1

(  id BIGINT, SomeValue char(5000))

WHILE @x < 362846
INSERT INTO #MyTempTable ( id, SomeValue )
VALUES  ( 99999999999, 'ddddddddddddddddd')
SET @x = @x +1

I picked the number 362846 because I knew that would get the files near 50% full and I wanted to add something to the test half way through. After this ran I checked TempDB again using my script and got the below results

As you can see it looks like the second camp was correct. Proportional fill was used to allocate pages to each data file according to its free space percentage and not it's size. The result is that there were a lot more allocations to the larger file but at the end they all ended up around the same free space percentage

Add to the Test

Now I wanted to see what would happen if you added a new TempDB data file, so I added another 500 MB file to TempDB. My thought was that if I now had four files at 55% free and one file at almost 100% free would it fill up the new file until it was at 55% or continue to allocate pages in all the files
After the file was added I ran my test again and checked the file usage twice
Once about halfway

And when the script finished

As you can see, the allocations were made to all of the files, not just the new one. It is just that more allocations were made to the new 500 MB file than to the old 500MB files during this second test (361MB as opposed to about 192 MB). Proportional Fill strikes again! If you look at how the numbers are trending you can see that if I hadn't set a limit on my while loop all of the files would have been almost equally full when the first one completely filled up.

Sum Up

This was a neat little experiment for me. I enjoy being able to test things out rather than just taking as fact what I read some where. In fact, quite a bit of the documentation I found was misleading on this subject. If you would like a good (and more thorough) look at our friend Fill, check out this post by Jonathan Kehayias:

He uses extended events to show this so you can actually see the number of file writes per data file. His test is not specifically on TempDB but the allocation should work the same way.
Also check out this article by Paul Randall for more info about proportional fill:

I'll be back in two weeks. Hopefully I can squeeze one more blog post out of all my TempDB testing.

Hasta Luego,

Tuesday, June 16, 2015

Resetting TempDB data files

I'm Back!

Have you ever missed a personal deadline for something (like a making a doctor's or dentist appointment) and the further you get past it, the harder it it is to do it? That happened it me and I found that I can go a LONG time before I finally break down and do it, 2 years and 8 months, apparently. Well in that time I have learned quite a bit about SQL and hope to be able to post something informative about every two weeks that isn't just a rehash of what you can find on other peoples blogs. Anyway, on with the post..

TempDB Configuration

Lately I have been doing some research and testing into the configuration of Tempdb. I won't rehash all the details of how or why you should change the default settings since there are people who I would just be paraphrasing:

What I am going to do is tell you about an error I kept getting when I was testing out some different configurations and how I resolved it.

Almost all of the information that you find on the internet about TempDB configuration tells you that you need to have more than one data file for TempDB so you don't suffer performance issues related to page allocation contention. There are some varying opinions on how many you need to have. Some say 1 file per core up to 8, some say 1 file for every 2 cores. There are also opinions on how to properly size the data files and the log file. You can read about them in the links above or with a google search. One important thing to note is that most everyone says you need to right size your data files and turn off auto-growth. This allows TempDB to allocate space in a round robin fashion. If one file is much bigger and has more free space, it will get the writes until the free space is reduced to the size of the other files.

My current task at work is to come up with a script that can automatically create the TempDB files and size them. In our current build doc we just create 8 TempDB files (even if we only have 4 cores) and size them at 1GB with auto grow at 1GB. My plan was to mess around with some different configurations and try out different numbers of data files. I the first test, I just took our existing 8 file standard and sized them to 80% of the disk that they are on.

--- Set the number of files
SET @number_of_files = 8
--- Get the Drive size for TempDB
SELECT @drive_size_mb=total_bytes/1024/1024 FROM sys.dm_os_volume_stats (2,1)
--- Calculate 80% of the total drive size and divide that by the number of files to get individual file size
SELECT @individ_file_size = (@drive_size_mb*.8)/@number_of_files

Once, I was finished with that script I wanted to reset TempDB so that it only had one file so I could try something different. I used DBCC SHRINKFILE to empty out each TempDB file and then REMOVE FILE to remove them.


REMOVE FILE tempdev8;

The problem is that when I tried to do that I got an error message:

DBCC SHRINKFILE: Page 9:32 could not be moved because it is a work table page.
Msg 2555, Level 16, State 1, Line 4
Cannot move all contents of file "tempdev8" to other places to complete the emptyfile operation.
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
Msg 5042, Level 16, State 1, Line 6
The file 'tempdev8' cannot be removed because it is not empty.

It looks like something is using my TempDB files so I can't get rid of them. (This is a test server but it does have some other non-prod stuff running on it). I tried restarting SQL service and running the script as soon as it started but I still got the error on most of the files. Something started using the TempDB data files right when SQL started and I couldn't get them deleted. I started thinking about how I could force everything to use just one of the TempDB files so I could delete the other ones. Can you figure out what I did based on something I have already said in this article?

The solution that worked was I increased the size of the main TempDB file so that it was much larger than the other files. Then when I restarted SQL Service all the TempDB writes went to that files since it had more free space then the other ones. Right after restart I ran my delete script and was left with one TempDB datafile so I could start working with another configuration.

Sum Up

I know this isn't a groundbreaking post (gimme a break, it's my first one back) but I love it when you can take some information that you are learning or just read about and then you apply it in a real life situation like this. I'll be back in two weeks, no idea of a topic yet but I'll think of something.

Hasta Luego,