Thursday, April 28, 2011

Finding The Top One Value Per Group

I ran into an issue the other day where I had a table like this:

IDnamelocationcommentcomment_time
1Johnhomehello1/1/2011 01:16:23
2Johnworkcomment21/12/2011 13:23:10
3Johnhomeblah1/13/2011 02:02:02
4Billworksmurf1/15/2011 23:59:00
5Billhomejello1/17/2011 13:30:00
6Billhomesnafu1/18/2011 12:00:19
7Frankworkbatman1/19/2011 14:15:16


Somebody came up to me and said "Hey John, I would like to see the last comment made by each person in the database and where they were when they made the comment". The logical answer is to use a Group By query. The problem is that you can't include a non-aggregate field in a Group By query which means I can't get at both the location and the comment in single query. I need to get a set of data that I can query against to get the result set i need. So each row in the return set has to be unique. I would love to use this subquery:
SELECT TOP(1) id, name
FROM table1
GROUP BY name
ORDER BY comment_time desc

But you get the following error:
Msg 8120, Level 16, State 1, Line 63
Column 'dbo.table1.id' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

So I have to use an aggregate function (SUM, AVG, MIN, MAX or COUNT). At first I thought about using the MAX function on the ID to give me the highest ID but that assumes that the comment_time field is generated on INSERT and isn't updateable (for example, it wouldn't work if the comment_times for ids 2 and 3 were switched). Since I don't know and can't control that I decided to use MAX on the comment_time field, like this:

SELECT MAX(comment_time) as comment_time, name
FROM table1
GROUP BY name

Which gives me:


comment_timename
1/13/2011 02:02:02John
1/19/2011 14:15:16Frank
1/18/2011 12:00:19Bill

You have to alias the MAX(comment_time) or else it will return the column with the header "(No Column Name)" which wold make it difficult to use it in the next step.

Now I can use that set to get the result set i want like this:

SELECT name, location, comment, date
FROM table1
JOIN
(SELECT MAX(comment_time) as comment_time, name
FROM table1
GROUP BY name) as sub
ON table1.name = sub.name
AND table1.comment_time = sub.comment_time
ORDER BY
table1.comment_time

Which gives us:

namelocationcommentcomment_time
JohnHomeblah1/13/2011 02:02:02
Billhomesnafu1/18/2011 12:00:19
Frankworkbatman1/19/2011 14:15:16

The only problem might be if you have two comments by the same person in the same day at exactly the same time. If the completion time is generated by the server on insert, you shouldn't have this problem but if it is generated by the application you might (especially if the field is editable). My answer is that if two comments were made at the same time, they are both the last comment and should both be returned so my solution works fine (for my needs). The other problem is that my solution doesn't work if you need to return more than one row per group.

Let me know in the comments if you have a different solution.

Hasta Luego,
RutherfordTX

Wednesday, April 20, 2011

Professional Development Planning

I went to a Cactuss meeting last night (the local Austin SQL users group) and heard a talk by A.J Mendo (twitter|website) about Professional Development planning and it put a name to what I am trying to do in my career right now so I thought I would share what I took away from it.

1. Own Your Career
Nobody is going to walk up to you and give you advancement opportunities (unless your last name is the same as the CEO's). You need to make yourself valuable and decide exactly what you want your career to look like. Do you want to be a DBA? A Manager? A VP? Think about what you want to gain out of your career and work towards it. That brings us to the next point

2. Set Goals
Take the time to sit down and map out some goals for yourself. Think about what you can do today. Think about what you can do in five years. Think about what you can do in 10 years. Goals can be as simple as joining Linkedin or Twitter or more complex like getting certified. Don't beat yourself up if you don't meet your goals. If you wanted to get certified by April but weren't ready, Don't Sweat It, reevaluate your goals and give yourself a new deadline. The important part is to keep evaluating where you are and where you want to be.
WRITE IT ALL DOWN. Get a journal and make a plan in it. Then go back and re-read it and revise it as needed.

3. Join the community
The best way to learn is to find people who do what you do only better and talk to them. There is a User Group in almost every major city for almost anything. You have your SUGs and your LUGs and your MUGs and your Pugs. Join one, go to the meetings and get involved. Raise your hand and ask questions. Answer questions. Introduce yourself. Stay late and meet people. Exchange Business Cards. Network. You never know what can come out of a chance encounter. Just by going to the CACTUSS meeting last night I came away with two job opportunities for a friend of mine who is currently unemployed.
Join online groups. There are tons of forums on the internet. Find one and join it and spend some time there. Many of us in the IT industry, especially those of us at small to mid-size businesses, don't have people to talk to about what we do and what we are doing wrong and how we can do it better. Forums are a great place to find that needed camaraderie.

4. Learn and Make Time to Learn
This one should be a no-brainer for anyone in the tech industry. We need to learn constantly just to keep up with our jobs. I buy books and subscribe to magazines and read blogs and try to listen when I can. It goes beyond just learning how to do what you do. Learn about different or better ways to do it. Learn about things you didn't even know you could do. I wish I had researched SSRS a year ago, I have been doing things with third party apps like Cognos and Crystal I could have been doing straight from SQL.
Make time to learn. Find a time of day you can stop working and read something. I use the netvibes website to group rss feeds of the blogs I read into one page so when i get to work in the morning I can get all the new posts from one place. Here is the public page to my SQL tab in netvibes: http://www.netvibes.com/rutherford#SQL .
Don't be afraid to learn at work. I have never met a boss who would get angry at an employee trying to do his job better (as long as productivity doesn't suffer).

5. Build a brand.
I read an excellent article by Andy Warren (website|twitter) which can be found here about building your brand. That article actually inspired me to join twitter, start this blog and join my local user group. I won't try to repeat what he says, I'll just ask you to read it and hopefully it will change your career life like it did mine.

If you are a clock watcher who thinks of his/her job as a way to pay the bills, this post probably isn't for you. But if you love your career life (not necessarily your job) as much as you love your home life, I hope you give this some thought.

Hasta Luego,
RutherfordTX

Friday, April 8, 2011

Updating a Table Using a Select

I recently had a situation at work where someone had exported data from our SQL Database into excel so they could update a column. Then they asked me to figure out how to update the data "my way", as they called it, so they didn't have to manually update all the changes. Coming from more of a programming background my first thought is to do things procedurally. I have done quite a bit of Perl and PHP coding so I would usually reach for a nice For Each loop and do it that way. Since T-SQL is set based I am having to retrain my brain to think in sets rather than procedures.

So I had a table like this:

pk_numnew_val
1NULL
2NULL
3NULL

And I was given back a table like this:

pk_numnew_val
1Frank
2Joe
3Sam

After scratching my head and looking through my big stack of books I decided the best way to do this would be a subquery. Lets call the original table table1, I imported the new table into SQL and called it table2. Here is the code I ended up using:

UPDATE table1
SET
new_val = table2.new_val
FROM
(SELECT pk_num, new_val
FROM table2) AS table2
WHERE
table1.pk_num = table2.pk_num

---------
EDIT: 8/16/2011
Since I am not using a where clause in the subquery, it is not actually needed. The query could be rewritten like it is below. I'm leaving it as an example and because it is the basis for the next part
UPDATE table1

SET
new_val = table2.new_val
FROM table2
WHERE
table1.pk_num = table2.pk_num
---------

It was a bit more complicated than that in that I didn't want to overwrite existing values in the new_val field in the original table if there was no value in the field in the update table so the end result actually looked like this:

UPDATE table1
SET
new_val = table2.new_val
FROM
(SELECT pk_num, new_val
FROM table2
WHERE new_val IS NOT null) AS table2
WHERE
table1.pk_num = table2.pk_num

Hope this helps someone. I know this is pretty basic stuff, but when you are just starting with T-SQL or, like me, are use to working with procedural or object oriented programming languages its's hard to train your brain to think in concepts like subqueries.

Hasta Luego,
RutherfordTX

Wednesday, April 6, 2011

First Post!

Howdy. I have been in IT for over 15 years but have just recently developed an interest with MS SQL server. I am in the process of studying to get certified and have been reading SQL books like crazy to help me learn. I want to use this blog to help me in my learning process. I remember from school that I always learned a subject better if I had to write a paper at the end of it. I also will post some of the real world problems I encounter as I try to manage, write code and create reports for the two production SQL Databases that we run at my office.
I thought for this first post (besides introducing myself) I would list some of the resources that I have used in the last year as I have begun to familiarize myself with MS SQL server and T-SQL.

Books:
Twitter - I joined twitter and started to follow some SQL pros that I had heard of:
Training:
This list is tiny compared to the resources available out there for the budding SQL star. Hit Google and see what you can find. Let me know of you favorite sites.

Hasta Luego,
RutherfordTX