I had some SQL trouble selecting distinct values from a single column while also selecting the remaining columns from the table.
After trying without any real breakthrough I turn to stackoverflow and luckily Peter Lang was the man with an fast answer.
The question was (http://stackoverflow.com/questions/2052158/ordered-sql-select-columnwise-distinct-but-return-of-all-columns)
I have a little SQL Distinct puzzle that i cannot solve (or at least not in an very elegant way).
I have two tables (try to ignore the simplicity of the example). I’m using MSSQL 2008 if that makes much of a difference.
Table: Category
| categoryId (uniqueidentifier) PK |
| Name varchar(50) |
Table: Download
| downloadId (uniqueidentifier) PK |
| categoryId (uniqueidentifier) FK |
| url (varchar(max)) |
| createdate (datetime) |
I have a few categories in the Category table and potentially a lot of download URLs in the Download table. I’m interested in selecting the newest using the createdate (or a top 5 if that is possible) download url for each category from the Download table.
Currently I’m doing the following, but that is not very nice, and can hardly be the correct way to do it.
SELECT
categoryId,
max(convert(BINARY(16),downloadId)) as downloadId,
max(createdate) as createdate
INTO tmp
FROM Download
GROUP BY categoryId
ORDER BY createdate
SELECT url
FROM Download
WHERE downloadId IN
(SELECT CONVERT(uniqueidentifier, downloadId) FROM tmp)
DROP Table tmp
Any suggestions would be much appreciated.
The ingenious solution is to use ROW_NUMBER and Partition i had never figured that out myself.
So my stupid tmp table attempt should be reformed into:
SELECT categoryId, downloadId, createdate, url
FROM (
SELECT
categoryId, downloadId, createdate, url,
ROW_NUMBER() OVER(PARTITION BY categoryId ORDER BY createdate DESC) rownum
FROM Download
) d
WHERE d.rownum <= 1
More on the topic can be found here: http://weblogs.sqlteam.com/jeffs/archive/2007/03/28/60146.aspx