SQL Group with Most Recent Record Each
Say you have a table with student test data.
SELECT t.*
FROM tests t
ORDER BY t.student_id,
t.taken_on
| student_id | score | taken_on |
|---|---|---|
| 1 | 1750 | 2021-08-30 |
| 1 | 1717 | 2021-11-16 |
| 1 | 1786 | 2022-02-08 |
| 2 | 1550 | 2021-08-30 |
| 2 | 1493 | 2021-11-16 |
| 2 | 1627 | 2022-02-08 |
| 3 | 1441 | 2021-08-30 |
| 3 | 1528 | 2022-02-07 |
Imagine that you want to see the most recent score each student received.
To put it another way, you want to query this table and see only 1 row per
student_id. The row should be the one with the most recent taken_on.
The desired result looks like this:
| student_id | score | taken_on |
|---|---|---|
| 1 | 1786 | 2022-02-08 |
| 2 | 1627 | 2022-02-08 |
| 3 | 1528 | 2022-02-07 |
Let’s start simple. You need to get a query that shows the most recent
taken_on grouped by student_id first.
SELECT student_id,
MAX(taken_on) latest_taken_on
FROM tests
GROUP BY student_id
Result:
| student_id | latest_taken_on |
|---|---|
| 1 | 2022-02-08 |
| 2 | 2022-02-08 |
| 3 | 2022-02-07 |
You could also add things like a WHERE clause to that query if you need only
the most recent test where a certain condition was met.
Next, you should revisit the original query.
SELECT t.*
FROM tests t
ORDER BY t.student_id,
t.taken_on
But this time, JOIN the “latest date” query you just wrote. You’ll want to
join on the student_id and taken_on date columns.
SELECT t.*
FROM tests t
JOIN (
SELECT student_id,
MAX(taken_on) latest_taken_on
FROM tests
GROUP BY student_id
) l
ON t.student_id = l.student_id
AND t.taken_on = l.latest_taken_on
ORDER BY t.student_id
Those JOIN parameters ensure the query only returns rows where the
student_id and taken_on columns match the latest date for each student.
Note that t.taken_on in the ORDER BY clause is no longer necessary because
you’re already ordering on t.student_id and there will only be one each.
| student_id | score | taken_on |
|---|---|---|
| 1 | 1786 | 2022-02-08 |
| 2 | 1627 | 2022-02-08 |
| 3 | 1528 | 2022-02-07 |
That’s it. The results are exactly what we were aiming for; the row with the
latest taken_on for each student_id is returned.
Travis Horn