Week 18: CST 363 - Introduction to Database Systems
WK02: Learning more about SQL
This week we kept building on SQL fundamentals and stepped into more complex territory. We explored joins - inner, outer, left, right, and self - seeing how they stitch related tables together and how predicate placement in ON versus WHERE can change results. We also practiced set operations like UNION/UNION ALL to combine or compare result sets. On top of that, we wrote nested subqueries with IN and EXIST and discussed how NULL and three-valued logic affect predicates. The big theme was translating real questions into set-based thinking: composing queries step by step, choosing the right join, and validating assumptions.
A common case for joining on something other than keys is a range join, where rows match based on whether a value falls within an interval rather than by equality.
For example: For every employee, find the salary grade whose range contains the employee's salary. Here there's no foreign-key relationship between Employees and SalaryGrades; instead we match each employee to the grade whose [min_salary, max_salary] interval includes the employee's salary. This pattern is typical when mapping numeric bands (e.g., scores to letter grades) or time windows (e.g., orders to promotions active on the order date). In SQL, you express this with a non-equality predicate in the ON clause:
SELECT E.emp_id, E.emp_name, E.salary, G.grade_name
FROM Employees E
JOIN SalaryGrades G ON E.salary BETWEEN G.min_salary AND G.max_salary;
This tells the database to pair each employee with exactly the grade whose range encloses the salary, producing a result without relying on primary/foreign keys.
My opinion about SQL
I'm familiar with SQL and have worked about a year with Microsoft SQL Server (T-SQL). At first glance SQL feels simple - SELECT … FROM … WHERE … - and you can get productive quickly for straightforward filters and basic joins. But the moment a query combines many tables, multiple functions, JOINs of different types, EXISTS/NOT EXISTS, subqueries, window functions, and temp tables in one statement, it stops feeling simple. It still feels a bit miraculous to me how the optimizer rearranges everything and makes it work.
As a language, I appreciate SQL's declarative style: you state what you want, not how to compute it. That's powerful and concise. The flip side is that translating an English question to SQL is hardest when the request hides tricky set logic. My most challenging categories are:
- GREATEST-N-PER-GROUP (for each customer, get the most recent order)
- Semantics with NULLs and three-valued logic - predicates don't always behave as first intuition suggests.
- Aggregations with conditions (conditional sums/counts)
Overall, SQL is elegant and immensely capable, but real-world queries quickly demand a solid grasp of set theory, join semantics, and the optimizer's behavior.
Comments
Post a Comment