nested queries lecture
TRANSCRIPT
Topics
● Nested queries
● CREATE
● UPDATE
● DELETE
● The Subquery as Scalar Operand
● Comparisons Using Subqueries
● Subqueries with ANY, IN, or SOME
● Subqueries with ALL
● Row Subqueries
● Subqueries with EXISTS or NOT EXISTS
● Correlated Subqueries
● Subqueries in the FROM Clause
● Subquery Errors
● Optimizing Subqueries
● Rewriting Subqueries as Joins
What are Nested Queries?
● A Subquery or Inner query or Nested query is a query within another SQL query and embedded within the WHERE clause.
● Subqueries (also known as inner queries or nested queries) are a tool for performing operations in multiple steps.
– Subqueries can be used in several places within a query, but it’s easiest to start with the FROM statement.
● Subqueries can return individual values or a list of records
Advantages
● The main advantages of subqueries are:
– They allow queries that are structured so that it is possible to isolate each part of a statement.
– They provide alternative ways to perform operations that would otherwise require complex joins and unions.
– More readable than complex joins or unions.● Indeed, it was the innovation of subqueries that gave
people the original idea of calling the early SQL “Structured Query Language.”
Return
● Scalar (a single value)
● Single row
● Single column
● Table (one or more rows of one or more columns).
Other info
● Subqueries must be enclosed with parenthesis
● A subquery can contain many of the clauses that an ordinary SELECT can: DISTINCT, GROUP BY, ORDER BY, LIMIT, joins, index hints, UNION constructs, comments, functions, and so on.
● A subquery's outer statement can be any one of: SELECT, INSERT, UPDATE, DELETE, SET, or DO.
● In MySQL, you cannot modify a table and select from the same table in a subquery
UPDATE
● How can I get the updated fertility rate for each country?
● SELECT country, year, fertility FROM fertility WHERE year = 2015 GROUP BY country
● How can I update my country table using subquery?
● UPDATE countries c SET c.fertility = (SELECT fertilityFROM fertility fWHERE YEAR =2015AND f.country = c.countryGROUP BY country),c.year =2015
UPDATE
● How to get one column (continent) from the second source?
● UPDATE countries c SET c.continent = (SELECT continent_code FROM countries2 c2WHERE c2.name = c.country )
DELETE
● How to delete countries which has no data
● DELETE FROM countries WHERE country IN (SELECT country FROM fertility GROUP BY country HAVING SUM(fertility) IS NULL) *
● * It might give error on MySQL workbench because of safe update mode. (Error Code: 1175) – You must disable safe mode.
The Subquery as Scalar Operand
● A scalar subquery is a simple operand, and you can use it almost anywhere a single column value or literal is legal, and you can expect it to have those characteristics that all operands have: a data type, a length, an indication that it can be NULL, and so on. For example:
● CREATE TABLE t1 (s1 INT, s2 CHAR(5) NOT NULL);
● INSERT INTO t1 VALUES(100, 'abcde');
● SELECT (SELECT s2 FROM t1);
● The subquery in this SELECT returns a single value ('abcde') that has a data type of CHAR, a length of 5, a character set and collation equal to the defaults in effect at CREATE TABLE time, and an indication that the value in the column can be NULL.
Example
● Fertility average of Europe and South America
– SELECT (SELECT AVG(fertility) FROM countries WHERE continent = 'EU'),(SELECT AVG(fertility) FROM countries WHERE continent = 'SA')
Comparisons Using Subqueries
● = > < >= <= <> != <=>
● What are the countries in South America which have fertility smaller than Max european fertility?
● SELECT * FROM countriesWHERE fertility < (SELECT MAX(fertility)FROM countriesWHERE continent='EU')AND continent = 'SA';
Subqueries with ANY, IN, or SOME
● Which countries in Africa have fertility rate < than any european country?
● SELECT * FROM countriesWHERE fertility < ANY (SELECT fertility FROM countries WHERE continent='EU')AND continent = 'AF';
● When used with a subquery, IN is an alias for = ANY
Subqueries with ALL
● SELECT s1 FROM t1 WHERE s1 > ALL(SELECT s1 FROM t2);
● Which countries in Africa have the fertility rate bigger than all countries of Asia and South America
● SELECT s1 FROM t1 WHERE s1 <> ALL(SELECT s1 FROM t2);
● NOT IN is an alias for <> ALL
Row Subqueries
● A row subquery is a subquery variant that returns a single row and can thus return more than one column value
● SELECT * FROM t1 WHERE (col1, col2) = (SELECT col3, col4 FROM t2 WHERE id = 10);
● Which countries are on the average of 2015?
● SELECT country, year, fertility FROM fertilityWHERE (TRUNCATE(fertility,1), year) =(SELECT TRUNCATE(AVG(fertility),1), year FROM countries);
Row Subqueries
● SELECT * FROM t1 WHERE ROW(col1, col2) = (SELECT col3, col4 FROM t2 WHERE id = 10);
● The row constructor and the row returned by the subquery must contain the same number of values.
● The following query answers the request, “find all rows in table t1 that also exist in table t2”:
● SELECT column1, column2, column3FROM t1WHERE (column1, column2, column3) IN(SELECT column1, column2, column3 FROM t2);
Row Subqueries
● Which countries have the same fertility rate than Estonia (rounding)?
● SELECT * FROM fertilityWHERE ROW(TRUNCATE(fertility,1), year) =(SELECT TRUNCATE(fertility,1), yearFROM countriesWHERE country='Estonia');
Subqueries with EXISTS or NOT EXISTS
● SELECT column1 FROM t1 WHERE EXISTS (SELECT * FROM t2);
● What countries exists in my two datasources?
– SELECT DISTINCT country FROM countries c
WHERE EXISTS (SELECT name FROM countries2 c2 WHERE c.country = c2.name);
● Now the opposite
– SELECT DISTINCT country FROM countries c
WHERE NOT EXISTS (SELECT name FROM countries2 c2 WHERE c.country = c2.name);
Correlated Subqueries
● A correlated subquery is a subquery that contains a reference to a table that also appears in the outer query.
● SELECT * FROM t1WHERE column1 = ANY (SELECT column1 FROM t2WHERE t2.column2 = t1.column2);
● In which years Estonia had fertility rate bigger than it's historical average.
● SELECT * FROM fertility fWHERE fertility > (SELECT AVG(fertility) FROM fertility f2 WHERE f.country = f2.country AND f2.country = 'Estonia' GROUP BY f2.country);
Correlated Subqueries
● In which years Estonia had fertility smaller than average for 2000 to 2015 (1.53)?
● SELECT * FROM fertility fWHERE fertility < (SELECT AVG(fertility)FROM fertility f2WHERE f.country = f2.country AND f2.country = 'Estonia'AND f2.year BETWEEN 2000 AND 2015GROUP BY f2.country);
Subqueries in the FROM Clause
● SELECT ... FROM (subquery) [AS] name …
● Average of fertility for each continent using historical average for each country
● SELECT continent, AVG(avg_fertility)FROM (SELECT AVG(fertility) as avg_fertility, country FROM fertility f WHERE year BETWEEN 2000 AND 2015GROUP BY country) AS avgfertJOIN countries c ON (c.country = avgfert.country)GROUP BY continent
If we have time
● https://www.google.com/fusiontables/DataSource?docid=1tVN1toVTUb1Ju3gaLxIHTtlcST_bdaR7UgU2OfJO#rows:id=1
● https://www.google.com/fusiontables/DataSource?docid=1kg8Pn9JEheqA8whqsmZBgM3quEiPTyFrasfUv5hQ
References
● MySQL 5.7 Reference Manual - http://dev.mysql.com/doc/refman/5.7/en/
● Data Country and Continentshttp://www.geekality.net/2011/08/21/country-names-continent-names-and-iso-3166-codes-for-mysql/
● Historical Fertility Rate - http://www.gapminder.org/data/https://ourworldindata.org/grapher/total-fertility-rate?tab=map
● Nested Querieshttp://www.w3resource.com/sql/subqueries/nested-subqueries.php
● Subqueries - https://sqlschool.modeanalytics.com/advanced/subqueries/
● Using Nested Queries - http://sqlzoo.net/wiki/Using_nested_SELECT