Types of subqueries (subqueries) in SQL

Lecture



Nested subqueries

Types of nested subqueries

A nested subquery is a subquery enclosed in parentheses and enclosed in a WHERE (HAVING) phrase of a SELECT clause or other clauses using a WHERE phrase. A nested subquery may contain in its WHERE (HAVING) phrase another nested subquery, etc. It is easy to guess that the embedded subquery was created so that when selecting the rows of the table formed by the main query, you could use data from other tables (for example, when selecting dishes for the menu, use the data on the availability of food in the pantry room).

There are simple and correlated nested subqueries. They are included in the WHERE (HAVING) phrase using IN, EXISTS conditions or one of the comparison conditions (= | <> | <| <| |> |> =). Simple nested subqueries are processed from the bottom up. The nested subquery of the lowest level is processed first. The set of values ​​obtained as a result of its execution is used when implementing a subquery of a higher level, etc.

Requests with correlated nested subqueries are processed by the system in the reverse order. First, the first row of the working table formed by the main query is selected, and the values ​​of those columns used in the nested subquery (nested subqueries) are selected from it. If these values ​​satisfy the conditions of the subquery, then the selected string is included in the result. Then the second row is selected, and so on, until all the rows matching the nested subquery (a sequence of nested subqueries) are included in the result.

It should be noted that SQL is highly redundant in the sense that it often provides several different ways of formulating the same query. Therefore, in many examples of this chapter, conceptual formulations of queries already familiar to us from the previous chapter will be used. And despite the fact that some of them are more successfully implemented with the help of connections, their variants using nested subqueries will still be given here. This is due to the need for detailed acquaintance with the creation and implementation of nested subqueries, since there are many tasks (especially to delete and modify data) that cannot be implemented in another way. In addition, different formulations of the same query require different memory resources for their execution and may differ significantly in implementation time in different DBMS.

1) Nested subqueries

SQL allows you to nest each other. Typically, a subquery returns a single value that is checked for the predicate's truth.

Types of search conditions:
• Comparison with the result of a subquery (=, <>, <, <=,>,> =)
• Check for belonging to the results of a subquery (IN)
• Existence check (EXISTS)
• Multiple (quantitative) comparison (ANY, ALL)

Notes on subqueries:
• A subquery must select only one column (with the exception of a subquery with an EXISTS predicate), and the data type of its result must match the data type of the value specified in the predicate.
• In some cases, you can use the DISTINCT keyword to ensure that you get a single value.
• You cannot include the ORDER BY and UNION clauses in a subquery.
• The subquery can be located on the left and right side of the search term.
• In subqueries, aggregation functions without a GROUP BY clause can be used, which automatically produce a special value for any number of rows, a special IN predicate, as well as column-based expressions.
• If possible, join JOIN tables instead of subqueries.

Examples for subqueries :

SELECT * FROM Orders WHERE SNum = (SELECT SNum FROM SalesPeople WHERE SName = 'Motika')
SELECT * FROM Orders WHERE SNum IN (SELECT SNum FROM SalesPeople WHERE City = 'London')
SELECT * FROM Orders WHERE SNum = (SELECT DISTINCT SNum FROM Orders WHERE CNum = 2001)
SELECT * FROM Orders WHERE Amt> (SELECT AVG (Amt) FROM Orders WHERE Odate = 10/04/1990)
SELECT * FROM Customer WHERE CNum = (SELECT SNum + 1000 FROM SalesPeople WHERE SName = 'Serres')

2) Related Subqueries

In SQL, you can create subqueries with a link to a table from an external query. In this case, the subquery is executed multiple times, once for each row of the table from the external query. Therefore, it is important that the subquery uses an index. A subquery can refer to the same table as an external one. If an external query returns a relatively small number of rows, then the associated subquery will run faster than the unbound. If the subquery returns a small number of rows, then the associated query will execute more slowly than the unbound.

Examples for related subqueries:

SELECT * FROM SalesPeople Main WHERE 1 <(SELECT COUNT (*) FROM Customer WHERE SNum = Main.SNum) // returns all sellers who served more than one customer SELECT * FROM Orders O1 WHERE Amt> (SELECT AVG (Amt) FROM Orders O2 WHERE O2.CNum = O1.CNum) // returns all orders whose value exceeds the average order value for a given buyer

3) Predicate EXISTS Queries using EXISTS

Syntax: [NOT] EXISTS ()

The predicate uses a subquery as an argument and evaluates it as true if the subquery has output, and otherwise as false. A subquery is executed once and may contain several columns, since their values ​​are not checked, but the result of the presence of rows is simply recorded.

The EXISTS quantifier (exists) is a concept borrowed from formal logic. In SQL, a predicate with a quantifier of existence is represented by the expression EXISTS (SELECT * FROM ...).

Such an expression is considered true only when the result of evaluating "SELECT * FROM ..." is a non-empty set, i.e. when there is any record in the table specified in the FROM clause of the subquery that satisfies the WHERE clause of the subquery. (In practice, this subquery will always be a correlated set.)

Consider examples. Give the names of suppliers supplying product number 11.

Result:
  SELECT Title
 FROM Suppliers
 WHERE EXISTS
	 (SELECT *
		 FROM Supplies
		 WHERE PS = Suppliers.PPS
		 AND OL = 11); 
Title
SATISFYING
CROP
SMELT
SUMMER

The system sequentially selects the rows of the Suppliers table, extracts the values ​​of the Name and PS columns from them, and then checks whether the existence condition is true, i.e. Is there at least one row in the Supply table with the PR value = 11 and the PS value equal to the PS value selected from the Suppliers table? If the condition is met, the resulting value of the Name column is included in the result.

Suppose that the first values ​​of the Name and PS fields are, respectively, 'HUMAN' and 1. Since the Supply table has a row with PR = 11 and PS = 1, the value 'HUMAN' should be included in the result.

Although this first example only shows a different way of formulating a query for a problem that can be solved in other ways (using the IN operator or join operator), EXISTS is one of the most important SQL features. In fact, any query that is expressed in terms of IN can also be formulated alternatively using EXISTS. However, the reverse statement is unfair.

Give the name and status of suppliers that do not supply product number 11.

Result:
  SELECT Title, Status
   FROM Suppliers
   WHERE NOT EXISTS
	 (SELECT *
		 FROM Supplies
		 WHERE PS = Suppliers.PPS
		 AND OL = 11); 
Title Status
PORTHOS cooperative
Shushary state farm
TULA supermarket
Cucumber farm

Notes on the EXISTS predicate:
• EXISTS is a predicate that returns a value of TRUE or FALSE, and it can be used separately or together with other boolean expressions.
• EXISTS cannot use aggregation functions in its subquery.
• In correlated (related, dependent - Correlated) subqueries, the EXISTS predicate is executed for each row of the outer table.
• You can combine the EXISTS predicate with table joins.

EXISTS predicate examples:

SELECT * FROM Customer WHERE EXISTS (SELECT * FROM Customer WHERE City = 'San Jose') - returns all customers if one of them lives in San Jose.
SELECT DISTINCT SNUM FROM Customer First WHERE NOT EXISTS (SELECT * FROM Customer Send WHERE Send.SNum = First.SNum AND Send.CNum <> First.CNum) - returns the numbers of sellers who have served only one buyer.
SELECT DISTINCT F.SNum, SName, F.City FROM SalesPeople F, Customer S WHERE EXISTS (SELECT * FROM Customer T WHERE S.SNum = T.SNum AND S.CNum <> T.CNum AND F.SNum = S.SNum ) - returns the numbers, names and cities of residence of all sellers who have served several buyers.
SELECT * FROM SalesPeople Frst WHERE EXISTS (SELECT * FROM Customer Send WHERE Frst.SNum = Send.SNum AND 1 <(SELECT COUNT (*) FROM Orders WHERE Orders.CNum = Send.CNum)) - returns all sellers who served buyers, made more than one order.

4) Predicates of quantitative comparison

Syntax: [NOT] {= |> | <|> = | <= | <>} ANY | ALL ()

These predicates use a subquery as an argument, however, compared to the EXISTS predicate, they are used in conjunction with relation predicates (=, <>, <, <=,>,> =). In this sense, they are similar to the IN predicate, but are used only with subqueries. The standard allows the use of the SOME keyword instead of ANY, but not all DBMSs support it.

Notes on comparison predicates:
• ALL predicate is TRUE if each value selected during the execution of a subquery satisfies the condition specified in the outer query predicate. Most often it is used with inequalities.
• ANY predicate takes the value TRUE if at least one value selected during the execution of the subquery satisfies the condition specified in the predicate of the outer query. Most often it is used with inequalities.
• If the subquery does not return rows, then ALL is automatically TRUE (it is assumed that the comparison condition is fulfilled), and for ANY - FALSE.
• If the comparison does not have the value TRUE for one row and there is one or more rows with a NULL value, ANY returns UNKNOWN.
• If the comparison does not have a FALSE value for one row and there is one or more rows with a NULL value, then ALL returns UNKNOWN.

Examples of quantitative comparison predicate:

SELECT * FROM SalesPeople WHERE City = ANY (SELECT City FROM Customer)
SELECT * FROM Orders WHERE Amt <any (select="" amt="" from="" orders="" a,="" customer="" b="" where="" a.cnum="B.CNum" and="" city="'San" jose')="" select="" *="" rating=""> ALL (SELECT Rating FROM Customer WHERE City = 'Rome')

5) Predicate of uniqueness

UNIQUE | DISTINCT ()

The predicate is used to verify the uniqueness (absence of duplicates) in the output of the subquery. Moreover, in the UNIQUT predicate, rows with NULL values ​​are considered unique, and in the DISTINCT predicate, two undefined values ​​are considered equal to each other.

6) Predicate of coincidence

MATCH [UNIQUE] [PARTIAL | FULL] ()

The MATCH predicate checks whether the value of the query string matches the value of any string obtained as a result of a subquery. This subquery differs from the IN and ANY predicates in that it allows you to process “partial” (PARTIAL) matches that can occur among rows that have a portion of NULL values.

7) Requests in the FROM section

In fact, it is permissible to use a subquery wherever a reference to a table is allowed.

Example:

SELECT CName, Tot_Amt FROM Customer, (SELECT CNum, SUM (Amt) AS Tot_Amt FROM Orders GROUP BY CNum) WHERE City = 'London' AND Customer.CNum = Orders.CNum
// the subquery returns the total value of orders made by each customer from London.

8) Recursive queries

WITH RECURSIVE
Q1 AS SELECT ... FROM ... WHERE ...
Q2 AS SELECT ... FROM ... WHERE ...

9.Simple nested subqueries

Simple nested subqueries are used to represent a set of values, the study of which should be carried out in any IN predicate, which is illustrated in the following example: to give the name and status of suppliers of a product with number 11, i.e. tomatoes

Result:
  SELECT Title, Status
 FROM Suppliers
 WHERE PS IN
	 (SELECT PS
		 FROM Supplies
		 WHERE OL = 11); 
Title Status
SATISFYING market
CROP cooptorg
SUMMER agrofirm
SMELT cooperative

As noted in clause 3.3.1, when processing a complete request, the system performs primarily a subquery. This subquery returns a set of numbers of suppliers who supply the product with the code PR = 11, namely the set (1, 5, 6, 8). Therefore, the initial query is equivalent to such a simple query:

  SELECT Title, Status
 FROM Suppliers
 WHERE PS IN (1, 5, 6, 8);

A subquery with several levels of nesting can be illustrated with the same example. Suppose you want to know not suppliers of product 11, as was done in previous requests, but suppliers of tomatoes, which are product number 11. To do this, you can give a request

  SELECT Title, Status
 FROM Suppliers
 WHERE PS IN
	 (SELECT PS
		 FROM Supplies
		 WHERE PR IN
			 (SELECT PR
				 FROM Products
				 WHERE Product = 'Tomatoes')); 

In this case, the result of the innermost subquery is only one value (11). As already shown above, the next level subquery in turn results in a set (1, 5, 6, 8). The last, outermost SELECT, computes the above final result. In general, any depth of subqueries is allowed.

The same result can be obtained by connecting

  SELECT Title, Status
 FROM Suppliers, Supplies, Products
 WHERE Suppliers.PS = Deliveries.PS
 AND Supplies. PR = Products. PR
 AND Product = 'Tomatoes';

When executing this compact query, the system must simultaneously process data from three tables, whereas in the previous example these tables are processed in turn. Naturally, various memory and time resources are required for their implementation, however, this cannot be felt when working with a limited amount of data in the PANSION illustrative database.

Using the same table in an external and subquery

Issue the numbers of suppliers who supply at least one product supplied by the supplier 6.

Result:
  SELECT DISTINCT PS
 FROM Supplies
 WHERE PR IN
	 (SELECT PR
		 FROM Supplies
		 WHERE PS = 6); 
PS
one
3
five
6
eight

Note that the reference to shipments in an embedded subquery means not the same thing as the link to shipments in the external query. In fact, the two Supply names designate different meanings. To make this fact explicit, it is useful to use pseudonyms, for example, X and Y:

  SELECT DISTINCT X.PS
 FROM Supplies X
 WHERE X.PR IN
	 (SELECT Y.PR
		 FROM Supplies Y
		 WHERE Y.PPS = 6); 

Here, X and Y are arbitrary nicknames of the Supply table, defined in the FROM clause and used as explicit qualifiers in the SELECT and WHERE clauses. Recall that aliases are defined only within a single query.

10 Nested subquery with comparison operator other than IN

Issue numbers of suppliers located in the same city as supplier number 6.

Result:
  SELECT PS
 FROM Suppliers
 WHERE City =	
	 (SELECT City
		 FROM Suppliers
		 WHERE PS = 6); 
PS
one
four
6

In such queries, you can use other comparison operators (<>, <=, <,> = or>), however, if the nested subquery returns more than one value and does not use the IN operator, an error will occur.

11 Correlated nested subqueries

Issue the name and status of suppliers of product number 11.

  SELECT Title, Status
 FROM Suppliers
 WHERE 11 IN
	 (SELECT PR
		 FROM Supplies
		 WHERE PS = Suppliers. PS); 

Such a subquery differs from that considered in clause 3.3.2 in that the nested subquery cannot be processed before the external subquery is processed. This is due to the fact that the subquery depends on the value of Suppliers. PS and it changes as the system checks the various rows of the Suppliers table. Therefore, from a conceptual point of view, the processing is as follows:

  1. The system checks the first row of the Suppliers table. Suppose that this is a vendor line number 1. Then the value Suppliers. The PS will currently have a value of 1, and the system processes the internal request
      (SELECT PR
    	 FROM Supplies
    	 WHERE PS = 1); 
    resulting in a set (9, 11, 12, 15). Now the system can complete the processing for vendor number 1. The Name and Status values ​​for PS = 1 (DEMAND and market) will be sampled if and only if PR = 11 belongs to this set, which is obviously true.
  2. Further, the system will repeat processing of this kind for the next supplier, etc. until all rows of the Suppliers table are considered.

Such subqueries are called correlated, since their result depends on the values ​​defined in the external subquery. Processing of the correlated subquery, therefore, must be repeated for each value retrieved from the external subquery, and not performed once and for all.

Consider an example of using the same table in an external subquery and a correlated nested subquery.

Issue the numbers of all products supplied by only one supplier.

Result:
  SELECT DISTINCT X.PR
 FROM Supplies X
 WHERE X.PR NOT IN
	 (SELECT Y.PR	
		 FROM Supplies Y	
		 WHERE Y.PS <> X.PS); 
X. PR
17

The effect of this query can be explained as follows: "For each row of the Supply table, say X, alternately, select the value of the product number (PR), if and only if this value is not included in some row, say Y, of the same table, and the value of the column the number of the supplier (PS) in line Y is not equal to its value in line X ".

Note that in this formulation at least one alias must be used - either X or Y.

12. Functions in a subquery

Now, after getting acquainted with the various formulations of the embedded subqueries and pseudonyms, it is easier to understand the text and algorithm for implementing the query (p. 3.1) to receive those food product suppliers for Syrniki who supply these products for a minimal price:

  SELECT Product, Price, Name, Status
 FROM Products, Ingredients, Meals, Supplies, Suppliers
 WHERE Products.PR = Composition.PR
 AND Ingredients.BL = Dishes.BL
 AND Deliveries.PR = Composition.PR
 AND Deliveries.PS = Suppliers.PS
 AND Dish = 'Cheesecakes'
 AND Price = (SELECT MIN (Price)
			 FROM Supplies X
			 WHERE X.PR = Deliveries.PR); 

Naturally, this is a correlated subquery: here the minimum price of a product that is part of Syrnikov is first determined, and only then its supplier is found out.

In this example, we end up with the subqueries nested, suggesting that we try our hand at compiling a number of queries using the mechanism of such subqueries:

  1. Give the names of all meat dishes.
  2. Give the number of all dishes, which include tomatoes.
  3. To issue dishes, products for which are supplied by LETO agrofirma.

Comments


To leave a comment
If you have any suggestion, idea, thanks or comment, feel free to write. We really value feedback and are glad to hear your opinion.
To reply

Databases, knowledge and data warehousing. Big data, DBMS and SQL and noSQL

Terms: Databases, knowledge and data warehousing. Big data, DBMS and SQL and noSQL