数据管理基础笔记

Textbook: 数据库原理、编程与性能

Week 1

  • 常见数据模型

    1. 层次数据模型
    2. 网状数据模型
    3. 关系模型
    4. 面向对象模型
    5. 对象关系模型
  • Terminology

    • Table (Relation) Old: file of records
    • Column names (Attributes) Old: field names of records
    • Rows (Tuples) Old: records of a file
    • Table heading (Schema) Old: set of attributes
    关系模型 SQL 文件系统
    Relation Table File of Records
    Attribute Column Field
    Tuple Row Record
    Schema Table Heading Type of Record
  • Column Type

    can be real, integer, char, data, etc..

  • Problems of Column type

    Most commercial database systems don’t support types consisting of enumerated sets(as with city, month).

  • Relation Algebra

    关系代数是一阶逻辑的分支,是闭合于运算下的关系的集合。运算作用于一个或多个关系上来生成一个关系。关系代数是计算机科学的一部分。

    在纯数学中的关系代数是有关于数理逻辑和集合论的代数结构。

    Domain of column is like an enumerated type.

    • Example:
      • Domain(City) → all the city names in the U.S.
  • Cartesian Product

    Set: CID = Domain(cid), CNAME = Domain(cname), CITY = Domain(city), DISCNT = Domain(discnt), then consider:
    $$
    CID \times CNAME \times CITY \times DISCNT
    $$
    consisting of all tuples: $$(w, x, y, z)$$, $w$ in CID, $x$ in CNAME, $y$ in CITY, $z$ in DISCNT.

    sno name dept
    1 A a
    2 B b

    如果只考虑上述Table,Domain(sno) = {1, 2, 3},Domain(name) = {A, B},Domain(dept) = {a, b}.

    若将上述三个值域进行笛卡尔乘积运算,可以得到如下Table:

    sno name dept
    1 A a
    1 A b
    1 B a
    1 B b
    2 A a
    2 A b
    2 B a
    2 B b

    NOTE: 原关系S显然只是Cartesian Product结果关系的一个真子集

  • 关系规则

    关系规则 (Relation Rules) 阐述了关系模型中那些操作是允许的,哪些是受限的。

  • 关系规则——1. 第一范式规则 First Normal Form Rule

    在定义的表中,关系模型坚持不允许含有多值属性(有时称为重复字段)和含有内部结构(比如记录类型)的。遵守这样规则的表称为第一范式

    多值属性:例如sno为1的同学,其行(sno为1的行唯一)的dept属性有两个值,中间划了一条线变成上下两个。那么dept列就是多值属性列。

    内部结构:即一列的值中不能中间从上往下划一刀变成左右两个,通过$A.l$和$A.r$访问。

  • 关系规则——2. 只能基于内容存取行规则 Access Rows by Content Only Rule

    关系模型的该条规则说明我们只可以通过行的内容,即每一行中所存在的属性值来检索列。

    这一规定类似$HashMap$,只能通过$Key$访问$Value$,而不能像数组一样说访问$map[i]$

    列和行均没有顺序

    不允许指向行的指针(但很多商业数据库允许)

  • 关系规则——3. 行唯一性规则 The Unique Row Rule

    该条规则要求关系中的任何连个元组(表中的行)的值在同一时刻不能是完全相同的。(类似泡利不相容原理)

    So that a relation is an unordered SET of tuples

    但是仍有一些商业数据库允许顺序检索以在某些场景下提高效率。

  • 超键、键、空值

    键 (Key):表的键是表的一些Head的组合,该组合对于任意行都是不同的,且不可含有冗余Head

    超键 (Super Key):是可以含有冗余Head的键。

    键是一个比超键更严格的概念,键同时也是一个超键。

    一般来讲,表的键的选定或设计是遵从设计者的意图的,故而在表的某个瞬间,即是有另一个Head满足上述键的定义,但由于row是时刻可以变化的,故不可推断其为键,必须遵从设计者的意图。只有整个存活周期都不变的表才能按照定义推断键

    • 定理:每一表都至少有一个键。

    主键 (Primary Key):关系的各种键称为候选键,而被数据库设计者选择出来作为表T中特定行的唯一性标识符的候选键就是主键

    通常主键标识符被用来作为别的表中的引用

    空值 (NULL):空值用来表示未知的或尚未定义的,一般意思为当前没有足够信息能填写该项,但数据有需要入库,以后知道更多情况会填写。空值不是0。

  • 关系规则——4. 实体完整性规则 Entity Integrity Rule

    表T中的任意行在主键列的取值都不允许为空值。

  • Fundamental Operations of Relational Algebra

    1. set-theoretic operations

      depend on fact that table is a set of rows

      Name Symbol
      Union $\bigcup$
      Intersection $\bigcap$
      Difference
      Product $\times$
    2. native relational operations

      depend on structure of table

      Name Symbol Example
      Project $R[\ \ ],\ or\ \ \pi$ $R[A_{i1,…,A_{ik}}],\ \pi_{A_{i1},…A_{ik}}(R)$
      Select $R\ where\ C,\ or\ \ \delta(R)$ $R\ where\ A_1=5,\ \delta_{A_1=5}(R$
      Join $\infty,\ or\ \ JOIN$ $R\ \infty\ S$
      Division $\div$ $R\div S$
  • Compatible Tables (相容表)

    Tables R and S are compatible if they have the same headings.

    that is , if Head(R)=Head(S), with attributes chosen from the same domains and with the same meanings.

    *SAME DOMAINS *is essential.

    Head的顺序不影响是否相容。

  • Union, Intersection, Difference

    Let R and S be two compatible tables, where Head(R) = Head(S)

    即相容表的的交并补运算。

    $R\bigcup S$ is a table with the same heading as R (or S), for each row t in R or in S, t in $R\bigcup S$
    $R\bigcap S$ is a table with the same heading as R (or S), for each row t in R, if t appear in S, then t in $R\bigcap S$
    $R-S$ is a table with the same heading as R (or S), for each row t in R, if t don’t appear in S, then t in $R-S$

    • $R-S\neq S-R$
    • $R\bigcap S=R-(R-S)=S-(S-R)$
  • Assignment, Alias

    $S:=R$

    The content of the new table S is exactly the same as the old table R.

  • Product Operation

    Let $R$ be:

    A B C
    1 a A
    2 b B

    Let $S$ be:

    B C D
    5 c C
    -1 f F

    Then $R\times S$:

    R.A R.B R.C S.B S.C S.D
    1 a A 5 c C
    1 a A -1 f F
    2 b B 5 c C
    2 b B -1 f F

    $col_num_of_(R×S) == col_num_of_R + col_num_of_S.$

    $row_num_of_(R×S) == row_num_of_R × row_num_of_S.$

  • Projection

    投影某一列,特点是去除重复的元素,显示的是元素的集合。

    $R[A_{i1},A_{i2},…,A_{ik}]$其中$A_{i1},A_{i2},…,A_{ik}$是$R$的$Head$。

  • Selection

    选择符合要求的rows。

    可以使用逻辑$AND,OR,NOT$.

    可以使用比较$<,>,=,<=,>=,<>$.

    Selection and Projection used together: $T:=CUSTOMERS\ where\ city=’Kyoto’$

  • 关系运算的优先级 Precedence of Relational Operations

    Precedence Operations
    Highest Project
    | Select
    | Product
    | Join, DivideBy
    Intersection
    Lowest Union, Difference
  • Join (联接,等值联接,自然联接)

    Let $R$ be:

    A B1 B2
    a1 b1 b1
    a1 b2 b1
    a2 b1 b2

    Let $S$ be:

    B1 B2 C
    b1 b1 c1
    b1 b1 c2
    b1 b2 c3
    b2 b2 c4

    Then $R\ \infty\ S$:

    A B1 B2 C
    a1 b1 b1 c1
    a1 b1 b1 c2
    a2 b1 b2 c3

    可以发现,Join就是发现R和S的对应重复行,然后进行各种可能的连接。

    If $Head(R)\bigcap Head(S)=\Phi$, then $R\ \infty\ S=R\times S$.

    If $Head(R)=Head(S)$, then $R\ \infty\ S=R\bigcap S$.

    If $Head(R)\subset Head(S)$, then $R\ \infty\ S\subset S$.

    • Outer Join

      Outer Join, Left Outer Join, Right Outer Join

      Let $R,S$ be

      R S
      A B B C
      a1 b1 b1 c1
      a2 b2 b2 c2
      a3 b5 b3 c3
      b4 c4

      Then:

      Outer Join Left Outer Join Right Outer Join
      A B C A B C A B C
      a1 b1 c1 a1 b1 c1 a1 b1 c1
      a2 b2 c2 a2 b2 c2 a2 b2 c2
      a3 b5 null a3 b5 null null b3 c3
      null b3 c3 null b4 c4
      null b4 c4
  • NOTE:

    The Product and Join operations are commutative and associative.

    乘积和联接符合交换律和结合律。

  • Division

    Let $R$ be:

    A B C
    a1 b1 c1
    a2 b1 c1
    a1 b2 c1
    a1 b2 c2
    a2 b1 c2
    a1 b2 c3
    a1 b2 c4
    a1 b1 c5

    Let $S_1,S_2,S_3$ be

    $S_1$ $S_2$ $S_3$
    C C C
    c1 c1 c2
    c2

    Then

    $R\div S_1$ $R\div S_2$ $R\div S_3$
    A B A B A B
    a1 b1 a1 b2 a1 b2
    a2 b1 a2 b1 a2 b1
    a1 b2

    可以明显看到,除以的结果,是同时持有除数组合的行的其他元素的集合

  • 定理

    If $R=T\times S$, then $T=R\div S$, and $S=R\div T$.

    If $T=R\div S$, then $T\times S\subset R$.

  • スキル

    • CAP:

      Customers: C(cid, cname, city, discnt)

      Agents: A(aid, aname, city, percent)

      Products: P(pid, pname, city, quantity, price)

      Orders: O(ordno, month, cid, aid, pid, qty, dollars)

    • Example of Operations of Relational Algebra

      1. Get aids of agents who do not supply product p02.

        $A[aid]-(O\ where\ pid=’p02’)[aid]$

      2. Get aids of agents who supply only product p02.

        $O[aid]-(O\ where\ pid <> ‘p02’)[aid]$

      3. Get aids of agents who take orders on at least that set of products ordered by c004.

        $O[aid, pid] ÷ (O\ where\ cid = ‘c004’)[pid]$

      4. Get cids of customers who order p01 and p07.

        $(O\ where\ pid=’p01’)[cid]\bigcap (O\ where\ pid=’p07’)[cid]$

      5. Get cids of customers who order p01 or p07.

        $(O\ where\ pid=’p01’)[cid]\bigcup (O\ where\ pid=’p07’)[cid]$

        $(O\ where\ pid=’p01’\ or\ pid=’p07’)[cid]$

      6. List all cities inhabited by customers who order product p02 or agents who place an order for p02.

        • cities inhabited by customers who order product:

          $T_1:=((O\ where\ pid=’p02’)\ \infty\ C)[city]$

          要在Order里寻找是否订购了’p02’,但是Order里没有city属性,故而需要Join Customer表,注意到Order和Customer表的相同项只有cid,所以Join后得到的就是相同cid行的Join。

        • agents who place an order for p02:

          $T_2:=((O\ where\ pid=’p02’)\ \infty\ A)[city]$

        • result

          $T:=T_1\bigcup T_2$

      7. Get aids of agents who place an order for at least one customer that uses product p01.

        $((O\ where\ pid=’p01’)[cid]\ \infty\ O)[aid]$

        $O[aid, pid]\div (P\ where\ pid=’p01’)[pid]$

      8. Get aids of agents who place orders for all customers that uses product p01.

        $O[cid,aid]\div (O\ where\ pid=’p01’)[cid]$

      9. Retrieve product ids for *all products that** are not ordered by any customers living in a city beginning with the letter “D”.*

        多重条件,英文要从后往前读。

        city begin with letter “D”,则需要用到字典序大小比较:

        $T_1:=C\ where\ C.city>=’D’\ and\ C.city<’E’$

        再看到not关键字,显然,需要用到minus操作

        $T_2:=P[pid]-(O\ \infty\ T_1)[pid]$

        所有的减去not的就是要求的

      10. Retrieve cids of customers with the largest discounts.

        largest表示需要和自己比较,意味着需要使用Cartesian Product不同行组成的相同行,意味着需要创造自身的副本。

        1. $CY:=C$ 创建C的副本
        2. $T_1:=((CY\times C)\ where\ CY.discnt>C.discnt)[C.cid]$
        3. $T_2:=C[cid]-T_1$

        可以看到,上述求取极值的方法是找到有比自己更大或更小值得值,这样的值就不是极大或极小,因为有比它更大或小的值,最后用全部值减去不符合要求的值即是符合要求的值。

        1. $CY:=C$
        2. $T_1(cyid,cid):=((CY\times C)\ where\ CY.discnt>=C.discnt)[CY.cid,C.cid]$
        3. $T_2:=T_1\div C[cid]$

        上述是另外一种求取极值的方法,这次Project的pair $[CY.cid,C.cid]$左边总是大于等于右边,而明显地,同时大于等于所有元素(包括自身)的元素,就是最大值,涉及到all故而用division.

      11. Get the names of customers who order at least one product priced at $0.50.

        $(((P\ where\ price=0.50)[pid]\ \infty\ O)\ \infty\ C)[cname]$

        还是从后往前看英文,先求出价格是0.05的Product,然后Join Order绑定相同的pid,得到对应的cid,再Join Customer,通过cid绑定,最后投影cname。

      12. Find cids of all customers who *don’t place any order through agent a03.***

        $O[cid]-(O\ where\ aid=’a03’)[cid]$

        这是我的第一反应,但是这实际上是错误的,注意到,获得cid的途径有O和C两个,根据提议,如果按照上述从O获取,就不能获取到没有订购任何商品的cid

        $C[cid]-(O\ where\ aid=’a03’)[cid]$

      13. Retrieve customers who place orders only through a03.

        $O[cid]-(O\ where\ aid\neq ‘a03’) [cid]$

        这里与12恰好相反,如果从C获取,则可能获取到没有订购任何商品的cid。

      14. Find products that have never been ordered by a customer based in New York through an agent based in Boston.

        1. $AB:=(A\ where\ city=’Boston’)[aid]$

        2. $CN:=(C\ where\ city=’New\ York’)[cid]$

        3. $PR:=(AB\ \infty\ O)\ \infty\ CN$

          得到指定agent和customer的order

        4. $R:=P[pid]-PR[pid]$

          同样的,这里也要格外注意,因为获取pid的途径有P和O两个。

      15. Get *names** of customers who order all products priced at $0.50.*

        1. $RP:=(P\ where\ price=’0.05’)[pid]$
        2. $RCI:=O[cid,pid]\div RP$
        3. $R:=(RCI\ \infty\ C)[cname]$
      16. Get cids of customers who order all products that anybody orders.

        $O[cid, pid]\div O[pid]$

      17. Get aids of agents who take orders on at least that set of products ordered by c004.

        $O[aid,pid]\div (O\ where\ cid=’c004’)[pid]$

      18. Get cids of customers who order both products p01 and p07.

        $(O\ where\ pid=’p01’)[cid]\bigcap (\ where\ pid=’p07’)[cid]$

        $O[cid,pid]\div (P\ where\ (pid=’p01’\ or\ pid=’p07’))[pid]$

      19. Get cids of customers who place an order through at least one agent who places an order for product p03.

        1. $T_1:=(O\ where\ pid=p03)[aid]$
        2. $T_2:=(T_1\ \infty\ O)[cid]$

        这是 Join的一大作用!即如果存在需要寻找X[y]=T1任意元素都行,这种情况时,可用 Join,因为只要等于一个T1中的都会被联接。

      20. Get cids of all customers who have the same discount as any customer in Dallas or Boston.

        1. $T1 := C\ where\ city=‘Dallas’\ or\ city=‘Boston’$
        2. $T2 := (T1 [discnt]\ \infty\ C) [cid]$
      21. List pids of products that are ordered through agents who place orders for (possibly different) customers who order at least one product from an agent who has placed an order for customer c001.

        1. $T_1:=(O\ where\ cid=’c001’)[aid]$
        2. $T_2:=(O\ \infty\ T_1)[cid]$
        3. $T_3:=(O\ \infty\ T_2)[aid]$
        4. $T_4:=(O\ \infty\ T_3)[pid]$

Week 4

  • Basic SQL Query Language

  • Data Type

    • ANSI SQL Datatype
      1. CHARACTER DataType
      2. NUMERIC DataType
    • ANSI CHARACTER Datatype
      1. CHARACTER(n), CHAR(n)
      2. CHARACTER VARING(n)
      3. CHAR VARYING(n)
    • ANSI NUMERIC Datatype
      1. NUMERIC(p, s), DECIMAL(p, s), DEC(p, s)
        • precision: total number of digits
        • scale: number of digits to the right of the decimal point
      2. INTEGER, INT, SMALLINT
      3. FLOAT(p)
      4. REAL
      5. DOUBLE PRECISION
    • Oracle: Character Datatype
      1. CHAR(n)
      2. VARCHAR(n)
      3. LONG
    • Oracle: NUMBER Datatype
      1. NUMBER
      2. NUMBER(precision, scale)
      3. NUMBER(*, scale)
        • precision: total number of digits
        • scale: number of digits to the right of the decimal point
          1. default value is zero
          2. negative scale: rounds the actual data to the specified number of places to the left of the decimal point
  • SQL create table

    1
    2
    3
    4
    5
    6
    7
    CREATE TABLE agents(
    aid CHAR(3) NOT NULL,
    aname VARCHAR(13),
    city VARCHAR(20),
    percent SMALLINT,
    PRIMARY KEY(aid);
    )
  • SQL query in relational algebra

    1. single relation

      $(R\ where\ Condition)[A_1,A_2,…,A_m]$

      1
      2
      3
      SELECT	A1,A2,...,Am
      FROM R
      WHERE Condition;
      1. product

        $((R_1\times R_2\times …\times R_n)\ where\ Condition) [A1,A2,…,Am]$

        1
        2
        3
        SELECT	A1,A2,...,Am
        FROM R1,R2,...,Rn
        WHERE Condition;
        1. join

        $(( R\ \infty \ S ) where Condition ) [ A1, A2, …, Am ]$

        1
        2
        3
        4
        SELECT	A1,A2,...,Am
        FROM R,S
        WHERE Condition and
        R.B1=S.B1 and R.B2=S.B2... and R.Bk=S.Bk;
  • SQL Examples

    1. Find aid and names of agents that are based in New York.

      $(AGENTS\ where\ city=’New\ York’) [aid,aname]$

      1
      2
      3
      SELECT  aid, aname
      FROM agents
      WHERE city = 'New York’ ;
    2. Display all values of customers in table CUSTOMERS.

      $(CUSTOMERS) [ cid, cname, city, discnt ]$

      1
      2
      SELECT	cid, cname, city, discnt
      FROM customers ;

      1
      2
      SELECT	*
      FROM customers ;

      $* == all$

    3. Select product ids of products for which orders are placed.

      $ORDERS[pid]$

      1
      2
      SELECT	pid FROM orders;  --: defult all
      SELECT distinct pid FROM orders; --: now distinct
    4. Retrieve all (cname, aname) pairs where the customer places an order through the agent.

      $( C[cid, cname]\ \infty\ O )\ \infty\ A ) [ cname, aname ]$

      $((C\times O\times A)\ where\ C.cid=O.cid\ and\ O.aid=A.aid) [ cname, aname ]$

      1
      2
      3
      SELECT	distinct  cname, aname
      FROM customers, orders, agents
      WHERE customers.cid=orders.cid and orders.aid=agents.aid ;

      table alias in FROM clause:

      • table_name as alias_name
      • table_name alias_name

      column alias in SELECT clause

      • expression as alias_name
      1
      2
      3
      SELECT	distinct  cname, aname
      FROM customers c, orders o, agents a
      WHERE c.cid=o.cid and o.aid=a.aid ;
      1
      2
      3
      SELECT	ordno, dollars, o.qty*p.price*(1-c.discnt*0.01) as mydollars
      FROM customers c, orders o, products p
      WHERE c.cid=o.cid and o.pid=p.pid ;
    5. List all pairs of customer cids based in the same city.

      1
      2
      3
      SELECT	c1.cid, c2.cid
      FROM customers c1, customers c2
      WHERE c1.city = c2.city and c1.cid < c2.cid ;

      实现原理:

      实现原理

    6. Find pids of products ordered by at least two customers.

      1
      2
      3
      SELECT  distinct  pid
      FROM orders x1, orders x2
      WHERE x1.pid = x2.pid and x1.cid < x2.cid ;

      1
      2
      3
      SELECT	distinct  pid
      FROM orders x1, orders x2
      WHERE x1.pid = x2.pid and x1.cid <> x2.cid ;
  • Subqueries

    • The IN Predicate

      1
      expr [NOT] IN ( subquery )
    • The Quantified Comparison Predicate

      1
      expr θ SOME|ANY|ALL ( subquery )
    • The EXISTS Predicate

      1
      [NOT] EXISTS ( subquery )
    • The BETWEEN Predicate

      1
      expr [NOT] BETWEEN expr1 AND expr2
    • The IS NULL Predicate

      1
      column IS [NOT] NULL
    • The LIKE Predicate

      1
      column [NOT] LIKE val1 [ESCAPE val2]
  • Subqueries Examples

    • The IN Predicate

      1. Retrieve cids of customers who place orders with agents in Duluth or Dallas.

        SQL1(SLOW)

        1
        2
        3
        select	distinct cid
        from orders o, agents a
        where a.aid=o.aid and (a.city=‘Duluth’ or a.city=‘Dallas’)

        SQL1-SLOW

        SQL2(FAST)

        1
        2
        3
        4
        5
        select	distinct cid from orders
        where aid IN
        (
        select aid from agents where city= ‘Duluth’ or city=‘Dallas’
        )

        SQL2-FAST

      2. Get all information concerning agents based in Duluth or Dallas.

        1
        2
        3
        SELECT	*
        FROM agents
        WHERE city IN { ‘Duluth’, ‘Dallas’ } ;
      3. Get the names and discounts of all customers who place orders through agents in Duluth or Dallas.

        1
        2
        3
        4
        5
        6
        7
        8
        9
        SELECT	cname,  discnt
        FROM customers
        WHERE cid IN (
        SELECT o.cid
        FROM orders o
        WHERE o.aid IN (
        SELECT a.aid
        FROM agents a
        WHERE a.city IN (‘Duluth’, ‘Dallas’))) ;
      4. Find the names of customers who order product p05.

        多表连接子查询

        1
        2
        3
        SELECT	distinct  cname
        FROM customers c, orders o
        WHERE c.cid=o.cid and o.pid=‘p05’ ;

        相关子查询

        1
        2
        3
        4
        5
        6
        SELECT	distinct cname
        FROM customers c
        WHERE ‘p05’ IN (
        select pid
        from orders o
        where o.cid=c.cid );

        独立子查询

        1
        2
        3
        4
        5
        6
        SELECT	distinct  cname
        FROM customers c
        WHERE cid IN (
        select cid
        from orders o
        where o.pid=‘p05’ );

        独立子查询最快

      5. Get the names of customers who order product ‘p07’ from agent ‘a03’.

        查询中的作用域问题

        image-20200311181553233

      6. Find ordno values for all orders placed by customers in Duluth through agents in New York.

        1
        2
        3
        4
        5
        6
        7
        SELECT	ordno
        FROM orders
        WHERE (cid, aid) IN
        (select cid, aid
        from customers c, agents a
        where c.city=‘Duluth’ and
        a.city=‘New York’) ;
    • The Quantified Comparison Predicate

      1
      2
      3
      4
      5
      IN  ≡  =SOME ≡ =ANY
      NOT IN ≡ <>ALL
      All: 对所有数据都满足条件,整个条件才成立 ,例如:5000大于所有返回的薪水
      ANY: 只要有一条数据满足条件,整个条件成立,例如,5000大于薪水中的3000,10000,7000
      SOME:的作用和Any一样 。
      1. Find aid values of agents with a minimum percent commission.

        1
        2
        3
        4
        5
        SELECT	aid
        FROM agents
        WHERE percent <= ALL (
        select percent
        from agents ) ;
      2. Find all customers who have the same discount as that of any of the customers in Dallas or Boston.

        1
        2
        3
        4
        5
        6
        SELECT	cid,  cname
        FROM customers
        WHERE discnt = SOME (
        select discnt
        from customers
        where city=‘Dallas’ or city=‘Boston’ ) ;
      3. Get cid values of customers with discnt smaller than those of any customers who live in Duluth.

        1
        2
        3
        4
        5
        6
        SELECT	cid,  cname
        FROM customers
        WHERE discnt < (
        select discnt
        from customers
        where city=‘Duluth’ ) ;
      4. Get cid values of customers with discnt smaller than those of any customers who live in Duluth.

        1
        2
        3
        4
        5
        6
        SELECT	cid,  cname
        FROM customers
        WHERE discnt < ALL (
        select discnt
        from customers
        where city=‘Duluth’ ) ;

        句中any代表所有的。

    • The EXIST Predicate

      1
      [NOT] EXISTS

      The predicate EXISTS (subquery) is TRUE if and only if the subquery returns a non-empty set.

      The predicate NOT EXISTS (subquery) is TRUE if and only if the subquery returns a empty set.

    • The BETWEEN Predicate

      1
      expr  [ NOT ] BETWEEN  expr1  AND  expr2
    • The IS NULL Predicate

      1
      colame IS [NoT] NULL
    • The LIKE Predicate

      1
      colname  [ NOT ]  LIKE  val1  [ ESCAPE  val2 ]
      • –Character in pattern

        –Underscore ( _ ) : Wildcard for any single character

        –Percent ( % ) : Wildcard for any sequence of zero or more characters

        –Escape character : Precedes quoted literal character

        –All other characters : Represent themselves

    • SQL Examples

      1. Find cid values of customers with discnt smaller than those of any customers who live in Duluth.

        1
        2
        3
        4
        5
        6
        7
        SELECT	cid
        FROM customers c
        WHERE c.discnt < ALL(
        SELECT discnt
        FROM customer x
        WHERE x.city = 'Duluth'
        );

        Find cid values of customers with discnt smaller than a customer who lives in Duluth.

        1
        2
        3
        4
        5
        6
        7
        SELECT	cid
        FROM customers c
        WHERE c.discnt < SOME(
        SELECT discnt
        FROM customer x
        WHERE x.city = 'Duluth'
        );
      2. Retrieve all customer names where the customer places an order through agent a05.

        1
        2
        3
        4
        5
        6
        7
        SELECT	distinct cnames
        FROM customers c
        WHERE c.cid IN (
        SELECT cid
        FROM order o
        WHERE o.aid = 'a05'
        );

        1
        2
        3
        4
        5
        6
        7
        SELECT	distinct cnames
        FROM customers c
        WHERE EXISTS(
        SELECT *
        FROM orders o
        WHERE o.cid = c.cid and o.aid = 'a05'
        );

        如果使用关系代数,会要使用到Join操作或Cartesian Product,在使用SQL查询时,最好避免使用效率低的表连接查询

        这里可以看出,EXISTS的意义不是”xx exists in c”而是”exists such … satisfying ….”,同时可以引用外部作用域变量。

      3. Get cids of customers who order both products p01 and p07.

        1
        2
        3
        4
        5
        6
        7
        (
        SELECT cid FROM orders WHERE pid = 'p01';
        )
        INTERSECT
        (
        SELECT cid FROM orders WHERE pid = 'p07'
        );
        1
        2
        3
        4
        5
        6
        SELECT	o1.cid
        FROM orders o1
        WHERE o1.pid=‘p01’ and o1.cid IN (
        SELECT o2.cid FROM orders o2
        WHERE o2.pid=‘p07’
        );
      4. Find all customer names where the customer does not place an order through agent a05.

        $((C[cid]-(O where aid = ‘a05’) [cid])\ \infty\ C) [cname]$

        <>ALL来代替减号

        1
        2
        3
        4
        5
        6
        7
        SELECT	cname
        FROM customers
        WHERE cid NOT IN(
        SELECT o.cid
        FROM orders o
        WHERE o.aid = ‘a05’
        );

        或者用户NOT EXISTS来代替减号

        1
        2
        3
        4
        5
        6
        7
        SELECT	cname
        FROM customers c
        WHERE NOT EXISTS (
        SELECT *
        FROM orders o
        WHERE o.cid = c.cid and o.aid = ‘a05’
        );
      5. Find all cid, aid pairs where the customer does not place an order through the agent.

        1
        2
        3
        4
        5
        6
        7
        SELECT	cid, aid
        FROM customers c, agents a
        WHERE NOT EXISTS(
        SELECT *
        FROM orders o
        WHERE o.cid = c.cid and o.aid = a.aid
        );

        1
        2
        3
        4
        5
        6
        SELECT	cid, aid
        FROM customers c, agents a
        WHERE (cid, aid) NOT IN(
        SELECT o.cid, o.aid
        FROM orders o
        );
      6. Find cids of all customers who don’t place any order through agent a03.

        1
        2
        3
        4
        5
        6
        7
        SELECT	cid
        FROM customers
        WHERE cid NOT IN (
        SELECT o.cid
        FROM orders o
        WHERE o.aid = ‘a03’
        );

        1
        2
        3
        4
        5
        6
        7
        SELECT	cid
        FROM customers c
        WHERE NOT EXISTS (
        SELECT *
        FROM orders o
        WHERE o.cid = c.cid and o.aid = ‘a03’
        );
      7. Retrieve the city names containing customers who order product p01.

        1
        2
        SELECT	city	FROM	customers c, orders o
        WHERE c.cid = o.cid and o.pid = ‘p01’;

        不推荐多表连接查询$\Uparrow$

        1
        2
        3
        4
        5
        6
        7
        SELECT	city
        FROM customers
        WHERE cid IN(
        SELECT cid
        FROM orders
        WHERE pid = ‘p01’
        );
        1
        2
        3
        4
        5
        6
        7
        SELECT	city
        FROM customers
        WHERE cid = SOME(
        SELECT cid
        FROM orders
        WHERE pid = ‘p01’
        );
        1
        2
        3
        4
        5
        6
        7
        SELECT	city
        FROM customers c
        WHERE ‘p01’ IN (
        SELECT o.pid
        FROM orders o
        WHERE o.cid = c.cid
        );
        1
        2
        3
        4
        5
        6
        7
        SELECT	city
        FROM customers c
        WHERE EXISTS (
        SELECT *
        FROM orders o
        WHERE o.cid = c.cid and o.pid = ‘p01’
        );
      8. Get the names of customers who order at least one product priced at $0.50.

        1
        2
        3
        SELECT	cname
        FROM products p, orders o, customers c
        WHERE price=0.50 and p.pid=o.pid and o.cid=c.cid ;
      9. Retrieve customers who place orders only through agent a03.

        1
        2
        3
        4
        5
        6
        7
        SELECT	o1.cid
        FROM orders o1
        WHERE o1.cid NOT IN (
        SELECT o2.cid
        FROM orders o2
        WHERE o2.aid <> ‘a03’
        );
      10. Find products that have never been ordered by a customer based in New York through an agent based in Boston.

        1
        2
        3
        4
        5
        6
        7
        8
        SELECT	p.pid
        FROM products p
        WHERE p.pid NOT IN (
        SELECT o.pid
        FROM customers c, agents a, orders o
        WHERE c.city=‘New York’ and a.city=‘Boston’ and c.cid=o.cid
        and o.aid=a.aid
        );
      11. Get cids of customers who place an order through at least one agent who places an order for product p03.

        1
        2
        3
        4
        5
        6
        7
        SELECT	o2.cid
        FROM orders o2
        WHERE o2.aid IN(
        SELECT o1.aid
        FROM orders o1
        WHERE o1.pid=‘p03’
        );
      12. List pids of products that are ordered through agents who place orders for (possibly different) customers who order at least one product from an agent who has placed an order for customer c001.

        1
        2
        3
        4
        SELECT	o4.pid
        FROM orders o1, orders o2, orders o3, orders o4
        WHERE o1.cid=‘c001’ and o1.aid=o2.aid
        and o2.cid=o3.cid and o3.aid=o4.aid ;

        1
        2
        3
        4
        5
        6
        7
        8
        9
        10
        11
        12
        13
        14
        15
        SELECT	o4.pid
        FROM orders o4
        WHERE o4.aid IN (
        SELECT o3.aid
        FROM orders o3
        WHERE o3.cid IN (
        SELECT o2.cid
        FROM orders o2
        WHERE o2.aid IN (
        SELECT o1.aid
        FROM orders o1
        WHERE o1.cid=‘c001’
        )
        )
        );
      13. Get pids of products not ordered by any customer living in a city whose name begin with the letter D.

        1
        2
        3
        4
        5
        6
        7
        SELECT	p.pid
        FROM products p
        WHERE p.pid NOT IN (
        SELECT o.pid
        FROM orders o, customers c
        WHERE o.cid = c.cid and c.city LIKE 'D%'
        );