Here’s an example of why relational databases need multiset (or multirelation) semantics, rather than set (or relation) semantics.
Suppose I have two tables:
owner :: String
and quantity :: Int
owner :: String
and quantity :: Int
If I want to work out how many fruits each owner has I can do
SELECT owner, SUM(quantity)
FROM (apples UNION ALL oranges)
GROUP BY owner
I can’t do the same with UNION
in the place of UNION ALL
because if an owner has the same number of apples as oranges then the UNION
will coalesce those two rows into one! In set semantics we would have to artificially add arbitrary identifiers to the apples
and oranges
tables so the UNION
can distinguish their rows, and discard the identifier after aggregation.