OQL LEAGUE
OQL LEAGUE
1.1
Introduction
In this chapter, we
describe an object query language named OQL, which supports the ODMG data
model. It is complete and simple. It deals with complex objects without
privileging the set construct and the select-from-where clause.
We first describe the
design principles of the language in Section 1.2, then we introduce in the next
sections the main features of OQL. We explain the input and result of a query
in Section 1.3. Section 1.4 deals with object identity. Section 1.5 presents
the path expressions. In Section 1.6, we show how OQL can invoke operations and
Section 1.7 describes how polymorphism is managed by OQL. Section 1.8 concludes
this part of the presentation of the main concepts by exemplifying the property
of operators composition.
Finally, a formal and
complete definition of the language is given in Section 1.9. For each feature
of the language, we give the syntax, its semantics, and an example. Alternate
syntax for some features are described in Section 1.10, which completes OQL in
order to accept any syntactical form of SQL. The chapter ends with the formal
syntax, which is given in Section 1.11.
Our design is based on
the following principles and assumptions:
- OQL relies on the ODMG object model.
- OQL is a superset of the part of standard SQL
that deals with database queries. Thus, any select SQL
sentence which runs on relational tables, works with the same syntax and
semantics on collections of ODMG objects. Extensions concern
object-oriented notions, like complex objects, object identity, path
expression, polymorphism, operation invocation, late binding, etc.
- OQL provides high-level primitives to deal with
sets of objects but is not restricted to this collection construct. It
also provides primitives to deal with structures, lists, arrays, and
treats such constructs with the same efficiency.
- OQL is a functional language where operators can
freely be composed, as long as the operands respect the type system. This
is a consequence of the fact that the result of any query has a type which
belongs to the ODMG type model, and thus can be queried again.
- OQL is not computationally complete. It is a
simple-to-use query language which provides easy access to an ODBMS.
- Based on the same type system, OQL can be invoked
from within programming languages for which an ODMG binding is defined.
Conversely, OQL can invoke operations programmed in these languages.
- OQL does not provide explicit update operators
but rather invokes operations defined on objects for that purpose, and
thus does not breach the semantics of an ODBMS which, by definition, is
managed by the "methods" defined on the objects.
- OQL provides declarative access to objects. Thus
OQL queries can be easily optimized by virtue of this declarative nature.
- The formal semantics of OQL can easily be
defined.
As a stand-alone
language, OQL allows querying denotable objects starting from their names,
which act as entry points into a database. A name may denote any kind of
object, i.e., atomic, structure, collection, or literal.
As an embedded language,
OQL allows querying denotable objects which are supported by the native
language through expressions yielding atoms, structures, collections, and
literals. An OQL query is a function which delivers an object whose type may be
inferred from the operator contributing to the query expression. This point is
illustrated with two short examples.
The schema defines the
types Person and Employee. These types have the extents Persons and Employees
respectively. One of these persons is the chairman (and there is an entry-point
Chairman to that person). The type Person defines the name, birth-date, and
salary as attributes and the operation age. The type Employee, a subtype of
Person, defines the relationship subordinates and the operation seniority.
select distinct x.age from Persons x where x.name = "Pat"
This selects the set of
ages of all persons named Pat, returning a literal of
type set<integer>.
select distinct struct(a:x.age, s: x.sex) from Persons x where
x.name = "Pat";
This does about the
same, but for each person, it builds a structure containing age and sex. It
returns a literal of type set<struct>.
select distinct struct(name: x.name, hps: (select y from x.subordinates as
y where y.salary >100000)) from Employees x
This is the same type of
example, but now we use a more complex function. For each employee we build a
structure with the name of the employee and the set of the employee's highly
paid subordinates. Notice we have used a select-from-where clause in the select
part. For each employee x, to compute hps, we traverse the
relationship subordinates and select among this set the employees with a salary
superior to $100,000. The result of this query is therefore a literal of the
type set<struct>, namely:
set<struct(name: string, hps: bag<Employee>)>
We could also use a select operator in the from part:
select struct (a:
x.age, s: x.sex)
from(select y from y in Employees where y.seniority ="10") as x
where x.name = "Pat"
from(select y from y in Employees where y.seniority ="10") as x
where x.name = "Pat"
Of course, you do not always have to use a
select-from-where clause:
Chairman
retrieves the Chairman object.
Chairman.subordinates
retrieves the set of subordinates of the Chairman.
Persons
1.4 Dealing with Object Identity
The query language
supports both types of objects: mutable (i.e., having an OID) and literal
(identity = their value), depending on the way these objects are constructed or
selected.
To create an object with
identity a type name constructor is used. For instance, to create a Person
defined in the previous example, simply write
Person(name: "Pat", birthdate: "3/28/56", salary:
100,000)
The parameters in
parentheses allow you to initialize certain properties of the object. Those
which are not explicitly initialized are given a default value.
You distinguish such a
construction from the construction expressions that yield objects without
identity. For instance,
struct (a: 10,b:"Pat")
creates a structure with two valued fields.
If you now return to the
example in Section 1.3, instead of computing literals, you can build objects.
For example, assuming that these mutable object types are defined:
type vectint: set<integer>;
type stat
attributes
type stat
attributes
a: integer
s:char
s:char
end_type;
type stats: bag<stat>;
type stats: bag<stat>;
you can carry out the following queries:
vectint(select distinct.age from Persons where name = "Pat")
which returns a mutable object of type vectint and
stats(select stat (a: age, s: sex) from Persons where name =
"Pat")
which returns a mutable object of type stats.
The extraction
expressions may return:
- A collection of objects with identity,
e.g., select x from Persons x where x.name
="Pat" returns a collection of persons whose name is Pat.
- An object with identity,
e.g., element(select x from Persons x where
x.passport_number=1234567) returns the person whose passport number
is 1234567.
- A collection of literals, e.g., select
x.passport_number from Persons x where x.name="Pat" returns
a collection of integers giving the passport numbers of people named Pat.
- A literal, e.g., Chairman.salary.
Therefore the result of
a query is an object with or without object identity: some objects are
generated by the query language interpreter, and others produced from the
current database.
As explained above, one
can enter a database through a named object, but more generally as long as one
gets an object (which comes, for instance, from a C++ expression), one needs a
way to navigate from it and reach the right data one needs. To
do this in OQL, we use the "." (or indifferently "->")
notation which enables us to go inside complex objects, as well as to follow
simple relationships. For example, we have a Person p and we want to know the
name of the city where this person's spouse lives.
Example:
p.spouse.address.city.name
This query starts from a Person, gets his/her spouse,
a Person again, goes inside the complex attribute of type Address to get the
City object whose name is then accessed.
This example treated 1-1
relationship, let us now look at n-p relationships. Assume we want the names of
the children of the person p. We cannot write p.children.name because children
is a list of references, so the interpretation of the result of this query
would be undefined. Intuitively, the result should be a collection of names,
but we need an unambiguous notation to traverse such a multiple relationship
and we use the select-from-where clause to handle collections just as in SQL.
Example:
select c.name
from p.children c
from p.children c
The result of this query is a value of type Bag<String>. If we want to get a Set, we simply drop
duplicates, like in SQL by using the distinct keyword.
Example:
select distinct
c.name
from p.children c
from p.children c
Now we have a means to
navigate from an object to any object following any relationship and entering
any complex subvalues of an object. For instance, we want the set of addresses
of the children of each Person of the database. We know the collection named
Persons contains all the persons of the database. We have now to traverse two
collections: Persons and Person: :children. Like in SQL, the select-from
operator
allows us to query more than one collection. These
collections then appear in the from part. In OQL, a collection in the from part
can be derived from a previous one by following a path which starts from it.
select c.address
from Persons p, p.children c
from Persons p, p.children c
This query inspects all children of all persons. Its
result is a value whose type isBag<Address>.
Of course, the where
clause can be used to define any predicate which then serves to select only the
data matching the predicate. For example, we want to restrict the previous
result to the people living on Main Street, and having at least two children.
Moreover we are only interested in the addresses of the children who do not live
in the same city as their parents.
Example:
select c.address
from Persons p, p.children c
where p.address.street = "Main Street" and count(p.children) >= 2 and c.address.city != p.address.city
from Persons p, p.children c
where p.address.street = "Main Street" and count(p.children) >= 2 and c.address.city != p.address.city
In the from clause,
collections which are not directly related can also be declared. As in SQL,
this allows computation of joins between these collections.
This example selects the people who bear the name of a flower, assuming there
exists a set of all flowers called Flowers.
Example:
select p
from Persons p, Flowers f
where p.name = f.name
from Persons p, Flowers f
where p.name = f.name
OQL allows us to call a
method with or without parameters anywhere the result type of the method
matches the expected type in the query. The notation for calling a method is
exactly the same as for accessing an attribute or traversing a relationship, in
the case where the method has no parameter. If it has parameters, these are
given between parentheses. This flexible syntax frees the user from knowing
whether the property is stored (an attribute) or computed (a method, such as
age in the following example). This example returns the age of the oldest child
of the person "Paul".
Example:
select max(select
c.age from p.children c)
from Persons p
where p.name = "Paul"
from Persons p
where p.name = "Paul"
Of course, a method can
return a complex object or a collection and then its call can be embedded in a
complex path expression. For instance, if oldest_child is a method defined on
the class Person which returns an object of class Person, the following example
computes the set of street names where the oldest children of Parisian people
are living.
Example:
select
p.oldest_child.address.street
from Persons p
where p.lives_in("Paris")
from Persons p
where p.lives_in("Paris")
Although oldest_child is
a method we traverse it as if it were a relationship.
Moreover, lives_in is a method with one parameter.
A major contribution of
object orientation is the possibility of manipulating polymorphic collections,
and thanks to the late binding mechanism, to carry out generic
actions on the elements of these collections. For instance, the set Persons
contains objects of class Person, Employee, and Student. So far, all the
queries against the Persons extent dealt with the three possible objects of the
collection.
If one wants to restrict
a query on a subclass of Person, either the schema provides an extent for this
subclass which can then be queried directly, or else the superclass extent can
be filtered to select only the objects of the subclass, as shown in the class
indicator example.
A query is an expression
whose operators operate on typed operands. A query is correct if the types of
operands match those required by the operators. In this sense, OQL is a typed
query language. This is a necessary condition for an efficient query
optimizer. When a polymorphic collection is filtered
(for instance Persons), its elements are statically known to be of that class
(for instance Person). This means that a property of a subclass (attribute or
method) cannot be applied to such an element, except in two important cases:
late binding to a method or explicit class indication.
Give the activities of
each person.
Example:
select
p.activities
from Persons p
from Persons p
where activities is a method which has three
incarnations. Depending on the kind of person of the current p, the right
incarnation is called. If p is an Employee, OQL calls the operation activities
defined on this object, or else if p is a Student, OQL calls the operation
activities defined for Students, or else, p is a Person and OQL calls the
method activities of the type Person.
To go down the class
hierarchy, a user may explicitly declare the class of an object that cannot be
inferred statically. The evaluator then has to check at runtime that this
object actually belongs to the indicated class (or one of its subclasses). For
example, assuming we know that only Students spend their time in following a
course of study, we can select those Persons and get their grade. We explicitly
indicate in the query that these Persons are in Student:
Example:
select ((Student)p).
grade
from Persons p
where "course of study" in p.activities
from Persons p
where "course of study" in p.activities
OQL is a purely
functional language. All operators can be composed freely as long as the type
system is respected. This is why the language is so simple and its manual so
short. This philosophy is different from SQL, which is an ad-hoc language whose
composition rules are not orthogonal. Adopting a complete orthogonality allows
OQL to not restrict the power of expression and makes the language easier to
learn without losing the SQL syntax for the simplest queries. However, when
very specific SQL syntax does not enter in a pure functional category, OQL
accepts these SQL peculiar-ities as possible syntactical variations. This is
explained more specifically in Section 1.10.
Among the operators
offered by OQL but not yet introduced, we can mention the set operators (union,
intersect, except), the universal (for all) and existential quantifiers
(exists), the sort and group by operators, and the aggregation operators
(count, sum, min, max, and avg).
To illustrate this free
composition of operators, let us write a rather complex query. We want to know
the name of the street where employees live and have the smallest salary on
average, compared to employees living in other streets. We proceed by steps and
then do it as one query. We use OQL define instruction to evaluate temporary
results.
Example:
- Build the extent of class Employee (assuming that
it is not supported directly by the schema):
define Employees as
select (Employee)
p
from Persons p
where "has a job" in p.activities
from Persons p
where "has a job" in p.activities
- Group the employees by street and compute the
average salary in each street:
define salary_map as
select street,
average_salary:avg(select e.salary from partition)
from Employees e
group by street: e.address.street
from Employees e
group by street: e.address.street
The result is of
type Bag<struct(street: string, average_salary:float)>. The group by
operator splits the employees into partitions, according to the criterion (the
name of the street where this person lives). The select clause computes, in
each partition, the average of the salaries of the employees belonging to the
partition.
- Sort this set by salary:
define
sorted_salary_map as
select s
from salary_mnap s
order by s.average_salary
from salary_mnap s
order by s.average_salary
The result is now of
typeList<struct(street: string, average_salary:float)>.
- Now get the smallest salary (the first in the
list) and take the corresponding street name. This is the final result.
first(sortecLsalary_map).street
Example as a single
query:
first(
select street,
average_salary: avg(select e.salary from partition)
from (select (Employee) p from Persons p where "has a job" in p.activities ) as e
group by street: e.address.street
order by average_salary).street
from (select (Employee) p from Persons p where "has a job" in p.activities ) as e
group by street: e.address.street
order by average_salary).street
OQL is an expression language. A query expression is built from typed
operands composed recursively by operators. We will use the termexpression to
designate a valid query in this section.
The examples are based on the schema described in Chapter 3.
A query consists of a (possibly empty) set of query definition expressions
followed by an expression, which is evaluated as the query itself. The set of
query definition expressions is nonrecursive (although a query may call an
operation which issues a query recursively).
Example:
define jones as select
distinct x from Students x where x.name = "Jones";
select distinct student_id from jones
select distinct student_id from jones
This defines the set jones of students named Jones and gets the set of
their student_ids.
If q is a query name and e is a query expression, then
define q as e is a query definition expression which
defines the query with name q.
Example:
define Does as select
x from Student x where x.name ="Doe"
This statement defines Does as a query returning a bag containing
all the students whose name is Doe.
define Doe as
element(select x from Student x where x.name="Doe")
This statement defines Doe as a query which returns the student
whose name is Doe (if there is only one, otherwise an exception is raised).
If I is an atomic literal, then I is an expression whose value is the
literal itself. Literals have the usual syntax:
- Object Literal: nil
- Boolean Literal: false, true
- Integer Literal: sequence of digits, e.g. 27
- Float Literal: mantissa/exponent. The exponent
is optional, e.g., 3.14 or 314.16e-2
- Character Literal: character between single
quotes, e.g., 'z'
- String Literal: character string between double
quote, e.g.,"a string"
If e is a named object, then e is an expression. It defines the entity
attached to the name.
Example:
Students
This query defines the set of students. We have assumed here that there
exists a name Students corresponding to the extent of objects of the class
Student.
If x is a variable declared in a from part of a
select-from-where, then xis an expression whose value is the current
element of the iteration over the corresponding collection.
If define q as e is a query definition expression,
then q is an expression.
Example:
Doe
This query returns the student with name Doe. It refers to the query
definition expression declared in Section 1.9.2.
If t is a type name, p1, p2, ...,pn are
properties of t, and e1,e2, ...,en are
expressions, then t(p1: e1..., pn: en) is
an expression.
This defines a new object of type t whose properties p1,
p2, ...,pnare initialized with the expression e1,
e2, ...,e2. The type of ei must be
compatible with the type of pi.
If t is a type name of a collection and e is a
collection literal, then t(e)is a collection object. The type
of e must be compatible with t.
Examples:
Employee(name:
"Peter", boss: Chairman)
This creates a mutable Employee object.
vectint(set(1,3,10))
This creates a mutable set object (see the definition
of vectint in Section 1.4.1).
If p1, p2,.... pn are property
names, and e1, e2,..., en are
expressions, then struct(p1: e1, p2: e2,
..., pn:en) is an expression. It defines the
structure taking values e1, e2, ..., en on
the properties p1, p2, ..., pn.
Note that this dynamically creates an instance of the typestruct(p1:
t1, p2: t2, ..., pn: tn) if ti is
the type of ei.
Example:
struct(name:
"Peter", age: 25);
This returns a structure with two
attributes name and age taking respective
values Peter and 25.
See also abbreviated syntax for some contexts in Section 1.10.1.
If e1, e2, ..., en are
expressions, then set(e1, e2,.... en) is
an expression. It defines the set containing the elements e1, e2,
..., en. It creates a set instance.
Example:
set(1,2,3)
This returns a set consisting of the three elements 1, 2,
and 3.
If e1, e2, ..., en are
expressions, then list(e1, e2, ..., en)or
simply (e1, e2, ..., en) are
expressions. They define the list having elements e1, e2,
..., en. They create a list instance.
If min, max are two expressions of integer or character
types, such that min < max, then list(min .. max) or
simply (min .. max) is an expression of value: list(min,
min+1,... max-1, max)
Example:
list(1,2,2,3)
This returns a list of four elements.
Example:
list(3 .. 5)
This returns the list(3,4,5)
If e1, e2, ..., en are
expressions, then bag(e1, e2, ...,en) is
an expression. It defines the bag having elements e1, e2,
..., en. It creates a bag instance.
Example:
bag(1,1,2,3,3)
This returns a bag of five elements.
If e1, e2, ..., en are
expressions, thenarray(e1, e2, ..., en) is
an expression. It defines an array having elements e1, e2,...,
en. It creates an array instance.
Example:
array(3,4,2,1,1)
This returns an array of five elements.
If e is an expression and <op> is a unary
operation valid for the type ofe, then <op> e is an expression.
It defines the result of applying <op> toe.
Arithmetic
unary
|
+, -, abs
|
Boolean
unary operator:
|
not
|
Example:
not true
This returns false.
This returns false.
1.9.5.2 Binary
Expressions
If e1 and e2 are expressions
and <op> is a binary operation, thene1<op>e2 is
an expression. It defines the result of applying <op> to e1and e2.
Arithmetic
integer binary operators:
|
+, -, *,
/, mod (modulo)
|
Floating
point binary operators:
|
+, -, *, /
|
Relational
binary operators:
|
=, !=,
<, <=, >,>=
|
These
operators are defined on all atomic types.
|
|
Boolean
binary operators:
|
and, or
|
Example:
count(Students) -
count(TA)
This returns the difference between the number of students and the number
of TAs.
If s1 and s2 are expressions of
type string, then s1 || s2, ands1 +
s2 are equivalent expressions of type string whose value is the
concatenation of the two strings.
If c is an expression of type character, and s an
expression of type string, then c in s is an expression of type
boolean whose value is true if the character belongs to the string, else false.
If s is an expression of type string, and i is an
expression of type integer, then s[i] is an expression of type
character whose value is thei+lth character of the string.
If s is an expression of type string,
and low and up are expressions of type integer,
then s[low:up] is an expression of type string whose value is the
substring of s from the low+lth character up to
the up+lthcharacter.
If s is an expression of type string, and pattern a
string literal which may include the wildcard characters: "?" or
"_", meaning any character, and "*" or "%",
meaning any substring including an empty substring, thens like pattern is
an expression of type boolean whose value is true ifs matches
the pattern, else false.
Example:
"a nice
string" like "%nice%str_ng"
1.9.6
Object Expressions
If e1 and e2 are expressions
which denote mutable objects (objects with identity) of the same type,
then e1 = e2 and e1 !=
e2 are expressions which return a boolean. The second
expression is equivalent to not(e1 = e2).
Likewise e1 = e2 is true if they designate
the same object.
Example:
Doe = element(select s
from Students s where s.name = "Doe")
is true.
If e1> and e2 are expressions
which denote immutable objects (literals) of the same type, then e1 =
e2 and e1 != e2 are
expressions which return a boolean. The second expression is equivalent
to not(e1 = e2). Likewise, e1 =
e2 is true if the value e1 is equal to the
value e2.
If e is an expression, if p is a property name,
then e->p and e.p are expressions. These are alternate syntax
to extract property p of an object e.
If e happens to designate a deleted or a nonexisting object,
i.e., nil, an attempt to access the attribute or to traverse the
relationship raises an exception. However, a query may test explicitly if an
object is different from nil before accessing a property.
Example:
Doe.name
This returns Doe.
Example:
Doe->spouse != nil
and Doe->spouse->name = "Carol"
This returns true, if Doe has a spouse whose name is Carol, or
else false.
If e is an expression and f is an operation name,
then e->f and e.fare expressions. These are alternate syntax
to apply an operation on an object. The value of the expression is the one
returned by the operation or else the object nil, if the operation returns
nothing.
Example:
jones->number_of_students
This applies the operation number_of_students to jones.
If e is an expression, if e1, e2, ...,
en are expressions and f is an operation name,
then e->f(e1, e2, ..., en) ande.f(e1,
e2, ..., en) are expressions that apply
operation f with parameters e1, e2, ..., en to
object e. The value of the expression is the one returned by the operation
or else the object nil, if the operation returns nothing.
In both cases, if e happens to designate a deleted or a
nonexisting object, i.e., nil, an attempt to apply an operation to it
raises an exception. However, a query may test explicitly if an object is
different from nil before applying an operation.
Example:
Doe->apply_course("Math",
Turing)->number
This query calls the operation apply_course on
class Student for the object Doe. It passes two parameters, a
string and an object of classProfessor. The operation returns an object of
type Course and the query returns the number of this course.
If e is an expression which denotes an object with identity (a
mutable object), then *e is an expression which delivers the value of
the object (a literal). Given two variables of type Person, p1 and p2,
the predicatep1 = p2 is true if both variables refer to the same object,
while*p1 = *p2 is true if the objects have the same values, even if they
are not the same object.
If x is a variable name, e1 and e2 are
expressions, e1 denotes a collection, and e2 a
predicate, then for all x in e1: e2 is an
expression. It returns true if all the elements of collection e1 satisfy e2 and
false otherwise.
Example:
for all x in Students:
x.student_id > 0
This returns true if all the objects in the Students set have a
positive value for their student_id attribute. Otherwise it returns
false.
If x is a variable name, if e1 and e2 are
expressions, e1 denotes a collection, and e2 a
predicate, then exists x in e1: e2 is an
expression. It returns true if there is at least one element of
collection e1that satisfies e2 and false
otherwise.
Example:
exists x in Doe.takes:
x.taught_by.name = "Turing"
This returns true if at least one course Doe takes is taught by
someone named Turing.
If e is a collection expression,
then exists(e) and unique(e) are expressions which return a
boolean value. The first one returns true if there exists at least one element
in the collection, while the second one returns true if there exists only one
element in the collection.
Note that these operators accept the SQL syntax for nested queries like
select... from col
where exists (select... from col1 where predicate)
The nested query returns a bag to which the operator exists is applied.
This is of course the task of an optimizer to recognize that it is useless to
compute effectively the intermediate bag result.
If e1 and e2 are
expressions, e2 is a collection, and e1 has
the type of its elements, then e1 in e2 is
an expression. It returns true if element e1belongs to
collection e2.
Example:
Doe in Does
This returns true.
Doe in TA
This returns true if Doe is a Teaching Assistant.
If e is an expression which denotes a collection,
if <op> is an operator from {min, max, count, sum, avg},
then <op>(e) is an expression.
Example:
max (select salary
from Professors)
This returns the maximum salary of the Professors.
1.9.8
Select From Where
If e, e', e1, e2, ..., en are
expressions which denote collections, and x1, x2,...,
xn are variable names if e' is an expression of type
boolean, and if projection is an expression or the character
*, then
select projection from
e1 as x1, e2 as x2, ...,
en as xn where e'
and
select distinct projection from
e1 as x1, e2 as x2, ...,
en as xn where e'
are expressions.
The result of the query is a set for a select distinct or a bag
for aselect.
If you assume e1, e2, ..., en are
all set and bag expressions, then the result is obtained as follows: take the
Cartesian product 1 of
the setse1, e2,..., en; filter that product by
expression e' (i.e., eliminate from the result all objects that do
not satisfy boolean expression e'); apply the projection to
each one of the elements of this filtered set and get the result. When the
result is a set (distinct case) duplicates are automatically eliminated.
The situation where one or more of the collections e1, e2,...,
en is an indexed collection is a little more complex. The
select operator first converts all these collections into bags and applies the
previous operation. The result is a set (distinct case) or else a bag. So,
in this case, we simply transform each of the ei's into a bag and
apply the previous definition.
1. The Cartesian product between a set and
a bag is defined by first converting the set into a bag, and then get the
resulting bag, which is the Cartesian product of the two bags.
Before the projection, the result of the iteration over the from variables
is of type
bag< struct(x1:
type_of(e1 elements), ..., xn: type_of(en elements))
>
The projection is constructed by an expression which can then refer
implicitly to the current element of this bag, using the
variables xi. If forei neither explicit nor
implicit variable is declared, then xi is given an
internal system name (which is not accessible by the query anyway).
By convention, if the projection is simply "*", then the result
of the select is the same as the result of the iteration.
If the projection is "distinct *", the result of the select is
this bag converted into a set. In all other cases, the projection is explicitly
computed by the given expression.
Example:
select couple(student:
x.name, professor: z.name)
from Students as x,
x.takes as y,
y.taught_by as z
y.taught_by as z
where z.rank =
"full professor"
This returns a bag of objects of type couple giving student names and the
names of the full professors from which they take classes.
Example:
select *
from Students as x,
x.takes as y,
y.taught_by as z
y.taught_by as z
where z.rank =
"full professor"
This returns a bag of structures, giving for each student
"object", the section object followed by the student and the full
professor "object" teaching in this section:
bag< struct(x:
Student, y: Section, z: Professor) >
A variable, xi declared in the from part
ranges over the collection ei, and thus has the type of the
elements of this collection. Such a variable can be used in any other part of
the query to evaluate any other expressions (see the Scope Rules in Section 1.9.15).
Syntactical variations are accepted for declaring these variables, exactly as
with SQL. The as keyword may be omitted. Moreover, the
variable itself can be omitted too, and in this case, the name of the
collection itself serves as a variable name to range over it.
Example:
select couple(student:
Students.name, professor: z.name)
from Students,
Students.takes y,
y.taught_by z
y.taught_by z
where z.rank =
"full professor"
In a select-from-where query, the where clause can be
omitted, with the meaning of a true predicate.
If select_query is a select-from-where query, partition_attributes is
a structure expression and predicate a boolean expression,
then
select_query group by
partition_attributes
is an expression and
select_query group by
partition_attributes having predicate
is an expression.
The Cartesian product visited by the select operator is split into
partitions. For each element of the Cartesian product, the partition attributes
are evaluated. All elements which match the same values according to the given
partition attributes, belong to the same partition. Thus the partitioned set,
after the grouping operation, is a set of structures: each structure has the
valued properties for this partition (the valued partition_attributes),
completed by a property which is conventionally called partition and
which is the bag of all objects matching this particular valued partition.
If the partition attributes are att1: e1, att2:
e2, ..., attn: en, then the result of the
grouping is of type
set< struct(att1:
type_of(e1), att2: type_of(e2), ..., attn:
type_of(en), partition: bag< type_of(grouped elements) >)>
The type of grouped elements is defined as follows.
If the from clause declares the variables v1 on
collection col1, v2 oncol2,
..., vn on coln, the grouped elements is a
structure with one attribute, vk, for each collection having
the type of the elements of the corresponding collection partition:
bag< struct(v1:
type_of(col1 elements), ..., vn: type_of(coln elements))>
If a collection colk has no variable declared the
corresponding attribute has an internal system name.
This partitioned set may then be filtered by the predicate of a havingclause.
Finally, the result is computed by evaluating the select clause
for this partitioned and filtered set.
The having clause can thus apply aggregate functions on partition;likewise
the select clause can refer to partition to compute the final
result. Both clauses can refer also to the partition attributes.
Example:
select *
from Employees e
group by low: salary < 1000,
from Employees e
group by low: salary < 1000,
medium: salary >=
1000 and salary < 10000,
high: salary >= 10000
high: salary >= 10000
This gives a set of three elements, each of which has a property calledpartition which
contains the bag of employees that enter in this category. So the type of the
result is
set<struct(low:
boolean, medium: boolean, high: boolean, partition: bag<struct(e:
Employee)>)>
The second form enhances the first one with a having clause
which enables you to filter the result using aggregative functions which
operate on each partition.
Example:
select department,
avg_salary: avg(select e.salary from partition)
from Employees e
group by department: e.deptno
having avg(select e.salary from partition) > 30000
from Employees e
group by department: e.deptno
having avg(select e.salary from partition) > 30000
This gives a set of couples: department and average of the salaries of the
employees working in this department, when this average is more than 30000. So
the type of the result is
bag<struct(department:
integer, avg_salary: float)>
Note that to compute the average salary, we have used a shortcut notation
allowed by the Scope Rules defined in Section 4.9.15. The fully developed
notation would read
avg_salary: avg(select
x.e.salary from partition x)
If select_query is a select-from-where or a
select-from-where-group_by query, and if e1, e2,....
en are expressions, thenselect_query order by e1,
e2,..., en is an expression. It returns a list of
the selected elements sorted by the functions e1, and inside
each subset yielding the same e1, sorted by e2,
..., and the final subsub...set, sorted by en.
Example:
select p from Persons
p order by p.age, p.name
This sorts the set of persons on their age, then on their name and puts the
sorted objects into the result as a list.
Each sort expression criterion can be followed by the keyword asc ordesc,
specifying respectively an ascending or descending order. The default order is
that of the previous declaration. For the first expression, the default is
ascending.
Example:
select * from Persons
order by age desc, name asc, department
If e1 and e2 are
expressions, e1 is a list or an array, e2 is
an integer, then e1[e2] is an expression. This
extracts the e2+1 element of the indexed collection e1.
Notice that the first element has the rank 0.
Example:
list(a,b,c,d)[1]
This returns b.
Example:
element (select x
from Courses x
where x.name = "Math" and x.number = "101").requires[2]
where x.name = "Math" and x.number = "101").requires[2]
This returns the third prerequisite of Math 101.
1.9.11.2 Extracting a
Subcollection of an Indexed Collection
If e1, e2, and e3 are
expressions, e1 is a list or an array, and e2 and e3are
integers, then e1[e2:e3] is an
expression. This extracts the subcollection of e1 starting
at position e2 and ending at position e3.
Example:
list(a,b,c,d)[1:3]
This returns list(b,c,d).
Example:
element (select
x
from Courses x
where x.name="Math" and x.number="101").requires[0:2]
where x.name="Math" and x.number="101").requires[0:2]
This returns the list consisting of the first three prerequisites of Math
101.
If e is an expression, <op> is an operator
from {first, last}, ande is a list or an array,
then <op>(e) is an expression. This extracts the first and last
element of a collection.
Example:
first(element(select x
from Courses x
where x.name="Math" and x.number="101").requires)
where x.name="Math" and x.number="101").requires)
This returns the first prerequisite of Math 101.
If e1 and e2 are expressions
and e1 and e2 are both lists or both
arrays, then e1+e2 is an expression. This
computes the concatenation ofe1 and e2.
Example:
list(1,2)+list(2,3)
This query
generates list(1,2,2,3).
If e1 and e2 are expressions,
if <op> is an operator from {union, except, intersect},
if e1 and e2 are sets or bags,
then e1 <op> e2is an expression. This
computes set theoretic operations, union, difference, and intersection
on e1 and e2, as defined in Chapter 2.
When the operand's collection types are different (bag and set), the set is
first converted into a bag and the result is a bag.
Examples:
Student except TA
This returns the set of students who are not Teaching Assistants.
bag(2,2,3,3,3) union
bag(2,3,3,3)
This bag expression returns bag(2,2,3,3,3,2,3,3,3).
bag(2,2,3,3,3)
intersect bag(2,3,3,3)
The intersection of two bags yields a bag that contains the minimum for
each of the multiple values. So the result is bag(2,3,3,3).
bag(2,2,3,3,3) except
bag(2,3,3,3)
This bag expression returns bag(2).
If e1 and e2 are expressions
which denote sets or bag and if <op> is an operator
from {<, <=, >, >=}, then e1 <op> e2 is
an expression whose value is a boolean.
When the operands are different kinds of collections (bag and set), the set
is first converted into a bag.
e1 <
e2 is true if e1 is included into e2 but
not equal to e2
e1 <= e2 is true if e1 is included into e2
e1 <= e2 is true if e1 is included into e2
Example:
set(1,2,3) <
set(3,4,2,1)
is true.
If e is a collection-valued expression, element(e) is
an expression. This takes the singleton e and returns its element.
If e is not a singleton this raises an exception.
Example:
element(select x from
Professors x where x.name = "Turing")
This returns the professor whose name is Turing (if there is only one).
If e is a list expression, listtoset(e) is an
expression. This converts the list into a set, by forming the set containing
all the elements of the list.
Example:
listtoset(list(
1,2,3,2))
This returns the set containing 1, 2, and 3.
If e is an expression whose value is a collection,
then distinct(e) is an expression whose value is the same collection
after removing the duplicated elements. If e is a
bag, distinct(e) is a set. If e is an ordered collection,
the relative ordering of the remaining elements is preserved.
1.9.13.4 Flattening
Collection of Collections
If e is a collection-valued expression, flatten(e) is
an expression. This converts a collection of collections of t into a
collection of t. So flattening operates at the first level only.
Assuming the type of e to be col1<col2<t>>,
the result offlatten(e) is:
- If col2 is a set (resp. a
bag), the union of all col2<t> is done and the
result is a set<t> (resp. bag<t>).
- If col2 is a list or an
array and col1 is a list or an array, the
concatenation of all col2<t> is done following
the order in col1 and the result is col2<t>,
which is thus a list or an array. Of course duplicates, if any, are
maintained by this operation.
- If col2 is a list or an
array and col1 is a set (resp. a bag), the lists or
arrays are converted into sets (resp. bags), the union of all these sets
(resp. bags) is done and the result is
a set<t> (resp. bag<t>).
Examples:
flatten(list(set(1,2,3),
set(3,4,5,6), set(7)))
This returns the set containing 1,2,3,4,5,6,7.
flatten(list(list(1,2),
list(1,2,3)))
This returns list(1,2,1,2,3).
flatten(set(list(1,2),
list(1,2,3)))
This returns the set containing 1,2,3.
1.9.13.5 Typing an
Expression
If e is an expression and c is a type name,
then (c)e is an expression. This asserts that e is an
object of class type c.
If it turns out that it is not true, an exception is raised at runtime.
This is useful to access a property of an object which is statically known to
be of a superclass of the specified class.
Example:
select ((Employee)
s).salary
from Students s
where s in (select sec.assistant from Sections sec)
from Students s
where s in (select sec.assistant from Sections sec)
This returns the set of salaries of all students who are teaching
assistants, assuming that Students and Sections are the
extents of the classes Student and Section.
If f is a function name, if e1 , e2,
..., en are expressions, thenf() and f(e1,
e2, ..., e2) are expressions whose value is the
value returned by the function, or the object nil, when the function does
not return any value. The first form calls a function without a parameter,
while the second one calls a function with the parameters e1, e2,
..., en.
OQL does not define in which language the body of such a function is
written. This allows one to extend the functionality of OQL without changing
the language.
The from part of a select-from-where query introduces
explicit or implicit variables to range over the filtered collections. An
example of an explicit variable is
select... from Persons
p ... while
an implicit declaration
would be
select... from Persons
...
The scope of these variables spreads over all the parts of the
select-from-where expression including nested sub-expressions.
The group by part of a select-from-where-group_by query
introduces the name partition along with possible explicit
attribute names which characterize the partition. These names are visible in
the correspondinghaving and select parts,
including nested sub-expressions within these parts.
Inside a scope, you use these variable names to construct path expressions
and reach properties (attributes and operations) when these variables denote
complex objects. For instance, in the scope of the first from clause above, you
access the age of a person by p.age.
When the variable is implicit, like in the second from clause, you directly
use the name of the collection by Persons.age.
However, when no ambiguity exists, you can use the property name directly
as a shortcut, without using the variable name to open the scope (this is made
implicitly), writing simply: age. There is no ambiguity when a property
name is defined for one and only one object denoted by a visible variable.
To summarize, a name appearing in a (nested) query is looked up as follows:
- a variable in the current scope, or
- a named query introduced by the define clause,
or
- a named object, i.e., an entry point in the
database, or
- an attribute name or an operation name of a
variable in the current scope, when there is no ambiguity, i.e., this
property name belongs to only one variable in the scope.
Example:
Assuming that in the current schema the
names Persons and Citiesare defined.
select scope1
from Persons,
from Persons,
Cities c
where
exists(select scope2 from children as child) or
count (select scope3,
(select scope4 from partition)
from children p,
scope5 v
group by age: scope6
)
)
In scope1, we see the names: Persons, c, Cities,
all property names of class Person and class City as long
as they are not present in both classes, and they are not called
"Persons", "c", nor "Cities".
In scope2, we see the names: child, Persons, c, Cities,
the property names of the class City which are not property of the
classPerson. No attribute of the class Person can be accessed
directly since they are ambiguous between "child" and
"Persons".
In scope3, we see the names: age, partition, and the
same names from scope1, except "age" and
"partition", if they exist.
In scope4, we see the
names: age, partition, p, v, and the same names from scope1,
except "age", "partition", "p" and "v",
if they exist.
In scope5, we see the names: p, and the same names
from scope1, except "p", if it exists.
In scope6, we see the
names: p, v, Persons, c, Cities, the property names of
the class City which are not property of the classPerson. No
attribute of the class Person can be accessed directly since they are
ambiguous between "child" and "Persons".
OQL defines an orthogonal expression language, in the sense that all
operators can be composed with each other as long as the types of the operands
are correct. To achieve this property, we have defined a functional language
with simple operators such as "+" or composite operators such as
"select from where", "group_by", and "order_by"
which always deliver a result in the same type system and which thus can be
recursively operated with other operations in the same query.
In order to accept the whole DML query part of SQL, as a valid syntax for
OQL, we have added ad-hoc constructions each time SQL introduces a syntax which
cannot be considered in the category of true operators. This section gives the
list of these constructions that we call "abbreviations," since they
are completely equivalent to a functional OQL expression. At the same time, we
give the semantics of these constructions, since all operators used for this
description have been previously defined.
The structure constructor has been introduced in Section 4.9.4.2. An
alternate syntax is allowed in two contexts: select clause and group-by clause.
In both contexts, the SQL syntax is accepted, along with the one already
defined.
select projection{, projection}
...
select ... group by projection {, projection}
select ... group by projection {, projection}
where projection is
one of the forms:
- expression as identifier
- identifier:
expression
- expression
This is an alternate syntax for
struct(identifier:
expression {, identifier: expression})
If there is only one projection and the syntax (3) is used
in a select clause, then it is not interpreted as a structure construction but
rather the expression stands as is. Furthermore, a (3) expression is only valid
if it is possible to infer the name of the corresponding attribute (the
identifier). This requires that the expression denotes a path expression
(possibly of length one) ending by a property whose name is then chosen as the
identifier.
Example:
select p.name, salary,
student_id
from Professors p, p.teaches
from Professors p, p.teaches
This query returns a bag of structures:
bag<struct(name:
string, salary: float, studentJd: integer)>
These operators have been introduced in Section 4.9.7.4. SQL adopts a
notation which is not functional for them. So OQL accepts this syntax, too. If
we define aggregate as one of min, max, count, sum and avg.
select count(*) from ...
|
is
equivalent to
|
count(select * from
...)
|
select aggregate(query)
from ...
|
is
equivalent to
|
aggregate(select query from
...)
|
select aggregate(distinct
query) from ...
|
is
equivalent to
|
aggregate (distinct(select
query from ...)
|
If e1 and e2 are
expressions, e2 is a collection, e1 has
the type of its elements, if relation is a relational operator
(=, !=, <, <=, > , >=), then e1 relation some
e2 and e1 relation any e2 ande1 relation all
e2 are expressions whose value is a boolean.
The two first predicates are equivalent to
exists x in eg:
ei relation x
The last predicate is equivalent to
for all x in e2:
e1 relation x
Example:
10 < some (8,15, 7,
22)
1.10.4
String Literal
OQL accepts single quotes as well to delineate a string (see Section
4.9.3.1), like SQL does. This introduces an ambiguity for a string with one
character which then has the same syntax as a character literal. This ambiguity
is solved by context.
The OQL grammar is given using a rather informal BNF notation.
- { symbol } is a sequence of 0 or n symbol(s).
- [symbol] is an optional symbol. Do
not confuse this with the separators [].
- keyword is a terminal of the grammar.
- xxx_name is the syntax of an identifier.
- xxx_literal is self-explanatory, e.g., "a
string" is a string_literal.
- bind_argument stands for a parameter when
embedded in a programming language, e.g., $3i.
The non-terminal query stands for a valid query expression. The grammar is
presented as recursive rules producing valid queries. This explains why most of
the time this nonterminal appears on the left side of ::=. Of course, each
operator expects its "query" operands to be of the right types. These
type constraints have been introduced in the previous sections.
These rules must be completed by the priority of OQL operators which is
given after the grammar. Some syntactical ambiguities are solved semantically
from the types of the operands.
query_program ::= {define_query;} query
define_query ::=define identifier as query
query ::= nil
query ::= true
query ::= false
query ::= integer_literal
query ::= float_literal
query ::= character_literal
query ::= string_literal
query ::= entry_name
query ::= query_name
query ::= from_variable_name
query ::= (query)
1.11.1.3 Simple
Expression (see 1.9.5)
query ::= query + query3
query ::= query - query
query ::= query * query
query ::= query / query
query ::= - query
query ::= query mod query
query ::= abs(query)
query ::= query || query
1.11.1.4 Comparison (see
1.9.5)
query ::= query comparison_operator query
query ::= query like string_literal
comparison_operator ::= =
comparison_operator ::= !=
comparison_operator ::= >
comparison_operator ::= <
comparison_operator ::= >=
comparison_operator ::= <=
query ::= not query
query ::= query and query
query ::= query or query
query ::= type_name([query])
query ::= type_name (identifier: query {,identifier: query})
query ::= struct (identifier: query {,identifier: query})
query ::= set ([query {, query}])
query ::= bag ([query {, query}])
query ::= list ([query {, query}])
query ::= (query, query {, query})
query ::= [list](query .. query)
query ::= array ([query {,query}])
query ::= query dot attribute_name
query ::= query dot relationship_name
query ::= query dot operation_name(query {,query})
dot ::= . | ->
query ::= * query
query ::= query [query]
query ::= query [query:query]
query ::= first(query)
query ::= last(query)
query ::= function_name([query {,query}])
query ::= for all identifier in query: query
query ::= exists identifier in query: query
query ::= exists(query)
query ::= unique(query)
query ::= query in query
query ::= query comparison_operator quantifier query
quantifier ::= some
quantifier ::= any
quantifier ::= all
query ::= count(query)
query ::= count(*)
query ::= sum(query)
query ::= min(query)
query ::= max(query)
query ::= avg(query)
1.11.1.9 Select
Expression (see 1.9.8,1.9.9,1.9.10)
query ::=
|
select [ distinct]
projection_attributes
from variable_declaration {,
variable_declaration}
[where query]
[group by partition_atlributes]
[having query]
[order by sort_criterion {, sort_criterion}]
|
projection_attributes ::= projection {, projection}
proiection_attributes ::= *
projection ::= query
projection ::= identifier: query
projection ::= query as identifier
variable_declaration ::= query [[ as] identifier]
partition_attributes ::= projection {, projection}
sort_criterion ::= query [ordering]
ordering ::= asc
ordering ::= desc
query ::= query intersect query
query ::= query union query
query ::= query except query
query ::= listtoset(query)
query ::= element(query)
query ::= distinct(e)
query ::= flatten(query)
query ::= (class_name) query
The following operators are sorted by decreasing priority. Operators on the
same line have the same priority and group left-to-right.
() [] . ->
not -(unary) + (unary)
in
* / mod intersect
+ - union except ||
< > <= >= <some <any <all
= != like
and exists for all
or
.. :
,
(identifier) this is the cast operator
order
having
group by
where
from
select
Comments
Post a Comment