EdgeQL is the primary language of EdgeDB. It is used to define, mutate, and query data.
EdgeQL input consists of a sequence of commands, and the database returns a specific response to each command in sequence.
For example, the following EdgeQL SELECT
command would return a
set of all User
objects with the value of the name
property equal to
"John"
.
SELECT User FILTER User.name = 'John';
EdgeQL is a strongly typed language. Every value in EdgeQL has a type, which is determined statically from the database schema and the expression that defines that value. Refer to Data Model for details about the type system.
Every value in EdgeQL is viewed as a set of elements. A set may be empty (empty set), contain a single element (a singleton), or contain multiple elements. Strictly speaking, EdgeQL sets are multisets, as they do not require the elements to be unique.
A set cannot contain elements of different base types. Mixing objects and primitive types, as well as primitive types with a different base type, is not allowed.
In SQL databases NULL
is a special value denoting an absence of data.
EdgeDB works with sets, so an absence of data is just an empty set.
A set reference is a name (a simple identifier or a qualified schema name) that represents a set of values. It can be the name of an object type or an expression alias (defined in a statement WITH block or in the schema via an alias declaration).
For example a reference to the User
object type in the following
query will resolve to a set of all User
objects:
SELECT User;
Note, that unlike SQL no explicit FROM
clause is needed.
A set reference can be an expression alias:
WITH odd_numbers := {1, 3, 5, 7, 9}
SELECT odd_numbers;
See with block for more information on expression aliases.
A path expression (or simply a path) is an expression followed by a sequence of dot-separated link or property traversal specifications. It represents a set of values reachable from the source set. See Paths for more information on path syntax and behavior.
A simple path is a path which begins with a set reference.
In EdgeQL a name can either be fully-qualified, i.e. of the form
module_name::entity_name
or in short form of just entity_name
(for more details see Names and keywords). Any short name is
ultimately resolved to some fully-qualified name in the following
manner:
Look for a match to the short name in the current module (typically
default
, but it can be changed).
Look for a match to the short name in the std
module.
Normally the current module is called default
, which is
automatically created in any new database. It is possible to override
the current module globally on the session level with a SET MODULE
my_module
command. It
is also possible to override the current module on per-query basis
using WITH MODULE my_module
clause.
A function parameter or an operand of an operator can be declared as an aggregate parameter. An aggregate parameter means that the function or operator are called once on an entire set passed as a corresponding argument, rather than being called sequentially on each element of an argument set. A function or an operator with an aggregate parameter is called an aggregate. Non-aggregate functions and operators are regular functions and operators.
For example, basic arithmetic operators
are regular operators, while the sum()
function and the
DISTINCT
operator are aggregates.
An aggregate parameter is specified using the SET OF
modifier
in the function or operator declaration. See CREATE FUNCTION
for details.
Normally, if a non-aggregate argument of a function or an operator is empty, then the function will not be called and the result will be empty.
A function parameter or an operand of an operator can be declared as
OPTIONAL
, in which case the function is called normally when the
corresponding argument is empty.
A notable example of a function that gets called on empty input
is the coalescing
operator.
EdgeQL is a functional language in the sense that every expression is a composition of one or more queries.
Queries can be explicit, such as a SELECT
statement,
or implicit, as dictated by the semantics of a function, operator or
a statement clause.
An implicit SELECT
subquery is assumed in the following situations:
expressions passed as an argument for an aggregate function parameter or operand;
the right side of the assignment operator (:=
) in expression
aliases and shape element declarations;
the majority of statement clauses.
A nested query is called a subquery. Here, the phrase “apearing directly in the query” means “appearing directly in the query rather than in the subqueries”.
A query is evaluated recursively using the following procedure:
Make a list of simple paths appearing directly the query. For every path in the list, find all paths which begin with the same set reference and treat their longest common prefix as an equivalent set reference.
Example:
SELECT (
User.firstname,
User.friends.firstname,
User.friends.lastname,
Issue.priority.name,
Issue.number,
Status.name
);
In the above query, the longest common prefixes are: User
,
User.friends
, Issue
, and Status.name
.
Make a query input list of all unique set references which appear directly in the query (including the common path prefixes identified above). The set references and path prefixes in this list are called input set references, and the sets they represent are called input sets. Order this list such that any input references come before any other input set reference for which it is a prefix (sorting lexicographically works).
Compute a set of input tuples.
Begin with a set containing a single empty tuple.
For each input set reference, we compute a dependent Cartesian
product of the input tuple set (X
) so far and the input set
Y
being considered. In this dependent product, we pair each
tuple x
in the input tuple set X
with each element of the
subset of the input set Y
corresponding to the tuple x
. (For
example, in the above example, computing the dependent product
of User and User.friends would pair each user with all of their
friends.)
(Mathematically, X' = {(x, y) | x ∈ X, y \in f(x)}
, if f(x)
selects the appropriate subset.)
The set produced becomes the new input tuple set and we continue down the list.
As a caveat to the above, if an input set appears exclusively as
an OPTIONAL argument, it produces
pairs with a placeholder value Missing
instead of an empty
Cartesian product in the above
set. (Mathematically, this corresponds to having f(x) =
{Missing}
whenever it would otherwise produce an empty set.)
Iterate over the set of input tuples, and on every iteration:
in the query and its subqueries, replace each input set reference with the
corresponding value from the input tuple or an empty set if the value
is Missing
;
evaluate the query expression in the order of precedence using the following rules:
subqueries are evaluated recursively from step 1;
a function or an operator is evaluated in a loop over a Cartesian
product of its non-aggregate arguments
(empty OPTIONAL
arguments are excluded from the product);
aggregate arguments are passed as a whole set;
the results of the invocations are collected to form a single set.
Collect the results of all iterations to obtain the final result set.
A link target can be an abstract type, thus allowing objects of different extending types to be referenced. This necessitates writing polymorphic queries that could fetch different data depending on the type of the actual objects. Consider the following schema:
abstract type Named {
required property name -> str {
delegated constraint exclusive;
}
}
type User extending Named {
property avatar -> str;
multi link favorites -> Named;
}
type Game extending Named {
property price -> int64;
}
type Article extending Named {
property url -> str;
}
Every User
can have its favorites
link point to either other
User
, Game
, or Article
. To fetch data related to
different types of objects in the favorites
link the following
syntax can be used:
SELECT User {
name,
avatar,
favorites: {
# common to all Named
name,
# specific to Games
[IS Game].price,
# specific to Article
[IS Article].url,
# specific to User
[IS User].avatar,
# a computable value tracking how many favorites
# does my favorite User have?
favorites_count := count(
# start the path at the root of the shape
User.favorites[IS User].favorites)
}
}
The [IS TypeName]
construct can be used in
paths to restrict the target to a specific
type. When it is used in shapes it
allows to create polymorphic nested queries.
Another scenario where polymorphic queries may be useful is when a
link target is a union type
.
It is also possible to fetch data that contains only one of the
possible types of favorites
even if a particular User
has a
mix of everything:
# User + favorite Articles only
SELECT User {
name,
favorites[IS Article]: {
name,
url
}
}