Language specification

Lexical Rules

Whitespaces and comments

Whitespaces are like in Java: characters for which java.lang.Character.isWhitespace(c) == true (e.g. space, tab, end-of-line, etc.).

Comments are like in Java.

  • Single-line comment: starts with //, ends with end-of-line or end-of-file.
  • Multiline comment: starts with /*, ends with */.
  • Error if there is no */ after /*.

Identifiers

Identifiers are like in Java: first character has java.lang.Character.isJavaIdentifierStart(c) == true, other characters have java.lang.Character.isJavaIdentifierPart(c) == true. Simple definition: sequence of letters, digits and underscores (_), first character is not a digit. Can contain non-English letters. Identifiers are case-sensitive.

Keywords:

Keyword is one of the following reserved identifiers:

and break class create delete else false for function if in index key limit list map mutable not null operation or query return set sort true update val var while
  • A keyword cannot be used as a general identifier, i.e. as a name of a class, function, variable, etc.
  • Longest possible keyword/identifier is taken, i.e. string “format” is an identifier “format”, not keyword “for” and identifier “mat”.

Operators and delimiters

List of operators and delimiters:

!! != % %= ( ) * *= + += , - -= . / /= : ; < <= = == > >= ? ?. ?: @ [ ] { }
  • Longest possible operator/delimiter is taken, i. e. string <= is a single operator <=, not two operators < and =.

Integer literals

  • Decimal: regex /[0-9]+/.

Maximum decimal value: 9223372036854775807 (2^63 - 1). Error if the value is greater.

  • Hex: regex /0x[0-9A-Fa-f]+/, e.g. 0x0, 0xABCD, etc.

Maximum hex value: 0x7FFFFFFFFFFFFFFF (2^63 - 1). Error if the value is greater.

  • Cannot have a letter directly after an integer literal, e. g. 1234X is an error, not two tokens 1234, X.

String literals

  • Enclosed in single (‘) or double (“) quotes.
  • There is no difference between single-quoted and double-quoted strings, i. e. ‘Hello’ and “Hello” are equal string literals.
  • Cannot contain an end-of-line character (0x0A), i. e. closing quote must be on the same line as the open quote.
  • Error if there is no closing quote on the same line.

Escape sequences

Simple escape sequences: \b \t \r \n \" \' \\

Unicode escape sequence: \u1234 \uABCD \uAbCd etc. - must have exactly 4 hex digits.

Error if wrong escape sequence is specified (\ character, but not one of valid escape sequences).

Byte array literals

  • Syntax: x"..." or x'...', only hex digits (upper or lower case) can be used within quotes.

Examples: x'' x"123456" x"DeadBeef"

  • Must start with lower-case x, not upper-case X.
  • Must contain an even (2*N) number of hex digits (because 1 byte = 2 hex digits).
  • Cannot contain escape sequences or end-of-lines.

Types

General

A type of an attribute, parameter, variable, etc. can be:

  • name of a built-in or user-defined type (Identifier)
  • nullable type
  • tuple type
  • collection type

Built-in types

Basic built-in types are:

boolean byte_array integer json range text

Built-in type aliases are:

name = text pubkey = byte_array timestamp = integer tuid = text

Type alias A = T means that entities (attributes, variables, etc.) of type A will effectively have type T during compilation.

Special types

Special types cannot be used in code explicitly (in attribute declarations, etc.), but they are used by the compiler internally as types of some expressions.

  • Special types are: unit, null.
  • Names of special types cannot be used in code as types. Trying to use “unit” as a type causes an error e. g. “Unknown type name”. “null” is a keyword, so using it as a type is a syntax error.

Nullable type

The idea was taken from Kotlin.

Syntax:

NullableType: Type "?"

Examples:

  • integer?
  • list<text>?

Error if the underlying type is nullable, e. g. integer??.

Tuple type

Consists of one or more fields. Each field must have a type and may have a name.

Syntax:

TupleType: "(" TupleTypeField ( "," TupleTypeField )* ")"

TupleTypeField: ( Identifier ":" )? Type

Examples:

  • (integer)
  • (integer, text)
  • (x: integer, y: integer)
  • (p: text, q: byte_array, list<integer>)

Error if same field name is used more than once.

Collection types

Collection types are: list, set, map.

Syntax:

  • "list" "<" Type ">"
  • "set" "<" Type ">"
  • "map" "<" Type "," Type ">"

Examples:

  • list<integer>
  • set<text>
  • map<text, byte_array>

Subtypes

Purpose: if type B is a subtype of type A, a value of type B can be assigned to a variable of type A.

  1. T is subtype of T.

  2. T is subtype of T?.

  3. null is subtype of T?.

  4. Tuple type T1 is subtype of tuple type T2 if:

    • the number of fields is the same
    • names of corresponding fields are the same (if a field has no name, the other field must have no name)
    • type of each field of T1 is a subtype of the type of the corresponding field of T2

Examples:

  • (integer, text) is subtype of (integer, text?)
  • (integer, text?) is subtype of (integer?, text?)
  • (integer, text?) is not subtype of (integer, text), because text? is not subtype of text
  • (x: integer, y: integer) is subtype of (x: integer?, y: integer?)
  • (x: integer, y: integer) is not subtype of (p: integer, q: integer), because field names differ
  • (integer, text) is not subtype of (x: integer, y: integer)
  • (x: integer, y: integer) is not subtype of (integer, text)

Classes

Class has a name and zero or more member definitions.

  • When a class with name A is defined, A can be used as a type name in the code after the class definition.
  • Error if there already is a built-in or user-defined type with same name.
  • Class members are: attribute, key, index.

Class syntax

ClassDefinition: "class" Identifier "{" ClassMemberDefinition* "}"

ClassMemberDefinition :
    AttributeDefinition
    KeyDefinition
    IndexDefinition

Example:

class user {
    name: text;
    address: text;
    key name;
    index address;
}

Attributes

Attribute definition may contain a name, type, default value expression and modifiers (e. g. mutable).

Syntax:

AttributeDefinition: "mutable"? FieldDefinition ("=" Expression) ";"

FieldDefinition: Identifier (":" Type)?

  • If type is not specified, same type as the attribute name is taken (built-in or user-defined). Error if there is no such type.
  • Error if there already is another attribute with same name in the same class.
  • If default value expression is specified, the type of the expression must be a subtype of the attribute’s type.
  • Expressions specification will be written later. We can use simplest expressions now for testing: integer literal, string literal, true, false, null, etc.

Examples:

name;            // same as "name: name;", there is a built-in type "name"
address: text;
mutable age: integer;
mutable status: text = 'Unknown';

Keys, indices

Keys and indices consist of one or more fields.

Syntax:

KeyDefinition: "key" FieldDefinition ("," FieldDefinition)* ";"

IndexDefinition: "index" FieldDefinition ("," FieldDefinition)* ";"

Handling of fields

  • Error if same field name is used more than once within one key/index.
  • If there is no attribute with such name, an attribute is added to the class implicitly. The added attribute is not mutable, has no default value.
  • If there is an attribute with such name, the key/index field cannot have a type specified.

No error:

key foo: integer;

Error:

foo: integer;
key foo: integer;

Error if there already is a key/index with same set of fields.

Not an error:

index a;
index a, b;

Error:

index a, b;
index b, a;

It does not matter if a key/index is defined before or after an attribute used in it

Code:

x: integer;
key x;

is equivalent to:

key x;
x: integer;

Same for field type restrictions: does not matter whether it is before or after the attribute definition

No error:

key x: integer;

No error:

x: integer;
key x;

Error:

x: integer;
key x: integer; // ERROR

Error:

key x: integer; // ERROR
x: integer;

Operations, Queries, Functions

Let’s say that operations, queries and functions are routines. Some rules are common for all routines, while other rules are specific for operations, queries or functions.

Syntax

Module : Definition*

Definition : ClassDefinition | RecordDefinition | RoutineDefinition

RoutineDefinition : Operation | Query | Function

ClassDefinition syntax is covered above.

  • Each routine has a name.
  • Error when defining a routine, and another routine with the same name already exists.

Built-in functions are also taken into account when checking this rule. (The list of built-in functions will be given in a future chapter.)

Operations

Operation: “operation” Identifier “(” FormalParams? “)” BlockStatement

FormalParams: FieldDefinition ( “,” FieldDefinition )*

BlockStatement: “{” Statement* “}”

  • FieldDefinition syntax is given in chapter 3 (it’s the same as for class fields).
  • Statement syntax will be given in a future chapter about statements.
  • Return type of an operation is “unit”. Thus, an operation cannot return a value. Return statement cannot have an expression, even if the expression returns unit:
return; // OK

return print('Hello'); // Error, even though print() returns unit.

Examples

operation foo(user; value: integer) {
    if (value == 0) return;
    update account @ { user } ( score += value );
}

Queries

Query: query Identifier ( FormalParams? ) (: Type)? QueryBody

QueryBody: SimpleBody | ComplexBody

SimpleBody: = Expression ;

ComplexBody: BlockStatement

Return type

  • A query has a specific return type and always returns a value.
  • If return type is not specified explicitly, it is implicitly deducted from return expressions.
  • For simple body: return type is the type of the expression.
  • Error if the type of the expression is “unit”.
  • For complex body: return type is the common type of types of all expressions used in return statements.
  • Error if there is no common type for return expressions types.
  • If explicit return type is specified.
  • For simple body: error if the type of the expression is not a subtype of the explicit return type.
  • For complex body: error if the type of the expression in a return statement is not a subtype of the explicit return type.
  • For complex body: error if there is no return statement.

Examples

query getUserCount(company) = (user @* { company }).size(); // Returns integer.

query getUserCount(companyName) {
    if (companyName == "") return 0;
    return (user @* { company.name == companyName }).size();
}

Error: no common return type
query q(x: integer) {
    if (x < 0) return 'Hello';
    return 123;    // Error on this line.
}

Error: actual return type differs from the declared one
query q(): integer = 'Hello';
query q(): integer { return 'Hello'; }

Functions

Function: “function” Identifier “(” FormalParams? “)” (“:” Type)? FunctionBody

FunctionBody: SimpleBody | ComplexBody

Return type

  • If return type is not specified, the return type of the function is “unit”.

*Simple body

  • The type of the expression must be a subtype of the return type of the function.
  • The type of the expression cannot be “unit”.

Complex body

  • If return type is not specified (thus, it is “unit”), return statements must have no expression (i. e. must use “return;”, not “return X;”).
  • If return type is specified, type of expressions in return statements must be a subtype of the return type.
  • Order of function definitions does not matter, all functions defined in a module are visible everywhere in the module.

This allows recursive and mutually-recursive functions:

function a(x: integer) {
    if (x > 0) b(x - 1); // b() is visible here, but it is defined below.
}

function b(x: integer) {
    if (x > 0) a(x - 1);
}

Common things for routines

  • Queries and non-unit functions must always return a value.
  • Error if there is no return statement on one of code paths:
function f(x: integer): integer {
    print(x);
} // Error: no return statement at all.

function f(x: integer): integer {
    if (x > 0) return x * x;
} // Error: no return statement for one of code paths.

function f(x: integer): integer {
    if (x > 0) {
        return x * x;
    } else {
        print('invalid argument');
    } // Error: no return statement for this branch.
}

More formal rules how to check if there is a return value will be given in the chapter on statements (future).