The GeomLab language

This page contains a concise and fairly formal description of the programming language that is part of GeomLab. Reading it is almost certainly not the best way to learn to write programs for GeomLab: instead, it is better to follow the examples that are contained in the worksheets.

The GeomLab language is a general-purpose programming language in the sense that it is not tailored towards the graphics programming that is the theme of the worksheets, and it is capable of describing any function that can be computed. Its chief feature is that it is purely applicative: there are no "variables" in the language that can be assigned different values at different times in the execution of a program, but like the variables of ordinary mathematics, each variable has a single value. As in mathematics, the same equation or definition can be used several times in a calculation, with the variables standing for different values each time it is used.

1. Syntax

An expression or definition consists of a sequence of tokens made up of ASCII characters. Each token belongs to one of these classes: names, numbers, strings, operators and delimiters. Blanks and line breaks may not occur within tokens, except in comments and for blanks in strings, but are otherwise ignored unless they separate two consecutive tokens that might otherwise be read as one. Upper and lower case letters are treated as distinct.

An identifier is any sequence of letters, digits and underscores that begins with a letter or underscore, except for the reserved words that are listed below.

A number token is any sequence of decimal digits, followed optionally by a decimal point and a further (possibly empty) sequence of decimal digits.

A string is a sequence of characters enclosed between two double-quote characters. A string may not contain a double-quote character or a line break.

An operator or delimiter is one of the reserved words and special symbols in the list that follows. Reserved words appear entirely in lower case, and may not be used as names.

and define div else if in lambda let mod not op or then when
_ = + - $ * / & ~ : @ < <= <> > >= ( ) [ ] , ; |

In most places where an identifier may appear in a GeomLab program, the keyword op may also appear, followed by an operator symbol; this acts as the name of the operator.

Name = Ident | "op" Operator

The syntax of expressions and definitions in the GeomLab language is given in subsequent sections of this document using EBNF notation. In this notation, square brackets [ ... ] enclose optional text, and curly brackets { ... } enclose text that may be repeated zero or more times.

2. Values

The following kinds of value are denoted by expressions in the GeomLab language:

2.1. Numbers

Numbers in the GeomLab language are represented in double-precision floating point format, even if they are integers. Numbers are denoted by number tokens, and are yielded as the results of arithmetic operations.

2.2. Booleans

The Boolean values true and false are denoted by pre-defined names, and are yielded as the results of comparison operators.

2.3. Strings

Strings are denoted by string tokens. Strings are not much used in GeomLab programming, and are included mostly for internal use "under the bonnet" in implementing the other language features.

2.4. Lists

The empty list is denoted by the expression [ ], and non-empty lists may be constructed with the operator ":". A list expression [x₁, x₂, ..., x_n] is an abbreviation for a list of length n constructed in this way. Because the GeomLab language is untyped, there is no guarantee that arbitrary lists are proper, in the sense that they end properly with the value [ ].

2.5. Functions

Names in GeomLab programs may denote functions, either primitive functions that are part of the initial environment or installed as a plug-in extension, or functions that are defined as part of the GeomLab program. Each function takes a fixed number of values as its arguments and deliviers a single value as its result. Functions are created by function definitions (see Section 6.2) and by lambda expressions (see Section 8.2).

2.6. Other values

Other kinds of value may be added to the GeomLab languages as plug-in extensions. Typically, such extensions come with a collection of primitive functions for creating and manipulating the new values. In the GeomLab environment, colours and pictures are provided as additional kinds of value in this way.

3. Scope rules

Expressions are evaluated in the context of an environment that gives values to the variables that appear in the expression. An environment has two parts: the global part contains the pre-defined names that are built in to the GeomLab system, together with additional names that have been added by top-level definitions. There is also a local part of the environment that contains names that have been defined by pattern-matching in a function definition, or by one of the additional forms of expression covered in Section 8. Unless these additional forms of expression are used, there is no nesting of environments in GeomLab, and the scope rules can be summarised as follows:

The local part of the environment takes precedence over the global part. Thus, if the same name is used for a global variable and for one of the formal parameters of a function, then it is the formal parameter that is denoted by the name within the function body.
If a function is defined, and then later applied after further definitions have been added to the global environment, then it is the global environment at the time the function is applied that is used for evaluating the function body -- a form of dynamic binding. This rule is a natural one for interactive programming, and allows naturally for mutual recursion among top-level functions.

The additional forms of expression introduced in Section 8 allow the possiblility of nested scopes in which function values are first-class citizens. For these forms of expression, it is necessary to add the rule that the local part of the environment is treated according to static binding. Thus, function values capture the local part of the environment at the point of definition, and it is this local part, and not the one in force at the point of application, that is used to evaluate the function body. Recursion is allowed in local function definitions but not in local value definitions.

4. Expressions

Primary    = Name | Number | String
           | Name { "(" [ Expression { "," Expression } ] ")" }
           | "(" Expression ")" 
           | "[" [ Expression { "," Expression } ] "]"
 
Factor     = ( "-" | "~" ) Factor | Primary 
 
Term0      = { Factor ":" } Factor
Term1      = Term0 { ( "*" | "/" | "$" ) Term0
Term2      = Term1 { ( "+" | "-" | "&" ) Term1 }
Term3      = { Term2 "@" } Term2
Term4      = Term3 { ( "=" | "<" | "<=" | "<>" | ">" | ">=" ) Term3 }
Term5      = Term4 { and Term4 }
Term       = Term5 { or Term5 }

BasicExpr  = Term | if Term then BasicExpr else BasicExpr

Expression = BasicExpr | LetExpr | LambdaExpr

4.1. Primary expressions

The smallest expressions are names, which denote the value to which they are bound in the environment; number tokens, which denote a numeric constant; and string tokens, which denote a constant string. Any expression may appear in parentheses as an operand in a larger expression.

An expression [e₁, e₂, ..., e_n] denotes a list made up of the n values that are denoted by the expressions e1, e2, ... en.

This list is constructed using the empty list (denoted [ ]) by n applications of the construction operator ":".

4.3. Function application

An expression f(e₁, e₂, ..., e_n) denotes the application of a function f to the n arguments e₁, ..., e_n. The number of arguments must match the number expected by f. The arguments and the function f are first evaluated. If f is a function that has been defined as part of the GeomLab program, then the argument values are matched with the patterns that appear in the definition of f, and the value of the application is that value of the appropriate function body. Other functions are implemented as primitives in the initial environment of the GeomLab program, and deliver a result that cannot be expressed as the value of another expression.

4.4. Unary and binary operators

Simple expressions may be combined with various unary and binary operators to form more complex expressions; these operators have the binding powers that are implied by the syntax rules above. All binary operators associate to the left, except for ":" and @, which associate to the right. An expression written with a prefix or infix operator is just an abbreviation for the application of the same operator as a function. Thus, the expression x + y is an abbreviation for the binary function application (op +)(x, y); this expression uses the name op + that is bound to the addition primitive. All the operators of GeomLab are bound to different primitive functions in the initial environment.

An exception to this rule is made in the case of the operators and and or, which are evaluated in a 'short-circuit' fashion: thus, in the expression e₁ and e₂, the sub-expression e₁ must yield a Boolean value. If this value is true, then the value of the expression is whatever value is yielded by evaluating e₂; otherwise the value of the expression is false. Similarly, the value of e₁ or e₂ is the value of e₂ if e₁ yields false, and otherwise it is true. This means that e₁ and e₂ is equivalent to the conditional expression

if e₁ then e₂ else false

and e₁ or e₂ is equivalent to

if e₁ then true else e₂

4.5. Conditional expressions

An expression if e₁ then e₂ else e₃ is evaluated by first finding the value of e₁. This should be a Boolean value, either true or false; an error is reported if it is not. Then either e₂ or e₃ is chosen for evaluation, depending on the Boolean value, and the value of the chosen expression becomes the value of the whole conditional expression. The other sub-expression is not evaluated.

4.6. Other forms of expression

Two additional forms of expression -- let expressions and lambda expressions -- are allowed. They are described on Section 8.

5. Patterns

Patterns are used in function definitions (see Section 6) and lambda expressions (see Section 8) to match the arguments of a function. An attempt to match a pattern against a value may either succeed or fail; if it succeeds, then the names that appear in the expression become bound to parts of the original value. A name may appear more than once in a pattern or list of patterns; in that case, matching fails unless the values that it matches are all equal. The equality test that is applied is the same as the one used for the = operator.

PattPrimary = Name | "_" 
            | [ "-" | "~" ] Number 
            | String
            | "(" Pattern ")" 
            | "[" [ Pattern { "," Pattern } ] "]"

PattFactor  = PattPrimary { ":" PattPrimary }

Pattern     = PattFactor { "+" Number }

5.1. Primary patterns

The simplest patterns are names, which match any value and bind the name to it; the anonymous pattern _ which matches any value but does not bind a name; positive and negative numbers and strings, which match the single, constant values denoted by the number or string. Any pattern may also be enclosed in parentheses and used as a primary part of a larger pattern.

A list pattern [p₁, p₂, ..., p_n] matches any list of length n whose elements are matched by the patterns p₁, p₂, ... p_n respectively.

5.2. Cons patterns

A "cons" pattern p₁:p₂ matches any non-empty list whose head is matched by p₁ and whose tail is matched by p₂.

5.3. Plus patterns

A "plus" pattern has the form p+n, where n is a number. It matches a number x if n > 0 and the difference y = x - n is an integer such that y >= 0 and the pattern p matches y.

6. Definitions

Definition = ValueDef | FuncDef

Definitions appear in define paragraphs to add a definition to the global environment, and also in let expressions to define a name locally to an expression.

6.1. Value definitions

ValueDef = Name "=" Expression

A value definition defines a name as standing for the value of a certain expression. The expression is evaluated immediately, and the name becomes bound to its value.

6.2. Function definitions

FuncDef = Clause { "|" Clause }

Clause  = Name Formals "=" Expression [ when Expression ]

Formals = "(" [ Pattern { "," Pattern } ] ")"

A function definition defines a name as standing for a function. The function is defined by a sequence of clauses, each containing a list of patterns that are matched against the arguments of the function, and expression that gives the corresponding value yielded by the function, and optionally a boolean-valued guard expression after when that specifies a condition under which the clause applies. For a function definition to be syntactically valid, all the clauses must contain the same function name, and all must contain the same number of argument patterns, the number of arguments that is expected by the function.

When the function is applied to arguments, the clauses in the definition are considered in order, and the first applicable one determines the value that is yielded by the application. To apply a clause, the patterns in the clause are first matched with the incoming arguments. If they all match, then the guard (if any) is evaluated; if the value of the guard is false, then the clause does not apply. Finally, the right-hand side expression is evaluated, and its value becomes the value yielded by the function application. If any guard that is evaluated fails to return a Boolean result, or if no clause is applicable to the arguments, then the evaluation fails.

The definition of a function may have several clauses, each one matching a different pattern of arguments. For example, here is the definition of a function pow(a, b) that computes a raised to the power b:

define pow(a, b) = a * pow(a, b-1) when b > 0
 | pow(a, 0) = 1

The first clause deals with the case where b > 0, defining pow(a, b) in terms of pow(a, b-1); the second clause deals with the case where b = 0, giving the result directly, and providing a place for the recursion to stop. (The function is not defined at all if b < 0.)

The first clause in this definition has patterns (a and b) that will match any arguments that are supplied, but the guard b > 0 rules out those where b <= 0. The second clause matches those argument lists where the second argument is equal to 0.

7. Paragraphs

Paragraph = Expression [ ";" ] 
          | define Definition [ ";" ]

A program in the GeomLab language consists of a sequence of paragraphs that are entered at the top-level prompt or read from one or more text files. Each paragraph is either an expression to be evaluated in the current global environment, or a definition that adds to that environment.

When paragraphs are written on a file for use with the File/Load command of GeomLab, each paragraph must end with a semicolon. The semicolon can be omitted when paragraphs are entered at the interactive prompt.

8. Additional forms of expression

There are two additional forms of expression -- let expressions and lambda expressions -- that are not needed for the worksheets, but are useful in more advanced programming. These forms of expression do not add to the expressive power of the language, but they do make some kinds of program easier to write.

8.1. `let` expressions

LetExpr = let Definition in Expression

An expression let d in e allows the name defined by the definition d to be used in the expression e. For example, the value of the expression

let y = x + 1 in y * y

is the square of whatever is the value of x + 1. The advantages of using a let expression is that it is often clearer to do so, and sometimes shorter and more efficient than writing out the expression and substituting the right-hand side of the definition for the left-hand side, like this:

(x + 1) * (x + 1)

Also, let expressions can be used to define functions that are local to a single expression.

8.2. `lambda` expressions

LambdaExpr = lambda Formals Expression

A lambda expression denotes a function that is defined by a single clause with no guard. A lambda expression

lambda (p₁, p₂, ..., p_n) e

denotes the same function f as is defined by the definition

(p₁, p₂, ..., p_n) = e

There is no need, however, to invent a fresh name f in order to write the function as a lambda expression. Lambda expressions are mainly used in more advanced programming to specify arguments to higher-order functions.

9. Syntax summary

Paragraph   = Expression [ ";" ] 
            | define Definition [ ";" ]

Definition  = ValueDef | FuncDef

ValueDef    = Name "=" Expression

FuncDef     = Clause { "|" Clause }

Clause      = Name Formals "=" Expression [ when Expression ]

Expression  = Term 
            | if Term then BasicExpr else BasicExpr
            | let Definition in Expression
            | lambda Formals Expression

Term        = Term5 { or Term5 }
Term5       = Term4 { and Term4 }
Term4       = Term3 { ( "=" | "<" | "<=" | "<>" | ">" | ">=" ) Term3 }
Term3       = { Term2 "@" } Term2
Term2       = Term1 { ( "+" | "-" | "&" ) Term1 }
Term1       = Term0 { ( "*" | "/" | "$" ) Term0
Term0       = { Factor ":" } Factor

Factor      = ( "-" | "~" ) Factor | Primary
 
Primary     = Name | Number | String
	    | Name { "(" [ Expression { "," Expression } ] ")" }
            | "(" Expression ")" 
            | "[" [ Expression { "," Expression } ] "]"
 
Formals     = "(" [ Pattern { "," Pattern } ] ")"

Pattern     = PattFactor { "+" Number }

PattFactor  = PattPrimary { ":" PattPrimary }

PattPrimary = Name | "_" 
            | [ "-" | "~" ] Number 
            | String
            | "(" Pattern ")" 
            | "[" [ Pattern { "," Pattern } ] "]"

Name        = Ident | "op" Operator

Operator    = and | div | mod | not | or | "=" | "+" | "-" 
            | "$" | "*" | "/" | "&" | "~" | ":" | "@" 
            | "<" | "<=" | "<>" | ">" | ">="