Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Introduction

Purpose and Audience

Core Solidity is an update to the Solidity programming language. It both extends the language with new features (ADTs, Parametric polymorphism, Typeclasses), and diverges from Classic Solidity in incompatible ways (most notably by removing inheritance). This document targets compiler implementers, tooling developers, and language researchers. It is a reference specification, not a tutorial.

A prototype implementation exists at argotorg/solcore. The implementation is a research prototype. It contains bugs, lacks optimizations, and will change.

Compilation Architecture

Compilation proceeds by succesive translation between the following representations:

  1. Core Solidity: The user-facing language with all syntactic features. Supports higher-order functions, pattern matching, type classes, and modules. This is a superset of SAIL, with all additional language constructs being defined in terms of desugaring passes (syntactic transformations) into SAIL.

  2. SAIL: A minimal first-order intermediate representation. All syntactic sugar has been removed. Type classes are resolved to concrete instances. Polymorphic functions are specialized to monomorphic versions. Pattern matching is compiled to simple case trees.

  3. Hull: A low-level representation close to Yul. Retains algebraic data types (binary sums and products) but otherwise resembles Yul. Hull code is translated directly to Yul, which is then compiled to EVM bytecode.

  4. Yul: The existing Yul assembly language currently used in solc. Core Solidity is designed in a way that allows the use of alternative or additional representations at this level in the future.

Each transformation step simplifies the language and moves closer to the EVM execution model.

Document Structure

  • Compilation Pipeline: Describes the compilation pipeline and how representations relate
  • SAIL: Specifies the desugared intermediate representation
  • Core Solidity: Specifies the user-facing language features
  • Hull: Specifies the low-level IR before Yul emission

Status

This is a research prototype. The language design is not stable. Features may be added, removed, or changed. The implementation has known bugs and missing functionality. Do not use this for production systems.

SAIL

SAIL (Solidity Advanced Intermediate Language) is the source language of the Core Solidity compiler. It extends Solidity's surface syntax with a statically-typed functional core: parametric polymorphism via forall quantifiers, type classes for constrained overloading, and algebraic data types with exhaustive pattern matching. Every SAIL program is compiled to monomorphic Core IR, named Hull, through specialization, then translated to Yul and assembled into EVM bytecode.

This section documents the SAIL language itself. The chapters are ordered from the most foundational concepts to the most advanced:

  • Syntax covers lexical conventions, literals, and the overall structure of a source file.
  • Built-ins lists the types, values, and operators available without any import, along with the full set of EVM opcodes accessible inside assembly blocks.
  • Variable Declaration and Assignment explains let bindings and mutation.
  • Functions describes function signatures, free functions, contract methods, and recursive definitions.
  • Assembly Blocks details the rules for embedding raw Yul inside a SAIL function body.
  • Datatypes introduces algebraic data type declarations and pattern matching.
  • Parametric Polymorphism explains forall quantifiers, type variable instantiation, and the specialization strategy.
  • Type Classes covers class declarations, instance declarations, superclass constraints, and the three soundness conditions the compiler enforces.
  • Modules describes the import and export system, qualified names, and visibility rules.
  • Type Inference explains the bidirectional constraint-based algorithm, what must be annotated explicitly, and the error messages the compiler produces when inference fails.

Syntax

This page is a complete grammar reference for SAIL. Every grammar rule appears as its own section, followed by a railroad diagram. Rounded boxes denote terminal tokens (keywords and punctuation); rectangular boxes denote non-terminals and link to the corresponding rule. The notation follows standard EBNF conventions: [ … ] marks optional elements and { … } marks zero-or-more repetition.


Parser Rules

CompilationUnit

A SAIL source file is a sequence of import declarations and top-level declarations in any order. Imports are not required to precede declarations.

CompilationUnit


TopDecl

A top-level declaration is one of: a contract, a free function, a type class, an instance, an algebraic data type, a type synonym, an export declaration, a pragma, or an operator declaration.

TopDecl


Import

An import declaration makes names from another module available in the current module. The @package. prefix selects an external package; the lib. prefix selects a standard library module.

Import


ModulePath

A module path is a dot-separated sequence of identifiers that locates a module within a package.

ModulePath


ImportItems

The list of names to import from a module, enclosed in braces.

ImportItems


ImportItem

A single item to import. Four forms are available:

  • * imports all exported names (wildcard).
  • Name imports a single name into the unqualified scope.
  • Name as Alias imports a single name under a local alias. The original name is not placed in scope.
  • (sym) imports an operator symbol and enables its infix syntax in this file.

ImportItem


OperatorSymbol

A non-empty sequence of operator characters (+-*/%<>=!&|^~#? or Unicode mathematical symbols in the range U+2200 to U+23FF), written between parentheses in import and export item lists.

OperatorSymbol


HidingList

A comma-separated list of names to exclude from an import.

HidingList


ExportDecl

An export declaration controls which names this module exposes to other modules.

ExportDecl


ExportItems

The list of names to export from the current module, enclosed in braces.

ExportItems


ExportItem

A single export entry. It may be the wildcard *, a plain name, a name together with its constructors, all names from a module, or an operator symbol in parentheses.

ExportItem


ExportConstructors

Selects which data constructors to re-export alongside a type name: either all constructors (*) or an explicit list.

ExportConstructors


ExportFromItems

The list of names re-exported from another module.

ExportFromItems


ExportFromItem

A single entry in a re-export list: the wildcard *, a plain name, or a name with explicit constructor re-exports.

ExportFromItem


Pragma

A pragma adjusts compiler behaviour for type class constraint checking. Without targets the pragma applies to all classes; with a list of identifiers it applies only to those specific classes.

Pragma


PragmaKind

The three available pragma kinds relax, respectively, the coverage condition, the Patterson condition, and the bound-variable condition for type class instance resolution.

PragmaKind


PragmaTargets

A comma-separated list of class names to which a pragma applies.

PragmaTargets


Type

A type is one of: a named type constructor applied to zero or more type arguments, a function type of the form (T₁, …, Tₙ) -> T, a tuple or unit type written as a parenthesised comma-separated list, or a proxy type @T.

Type


TypeList

A comma-separated (possibly empty) list of types, used as arguments to type constructors and as the parameter list of function types.

TypeList


TypeVarSeq

A space-separated sequence of type-variable names following a forall keyword. All listed names are universally quantified over the scope of the accompanying signature.

TypeVarSeq


TypeVarParams

A comma-separated list of type-variable names enclosed in parentheses. Used in data, type, and class declarations to introduce parametric type arguments.

TypeVarParams


TypeName

A possibly qualified type name. Simple names are single identifiers; qualified names chain module components with ..

TypeName


DataDef

An algebraic data type declaration. The optional parameter list introduces type variables. The optional body lists the constructors separated by |.

DataDef


DataConstrs

One or more data constructor definitions separated by |.

DataConstrs


DataConstr

A single data constructor: a name optionally followed by a parenthesised, comma-separated list of field types.

DataConstr


TypeSynonym

A type synonym introduces an alias for an existing type. The optional parameter list introduces type variables that may appear in the right-hand side.

TypeSynonym


Pattern

A pattern appears in match equations to deconstruct a value by its constructor. The dot-prefix form (.Name) is a contextual shorthand: the constructor is resolved from the type being matched.

Pattern


PatternList

A comma-separated list of patterns used as the argument list of a constructor pattern or as the simultaneous arguments of a match equation.

PatternList


Expr

An expression computes a value. Binary operators follow standard precedence: arithmetic binds tighter than comparison, which binds tighter than logical. All binary operators are left-associative except if-then-else, which is right-associative.

Expr


ExprList

A comma-separated (possibly empty) list of expressions used as function arguments.

ExprList


Literal

A literal value: a decimal or hexadecimal integer, or a double-quoted string.

Literal


Stmt

A statement is an executable step inside a function body. Assignment operators =, +=, and -= require a terminating ;. let declares a local variable, optionally with a type annotation and an initialiser. The for statement provides a C-style counted loop; its initialisation and post-iteration clauses obey the ForInitStmt and ForPostStmt grammars respectively.

Stmt


Body

A brace-enclosed sequence of zero or more statements forming the body of a function, branch, or constructor.

Body


ForInitStmt

The initialisation clause of a for loop. It may be an assignment, a compound assignment, a let binding (typed or untyped, with or without an initialiser), or a plain expression. Unlike a regular statement, there is no trailing ; — the semicolons are written explicitly in the for(…; …; …) header.

ForInitStmt


ForPostStmt

The post-iteration clause executed after each loop body. It follows the same grammar as ForInitStmt. A let binding introduced here is scoped to the body of that single iteration.

ForPostStmt


MatchArgs

One or more comma-separated expressions forming the scrutinees of a match statement.

MatchArgs


Equation

A single match arm: a |-prefixed list of patterns followed by => and a sequence of statements. The patterns are matched positionally against the scrutinee list.

Equation


Param

A single function parameter: a name with an explicit type annotation, or an untyped name whose type will be inferred.

Param


ParamList

A comma-separated list of function parameters.

ParamList


Function

A function definition. The long form uses a brace-enclosed statement block as the body. The short form uses a single expression whose value is returned implicitly (Rust-style).

Function


Signature

A function signature declares the function name, its parameter list, and the optional return type. It may be preceded by a polymorphism prefix to introduce type variables and constraints.

Signature


SigPrefix

An optional forall quantifier that precedes a function or method signature. It introduces universally quantified type variables and, optionally, a list of type class constraints that callers must satisfy.

SigPrefix


ConstraintList

A comma-separated list of type class constraints.

ConstraintList


Constraint

A single type class constraint of the form Type : ClassName or Type : ClassName(T₁, …, Tₙ). It asserts that the given type is an instance of the named class, possibly with additional type parameters.

Constraint


ClassDef

A type class declaration. The self-variable (the first identifier after class) is the main type being constrained. The optional comma-separated list in parentheses introduces auxiliary associated type variables. The body lists method signatures, each terminated by ;.

ClassDef


InstDef

An instance declaration provides method implementations for a specific type. The optional default keyword marks the instance as an overlappable fallback when no more specific instance is found.

InstDef


Contract

A contract groups fields, nested data types, methods, and an optional constructor. The optional parameter list makes the contract generic over type variables.

Contract


ContractDecl

A single declaration inside a contract body: a field, a data type, a function, or a constructor.

ContractDecl


FieldDecl

A contract field declaration. The type annotation is mandatory; the initialiser expression is optional.

FieldDecl


Constructor

A contract constructor is invoked exactly once at deployment time. It has an explicit parameter list and a statement block body.

Constructor


AsmBlock

An inline assembly block embeds Yul statements directly in SAIL source code, giving direct access to EVM opcodes.

AsmBlock


YulStmt

A statement in the Yul sublanguage. Yul provides low-level control flow (if, switch, for, break, continue, leave) and variable declarations and assignments using :=.

YulStmt


YulCase

A single case arm in a Yul switch statement: a literal value followed by a block of Yul statements.

YulCase


YulExpr

A Yul expression: a literal, a variable reference, a function call, or a call to the special return built-in.

YulExpr


YulNames

A comma-separated list of identifiers used as the left-hand side of a Yul multi-assignment or the names in a Yul let declaration.

YulNames


YulExprList

A comma-separated list of Yul expressions used as arguments to a Yul function call.

YulExprList


YulLiteral

A Yul literal value: a decimal or hexadecimal integer, or a string.

YulLiteral


Lexer Rules

Identifier

An identifier begins with a letter (upper or lower case) and may contain letters, decimal digits, and underscores. Identifiers are used for variable names, function names, type names, module components, and constructor names.

Identifier


Integer

An integer literal is either a sequence of decimal digits or a hexadecimal literal prefixed with 0x.

Integer


StringLiteral

A string literal is a sequence of characters enclosed in double quotes. Supported escape sequences are \n (newline), \t (tab), and \" (literal double quote).

StringLiteral

Built-ins

SAIL provides a small set of types, values, and operators that are available in every source file without any import. These are wired into the compiler's initial environment. In addition, every assembly block has access to the full set of EVM opcodes through Yul primitives.


Primitive Types

Five types are built into the language kernel.

TypeDescription
word256-bit unsigned integer; the EVM's native machine word
boolBoolean type with constructors true and false
()Unit type; used as the return type of functions that produce no value
pair a bGeneric product type, also written (a, b) in tuple syntax
sum a bGeneric disjoint union with constructors inl and inr

pair and sum are the internal representation of all user-defined algebraic data types. A data type with multiple constructors is encoded as a nested sum, and a constructor with multiple fields is encoded as a nested pair. User code rarely names pair or sum directly; they appear implicitly through data declarations and tuple syntax.

word

word is the only numeric type at the kernel level. Every integer literal in SAIL has type word. There is no numeric overloading: 42, 0xff, and 0xffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff are all word values.

function decimals() -> word {
    return 18;
}

bool

bool has exactly two constructors, true and false, both of which are always in scope.

function isActive(paused : bool) -> bool {
    match paused {
    | false => return true;
    | true  => return false;
    }
}

() — Unit

The unit type () has a single value, also written (). Functions that perform side effects and return nothing use () as their return type.

function setOwner(newOwner : word) -> () {
    assembly { sstore(0, newOwner) }
}

Tuple Syntax

The syntax (a, b) is shorthand for pair a b. Tuples with more than two elements are right-nested pairs: (a, b, c) means pair a (pair b c).

data Transfer = Transfer(word, word, word);

function unpack(t : Transfer) -> (word, word, word) {
    match t {
    | Transfer(from, to, amount) => return (from, to, amount);
    }
}

Infix Operators

SAIL provides eight infix operators as syntactic sugar over ordinary function calls. The parser rewrites each operator to the corresponding function call before name resolution; the functions themselves must be in scope at the point of use.

OperatorEquivalent callType
e1 < e2lt(e1, e2)bool -> bool -> bool
e1 > e2gt(e1, e2)bool -> bool -> bool
e1 <= e2le(e1, e2)bool -> bool -> bool
e1 >= e2ge(e1, e2)bool -> bool -> bool
e1 != e2ne(e1, e2)bool -> bool -> bool
e1 && e2and(e1, e2)bool -> bool -> bool
e1 || e2or(e1, e2)bool -> bool -> bool
!enot(e)bool -> bool

Because the operators desugar to function calls, the compiler resolves them through the normal type class and name resolution machinery. The functions lt, gt, le, ge, ne, and, or, and not are not built into the kernel; they must be brought into scope before use.

Note The && and || operators do not short-circuit in the current implementation. Both operands are always evaluated before the logical operation is performed.

import std.{lt, ge, and};

function isValidAmount(amount : word, balance : word) -> bool {
    return amount > 0 && amount <= balance;
}

The invokable Class

The kernel defines one built-in type class:

forall self args ret . class self:invokable(args, ret) {
    function invoke(self : self, args : args) -> ret;
}

invokable is the compiler's mechanism for encoding higher-order functions. When a function-typed value is passed as an argument or stored in a data structure, the compiler generates an invokable instance that captures the closure and implements invoke. User code rarely interacts with invokable directly. A dedicated chapter covers higher-order functions and the defunctionalization transformation in detail; see Lambda Functions.


Assembly Primops

Inside an assembly { } block, all EVM opcodes are available as Yul primitives. Each opcode is treated as a function that operates exclusively on word values. Variables declared in the surrounding SAIL scope are accessible by name inside the block.

The sections below list every available opcode grouped by category, along with its Yul type signature.

Arithmetic

OpcodeSignatureDescription
add(x, y)word -> word -> wordAddition modulo 2^256
sub(x, y)word -> word -> wordSubtraction modulo 2^256
mul(x, y)word -> word -> wordMultiplication modulo 2^256
div(x, y)word -> word -> wordInteger division; 0 if y = 0
sdiv(x, y)word -> word -> wordSigned integer division
mod(x, y)word -> word -> wordModulo; 0 if y = 0
smod(x, y)word -> word -> wordSigned modulo
exp(x, y)word -> word -> wordx raised to the power y
addmod(x, y, m)word -> word -> word -> word(x + y) mod m
mulmod(x, y, m)word -> word -> word -> word(x * y) mod m
signextend(b, x)word -> word -> wordSign-extend from bit b
function checkedAdd(x : word, y : word) -> word {
    let result : word;
    let overflow : word;
    assembly {
        result   := add(x, y)
        overflow := lt(result, x)
    }
    if (overflow != 0) {
        assembly { revert(0, 0) }
    }
    return result;
}

Bitwise

OpcodeSignatureDescription
and(x, y)word -> word -> wordBitwise AND
or(x, y)word -> word -> wordBitwise OR
xor(x, y)word -> word -> wordBitwise XOR
not(x)word -> wordBitwise NOT
byte(n, x)word -> word -> wordnth byte of x (0 = most significant)
shl(shift, value)word -> word -> wordLeft shift
shr(shift, value)word -> word -> wordLogical right shift
sar(shift, value)word -> word -> wordArithmetic right shift

Comparison

Comparison opcodes return 1 if the condition holds and 0 otherwise. The result type is word, not bool; use tobool or a match on the result to convert.

OpcodeSignatureDescription
lt(x, y)word -> word -> word1 if x < y (unsigned)
gt(x, y)word -> word -> word1 if x > y (unsigned)
slt(x, y)word -> word -> word1 if x < y (signed)
sgt(x, y)word -> word -> word1 if x > y (signed)
eq(x, y)word -> word -> word1 if x = y
iszero(x)word -> word1 if x = 0
function isOwner(account : word) -> bool {
    let owner : word;
    let result : word;
    assembly {
        owner  := sload(0)
        result := eq(account, owner)
    }
    if (result) {
        return true;
    } else {
        return false;
    }
}

Hashing

OpcodeSignatureDescription
keccak256(offset, size)word -> word -> wordKeccak-256 hash of size bytes starting at memory offset
function storageSlot(account : word) -> word {
    let slot : word;
    assembly {
        mstore(0, account)
        slot := keccak256(0, 32)
    }
    return slot;
}

Memory

OpcodeSignatureDescription
mload(p)word -> wordLoad word from memory at offset p
mstore(p, v)word -> word -> ()Store word v to memory at offset p
mstore8(p, v)word -> word -> ()Store the least significant byte of v at offset p
msize()wordSize of active memory in bytes
mcopy(dst, src, size)word -> word -> word -> ()Copy size bytes from src to dst
memoryguard(n)word -> wordDeclare minimum memory usage to the optimizer

Storage

OpcodeSignatureDescription
sload(slot)word -> wordLoad value from storage slot
sstore(slot, value)word -> word -> ()Store value to storage slot
function transfer(to : word, amount : word) -> () {
    let callerSlot : word;
    let toSlot : word;
    let senderBal : word;
    let recipientBal : word;
    assembly {
        callerSlot   := caller()
        toSlot       := to
        senderBal    := sload(callerSlot)
        recipientBal := sload(toSlot)
        sstore(callerSlot, sub(senderBal, amount))
        sstore(toSlot,     add(recipientBal, amount))
    }
}

Call Data

OpcodeSignatureDescription
calldataload(p)word -> wordRead 32 bytes from calldata at offset p
calldatasize()wordTotal size of calldata in bytes
calldatacopy(dst, src, size)word -> word -> word -> ()Copy calldata into memory

Return Data

OpcodeSignatureDescription
returndatasize()wordSize of the most recent return data
returndatacopy(dst, src, size)word -> word -> word -> ()Copy return data into memory

Code

OpcodeSignatureDescription
codesize()wordSize of the current contract's bytecode
codecopy(dst, src, size)word -> word -> word -> ()Copy bytecode into memory
extcodesize(addr)word -> wordBytecode size of external contract at addr
extcodecopy(addr, dst, src, size)word -> word -> word -> word -> ()Copy external bytecode into memory
extcodehash(addr)word -> wordKeccak-256 hash of external contract's bytecode
datasize(name)string -> wordSize of a named Yul data object
dataoffset(name)string -> wordOffset of a named Yul data object

Control Flow

OpcodeSignatureDescription
stop()()Halt execution successfully
return(offset, size)word -> word -> aReturn size bytes from memory at offset and halt
revert(offset, size)word -> word -> aRevert with size bytes from memory at offset
invalid()()Trigger the invalid opcode; consumes all remaining gas
selfdestruct(addr)word -> ()Destroy the contract and send balance to addr

return and revert have a polymorphic return type a because execution does not resume after them; they can appear in any expression position regardless of the expected type.

Stack and Program Counter

OpcodeSignatureDescription
pop(v)word -> ()Discard a value
pc()wordCurrent program counter value

Gas

OpcodeSignatureDescription
gas()wordRemaining gas for the current execution

External Calls

OpcodeSignatureDescription
call(gas, addr, value, inOffset, inSize, outOffset, outSize)word^7 -> wordCall external contract; returns 1 on success
callcode(gas, addr, value, inOffset, inSize, outOffset, outSize)word^7 -> wordLike call, but runs in the caller's context
delegatecall(gas, addr, inOffset, inSize, outOffset, outSize)word^6 -> wordDelegatecall; preserves caller and value
staticcall(gas, addr, inOffset, inSize, outOffset, outSize)word^6 -> wordRead-only external call

Contract Creation

OpcodeSignatureDescription
create(value, offset, size)word -> word -> word -> wordDeploy new contract; returns address
create2(value, offset, size, salt)word -> word -> word -> word -> wordDeploy with deterministic address

Logging

OpcodeSignatureDescription
log0(offset, size)word -> word -> ()Emit log with no topics
log1(offset, size, topic1)word -> word -> word -> ()Emit log with 1 topic
log2(offset, size, topic1, topic2)word^4 -> ()Emit log with 2 topics
log3(offset, size, topic1, topic2, topic3)word^5 -> ()Emit log with 3 topics
log4(offset, size, topic1, topic2, topic3, topic4)word^6 -> ()Emit log with 4 topics
function emitTransfer(from : word, to : word, amount : word) -> () {
    let transferTopic : word;
    assembly {
        transferTopic := 0xddf252ad1be2c89b69c2b068fc378daa952ba7f163c4a11628f55a4df523b3ef
        mstore(0x00, amount)
        log3(0x00, 0x20, transferTopic, from, to)
    }
}

Blockchain Context

These opcodes expose information about the current transaction and block. All return word.

OpcodeDescription
address()Address of the executing contract
balance(addr)Ether balance of addr in wei
selfbalance()Ether balance of the executing contract
caller()Address of the direct caller (msg.sender)
callvalue()Ether sent with the call in wei (msg.value)
origin()Address that originated the transaction (tx.origin)
gasprice()Gas price of the transaction
chainid()Chain identifier
basefee()Base fee of the current block
blockhash(blockNumber)Hash of the given block (only last 256 blocks)
coinbase()Beneficiary address of the current block
timestamp()Unix timestamp of the current block
number()Current block number
difficulty()Difficulty of the current block
prevrandao()Previous RANDAO value (post-Merge randomness source)
gaslimit()Gas limit of the current block
function onlyOwner(ownerSlot : word) -> () {
    let owner : word;
    let msgSender : word;
    let isAuth : word;
    assembly {
        owner     := sload(ownerSlot)
        msgSender := caller()
        isAuth    := eq(msgSender, owner)
    }
    if (isAuth) {
        return ();
    } else {
        assembly { revert(0, 0) }
    }
}

Variable Declaration and Assignment

SAIL distinguishes two kinds of mutable state: local variables declared inside function bodies and field variables declared inside contract bodies. Local variables exist only for the duration of a function call; field variables persist in contract storage across transactions.


Local Variable Declaration

A local variable is introduced with the let keyword inside a function body. The type annotation and the initialiser are both optional.

Declaration without annotation or initialiser

let x;

The compiler assigns a fresh type variable to x and infers its type from subsequent uses. The variable must be assigned before it is read; the compiler does not insert a default value.

Declaration with a type annotation

let bal : word;

The type is fixed to word at the point of declaration. The variable is still uninitialized; it must be assigned before use.

function loadBalance(account : word) -> word {
    let bal : word;
    assembly { bal := sload(account) }
    return bal;
}

Declaration with an initialiser

An initialiser provides a value at declaration time. The type may still be omitted and will be inferred from the initialiser expression.

let amount = 100;        // type inferred as word
let fee : word = 3;      // type annotation and initialiser together

Initialised declarations are useful when the right-hand side is an expression whose type would otherwise be ambiguous:

data Result = Ok(word) | Err(word);

function safeTransfer(from : word, to : word, amount : word) -> Result {
    let result = Result.Err(0);   // type inferred as Result from constructor
    let bal : word;
    assembly { bal := sload(from) }
    if (gte(bal, amount)) {
        result = Result.Ok(amount);
    }
    return result;
}

Assignment

Simple assignment

An assignment statement writes a new value to an existing variable or to a contract field. The left-hand side must be an lvalue: a name or an indexed expression.

x = expr;

The type of expr must match the declared type of x.

contract Vault {
    balance : word;

    function deposit(amount : word) -> () {
        let next : word;
        next = balance;
        balance = add(next, amount);
    }
}

Compound assignment

The += and -= operators combine a read, an arithmetic operation, and a write in a single statement.

balance += amount;   // equivalent to balance = balance + amount
balance -= amount;   // equivalent to balance = balance - amount

Compound assignment is most commonly used with contract fields:

contract ERC20 {
    totalSupply : word;
    feePool     : word;

    function mint(amount : word) -> () {
        totalSupply += amount;
        feePool     += div(amount, 100);
    }
}

Contract Field Variables

A field variable is declared inside a contract body with a mandatory type annotation and an optional initialiser. Fields are stored in contract storage and retain their values between calls.

contract Token {
    owner   : word;
    supply  : word;
    paused  : bool;
}

Fields are accessed and assigned by name from any function inside the same contract. A field cannot be accessed from a free function.

contract Token {
    supply : word;

    function mint(amount : word) -> () {
        supply += amount;
    }

    function totalSupply() -> word {
        return supply;
    }
}

Field initialisers

An optional initialiser sets the field's value at deployment time. It is evaluated once when the contract is deployed.

contract Token {
    supply : word = 0;
}

Contextual Assignment

The left-hand side of an assignment may be any expression that denotes an lvalue. When the expected type of the right-hand side is determined by the left-hand side, the contextual constructor shorthand .Constructor can be used on the right-hand side.

data Result = Ok(word) | Err(word);

function main() -> Result {
    let r : Result;
    r = .Ok(0);         // equivalent to Result.Ok(0)
    return r;
}

Conditional Statement

The if statement executes a block conditionally on a boolean expression. An optional else branch handles the false case.

if (condition) {
    // executed when condition is true
}

if (condition) {
    // true branch
} else {
    // false branch
}

Both branches must produce the same type if the if statement appears in a context where a value is expected. When used purely for side effects the types need only be consistent:

contract Token {
    paused : bool;

    function transfer(to : word, amount : word) -> () {
        if (paused) {
            assembly { revert(0, 0) }
        }
    }
}

Note The condition must be of type bool. SAIL does not implicitly convert word to bool. Use an explicit comparison (x != 0) when the condition originates from a word value.


For Loop

The for statement provides a C-style counted loop. Its header has three clauses separated by ;:

for ( ForInitStmt ; Condition ; ForPostStmt ) Body

The condition is any expression of type bool; the loop runs while it is true.

Initialisation clause

The init clause runs once before the first iteration. It may:

  • Declare a new local variable (typed or untyped, with or without an initialiser):

    for (let i = 0; i < 10; i = i + 1) { ... }
    for (let i : word; i < 10; i = i + 1) { ... }
    
  • Assign to an already-declared variable:

    let i : word;
    for (i = 0; i < 10; i = i + 1) { ... }
    
  • Use a compound assignment or a plain expression.

Post-iteration clause

The post clause runs after each iteration, before the condition is re-tested. It follows the same grammar as the init clause. A let binding introduced here creates a fresh variable scoped to the body of that iteration only:

for (i = 0; i <= 0; let j = 1) {
    s = j;    // j is in scope here and rebound on every iteration
    i = i + 1;
}

Complete examples

Accumulate a sum from 1 to 10:

import std.{Num, Add, Sub, Eq, Ord, Bounded, Typedef, le};

contract Sum {
    function main() -> word {
        let s = 0;
        for (let i = 1; i <= 10; i = i + 1) { s = s + i; }
        return s;    // 55
    }
}

Loop variable declared before the for:

contract Sum {
    function main() -> word {
        let i : word;
        let s = 0;
        for (i = 1; i <= 10; i = i + 1) { s = s + i; }
        return s;
    }
}

Loop variable shadows an outer declaration:

contract Shadow {
    function main() -> word {
        let i = 100;
        let s = 0;
        for (let i = 1; i <= 10; i = i + 1) { s = s + i; }
        // i is 100 here again
        return s;
    }
}

Nested if inside a for body:

contract ForInner {
    function main() -> word {
        let result = 0;
        for (let height = 0; height < 7; height = height + 1) {
            if (true) { result = height; } else {}
        }
        return result;
    }
}

Scope rules

A variable declared in the init clause is in scope for the condition expression, the post-iteration clause, and the entire body.

A variable declared in the post clause is in scope only for the body of the current iteration — it is re-bound at the start of each subsequent one.

The loop body is its own block; declarations inside it do not escape to the enclosing function.

Note The condition must be of type bool. SAIL does not implicitly convert word to bool. Use an explicit comparison (le(i, 10) or i <= 10) when comparing integer counters.


Expression Statements

Any expression may appear as a statement. The expression is evaluated for its side effects and the result is discarded. This is the standard way to call a function whose return type is ().

function emitTransfer(from : word, to : word, amount : word) -> () {
    assembly {
        mstore(0x00, amount)
        log3(0x00, 0x20, 0xddf252ad, from, to)
    }
}

function main(to : word, amount : word) -> () {
    emitTransfer(caller(), to, amount);    // expression statement: result () is discarded
}

Scope and Shadowing

Local variables are in scope from their declaration to the end of the enclosing block. A variable declared in an inner block shadows an outer declaration of the same name for the duration of that block.

function computeFee(amount : word) -> word {
    let fee : word = 1;
    {
        let fee : word = div(amount, 100);   // shadows outer fee inside this block
    }
    return fee;                              // refers to the outer fee; returns 1
}

Note The compiler uses unique identifiers internally, so shadowing is safe and does not cause name collisions in the generated code.

Functions

A function definition introduces a named computation that takes zero or more typed parameters and returns a value of a declared type. Functions can be defined at the top level of a source file, called free functions, or inside a contract body.

function name(param1 : Type1, param2 : Type2) -> ReturnType {
    // body
}

Every top-level function must carry a complete type signature: every parameter must be annotated with its type, and the return type must be provided after ->. The compiler rejects any top-level definition that omits an annotation.

Note The complete-annotation requirement applies to free functions and contract methods. It does not apply to lambda expressions or to local bindings inside a function body, where the compiler infers types from context.


Parameters

Parameters are declared as a comma-separated list enclosed in parentheses. Each parameter has the form name : Type.

function transfer(to : word, amount : word) -> () {
    let bal : word;
    assembly { bal := sload(caller()) }
    assembly { sstore(caller(), sub(bal, amount)) }
    assembly { sstore(to, add(sload(to), amount)) }
}

A function that takes no arguments is written with an empty parameter list:

function sender() -> word {
    let s : word;
    assembly { s := caller() }
    return s;
}

Return Type

The return type follows the parameter list after ->. Every top-level function must declare its return type explicitly.

A function that returns no meaningful value uses the unit type ():

function emitTransfer(from : word, to : word, amount : word) -> () {
    assembly {
        mstore(0x00, amount)
        log3(0x00, 0x20, 0xddf252ad, from, to)
    }
}

Every execution path through the body must end with a return statement whose expression has the declared return type.


Free Functions

A function defined outside any contract body is called a free function. Free functions are visible throughout the file in which they are defined and can be imported by other modules.

function isContract(addr : word) -> bool {
    let size : word;
    assembly { size := extcodesize(addr) }
    return gt(size, 0);
}

contract Token {
    function onlyContract(addr : word) -> () {
        if (isContract(addr)) {
            return ();
        } else {
            assembly { revert(0, 0) }
        }
    }
}

Polymorphic Functions

A function that works uniformly over multiple types can be made polymorphic with a forall quantifier placed before the function keyword. The quantifier lists the type variables that appear in the signature.

forall a . function identity(x : a) -> a {
    return x;
}

forall a b . function fst(p : (a, b)) -> a {
    match p {
    | (x, y) => return x;
    }
}

Type variables introduced by forall are instantiated at each call site. The compiler specializes the function for every concrete type combination that appears in the program.

Note Polymorphic functions are monomorphized by the specializer before code generation. Each distinct instantiation produces a separate function in the output. A call to identity with a word argument becomes identity$word in the compiled output. No polymorphism survives to the generated Yul.


Constrained Functions

A function may require that one or more of its type variables satisfy a type class constraint. Constraints are written after the type variable list, separated from the function keyword by =>.

forall a . class a:Checked {
    function checkedAdd(x : a, y : a) -> a;
}

forall t . t:Checked => function safeTransfer(from : word, to : word, amount : t) -> t {
    return Checked.checkedAdd(amount, amount);
}

Multiple constraints on different type variables are separated by commas:

forall a b . a:Eq, b:Eq => function transfersEqual(x : (a, b), y : (a, b)) -> bool {
    match (x, y) {
    | ((xa, xb), (ya, yb)) => return Eq.eq(xa, ya);
    }
}

At each call site the compiler checks that the supplied types satisfy all listed constraints. If no instance is found a type error is reported.


Recursive Functions

A function may call itself recursively. The compiler adds the function name to the typing context before checking the body.

function sumBalances(slot : word, count : word) -> word {
    if (eq(count, 0)) {
        return 0;
    } else {
        let bal : word;
        assembly { bal := sload(slot) }
        return add(bal, sumBalances(add(slot, 1), sub(count, 1)));
    }
}

Mutually recursive functions are also supported. The compiler detects mutual dependencies automatically through strongly-connected-component analysis and type-checks the group as a unit. Both functions must be defined in the same file.

data TxStatus = Pending | Confirmed;

function isPending(s : TxStatus) -> bool {
    match s {
    | TxStatus.Pending   => return isNotConfirmed(s);
    | TxStatus.Confirmed => return false;
    }
}

function isNotConfirmed(s : TxStatus) -> bool {
    match s {
    | TxStatus.Confirmed => return isPending(TxStatus.Pending);
    | TxStatus.Pending   => return true;
    }
}

Contract Functions

Functions defined inside a contract body have access to the contract's field variables. They follow the same signature rules as free functions.

contract ERC20 {
    totalSupply : word;

    function mint(amount : word) -> () {
        totalSupply = add(totalSupply, amount);
    }

    function getTotalSupply() -> word {
        return totalSupply;
    }
}

Contract functions may read and write field variables. Free functions can only operate on their parameters and locally declared variables.


Pattern Matching in Function Bodies

Functions may use match statements to deconstruct algebraic data type values.

data Result = Ok(word) | Err(word);

function unwrapOrZero(r : Result) -> word {
    match r {
    | Result.Ok(v)  => return v;
    | Result.Err(_) => return 0;
    }
}

Patterns may be nested arbitrarily. The wildcard pattern _ matches any value without binding it. The compiler checks that the set of patterns covers all possible constructors of the scrutinee type and reports an error for incomplete matches.


Assembly in Function Bodies

Functions may contain assembly blocks to access EVM opcodes directly. Inside an assembly block, Yul syntax is used. Variables declared in the surrounding SAIL scope are accessible by name inside the block.

function loadBalance(account : word) -> word {
    let bal : word;
    assembly {
        bal := sload(account)
    }
    return bal;
}

Variables assigned inside an assembly block must be declared with let in the enclosing SAIL scope before the block opens. The type of such variables must be word, since Yul operates exclusively on 256-bit machine words.

Warning The type checker cannot verify the semantic correctness of Yul code. Incorrect assembly can produce contracts that silently compute wrong results or revert unexpectedly. Minimize the size of assembly blocks and document any non-obvious invariants.


Missing Annotation Error

Omitting a parameter type or the return type on a top-level function is a compile-time error. The compiler reports the offending signature and explains what is missing.

// Error: parameter 'x' has no type annotation.
function bad(x) -> word {
    return x;
}
Top-level function must have complete type annotations:
  bad(x) -> word
Annotate every parameter (name : Type) and provide a return type (-> Type).

Omitting the return type is equally rejected:

// Error: return type is missing.
function alsobad(x : word) {
    return x;
}

Type inference remains available inside function bodies for local variables and intermediate expressions. Only the function signature itself requires explicit annotations at the top level.

Assembly Blocks

SAIL allows inline assembly blocks that embed Yul statements directly in a function body. Assembly blocks provide unrestricted access to EVM opcodes and are the primary mechanism for operations that SAIL has no built-in syntax for, such as storage reads and writes, event emission, and ABI encoding helpers.

function loadBalance(account : word) -> word {
    let bal : word;
    assembly {
        bal := sload(account)
    }
    return bal;
}

Warning The type checker cannot verify the semantic correctness of Yul code. Incorrect assembly can produce contracts that silently compute wrong results or revert unexpectedly. Minimize the size of assembly blocks and document any non-obvious invariants.


Yul Sublanguage

The contents of an assembly { … } block are written in Yul, the low-level intermediate language used by the Solidity compiler. Yul operates exclusively on 256-bit machine words, the native value type of the EVM.

SAIL variables whose type is word are accessible by name inside the block. Variables of other SAIL types cannot be referenced directly in Yul.


Variable Declaration

Inside an assembly block, Yul variables are declared with let and assigned with :=. Yul let is separate from SAIL let: Yul variables exist only within the enclosing Yul block.

assembly {
    let ptr := mload(0x40)   // Yul variable; exists only in this block
}

Multiple names may appear on the left-hand side of a single let to receive the multiple return values of a built-in opcode:

assembly {
    let success, returndata := call(gas(), target, value, argOffset, argSize, 0, 0)
}

Assignment

An assignment in Yul uses :=. The left-hand side must be either a Yul variable or a SAIL word variable in scope.

function storeBalance(account : word, amount : word) -> () {
    assembly {
        sstore(account, amount)   // EVM opcode: write amount to storage slot account
    }
}

Assigning to a SAIL variable communicates a result back to the SAIL scope:

function getFreeMemPtr() -> word {
    let ptr : word;
    assembly {
        ptr := mload(0x40)
    }
    return ptr;
}

Conditionals

Yul provides an if statement that executes a block when a condition is non-zero. There is no else branch in Yul; use switch for multi-way dispatch.

assembly {
    if iszero(success) {
        revert(0, 0)
    }
}

Switch

The Yul switch statement dispatches on a value. Each case arm matches a literal. An optional default arm matches any value not handled by a case.

assembly {
    switch selector
    case 0x70a08231 {
        // balanceOf selector
    }
    case 0xa9059cbb {
        // transfer selector
    }
    default {
        revert(0, 0)
    }
}

For Loops

Yul's for statement provides a general loop with an initialisation block, a condition expression, a post-iteration block, and a body block.

contract ERC20 {
    function sumSlots(startSlot : word, count : word) -> word {
        let endSlot : word;
        let total   : word;
        assembly {
            endSlot := add(startSlot, count)
        }
        assembly {
            for { let i := startSlot } lt(i, endSlot) { i := add(i, 1) }
            {
                total := add(total, sload(i))
            }
        }
        return total;
    }
}

The initialisation block may be empty ({}). break, continue, and leave control loop execution:

StatementEffect
breakExit the innermost for loop immediately
continueSkip to the post-iteration block of the loop
leaveReturn from the enclosing Yul function

Nested Blocks

An assembly block may contain nested Yul blocks { … }. Variables declared inside a nested block are not visible outside it.

assembly {
    {
        let tmp := mload(0x00)   // tmp is scoped to this inner block
        mstore(0x20, tmp)
    }
    // tmp is not in scope here
}

Accessing SAIL Variables

Only SAIL variables of type word can be read or written inside an assembly block. This restriction is enforced at compile time: every SAIL name that appears inside Yul must resolve to a variable or parameter whose type is word. Parameters, local variables declared with let, and contract field variables are all subject to this rule.

function transfer(account : word, amount : word) -> () {
    let bal : word;
    assembly {
        bal    := sload(account)      // account and bal are SAIL word variables
        sstore(account, sub(bal, amount))
    }
}

Note Variables assigned inside an assembly block must have been declared with let in the enclosing SAIL scope before the block opens.

Rejected: parameter of type bool

A parameter of type bool cannot be named inside Yul. The compiler reports a type mismatch because Yul has no boolean type and cannot represent the value.

// Error: bool is not word.
function bad(paused : bool) -> () {
    assembly {
        sstore(0, paused)
    }
}
Types: bool and word do not unify
 - in: function bad (paused : bool) -> () { ... }

To work with a bool value inside an assembly block, convert it to a word first using an explicit conditional in SAIL.

Rejected: variable of an algebraic data type

A local variable whose type is a user-defined data type is equally rejected. Sum and product types are not EVM words and have no direct Yul representation.

data Result = Ok(word) | Err(word);

// Error: Result is not word.
function bad(r : Result) -> word {
    let res : word;
    assembly {
        res := r
    }
    return res;
}
Types: Result and word do not unify
 - in: function bad (r : Result) -> word { ... }

To operate on structured values from assembly, extract the relevant word fields first in SAIL, pass them as word parameters or local variables, and perform the Yul computation on those.


Common EVM Operations

The following table lists the EVM opcodes most frequently used in assembly blocks. For the full list see the EVM opcode reference.

OpcodeDescription
add(a, b)256-bit addition (wrapping)
sub(a, b)256-bit subtraction (wrapping)
mul(a, b)256-bit multiplication (wrapping)
div(a, b)256-bit unsigned integer division
mod(a, b)256-bit unsigned modulo
mload(p)Read 32 bytes from memory at offset p
mstore(p, v)Write 32 bytes v to memory at offset p
sload(k)Read storage slot k
sstore(k, v)Write value v to storage slot k
caller()Address of the message sender
callvalue()Value (in wei) sent with the call
calldataload(p)Read 32 bytes from calldata at offset p
iszero(x)1 if x == 0, else 0
revert(p, s)Abort execution; return s bytes from p
return(p, s)Halt execution; return s bytes from p

Assembly and the Hull IR

Assembly blocks pass through the compilation pipeline unchanged. The SAIL compiler includes them verbatim in the Hull IR, and the Yul code generator reproduces them without transformation. This means the programmer has full control over the generated Yul but also bears full responsibility for its correctness.

Datatypes

SAIL provides algebraic data types (ADTs) for defining structured values. An ADT declares a named type together with one or more constructors. Each constructor describes one way to build a value of that type and may carry zero or more fields of arbitrary types.

data Option(a) = None | Some(a);

Data types may be defined at the top level of a source file or inside a contract body.


Enumeration Types

The simplest kind of algebraic data type has only nullary constructors with no fields. Such a type acts as a finite enumeration.

data TokenStatus = Active | Paused | Deprecated;

Each constructor is a distinct value of the type. Enumerations are commonly used wherever Solidity uses enum.

contract Registry {
    data TokenStatus = Active | Paused | Deprecated;

    function statusCode(s : TokenStatus) -> word {
        match s {
        | TokenStatus.Active     => return 1;
        | TokenStatus.Paused     => return 2;
        | TokenStatus.Deprecated => return 0;
        }
    }

    function main() -> word {
        return statusCode(TokenStatus.Active);
    }
}

Constructors with Fields

A constructor can carry one or more fields. The field types are listed in parentheses, separated by commas.

data TxStatus  = Pending | Settled | Failed;
data TxOutcome = Success(TxStatus) | Revert(TxStatus) | Unknown;

A constructor with fields is applied like a function: TxOutcome.Success(TxStatus.Settled) produces a value of type TxOutcome wrapping a value of type TxStatus.

Fields are extracted by pattern matching; there is no record-style field access. The pattern mirrors the constructor application:

function outcomeCode(x : TxOutcome) -> word {
    match x {
    | TxOutcome.Success(TxStatus.Settled) => return 1;
    | TxOutcome.Revert(TxStatus.Failed)   => return 2;
    | _                                   => return 0;
    }
}

Parametric Data Types

A data type can be parameterized by one or more type variables, making it a generic or parametric type. The type variables are listed in parentheses after the type name.

data Option(a) = None | Some(a);

Here a is a type variable. Option(word) is the type of optional words, Option(bool) is the type of optional booleans, and so on. The type variable a may appear in the field types of any constructor.

contract Option {
    data Option(a) = None | Some(a);

    function just(x : a) -> Option(a) {
        return Option.Some(x);
    }

    function maybe(default : word, opt : Option(word)) -> word {
        match opt {
        | Option.None    => return default;
        | Option.Some(x) => return x;
        }
    }

    function main() -> word {
        return maybe(0, Option.Some(42));
    }
}

The type checker verifies that every use of a parametric type supplies the correct number of type arguments. Applying Option to two arguments, for instance, is a type error.


Nested Pattern Matching

Patterns may be nested to arbitrary depth to match inside multiple layers of constructors in a single arm.

data Option(a) = None | Some(a);

// Unwrap an approval amount nested in two Option layers.
function resolveApproval(outer : Option(Option(word))) -> Option(word) {
    match outer {
    | Option.Some(Option.Some(x)) => return Option.Some(x);
    | _                           => return Option.None;
    }
}

The wildcard _ matches any value without binding it. It can appear at any depth in a pattern.


Opaque Wrappers

A single-constructor, single-field type is the standard idiom for introducing a distinct type that is represented by an existing type at runtime. This is similar to Haskell's newtype or Solidity's user-defined value types.

data uint256 = uint256(word);

uint256 is a type distinct from word even though it carries exactly one word field. The type checker treats them as incompatible, preventing accidental mixing. The wrapper is removed during compilation: uint256 values occupy exactly one EVM word, just like word.

Wrapping and unwrapping are done explicitly with the constructor and a pattern:

function wrap(x : word) -> uint256 {
    return uint256(x);
}

function unwrap(x : uint256) -> word {
    match x {
    | uint256(w) => return w;
    }
}

The standard library defines uint256, bytes4, bytes32, and address this way, each wrapping word.


Phantom Type Parameters

A type parameter that does not appear in any constructor field is called a phantom type parameter. It carries no runtime information but allows the type system to distinguish values that would otherwise be identical.

// 'a' is a phantom type parameter: the constructor Proxy carries no field of type 'a'.
data Proxy(a) = Proxy;

Proxy(word) and Proxy(bool) are distinct types at compile time but produce the same runtime value. Phantom types are useful for passing type information to functions without allocating extra memory.

forall a . class a:MemoryType {
    function size(prx : Proxy(a)) -> word;
}

instance word:MemoryType {
    function size(prx : Proxy(word)) -> word {
        return 32;
    }
}

The Proxy(a) argument lets the caller select which MemoryType instance to use without passing an actual value of type a.

Note Because phantom type parameters leave the constructor's result type partially undetermined, the type checker requires an explicit type annotation whenever a Proxy value is constructed in a context where the type cannot be inferred from surrounding expressions. Use the expression annotation form Proxy : Proxy(word) to resolve the ambiguity.


Tuples

SAIL has built-in support for product types (tuples). A tuple type is written as a parenthesised, comma-separated list of component types. Tuple values are written the same way.

function swap(p : (word, bool)) -> (bool, word) {
    match p {
    | (x, b) => return (b, x);
    }
}

Tuples of more than two elements are right-nested pairs internally. The type (word, bool, word) is represented as pair(word, pair(bool, word)).

The unit type () is the zero-element tuple. It carries no information and is used as the return type of functions that exist only for their side effects.

function storeBalance(account : word, amount : word) -> () {
    assembly { sstore(account, amount) }
}

Note Tuple patterns may appear anywhere a pattern is expected, including inside constructor patterns:

forall a b . instance Zero:Nth((a, b), a) {
    function nth(idx : Proxy(Zero), tup : (a, b)) -> a {
        match tup {
        | (x, _) => return x;
        }
    }
}

Contextual Constructor Syntax

In a context where the expected type is known, the module qualifier can be omitted from a constructor name by prefixing it with .. The compiler resolves the constructor to the appropriate type automatically.

data Option(a) = None | Some(a);

function just(x : word) -> Option(word) {
    return .Some(x);   // equivalent to Option.Some(x)
}

function nothing() -> Option(word) {
    return .None;      // equivalent to Option.None
}

The same shorthand works in patterns:

function isNone(o : Option(word)) -> bool {
    match o {
    | .None    => return true;
    | .Some(_) => return false;
    }
}

The compiler reports an error if the expected type is not known or if the constructor name is ambiguous.


Type Synonyms

A type synonym introduces a new name for an existing type. Synonyms are purely a compile-time device: the compiler expands them before type checking and they leave no trace in the generated code.

type Int   = word;
type Point = pair(Int, Int);

function makePoint(x : Int, y : Int) -> Point {
    return (x, y);
}

function getX(p : Point) -> Int {
    match p {
    | (x, _) => return x;
    }
}

Like data types, synonyms can have type parameters:

type Map(k, v) = pair(k, v);   // toy example

Warning Recursive type synonyms are not allowed. A synonym must not refer directly or indirectly to itself. Attempting to define type A = B and type B = A simultaneously is a compile-time error.


Runtime Encoding

Algebraic data types compile to a uniform binary encoding in the generated Hull/Yul code.

Sum types (types with more than one constructor) are encoded as nested binary sums using inl (left injection) and inr (right injection). A type with n constructors becomes a right-nested binary tree of depth ⌈log₂ n⌉. For example, a three-constructor type data T = A | B | C is encoded as:

A  →  inl ()
B  →  inr (inl ())
C  →  inr (inr ())

Product types (constructor fields, tuples) are encoded as right-nested pairs. The three-field constructor data T = T(word, bool, word) becomes pair(word, pair(bool, word)).

This uniform encoding is what the match compiler and the Hull back-end operate on. It is not visible at the SAIL level.

Parametric Polymorphism

A parametric polymorphic function works uniformly over any type. The caller does not need to know which concrete type is used; the function behaves identically for all instantiations. SAIL supports parametric polymorphism through forall quantifiers in function signatures.


Type Variables and forall

A type variable is a placeholder for any concrete type. To introduce type variables in a function signature, place a forall quantifier before the function keyword. The quantifier lists the type variable names separated by spaces and terminated by a period.

forall a . function id(x : a) -> a {
    return x;
}

The signature forall a . a -> a states that id accepts one argument of any type a and returns a value of the same type a. The same type variable a appears in both the parameter and the return position, so the caller knows that the output type equals the input type.

Multiple type variables are listed in the same quantifier:

forall a b . function fst(p : (a, b)) -> a {
    match p {
    | (x, y) => return x;
    }
}

forall a b . function snd(p : (a, b)) -> b {
    match p {
    | (x, y) => return y;
    }
}

Type variables may appear in parameter types, the return type, and in type arguments to other type constructors:

data Option(a) = None | Some(a);

forall a . function just(x : a) -> Option(a) {
    return Option.Some(x);
}

forall a . function fromOption(default : a, opt : Option(a)) -> a {
    match opt {
    | Option.None    => return default;
    | Option.Some(x) => return x;
    }
}

Call-Site Instantiation

At each call site, the compiler determines the concrete type for every type variable from the types of the supplied arguments. No explicit type application is needed; the inference engine handles instantiation automatically.

forall a . function id(x : a) -> a {
    return x;
}

contract C {
    function main() -> word {
        return id(42);      // a is instantiated to word
    }
}

Each combination of concrete types produces a distinct specialization during compilation. A call to id with a word argument becomes id$word in the generated output, and a call with a pair type becomes a separate function with a distinct name. No polymorphism survives to the generated Yul.


Polymorphic Functions with Pattern Matching

Polymorphic functions frequently deconstruct structured values through pattern matching. The match compiler operates on the inferred type at each call site after instantiation.

data Pair(a, b) = Pair(a, b);

forall a b . function fst(p : Pair(a, b)) -> a {
    match p {
    | Pair(x, y) => return x;
    }
}

forall a b . function snd(p : Pair(a, b)) -> b {
    match p {
    | Pair(x, y) => return y;
    }
}

function addAmounts(x : word, y : word) -> word {
    let res : word;
    assembly { res := add(x, y) }
    return res;
}

// A transfer record holds (sender address, amount).
function totalTransferred(p : Pair(word, word)) -> word {
    return addAmounts(fst(p), snd(p));
}

contract ERC20 {
    function main() -> word {
        return totalTransferred(Pair(100, 200));
    }
}

Mutually Recursive Polymorphic Functions

Polymorphic functions may call each other recursively. The compiler resolves mutual dependencies through strongly-connected-component analysis and checks all functions in a group together. Both functions must be defined in the same file.

data Option(a) = None | Some(a);

forall a . function orElse(primary : Option(a), fallback : Option(a)) -> Option(a) {
    match primary {
    | Option.Some(v) => return Option.Some(v);
    | Option.None    => return pickFirst(fallback, primary);
    }
}

forall a . function pickFirst(x : Option(a), y : Option(a)) -> Option(a) {
    match x {
    | Option.Some(v) => return orElse(x, y);
    | Option.None    => return y;
    }
}

The Subsumption Test

When a function carries a forall annotation, the compiler verifies that the body is at least as polymorphic as the declared signature. This check is called the subsumption test. It prevents signatures that claim more generality than the body actually provides.

The test works in three steps:

  1. The declared type is skolemised: each type variable is replaced by a fresh rigid constant that cannot be unified with any other type.
  2. The body is type-checked independently, producing an inferred type.
  3. The inferred type must unify with the skolemised declared type. If a rigid constant would need to be unified with a concrete type (such as word), the body is not polymorphic enough and the compiler reports an error.

Error: return type is more polymorphic than the body

The most common subsumption failure occurs when the annotation promises that the function works for any type a, but the body always produces a specific type such as word.

// Error: the body always returns word, but the annotation says a.
forall a . function wrong(x : word) -> a {
    return x;
}
Type not polymorphic enough! The annotated type is:
forall a . word -> a
but the infered type is:
word -> word
in:
forall a . function wrong (x : word) -> a

The body return x has type word -> word because x is declared as word. The skolemised declared type requires the result to be a rigid variable a, which cannot be unified with word. The compiler rejects the definition.

Error: wrong type variable in the return position

A function that swaps the return type variable is caught by the same test.

// Error: the body returns the first component (type a),
//        but the annotation declares the return type as b.
forall a b . function fst(p : (a, b)) -> b {
    match p {
    | (x, y) => return x;
    }
}
Type not polymorphic enough! The annotated type is:
forall a b . (a, b) -> b
but the infered type is:
forall $t . ($t, $t) -> $t
in:
forall a b . function fst (p : (a, b)) -> b

The body returns x, which has the type of the first component. The inferred type therefore unifies both components and the return, making them all the same variable $t. The skolemised declared type requires the return to be the rigid variable b (the second component), which is distinct from a. The unification fails and the compiler reports the error.

Error: type variable forced to word by an assembly block

Assembly blocks operate exclusively on word values. If the body uses a type variable as if it were word inside an assembly block, the inference engine forces that variable to word, making the function monomorphic in the body while the annotation still declares a type variable.

// Error: the assembly block forces a to word,
//        so the body is monomorphic.
forall a . function double(x : a) -> a {
    let res : word;
    assembly { res := add(x, x) }
    return res;
}
Type not polymorphic enough! The annotated type is:
forall a . a -> a
but the infered type is:
word -> word
in:
forall a . function double (x : a) -> a

The correct way to write this function is to restrict the parameter type to word explicitly and drop the forall:

function double(x : word) -> word {
    let res : word;
    assembly { res := add(x, x) }
    return res;
}

If a computation must be polymorphic in a type class sense (working for all types that support addition), use a constrained type variable instead of an assembly block. See the Type Classes section for details.


Specialization and Naming

The compiler eliminates all polymorphism before code generation through a process called specialization (or monomorphization). Every call site that instantiates a polymorphic function at a concrete type combination produces a separate function definition in the output. The compiler chooses names of the form name$Type for each specialization, for example id$word or fst$word$bool.

This means:

  • There is no runtime representation of type variables.
  • Each specialized version is compiled independently and can be optimized on its own.
  • Whole-program compilation is required: the specializer must see all call sites to determine which instantiations to generate.

Note A polymorphic function that is never called is not emitted at all. Only the specializations that are actually needed by the program appear in the compiled output.

Type Classes

A type class defines a named set of operations that a type must implement. Any type that provides implementations for all required operations is said to be an instance of the class. Type classes enable constrained polymorphism: a function may be parameterized over a type variable and simultaneously require that the variable belongs to one or more classes.


Class Declarations

A class declaration introduces a class name, a main type variable, and zero or more method signatures. The forall quantifier is required when the class declaration introduces type variables.

forall a . class a:Eq {
    function eq(x : a, y : a) -> bool;
    function ne(x : a, y : a) -> bool;
}

The type variable a that appears immediately before the colon is the main type argument of the class. Every instance must supply a concrete type for this variable.

A class with no methods defines a pure marker class:

forall a . class a:Serializable {}

Superclass Constraints

A class may require that its main type argument already belongs to another class. This constraint is called a superclass constraint and is written before the class keyword with =>.

forall a . a:Eq => class a:Ord {
    function lt(x : a, y : a) -> bool;
    function lte(x : a, y : a) -> bool;
}

Any instance of Ord must also be an instance of Eq. The compiler verifies this at each instance declaration. If a function requires a:Ord, the constraint a:Eq is automatically available without listing it explicitly.


Instance Declarations

An instance declaration provides implementations for all methods of a class for a specific type. The instance head names the class and supplies a concrete type for the main type variable.

instance word:Eq {
    function eq(x : word, y : word) -> bool {
        let res : word;
        assembly { res := eq(x, y) }
        return res;
    }
    function ne(x : word, y : word) -> bool {
        let res : word;
        assembly { res := iszero(eq(x, y)) }
        return res;
    }
}

A polymorphic instance applies to a family of types. The forall quantifier lists the type variables that appear in the instance head:

data Pair(a, b) = Pair(a, b);

forall a b . a:Eq, b:Eq => instance Pair(a, b):Eq {
    function eq(x : Pair(a, b), y : Pair(a, b)) -> bool {
        match x, y {
        | Pair(xa, xb), Pair(ya, yb) =>
            return Eq.eq(xa, ya);
        }
    }
    function ne(x : Pair(a, b), y : Pair(a, b)) -> bool {
        return Eq.ne(x, y);
    }
}

Calling Class Methods

Class methods are called with a qualified name of the form ClassName.method. The compiler resolves the correct instance from the types of the arguments:

data Option(a) = None | Some(a);

forall a . a:Eq => function senderMatches(sender : a, expected : Option(a)) -> bool {
    match expected {
    | Option.None    => return false;
    | Option.Some(e) => return Eq.eq(sender, e);
    }
}

Overlapping Instances

SAIL does not support overlapping instances. Two instances overlap when the same type can match both instance heads. The compiler reports an error at the second declaration:

data Box(a) = Box(word);
forall a . class a:C {}

forall a . instance Box(a):C {}

// Error: overlaps with the more general instance above.
instance Box(word):C {}
Overlapping instances are not supported
instance:
Box(word) : C
overlaps with:
Box(?$3) : C

Main and Weak Type Arguments

When a class has more than one type parameter, the parameter immediately before the colon in the class head is called the main type argument. The remaining parameters, listed after the class name in parentheses, are called weak type arguments.

//          main ──┐         ┌── weak
forall a b . class a:Convert(b) {
    function convert(x : a) -> b;
}

The distinction matters for instance resolution and for the three soundness conditions the compiler enforces.

Main type argument (a in a:Convert(b)): used as the primary key for instance lookup. The compiler selects an instance by matching the main type first. It must be determinable independently of the weak arguments.

Weak type arguments (b in a:Convert(b)): represent additional types involved in the relationship. They may be determined by the main type argument through the coverage condition, but they cannot introduce type variables that are unconstrained at the call site.

Example: weak argument determined by main type

In the following instance, the main type Wei uniquely determines the weak type Ether. The instance is well formed because the weak type variable is replaced by a concrete type:

forall a b . class a:Convert(b) {
    function convert(x : a) -> b;
}

data Wei   = Wei(word);
data Ether = Ether(word);

instance Wei:Convert(Ether) {
    function convert(x : Wei) -> Ether {
        match x {
        | Wei.Wei(w) =>
            let e : word;
            assembly { e := div(w, 1000000000000000000) }
            return Ether.Ether(e);
        }
    }
}

contract C {
    function main() -> word {
        let result = Convert.convert(Wei.Wei(2000000000000000000));
        match result {
        | Ether.Ether(v) => return v;
        }
    }
}

Instance Soundness Conditions

To guarantee that instance resolution terminates and remains coherent, the compiler enforces three conditions on every instance declaration. Violating any of them is a compile-time error. Each condition can be relaxed by a pragma when a specific instance is known to be safe.

Coverage Condition

Every type variable that appears in a weak type argument position must be determined by the main type argument. The set of type variables bound by the main type must cover all type variables bound by the weak types.

Rejected example

data Box(a) = Box(word);
forall a b . class a:MyClass(b) {}

// Error: b appears only in the weak position; Box(a) does not determine b.
forall a b . instance Box(a):MyClass(b) {}
Coverage condition fails for class:
MyClass
- the type:
Box(a)
does not determine:
b

Accepted example

Replacing the unconstrained variable b with a concrete type eliminates the violation:

forall a . instance Box(a):MyClass(word) {}

Patterson Condition

For each constraint in the instance context, the measure of the constraint must be strictly smaller than the measure of the instance head. The measure of a predicate is the total number of type constructors and type variables it contains, counting repetitions. Each type constructor or type variable contributes 1 to the measure, regardless of nesting.

This condition prevents instance search from entering an infinite loop when the same type class is used in both the context and the head.

Rejected example

forall a . class a:C1 {}
forall a . class a:C2 {}

// Context: U:C1 has measure 2, U:C2 has measure 2, total 4.
// Head:    U:C1 has measure 2.
// Context measure (4) is not strictly smaller than head measure (2).
forall U . U:C1, U:C2 => instance U:C1 {}
Instance
U : C1
does not satisfy the Patterson conditions.

Accepted example

Wrapping the main type in a constructor increases the head measure so that each context constraint is strictly smaller:

data Wrap(a) = Wrap(a);

// Context: U:C1 has measure 2.
// Head:    Wrap(U):C1 has measure 3 (Wrap + U + C1 name).
// 2 < 3, so the Patterson condition holds.
forall U . U:C1 => instance Wrap(U):C1 {}

Bound Variable Condition

Every type variable that appears in the instance context must also appear in the instance head. A type variable present only in the context cannot be determined from the types at the call site, making instance resolution ambiguous.

Rejected example

data Box(a) = Box(word);
forall a . class a:Eq {}
forall a b . class a:Container(b) {}

// Error: c appears in the context constraint c:Eq
//        but not in the instance head Box(a):Container(a).
forall a c . c:Eq => instance Box(a):Container(a) {}
Bounded variable condition fails!

Accepted example

Remove the unused variable from the context, or include it in the head:

// No context needed.
forall a . instance Box(a):Container(a) {}

// Or: bring c into the head through the weak argument.
forall a c . c:Eq => instance Box(a):Container(c) {}

Pragmas

A pragma is a compiler directive that relaxes one of the three instance soundness conditions. Pragmas are written at the top level of a source file, before any declarations.

There are three pragmas, one per condition:

Pragma keywordCondition disabled
no-coverage-conditionCoverage condition
no-patterson-conditionPatterson condition
no-bounded-variable-conditionBound variable condition

Each pragma has two forms:

// Disable for a specific list of classes (comma-separated).
pragma no-coverage-condition ClassName1, ClassName2;

// Disable globally for all classes in this file.
pragma no-coverage-condition;

Pragmas apply only to the file in which they appear. Importing a file does not inherit its pragmas, and the importing file's pragmas do not affect the imported declarations.

Warning Disabling these conditions can allow instances that cause the compiler's instance resolution to loop or produce incoherent results. Use pragmas only when you understand the implications for the specific class and instance involved.

pragma no-coverage-condition

Disables the coverage check for the listed classes. Use this when a weak type argument is deliberately left undetermined by the main type, for example in open type-indexed families where the relationship is established by context rather than by the instance itself.

pragma no-coverage-condition MyClass;

data Box(a) = Box(word);
forall a b . class a:MyClass(b) {}

// Accepted: coverage condition is disabled for MyClass.
forall a b . instance Box(a):MyClass(b) {}

Without the pragma, this declaration would produce:

Coverage condition fails for class:
MyClass
- the type:
Box(a)
does not determine:
b

pragma no-patterson-condition

Disables the Patterson measure check for the listed classes. Use this for class hierarchies where the instance search is known to terminate through structural arguments not captured by the simple measure metric.

pragma no-patterson-condition C1;

forall a . class a:C1 {}
forall a . class a:C2 {}

// Accepted: Patterson condition is disabled for C1.
forall U . U:C1, U:C2 => instance U:C1 {}

Without the pragma, this declaration would produce:

Instance
U : C1
does not satisfy the Patterson conditions.

pragma no-bounded-variable-condition

Disables the bound variable check for the listed classes. Use this when a context variable is intentionally existential, meaning it is chosen by the instance rather than derived from the call site.

pragma no-bounded-variable-condition Container;

data Box(a) = Box(word);
forall a . class a:Eq {}
forall a b . class a:Container(b) {}

// Accepted: bound variable condition is disabled for Container.
forall a c . c:Eq => instance Box(a):Container(a) {}

Without the pragma, this declaration would produce:

Bounded variable condition fails!

Combining Pragmas

Multiple pragmas may appear in the same file and may target the same class from different directives. All specified conditions are disabled independently:

pragma no-coverage-condition MyClass;
pragma no-patterson-condition MyClass;
pragma no-bounded-variable-condition MyClass;

data Box(a) = Box(word);
forall a . class a:Eq {}
forall a . class a:C1 {}
forall a b . class a:MyClass(b) {}

// Accepted: all three conditions are disabled for MyClass.
forall a b c . c:Eq, (a, b):C1 => instance Box(a):MyClass(b) {}

Modules

Every SAIL source file is a module. A module can import definitions from other modules and control which of its own definitions are visible to importers through export declarations. The module system provides explicit namespace management: a name imported from another module is not automatically available without qualification unless the import form places it directly into scope.


File Layout

A source file is a sequence of import declarations and top-level declarations in any order. Import declarations may appear before, after, or interleaved with top-level declarations.

( import-declaration | top-level-declaration )*

The conventional style places all imports at the top of the file, but this is not enforced by the compiler.


Import Forms

SAIL provides four import forms. Each form controls how the imported names are placed into scope and whether they require qualification to use.

Full module import

import modname;

Loads the module and makes all its exported names available under the qualified prefix modname. No names are introduced into the unqualified scope.

// token.solc exports: Token, transfer
import token;

function doTransfer(t : token.Token, to : word, amount : word) -> () {
    return token.transfer(t, to, amount);
}

Module import with alias

import modname as Alias;

Same as a full import but assigns a shorter alias to the module. All qualified references must use the alias; the original module name is not available as a qualifier.

import token as T;

function doTransfer(t : T.Token, to : word, amount : word) -> () {
    return T.transfer(t, to, amount);
}

After this import, writing token.transfer(t, to, amount) is an error because token is not a known qualifier in this file.

Selective import

import modname.{Name1, Name2};

Loads the listed names directly into the unqualified scope. They can be used without any prefix.

import token.{Token, transfer};

function doTransfer(t : Token, to : word, amount : word) -> () {
    return transfer(t, to, amount);
}

Each item in the selector list may optionally be renamed with as:

import selectlib.{keep as keep_, drop as drop_};

function main(x : word) -> word {
    return drop_(keep_(x));
}

The original name keep is not placed into scope. Only the alias keep_ is available. Multiple items may be renamed independently in the same selector list.

Constructors of an imported type must still be qualified with the type name even when the type itself was selectively imported:

import token.{Token};

function makeActive() -> Token {
    return Token.Active;    // correct
}

// Error: unqualified constructor.
function makeBad() -> Token {
    return Active;
}
Unqualified constructor:
Active
Use Type.Constructor form.

Wildcard selective import

import modname.{*};

Places every exported name from the module into the unqualified scope. Individual names may be excluded using hiding:

import globlib.{*} hiding {idWord};

function main(x : word) -> word {
    let y : T = mkT(x);    // mkT is in scope; idWord is not
    match y {
    | T.T(v) => return v;
    }
}

The hiding clause accepts a comma-separated list of names to suppress:

import selectlib.{keep, drop} hiding {drop};

function main(x : word) -> word {
    return keep(x);    // drop is not in scope
}

Importing operator symbols

User-defined operator symbols are exported and imported using parenthesised syntax (sym), where sym is the operator character sequence. Importing an operator brings both its precedence/fixity declaration and its bound function into scope.

// math.solc  (declares and exports the operator)
infixl 70 (^^) => pow;
export { pow, (^^) };

function pow(b : word, e : word) -> word { ... }
// main.solc  (imports the operator by symbol)
import math.{pow, (^^)};

contract Main {
    function main() -> word {
        return 2 ^^ 10;    // 1024
    }
}

Operator symbols may be mixed freely with ordinary names in the same selector list. Importing the function name alone (without the symbol) makes the implementation callable as a regular function but does not enable infix syntax.


Module Paths

A module path identifies the source file of a module relative to a root directory. Dots in the path correspond to directory separators.

Relative paths

A name without a lib. prefix is a relative path. The compiler resolves it relative to the directory that contains the importing file.

import foo.bar;          // loads foo/bar.solc
import foo.bar.baz;      // loads foo/bar/baz.solc

After a plain import foo.bar, the module is accessible under the full dotted qualifier:

import foo.bar;

function main() -> word {
    return foo.bar.value();
}

Library paths

A path that begins with lib. is treated as an absolute library path, resolved from the root of the current library rather than the current directory.

export lib.some.module;          // re-exports some/module.solc from the library root

Library paths are mainly used in re-export declarations to expose a module from a different directory tree. They are not commonly used in import statements directly.

External library paths

A path that begins with @libname. refers to a module in a separately configured external library root. External libraries are registered in the build configuration; the compiler resolves them to absolute paths at build time.

import @extlib.math.api;

contract External {
    function main() -> word {
        return math.api.sum(39);
    }
}

An alias keeps the reference concise:

import @extlib.math.api as MathApi;

function main() -> word {
    return MathApi.sum(39);
}

Standard library

The name std and any name that begins with std. are resolved to the standard library. The standard library root is configured separately from user libraries.

import std;

function main() -> word {
    return std.addWord(21, 21);
}

Qualified Names

When a module is imported with a full or aliased import, its exported definitions are accessed through a dotted qualifier. The qualifier may prefix types, functions, and constructors.

Qualified type names

import token;

function doTransfer(t : token.Token, to : word, amount : word) -> () {
    return token.transfer(t, to, amount);
}

Qualified constructor expressions

Constructors are written as qualifier.TypeName.Constructor:

import token;

function makeActive() -> token.Token {
    return token.Token.Active;
}

Qualified constructor patterns

The same qualified form is used in pattern matching:

import token;

function isActive(t : token.Token) -> word {
    match t {
    | token.Token.Active => return 1;
    | token.Token.Paused => return 0;
    }
}

Qualified names with aliases

When the import carries an alias, replace the module name with the alias in all qualified references:

import token as T;

function makeActive() -> T.Token {
    return T.Token.Active;
}

Export Declarations

An export declaration controls which definitions an importing module can see. Definitions that are not listed in an export declaration are private to the file.

Note A file without any export declaration exports nothing. All definitions are private unless explicitly exported.

Explicit export list

export { Name1, Name2, TypeName };

Names are listed by their unqualified identifier.

export { Bool, not, C, D, id };

Exporting a type with constructors

By default, exporting a type name makes the type visible but keeps its constructors private. An importer can use the type in signatures but cannot construct or pattern-match on its values.

To export constructors explicitly, list them in parentheses after the type name:

export { Token(Ok) };           // exports only the Ok constructor
export { Token(Ok, Err) };      // exports both constructors
export { Bool(*) };             // exports Bool and all its constructors

Exporting operator symbols

Operator symbols are separate exportable entities from the functions they implement. To make an operator available to importers as infix syntax, both the function name and the operator symbol must appear in the export list:

infixl 70 (^^) => pow;
export { pow, (^^) };

Exporting only pow (without (^^)) makes the function callable by name but does not grant importers access to the infix ^^ syntax.

Wildcard export

export { * };

Exports everything defined in the file. Constructors of all types are also exported.

Re-exporting another module

A module may forward its imports so that importers receive them as if they came from the re-exporting module.

Re-export a whole module:

// api.solc: makes all of util's exports available under api.util.*
export lib.reexport_module.pkg.util;

An importer of api then accesses the re-exported names through the full chain:

import reexport_module.pkg.api;

function main() -> word {
    return api.util.unwrap(api.util.Wrap.Mk(1));
}

Re-export a module under an alias:

// api_alias.solc
export lib.reexport_module.pkg.util as Utils;
import reexport_module.pkg.api_alias;

function main() -> word {
    return api_alias.Utils.unwrap(api_alias.Utils.Wrap.Mk(1));
}

Re-export selected names from a module:

import hidden_ctor_lib;

export hidden_ctor_lib.{Token(Ok)};    // re-exports Token type with Ok constructor only
export hidden_ctor_lib.{mkErr};        // re-exports only the mkErr function

Hidden Constructors

When a constructor is not exported, importers receive an opaque type: they can name the type and pass values around, but they cannot construct new values directly or inspect existing ones through pattern matching. The only way to create or examine values of an opaque type is through the functions the module chooses to export.

// hidden_ctor_lib.solc
export {Token(Ok), mkOk, mkErr};

data Token = Ok(word) | Err(word);

function mkOk(x : word) -> Token { return Token.Ok(x); }
function mkErr(x : word) -> Token { return Token.Err(x); }

The module exports Token with only the Ok constructor visible. The Err constructor is private.

An importer that selects only the type cannot use the hidden constructor:

import hidden_ctor_lib.{Token};

// Error: Err is not exported.
function bad() -> Token {
    return .Err(1);
}
No matching constructor for shorthand expression:
.Err

Pattern matching on the hidden constructor is equally rejected:

import hidden_ctor_lib.{Token, mkErr};

// Error: Token.Err is not in scope.
function bad(x : word) -> word {
    match mkErr(x) {
    | Token.Err(v) => return v;
    | _ => return 0;
    }
}
Undefined name: Token.Err

Transitive Imports

Importing a module does not automatically make its own imports visible. If module A imports module B, and module C imports module A, then C sees only the names that A chose to export. Names that B exported to A but that A did not re-export are not visible in C.

// transitive_dep_base.solc
export { g };
function g() -> word { return 1; }

// transitive_dep_mid.solc
import transitive_dep_base.{g};
export { f };
function f() -> word { return g(); }

// transitive_dep_main_select.solc
import transitive_dep_mid.{f};
function main() -> word { return f(); }    // g is not in scope here

Name Shadowing

A locally defined function or parameter shadows an imported name of the same identifier. The imported name remains accessible through its qualified form.

import token;

// Local 'transfer' shadows token.transfer for unqualified calls.
function transfer(to : word, amount : word) -> () { return (); }

function main(to : word, amount : word) -> () {
    return token.transfer(to, amount);    // uses the imported transfer, not the local one
}

A locally defined name also shadows a selectively imported name:

import erc20lib.{balanceOf};

// Local 'balanceOf' shadows the imported one.
function balanceOf(account : word) -> word { return 0; }

function main(account : word) -> word {
    return balanceOf(account);    // calls the local balanceOf
}

Common Errors

Using an unqualified name after a full import

A full import (import modname;) requires all names to be qualified. Using an imported name without the qualifier is an error:

import token;

// Error: 'transfer' is not in scope unqualified.
function main(to : word, amount : word) -> () {
    return transfer(to, amount);
}
Undefined name: transfer

The fix is to qualify the call: return token.transfer(to, amount);.

Using the original name after aliasing

When an alias replaces the module name, the original name is not a valid qualifier:

import erc20.token as T;

// Error: erc20 is not a qualifier in this file.
function main(to : word, amount : word) -> () {
    return erc20.token.transfer(to, amount);
}
Undefined name: erc20

The fix is to use the alias: return T.transfer(to, amount);.

Using a type without qualifying its constructor

Selectively importing a type name does not bring its constructors into the unqualified scope. Constructors must always be written as TypeName.Constructor:

import token.{Token};

// Error: unqualified constructor.
function makeActive() -> Token {
    return Active;
}
Unqualified constructor:
Active
Use Type.Constructor form.

Using a type not in scope at all

Importing a module without qualification and then using a name without the module qualifier fails:

import token;

// Error: Token is not in the unqualified scope.
function bad(t : Token) -> word {
    return 0;
}
Undefined type constructor:
Token

Use token.Token, or switch to a selective import.

Type Inference

SAIL uses a constraint-based bidirectional type inference algorithm. The algorithm divides type information into two flows: a bottom-up flow that generates a type from an expression, and a top-down flow that pushes an expected type down into an expression. Both flows run simultaneously through unification, which merges type information from different sources into a consistent solution.

The scope of inference is deliberately limited. Top-level function signatures require complete explicit annotations. Inference operates freely inside function bodies: local variable types, intermediate expression types, and the types of arguments to generic functions are all inferred automatically without any annotation.


What Requires Annotations

Every top-level function must carry a complete type signature: every parameter must be annotated and the return type must be declared. The compiler rejects any top-level definition that omits an annotation. This rule applies to free functions and to functions defined inside a contract body.

// Required: every parameter and the return type are annotated.
function transfer(to : word, amount : word) -> () {
    let bal : word;
    assembly { bal := sload(caller()) }
    assembly { sstore(caller(), sub(bal, amount)) }
    assembly { sstore(to, add(sload(to), amount)) }
}

For a full description of the annotation requirement and the error it produces, see Functions.


What Is Inferred

Inside a function body, the compiler infers types for:

  • Local variables declared with let, whether or not they carry an annotation.
  • Intermediate expressions and subexpressions.
  • Type arguments to polymorphic function calls.
  • Type class constraints required by the body.

The inferred types are propagated through unification. Whenever the type of an expression is used in a position where another type is expected, the two are unified: the algorithm finds the most general substitution that makes them equal, or reports an error if no such substitution exists.


Local Variable Inference

A let declaration without a type annotation introduces a fresh type variable. The compiler assigns a concrete type to that variable as soon as enough information is available from the surrounding context.

Inferred from initialiser

When a let declaration includes an initialiser, the type is taken from the initialiser expression. Integer literals always have type word.

function demo() -> word {
    let amount = 100;     // amount : word, from the integer literal
    let flag   = true;    // flag : bool, from the boolean literal
    return amount;
}

Inferred from subsequent use

When no initialiser is present, the type is inferred from the first use of the variable.

function loadBalance(account : word) -> word {
    let bal : word;           // annotated; no inference needed
    assembly { bal := sload(account) }
    return bal;
}

function compute(account : word) -> word {
    let x;                    // no annotation, no initialiser
    x = sload(account);       // first assignment: x : word
    return x;
}

Inferred from return context

A variable whose type depends on an algebraic data type can have its type fixed by the expected return type.

data Result = Ok(word) | Err(word);

function demo() -> Result {
    let x = Result.Err(0);     // x : Result, inferred from constructor
    return x;
}

Contextual Constructor Shorthand

A constructor expression written as .Constructor (with a leading dot and no type prefix) is resolved using the expected type at the point of use. The compiler uses the context to determine which type the constructor belongs to.

In a return statement

The declared return type provides the expected type:

data TxStatus = Pending | Confirmed | Reverted;

function initialStatus() -> TxStatus {
    return .Pending;    // resolved as TxStatus.Pending from the return type
}

In a typed assignment

The declared type of the left-hand side provides the expected type:

data TxStatus = Pending | Confirmed | Reverted;

function demo() -> TxStatus {
    let s : TxStatus;
    s = .Confirmed;     // resolved as TxStatus.Confirmed from the declared type of s
    return s;
}

Error: ambiguous shorthand

If no expected type is available, the shorthand cannot be resolved and the compiler reports an error:

data TxStatus = Pending | Confirmed | Reverted;

function bad() -> word {
    let x = .Pending;   // no expected type available for x
    return 0;
}
Cannot resolve shorthand constructor expression without expected constructor type:
.Pending

The fix is to annotate the variable: let x : TxStatus = .Pending;.


Integer Literals

An integer literal always has type word. There is no numeric type class or overloading for integer literals in SAIL. Every integer literal that appears in source code is a 256-bit EVM word value.

function demo() -> word {
    let amount   = 1000;
    let decimals = 18;
    let mask     = 0xffffffffffffffffffffffffffffffffffffffff;
    return amount;     // amount, decimals, mask all have type word
}

Polymorphic Function Call Inference

When a polymorphic function is called, the compiler instantiates the type variables from the types of the supplied arguments. No explicit type application is needed.

forall a . function id(x : a) -> a {
    return x;
}

function demo() -> word {
    return id(42);    // a instantiated to word at this call site
}

For a pair function with two type variables, both are instantiated independently:

data Pair(a, b) = Pair(a, b);

forall a b . function fst(p : Pair(a, b)) -> a {
    match p {
    | Pair(x, y) => return x;
    }
}

function demo() -> word {
    return fst(Pair(42, true));    // a = word, b = bool
}

Constraint Inference

When a function body calls a type class method, the compiler generates a constraint on the type variable involved. If the function is top-level and already carries an annotation, the annotation must list the constraint explicitly. If the annotation omits the constraint, the subsumption check catches the mismatch.

At a call site where the type variable is fixed to a concrete type, the compiler checks that an instance exists for that type. If no instance is found, the compiler reports an unsolved constraint error.

Constraint resolved at call site

forall a . class a:Encodable {
    function encode(x : a) -> word;
}

instance word:Encodable {
    function encode(x : word) -> word { return x; }
}

forall a . a:Encodable => function encodeField(x : a) -> word {
    return Encodable.encode(x);
}

contract ERC20 {
    function main() -> word {
        return encodeField(42);    // a = word; word:Encodable resolved
    }
}

Error: no instance for the required type

If the concrete type at the call site has no matching instance, the compiler reports which constraint could not be satisfied and which instances are defined:

forall a . class a:SafeArith {
    function safeAdd(x : a, y : a) -> a;
}

// No instance for word is declared.
function bad(x : word, y : word) -> word {
    return SafeArith.safeAdd(x, y);
}
Cannot entail:
word : SafeArith
using defined instances:

The fix is either to declare instance word:SafeArith { ... } or to add the constraint to the calling function's signature so the obligation is propagated to the caller:

forall a . a:SafeArith => function bad(x : a, y : a) -> a {
    return SafeArith.safeAdd(x, y);
}

Phantom Type Variables

A phantom type parameter is a type parameter of a data type that does not appear in any constructor field. When a phantom constructor is used inside a function body and the phantom parameter cannot be determined from the context, the compiler reports an ambiguous type variable error.

data TypedSlot(a) = TypedSlot(word);    // a is phantom: it appears in no field

function bad() -> word {
    let s = TypedSlot.TypedSlot(0);     // a is unconstrained; no context fixes it
    return 0;
}
Ambiguous type variable(s) $1 in definition of bad.
This typically occurs when a constructor has phantom type parameters.
Please, add a type signature to fix the ambiguous type variable.

The fix is to annotate the let declaration with the full type, giving the phantom parameter a concrete value:

function good() -> word {
    let s : TypedSlot(word) = TypedSlot.TypedSlot(0);
    return 0;
}

Unification Errors

A unification error occurs when two types that must be equal turn out to be incompatible. The compiler reports the two types and the expression that triggered the failure.

Return type mismatch

function bad(amount : word) -> bool {
    return amount;    // amount : word; expected bool
}
Types: bool and word do not unify
 - in: function bad (amount : word) -> bool { return amount; }

Match arm return type mismatch

All arms of a match expression must produce the same type. Returning different types in different arms is a unification error:

data Result = Ok(word) | Err(word);

function bad(r : Result) -> word {
    match r {
    | Result.Ok(v)  => return v;
    | Result.Err(_) => return false;    // word expected; bool returned
    }
}
Types: bool and word do not unify
 - in: false
 - in: function bad (r : Result) -> word { ... }

Algebraic data type vs primitive mismatch

User-defined types and primitive types such as word are never interchangeable:

data TxStatus = Pending | Confirmed;

function bad(n : word) -> TxStatus {
    return n;    // word is not TxStatus
}
Types: TxStatus and word do not unify
 - in: function bad (n : word) -> TxStatus { return n; }

Scope of Inference

Inference is local to function bodies. Each function is checked independently, and no type information flows between sibling functions except through their declared signatures. The declared signature of a function is the only interface that callers see; the body is invisible to inference in other functions.

This means that the order of function definitions in a file does not affect the types that inference assigns to expressions inside any given function. Mutually recursive functions are type-checked as a group (see Functions), but even then each function's signature must carry complete annotations.

Core Solidity

Lambda Functions

Numeric Types

Arithmetic

Data Locations

ABI Encoding / Decoding

Contracts

Hull

Hull is an intermediate language used by the Core Solidity compiler between SAIL and the Yul backend. Every SAIL program is lowered to Hull after type-checking, monomorphization, and match compilation. Hull is then translated to Yul for final code generation.

Hull retains algebraic data types (products and sums) from SAIL, but eliminates polymorphism, type classes, higher-order functions, and all surface syntax sugar. The result is a first-order, monomorphic language whose structure maps directly onto Yul constructs.

Note Hull is produced by the compiler and is not intended to be written by hand. The concrete syntax described in this document exists to make diagnostic .hull output readable. It can be inspected by passing the -dump-hull flag to the sol-core binary.


Relation to the Compilation Pipeline

Hull compilation pipeline

The Core Solidity performs all stages up to and including Hull emission. The yule compiler takes Hull as input and produces Yul. Yul is then compiled to EVM bytecode by the standard solc compiler.


Types

All types in Hull are monomorphic. Type variables are fully instantiated by the specializer before Hull is produced.

word

word is a 256-bit unsigned integer, corresponding to the native word size of the EVM. All arithmetic values, storage addresses, and ABI-encoded data are ultimately represented as word.

bool

bool is the Boolean type. Its only values are the literals true and false.

unit

unit is the zero-size type. Its only value is (). It carries no information and occupies no storage. unit appears as the payload type of nullary constructors in sum types. For example, the None branch of Option(word) has payload type unit.

Product Types

A product type is written as:

(T1 * T2)

It holds two values simultaneously, one of type T1 and one of type T2. The * operator is right-associative, so A * B * C is parsed as A * (B * C).

The projections fst and snd extract the left and right components of a product value, respectively.

Note N-ary products are not yet implemented. Products with more than two components are encoded as right-nested pairs: (A, B, C) is represented as (A * (B * C)).

Sum Types

A sum type is written as:

(T1 + T2)

It holds a value of either type T1, tagged as the left branch, or type T2, tagged as the right branch. The + operator is right-associative, so A + B + C is parsed as A + (B + C).

Sum values are constructed using injection expressions (inl, inr, in(k)) and deconstructed using match statements.

An n-ary sum can also be written as sum(T1, T2, ..., Tn), but the binary right-nested encoding is what the Yul code generator uses internally.

Named Types

A named type is written as:

Name{T}

It attaches a human-readable label to an underlying structural type T. The name is purely documentary: the Yul backend strips it and operates on the underlying structure. Named types originate from user-defined data declarations in Core Solidity; the specializer fills in the concrete type arguments and wraps the result in a named type to preserve the original name in diagnostic output.

For example, the Core Solidity declaration:

data Option(a) = None | Some(a);

when specialized at a = word produces the Hull named type:

Option{(unit + word)}

Function Types

Function types are written as:

(T1, T2, ..., Tn -> R)

Function types appear only in function definitions; they are not first-class values. Hull has no higher-order functions and no closures.


Expressions

Integer Literals

An integer literal is a non-negative decimal integer that denotes a word value. The value must fit in 256 bits.

42
1000000

Boolean Literals

true and false are the two values of type bool.

Unit

The expression () is the single value of type unit.

Variables

A variable refers to a locally declared name. Variable names follow the same identifier rules as Yul: they may contain letters, digits, _, and $.

Note The compiler conventionally uses names beginning with $ for compiler-generated variables, such as $alt for the payload variable introduced by a match alternative. User-visible names do not begin with $.

Pairs

A tuple expression constructs a product value:

(e1, e2)

The type of (e1, e2) is (T1 * T2) where T1 and T2 are the types of e1 and e2. Tuples with more than two elements are right-nested: (e1, e2, e3) is equivalent to (e1, (e2, e3)).

Projections

fst(e) evaluates to the first component of the pair e. snd(e) evaluates to the second component.

fst(e)
snd(e)

Both e must have a product type.

Sum Injections

A sum injection constructs a sum value. The following three forms are available:

FormMeaning
inl<T>(e)Injects e as the left branch of T
inr<T>(e)Injects e as the right branch of T
in(k)<T>(e)k-th injection into the n-ary sum T

The type annotation T is the target sum type, not the type of e. This annotation is mandatory: the Yul code generator needs the full sum type to compute the correct memory representation of the injected value without re-running type inference.

For example, inr<Option{(unit + word)}>(42) injects the word 42 into the right branch of Option{(unit + word)}, representing Some(42). inl<Option{(unit + word)}>(()) injects () into the left branch, representing None.

Function Calls

A function call applies a named function to a list of arguments:

f(e1, e2, ..., en)

All call targets in Hull are statically known names produced by the specializer. There is no dynamic dispatch or virtual call mechanism.

Conditional Expressions

A conditional expression selects between two alternatives based on a boolean value:

if<T> cond then e1 else e2

cond must have type bool. Both e1 and e2 must have type T. The type annotation T is mandatory. Conditional expressions are generated by the if-desugaring pass from Core Solidity if/else expressions.


Statements

Statements are executed sequentially inside function bodies and blocks. There are no statement separators; statements are delimited by whitespace and the structure of the enclosing block.

Variable Declaration

let x : T

Declares a mutable local variable x of type T. The declaration does not initialize the variable; it must be assigned before it is read. This corresponds directly to Yul's uninitialized let declaration.

Warning Reading a variable before it has been assigned is undefined behaviour at the Yul level. The compiler always assigns variables before reading them, but hand-written Hull must respect this invariant.

Assignment

lhs := rhs

Assigns the value of rhs to lhs. The left-hand side is most commonly a variable name, written as x := e.

Expression Statement

A bare expression can be used as a statement:

e

The expression is evaluated for its side effects and the result is discarded. This form is used for calls whose return type is unit.

Return

return e

Returns the value of e from the enclosing function. Every execution path in a Hull function must end with a return or revert.

Block

{
  stmt1
  stmt2
  ...
}

A block groups a sequence of statements. Blocks are used as the bodies of match alternatives and function definitions.

Match

match<T> e with {
  inl $alt => { ... }
  inr $alt => { ... }
}

Deconstructs the sum-typed expression e of type T. Each alternative names the payload of its branch and executes the corresponding block. The following constructor patterns are available:

PatternMatches
inlLeft injection
inrRight injection
in(k)k-th injection (n-ary)

The type annotation <T> is mandatory. It is used by the Yul translator to determine the memory layout of the sum type.

Note Sum types with more than two constructors are represented as right-nested binary sums. Deconstructing them requires nested match expressions: the outer match separates the first constructor (left branch) from the remainder (right branch), and inner matches continue the decomposition.

Function Definition

function f(x1 : T1, x2 : T2, ...) -> R {
  stmt1
  stmt2
  ...
}

Defines a named function with explicitly typed parameters and a single return type. Hull functions are translated directly to Yul functions.

The following properties hold for all Hull functions:

  • Parameters and the return type must be given explicit Hull types.
  • Functions are first-order: no function-typed parameters or return values.
  • Each function has exactly one return type.
  • The specializer produces unique names for each monomorphic instantiation, for example map$word or maybe$Word.

Assembly Block

assembly {
  <Yul statements>
}

An inline assembly block embeds raw Yul statements directly in the Hull output. The block passes through unchanged into the generated Yul. Assembly blocks are used to access EVM primitives such as mload, mstore, add, iszero, revert, and similar opcodes.

See the Yul documentation for the complete overview of Yul syntax.

Revert

revert "message"

Immediately aborts execution. The string literal is a diagnostic label. The match compiler generates revert statements for branches that are unreachable at the Core Solidity level but must be given a code path in the Hull output.

Comment

/* text */

A block comment. Comments carry no semantics and are stripped during Yul translation. The compiler inserts comments to annotate which source-level constructor each $alt variable corresponds to, for example:

inl $alt => { /* None */
              return n
            }

Alternatives

An alternative is a branch in a match statement:

Con $var => { body }

Con is one of inl, inr, or in(k). $var is a fresh variable name that is bound to the payload of the matched constructor. body is a block of statements. The bound variable has the payload type of the matched branch: for inl and inr on a sum type (T1 + T2), the payload types are T1 and T2 respectively.


Contracts and Objects

At the top level, a Hull program consists of named objects following the Yul object model. An object has a name, a code block containing statements, and zero or more inner objects:

object "ContractName" {
  code {
    ...deployment code...
  }
  object "ContractName_deployed" {
    code {
      ...runtime code...
    }
  }
}

The outer object's code block is the deployment (constructor) code. The inner object's code block is the runtime code. This structure corresponds exactly to the Yul object notation used by solc. During Hull emission, the Core Solidity contract body is split into these two objects automatically.


Concrete Syntax Reference

The grammar diagrams below use the railroad diagram convention: rounded boxes denote terminals (keywords and punctuation literals) and rectangular boxes denote non-terminals.

Types

PrimaryType: the base forms that can appear in any type position:

PrimaryType railroad diagram

Type: combines primary types with * (product) and + (sum), both right-associative:

Type railroad diagram

Expressions

PrimaryExpr: literals, variables, tuples, and calls:

PrimaryExpr railroad diagram

Expr: the complete expression grammar, including injections, projections, conditionals, and primary expressions:

Expr railroad diagram

Statements

Stmt: the complete statement grammar:

Stmt railroad diagram

Block: a brace-enclosed sequence of statements:

block railroad diagram

Auxiliaries

Arg: a typed parameter in a function definition:

Arg railroad diagram

Con: a constructor tag in a match alternative:

Con railroad diagram

Alt: a match alternative, binding the payload to a name:

Alt railroad diagram

Objects

Object: a named code container, optionally containing inner objects:

Object railroad diagram


Examples

The following examples show Core Solidity source programs alongside the Hull they produce. All examples are accepted by the Core Solidity prototype.

Identity Function

The simplest possible Hull function passes its argument through unchanged:

function id(x : word) -> word {
    return x;
}

Hull output:

function id (x : word) -> word {
  return x
}

Optional Value

The following Core Solidity program defines an Option type and a maybe function that extracts the contained value or returns a default:

data Option(a) = None | Some(a);

function maybe(n : word, o : Option(word)) -> word {
    match o {
        | None    => return n;
        | Some(v) => return v;
    }
}

After specialization at a = word, the Hull output is:

function maybe$Word (n : word, o : Option{(unit + word)}) -> word {
  match<Option{(unit + word)}> o with {
    inl $alt => { /* None */
                  return n
                }
    inr $alt => { /* Some */
                  let var_1 : word
                  var_1 := $alt
                  return var_1
                }
  }
}

Note that the function is renamed from maybe to maybe$Word by the specializer to reflect the type instantiation. The Option{(unit + word)} named type preserves the original constructor name in the type annotation.

Enumeration Type

The following program defines a three-constructor enumeration and converts it to an integer:

data Color = Red | Green | Blue;

function fromEnum(c : Color) -> word {
    match c {
        | Red   => return 0;
        | Green => return 1;
        | Blue  => return 2;
    }
}

Hull output:

function fromEnum (c : Color{(unit + (unit + unit))}) -> word {
  match<Color{(unit + (unit + unit))}> c with {
    inl $alt => { /* Red */
                  return 0
                }
    inr $alt => match<(unit + unit)> $alt with {
                  inl $alt => { /* Green */
                                return 1
                              }
                  inr $alt => { /* Blue */
                                return 2
                              }
                }
  }
}

The three-constructor type Color is encoded as the right-nested binary sum (unit + (unit + unit)). The outer match separates Red (left) from the remaining constructors (right). An inner match then separates Green (left) from Blue (right).

EVM Arithmetic via Assembly

The following contract uses an inline assembly block to invoke the add EVM opcode:

contract Add1 {
    function main() -> word {
        let res : word;
        assembly { res := add(40, 2) }
        return res;
    }
}

Hull output:

object "Add1" {
  code {
    ...deployment code...
  }
  object "Add1_deployed" {
    code {
      function main () -> word {
        let res : word
        assembly {
          res := add(40, 2)
        }
        return res
      }
    }
  }
}

The assembly block is reproduced verbatim in the Hull output and passes through unchanged into the generated Yul.

Compilation Pipeline