Introduction
Purpose and Audience
Core Solidity is an update to the Solidity programming language. It both extends the language with new features (ADTs, Parametric polymorphism, Typeclasses), and diverges from Classic Solidity in incompatible ways (most notably by removing inheritance). This document targets compiler implementers, tooling developers, and language researchers. It is a reference specification, not a tutorial.
A prototype implementation exists at argotorg/solcore. The
implementation is a research prototype. It contains bugs, lacks optimizations, and will change.
Compilation Architecture
Compilation proceeds by succesive translation between the following representations:
-
Core Solidity: The user-facing language with all syntactic features. Supports higher-order functions, pattern matching, type classes, and modules. This is a superset of SAIL, with all additional language constructs being defined in terms of desugaring passes (syntactic transformations) into SAIL.
-
SAIL: A minimal first-order intermediate representation. All syntactic sugar has been removed. Type classes are resolved to concrete instances. Polymorphic functions are specialized to monomorphic versions. Pattern matching is compiled to simple case trees.
-
Hull: A low-level representation close to Yul. Retains algebraic data types (binary sums and products) but otherwise resembles Yul. Hull code is translated directly to Yul, which is then compiled to EVM bytecode.
-
Yul: The existing Yul assembly language currently used in solc. Core Solidity is designed in a way that allows the use of alternative or additional representations at this level in the future.
Each transformation step simplifies the language and moves closer to the EVM execution model.
Document Structure
- Compilation Pipeline: Describes the compilation pipeline and how representations relate
- SAIL: Specifies the desugared intermediate representation
- Core Solidity: Specifies the user-facing language features
- Hull: Specifies the low-level IR before Yul emission
Status
This is a research prototype. The language design is not stable. Features may be added, removed, or changed. The implementation has known bugs and missing functionality. Do not use this for production systems.
SAIL
SAIL (Solidity Advanced Intermediate Language) is the source language of the
Core Solidity compiler. It extends Solidity's surface syntax with a
statically-typed functional core: parametric polymorphism via forall
quantifiers, type classes for constrained overloading, and algebraic data types
with exhaustive pattern matching. Every SAIL program is compiled to monomorphic
Core IR, named Hull, through specialization, then translated to Yul and
assembled into EVM bytecode.
This section documents the SAIL language itself. The chapters are ordered from the most foundational concepts to the most advanced:
- Syntax covers lexical conventions, literals, and the overall structure of a source file.
- Built-ins lists the types, values, and operators available without any
import, along with the full set of EVM opcodes accessible inside
assemblyblocks. - Variable Declaration and Assignment explains
letbindings and mutation. - Functions describes function signatures, free functions, contract methods, and recursive definitions.
- Assembly Blocks details the rules for embedding raw Yul inside a SAIL function body.
- Datatypes introduces algebraic data type declarations and pattern matching.
- Parametric Polymorphism explains
forallquantifiers, type variable instantiation, and the specialization strategy. - Type Classes covers class declarations, instance declarations, superclass constraints, and the three soundness conditions the compiler enforces.
- Modules describes the import and export system, qualified names, and visibility rules.
- Type Inference explains the bidirectional constraint-based algorithm, what must be annotated explicitly, and the error messages the compiler produces when inference fails.
Syntax
This page is a complete grammar reference for SAIL. Every grammar rule appears
as its own section, followed by a railroad diagram. Rounded boxes denote
terminal tokens (keywords and punctuation); rectangular boxes denote
non-terminals and link to the corresponding rule. The notation follows standard
EBNF conventions: [ … ] marks optional elements and { … } marks
zero-or-more repetition.
Parser Rules
CompilationUnit
A SAIL source file is a sequence of import declarations and top-level declarations in any order. Imports are not required to precede declarations.
TopDecl
A top-level declaration is one of: a contract, a free function, a type class, an instance, an algebraic data type, a type synonym, an export declaration, a pragma, or an operator declaration.
Import
An import declaration makes names from another module available in the current
module. The @package. prefix selects an external package; the lib. prefix
selects a standard library module.
ModulePath
A module path is a dot-separated sequence of identifiers that locates a module within a package.
ImportItems
The list of names to import from a module, enclosed in braces.
ImportItem
A single item to import. Four forms are available:
*imports all exported names (wildcard).Nameimports a single name into the unqualified scope.Name as Aliasimports a single name under a local alias. The original name is not placed in scope.(sym)imports an operator symbol and enables its infix syntax in this file.
OperatorSymbol
A non-empty sequence of operator characters (+-*/%<>=!&|^~#? or Unicode
mathematical symbols in the range U+2200 to U+23FF), written between
parentheses in import and export item lists.
HidingList
A comma-separated list of names to exclude from an import.
ExportDecl
An export declaration controls which names this module exposes to other modules.
ExportItems
The list of names to export from the current module, enclosed in braces.
ExportItem
A single export entry. It may be the wildcard *, a plain name, a name together
with its constructors, all names from a module, or an operator symbol in
parentheses.
ExportConstructors
Selects which data constructors to re-export alongside a type name: either all
constructors (*) or an explicit list.
ExportFromItems
The list of names re-exported from another module.
ExportFromItem
A single entry in a re-export list: the wildcard *, a plain name, or a name
with explicit constructor re-exports.
Pragma
A pragma adjusts compiler behaviour for type class constraint checking. Without targets the pragma applies to all classes; with a list of identifiers it applies only to those specific classes.
PragmaKind
The three available pragma kinds relax, respectively, the coverage condition, the Patterson condition, and the bound-variable condition for type class instance resolution.
PragmaTargets
A comma-separated list of class names to which a pragma applies.
Type
A type is one of: a named type constructor applied to zero or more type
arguments, a function type of the form (T₁, …, Tₙ) -> T, a tuple or unit
type written as a parenthesised comma-separated list, or a proxy type @T.
TypeList
A comma-separated (possibly empty) list of types, used as arguments to type constructors and as the parameter list of function types.
TypeVarSeq
A space-separated sequence of type-variable names following a forall keyword.
All listed names are universally quantified over the scope of the accompanying
signature.
TypeVarParams
A comma-separated list of type-variable names enclosed in parentheses. Used in
data, type, and class declarations to introduce parametric type
arguments.
TypeName
A possibly qualified type name. Simple names are single identifiers; qualified
names chain module components with ..
DataDef
An algebraic data type declaration. The optional parameter list introduces
type variables. The optional body lists the constructors separated by |.
DataConstrs
One or more data constructor definitions separated by |.
DataConstr
A single data constructor: a name optionally followed by a parenthesised, comma-separated list of field types.
TypeSynonym
A type synonym introduces an alias for an existing type. The optional parameter list introduces type variables that may appear in the right-hand side.
Pattern
A pattern appears in match equations to deconstruct a value by its
constructor. The dot-prefix form (.Name) is a contextual shorthand: the
constructor is resolved from the type being matched.
PatternList
A comma-separated list of patterns used as the argument list of a constructor
pattern or as the simultaneous arguments of a match equation.
Expr
An expression computes a value. Binary operators follow standard precedence:
arithmetic binds tighter than comparison, which binds tighter than logical. All
binary operators are left-associative except if-then-else, which is
right-associative.
ExprList
A comma-separated (possibly empty) list of expressions used as function arguments.
Literal
A literal value: a decimal or hexadecimal integer, or a double-quoted string.
Stmt
A statement is an executable step inside a function body. Assignment operators
=, +=, and -= require a terminating ;. let declares a local variable,
optionally with a type annotation and an initialiser. The for statement provides
a C-style counted loop; its initialisation and post-iteration clauses obey the
ForInitStmt and ForPostStmt grammars respectively.
Body
A brace-enclosed sequence of zero or more statements forming the body of a function, branch, or constructor.
ForInitStmt
The initialisation clause of a for loop. It may be an assignment, a compound
assignment, a let binding (typed or untyped, with or without an initialiser),
or a plain expression. Unlike a regular statement, there is no trailing ; — the
semicolons are written explicitly in the for(…; …; …) header.
ForPostStmt
The post-iteration clause executed after each loop body. It follows the same
grammar as ForInitStmt. A let binding introduced here is scoped to the body
of that single iteration.
MatchArgs
One or more comma-separated expressions forming the scrutinees of a match
statement.
Equation
A single match arm: a |-prefixed list of patterns followed by => and a
sequence of statements. The patterns are matched positionally against the
scrutinee list.
Param
A single function parameter: a name with an explicit type annotation, or an untyped name whose type will be inferred.
ParamList
A comma-separated list of function parameters.
Function
A function definition. The long form uses a brace-enclosed statement block as the body. The short form uses a single expression whose value is returned implicitly (Rust-style).
Signature
A function signature declares the function name, its parameter list, and the optional return type. It may be preceded by a polymorphism prefix to introduce type variables and constraints.
SigPrefix
An optional forall quantifier that precedes a function or method signature. It
introduces universally quantified type variables and, optionally, a list of type
class constraints that callers must satisfy.
ConstraintList
A comma-separated list of type class constraints.
Constraint
A single type class constraint of the form Type : ClassName or
Type : ClassName(T₁, …, Tₙ). It asserts that the given type is an instance
of the named class, possibly with additional type parameters.
ClassDef
A type class declaration. The self-variable (the first identifier after
class) is the main type being constrained. The optional comma-separated list
in parentheses introduces auxiliary associated type variables. The body lists
method signatures, each terminated by ;.
InstDef
An instance declaration provides method implementations for a specific type.
The optional default keyword marks the instance as an overlappable fallback
when no more specific instance is found.
Contract
A contract groups fields, nested data types, methods, and an optional constructor. The optional parameter list makes the contract generic over type variables.
ContractDecl
A single declaration inside a contract body: a field, a data type, a function, or a constructor.
FieldDecl
A contract field declaration. The type annotation is mandatory; the initialiser expression is optional.
Constructor
A contract constructor is invoked exactly once at deployment time. It has an explicit parameter list and a statement block body.
AsmBlock
An inline assembly block embeds Yul statements directly in SAIL source code, giving direct access to EVM opcodes.
YulStmt
A statement in the Yul sublanguage. Yul provides low-level control flow (if,
switch, for, break, continue, leave) and variable declarations and
assignments using :=.
YulCase
A single case arm in a Yul switch statement: a literal value followed by a
block of Yul statements.
YulExpr
A Yul expression: a literal, a variable reference, a function call, or a call
to the special return built-in.
YulNames
A comma-separated list of identifiers used as the left-hand side of a Yul
multi-assignment or the names in a Yul let declaration.
YulExprList
A comma-separated list of Yul expressions used as arguments to a Yul function call.
YulLiteral
A Yul literal value: a decimal or hexadecimal integer, or a string.
Lexer Rules
Identifier
An identifier begins with a letter (upper or lower case) and may contain letters, decimal digits, and underscores. Identifiers are used for variable names, function names, type names, module components, and constructor names.
Integer
An integer literal is either a sequence of decimal digits or a hexadecimal
literal prefixed with 0x.
StringLiteral
A string literal is a sequence of characters enclosed in double quotes.
Supported escape sequences are \n (newline), \t (tab), and \" (literal
double quote).
Built-ins
SAIL provides a small set of types, values, and operators that are available in
every source file without any import. These are wired into the compiler's
initial environment. In addition, every assembly block has access to the full
set of EVM opcodes through Yul primitives.
Primitive Types
Five types are built into the language kernel.
| Type | Description |
|---|---|
word | 256-bit unsigned integer; the EVM's native machine word |
bool | Boolean type with constructors true and false |
() | Unit type; used as the return type of functions that produce no value |
pair a b | Generic product type, also written (a, b) in tuple syntax |
sum a b | Generic disjoint union with constructors inl and inr |
pair and sum are the internal representation of all user-defined algebraic
data types. A data type with multiple constructors is encoded as a nested sum,
and a constructor with multiple fields is encoded as a nested pair. User code
rarely names pair or sum directly; they appear implicitly through data
declarations and tuple syntax.
word
word is the only numeric type at the kernel level. Every integer literal in
SAIL has type word. There is no numeric overloading: 42, 0xff, and
0xffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff are all
word values.
function decimals() -> word {
return 18;
}
bool
bool has exactly two constructors, true and false, both of which are
always in scope.
function isActive(paused : bool) -> bool {
match paused {
| false => return true;
| true => return false;
}
}
() — Unit
The unit type () has a single value, also written (). Functions that
perform side effects and return nothing use () as their return type.
function setOwner(newOwner : word) -> () {
assembly { sstore(0, newOwner) }
}
Tuple Syntax
The syntax (a, b) is shorthand for pair a b. Tuples with more than two
elements are right-nested pairs: (a, b, c) means pair a (pair b c).
data Transfer = Transfer(word, word, word);
function unpack(t : Transfer) -> (word, word, word) {
match t {
| Transfer(from, to, amount) => return (from, to, amount);
}
}
Infix Operators
SAIL provides eight infix operators as syntactic sugar over ordinary function calls. The parser rewrites each operator to the corresponding function call before name resolution; the functions themselves must be in scope at the point of use.
| Operator | Equivalent call | Type |
|---|---|---|
e1 < e2 | lt(e1, e2) | bool -> bool -> bool |
e1 > e2 | gt(e1, e2) | bool -> bool -> bool |
e1 <= e2 | le(e1, e2) | bool -> bool -> bool |
e1 >= e2 | ge(e1, e2) | bool -> bool -> bool |
e1 != e2 | ne(e1, e2) | bool -> bool -> bool |
e1 && e2 | and(e1, e2) | bool -> bool -> bool |
e1 || e2 | or(e1, e2) | bool -> bool -> bool |
!e | not(e) | bool -> bool |
Because the operators desugar to function calls, the compiler resolves them
through the normal type class and name resolution machinery. The functions
lt, gt, le, ge, ne, and, or, and not are not built into the
kernel; they must be brought into scope before use.
Note The
&&and||operators do not short-circuit in the current implementation. Both operands are always evaluated before the logical operation is performed.
import std.{lt, ge, and};
function isValidAmount(amount : word, balance : word) -> bool {
return amount > 0 && amount <= balance;
}
The invokable Class
The kernel defines one built-in type class:
forall self args ret . class self:invokable(args, ret) {
function invoke(self : self, args : args) -> ret;
}
invokable is the compiler's mechanism for encoding higher-order functions.
When a function-typed value is passed as an argument or stored in a data
structure, the compiler generates an invokable instance that captures the
closure and implements invoke. User code rarely interacts with invokable
directly. A dedicated chapter covers higher-order functions and the
defunctionalization transformation in detail; see
Lambda Functions.
Assembly Primops
Inside an assembly { } block, all EVM opcodes are available as Yul
primitives. Each opcode is treated as a function that operates exclusively on
word values. Variables declared in the surrounding SAIL scope are accessible
by name inside the block.
The sections below list every available opcode grouped by category, along with its Yul type signature.
Arithmetic
| Opcode | Signature | Description |
|---|---|---|
add(x, y) | word -> word -> word | Addition modulo 2^256 |
sub(x, y) | word -> word -> word | Subtraction modulo 2^256 |
mul(x, y) | word -> word -> word | Multiplication modulo 2^256 |
div(x, y) | word -> word -> word | Integer division; 0 if y = 0 |
sdiv(x, y) | word -> word -> word | Signed integer division |
mod(x, y) | word -> word -> word | Modulo; 0 if y = 0 |
smod(x, y) | word -> word -> word | Signed modulo |
exp(x, y) | word -> word -> word | x raised to the power y |
addmod(x, y, m) | word -> word -> word -> word | (x + y) mod m |
mulmod(x, y, m) | word -> word -> word -> word | (x * y) mod m |
signextend(b, x) | word -> word -> word | Sign-extend from bit b |
function checkedAdd(x : word, y : word) -> word {
let result : word;
let overflow : word;
assembly {
result := add(x, y)
overflow := lt(result, x)
}
if (overflow != 0) {
assembly { revert(0, 0) }
}
return result;
}
Bitwise
| Opcode | Signature | Description |
|---|---|---|
and(x, y) | word -> word -> word | Bitwise AND |
or(x, y) | word -> word -> word | Bitwise OR |
xor(x, y) | word -> word -> word | Bitwise XOR |
not(x) | word -> word | Bitwise NOT |
byte(n, x) | word -> word -> word | nth byte of x (0 = most significant) |
shl(shift, value) | word -> word -> word | Left shift |
shr(shift, value) | word -> word -> word | Logical right shift |
sar(shift, value) | word -> word -> word | Arithmetic right shift |
Comparison
Comparison opcodes return 1 if the condition holds and 0 otherwise. The result
type is word, not bool; use tobool or a match on the result to convert.
| Opcode | Signature | Description |
|---|---|---|
lt(x, y) | word -> word -> word | 1 if x < y (unsigned) |
gt(x, y) | word -> word -> word | 1 if x > y (unsigned) |
slt(x, y) | word -> word -> word | 1 if x < y (signed) |
sgt(x, y) | word -> word -> word | 1 if x > y (signed) |
eq(x, y) | word -> word -> word | 1 if x = y |
iszero(x) | word -> word | 1 if x = 0 |
function isOwner(account : word) -> bool {
let owner : word;
let result : word;
assembly {
owner := sload(0)
result := eq(account, owner)
}
if (result) {
return true;
} else {
return false;
}
}
Hashing
| Opcode | Signature | Description |
|---|---|---|
keccak256(offset, size) | word -> word -> word | Keccak-256 hash of size bytes starting at memory offset |
function storageSlot(account : word) -> word {
let slot : word;
assembly {
mstore(0, account)
slot := keccak256(0, 32)
}
return slot;
}
Memory
| Opcode | Signature | Description |
|---|---|---|
mload(p) | word -> word | Load word from memory at offset p |
mstore(p, v) | word -> word -> () | Store word v to memory at offset p |
mstore8(p, v) | word -> word -> () | Store the least significant byte of v at offset p |
msize() | word | Size of active memory in bytes |
mcopy(dst, src, size) | word -> word -> word -> () | Copy size bytes from src to dst |
memoryguard(n) | word -> word | Declare minimum memory usage to the optimizer |
Storage
| Opcode | Signature | Description |
|---|---|---|
sload(slot) | word -> word | Load value from storage slot |
sstore(slot, value) | word -> word -> () | Store value to storage slot |
function transfer(to : word, amount : word) -> () {
let callerSlot : word;
let toSlot : word;
let senderBal : word;
let recipientBal : word;
assembly {
callerSlot := caller()
toSlot := to
senderBal := sload(callerSlot)
recipientBal := sload(toSlot)
sstore(callerSlot, sub(senderBal, amount))
sstore(toSlot, add(recipientBal, amount))
}
}
Call Data
| Opcode | Signature | Description |
|---|---|---|
calldataload(p) | word -> word | Read 32 bytes from calldata at offset p |
calldatasize() | word | Total size of calldata in bytes |
calldatacopy(dst, src, size) | word -> word -> word -> () | Copy calldata into memory |
Return Data
| Opcode | Signature | Description |
|---|---|---|
returndatasize() | word | Size of the most recent return data |
returndatacopy(dst, src, size) | word -> word -> word -> () | Copy return data into memory |
Code
| Opcode | Signature | Description |
|---|---|---|
codesize() | word | Size of the current contract's bytecode |
codecopy(dst, src, size) | word -> word -> word -> () | Copy bytecode into memory |
extcodesize(addr) | word -> word | Bytecode size of external contract at addr |
extcodecopy(addr, dst, src, size) | word -> word -> word -> word -> () | Copy external bytecode into memory |
extcodehash(addr) | word -> word | Keccak-256 hash of external contract's bytecode |
datasize(name) | string -> word | Size of a named Yul data object |
dataoffset(name) | string -> word | Offset of a named Yul data object |
Control Flow
| Opcode | Signature | Description |
|---|---|---|
stop() | () | Halt execution successfully |
return(offset, size) | word -> word -> a | Return size bytes from memory at offset and halt |
revert(offset, size) | word -> word -> a | Revert with size bytes from memory at offset |
invalid() | () | Trigger the invalid opcode; consumes all remaining gas |
selfdestruct(addr) | word -> () | Destroy the contract and send balance to addr |
return and revert have a polymorphic return type a because execution does
not resume after them; they can appear in any expression position regardless of
the expected type.
Stack and Program Counter
| Opcode | Signature | Description |
|---|---|---|
pop(v) | word -> () | Discard a value |
pc() | word | Current program counter value |
Gas
| Opcode | Signature | Description |
|---|---|---|
gas() | word | Remaining gas for the current execution |
External Calls
| Opcode | Signature | Description |
|---|---|---|
call(gas, addr, value, inOffset, inSize, outOffset, outSize) | word^7 -> word | Call external contract; returns 1 on success |
callcode(gas, addr, value, inOffset, inSize, outOffset, outSize) | word^7 -> word | Like call, but runs in the caller's context |
delegatecall(gas, addr, inOffset, inSize, outOffset, outSize) | word^6 -> word | Delegatecall; preserves caller and value |
staticcall(gas, addr, inOffset, inSize, outOffset, outSize) | word^6 -> word | Read-only external call |
Contract Creation
| Opcode | Signature | Description |
|---|---|---|
create(value, offset, size) | word -> word -> word -> word | Deploy new contract; returns address |
create2(value, offset, size, salt) | word -> word -> word -> word -> word | Deploy with deterministic address |
Logging
| Opcode | Signature | Description |
|---|---|---|
log0(offset, size) | word -> word -> () | Emit log with no topics |
log1(offset, size, topic1) | word -> word -> word -> () | Emit log with 1 topic |
log2(offset, size, topic1, topic2) | word^4 -> () | Emit log with 2 topics |
log3(offset, size, topic1, topic2, topic3) | word^5 -> () | Emit log with 3 topics |
log4(offset, size, topic1, topic2, topic3, topic4) | word^6 -> () | Emit log with 4 topics |
function emitTransfer(from : word, to : word, amount : word) -> () {
let transferTopic : word;
assembly {
transferTopic := 0xddf252ad1be2c89b69c2b068fc378daa952ba7f163c4a11628f55a4df523b3ef
mstore(0x00, amount)
log3(0x00, 0x20, transferTopic, from, to)
}
}
Blockchain Context
These opcodes expose information about the current transaction and block. All
return word.
| Opcode | Description |
|---|---|
address() | Address of the executing contract |
balance(addr) | Ether balance of addr in wei |
selfbalance() | Ether balance of the executing contract |
caller() | Address of the direct caller (msg.sender) |
callvalue() | Ether sent with the call in wei (msg.value) |
origin() | Address that originated the transaction (tx.origin) |
gasprice() | Gas price of the transaction |
chainid() | Chain identifier |
basefee() | Base fee of the current block |
blockhash(blockNumber) | Hash of the given block (only last 256 blocks) |
coinbase() | Beneficiary address of the current block |
timestamp() | Unix timestamp of the current block |
number() | Current block number |
difficulty() | Difficulty of the current block |
prevrandao() | Previous RANDAO value (post-Merge randomness source) |
gaslimit() | Gas limit of the current block |
function onlyOwner(ownerSlot : word) -> () {
let owner : word;
let msgSender : word;
let isAuth : word;
assembly {
owner := sload(ownerSlot)
msgSender := caller()
isAuth := eq(msgSender, owner)
}
if (isAuth) {
return ();
} else {
assembly { revert(0, 0) }
}
}
Variable Declaration and Assignment
SAIL distinguishes two kinds of mutable state: local variables declared inside function bodies and field variables declared inside contract bodies. Local variables exist only for the duration of a function call; field variables persist in contract storage across transactions.
Local Variable Declaration
A local variable is introduced with the let keyword inside a function body.
The type annotation and the initialiser are both optional.
Declaration without annotation or initialiser
let x;
The compiler assigns a fresh type variable to x and infers its type from
subsequent uses. The variable must be assigned before it is read; the compiler
does not insert a default value.
Declaration with a type annotation
let bal : word;
The type is fixed to word at the point of declaration. The variable is still
uninitialized; it must be assigned before use.
function loadBalance(account : word) -> word {
let bal : word;
assembly { bal := sload(account) }
return bal;
}
Declaration with an initialiser
An initialiser provides a value at declaration time. The type may still be omitted and will be inferred from the initialiser expression.
let amount = 100; // type inferred as word
let fee : word = 3; // type annotation and initialiser together
Initialised declarations are useful when the right-hand side is an expression whose type would otherwise be ambiguous:
data Result = Ok(word) | Err(word);
function safeTransfer(from : word, to : word, amount : word) -> Result {
let result = Result.Err(0); // type inferred as Result from constructor
let bal : word;
assembly { bal := sload(from) }
if (gte(bal, amount)) {
result = Result.Ok(amount);
}
return result;
}
Assignment
Simple assignment
An assignment statement writes a new value to an existing variable or to a contract field. The left-hand side must be an lvalue: a name or an indexed expression.
x = expr;
The type of expr must match the declared type of x.
contract Vault {
balance : word;
function deposit(amount : word) -> () {
let next : word;
next = balance;
balance = add(next, amount);
}
}
Compound assignment
The += and -= operators combine a read, an arithmetic operation, and a
write in a single statement.
balance += amount; // equivalent to balance = balance + amount
balance -= amount; // equivalent to balance = balance - amount
Compound assignment is most commonly used with contract fields:
contract ERC20 {
totalSupply : word;
feePool : word;
function mint(amount : word) -> () {
totalSupply += amount;
feePool += div(amount, 100);
}
}
Contract Field Variables
A field variable is declared inside a contract body with a mandatory type annotation and an optional initialiser. Fields are stored in contract storage and retain their values between calls.
contract Token {
owner : word;
supply : word;
paused : bool;
}
Fields are accessed and assigned by name from any function inside the same contract. A field cannot be accessed from a free function.
contract Token {
supply : word;
function mint(amount : word) -> () {
supply += amount;
}
function totalSupply() -> word {
return supply;
}
}
Field initialisers
An optional initialiser sets the field's value at deployment time. It is evaluated once when the contract is deployed.
contract Token {
supply : word = 0;
}
Contextual Assignment
The left-hand side of an assignment may be any expression that denotes an
lvalue. When the expected type of the right-hand side is determined by the
left-hand side, the contextual constructor shorthand .Constructor can be
used on the right-hand side.
data Result = Ok(word) | Err(word);
function main() -> Result {
let r : Result;
r = .Ok(0); // equivalent to Result.Ok(0)
return r;
}
Conditional Statement
The if statement executes a block conditionally on a boolean expression.
An optional else branch handles the false case.
if (condition) {
// executed when condition is true
}
if (condition) {
// true branch
} else {
// false branch
}
Both branches must produce the same type if the if statement appears in a
context where a value is expected. When used purely for side effects the
types need only be consistent:
contract Token {
paused : bool;
function transfer(to : word, amount : word) -> () {
if (paused) {
assembly { revert(0, 0) }
}
}
}
Note The condition must be of type
bool. SAIL does not implicitly convertwordtobool. Use an explicit comparison (x != 0) when the condition originates from awordvalue.
For Loop
The for statement provides a C-style counted loop. Its header has three
clauses separated by ;:
for ( ForInitStmt ; Condition ; ForPostStmt ) Body
The condition is any expression of type bool; the loop runs while it is
true.
Initialisation clause
The init clause runs once before the first iteration. It may:
-
Declare a new local variable (typed or untyped, with or without an initialiser):
for (let i = 0; i < 10; i = i + 1) { ... } for (let i : word; i < 10; i = i + 1) { ... } -
Assign to an already-declared variable:
let i : word; for (i = 0; i < 10; i = i + 1) { ... } -
Use a compound assignment or a plain expression.
Post-iteration clause
The post clause runs after each iteration, before the condition is re-tested. It
follows the same grammar as the init clause. A let binding introduced here
creates a fresh variable scoped to the body of that iteration only:
for (i = 0; i <= 0; let j = 1) {
s = j; // j is in scope here and rebound on every iteration
i = i + 1;
}
Complete examples
Accumulate a sum from 1 to 10:
import std.{Num, Add, Sub, Eq, Ord, Bounded, Typedef, le};
contract Sum {
function main() -> word {
let s = 0;
for (let i = 1; i <= 10; i = i + 1) { s = s + i; }
return s; // 55
}
}
Loop variable declared before the for:
contract Sum {
function main() -> word {
let i : word;
let s = 0;
for (i = 1; i <= 10; i = i + 1) { s = s + i; }
return s;
}
}
Loop variable shadows an outer declaration:
contract Shadow {
function main() -> word {
let i = 100;
let s = 0;
for (let i = 1; i <= 10; i = i + 1) { s = s + i; }
// i is 100 here again
return s;
}
}
Nested if inside a for body:
contract ForInner {
function main() -> word {
let result = 0;
for (let height = 0; height < 7; height = height + 1) {
if (true) { result = height; } else {}
}
return result;
}
}
Scope rules
A variable declared in the init clause is in scope for the condition expression, the post-iteration clause, and the entire body.
A variable declared in the post clause is in scope only for the body of the current iteration — it is re-bound at the start of each subsequent one.
The loop body is its own block; declarations inside it do not escape to the enclosing function.
Note The condition must be of type
bool. SAIL does not implicitly convertwordtobool. Use an explicit comparison (le(i, 10)ori <= 10) when comparing integer counters.
Expression Statements
Any expression may appear as a statement. The expression is evaluated for
its side effects and the result is discarded. This is the standard way to
call a function whose return type is ().
function emitTransfer(from : word, to : word, amount : word) -> () {
assembly {
mstore(0x00, amount)
log3(0x00, 0x20, 0xddf252ad, from, to)
}
}
function main(to : word, amount : word) -> () {
emitTransfer(caller(), to, amount); // expression statement: result () is discarded
}
Scope and Shadowing
Local variables are in scope from their declaration to the end of the enclosing block. A variable declared in an inner block shadows an outer declaration of the same name for the duration of that block.
function computeFee(amount : word) -> word {
let fee : word = 1;
{
let fee : word = div(amount, 100); // shadows outer fee inside this block
}
return fee; // refers to the outer fee; returns 1
}
Note The compiler uses unique identifiers internally, so shadowing is safe and does not cause name collisions in the generated code.
Functions
A function definition introduces a named computation that takes zero or more typed parameters and returns a value of a declared type. Functions can be defined at the top level of a source file, called free functions, or inside a contract body.
function name(param1 : Type1, param2 : Type2) -> ReturnType {
// body
}
Every top-level function must carry a complete type signature: every parameter
must be annotated with its type, and the return type must be provided after
->. The compiler rejects any top-level definition that omits an annotation.
Note The complete-annotation requirement applies to free functions and contract methods. It does not apply to lambda expressions or to local bindings inside a function body, where the compiler infers types from context.
Parameters
Parameters are declared as a comma-separated list enclosed in parentheses.
Each parameter has the form name : Type.
function transfer(to : word, amount : word) -> () {
let bal : word;
assembly { bal := sload(caller()) }
assembly { sstore(caller(), sub(bal, amount)) }
assembly { sstore(to, add(sload(to), amount)) }
}
A function that takes no arguments is written with an empty parameter list:
function sender() -> word {
let s : word;
assembly { s := caller() }
return s;
}
Return Type
The return type follows the parameter list after ->. Every top-level function
must declare its return type explicitly.
A function that returns no meaningful value uses the unit type ():
function emitTransfer(from : word, to : word, amount : word) -> () {
assembly {
mstore(0x00, amount)
log3(0x00, 0x20, 0xddf252ad, from, to)
}
}
Every execution path through the body must end with a return statement whose
expression has the declared return type.
Free Functions
A function defined outside any contract body is called a free function. Free functions are visible throughout the file in which they are defined and can be imported by other modules.
function isContract(addr : word) -> bool {
let size : word;
assembly { size := extcodesize(addr) }
return gt(size, 0);
}
contract Token {
function onlyContract(addr : word) -> () {
if (isContract(addr)) {
return ();
} else {
assembly { revert(0, 0) }
}
}
}
Polymorphic Functions
A function that works uniformly over multiple types can be made polymorphic
with a forall quantifier placed before the function keyword. The quantifier
lists the type variables that appear in the signature.
forall a . function identity(x : a) -> a {
return x;
}
forall a b . function fst(p : (a, b)) -> a {
match p {
| (x, y) => return x;
}
}
Type variables introduced by forall are instantiated at each call site. The
compiler specializes the function for every concrete type combination that
appears in the program.
Note Polymorphic functions are monomorphized by the specializer before code generation. Each distinct instantiation produces a separate function in the output. A call to
identitywith awordargument becomesidentity$wordin the compiled output. No polymorphism survives to the generated Yul.
Constrained Functions
A function may require that one or more of its type variables satisfy a type
class constraint. Constraints are written after the type variable list,
separated from the function keyword by =>.
forall a . class a:Checked {
function checkedAdd(x : a, y : a) -> a;
}
forall t . t:Checked => function safeTransfer(from : word, to : word, amount : t) -> t {
return Checked.checkedAdd(amount, amount);
}
Multiple constraints on different type variables are separated by commas:
forall a b . a:Eq, b:Eq => function transfersEqual(x : (a, b), y : (a, b)) -> bool {
match (x, y) {
| ((xa, xb), (ya, yb)) => return Eq.eq(xa, ya);
}
}
At each call site the compiler checks that the supplied types satisfy all listed constraints. If no instance is found a type error is reported.
Recursive Functions
A function may call itself recursively. The compiler adds the function name to the typing context before checking the body.
function sumBalances(slot : word, count : word) -> word {
if (eq(count, 0)) {
return 0;
} else {
let bal : word;
assembly { bal := sload(slot) }
return add(bal, sumBalances(add(slot, 1), sub(count, 1)));
}
}
Mutually recursive functions are also supported. The compiler detects mutual dependencies automatically through strongly-connected-component analysis and type-checks the group as a unit. Both functions must be defined in the same file.
data TxStatus = Pending | Confirmed;
function isPending(s : TxStatus) -> bool {
match s {
| TxStatus.Pending => return isNotConfirmed(s);
| TxStatus.Confirmed => return false;
}
}
function isNotConfirmed(s : TxStatus) -> bool {
match s {
| TxStatus.Confirmed => return isPending(TxStatus.Pending);
| TxStatus.Pending => return true;
}
}
Contract Functions
Functions defined inside a contract body have access to the contract's field variables. They follow the same signature rules as free functions.
contract ERC20 {
totalSupply : word;
function mint(amount : word) -> () {
totalSupply = add(totalSupply, amount);
}
function getTotalSupply() -> word {
return totalSupply;
}
}
Contract functions may read and write field variables. Free functions can only operate on their parameters and locally declared variables.
Pattern Matching in Function Bodies
Functions may use match statements to deconstruct algebraic data type values.
data Result = Ok(word) | Err(word);
function unwrapOrZero(r : Result) -> word {
match r {
| Result.Ok(v) => return v;
| Result.Err(_) => return 0;
}
}
Patterns may be nested arbitrarily. The wildcard pattern _ matches any value
without binding it. The compiler checks that the set of patterns covers all
possible constructors of the scrutinee type and reports an error for incomplete
matches.
Assembly in Function Bodies
Functions may contain assembly blocks to access EVM opcodes directly. Inside
an assembly block, Yul syntax is used. Variables declared in the surrounding
SAIL scope are accessible by name inside the block.
function loadBalance(account : word) -> word {
let bal : word;
assembly {
bal := sload(account)
}
return bal;
}
Variables assigned inside an assembly block must be declared with let in the
enclosing SAIL scope before the block opens. The type of such variables must be
word, since Yul operates exclusively on 256-bit machine words.
Warning The type checker cannot verify the semantic correctness of Yul code. Incorrect assembly can produce contracts that silently compute wrong results or revert unexpectedly. Minimize the size of assembly blocks and document any non-obvious invariants.
Missing Annotation Error
Omitting a parameter type or the return type on a top-level function is a compile-time error. The compiler reports the offending signature and explains what is missing.
// Error: parameter 'x' has no type annotation.
function bad(x) -> word {
return x;
}
Top-level function must have complete type annotations:
bad(x) -> word
Annotate every parameter (name : Type) and provide a return type (-> Type).
Omitting the return type is equally rejected:
// Error: return type is missing.
function alsobad(x : word) {
return x;
}
Type inference remains available inside function bodies for local variables and intermediate expressions. Only the function signature itself requires explicit annotations at the top level.
Assembly Blocks
SAIL allows inline assembly blocks that embed Yul statements directly in a function body. Assembly blocks provide unrestricted access to EVM opcodes and are the primary mechanism for operations that SAIL has no built-in syntax for, such as storage reads and writes, event emission, and ABI encoding helpers.
function loadBalance(account : word) -> word {
let bal : word;
assembly {
bal := sload(account)
}
return bal;
}
Warning The type checker cannot verify the semantic correctness of Yul code. Incorrect assembly can produce contracts that silently compute wrong results or revert unexpectedly. Minimize the size of assembly blocks and document any non-obvious invariants.
Yul Sublanguage
The contents of an assembly { … } block are written in
Yul, the low-level
intermediate language used by the Solidity compiler. Yul operates exclusively
on 256-bit machine words, the native value type of the EVM.
SAIL variables whose type is word are accessible by name inside the block.
Variables of other SAIL types cannot be referenced directly in Yul.
Variable Declaration
Inside an assembly block, Yul variables are declared with let and assigned
with :=. Yul let is separate from SAIL let: Yul variables exist only
within the enclosing Yul block.
assembly {
let ptr := mload(0x40) // Yul variable; exists only in this block
}
Multiple names may appear on the left-hand side of a single let to receive
the multiple return values of a built-in opcode:
assembly {
let success, returndata := call(gas(), target, value, argOffset, argSize, 0, 0)
}
Assignment
An assignment in Yul uses :=. The left-hand side must be either a Yul
variable or a SAIL word variable in scope.
function storeBalance(account : word, amount : word) -> () {
assembly {
sstore(account, amount) // EVM opcode: write amount to storage slot account
}
}
Assigning to a SAIL variable communicates a result back to the SAIL scope:
function getFreeMemPtr() -> word {
let ptr : word;
assembly {
ptr := mload(0x40)
}
return ptr;
}
Conditionals
Yul provides an if statement that executes a block when a condition is
non-zero. There is no else branch in Yul; use switch for multi-way
dispatch.
assembly {
if iszero(success) {
revert(0, 0)
}
}
Switch
The Yul switch statement dispatches on a value. Each case arm matches a
literal. An optional default arm matches any value not handled by a case.
assembly {
switch selector
case 0x70a08231 {
// balanceOf selector
}
case 0xa9059cbb {
// transfer selector
}
default {
revert(0, 0)
}
}
For Loops
Yul's for statement provides a general loop with an initialisation block, a
condition expression, a post-iteration block, and a body block.
contract ERC20 {
function sumSlots(startSlot : word, count : word) -> word {
let endSlot : word;
let total : word;
assembly {
endSlot := add(startSlot, count)
}
assembly {
for { let i := startSlot } lt(i, endSlot) { i := add(i, 1) }
{
total := add(total, sload(i))
}
}
return total;
}
}
The initialisation block may be empty ({}). break, continue, and leave
control loop execution:
| Statement | Effect |
|---|---|
break | Exit the innermost for loop immediately |
continue | Skip to the post-iteration block of the loop |
leave | Return from the enclosing Yul function |
Nested Blocks
An assembly block may contain nested Yul blocks { … }. Variables declared
inside a nested block are not visible outside it.
assembly {
{
let tmp := mload(0x00) // tmp is scoped to this inner block
mstore(0x20, tmp)
}
// tmp is not in scope here
}
Accessing SAIL Variables
Only SAIL variables of type word can be read or written inside an assembly
block. This restriction is enforced at compile time: every SAIL name that
appears inside Yul must resolve to a variable or parameter whose type is
word. Parameters, local variables declared with let, and contract field
variables are all subject to this rule.
function transfer(account : word, amount : word) -> () {
let bal : word;
assembly {
bal := sload(account) // account and bal are SAIL word variables
sstore(account, sub(bal, amount))
}
}
Note Variables assigned inside an assembly block must have been declared with
letin the enclosing SAIL scope before the block opens.
Rejected: parameter of type bool
A parameter of type bool cannot be named inside Yul. The compiler reports a
type mismatch because Yul has no boolean type and cannot represent the value.
// Error: bool is not word.
function bad(paused : bool) -> () {
assembly {
sstore(0, paused)
}
}
Types: bool and word do not unify
- in: function bad (paused : bool) -> () { ... }
To work with a bool value inside an assembly block, convert it to a word
first using an explicit conditional in SAIL.
Rejected: variable of an algebraic data type
A local variable whose type is a user-defined data type is equally
rejected. Sum and product types are not EVM words and have no direct Yul
representation.
data Result = Ok(word) | Err(word);
// Error: Result is not word.
function bad(r : Result) -> word {
let res : word;
assembly {
res := r
}
return res;
}
Types: Result and word do not unify
- in: function bad (r : Result) -> word { ... }
To operate on structured values from assembly, extract the relevant word
fields first in SAIL, pass them as word parameters or local variables, and
perform the Yul computation on those.
Common EVM Operations
The following table lists the EVM opcodes most frequently used in assembly blocks. For the full list see the EVM opcode reference.
| Opcode | Description |
|---|---|
add(a, b) | 256-bit addition (wrapping) |
sub(a, b) | 256-bit subtraction (wrapping) |
mul(a, b) | 256-bit multiplication (wrapping) |
div(a, b) | 256-bit unsigned integer division |
mod(a, b) | 256-bit unsigned modulo |
mload(p) | Read 32 bytes from memory at offset p |
mstore(p, v) | Write 32 bytes v to memory at offset p |
sload(k) | Read storage slot k |
sstore(k, v) | Write value v to storage slot k |
caller() | Address of the message sender |
callvalue() | Value (in wei) sent with the call |
calldataload(p) | Read 32 bytes from calldata at offset p |
iszero(x) | 1 if x == 0, else 0 |
revert(p, s) | Abort execution; return s bytes from p |
return(p, s) | Halt execution; return s bytes from p |
Assembly and the Hull IR
Assembly blocks pass through the compilation pipeline unchanged. The SAIL compiler includes them verbatim in the Hull IR, and the Yul code generator reproduces them without transformation. This means the programmer has full control over the generated Yul but also bears full responsibility for its correctness.
Datatypes
SAIL provides algebraic data types (ADTs) for defining structured values. An ADT declares a named type together with one or more constructors. Each constructor describes one way to build a value of that type and may carry zero or more fields of arbitrary types.
data Option(a) = None | Some(a);
Data types may be defined at the top level of a source file or inside a contract body.
Enumeration Types
The simplest kind of algebraic data type has only nullary constructors with no fields. Such a type acts as a finite enumeration.
data TokenStatus = Active | Paused | Deprecated;
Each constructor is a distinct value of the type. Enumerations are commonly used
wherever Solidity uses enum.
contract Registry {
data TokenStatus = Active | Paused | Deprecated;
function statusCode(s : TokenStatus) -> word {
match s {
| TokenStatus.Active => return 1;
| TokenStatus.Paused => return 2;
| TokenStatus.Deprecated => return 0;
}
}
function main() -> word {
return statusCode(TokenStatus.Active);
}
}
Constructors with Fields
A constructor can carry one or more fields. The field types are listed in parentheses, separated by commas.
data TxStatus = Pending | Settled | Failed;
data TxOutcome = Success(TxStatus) | Revert(TxStatus) | Unknown;
A constructor with fields is applied like a function:
TxOutcome.Success(TxStatus.Settled) produces a value of type TxOutcome
wrapping a value of type TxStatus.
Fields are extracted by pattern matching; there is no record-style field access. The pattern mirrors the constructor application:
function outcomeCode(x : TxOutcome) -> word {
match x {
| TxOutcome.Success(TxStatus.Settled) => return 1;
| TxOutcome.Revert(TxStatus.Failed) => return 2;
| _ => return 0;
}
}
Parametric Data Types
A data type can be parameterized by one or more type variables, making it a generic or parametric type. The type variables are listed in parentheses after the type name.
data Option(a) = None | Some(a);
Here a is a type variable. Option(word) is the type of optional words,
Option(bool) is the type of optional booleans, and so on. The type variable
a may appear in the field types of any constructor.
contract Option {
data Option(a) = None | Some(a);
function just(x : a) -> Option(a) {
return Option.Some(x);
}
function maybe(default : word, opt : Option(word)) -> word {
match opt {
| Option.None => return default;
| Option.Some(x) => return x;
}
}
function main() -> word {
return maybe(0, Option.Some(42));
}
}
The type checker verifies that every use of a parametric type supplies the
correct number of type arguments. Applying Option to two arguments, for
instance, is a type error.
Nested Pattern Matching
Patterns may be nested to arbitrary depth to match inside multiple layers of constructors in a single arm.
data Option(a) = None | Some(a);
// Unwrap an approval amount nested in two Option layers.
function resolveApproval(outer : Option(Option(word))) -> Option(word) {
match outer {
| Option.Some(Option.Some(x)) => return Option.Some(x);
| _ => return Option.None;
}
}
The wildcard _ matches any value without binding it. It can appear at any
depth in a pattern.
Opaque Wrappers
A single-constructor, single-field type is the standard idiom for introducing a
distinct type that is represented by an existing type at runtime. This is
similar to Haskell's newtype or Solidity's user-defined value types.
data uint256 = uint256(word);
uint256 is a type distinct from word even though it carries exactly one
word field. The type checker treats them as incompatible, preventing
accidental mixing. The wrapper is removed during compilation: uint256 values
occupy exactly one EVM word, just like word.
Wrapping and unwrapping are done explicitly with the constructor and a pattern:
function wrap(x : word) -> uint256 {
return uint256(x);
}
function unwrap(x : uint256) -> word {
match x {
| uint256(w) => return w;
}
}
The standard library defines uint256, bytes4, bytes32, and address this
way, each wrapping word.
Phantom Type Parameters
A type parameter that does not appear in any constructor field is called a phantom type parameter. It carries no runtime information but allows the type system to distinguish values that would otherwise be identical.
// 'a' is a phantom type parameter: the constructor Proxy carries no field of type 'a'.
data Proxy(a) = Proxy;
Proxy(word) and Proxy(bool) are distinct types at compile time but produce
the same runtime value. Phantom types are useful for passing type information to
functions without allocating extra memory.
forall a . class a:MemoryType {
function size(prx : Proxy(a)) -> word;
}
instance word:MemoryType {
function size(prx : Proxy(word)) -> word {
return 32;
}
}
The Proxy(a) argument lets the caller select which MemoryType instance to
use without passing an actual value of type a.
Note Because phantom type parameters leave the constructor's result type partially undetermined, the type checker requires an explicit type annotation whenever a
Proxyvalue is constructed in a context where the type cannot be inferred from surrounding expressions. Use the expression annotation formProxy : Proxy(word)to resolve the ambiguity.
Tuples
SAIL has built-in support for product types (tuples). A tuple type is written as a parenthesised, comma-separated list of component types. Tuple values are written the same way.
function swap(p : (word, bool)) -> (bool, word) {
match p {
| (x, b) => return (b, x);
}
}
Tuples of more than two elements are right-nested pairs internally. The type
(word, bool, word) is represented as pair(word, pair(bool, word)).
The unit type () is the zero-element tuple. It carries no information and is
used as the return type of functions that exist only for their side effects.
function storeBalance(account : word, amount : word) -> () {
assembly { sstore(account, amount) }
}
Note Tuple patterns may appear anywhere a pattern is expected, including inside constructor patterns:
forall a b . instance Zero:Nth((a, b), a) { function nth(idx : Proxy(Zero), tup : (a, b)) -> a { match tup { | (x, _) => return x; } } }
Contextual Constructor Syntax
In a context where the expected type is known, the module qualifier can be
omitted from a constructor name by prefixing it with .. The compiler resolves
the constructor to the appropriate type automatically.
data Option(a) = None | Some(a);
function just(x : word) -> Option(word) {
return .Some(x); // equivalent to Option.Some(x)
}
function nothing() -> Option(word) {
return .None; // equivalent to Option.None
}
The same shorthand works in patterns:
function isNone(o : Option(word)) -> bool {
match o {
| .None => return true;
| .Some(_) => return false;
}
}
The compiler reports an error if the expected type is not known or if the constructor name is ambiguous.
Type Synonyms
A type synonym introduces a new name for an existing type. Synonyms are purely a compile-time device: the compiler expands them before type checking and they leave no trace in the generated code.
type Int = word;
type Point = pair(Int, Int);
function makePoint(x : Int, y : Int) -> Point {
return (x, y);
}
function getX(p : Point) -> Int {
match p {
| (x, _) => return x;
}
}
Like data types, synonyms can have type parameters:
type Map(k, v) = pair(k, v); // toy example
Warning Recursive type synonyms are not allowed. A synonym must not refer directly or indirectly to itself. Attempting to define
type A = Bandtype B = Asimultaneously is a compile-time error.
Runtime Encoding
Algebraic data types compile to a uniform binary encoding in the generated Hull/Yul code.
Sum types (types with more than one constructor) are encoded as nested
binary sums using inl (left injection) and inr (right injection). A type
with n constructors becomes a right-nested binary tree of depth ⌈log₂ n⌉. For
example, a three-constructor type data T = A | B | C is encoded as:
A → inl ()
B → inr (inl ())
C → inr (inr ())
Product types (constructor fields, tuples) are encoded as right-nested
pairs. The three-field constructor data T = T(word, bool, word) becomes
pair(word, pair(bool, word)).
This uniform encoding is what the match compiler and the Hull back-end operate
on. It is not visible at the SAIL level.
Parametric Polymorphism
A parametric polymorphic function works uniformly over any type. The caller
does not need to know which concrete type is used; the function behaves
identically for all instantiations. SAIL supports parametric polymorphism
through forall quantifiers in function signatures.
Type Variables and forall
A type variable is a placeholder for any concrete type. To introduce type
variables in a function signature, place a forall quantifier before the
function keyword. The quantifier lists the type variable names separated by
spaces and terminated by a period.
forall a . function id(x : a) -> a {
return x;
}
The signature forall a . a -> a states that id accepts one argument of any
type a and returns a value of the same type a. The same type variable a
appears in both the parameter and the return position, so the caller knows that
the output type equals the input type.
Multiple type variables are listed in the same quantifier:
forall a b . function fst(p : (a, b)) -> a {
match p {
| (x, y) => return x;
}
}
forall a b . function snd(p : (a, b)) -> b {
match p {
| (x, y) => return y;
}
}
Type variables may appear in parameter types, the return type, and in type arguments to other type constructors:
data Option(a) = None | Some(a);
forall a . function just(x : a) -> Option(a) {
return Option.Some(x);
}
forall a . function fromOption(default : a, opt : Option(a)) -> a {
match opt {
| Option.None => return default;
| Option.Some(x) => return x;
}
}
Call-Site Instantiation
At each call site, the compiler determines the concrete type for every type variable from the types of the supplied arguments. No explicit type application is needed; the inference engine handles instantiation automatically.
forall a . function id(x : a) -> a {
return x;
}
contract C {
function main() -> word {
return id(42); // a is instantiated to word
}
}
Each combination of concrete types produces a distinct specialization during
compilation. A call to id with a word argument becomes id$word in the
generated output, and a call with a pair type becomes a separate function with
a distinct name. No polymorphism survives to the generated Yul.
Polymorphic Functions with Pattern Matching
Polymorphic functions frequently deconstruct structured values through pattern matching. The match compiler operates on the inferred type at each call site after instantiation.
data Pair(a, b) = Pair(a, b);
forall a b . function fst(p : Pair(a, b)) -> a {
match p {
| Pair(x, y) => return x;
}
}
forall a b . function snd(p : Pair(a, b)) -> b {
match p {
| Pair(x, y) => return y;
}
}
function addAmounts(x : word, y : word) -> word {
let res : word;
assembly { res := add(x, y) }
return res;
}
// A transfer record holds (sender address, amount).
function totalTransferred(p : Pair(word, word)) -> word {
return addAmounts(fst(p), snd(p));
}
contract ERC20 {
function main() -> word {
return totalTransferred(Pair(100, 200));
}
}
Mutually Recursive Polymorphic Functions
Polymorphic functions may call each other recursively. The compiler resolves mutual dependencies through strongly-connected-component analysis and checks all functions in a group together. Both functions must be defined in the same file.
data Option(a) = None | Some(a);
forall a . function orElse(primary : Option(a), fallback : Option(a)) -> Option(a) {
match primary {
| Option.Some(v) => return Option.Some(v);
| Option.None => return pickFirst(fallback, primary);
}
}
forall a . function pickFirst(x : Option(a), y : Option(a)) -> Option(a) {
match x {
| Option.Some(v) => return orElse(x, y);
| Option.None => return y;
}
}
The Subsumption Test
When a function carries a forall annotation, the compiler verifies that the
body is at least as polymorphic as the declared signature. This check is called
the subsumption test. It prevents signatures that claim more generality than
the body actually provides.
The test works in three steps:
- The declared type is skolemised: each type variable is replaced by a fresh rigid constant that cannot be unified with any other type.
- The body is type-checked independently, producing an inferred type.
- The inferred type must unify with the skolemised declared type. If a rigid
constant would need to be unified with a concrete type (such as
word), the body is not polymorphic enough and the compiler reports an error.
Error: return type is more polymorphic than the body
The most common subsumption failure occurs when the annotation promises that
the function works for any type a, but the body always produces a specific
type such as word.
// Error: the body always returns word, but the annotation says a.
forall a . function wrong(x : word) -> a {
return x;
}
Type not polymorphic enough! The annotated type is:
forall a . word -> a
but the infered type is:
word -> word
in:
forall a . function wrong (x : word) -> a
The body return x has type word -> word because x is declared as
word. The skolemised declared type requires the result to be a rigid
variable a, which cannot be unified with word. The compiler rejects the
definition.
Error: wrong type variable in the return position
A function that swaps the return type variable is caught by the same test.
// Error: the body returns the first component (type a),
// but the annotation declares the return type as b.
forall a b . function fst(p : (a, b)) -> b {
match p {
| (x, y) => return x;
}
}
Type not polymorphic enough! The annotated type is:
forall a b . (a, b) -> b
but the infered type is:
forall $t . ($t, $t) -> $t
in:
forall a b . function fst (p : (a, b)) -> b
The body returns x, which has the type of the first component. The inferred
type therefore unifies both components and the return, making them all the
same variable $t. The skolemised declared type requires the return to be the
rigid variable b (the second component), which is distinct from a. The
unification fails and the compiler reports the error.
Error: type variable forced to word by an assembly block
Assembly blocks operate exclusively on word values. If the body uses a type
variable as if it were word inside an assembly block, the inference engine
forces that variable to word, making the function monomorphic in the body
while the annotation still declares a type variable.
// Error: the assembly block forces a to word,
// so the body is monomorphic.
forall a . function double(x : a) -> a {
let res : word;
assembly { res := add(x, x) }
return res;
}
Type not polymorphic enough! The annotated type is:
forall a . a -> a
but the infered type is:
word -> word
in:
forall a . function double (x : a) -> a
The correct way to write this function is to restrict the parameter type to
word explicitly and drop the forall:
function double(x : word) -> word {
let res : word;
assembly { res := add(x, x) }
return res;
}
If a computation must be polymorphic in a type class sense (working for all types that support addition), use a constrained type variable instead of an assembly block. See the Type Classes section for details.
Specialization and Naming
The compiler eliminates all polymorphism before code generation through a
process called specialization (or monomorphization). Every call site that
instantiates a polymorphic function at a concrete type combination produces a
separate function definition in the output. The compiler chooses names of the
form name$Type for each specialization, for example id$word or
fst$word$bool.
This means:
- There is no runtime representation of type variables.
- Each specialized version is compiled independently and can be optimized on its own.
- Whole-program compilation is required: the specializer must see all call sites to determine which instantiations to generate.
Note A polymorphic function that is never called is not emitted at all. Only the specializations that are actually needed by the program appear in the compiled output.
Type Classes
A type class defines a named set of operations that a type must implement. Any type that provides implementations for all required operations is said to be an instance of the class. Type classes enable constrained polymorphism: a function may be parameterized over a type variable and simultaneously require that the variable belongs to one or more classes.
Class Declarations
A class declaration introduces a class name, a main type variable, and zero or
more method signatures. The forall quantifier is required when the class
declaration introduces type variables.
forall a . class a:Eq {
function eq(x : a, y : a) -> bool;
function ne(x : a, y : a) -> bool;
}
The type variable a that appears immediately before the colon is the main
type argument of the class. Every instance must supply a concrete type for this
variable.
A class with no methods defines a pure marker class:
forall a . class a:Serializable {}
Superclass Constraints
A class may require that its main type argument already belongs to another
class. This constraint is called a superclass constraint and is written before
the class keyword with =>.
forall a . a:Eq => class a:Ord {
function lt(x : a, y : a) -> bool;
function lte(x : a, y : a) -> bool;
}
Any instance of Ord must also be an instance of Eq. The compiler verifies
this at each instance declaration. If a function requires a:Ord, the
constraint a:Eq is automatically available without listing it explicitly.
Instance Declarations
An instance declaration provides implementations for all methods of a class for a specific type. The instance head names the class and supplies a concrete type for the main type variable.
instance word:Eq {
function eq(x : word, y : word) -> bool {
let res : word;
assembly { res := eq(x, y) }
return res;
}
function ne(x : word, y : word) -> bool {
let res : word;
assembly { res := iszero(eq(x, y)) }
return res;
}
}
A polymorphic instance applies to a family of types. The forall quantifier
lists the type variables that appear in the instance head:
data Pair(a, b) = Pair(a, b);
forall a b . a:Eq, b:Eq => instance Pair(a, b):Eq {
function eq(x : Pair(a, b), y : Pair(a, b)) -> bool {
match x, y {
| Pair(xa, xb), Pair(ya, yb) =>
return Eq.eq(xa, ya);
}
}
function ne(x : Pair(a, b), y : Pair(a, b)) -> bool {
return Eq.ne(x, y);
}
}
Calling Class Methods
Class methods are called with a qualified name of the form ClassName.method.
The compiler resolves the correct instance from the types of the arguments:
data Option(a) = None | Some(a);
forall a . a:Eq => function senderMatches(sender : a, expected : Option(a)) -> bool {
match expected {
| Option.None => return false;
| Option.Some(e) => return Eq.eq(sender, e);
}
}
Overlapping Instances
SAIL does not support overlapping instances. Two instances overlap when the same type can match both instance heads. The compiler reports an error at the second declaration:
data Box(a) = Box(word);
forall a . class a:C {}
forall a . instance Box(a):C {}
// Error: overlaps with the more general instance above.
instance Box(word):C {}
Overlapping instances are not supported
instance:
Box(word) : C
overlaps with:
Box(?$3) : C
Main and Weak Type Arguments
When a class has more than one type parameter, the parameter immediately before the colon in the class head is called the main type argument. The remaining parameters, listed after the class name in parentheses, are called weak type arguments.
// main ──┐ ┌── weak
forall a b . class a:Convert(b) {
function convert(x : a) -> b;
}
The distinction matters for instance resolution and for the three soundness conditions the compiler enforces.
Main type argument (a in a:Convert(b)): used as the primary key for
instance lookup. The compiler selects an instance by matching the main type
first. It must be determinable independently of the weak arguments.
Weak type arguments (b in a:Convert(b)): represent additional types
involved in the relationship. They may be determined by the main type argument
through the coverage condition, but they cannot introduce type variables that
are unconstrained at the call site.
Example: weak argument determined by main type
In the following instance, the main type Wei uniquely determines the weak
type Ether. The instance is well formed because the weak type variable is
replaced by a concrete type:
forall a b . class a:Convert(b) {
function convert(x : a) -> b;
}
data Wei = Wei(word);
data Ether = Ether(word);
instance Wei:Convert(Ether) {
function convert(x : Wei) -> Ether {
match x {
| Wei.Wei(w) =>
let e : word;
assembly { e := div(w, 1000000000000000000) }
return Ether.Ether(e);
}
}
}
contract C {
function main() -> word {
let result = Convert.convert(Wei.Wei(2000000000000000000));
match result {
| Ether.Ether(v) => return v;
}
}
}
Instance Soundness Conditions
To guarantee that instance resolution terminates and remains coherent, the compiler enforces three conditions on every instance declaration. Violating any of them is a compile-time error. Each condition can be relaxed by a pragma when a specific instance is known to be safe.
Coverage Condition
Every type variable that appears in a weak type argument position must be determined by the main type argument. The set of type variables bound by the main type must cover all type variables bound by the weak types.
Rejected example
data Box(a) = Box(word);
forall a b . class a:MyClass(b) {}
// Error: b appears only in the weak position; Box(a) does not determine b.
forall a b . instance Box(a):MyClass(b) {}
Coverage condition fails for class:
MyClass
- the type:
Box(a)
does not determine:
b
Accepted example
Replacing the unconstrained variable b with a concrete type eliminates the
violation:
forall a . instance Box(a):MyClass(word) {}
Patterson Condition
For each constraint in the instance context, the measure of the constraint must be strictly smaller than the measure of the instance head. The measure of a predicate is the total number of type constructors and type variables it contains, counting repetitions. Each type constructor or type variable contributes 1 to the measure, regardless of nesting.
This condition prevents instance search from entering an infinite loop when the same type class is used in both the context and the head.
Rejected example
forall a . class a:C1 {}
forall a . class a:C2 {}
// Context: U:C1 has measure 2, U:C2 has measure 2, total 4.
// Head: U:C1 has measure 2.
// Context measure (4) is not strictly smaller than head measure (2).
forall U . U:C1, U:C2 => instance U:C1 {}
Instance
U : C1
does not satisfy the Patterson conditions.
Accepted example
Wrapping the main type in a constructor increases the head measure so that each context constraint is strictly smaller:
data Wrap(a) = Wrap(a);
// Context: U:C1 has measure 2.
// Head: Wrap(U):C1 has measure 3 (Wrap + U + C1 name).
// 2 < 3, so the Patterson condition holds.
forall U . U:C1 => instance Wrap(U):C1 {}
Bound Variable Condition
Every type variable that appears in the instance context must also appear in the instance head. A type variable present only in the context cannot be determined from the types at the call site, making instance resolution ambiguous.
Rejected example
data Box(a) = Box(word);
forall a . class a:Eq {}
forall a b . class a:Container(b) {}
// Error: c appears in the context constraint c:Eq
// but not in the instance head Box(a):Container(a).
forall a c . c:Eq => instance Box(a):Container(a) {}
Bounded variable condition fails!
Accepted example
Remove the unused variable from the context, or include it in the head:
// No context needed.
forall a . instance Box(a):Container(a) {}
// Or: bring c into the head through the weak argument.
forall a c . c:Eq => instance Box(a):Container(c) {}
Pragmas
A pragma is a compiler directive that relaxes one of the three instance soundness conditions. Pragmas are written at the top level of a source file, before any declarations.
There are three pragmas, one per condition:
| Pragma keyword | Condition disabled |
|---|---|
no-coverage-condition | Coverage condition |
no-patterson-condition | Patterson condition |
no-bounded-variable-condition | Bound variable condition |
Each pragma has two forms:
// Disable for a specific list of classes (comma-separated).
pragma no-coverage-condition ClassName1, ClassName2;
// Disable globally for all classes in this file.
pragma no-coverage-condition;
Pragmas apply only to the file in which they appear. Importing a file does not inherit its pragmas, and the importing file's pragmas do not affect the imported declarations.
Warning Disabling these conditions can allow instances that cause the compiler's instance resolution to loop or produce incoherent results. Use pragmas only when you understand the implications for the specific class and instance involved.
pragma no-coverage-condition
Disables the coverage check for the listed classes. Use this when a weak type argument is deliberately left undetermined by the main type, for example in open type-indexed families where the relationship is established by context rather than by the instance itself.
pragma no-coverage-condition MyClass;
data Box(a) = Box(word);
forall a b . class a:MyClass(b) {}
// Accepted: coverage condition is disabled for MyClass.
forall a b . instance Box(a):MyClass(b) {}
Without the pragma, this declaration would produce:
Coverage condition fails for class:
MyClass
- the type:
Box(a)
does not determine:
b
pragma no-patterson-condition
Disables the Patterson measure check for the listed classes. Use this for class hierarchies where the instance search is known to terminate through structural arguments not captured by the simple measure metric.
pragma no-patterson-condition C1;
forall a . class a:C1 {}
forall a . class a:C2 {}
// Accepted: Patterson condition is disabled for C1.
forall U . U:C1, U:C2 => instance U:C1 {}
Without the pragma, this declaration would produce:
Instance
U : C1
does not satisfy the Patterson conditions.
pragma no-bounded-variable-condition
Disables the bound variable check for the listed classes. Use this when a context variable is intentionally existential, meaning it is chosen by the instance rather than derived from the call site.
pragma no-bounded-variable-condition Container;
data Box(a) = Box(word);
forall a . class a:Eq {}
forall a b . class a:Container(b) {}
// Accepted: bound variable condition is disabled for Container.
forall a c . c:Eq => instance Box(a):Container(a) {}
Without the pragma, this declaration would produce:
Bounded variable condition fails!
Combining Pragmas
Multiple pragmas may appear in the same file and may target the same class from different directives. All specified conditions are disabled independently:
pragma no-coverage-condition MyClass;
pragma no-patterson-condition MyClass;
pragma no-bounded-variable-condition MyClass;
data Box(a) = Box(word);
forall a . class a:Eq {}
forall a . class a:C1 {}
forall a b . class a:MyClass(b) {}
// Accepted: all three conditions are disabled for MyClass.
forall a b c . c:Eq, (a, b):C1 => instance Box(a):MyClass(b) {}
Modules
Every SAIL source file is a module. A module can import definitions from other modules and control which of its own definitions are visible to importers through export declarations. The module system provides explicit namespace management: a name imported from another module is not automatically available without qualification unless the import form places it directly into scope.
File Layout
A source file is a sequence of import declarations and top-level declarations in any order. Import declarations may appear before, after, or interleaved with top-level declarations.
( import-declaration | top-level-declaration )*
The conventional style places all imports at the top of the file, but this is not enforced by the compiler.
Import Forms
SAIL provides four import forms. Each form controls how the imported names are placed into scope and whether they require qualification to use.
Full module import
import modname;
Loads the module and makes all its exported names available under the
qualified prefix modname. No names are introduced into the unqualified
scope.
// token.solc exports: Token, transfer
import token;
function doTransfer(t : token.Token, to : word, amount : word) -> () {
return token.transfer(t, to, amount);
}
Module import with alias
import modname as Alias;
Same as a full import but assigns a shorter alias to the module. All qualified references must use the alias; the original module name is not available as a qualifier.
import token as T;
function doTransfer(t : T.Token, to : word, amount : word) -> () {
return T.transfer(t, to, amount);
}
After this import, writing token.transfer(t, to, amount) is an error because token
is not a known qualifier in this file.
Selective import
import modname.{Name1, Name2};
Loads the listed names directly into the unqualified scope. They can be used without any prefix.
import token.{Token, transfer};
function doTransfer(t : Token, to : word, amount : word) -> () {
return transfer(t, to, amount);
}
Each item in the selector list may optionally be renamed with as:
import selectlib.{keep as keep_, drop as drop_};
function main(x : word) -> word {
return drop_(keep_(x));
}
The original name keep is not placed into scope. Only the alias keep_ is available.
Multiple items may be renamed independently in the same selector list.
Constructors of an imported type must still be qualified with the type name even when the type itself was selectively imported:
import token.{Token};
function makeActive() -> Token {
return Token.Active; // correct
}
// Error: unqualified constructor.
function makeBad() -> Token {
return Active;
}
Unqualified constructor:
Active
Use Type.Constructor form.
Wildcard selective import
import modname.{*};
Places every exported name from the module into the unqualified scope.
Individual names may be excluded using hiding:
import globlib.{*} hiding {idWord};
function main(x : word) -> word {
let y : T = mkT(x); // mkT is in scope; idWord is not
match y {
| T.T(v) => return v;
}
}
The hiding clause accepts a comma-separated list of names to suppress:
import selectlib.{keep, drop} hiding {drop};
function main(x : word) -> word {
return keep(x); // drop is not in scope
}
Importing operator symbols
User-defined operator symbols are exported and imported using parenthesised
syntax (sym), where sym is the operator character sequence. Importing an
operator brings both its precedence/fixity declaration and its bound function
into scope.
// math.solc (declares and exports the operator)
infixl 70 (^^) => pow;
export { pow, (^^) };
function pow(b : word, e : word) -> word { ... }
// main.solc (imports the operator by symbol)
import math.{pow, (^^)};
contract Main {
function main() -> word {
return 2 ^^ 10; // 1024
}
}
Operator symbols may be mixed freely with ordinary names in the same selector list. Importing the function name alone (without the symbol) makes the implementation callable as a regular function but does not enable infix syntax.
Module Paths
A module path identifies the source file of a module relative to a root directory. Dots in the path correspond to directory separators.
Relative paths
A name without a lib. prefix is a relative path. The compiler resolves
it relative to the directory that contains the importing file.
import foo.bar; // loads foo/bar.solc
import foo.bar.baz; // loads foo/bar/baz.solc
After a plain import foo.bar, the module is accessible under the full
dotted qualifier:
import foo.bar;
function main() -> word {
return foo.bar.value();
}
Library paths
A path that begins with lib. is treated as an absolute library path,
resolved from the root of the current library rather than the current
directory.
export lib.some.module; // re-exports some/module.solc from the library root
Library paths are mainly used in re-export declarations to expose a module from a different directory tree. They are not commonly used in import statements directly.
External library paths
A path that begins with @libname. refers to a module in a separately
configured external library root. External libraries are registered in the
build configuration; the compiler resolves them to absolute paths at build
time.
import @extlib.math.api;
contract External {
function main() -> word {
return math.api.sum(39);
}
}
An alias keeps the reference concise:
import @extlib.math.api as MathApi;
function main() -> word {
return MathApi.sum(39);
}
Standard library
The name std and any name that begins with std. are resolved to the
standard library. The standard library root is configured separately from
user libraries.
import std;
function main() -> word {
return std.addWord(21, 21);
}
Qualified Names
When a module is imported with a full or aliased import, its exported definitions are accessed through a dotted qualifier. The qualifier may prefix types, functions, and constructors.
Qualified type names
import token;
function doTransfer(t : token.Token, to : word, amount : word) -> () {
return token.transfer(t, to, amount);
}
Qualified constructor expressions
Constructors are written as qualifier.TypeName.Constructor:
import token;
function makeActive() -> token.Token {
return token.Token.Active;
}
Qualified constructor patterns
The same qualified form is used in pattern matching:
import token;
function isActive(t : token.Token) -> word {
match t {
| token.Token.Active => return 1;
| token.Token.Paused => return 0;
}
}
Qualified names with aliases
When the import carries an alias, replace the module name with the alias in all qualified references:
import token as T;
function makeActive() -> T.Token {
return T.Token.Active;
}
Export Declarations
An export declaration controls which definitions an importing module can see. Definitions that are not listed in an export declaration are private to the file.
Note A file without any export declaration exports nothing. All definitions are private unless explicitly exported.
Explicit export list
export { Name1, Name2, TypeName };
Names are listed by their unqualified identifier.
export { Bool, not, C, D, id };
Exporting a type with constructors
By default, exporting a type name makes the type visible but keeps its constructors private. An importer can use the type in signatures but cannot construct or pattern-match on its values.
To export constructors explicitly, list them in parentheses after the type name:
export { Token(Ok) }; // exports only the Ok constructor
export { Token(Ok, Err) }; // exports both constructors
export { Bool(*) }; // exports Bool and all its constructors
Exporting operator symbols
Operator symbols are separate exportable entities from the functions they implement. To make an operator available to importers as infix syntax, both the function name and the operator symbol must appear in the export list:
infixl 70 (^^) => pow;
export { pow, (^^) };
Exporting only pow (without (^^)) makes the function callable by name
but does not grant importers access to the infix ^^ syntax.
Wildcard export
export { * };
Exports everything defined in the file. Constructors of all types are also exported.
Re-exporting another module
A module may forward its imports so that importers receive them as if they came from the re-exporting module.
Re-export a whole module:
// api.solc: makes all of util's exports available under api.util.*
export lib.reexport_module.pkg.util;
An importer of api then accesses the re-exported names through the full
chain:
import reexport_module.pkg.api;
function main() -> word {
return api.util.unwrap(api.util.Wrap.Mk(1));
}
Re-export a module under an alias:
// api_alias.solc
export lib.reexport_module.pkg.util as Utils;
import reexport_module.pkg.api_alias;
function main() -> word {
return api_alias.Utils.unwrap(api_alias.Utils.Wrap.Mk(1));
}
Re-export selected names from a module:
import hidden_ctor_lib;
export hidden_ctor_lib.{Token(Ok)}; // re-exports Token type with Ok constructor only
export hidden_ctor_lib.{mkErr}; // re-exports only the mkErr function
Hidden Constructors
When a constructor is not exported, importers receive an opaque type: they can name the type and pass values around, but they cannot construct new values directly or inspect existing ones through pattern matching. The only way to create or examine values of an opaque type is through the functions the module chooses to export.
// hidden_ctor_lib.solc
export {Token(Ok), mkOk, mkErr};
data Token = Ok(word) | Err(word);
function mkOk(x : word) -> Token { return Token.Ok(x); }
function mkErr(x : word) -> Token { return Token.Err(x); }
The module exports Token with only the Ok constructor visible. The Err
constructor is private.
An importer that selects only the type cannot use the hidden constructor:
import hidden_ctor_lib.{Token};
// Error: Err is not exported.
function bad() -> Token {
return .Err(1);
}
No matching constructor for shorthand expression:
.Err
Pattern matching on the hidden constructor is equally rejected:
import hidden_ctor_lib.{Token, mkErr};
// Error: Token.Err is not in scope.
function bad(x : word) -> word {
match mkErr(x) {
| Token.Err(v) => return v;
| _ => return 0;
}
}
Undefined name: Token.Err
Transitive Imports
Importing a module does not automatically make its own imports visible. If
module A imports module B, and module C imports module A, then C
sees only the names that A chose to export. Names that B exported to A
but that A did not re-export are not visible in C.
// transitive_dep_base.solc
export { g };
function g() -> word { return 1; }
// transitive_dep_mid.solc
import transitive_dep_base.{g};
export { f };
function f() -> word { return g(); }
// transitive_dep_main_select.solc
import transitive_dep_mid.{f};
function main() -> word { return f(); } // g is not in scope here
Name Shadowing
A locally defined function or parameter shadows an imported name of the same identifier. The imported name remains accessible through its qualified form.
import token;
// Local 'transfer' shadows token.transfer for unqualified calls.
function transfer(to : word, amount : word) -> () { return (); }
function main(to : word, amount : word) -> () {
return token.transfer(to, amount); // uses the imported transfer, not the local one
}
A locally defined name also shadows a selectively imported name:
import erc20lib.{balanceOf};
// Local 'balanceOf' shadows the imported one.
function balanceOf(account : word) -> word { return 0; }
function main(account : word) -> word {
return balanceOf(account); // calls the local balanceOf
}
Common Errors
Using an unqualified name after a full import
A full import (import modname;) requires all names to be qualified. Using
an imported name without the qualifier is an error:
import token;
// Error: 'transfer' is not in scope unqualified.
function main(to : word, amount : word) -> () {
return transfer(to, amount);
}
Undefined name: transfer
The fix is to qualify the call: return token.transfer(to, amount);.
Using the original name after aliasing
When an alias replaces the module name, the original name is not a valid qualifier:
import erc20.token as T;
// Error: erc20 is not a qualifier in this file.
function main(to : word, amount : word) -> () {
return erc20.token.transfer(to, amount);
}
Undefined name: erc20
The fix is to use the alias: return T.transfer(to, amount);.
Using a type without qualifying its constructor
Selectively importing a type name does not bring its constructors into the
unqualified scope. Constructors must always be written as TypeName.Constructor:
import token.{Token};
// Error: unqualified constructor.
function makeActive() -> Token {
return Active;
}
Unqualified constructor:
Active
Use Type.Constructor form.
Using a type not in scope at all
Importing a module without qualification and then using a name without the module qualifier fails:
import token;
// Error: Token is not in the unqualified scope.
function bad(t : Token) -> word {
return 0;
}
Undefined type constructor:
Token
Use token.Token, or switch to a selective import.
Type Inference
SAIL uses a constraint-based bidirectional type inference algorithm. The algorithm divides type information into two flows: a bottom-up flow that generates a type from an expression, and a top-down flow that pushes an expected type down into an expression. Both flows run simultaneously through unification, which merges type information from different sources into a consistent solution.
The scope of inference is deliberately limited. Top-level function signatures require complete explicit annotations. Inference operates freely inside function bodies: local variable types, intermediate expression types, and the types of arguments to generic functions are all inferred automatically without any annotation.
What Requires Annotations
Every top-level function must carry a complete type signature: every parameter must be annotated and the return type must be declared. The compiler rejects any top-level definition that omits an annotation. This rule applies to free functions and to functions defined inside a contract body.
// Required: every parameter and the return type are annotated.
function transfer(to : word, amount : word) -> () {
let bal : word;
assembly { bal := sload(caller()) }
assembly { sstore(caller(), sub(bal, amount)) }
assembly { sstore(to, add(sload(to), amount)) }
}
For a full description of the annotation requirement and the error it produces, see Functions.
What Is Inferred
Inside a function body, the compiler infers types for:
- Local variables declared with
let, whether or not they carry an annotation. - Intermediate expressions and subexpressions.
- Type arguments to polymorphic function calls.
- Type class constraints required by the body.
The inferred types are propagated through unification. Whenever the type of an expression is used in a position where another type is expected, the two are unified: the algorithm finds the most general substitution that makes them equal, or reports an error if no such substitution exists.
Local Variable Inference
A let declaration without a type annotation introduces a fresh type variable.
The compiler assigns a concrete type to that variable as soon as enough
information is available from the surrounding context.
Inferred from initialiser
When a let declaration includes an initialiser, the type is taken from the
initialiser expression. Integer literals always have type word.
function demo() -> word {
let amount = 100; // amount : word, from the integer literal
let flag = true; // flag : bool, from the boolean literal
return amount;
}
Inferred from subsequent use
When no initialiser is present, the type is inferred from the first use of the variable.
function loadBalance(account : word) -> word {
let bal : word; // annotated; no inference needed
assembly { bal := sload(account) }
return bal;
}
function compute(account : word) -> word {
let x; // no annotation, no initialiser
x = sload(account); // first assignment: x : word
return x;
}
Inferred from return context
A variable whose type depends on an algebraic data type can have its type fixed by the expected return type.
data Result = Ok(word) | Err(word);
function demo() -> Result {
let x = Result.Err(0); // x : Result, inferred from constructor
return x;
}
Contextual Constructor Shorthand
A constructor expression written as .Constructor (with a leading dot and no
type prefix) is resolved using the expected type at the point of use. The
compiler uses the context to determine which type the constructor belongs to.
In a return statement
The declared return type provides the expected type:
data TxStatus = Pending | Confirmed | Reverted;
function initialStatus() -> TxStatus {
return .Pending; // resolved as TxStatus.Pending from the return type
}
In a typed assignment
The declared type of the left-hand side provides the expected type:
data TxStatus = Pending | Confirmed | Reverted;
function demo() -> TxStatus {
let s : TxStatus;
s = .Confirmed; // resolved as TxStatus.Confirmed from the declared type of s
return s;
}
Error: ambiguous shorthand
If no expected type is available, the shorthand cannot be resolved and the compiler reports an error:
data TxStatus = Pending | Confirmed | Reverted;
function bad() -> word {
let x = .Pending; // no expected type available for x
return 0;
}
Cannot resolve shorthand constructor expression without expected constructor type:
.Pending
The fix is to annotate the variable: let x : TxStatus = .Pending;.
Integer Literals
An integer literal always has type word. There is no numeric type class or
overloading for integer literals in SAIL. Every integer literal that appears
in source code is a 256-bit EVM word value.
function demo() -> word {
let amount = 1000;
let decimals = 18;
let mask = 0xffffffffffffffffffffffffffffffffffffffff;
return amount; // amount, decimals, mask all have type word
}
Polymorphic Function Call Inference
When a polymorphic function is called, the compiler instantiates the type variables from the types of the supplied arguments. No explicit type application is needed.
forall a . function id(x : a) -> a {
return x;
}
function demo() -> word {
return id(42); // a instantiated to word at this call site
}
For a pair function with two type variables, both are instantiated independently:
data Pair(a, b) = Pair(a, b);
forall a b . function fst(p : Pair(a, b)) -> a {
match p {
| Pair(x, y) => return x;
}
}
function demo() -> word {
return fst(Pair(42, true)); // a = word, b = bool
}
Constraint Inference
When a function body calls a type class method, the compiler generates a constraint on the type variable involved. If the function is top-level and already carries an annotation, the annotation must list the constraint explicitly. If the annotation omits the constraint, the subsumption check catches the mismatch.
At a call site where the type variable is fixed to a concrete type, the compiler checks that an instance exists for that type. If no instance is found, the compiler reports an unsolved constraint error.
Constraint resolved at call site
forall a . class a:Encodable {
function encode(x : a) -> word;
}
instance word:Encodable {
function encode(x : word) -> word { return x; }
}
forall a . a:Encodable => function encodeField(x : a) -> word {
return Encodable.encode(x);
}
contract ERC20 {
function main() -> word {
return encodeField(42); // a = word; word:Encodable resolved
}
}
Error: no instance for the required type
If the concrete type at the call site has no matching instance, the compiler reports which constraint could not be satisfied and which instances are defined:
forall a . class a:SafeArith {
function safeAdd(x : a, y : a) -> a;
}
// No instance for word is declared.
function bad(x : word, y : word) -> word {
return SafeArith.safeAdd(x, y);
}
Cannot entail:
word : SafeArith
using defined instances:
The fix is either to declare instance word:SafeArith { ... } or to add the
constraint to the calling function's signature so the obligation is propagated
to the caller:
forall a . a:SafeArith => function bad(x : a, y : a) -> a {
return SafeArith.safeAdd(x, y);
}
Phantom Type Variables
A phantom type parameter is a type parameter of a data type that does not appear in any constructor field. When a phantom constructor is used inside a function body and the phantom parameter cannot be determined from the context, the compiler reports an ambiguous type variable error.
data TypedSlot(a) = TypedSlot(word); // a is phantom: it appears in no field
function bad() -> word {
let s = TypedSlot.TypedSlot(0); // a is unconstrained; no context fixes it
return 0;
}
Ambiguous type variable(s) $1 in definition of bad.
This typically occurs when a constructor has phantom type parameters.
Please, add a type signature to fix the ambiguous type variable.
The fix is to annotate the let declaration with the full type, giving the
phantom parameter a concrete value:
function good() -> word {
let s : TypedSlot(word) = TypedSlot.TypedSlot(0);
return 0;
}
Unification Errors
A unification error occurs when two types that must be equal turn out to be incompatible. The compiler reports the two types and the expression that triggered the failure.
Return type mismatch
function bad(amount : word) -> bool {
return amount; // amount : word; expected bool
}
Types: bool and word do not unify
- in: function bad (amount : word) -> bool { return amount; }
Match arm return type mismatch
All arms of a match expression must produce the same type. Returning
different types in different arms is a unification error:
data Result = Ok(word) | Err(word);
function bad(r : Result) -> word {
match r {
| Result.Ok(v) => return v;
| Result.Err(_) => return false; // word expected; bool returned
}
}
Types: bool and word do not unify
- in: false
- in: function bad (r : Result) -> word { ... }
Algebraic data type vs primitive mismatch
User-defined types and primitive types such as word are never
interchangeable:
data TxStatus = Pending | Confirmed;
function bad(n : word) -> TxStatus {
return n; // word is not TxStatus
}
Types: TxStatus and word do not unify
- in: function bad (n : word) -> TxStatus { return n; }
Scope of Inference
Inference is local to function bodies. Each function is checked independently, and no type information flows between sibling functions except through their declared signatures. The declared signature of a function is the only interface that callers see; the body is invisible to inference in other functions.
This means that the order of function definitions in a file does not affect the types that inference assigns to expressions inside any given function. Mutually recursive functions are type-checked as a group (see Functions), but even then each function's signature must carry complete annotations.
Core Solidity
Lambda Functions
Numeric Types
Arithmetic
Data Locations
ABI Encoding / Decoding
Contracts
Hull
Hull is an intermediate language used by the Core Solidity compiler between SAIL and the Yul backend. Every SAIL program is lowered to Hull after type-checking, monomorphization, and match compilation. Hull is then translated to Yul for final code generation.
Hull retains algebraic data types (products and sums) from SAIL, but eliminates polymorphism, type classes, higher-order functions, and all surface syntax sugar. The result is a first-order, monomorphic language whose structure maps directly onto Yul constructs.
Note Hull is produced by the compiler and is not intended to be written by hand. The concrete syntax described in this document exists to make diagnostic
.hulloutput readable. It can be inspected by passing the-dump-hullflag to thesol-corebinary.
Relation to the Compilation Pipeline
The Core Solidity performs all stages up to and including Hull emission. The
yule compiler takes Hull as input and produces Yul. Yul is then compiled to
EVM bytecode by the standard solc compiler.
Types
All types in Hull are monomorphic. Type variables are fully instantiated by the specializer before Hull is produced.
word
word is a 256-bit unsigned integer, corresponding to the native word size of
the EVM. All arithmetic values, storage addresses, and ABI-encoded data are
ultimately represented as word.
bool
bool is the Boolean type. Its only values are the literals true and false.
unit
unit is the zero-size type. Its only value is (). It carries no information
and occupies no storage. unit appears as the payload type of nullary
constructors in sum types. For example, the None branch of Option(word) has
payload type unit.
Product Types
A product type is written as:
(T1 * T2)
It holds two values simultaneously, one of type T1 and one of type T2. The
* operator is right-associative, so A * B * C is parsed as A * (B * C).
The projections fst and snd extract the left and right components of a
product value, respectively.
Note N-ary products are not yet implemented. Products with more than two components are encoded as right-nested pairs:
(A, B, C)is represented as(A * (B * C)).
Sum Types
A sum type is written as:
(T1 + T2)
It holds a value of either type T1, tagged as the left branch, or type T2,
tagged as the right branch. The + operator is right-associative, so
A + B + C is parsed as A + (B + C).
Sum values are constructed using injection expressions (inl, inr, in(k))
and deconstructed using match statements.
An n-ary sum can also be written as sum(T1, T2, ..., Tn), but the binary
right-nested encoding is what the Yul code generator uses internally.
Named Types
A named type is written as:
Name{T}
It attaches a human-readable label to an underlying structural type T. The
name is purely documentary: the Yul backend strips it and operates on the
underlying structure. Named types originate from user-defined data
declarations in Core Solidity; the specializer fills in the concrete type
arguments and wraps the result in a named type to preserve the original name in
diagnostic output.
For example, the Core Solidity declaration:
data Option(a) = None | Some(a);
when specialized at a = word produces the Hull named type:
Option{(unit + word)}
Function Types
Function types are written as:
(T1, T2, ..., Tn -> R)
Function types appear only in function definitions; they are not first-class values. Hull has no higher-order functions and no closures.
Expressions
Integer Literals
An integer literal is a non-negative decimal integer that denotes a word
value. The value must fit in 256 bits.
42
1000000
Boolean Literals
true and false are the two values of type bool.
Unit
The expression () is the single value of type unit.
Variables
A variable refers to a locally declared name. Variable names follow the same
identifier rules as Yul: they may contain letters, digits, _, and $.
Note The compiler conventionally uses names beginning with
$for compiler-generated variables, such as$altfor the payload variable introduced by amatchalternative. User-visible names do not begin with$.
Pairs
A tuple expression constructs a product value:
(e1, e2)
The type of (e1, e2) is (T1 * T2) where T1 and T2 are the types of e1
and e2. Tuples with more than two elements are right-nested: (e1, e2, e3) is
equivalent to (e1, (e2, e3)).
Projections
fst(e) evaluates to the first component of the pair e. snd(e) evaluates to
the second component.
fst(e)
snd(e)
Both e must have a product type.
Sum Injections
A sum injection constructs a sum value. The following three forms are available:
| Form | Meaning |
|---|---|
inl<T>(e) | Injects e as the left branch of T |
inr<T>(e) | Injects e as the right branch of T |
in(k)<T>(e) | k-th injection into the n-ary sum T |
The type annotation T is the target sum type, not the type of e. This
annotation is mandatory: the Yul code generator needs the full sum type to
compute the correct memory representation of the injected value without
re-running type inference.
For example, inr<Option{(unit + word)}>(42) injects the word 42 into the
right branch of Option{(unit + word)}, representing Some(42).
inl<Option{(unit + word)}>(()) injects () into the left branch, representing
None.
Function Calls
A function call applies a named function to a list of arguments:
f(e1, e2, ..., en)
All call targets in Hull are statically known names produced by the specializer. There is no dynamic dispatch or virtual call mechanism.
Conditional Expressions
A conditional expression selects between two alternatives based on a boolean value:
if<T> cond then e1 else e2
cond must have type bool. Both e1 and e2 must have type T. The type
annotation T is mandatory. Conditional expressions are generated by the
if-desugaring pass from Core Solidity if/else expressions.
Statements
Statements are executed sequentially inside function bodies and blocks. There are no statement separators; statements are delimited by whitespace and the structure of the enclosing block.
Variable Declaration
let x : T
Declares a mutable local variable x of type T. The declaration does not
initialize the variable; it must be assigned before it is read. This corresponds
directly to Yul's uninitialized let declaration.
Warning Reading a variable before it has been assigned is undefined behaviour at the Yul level. The compiler always assigns variables before reading them, but hand-written Hull must respect this invariant.
Assignment
lhs := rhs
Assigns the value of rhs to lhs. The left-hand side is most commonly a
variable name, written as x := e.
Expression Statement
A bare expression can be used as a statement:
e
The expression is evaluated for its side effects and the result is discarded.
This form is used for calls whose return type is unit.
Return
return e
Returns the value of e from the enclosing function. Every execution path in a
Hull function must end with a return or revert.
Block
{
stmt1
stmt2
...
}
A block groups a sequence of statements. Blocks are used as the bodies of
match alternatives and function definitions.
Match
match<T> e with {
inl $alt => { ... }
inr $alt => { ... }
}
Deconstructs the sum-typed expression e of type T. Each alternative names
the payload of its branch and executes the corresponding block. The following
constructor patterns are available:
| Pattern | Matches |
|---|---|
inl | Left injection |
inr | Right injection |
in(k) | k-th injection (n-ary) |
The type annotation <T> is mandatory. It is used by the Yul translator to
determine the memory layout of the sum type.
Note Sum types with more than two constructors are represented as right-nested binary sums. Deconstructing them requires nested
matchexpressions: the outermatchseparates the first constructor (left branch) from the remainder (right branch), and inner matches continue the decomposition.
Function Definition
function f(x1 : T1, x2 : T2, ...) -> R {
stmt1
stmt2
...
}
Defines a named function with explicitly typed parameters and a single return type. Hull functions are translated directly to Yul functions.
The following properties hold for all Hull functions:
- Parameters and the return type must be given explicit Hull types.
- Functions are first-order: no function-typed parameters or return values.
- Each function has exactly one return type.
- The specializer produces unique names for each monomorphic instantiation, for
example
map$wordormaybe$Word.
Assembly Block
assembly {
<Yul statements>
}
An inline assembly block embeds raw Yul statements directly in the Hull output.
The block passes through unchanged into the generated Yul. Assembly blocks are
used to access EVM primitives such as mload, mstore, add, iszero,
revert, and similar opcodes.
See the Yul documentation for the complete overview of Yul syntax.
Revert
revert "message"
Immediately aborts execution. The string literal is a diagnostic label. The
match compiler generates revert statements for branches that are unreachable
at the Core Solidity level but must be given a code path in the Hull output.
Comment
/* text */
A block comment. Comments carry no semantics and are stripped during Yul
translation. The compiler inserts comments to annotate which source-level
constructor each $alt variable corresponds to, for example:
inl $alt => { /* None */
return n
}
Alternatives
An alternative is a branch in a match statement:
Con $var => { body }
Con is one of inl, inr, or in(k). $var is a fresh variable name that
is bound to the payload of the matched constructor. body is a block of
statements. The bound variable has the payload type of the matched branch: for
inl and inr on a sum type (T1 + T2), the payload types are T1 and T2
respectively.
Contracts and Objects
At the top level, a Hull program consists of named objects following the Yul
object model. An object has a name, a code block containing statements, and
zero or more inner objects:
object "ContractName" {
code {
...deployment code...
}
object "ContractName_deployed" {
code {
...runtime code...
}
}
}
The outer object's code block is the deployment (constructor) code. The inner
object's code block is the runtime code. This structure corresponds exactly to
the Yul object notation used by solc. During Hull emission, the Core Solidity
contract body is split into these two objects automatically.
Concrete Syntax Reference
The grammar diagrams below use the railroad diagram convention: rounded boxes denote terminals (keywords and punctuation literals) and rectangular boxes denote non-terminals.
Types
PrimaryType: the base forms that can appear in any type position:
Type: combines primary types with * (product) and + (sum), both
right-associative:
Expressions
PrimaryExpr: literals, variables, tuples, and calls:
Expr: the complete expression grammar, including injections, projections, conditionals, and primary expressions:
Statements
Stmt: the complete statement grammar:
Block: a brace-enclosed sequence of statements:
Auxiliaries
Arg: a typed parameter in a function definition:
Con: a constructor tag in a match alternative:
Alt: a match alternative, binding the payload to a name:
Objects
Object: a named code container, optionally containing inner objects:
Examples
The following examples show Core Solidity source programs alongside the Hull they produce. All examples are accepted by the Core Solidity prototype.
Identity Function
The simplest possible Hull function passes its argument through unchanged:
function id(x : word) -> word {
return x;
}
Hull output:
function id (x : word) -> word {
return x
}
Optional Value
The following Core Solidity program defines an Option type and a maybe
function that extracts the contained value or returns a default:
data Option(a) = None | Some(a);
function maybe(n : word, o : Option(word)) -> word {
match o {
| None => return n;
| Some(v) => return v;
}
}
After specialization at a = word, the Hull output is:
function maybe$Word (n : word, o : Option{(unit + word)}) -> word {
match<Option{(unit + word)}> o with {
inl $alt => { /* None */
return n
}
inr $alt => { /* Some */
let var_1 : word
var_1 := $alt
return var_1
}
}
}
Note that the function is renamed from maybe to maybe$Word by the
specializer to reflect the type instantiation. The Option{(unit + word)} named
type preserves the original constructor name in the type annotation.
Enumeration Type
The following program defines a three-constructor enumeration and converts it to an integer:
data Color = Red | Green | Blue;
function fromEnum(c : Color) -> word {
match c {
| Red => return 0;
| Green => return 1;
| Blue => return 2;
}
}
Hull output:
function fromEnum (c : Color{(unit + (unit + unit))}) -> word {
match<Color{(unit + (unit + unit))}> c with {
inl $alt => { /* Red */
return 0
}
inr $alt => match<(unit + unit)> $alt with {
inl $alt => { /* Green */
return 1
}
inr $alt => { /* Blue */
return 2
}
}
}
}
The three-constructor type Color is encoded as the right-nested binary sum
(unit + (unit + unit)). The outer match separates Red (left) from the
remaining constructors (right). An inner match then separates Green (left)
from Blue (right).
EVM Arithmetic via Assembly
The following contract uses an inline assembly block to invoke the add EVM
opcode:
contract Add1 {
function main() -> word {
let res : word;
assembly { res := add(40, 2) }
return res;
}
}
Hull output:
object "Add1" {
code {
...deployment code...
}
object "Add1_deployed" {
code {
function main () -> word {
let res : word
assembly {
res := add(40, 2)
}
return res
}
}
}
}
The assembly block is reproduced verbatim in the Hull output and passes
through unchanged into the generated Yul.