In this post, I’ll try to summarize all the user facing features that were added to Sophia in the past two years. The work on the Sophia compiler was not only focused on adding new features, but also on bug fixing, higher stability, and internal developments that would make the life of the compiler develops easier.
It’s not always convenient for a user of a language to go through the changelog of every release and try to see what new features were added and figure out use cases for them. So I’ll try here to list all the features that were added, along with the motivation to add them, and possibly some use cases.
Additions to the standard library
The Set
standard library
Sophia has no builtin set type but it has a map that could be used as a set if we ignore the values of this map (i.e. map('a, unit)
). The new set
type is a wrapper around map('a, unit)
, and the Set
standard library provides a bunch of useful functions that could be used when the problem at hand needs a set rather than map of keys and values.
Before:
entrypoint uniq_count() =
let elems = [1, 2, 3, 2, 3, 4]
let l = List.map((x) => (x, ()), elems)
let m = Map.from_list(l)
Map.size(m)
After:
entrypoint uniq_count() =
let elems = [1, 2, 3, 2, 3, 4]
let s = Set.from_list(elems)
Set.size(s)
The Bitwise
standard library
The Bitwise standard library includes the most used operations on arbitrary precision integers. You can learn more about the functions of the Bitwise
namespace in the standard library documentation.
New syntax
Loading namespaces with the using
keyword
The using
statement allows bringing functions from different namespaces to the current namespace. If a function makes multiple calls to Pair.fst
for example, then it would be possible to type less characters by bringing the fst
function into the current namespace with using Pair for [fst]
. The using
statement could be used in a top-level scope, a contract scope, or a function scope.
Before:
entrypoint f() =
let x = (1, 2)
let y = (4, 5)
Pair.fst(x) * Pair.snd(x) + Pair.fst(y) * Pair.snd(y)
After:
entrypoint g() =
using Pair
let x = (1, 2)
let y = (4, 5)
fst(x) * snd(x) + fst(y) * snd(y)
Allow assigning patterns to variables
Sometimes when when we’re doing pattern matching, we might be interested in multiple parts of the pattern. For example, if we are interested in the first couple of elements of a list, along with the tail of the list (the list without its first element), we would have to do pattern matching twice. First time to get the first item along with the tail, and second time to get the second element of the list, so we would write something like:
entrypoint f() =
let l = [1, 2, 3, 4]
let x::rest = l
let y::_ = rest
(x + y, rest)
After introducing the feature of assigning patterns to variables, we can instead do pattern matching to get the first two elements of the list let x::y::_ = [1, 2, 3, 4]
, and assign a name to the pattern that makes the tail of the original list let x::(rest = y::_) = [1, 2, 3, 4]
. The above example would look like this:
entrypoint f() =
let l = [1, 2, 3, 4]
let x::(rest = y::_) = l
(x + y, rest)
Pattern guards for functions and switch statements
Pattern guards are used to guard against executing some branch that we are only interested in executing after some conditions are met. They can be used when pattern matching against functions arguments or in a switch statement cases. In the following example, we only want to match a list against the pattern a::[]
only when the value of a is either between 10
and 20
or when it’s less than 0
. Before adding patterns guards, we had to enter the branch and check the value of a using if-elif-else
statements:
Before:
entrypoint f(l) =
switch(l)
a::[] =>
if (a > 10 && a < 20)
"ok1"
elif (a < 0)
"ok2"
else
switch(l)
a::b::[] => "ok3"
_ => "fail"
a::b::[] => "ok3"
_ => "fail"
Using guards, we can skip executing the branch at all, when the conditions are not met, so the above code would look simpler by using this new feature:
entrypoint f(l) =
switch(l)
a::[]
| a > 10, a < 20 => "ok"
| a < 0 => "ok"
a::b::[] => "ok"
_ => "fail"
Introduce the pipe operator |>
The pipe operator is a feature that is available in most functional programming languages, it’s a shortcut for feeding expressions into functions, to avoid wrapping each expression with parenthesis. The value of this feature can be demonstrated with an example. Assuming we have a few functions that perform transformations on some datatype:
contract C =
function transform1(val) = ...
function transform2(val) = ...
function transform3(val) = ...
function transform4(val) = ...
function transform5(val) = ...
While it’s possible to call these functions on some value:
entrypoint f(val) =
transform5(transform4(transform3(transform2(transform1(val)))))
A much cleaner way to do the above is to use the pipe operator:
entrypoint f(val) =
val |> transform1
|> transform2
|> transform3
|> transform4
|> transform5
Allow binary operators to be used as lambdas
Sometimes it’s needed to pass a binary operator as an argument to another function, but it was not possible in Sophia to do that. What was done before was creating a lambda that does the exact same job as the binary operator. For example, if we wanted to pass the binary operator +
as an argument to a function f, we would pass it as f((x, y) => x + y)
. So if we are trying to write a function that would sum the elements of a list, we would write it as:
function sum(l : list(int)) : int =
List.foldl((x, y) => x + y, 0, l)
But since the lambda (x, y) => x + y
does the same job as the +
operator, we have allowed binary operators to be used as lambdas when surrounded by parenthesis (e.g. (+)
is a lambda). The above example would look like this:
function sum(l : list(int)) : int =
List.foldl((+), 0, l)
Add hole expression
Since Sophia is a strongly typed language, new users might sometimes hit a wall trying to figure out the exact types needed to compile their contracts. Hole expressions (written in Sophia as ???
) are useful in that they would tell you the exact type you need for the compilation to succeed. For example, when someone is not sure about the type of the first argument of the function List.map
that would make following code compiles:
List.sum(List.map(f, [1,2,3]))
It’s possible to replace f
with the hole expression ???
:
List.sum(List.map(???, [1,2,3]))
The compiler would then produce the error message Found a hole of type (int) => int
meaning that a function of the type (int) => int
should be used as the first argument of List.map
in the above example.
Introduce contract-level compile-time constants
When a contract-level constant was needed, the main solution for that was to define a function with no arguments, that would return the required constant.
contract C =
function constant() = 42
entrypoint f(x) =
x * constant()
Having a function call every time the constant is unnecessary if constants are allowed to be defined in the contract-level. The above definition of the constant()
function could be changed to a constant defined using let
:
contract C =
let constant = 42
entrypoint f(x) =
x * constant
Polymorphism support
A more detailed explanation of this feature could be found in the polymorphism section in the documentation. The wikipedia page would give some insight about what polymorphism is as well. Examples could be found in the documentation page as well.
Compiler flags
Add compiler warnings
A few compiler flags were introduced to make the compiler warn you about the things that you should not be doing in your code, but are not really errors. If you have included some file, but you are not making any use of it, or if you have some function that you were using, but not anymore, the compiler would also warn you about it. In some cases, it might be useful to consider such warnings as error, so we have also added the compiler flag warn_error
to treat all warnings as errors. Below is a list of all available warnings, the names of the flags are self-explanatory:
warn_unused_includes
warn_unused_stateful
warn_unused_variables
warn_unused_typedefs
warn_unused_return_value
warn_unused_functions
warn_shadowing
warn_division_by_zero
-
warn_negative_spend
(InChain.spend
) -
warn_all
(Enable all of the above) -
warn_error
(Treat all warnings as errors)
Add options to enable/disable certain optimizations
This feature is not meant for people writing production smart contracts. Instead, it’s meant for developers who are trying to understand which FATE instructions are generated for their smart contracts. The Sophia compiler implements a bunch of optimizations that are all enabled by default, but this feature makes it possible to turn off some or all optimizations using compiler flags. You can find a comprehensive list of the optimizations in the compiler docs.
Additional changes coming in the Ceres protocol upgrade
Bitwise operations built-in
The Bitwise
standard library will be replaced with built-in (binary) operations:
band, bor, bxor, bnot, <<, >>
These will be cheaper to use, and code can be written in a more natural way. For example, compare a band b
to Bitwise.and(a, b)
Adding Int.mulmod
This will combine multiplication with modulus - which will save gas on computation, but most prominently on space since you don’t need to allocate memory the large product value. I.e. Int.mulmod(a, b, n)
will be equivalent to (a * b) % n
Adding Crypto.poseidon
The addition of a ZK(SNARK) friendly hash function is crucial to be able to write interesting and at the same time efficient Zero-knowledge proof applications. The Crypto.poseidon
hash function would take approximately 600 operations to implement in the underlying integer arihtmetic circuit, while for example sha256
would require 20000 operations!
Adding Address.to_bytes
Easily converting to the byte array repersentation of an address should be useful when hashing etc.
Replacing AENS
with AENSv2
The introduction of raw data pointers to AENS means, in order not to make old code invalid, that we have to bump the AENS standard library. The new functionality is available in the namespace AENSv2
.
Adding arbitrary sized binary data (still WIP, no link)
In order to make byte arrays more useful we’re introducing the type bytes()
meaning a byte array of any size (a size not known at compile time). We’re extending Bytes.concat
to handle concatenation also of arbitrary sized byte arrays, we’re adding Bytes.split_any(b : bytes(), at : int) => option(bytes() * bytes())
that will (if possible) split an arbitrary byte array. There will also be functions to convert between fix sized byte arrays and arbitrary sized byte arrays. And we will add the functions Int.to_bytes(val : int, byte_size : int) => bytes()
and String.to_bytes(s : string) => bytes()
.