Courses

This page is used for preparing the online Nemerle courses that will be held beginning with October 1st. Since that is a collaborative wiki, feel free to add any information you find useful, or discuss it on the forums. Please don't ask any question here about the course itself, this is just for creating the texts that will be used. Course attendants will get a mailing list and a teacher to check the exercises and answer questions.

Beginners Course

This course is intended for people who don't have any expertise or knowledge on application development. So it introduces terms such as variable, method, wanting to become a reference for a starting student.

Nemerle for OOP Programmers

This course is the first one that will be held, and it's intended for people who already have expertise in some OOP language, such as C#, C++, VB.NET or Java. This course aims to make students get in touch with Nemerle syntax and functional constructs. They will also get some information on .NET Framework development and macros.

Developing with Nemerle

While the other two courses aim to provide a good understanding on the language, this course is rather about how to use Nemerle to develop real applications. Some of the subjects that will be talked about are GTK#, Windows Forms, ADO.NET...

Nemerle for OOP Programmers Introduction

Welcome to the "Nemerle for OOP Programmers" course. This course aims to provide a solid introduction to Nemerle for developers that already have proficiency with an object oriented programming (OOP) language, especially C#, Java or C++. Care has been taken, though, to make the explanations clear, and simple enough even for VB programmers ; )

Each week, one PDF file (or maybe two, depending on the material), will be sent via the course mailing list. This PDF contains both the contents and the "homework" for a given week.

Mailing lists

This course has a special mailing list, where all members are registered by the course leaders. This is the main place to ask questions and get answers, so everyone can learn from others. So, please, use the mailing list provided and you'll receive an answer very soon.

The mailing list for this course is beginners-course@nemerle.org, so please update your mail program configuration so it knows it is not spam.

Exercises

Please send your exercises to michal.moskal@gmail.com, or trupill@yahoo.es (it is best to email both of them, except if you are willing to use Spanish (ask trupill) or Polish (ask michal)). These are the leaders for this course. They will try to correct them as soon as possible.

Please do not share the solutions on the mailing list - solving a problem when you have the solution laying around isn't much fun.

Nemerle for OOP Programmers Week 0

In this first lesson, we will get in touch with Nemerle syntax, and some of the basic features of the language. At the end, we will also learn how to install and configure a usable Nemerle compiler.

We assume that you have general knowledge about all the items that are treated here, so we won't dig so much into them.

Contents

Values and Variables

Just as a review: a value is a place in memory that holds some data. Each value has an identifier, or name, which allows us to refer to its contents, and operate on it. In most languages, identifiers declared in code are read-write variables, but this is not the default scheme in Nemerle. Indeed, there are two kinds of identifiers: mutable and immutable.

Immutable identifiers (values), far more commonly used in Nemerle, are defined with the following syntax. There's always an initializer, because if there wasn't, we wouldn't be able to use it at all:

def name : type = value

For mutable identifiers (variables) we just change the def keyword to mutable, like this:

mutable name : type = value

Immutable values are typically used for naming some object, and using that same instance for the life of the value. Mutable variables are used as place holders for some data which is meant to be changed in further computation. You will see that such a distinction leads to a different approach to solving programming problems.

Type Inference

You might have wondered why we use a special keyword for declaring values, instead of just writing type name; as in C. This is because the type can be skipped in most cases. So instead of writing:

def x : int = 42;

you write:

def x = 42;

This is not limited to literals, as you can also write something like:

mutable y = null;
// do something
y = SomeComplexObject ();

This is still OK.

Methods

As you surely know, a method or function is a group of statements, executed sequentially, which are given a meaningful name. Optionally, methods can have a return value as the final value of the computation. Methods can also have parameters, that is, values that are given to the method so it can know what to do. The general syntax for methods is:

name ( parameter1 : type, parameter2 : type ... ) : return_type

As you can see, the only difference when compared against C#, C++ or Java is that types are written after the parameter name, instead of before them. This is just a matter of convention; Visual Basic, for example, also places it after the name.

NOTE: type inference is not available for parameters and return types of methods. Having type inference for public methods could lead to accidently changing exported interfaces of classes — which is Not Good™. We plan to implement it for private methods though.

Returning Values

Continuing in the functional programming tradition, Nemerle doesn't use return or any other keyword to specify which value a function gives back as it's result. This fits very well in Nemerle: there are no goto, break or label statements that could disrupt normal program flow (well, there are some macros extending the language to allow them, but for now let us just ignore them).

The question then arises: if there is no keyword, how does the compiler know which value to return? The answer is very easy: the last value that has been computed. The easiest example to understand this is:

add (number1 : int, number2 : int) : int {
    number1 + number2
}

So the function's return type should agree with the type of last expression in its body. If it does not yield any value, then the function does not return anything (and its return type is called void).

Primitive Types

Well, up to this point we have used the word type lots of times, but we have not really defined it. Type is a fundamental concept in Object Oriented Programming. It represents an abstract idea of what a value is. For now, we will only introduce the primitive types:

Short name Full name Description
sbyte System.SByte A signed 8-bit wide integer.
short System.Int16 A signed 16-bit wide integer.
int System.Int32 A signed 32-bit wide integer.
long System.Int64 A signed 64-bit wide integer.
byte System.Byte An unsigned 8-bit wide integer.
ushort System.UInt16 An unsigned 16-bit wide integer.
uint System.UInt32 An unsigned 32-bit wide integer.
ulong System.UInt64 An unsigned 64-bit wide integer.
string System.String An immutable string of characters.
char System.Char A character, represented internally by a 16-bit integer.
void System.Void A type with exactly one value, named () in Nemerle. Used when a method doesn't return anything sensible.
float System.Single A 32-bit IEEE floating point number.
double System.Double A 64-bit IEEE floating point number.
decimal System.Decimal A type used for money representation.

During this course you probably won't touch anything beside int, string, char and void, so don't bother remembering the above :-)

The Full Name is the name used in the Base Class Libraries (BCL), which you can use to lookup information in the .NET Framework SDK Documentation, or similar reference. In code it is easier to use the short name, but the longer one is equally valid. They are interchangeable.

Nemerle is based on Microsoft .NET Framework, or its free implementation, Mono. That means that these types are not really just for Nemerle, but are common for any language targeting the .NET Framework. You have to adapt to this new platform, because while the syntactical parts of the language come from Nemerle, the practical parts (interaction with the user, communication over a network) come from the BCL. If you come from C#, you very likely know the Base Class Libraries well already. If you are new to .NET, you will need an introduction, so we will start with some working examples.

Simple Examples

For example, if I wanted to ask the user for her/his name and present a greeting, I would write a program something like this:

def name = System.Console.ReadLine ();
System.Console.WriteLine ("Hello " + name);

You can save these two lines into a file, say greeting.n. Compile it using:

 ncc greeting.n

and then run it (at the Windows cmd prompt):

 out

(or on Linux with mono, in a terminal window):

 mono out.exe

Nemerle does not require you to provide any classes or methods, you can just throw expressions to evaluate into a file and it will run it.

Another way to start a program's execution is to write a static method called Main in some class. So here, the example would be:

class Greeter {
  public static Main () : void
  {
    def name = System.Console.ReadLine ();
    System.Console.WriteLine ("Hello " + name);
  }
}

You cannot mix these two techniques, or for that matter, place top-level code in more than one file. What you can do is define some classes before (but not in the middle, or after) the actual code to be executed. This is what we will do in this course, as it is more concise.

So the most sophisticated way to write this simple example would be:

class Asker {
  public static Run () : string
  {
    System.Console.ReadLine ()
  }
}
 
def name = Asker.Run ();
System.Console.WriteLine ("Hello " + name);

Decision Statements

One of the main functionalities in every programming language is the ability to execute instructions depending on some values or some input from the user. If you know some other language, this should be a familiar concept. The relational operators in Nemerle are the ones found in C#, C++ and Java:

==Equal to
!=Not equal to
>Greater than
>=Greater or equal than
<Less than
<=Less or equal than

NOTE: other languages, such as Visual Basic or Pascal, use = as the logical operator for equal to. This is an assignment operator in Nemerle, therefore it is an error to use it as a relational operator.

Results from relational operators can be saved in bool values, and these values can be used like any other:

def number = int.Parse (System.Console.ReadLine ());
def isEven = (number % 2) == 0;
System.Console.Write ("The number is even :");
System.Console.Write (isEven);

This code will show True or False depending on the input from the user.

Combining relational operators

Sometimes it is necessary to combine more than one condition. This can be done using the following operators:

! c Negation, returns true if c was false, and false if c was true.
c1 && c2 And, checks c1, if it is false returns false otherwise returns c2.
c1 || c2Or, checks c1, if it is true returns true otherwise returns c2.

Both && and || are evaluated lazily, which means the following code:

when (foo != null && foo.Length > 0)
  System.Console.WriteLine (foo);

will never yield NullReferenceException, because the foo.Length will not be evaluated if foo != null was false.

This is the same behavior as in all C-like languages.

Using conditions

Once we have learned how to check for conditions, it's time to learn how to execute a piece of code depending on them. The three main conditional constructs for this task are:

when (condition)
  code

when executes the code only when the condition is true.

unless (condition)
  code

unless executes the code only when the condition is false.

if (condition)
  code1
else
  code2

if..else executes the first block of code when the condition is true, or the second if its result is false. On the main C-family languages, the notion of when or unless clauses does not exist, so they are managed using ifs without else. In Nemerle, every if has its else.

As you can see:

unless (something)
  DoSomethingElse ()

is exactly the same as:

when (!something)
  DoSomethingElse ()

Whether or not you use unless is up to you: use what reads better to you. No styling guidelines here.

In Nemerle, conditional statements are also functions (as in other functional languages), so the type of the return value of each branch must be the same. However, these constructs are mostly used as they are in C# or Java (the type of both branches is void), so there should be no problem if you don't miss your final ;. This is specially important in when/unless, because, as stated in Language Reference, they really translate to an if...else statement, which returns ().

Code blocks

The final word of this week are code blocks. Conditional statements support just a single statement written within their body. However, most of the time, you will need to execute more than one statement. This is achieved using code blocks: a series of statements written inside { and }, that are evaluated sequentially, and whose return type is the last value computed (remember, it's a functional language).

Exercises

The homework for this week is very easy to do. The intention is that the hardest part is the Nemerle installation.

Install Nemerle

Install the latest stable version of Nemerle in Linux, Windows, MacOS or whatever operating system you use. If you use Linux or MacOS, you will need to install Mono first. If you use Windows, you will need the latest release of .NET Framework 2.0 (Beta 2 is too old, use at least the August 2005 CTP).

The Step by step guide to get Nemerle compiler running on our Webpage has all the details, including MS.NET/Mono installation instructions.

You can ask any questions on the course mailing list. You can also configure some editors (such as SciTE, VIM, XEmacs, Kate, GEdit, mcedit) for Nemerle syntax highlighting. You can find configuration files in the SVN repository.

Exercise 1

Create a console application that asks the user for two numbers and then shows them added, subtracted, multiplied, divided and its modulus. You can also build upon it a program that shows the sin, cosine and tangent of a specified angle (look in the System.Math class).

Exercise 2

Add a menu (a list of possible operations and a prompt for a choice) to this program so the user can choose which operation to perform. Implements separate methods for reading numbers and displaying the menu.

Nemerle for OOP Programmers Week 1

During this lesson we will learn how to define good ol' classes in Nemerle. It works much like in C# (or Java, C++) but there are some differences.

We also take a look at loops and a few basic macros.

This week still doesn't introduce any functional features. We'll find out about them next week.

Contents

Class definitions

We assume you know what a class is.

Classes in Nemerle are defined with the class keyword. The basic form is:

class MyClass {
  // something inside
}

There are several possible entities that can be defined inside classes. All of them can be proceeded by access modifiers, and some by other modifiers.

Modifiers

There are several kinds of modifiers. Note that you're probably only going to use public, static and mutable at first.

We will first discuss access modifiers. Modifiers specify the kind of access other classes have to given entity. Access attributes are valid on all kinds of entities defined in classes. Everything is private by default. The public and internal modifiers are also valid on top-level classes (which are internal by default).

Next are the modifiers changing the semantics of a given entity:

Next are the modifiers used to control virtual methods and name hiding. We only explain new here. The others are described in the section about virtual calls.

Side note: mutability

One way to think about the mutable modifier is that it's much like default-private fields. That is, you need to make a conscious decision to make a field mutable, just as you need a conscious decision to make it public. Therefore fields will end up non-mutable only when they need to. As you continue programming in Nemerle, you will find yourself not needing that much mutability.

Accessing instance vs. static members

Values created from classes are known as instances of the class. Classes work like models, and instances are real representations of that model in memory. If you want to access a instance method, that is, a method not declared static, you first have to create an instance of the class using a constructor (see below). Then, you use the . operator to "dive into" the class members: properties, fields, methods...

On the other hand, static members can only be accessed through the name of the class. Trying to call a static method from an instance will always fail in Nemerle. Some other languages allow both uses, so be careful!

Custom attributes

This is rather advanced topic, but this syntax is reused by top-level macros, so we briefly mention it here.

In .NET all metadata elements (classes, methods, fields and so on) can have a special, additional information attached. This information is (in most cases) not interpreted in any way by the runtime, therefore it is good place for placing documentation, authorship/copyright tags, and so on. This information can be accessed by user applications at runtime.

The syntax for attaching this information in Nemerle (and in C# for that matter) is:

[MyCustomAttribute (7, "foobar")]
public class FooBar {
  [OtherCustomAttribute]
  SomeMethod () : void
  {
  }
}

Custom attributes are classes subtyping System.Attribute, so the user can define new custom attributes, though some are provided by the base class library.

If you're interested in this mechanism, it may be a good idea to consult MSDN attributes tutorial, after going through this course. It's about C#, but, except for minor syntactic differences it also applies to Nemerle.

In Nemerle, certain custom attributes trigger execution of macros - that is special compiler extensions. Later this week we will learn how to automatically create constructors and accessors.

Methods

Methods are used for grouping code together for later reuse. We've already seen how to define methods in the previous week.

As you recall, method calls are written using the method name, and then writing the parameters between parenthesis.

Fields

There is nothing special about fields in Nemerle (compared to fields in C#/Java/C++), except for their immutability. If you do not declare a field to be mutable, they can be only set inside the class or instance constructor (depending on if the field is static or not). This is much like the readonly modifier in C# or final in Java.

You can also initialize fields at the place of their definition, which causes the appropriate initialization code to be added to the class or instance constructor.

Fields are defined with the following syntax:

modifiers name : type;

or with the initializer:

modifiers name : type = expression;

Modifiers are optional, and fields are private by default. If you omit the initializer fields have the value of 0, 0.0 or null for reference types.

In fact, even if you supply the initializer you can still observe this default value, for example:

class Test {
  static public x : int = y + 1;
  static public y : int = x + 2;
}
System.Console.WriteLine (Test.x);
System.Console.WriteLine (Test.y);

will print 1 and 3, because when initializing x, y still has its default value of 0. As you can see, the initializations are performed top-down.

Instance constructors

Unlike in C#/Java/C++, the constructor has a fixed name, it's called this.

Example:

class Point {
  public x : float;
  public y : float;
  public this (ini_x : float, ini_y : float)
  {
    x = ini_x;
    y = ini_y;
  }
}

Instead of inventing this ini_ stuff one can also use this pointer directly:

class Point {
  public x : float;
  public y : float;
  public this (x : float, y : float)
  {
    this.x = x;
    this.y = y;
  }
}

The constructor is later called by the class name:

def the_point = Point (3.14, -3.14);

There is no new keyword, as is present in C# or Java.

The [Record] macro

If all your constructor is going to do is assign field values, which will all supplied by the user, you can use the [Record] macro. So the example above can be reduced to:

[Record]
class Point {
  public x : float;
  public y : float;
}

This useful especially for creating small helper classes.

Default constructor

If you do not provide an instance constructor at all, the compiler adds a default, public, parameter-less, empty one. Therefore the following piece of code is valid:

class MyClass {} 
def c = MyClass ();

Example

Now, now that we know about fields, methods and constructors, we can start some examples. We will try to program a robot called Marvin.

class Marvin {
  // Marvin starts with a fairly positive attitude
  mutable attitude : int = 100;
  mutable x : int;
  mutable y : int;
  mutable direction : int;
  
  public this (ini_x : int, ini_y : int)
  {
    x = ini_x;
    y = ini_y;
    // head north
    direction = 0;
  }
}

This example is completely equivalent to:

class Marvin {
  mutable attitude : int; // CHANGED
  mutable x : int;
  mutable y : int;
  mutable direction : int;
  
  public this (ini_x : int, ini_y : int)
  {
    attitude = 100; // CHANGED
    x = ini_x;
    y = ini_y;
    // CHANGED
  }
}

As you can see we moved initialization of attitude to the constructor. This does exactly the same thing that the field initializer does. We also removed direction = 0, because 0 is the default value for int anyway.

A little prize: string interpolation

Now we can add some more complicated behavior to Marvin, we start with a Turn method:

public Turn (delta : int) : void
{
  // we need to protect our engine, Marvin cannot
  // turn too much at once!
  if (delta < -90 || delta > 90)
    System.Console.WriteLine ($ "Cannot turn that much ($delta degrees)!");
  else {
    // having to turn makes our robot feel worse
    attitude -= 5;
    direction += delta;
    // call the robot's firmware here
  }
}

(Because of our NDA with Sirius Cybernetics, we cannot reveal how to call Marvin's firmware to actually make it move, therefore we will just skip this :-)

The interesting part is the usage of the $ macro. It takes a string as an argument, and interprets the dollar signs in it in a special way -- it evaluates them as expressions, calls ToString() on them and puts the results into the string. If you need to place a more complicated expression after $, you need to use parens, like in:

System.Console.WriteLine ($ "Cannot turn that much ($(delta / (180 * 3.1415)) radians)!");

This works mostly like in PHP, perl or Bourne Shell, but it is checked at the compile time, so if you put an undefined variable you will get a compile time error, not empty string at runtime (check it out!).

The Framework provides also another way to interpolate strings. Inside the string, you can include {n} directives, and they will be changed with the "n + 2"-th parameter. Console.WriteLine is one of the methods that allow this. In other places, one can use:

def addition = string.Format ("{0} + {1} = {2}", n1, n2, n1 + n2);

As you can imagine, it is easy to mess up matching the numbers, so better just stick with $ :-)

Some loops

Now we can add another method:

public TurnBack () : void
{
  for (mutable i = 0; i < 2; ++i)
    Turn (90);
}

The for loop works much like other C-like languages -- the first expression is evaluated before the loop, the second is the condition and the last one is the increment.

So what about a general turning method without limits?

public TurnAny (mutable delta : int) : void
{
  while (delta != 0)
    if (delta > 90) {
      Turn (90);
      delta -= 90;
    } else if (delta > 0) {
      Turn (delta);
      delta = 0;
    } else if (delta < -90) {
      Turn (-90);
      delta += 90;
    } else {
      Turn (delta);
      delta = 0;
    }
}

As you can see, we have a while loop too. Moreover, in order to assign values to a parameter, we need to mark it mutable.

More about classes!

Static constructors

A static constructor is executed once for the program (or more exactly, AppDomain) lifetime, when the class is first referenced. It's a good place to include some class-wide initialization code.

Initializers specified for static fields are added to the static constructor.

Static constructors need to be private, parameter-less and ergh... static. For example:

using System.Console;
 
class Hello {
  static this ()
  {
    WriteLine ("static ctor");
  }
 
  static public Run () : void
  {
    WriteLine ("Run");
  }
}
 
WriteLine ("hello");
Hello.Run ();
WriteLine ("bye");
 
/* Produces:
hello
static ctor
Run
bye
*/

Note how "hello" is written before static ctor is run, but before the Run method.

Properties

Properties provide language level support for the get/set accessor design pattern commonly found in Java and C++. Most .NET languages (including C#) provide properties. A property called Foo of type int is internally a set of two methods get_Foo() : int and set_Foo (value : int). However instead of writing:

def my_value = some_object.get_Foo ();
some_object.set_Foo (42);

we write:

def my_value = some_object.Foo;
some_object.Foo = 42;

As you can see this is only syntactic sugar. There is also a special syntax for defining properties:

using System.Console;
 
class MyClass {
  mutable m_foo : int;
  
  public Foo : int
  {
    get { 
      WriteLine ("get_Foo called");
      m_foo
    }
    set {
      when (value > 3)
        WriteLine ("value bigger than three");
      m_foo = value;
    }
  }
}
 
def c = MyClass ();
c.Foo = 1;
WriteLine ($ "c.Foo = $(c.Foo)");
c.Foo += 42;
WriteLine ($ "c.Foo = $(c.Foo)");
 
/* Output:
get_Foo called
c.Foo = 1
get_Foo called
value bigger than three
get_Foo called
c.Foo = 43
*/

Note how += is changed to c.Foo = c.Foo + 42, which results in call to get_Foo before setting the value.

Also note, that the set method has implicit parameter called value.

You can omit get or set, but not both. If you omit set, trying to write to the property will result in an error. Omitting get means you can't read the property.

The [Accessor] macro

If your property is only going to return the value of a field, you can use the [Accessor] macro. It sits in Nemerle.Utility. You place it on a field. If you don't supply any parameters, it generates a public getter:

using Nemerle.Utility;
 
class MyClass {
  [Accessor]
  foo_bar : string = "qux";
}
 
def c = MyClass ();
System.Console.WriteLine (c.FooBar);
 
/* Output: qux */

You can also supply another name for the accessor, request a setter, or make them internal. Please refer to accessor macros page for details.

Inheritance and virtual calls

Inheritance is one of the central concepts of OOP. It is a means of extending the functionality of an existing class. A class can inherit all the behavior of some original class, then extend it by adding new methods, fields and/or properties.

For example we can define a class called Robot and later define two classes inheriting from it: Marvin and R2D2. Robot is then said to be supertype of Marvin and R2D2, and Marvin and R2D2 are subtypes of Robot.

The inheriting class can also modify the behavior of the original class. This is done through virtual methods. For example, Robot can define a virtual SelfDestruct method, which R2D2 can override to refuse self-destruction, while Marvin can reuse the method from Robot (by just doing nothing).

The syntax is:

class Robot {
  public virtual SelfDestruct () : void
  {
    // call robot firmware here
  }
}
 
class R2D2 : Robot {
  public override SelfDestruct () : void
  { 
    System.Console.WriteLine ("refusing");
  }
}
 
class Marvin : Robot {
 // don't override anything
}

As you can see, the colon (:) is used to mark the class to inherit (Java uses extends keyword here). Methods need to marked virtual, otherwise they cannot be overridden (unlike in Java). Additionally, the overriding method needs to be marked override (unlike in C++).

Abstract methods

Methods can also be marked abstract. You mustn't provide any body in this case. It means the method has to be eventually overridden by an inheriting class. A class is marked abstract when it has one or more abstract methods, either defined in the class itself, or an abstract method defined in the super type that is not overridden. One cannot construct instances of abstract classes.

Example:

abstract class Robot {
  abstract SelfDestruct () : void;
}
 
class R2D2 : Robot {
  public override SelfDestruct () : void
  { 
    System.Console.WriteLine ("refusing");
  }
}
 
class Marvin : Robot {
  public override SelfDestruct () : void
  { 
    System.Console.WriteLine ("self destructing in 5 seconds...");
  }
}

Modules

Another common pattern found in some C# classes, like Console, is to provide neither public nor default instance constructors, and make all members static. This prevents every type of instantiation, and works very well for resources where it doesn't make sense to have more than one object at once (like console, file system in UNIX/Linux systems...).

In Nemerle, this pattern has a special construct: modules. A module is declared exactly like a class, but every method or field you declared on has an automagically added static modifier. For example:

module Operations {
  public Add (n1 : int, n2 : int) : int {
    n1 + n2
  }
 
  public Substract (n1 : int, n2 : int) : int {
    n1 - n2
  }
}
 
// Note that module members are accessed through class name
System.Console.WriteLine (Operations.Add (1, 2));

Type conversions

Sometimes you need to convert an object from one type to other. You might want to convert an int to a double, or a class to its parent. In Nemerle, there are two conversion operators (or casting, as you C# guys like to call them), depending on whether the cast is checked at compile or runtime:

Both operators are written the same way. First, the object you want to convert, then the operator, and finally, the result type.

def memoryStream = System.IO.MemoryStream ();
def stream = memoryStream : System.IO.Stream; // MemoryStream is a subclass of Stream
 
def n1 = int.Parse System.Console.ReadLine());
def n2 = int.Parse System.Console.ReadLine());
def resultAsShort = (n1 + n2) :> short; // We don't know if n1 + n2 fits on short, so we use :>

Exercises

Exercise 1

Create a Person class. This class needs to have FirstName, LastName and Telephone properties, as well as a BirthDate. BirthDate should be a record Date with the necessary fields. Additionally, add an Age property that computes the age of the person using the system current date. This property must not be writable.

Exercise 2

Play a little with inheritance and virtual calls. The traditional example is to create an Animal class with Legs property and a virtual Speak method. Then, create some more classes, like Dog, Cat..., overriding the Speak method. Watch the results!

Exercise 3

Take this small example and add ImperialTrooper subclass to it. DarthVader should have a radio connection (a reference) to imperial trooper, so he can ask (with a method) if he can see Luke.

Extra Exercise

Find out a little about XML Serialization (check MSDN, or Monodoc in the System.Xml.Serialization namespace), and write a method for opening and saving a contact.

Nemerle for OOP Programmers Week 2

This week we will learn some basics of functional programming. You're not assumed to know anything about this topic, as we will go into much detail here. There is a lot of new material this week -- this is where we leave the safe and known OO land.

Contents

Introduction to Functional Programming

The basic idea of Functional Programming (FP) is to consider the entire program as an expression to be evaluated, and not to think about computation as being performed step by step and updating memory. So the functional program in general does not update this and that memory location, but rather creates new objects (this is why virtually all functional languages rely on the garbage collector.) This is how functions work in math -- they take values and return values, but don't change anything in particular.

There are pure functional languages, which prohibit any change to existing objects (in other words both fields of objects and variables are always immutable.) They are also very often lazy, which means parameters passed to a function are not evaluated until the function actually needs them (which may seem like an optimization to you, unless you've ever tried Haskell :-).

Nemerle neither pure nor lazy, but it does support the key FP concept of immutability.

The second key concept is the ability to use functions as data, that is, to pass functions as parameters and store them in data structures. This feature is known as first class functions. Nemerle provides this.

There are other related concepts -- for example type inference and parametric polymorphism (generics) -- which are very often used in functional languages. Variant data structures and pattern matching are also notable concepts. We will find out more about these later, but for now we will focus on core features of FP.

To recap, the basic ideas behind FP in Nemerle are that:

OK, I can see this is getting a little boring, so let's proceed with some examples.

Local functions

In addition to defining methods in classes or modules, one can define some small helper functions inside other functions. For example:

class LocalFunctionExample {
  public static Main () : void
  {
    def say_hello () : void {
      System.Console.WriteLine ("Hello!");
    }
 
    say_hello ();
    say_hello ();
  }
}

It shouldn't come as a big surprise to see "Hello!" written twice by this program. The say_hello local function is defined inside Main, and can only be referenced within Main (it's local to it, right?).

You can also skip Main altogether, as we've seen earlier:

def say_hello () : void {
  System.Console.WriteLine ("Hello!");
}
 
say_hello ();
say_hello ();

Moreover, because say_hello is local, type inference is supported, so we can skip the void type annotation:

def say_hello () {
  System.Console.WriteLine ("Hello!");
}
 
say_hello ();
say_hello ();

Passing functions as parameters

Once you have defined a local function, you can pass it to another function as a parameter. For example we can have a special function that runs the passed function f twice:

def run_twice (f) {
  f ();
  f ();
}

We didn't specify a type for f, because type inference takes care of this. Nemerle sees that f is used as a function in the body of run_twice.

Now, we can define a function say_hello, and pass it to run_twice:

def say_hello () {
  System.Console.WriteLine ("Hello!");
}
 
run_twice (say_hello);

Lexical scoping

Local functions can see all the identifiers defined in their parent function. For example:

def hello = "Hello!";
def say_hello () {
  System.Console.WriteLine (hello);
}
 
say_hello ();
say_hello ();

When a local function references an identifier, it sees the nearest one defined before it (in program text). This property is called lexical scoping. So here, when say_hello refers to hello, it gets the closest previous definition: "Hello!". Note that other hello identifiers may be defined in other contexts and used without conflict. The rules of lexical scoping keep them separate.

Lists

Lists are a very important data structure in FP, and are used even more often than arrays in imperative programming.

Lists are constructed out of list cells. Each cell has a head (the element held in the cell) and tail -- a pointer to another cell. There is a special cell called nil which serves as an empty list. List cells cannot be changed once they are constructed. On the one hand, it means you cannot update the cell number or order of a list once it is constructed, but on the other hand mischief can't be done to a list that you pass to others.

While lists cannot be changed once constructed, they can share the tail. For example:

      42
      |
      V
 10-->34-->27-->nil

Here the lists [10, 34, 27] and [42, 34, 27] share the list [34, 27]. Because lists are immutable this sharing cannot do any harm. All Nemerle lists share the empty list.

The list constructor is ::, while nil is referred to as []. Therefore 1 :: [] means we want a list cell with 1 in the head and the empty list in the tail, which is another way of saying we want a single element list with 1. After this object is constructed we can make a two element list:

def l1 = 1 :: [];
def l2 = 42 :: l1;

Here, we chain list l1 to l2 to make a longer list. Of course, you can define this list in one expression: 42 :: 1 :: []. If you don't like all those colons, there is shorter notation: [42, 1]. This short form is convenient to use in pattern matching.

Pattern matching

The basic idea behind a matching like this:

match (some_value) {
  | pattern1 => expr1
  | pattern2 => expr2
  ...
}

is to inspect some_value, check if it looks like pattern1, and if so, evaluate expr1 return it as the result. Patterns are checked until a match is found, and the corresponding expression is returned. If all patterns fail to match (which is uncommon, because patterns should be exhaustive, that is, cover all possible values), an exception is thrown.

There are several kinds of patterns, we shall briefly introduce some of them here.

The wildcard pattern

Wildcards are written as _ and match any value, including the null pointer (don't confuse this with the nil list cell). It's like the default: case in switch in C-like languages.

Literal patterns

Each literal (like 42, 3.1415, "foobar", and null) is a valid pattern. This allows one to implement switch functionality with match:

match (some_number) {
  | 1 => "one"
  | 2 => "two"
  | 3 => "three"
  | _ => "a lot"
}

List patterns

When matching list patterns, both the :: constructor and the [1, 2] syntax are available:

match (some_list) {
  | [42, 42] => "two forty-two"
  | [42, _] => "forty-two on first position of two-element list"
  | [_, 42] => "forty-two on second position of two-element list"
  | 42 :: _ => "forty-two on first position"
  | _ :: 42 :: _ => "forty-two on second position"
  | [] => "an empty list!"
  | _ => "another list"
}

Note that the ordering of pattern matching clauses matters, because it is possible they can overlap (here the first case overlaps the second and third, while the second overlaps the fourth, and so on). In the case of multiple possible matches, the first match wins.

Variable patterns

The variable pattern matches any value, just like the wildcard pattern, but additionally binds the value matched to a specified identifier. The identifiers used have to start with a lowercase letter. They are always immutable (there is no mutable modifier allowed here).

For example a function to display all elements of a list would be:

using System.Console;
 
def display (l) {
  match (l) {
    | head :: tail =>
      Write ($ "$head, ");
      display (tail)
    | [] =>
      WriteLine ("end")
  }
}
 
display ([1, 2, 3])
 
// Output: 1, 2, 3, end

The display function takes a list and matches it against the list head :: tail. If the match list is not empty, the element of the current cell is bound to head and the rest of the list is bound to tail. The function prints the value of head and recursively calls itself to print out the rest of the list.

When the function hits the end of the list, nil [] is matched, end is written, and the function terminates.

Simple list manipulation

We now know the basic tools for list manipulation -- list construction and matching. Therefore we can proceed with some example functions.

Filter

filter is a function that is given a list and a functional value as a parameter. It will return the list of elements from the original list for which the functional value returned true.

For example:

def is_even (x) { x % 2 == 0 }
System.Console.WriteLine (filter ([1, 2, 3, 4], is_even));

should print out [2, 4].

In Nemerle, all basic data structures have the ToString method overridden to return contents of the data structure in a human readable form, therefore WriteLine does a good job displaying the list.

So we're back to the filter code:

def filter (l, f) {
  match (l) {
    | x :: xs =>
      def xs' = filter (xs, f);
      if (f (x)) x :: xs'
      else xs'
    | [] => []
  }
}

First, the function does a match on the list argument. For the non- empty list case the head is bound to x and the tail to xs. It is a common convention to call the head x or y and the rest of the list xs or ys.

Once the head and tail are bound the function calls itself to filter the rest of the list. The result is put in value xs'. In Nemerle the apostrophe (') character is a valid part of identifier names. A transform of x is often called x' (say x-prime); this convention comes from mathematics.

Finally, depending on the return value of f (x), the function adds element x to the result, or skips it. If you're looking for a return statement here, you won't find it! As you will recall from Week 0, Nemerle functions simply return the the last value computed, which in this example is either x :: xs' or xs'.

Of course when the list is empty, filtering it yields an empty list.

Side note: anonymous functions

While testing the filter example, you might want to use a shorthand that allows you to filter on any arbitrary expression on the fly. To that end, it is possible to rewrite the even filter as:

System.Console.WriteLine (filter ([1, 2, 3, 4], fun (x) { x % 2 == 0 }));

If we use a local function just once, it is possible to make it anonymous. The expression:

fun (some_parms) { some_expression }

is equivalent to:

{
  def tmp (some_parms) { some_expression }
  tmp
}

that is, defining a local function and returning it right away. Anonymous functions allow us to skip the superflous naming of truly one time, 'throw-away' functions.

First N

The first_n function returns first N elements of the list passed.

def first_n (l, n) {
  if (n == 0)
    []
  else
    match (l) {
      | x :: xs => x :: first_n (xs, n - 1)
      | [] => throw System.ArgumentException ("List too short")
    }
}
 
System.Console.WriteLine (first_n (["foo", "bar", "baz"], 2))
 
// Output: ["foo", "bar"]

Note how both arguments to first_n change as the function executes: it recursively calls itself for the rest of the list with the counter decremented. It is also interesting to see how the function return value includes a call to itself.

Another new thing here is the throw expression. It raises an exception passed as an argument. System.ArgumentException is a class from BCL. Handling exceptions will be discussed later (next week probably).

Types and Generics

We have used local functions which rely on type inference throughout this lesson. However type inference is not available for global functions; therefore it is good to know the exact way things get typed in Nemerle.

Parametric types

The list type is parametric (generic), so a list of integers has type list [int] and a list of strings has type list [string]. Note how square brackets ([]) are used for specifying the argument to the type. This is unlike C++, Java and C#, where the triangle brackets (<>) are used.

There are other generic types provided by the BCL and standard Nemerle library. For example Nemerle.Collections.Hashtable [K, V] is a mutable collection mapping values of type K to values of type V (it inherits from System.Collections.Generic.Dictionary [K, V]).

Another important example of parametric types are arrays. Array is a list-like type, so we felt there is no reason to make a special case of it. The syntax for array types in Nemerle is array [type], where type is the type of the elements in the array.

Function types

The type of a function taking a string and returning an int, like the int.Parse method, is: string -> int. The -> symbol is called the arrow, so you can read the previous type as string arrow int. When a function returns nothing, we use the void type. For example the System.Threading.Thread.Sleep method, which sleeps for n milliseconds has the type int -> void. Similarly, when a function takes no arguments, we use the void type before the arrow. So the type of Main is void -> void.

We will talk about types of multi-argument functions next week.

We now know how to write basic function types, so we can now write a fully typed version of filter:

def filter (l : list [int], f : int -> bool) : list [int] {
  match (l) {
    | x :: xs =>
      def xs' = filter (xs, f);
      if (f (x)) x :: xs'
      else xs'
    | [] => []
  }
}

The body is exactly the same as previously, but we have provided explicit types for the arguments.

OK, the question here is: why type the function as int? Because we intend to use it with a list of integers. Fine, you might respond, but why limit our function to just integers? As you correctly observe, the structure of the filter function would work equally well for, say, a list of strings or a list of FooBars, if only there was a way to type it less restrictively. This leads us naturally to the idea of generic functions.

Generic functions

What we would really like for our filter function is a generic type, one that would allow the function to work on all kinds of data. Unfortunately, because of subtyping and related theoretically difficult problems, the type inferencer in Nemerle cannot infer generic types for you. No worries, you can still have generics, you just have to specify them yourself, using a generic type declaration for your function:

Here, we define a generic type parameter, gentype0, to be used in the function. Think of gentype0 as a placeholder for any type. Substitute gentype0 any place you would use a single specific type in a function you want to make generic.

With this knowledge, let's make a generic filter:

using System.Console;
 
def filter[T] (l : list [T], f : T -> bool) : list [T] {
  match (l) {
    | x :: xs =>
      def xs' = filter (xs, f);
      if (f (x)) x :: xs'
      else xs'
    | [] => []
  }
}
 
WriteLine (filter ([1, 2, 3, 4], fun (x) { x % 2 == 0 }));
WriteLine (filter (["foo", "bar", "foox"], fun (x) { x.StartsWith ("foo") }));
 
/* Output:
[2, 4]
["foo", "foox"]
*/

The new, generic definition of filter can be read: for any type T, the first parameter has the type list [T], the second T -> bool and the result type is list [T]. This allows a list of integers to be filtered as easily as a list of strings. However, the function type passed has to match the type of the list (try passing the even lambda expression and a list of strings). This is because we use the same generic type parameter T for list l and function f.

Using the same syntax, you can define generic parameter types for these contexts:

Because generic parameter declarations use the compact list syntax, you can also declare multiple generics:

The use of generics in FP can bring a level of abstractness and compactness to code that can be hard to match using the object-oriented paradigm by itself.

Side note: names of generic parameters

There is a convention, originating from ML, to call type parameters 'a, 'b, 'c... which reads alpha, beta, gamma... As Nemerle supports apostrophes in identifiers, you're free to follow it.

In C++ there is a convention to call generic type parameters T, while in C# one mostly uses descriptive names (something like ElementType). Nemerle is still new, so perhaps a generic naming convention for it will emerge, and come into common use. The standard library uses the ML convention.

Exercises

In the first few exercises you should write a handful of short list manipulation functions. This is widely recognized as a good introduction to FP and I couldn't come out with anything better.

The functions should throw an exception for invalid data (empty lists, element not found and so on). System.ArgumentException will be OK.

Functions should come with one or two example invocations.

Exercise 1

Exercise 2: iterators

Exercise 3: flatten

Write the function: flatten[T] (l : list [list [T]]) : list [T] concatenating given lists together. For example:

System.Console.WriteLine (flatten ([[1, 2], [3, 4], [5]]))

should output [1, 2, 3, 4, 5].

Exercise 4: SW again

Take the Star Wars example from the previous week, possibly enriched with your Imperial Trooper and modify it to hold a lists of persons instead of 3 persons.

Nemerle for OOP Programmers Week 3

This week we will learn more about functional programming in Nemerle. We will find out about tuples, variants, and advanced pattern matching.

Contents

Tuples

Tuples in Nemerle are similar to tuples in math, or in SQL for that matter. They are immutable, heterogeneous (a tuple can hold values of several types at once) finite sequences of objects. The most common examples of tuples is a pair or triple:

def our_pair = (1, 2);
def our_triple = (1, 12, -10);

Up to 9 elements can be held in a tuple (the syntax looks clumsy for 5 or 6 elements already, and you cannot use loops for iterating over tuples).

A tuple can hold values of several types:

def another_pair = ("foo", 3.0);

Tuples are useful for returning multiple values from functions or holding several things in a single cell in containers. One can always use record-like classes, however this requires additional hassle of a separate class definition.

Tuple pattern

The most common way of decomposing a tuple is by using a tuple pattern. It looks mostly like the tuple definition:

def divmod (x, y) {
  def div = x / y;
  def mod = x % y;
  (div, mod)
}
 
match (divmod (142, 13)) {
  | (d, m) =>
    System.Console.WriteLine ($ "div=$d, mod=$m");
}
 
// Output: div=10, mod=12

Here we define a local function -- introduced in Week 2 -- called divmod, which returns the tuple (div, mod). The result is then matched against a tuple pattern. As you can see, the pattern binds the first element of the tuple to d and the second to m.

Single-case matching has a special, shorter syntax in Nemerle, as shown here:

def (d, m) = divmod (142, 13);
System.Console.WriteLine ($ "div=$d, mod=$m");

The divmod function could be written more compactly as well:

def divmod (x, y) {
  (x / y, x % y)
}

The arguments to the tuple constructor are just expressions, and the names div and mod used for elements didn't really matter.

Tuple indexer

For cases where pattern matching doesn't seem right (for example, you want just the first element of a tuple returned by a nested call), there is a special tuple indexer. Using it, the example above could be rewritten as:

def divmod (x, y) {
  (x / y, x % y)
}
 
def r = divmod (142, 13);
System.Console.WriteLine ($ "div=$(r[0]), mod=$(r[1])");

Now, the tuple result gets assigned to r, and the $ macro expands it's elements by index. As with arrays, tuple indexing is 0-based (the first element of a tuple has index 0).

Indexing rules

In case tuples have started to look too much like arrays, here are some rules about tuple indexers to help clarify things.

Unlike an array indexer, it is not possible to supply anything beside a constant to the tuple indexer. For example, the following code will not work:

def pair = (2, "foo");
def idx = int.Parse (System.Console.ReadLine ());
def element = pair [idx]; // dynamic access is invalid

In this contrived example, it should be clear why this is prohibited: depending on user input the type of element would be string or int. The only solution would be to use a compile-time type of object for element, however we felt that this wouldn't be useful enough to mandate implementation.

Because tuples are immutable, you also cannot assign values to tuple elements, as in:

def p = (3, 4);
p [0] = 17; // incorrect assignment to tuple

Tuple type

The constructor of tuple types is the star (*). For example the tuple ("foo", 42, 3.1415) has type string * int * double. This notion comes from math, where the times (×) symbol is used, as in dimensions L × W × H.

A fully-typed standalone version of divmod would look like:

static divmod (x : int, y : int) : int * int {
   (x / y , x % y)
}

Side note: relationship between tuple and function types

The tuple type constructor is also used in function types, which is not an accident. We consider a function taking more than one argument exactly the same as a function taking a single tuple argument of the corresponding type. For example the following code is perfectly valid:

def print_two_ints (x, y) {
  System.Console.WriteLine ($ "first=$x, second=$y");
}
def divmod (x, y) {
  (x / y, x % y)
}
print_two_ints (divmod (143, 77))

Side note: tuple assignment

Nemerle supports multiple assignment. That is, one can use a pseudo-tuple of possible assignment targets. For example:

mutable x = 17, y = 32;
(x, y) = (y, x);          // swap x and y
(x, y) = (y + 12, 2 * x); // or even change their values

All the source expressions are evaluated first and the assignment is done afterwards (this is why the swap works fine).

Variants

This section borrows some text from the Grokking Nemerle tutorial.

Variants (called data types or sum types in SML and OCaml) are forms of expressing data of several different kinds.

The simplest example of variants are enum types known from C (or C#).

// C
enum Color {
  Red, 
  Yellow, 
  Green 
}

You can define C#-like enum types in Nemerle too, but we will talk about it next week. Now let us look at the simplest variant type:

// Nemerle
variant Color {
   | Red
   | Yellow
   | Green
}

However, the variant options might be more useful because they can carry some extra data with them:

variant Color {
  | Red
  | Yellow
  | Green
  | Different {
      red : float;
      green : float;
      blue : float;
    }
}

So if the color is neither red, yellow nor green, it can be represented with RGB. You can create variant objects just like any other object, by using its constructor. All variant options have an implicit [Record] macro invocation on them. We talked about this macro 2 weeks ago, it adds a constructor assigning to each field of a class:

def blue = Color.Different (0, 0, 1);
def red = Color.Red ();

In the OO world, modeling variants with sub classing can sometimes be seen:

// C#
class Color {
  class Red : Color { }
  class Green : Color { }
  class Yellow : Color { }
  class Different : Color {
    public float red;
    public float green;
    public float blue;
  
    public Different (float red, float green, float blue) {
      this.red = red;
      this.green = green;
      this.blue = blue;
    }
  }
}

Of course, you need to write a constructor, mark fields as public, and so on. When you're done -- using this kind of stuff can get quite involved -- you will need to use lots of runtime type checks.

On the other hand, Nemerle provides an easy and convenient method of dealing with variants -- pattern matching.

Matching over variants

We already used pattern matching on lists, so you can imagine doing a switch over variant options. For example, a function returning string representation of a color could look like this:

variant Color {
  | Red
  | Yellow
  | Green
  | Different {
      red : float;
      green : float;
      blue : float;
    }
}
 
def string_of_color (color)
{
  match (color) {
    | Color.Red => "red"
    | Color.Yellow => "yellow"
    | Color.Green => "green"
    | Color.Different (red = r, green = g, blue = b) => 
      $ "rgb($r, $g, $b)"
  }
}
 
System.Console.WriteLine (string_of_color (Color.Red ()));
System.Console.WriteLine (string_of_color (Color.Different (1, 1, 0)));
 
/* Output:
red
rgb(1, 1, 0)
*/

The first three patterns state we're not interested in any possible fields in the case of Red, Yellow and Green. The last pattern, for the Different case, binds values of the red, green and blue to r, g and b respectively. You can omit matched fields at will, as well as change the ordering.

It is also possible to use a shortcut here:

| Color.Different (r, g, b) =>

This is only available when you specify all the fields in the given object, and in the right order.

Using variants as trees

The example above, while simple, is not the best usage of variants. Variants are best at handling tree-like data structures. A common example of tree data structures are XML documents. However, we will deal with plain binary trees first.

The following example defines the type of trees of integers (representing sets).

variant Tree {
  | Node {
      left  : Tree;
      elem  : int;
      right : Tree;
    }
  | Null
 
  public override ToString () : string
  {
    match (this) {
      | Tree.Node (l, e, r) => $ "($l $e $r)"
      | Tree.Null => "."
    }
  }
}

So a tree node is either an inside node with an element and two subtrees or a null tree without any elements inside. We have also overridden the ToString method, so the $ macro and WriteLine work properly on trees (the default implementation would yield "Tree.Node" or "Tree.Null" only).

We can now define a method for inserting elements to the tree. It should be defined inside the Tree variant:

  public Insert (e : int) : Tree
  {
    match (this) {
      | Tree.Node (l, cur, r) =>
        if (e < cur)
          Tree.Node (l.Insert (e), cur, r)
        else if (e > cur)
          Tree.Node (l, cur, r.Insert (e))
        else
          this
      | Tree.Null =>
        Tree.Node (Tree.Null (), e, Tree.Null ())
    }
  }

The function checks if the element to insert is smaller than the element in the current node, and if so, inserts it in the left subtree. If it's bigger, it inserts the element in the right subtree. Otherwise, it has to be equal, so it doesn't insert it again: it just returns the unchanged tree.

If the function hits an empty subtree, it creates a new leaf in that place.

Having a Contains check wouldn't be a bad idea, either:

  public Contains (e : int) : bool
  {
    match (this) {
      | Tree.Node (l, cur, _) when e < cur => 
        l.Contains (e)
 
      | Tree.Node (_, cur, r) when e > cur => 
        r.Contains (e)
 
      | Tree.Node => true
      | Tree.Null => false
    }
  }

This function works much like insert -- it checks if the element could be found in the left or in the right subtree, and looks for it there. If not, either it has found the matching node, or null, and returns the appropriate result.

There is however one new feature used here -- when guards in matching. Any matching clause can have an additional condition attached. The function first checks if the pattern matches the value, and if so, the condition is evaluated. If that yields true, the given branch is taken, otherwise it proceeds with further match clauses.

This example could also be written with regular ifs:

  public Contains (e : int) : bool
  {
    match (this) {
      | Tree.Node (l, cur, r) =>
        if (e < cur)
          l.Contains (e)
	else if (e > cur)
          r.Contains (e)
	else
	  true
 
      | Tree.Null => false
    }
  }

Finally we can test our example:

// we start with an empty tree
def t = Tree.Null ();
// add a few elements
def t = t.Insert (13).Insert (34).Insert (23);
// and display it
System.Console.WriteLine (t); 
System.Console.WriteLine (t.Contains (13)); 
System.Console.WriteLine (t.Contains (42)); 
 
/* Output:
(. 13 ((. 23 .) 34 .))
True
False
*/

XML trees

As you can see binary trees are not very interesting, so we will go on to XML. Whether XML is interesting remains a doubtful question, but at least it is somewhat more practical.

variant Node {
  | Text { 
      value : string; 
    }
  | Element {
      name : string; 
      children : list [Node];
    }
}

This variant defines a simplistic data structure to hold XML trees. An XML node is either a text node with some specified text inside, or an element node with a name and zero or more children. A sequence of children is represented as a Nemerle list.

For example the following tree:

<tree>
  <branch>
    <leaf/>
  </branch>
  <branch>
    Foo
  </branch>
</tree>

would be represented by:

Node.Element ("tree", 
  [Node.Element ("branch", [Node.Element ("leaf", [])]),
   Node.Element ("branch", [Node.Text ("Foo")])])

Of course XML by itself is just a data format. Using data in the above form wouldn't be too easy. So we want some different internal representation of data, and use XML only to save it or send it over the network.

variant RefrigeratorContent
{
  | Beer { name : string; volume : float; }
  | Chips { weight : int; }
  | Ketchup
 
  public static FromXml (node : Node) : RefrigeratorContent
  {
    match (node) {
      | Node.Element ("ketchup", []) =>
        RefrigeratorContent.Ketchup ()
        
      | Node.Element ("beer", 
          [Node.Element ("name", [Node.Text (name)]),
           Node.Element ("volume", [Node.Text (volume)])]) =>
        RefrigeratorContent.Beer (name, float.Parse (volume))
        
      | Node.Element ("chips",
          [Node.Element ("weight", [Node.Text (weight)])]) =>
        RefrigeratorContent.Chips (int.Parse (weight))
        
      | _ =>
        throw System.ArgumentException ()
    }
  }
}

The most interesting thing here are the nested patterns. Until now we have always used very shallow patterns -- we just checked the top-level object and possibly its bound fields. However it is possible to look very deeply in the structure of object. Instead of binding values of fields to variables, this function checks if they match given patterns. This is all that is happening here -- nested patterns.

Probably the most annoying thing about the example above is the amount of times you have to say RefrigeratorContent and Node. Fortunately both can be skipped, however for quite different reasons.

RefrigeratorContent used for object construction can be skipped, because we are in the body of the RefrigeratorContent variant definition. If we wanted to skip it elsewhere, we would have to say using RefrigeratorContent; first.

On the other hand, Node in matching can be skipped, because we provided the type of the node parameter in the match statement, therefore the compiler will just look up the appropriate variant options in this type.

So the shorter example would be:

variant RefrigeratorContent
{
  | Beer { name : string; volume : float; }
  | Chips { weight : int; }
  | Ketchup
 
  public static FromXml (node : Node) : RefrigeratorContent
  {
    match (node) {
      | Element ("ketchup", []) => Ketchup ()
        
      | Element ("beer", [Element ("name", [Text (name)]),
                          Element ("volume", [Text (volume)])]) =>
        Beer (name, float.Parse (volume))
        
      | Element ("chips", [Element ("weight", [Text (weight)])]) =>
        Chips (int.Parse (weight))
        
      | _ =>
        throw System.ArgumentException ()
    }
  }
}

Once we have something to put into a refrigerator, we can now build the refrigerator.

[Record]
class Refrigerator
{
  public minimal_temperature : float;
  public content : list [RefrigeratorContent];
  
  public static FromXml (node : Node) : Refrigerator
  {
    match (node) {
      | Element ("refrigerator", 
          Element ("minimal-temperature", [Text (min_temp)]) :: content) =>
        def parse_content (lst) {
	  match (lst) {
	    | x :: xs =>
	      RefrigeratorContent.FromXml (x) :: parse_content (xs)
	    | [] => []
	  }
        }
        Refrigerator (float.Parse (min_temp), parse_content (content)); // (*)
      | _ =>
        throw System.ArgumentException ("node")
    }
  }
}

The reader will easily note that: a) the XML deserialization code looks a bit like a junk b) it can be generated automatically, and c) without matching it would be even worse. Later we will learn how to write macros to generate this kind of code automatically.

We can however still make it somewhat better, without resorting to macros. First off, if you wrote the map example from the previous week, then the parse_content function should familiar to you. In fact it can be replaced with map altogether, like this:

Refrigerator (float.Parse (min_temp), map (content, RefrigeratorContent.FromXml));

We have passed a static global function as a functional value. It works exactly the same as with local functions.

Because the map function is generally useful, it is included in the standard library as a member function of the list data structure. So we remove parse_content, and replace line marked by (*) with:

Refrigerator (float.Parse (min_temp), content.Map (RefrigeratorContent.FromXml));

Side note: skipping match

Because it is quite a common pattern to do matching directly on arguments, Nemerle provides a way to skip match in such cases. If your function starts with | it is implicitly surrounded with match (single_parm) { ... }, or in the case there is more than one parameter, with match ((parm1, parm2, ..., parmN)) { ... } (so you can use tuple patterns to get to a particular parameter). So we can further shorten the above example by two more lines:

  public static FromXml (node : Node) : Refrigerator
  {
    | Element ("refrigerator", 
        Element ("minimal-temperature", [Text (min_temp)]) :: content) =>
      Refrigerator (float.Parse (min_temp), content.Map (RefrigeratorContent.FromXml));
    | _ =>
      throw System.ArgumentException ("node")
  }

Exercises

Exercise 1: Tree.Iter

Add an Iter (f : int -> void) : void method to the Tree variant, implementing inorder traversal (that is you first traverse the left subtree, then call f on current element, and traverse the right subtree).

Exercise 2: XML parsing

Write a function that reads XML from specified files and puts it into the Node variant defined above. Then write a function to dump your data in a lispy format, something like:

(tree
(branch
(leaf)
)
(branch
($text "Foo")
)
)

Then you can implement indentation of output.

(tree
  (branch
    (leaf))
  (branch
    ($text "Foo")))


Then copy the refridgerator variants from the above, fix any errors you find in them and try to parse the following file:

<?xml version='1.0' encoding='utf-8' ?>
<refrigerator>
  <minimal-temperature>-3.0</minimal-temperature>
  <beer>
    <name>Hyneken</name>
    <volume>0.6</volume>
  </beer>
  <beer>
    <name>Bydweisser</name>
    <volume>0.5</volume>
  </beer>
  <beer>
    <name>Plsner</name>
    <volume>0.5</volume>
  </beer>
  <chips>
    <weight>500</weight>
  </chips>
  <ketchup/>
</refrigerator>

Warning: you do not need to write the XML parser. You should not do it, actually. Use the System.Xml namespace from the .NET Framework. In order to link with the System.Xml library you need to compile with the -r:System.Xml option. For example:

 ncc -r System.Xml myprogram.n

should do.

Nemerle for OOP Programmers Week 4

This week we will learn about the remaining patterns, get some insights about performance of functional programs, and finally learn how to define polymorphic (generic) types.

This is the last lesson about FP during this course.

Contents

Advanced pattern matching

This section discusses a few matching constructs that remain to be covered.

Type check pattern

The type check pattern checks if the given value has the given type. It is similar to is in C# and Java, even the syntax looks alike:

using System.Console;
 
def check (o : object) {
  match (o) {
    | i is int => WriteLine ($ "an int, $(i * 2) / 2!");
    | s is string => WriteLine ($ "a string: $s!");
    | _ => WriteLine ("something else");
  }
}
 
check (21);
check ("foo");
check (3.0);
 
/* Output:
an int, 42 / 2!
a string: foo!
something else
*/

In this matching pattern, you can see how i is used as an int, and s as a string. The is pattern checks if the match value (o, an object) has one of the given types, and if so, binds the value to specified identifier. The identifier is given a static type for each branch in the match. The compiler implicitly casts the value to the type of the matching identifier. So the common C# pattern:

if (x is Foo) {
  Foo y = (Foo)x;
  ... use y ...
} else { ... }

or:

Foo y = x as Foo;
if (y != null) {
  ... use y ...
} else { ... }

becomes:

match (x) {
  | y is Foo => ... use y ...
  | _ => ...
}

This is useful when you need to cast an object as one of its inherited base classes, or as an interface implemented by the object.

Note that the C# construction as doesn't have anything to do with the Nemerle as pattern, presented below.

Record pattern

We have already seen the record pattern as a subpattern of the constructor pattern used for variants. The idea is to take an object and match on its fields:

class MyClass {
  public foo : int;
  public bar : string;
}
 
def c = MyClass ();
match (c) {
  | (foo = 0, bar = null) =>
    System.Console.WriteLine ("fine");
  | _ =>
    System.Console.WriteLine ("oops");
}

Field matching is quite flexible: you can skip some fields and reorder them.

You can also explicitly state the type of the class you're matching on. This is useful when the compiler complains about unknown types (there are some limitations of type inference in the current implementation, that may prevent it from working here). For instance, you could write a variation of the match above as:

def check_class (c) {
  match (c) {
    | MyClass where (foo = 0) => true
    | _ => false
  }
}
 
def result = check_class (MyClass ())

The record pattern can also match on individual readable properties:

class MyClass {
  [Accessor]
  public foo : int;
  public bar : string;
}
 
def c = MyClass ();
match (c) {
  | (Foo = 0) =>
    System.Console.WriteLine ("fine");
  | _ =>
    System.Console.WriteLine ("oops");
}

Properties do not take part in tuple-pattern -> record-pattern automatic conversion, where you can write a tuple pattern that is automatically translated to record pattern (we talked about this in Week 3).

as pattern

The as pattern binds a value matched by a subpattern to an identifier. This is useful when you want to check if a value has a given structure, but you're interested in still handling it as a whole, for example:

match (construct_list ()) {
  | [1, _] as l =>
    handle_one_something (l)
  | _ => {}
}

Another example, perhaps more interesting, is:

variant Foo {
  | A { x : int; mutable y : string; }
  | B
}
 
match (some_foo ()) {
  | A (3, _) as a =>
    a.y = "three";
  | _ => {}
}

Here the compiler knows the static type of a is Foo.A, so you can assign values to its y field.

Type hint pattern

This pattern matches to explicitly declared types. It is used for hinting the compiler about the static type of given matched value or subvalue. If the compiler gets the wrong idea about a type you have left for it to infer, it will scream. If it doesn't know the type, it will appreciate the hint. Example:

def foo (l) {
  | (s : string) :: _ => s [0]
  | [] => '?'
}
 
assert (foo (["foo", "bar"]) == 'f');

Here we hint the compiler that the type of s (the first element of the list) will be string so we can use the indexer to get its first character (at the time of this writing, there is a bug in our type inference engine that prevents deducing the type of s early enough for this snippet to work unhinted).

The assert macro checks if given condition is true, and if it's not, it throws an exception.

The example above can be rewritten as:

def foo (l : list [string]) {
  | s :: _ => s [0]
  | [] => '?'
}
 
assert (foo (["foo", "bar"]) == 'f');

with the same effect, although it is slightly longer (because we have to specify the list part of the type).

Alternative clauses

You can combine several matching clauses in a single branch. For example:

match (my_list) {
  | [1]
  | [1, 2] => WriteLine ("one or one, two");
  
  | [x]
  | [7, x] => WriteLine ($ "$x or seven, $x");
 
  | _ => WriteLine ("something else");
}

Here, the first two branches match against multiple lists. The second branch even matches against some value x, appearing in both its clauses. If you use a value in a matching clause, it can't change types, and it has to appear for each clause in the branch. In other words, you couldn't do this:

  | [7]   // x needs to appear here,
  | [7, x] => WriteLine ($ "$x or seven, $x");   // or not here

There is a way to work around this limitation, by using a with clause.

with clauses

When using alternative clauses it is sometimes useful to match some smaller pattern with a value x, and a bigger pattern with x and another value y. Now when we construct a single matching branch for the two patterns, we cannot use y in the smaller, as it doesn't occur there. The rub is we have to include y for all matches in the branch, because of the rule on matching values explained above.

The with clause solves this problem by providing a way of specifying a default y value in such cases. For example:

def foo (_) {
  | [x] with y = 3
  | [x, y] => x * y
  | _ => 42
}
 
assert (foo (3) == 9);
assert (foo (3, 4) == 12);
assert (foo (3, 4, 5) == 42);

The ability to match irregular length lists by providing with clause defaults comes in handy when creating more complex patterns.

Performance considerations

We now switch gears to the topic of FP performance. Tail calls, an FP method of implementing functions, are covered in detail. We will also examine looping and in-lining, which are compiler optimizations for tail calls that minimize stack use and function call overhead. Accumulators, an FP construct implemented with tail calls, are also introduced here.

Tail calls vs loops

When you have a while loop, like this:

while (true)
  // do something

it is possible to rewrite it like this:

def my_loop () {
  // do something
  my_loop ();
}
 
my_loop ();

This code calls my_loop which first executes the body of the loop, and at the end calls itself (and executes the body of the loop, and calls itself, and...). As you can see the result is the same as with the while loop.

If the while loop has a condition, it is also possible to rewrite:

while (some_condition)
  // do something

into:

def my_loop () {
  when (some_condition) {
    // do something
    my_loop ();
  }
}
 
my_loop ();

The performance of the two pieces of code is exactly the same. The Nemerle compiler can see that there is nothing left to do after call to my_loop inside my_loop, and replaces this call with an internal jump. Moreover, it notes that my_loop is called just once outside the body of the function, and therefore inlines it, replacing the call with the function body code itself. These optimizations together eliminate stack overhead in this example.

In fact, the compiler internally transforms all kinds of loops into local functions.

Calling some function f within f, in such a place that there is nothing more left to do in the body of f, is called tail recursion. It can be always replaced with a jump (internally, by the compiler).

To make functional programs run fast, tail recursion is a good idea for loops that are going to execute, say, one million times or more. Of course, there is no point rewriting things that are more readable as while loops to tail recursion, except when you want to obey some strict FP rules, like "never use mutable variables".

However there are problems where a solution with tail recursion is actually shorter and more readable (well, readablity depends on expertise with FP) than one with a loop.

Programming with accumulators

An accumulator is a parameter of a function that is used to hold the result constructed so far. When the function executes for the last time, it returns the accumulator, possibly transformed using some other function.

Reverse list

def rev (l, acc = []) {
  match (l) {
    | x :: xs => rev (xs, x :: acc)
    | [] => acc
  }
}
 
System.Console.WriteLine (rev ([1, 2, 3]));
 
// Output: [3, 2, 1]

This function adds the first element of the input list l to the accumulator acc, and calls itself with the rest of the list. It accumulates head elements until the list is exhausted, at which point it returns acc.

Because the result list is built from the end, while the input list is traversed from the beginning, the function returns a reversed list.

This is a common property of programming with list accumulators -- the result lists are reversed.

Note how rev is called inside itself in a tail position. This means the call is transformed to a jump, so it is very fast and doesn't consume stack space.

Length of list

It is also possible to use types other than list for accumulators:

def length (l, acc = 0) {
  match (l) {
    | _ :: xs => length (xs, acc + 1)
    | [] => acc
  }
}
 
System.Console.WriteLine (length ([4, 44, 22]));
 
// Output: 3

Here we use an integer as a accumulator, to compute the length of the list.

Fold

There is one more important concept related to programming with accumulators: the fold function. Fold is a built-in method for most Nemerle collections, so it is a handy shortcut to writing certain accumulator functions yourself. We will talk about fold for lists here. The signature is:

List.FoldLeft['a,'b] (l : list ['a], ini : 'b, f : 'a * 'b) : 'b

FoldLeft takes a list of type 'a, an initial value of type 'b, a function relating 'a to 'b, and returns a value of type 'b.

The expression List.FoldLeft ([x1, x2, ..., xN], ini, f) is equivalent to: f (xN, f (... f (x2, f (x1, ini))...). So, folding is a way to efficiently nest function calls on a list of arguments. When folding, the function is applied to the first element of the list, then successively to the result of that call plus the next element, until the whole list is evaluated.

This definition in code may be easier to understand:

def fold (l, acc, f) {
  match (l) {
    | x :: xs =>
      fold (xs, f (x, acc), f)
    | [] => acc
  }
}

This has a very similar form to both functions shown above. The rev function replaces f (x, acc) with x :: acc, and length replaces it with 1 + acc. Now, because function f is a parameter to fold, you can implement both rev and length with the built-in FoldLeft method:

def rev (l) { List.FoldLeft (l, [], fun (x, xs) { x :: xs }) }
def length (l) { List.FoldLeft (l, 0, fun (_, acc) { 1 + acc }) }

One can also use the FoldLeft member of the list variant:

def rev (l) { l.FoldLeft ([], fun (x, xs) { x :: xs }) }
def length (l) { l.FoldLeft (0, fun (_, acc) { 1 + acc }) }

which makes the implementations even shorter.

Gaining proficiency with the fold function may be a little hard at first. It's good idea to take a function using direct tail recursion and try to rewrite it using fold to learn the cases where it is useful (this is only to learn things, of course there is no point rewriting already working stuff just to use fold).

Side note: redefining symbols

As we wrap up our exploration of functional performance, let's take a moment to look at the topic of symbol redefinition.

From Week 2, recall this list filter example:

def filter (l, f) {
  match (l) {
    | x :: xs =>
      def xs' = filter (xs, f);
      if (f (x)) x :: xs'
      else xs'
    | [] => []
  }
}

For clarity, we used xs' as the name for the filtered result of list xs. But, we could have simply reused xs. It would look like:

def xs = filter (xs, f);

While this reminds us of assignment in imperative programming (from this point on in the program the new xs is in scope, so it looks as if the original was changed), it is quite different from the theoretical point of view.

Note that you can rebind an identifier to a different type. For example, the fragment:

def x = [1, 2];
def x = "foo";

is perfectly valid. You could even do that with mutable variables, though a warning would be generated. However, the more common use is to rebind an identifier to the results of some computations on its previous meaning. For example:

def result = 2 * 2;
def result = $ "the result is: $result";

Both bindings of result have similar semantic meaning, though they are quite different from the compiler point of view. The second result is an entirely new entity, and not related to the first in any physical way.

The advantage of this approach is that you won't accidently use xs instead of xs' in your code. The disadvantage is that it is less clear what is going on as xs changes its meaning.

The Nemerle compiler will warn about redefinitions of mutable symbols (and redefinitions with mutable symbols), because it is not a very mainstream programming concept, and the relative likelihood of misunderstanding the difference between redefinition and assignment.

Parametric polymorphism aka generics

Generic types (and immutable collections)

Generic types allow the definition of strongly typed containers that can hold values of arbitrary types. An example of such a container is the array -- the type of elements stored must match the type of the array. Similarly, lists contain only elements that match the list's type. The difference is that for arrays and lists, the type must be specified at compile time.

Restricting a list's type improves static code safety. Instead of using a hold-anything list type, object, you can restrict your list to a single type, say list [int]. This prevents adding a string to this list, and gives the guarantee that when you read an element from it, it will be an integer. This saves you a runtime cast, which is costly and can crash the application if you don't provide extra error handling to handle miscasts.

Of course there are cases when you would want ints and strings in a single list; you can use list [object] in such cases. Then you can add anything to the list, but you need to cast the values taken from it during runtime, with the drawbacks previously mentioned.

Generics offer a middle road: they allow you to build classes that enforce type consistency internally, but don't explicitly define the type, allowing the user to use instances of the class with any arbitrary type at runtime. With generics, you can build type flexibility into your classes, while avoiding the problems associated with runtime casting.

Let's start with a naive Set implementation:

class Set ['a]
{
  mutable storage : list ['a] = [];
  public Add (e : 'a) : void
  {
    when (! Contains (e))
      storage ::= e;
  }
  public Contains (e : 'a) : bool
  {
    storage.Contains (e)
  }
}
 
def s1 = Set ();
s1.Add (3);
s1.Add (42);
assert (s1.Contains (3));
// s1.Add ("foo"); // error here!
def s2 = Set ();
s2.Add ("foo");
assert (s2.Contains ("foo"));

Instances of the same Set class are used at runtime to handle sets of integers and strings. Having type affinity, however, the same set can't store elements of different types.

Here, the generic parameter 'a is used to define the type of the storage list and method parameters. We talked about generic parameters in Week 2, refer to it for more details.

We can add elements to the set and check if a given element is already there. This is not a very functional approach as the set is updated in-place. So another implementation (and interface!) would be:

class Set ['a]
{
  storage : list ['a] = [];
  
  public this ()
  {
    this ([])
  }
 
  this (s : list ['a])
  {
    storage = s;
  }
  
  public Add (e : 'a) : Set ['a]
  {
    if (Contains (e)) this
    else Set (e :: storage)
  }
 
  public Contains (e : 'a) : bool
  {
    storage.Contains (e)
  }
}
 
def s1 = Set ();
def s1 = s1.Add (3).Add (42);
assert (s1.Contains (3));
// s1.Add ("foo"); // error here!
def s2 = Set ().Add ("foo");
assert (s2.Contains ("foo"));

A few details of this class bear some explanation:

The line def s1 = s1.Add (3).Add (42); may also raise questions. Because . binds to the left, elements are added from left to right, with the last element added being the first in the list. It is equivalent to:

def s1 = s1.Add (3);
def s1 = s1.Add (42);

The key idea here is that we create a new set instance for each element added. When we don't explicitly save older instances, they are garbage collected, as is the case here. However, when we do save an instance of a set by maintaining a reference to it, it will not be affected by adding new elements later. There are cases when this is useful.

For example, a map holding name -> value bindings in the compiler needs to be restored when the scope ends. If we use an immutable map for this, all we have to do is save the state before we enter the scope, and restore the saved reference when we leave.

Additionally, it is easy to wrap immutable structures in a mutable object. This is exactly what we did in the first example -- we wrapped an immutable list in a mutable set. The Nemerle standard library contains a handful of immutable collections.

Generic methods

We've already seen how to define generic methods. You just need to specify the type parameters:

public Length['a] (l : list ['a], acc = 0) : int
{
  match (l) {
    | _ :: xs => Length (xs, acc + 1)
    | [] => acc
  }
}

You are required to specify the generic parameters you're going to use. So, the following declaration is not valid:

class C {
  public foo (x : list [T]) : void { }
}

but this one is valid:

class C {
  public foo[T] (x : list [T]) : void { }
}

Generic specifier

There are three kinds of generic specifiers. They are all used to specify generic parameters explicitly in the code. All three cases are optional, the compiler will always try to infer the generic parameters.

For constructors

This is a very obvious one. If you have a Set['a], like the one before, and you want to ensure a set of ints is being created use:

def s = Set.[int] ();
s.Add ("foo"); // error!
def s2 = Set.[int] ();
s2.Add (3); // OK

Note that unlike in C# you don't have to specify this type, compiler will happily do it for you in most cases.

For methods

This is also quite simple, when a method has generic parameters (like the Length method above) you can supply them:

def l1 = Length.[string] (["foo","bar"]); // OK
def l2 = Length.[string] ([1, 2, 3]); // error

For types in static member references

And this one is more complicated. The type parameters are in scope also in static members. This means the following code is valid:

class Foo['a] {
  static public bar () : void
  {
    System.Console.WriteLine (typeof ('a));
  }
}

When you try to call Foo.bar () the compiler cannot know what 'a you had in mind (in this case it will assume System.Object). However you cannot do Foo.bar.[int] () because the bar method is not generic. Therefore you can do: Foo[int].bar (). You also can do: Foo.[int].bar (), so the dot is optional (but only in this case, we know this sucks, we're working on it).

Constraints on type variables

For certain projects, like model-view-controller frameworks, it is sometimes necessary for types to be substituted with typed variables that also conform to some specific interface. We will address this issue in more detail, as it is probably new for most readers.

For example, elements stored in an ordered binary tree must provide a comparison method so they can be properly sorted. To do this, we define an appropriate interface, IComparable, and then build a variant structure, Tree, which defines a node that can store generic type 'a. Further, Tree requires that any 'a to be stored must implement the IComparable interface:

interface IComparable ['a] {
  CompareTo (elem : 'a) : int;
}
 
variant Tree['a] 
  where 'a : IComparable['a]
{
  | Node {
      left : Tree['a];
      elem : 'a;
      right : Tree['a];
    }
  | Tip
}

In our variant declaration, we have added a where parameter to further constrain the type of 'a that the variant will accept. Such lower bounds on type variables, where the type variables can occur in types that bound them, are called F-bounded polymorphism in type theory.

In fact the IComparable interface is already defined in the standard library, but that is beside the point.

Once we have ensured that all elements in the tree have both subtype IComparable and a generic type 'a, we can use the CompareTo method to insert an element into a tree:

module TreeOperations {
  public Insert['a] (t : Tree['a], e : 'a) : Tree['a]
    where 'a : IComparable['a]
  {
    match (t) {
      | Node (l, c, r) =>
        if (e.CompareTo (c) < 0)
          Node (Insert (l, e), c, r)
        else if (e.CompareTo (c) > 0)
          Node (r, c, Insert (r, e))
        else
          Node (r, e, l)
      | Tip =>
        Node (Tip (), e, Tip ())
    }
  }
}

Now, people familiar with C# or Java might ask, why not simply use something like:

interface IComparable {
  CompareTo (elem : IComparable) : int;
}
 
variant Tree
{
  | Node {
      left : Tree;
      elem : IComparable;
      right : Tree;
    }
  | Tip
}

But this is only half good. The most common use for a tree is to store elements of some specific type, for example strings. We don't want integers and strings to be stored in the same tree, for the very simple reason that we cannot compare integers with strings in a reasonable way. Well, even if we could, we plainly cannot predict what other types beside integers and strings might implement IComparable, and thus have their value passed to a string's CompareTo.

The above design makes it impossible to ensure statically whether we're using the tree with consistent types. In this scheme, when inserting nodes to the tree, we upcast all elements to IComparable, regardless of type. But we will get a runtime exception when one type's CompareTo method is passed an argument from another, incompatible type. The second drawback is that when we extract elements out of the tree, we need to downcast them to a specific type. This is another possibility for runtime errors in trees of mixed type.

So, to guarantee runtime safety in such a design, we would have to implement CompareTo for all permutations of types we wish to support, plus build a downcast that would somehow coerce all types into the target type. Not only is this a lot of work, it is not likely to work very well. F-bounded polymorphism neatly avoids this mess by ensuring not only conformance to an interface, but to an underlying type as well. This two-layered checking makes type safety possible in systems that are bounded by interface and type.

To better understand this issue, look at the following example:

interface IFrobincatable {
  Frobnicate (x : int) : void;
}
 
class C1 : IFrobincatable 
{
  public this () {}
  public Frobnicate (_ : int) : void {}
}
 
class C2 : IFrobincatable 
{
  public this () {}
  public Frobnicate (_ : int) : void {}
}
 
module M {
  f1['a] (o : 'a) : 'a
    where 'a : IFrobincatable
  {
    o.Frobnicate (3);
    o
  }
 
  f2 (o : IFrobincatable) : IFrobincatable
  {
    o.Frobnicate (3);
    C1 ()
  }
 
  Main () : void
  {
    def x1 = f1 (C1 ()); // x1 : C1
    def x2 = f1 (C2 ()); // x2 : C2
    def x3 = f2 (C1 ()); // x3 : IFrobincatable
    def x4 = f2 (C2 ()); // x4 : IFrobincatable
  }
}

In the Main function, calls to f1 always return a value of the type passed to it. However, f2 doesn't give this guarantee: even though the line def x4 = f2 (C2 ()) passed in a value of type C2, what it got back was C1. Even though f2 compiles, users of f2 may get runtime errors or misbehavior because their type assumptions aren't correctly supported.

Excercises

One

Write a version of the functional Set class above using (unbalanced) trees, like the ones we've seen last week. Use the System.IComparable [T] interface.

Two

Write the following tail recursive functions:

Three

Rewrite the functions above (except for Map) to use FoldLeft.

Nemerle for OOP Programmers Week 5

This is the last lesson in this course. It discusses topics mostly related to object-oriented programming in Nemerle. This should be fairly easy for C# programmers, while people with Java/C++ background will have a slightly harder time.

This lesson again borrows topics from Grokking Nemerle.


Contents

Namespaces and the using directive

In Nemerle, types can live in namespaces. If type X is defined in namespace Y, then its full name is Y.X. It is a good idea to put all your code in some unique namespace if you're going to distribute it, to avoid name clashes. All classes in the Base Class Library live in some namespace (mostly in System and its sub-namespaces).

The syntax for defining namespaces is:

namespace FooBarCorp.TurboLib {
  class SomeClass { }
}

The full name of SomeClass is FooBarCorp.TurboLib.SomeClass. If this is going to be a library, it should be placed in FooBarCorp.TurboLib.dll. One of the great advantages of Nemerle or C# over Java is that you don't need to create any folder that matches the name of the namespace. Indeed, you can create more than one namespace in a file:

namespace FuBar {
  namespace Example {
    // This is FuBar.Example
  }
}
namespace OtherFoo {
  // This is just OtherFoo
}

Nemerle requires all objects to be referenced using their fully qualified name. To use SomeClass, you would have to reference FooBarCorp.TurboLib. But, to avoid typing FooBarCorp.TurboLib again and again to access its classes, there is a special using directive that imports the given namespace. So the following is valid:

using FooBarCorp.TurboLib;
def c = SomeClass ();

(Code inside the FooBarCorp.TurboLib namespace doesn't require using FooBarCorp.TurboLib; -- that's done automatically.)

You can also use:

using FooBarCorp;
def c = TurboLib.SomeClass ();

One of the features Nemerle has is the ability to import a namespace and use the namespaces inside it without qualifying them with the parent name. In C#, you would have to add one or more using directives to for the classes you wish to use to get this effect.

Another form of the using directive allows you to give a shorter name to a namespace:

using TL = FooBarCorp.TurboLib;
def c = TL.SomeClass ();

Finally, using is not restricted only to namespaces -- with using you can also reference a class or module so you can use its static members:

using System.Console;
WriteLine ("Hey, we just used Console.WriteLine without writing Console!");

Enums

Enums are like restricted variants. Enums contain a list of options, and cannot have methods, fields, or constructors. Example:

enum Color {
  | Red
  | Green
  | Blue
}
 
def string_of_color (e) {
  match (e) {
    | Color.Red => "red"
    | Color.Green => "green"
    | Color.Blue => "blue"
  }
}
 
System.Console.WriteLine (string_of_color (Color.Red));

Enums are internally represented as integers, which makes them more efficient than variants. It also means you can convert integers to enums, but beware: if an integer doesn't represent a valid enum option, strange things can happen, so it's better to check a enum variable for allowed ranges before using it. For example;

System.Console.WriteLine (string_of_color (13 :> Color));

This will raise MatchFailure exception.

However, the way enums work can help on some situations. One of the most used is to combine two or more values into one, using the | (OR) operator. For this, you must use values that do not overlap. Powers of 2 (1, 2, 4, ...) are ideal, because their individual bits do not coincide (in binary, 1 => 0b001, 2 => 0b010, 4 => 0b100), so the | operator can use them unambigously. The Mono and .NET guidelines also state that this type of enum should be marked with the Flags attribute:

using System;
 
[FlagsAttribute]
enum Seasons {
  | Winter = 0b0001
  | Spring = 0b0010
  | Summer = 0b0100
  | Fall = 0b1000
  | Autumn = Fall
  | AllSeasons = Winter + Spring + Summer + Fall
}
 
def seasons = Seasons.Winter | Seasons.Fall;
Console.WriteLine ($"Seasons: $(seasons.ToString())");
 
/* Output:
Seasons: Winter, Autumn
*/

As you can see in this example, you can also use more than one alias for the same value, and even create combinations of values in the enum, like we have done with AllSeasons. Also, since we were able to use the [FlagsAttribute], we saved ourselves the trouble of having to write our own string_of_season function.

Function parameters

Ref/out parameters

Normally, when you pass some value as a function parameter, a copy of the value is created. This is call-by-value, and is preferred because it makes certain that the function cannot modify the calling code's values. But sometimes you do want the function to change the original value passed to it, not the copy. This is possible by passing the parameter using the ref keyword. Such a parameter is passed by reference, which means that when the parameter value is changed by the function, the calling code sees it.

class C {
  public static Inc (x : ref int) : void
  {
    x++;
    System.Console.WriteLine (x);
  }
 
  public static Main () : void
  {
    mutable q = 13;
    Inc (ref q);
    Inc (ref q);
  }
}
 
/* Output:
14
15
*/

First, you need to add a ref annotation to the parameter type when you declare the function. When you call the function, you need to pass a mutable value (a local variable or a mutable field), and prefix it with ref keyword. Be aware that this just works with value types. In Mono/.NET, there is a strong divison between reference and value types. When you pass a reference type (object) as a parameter, you are only passing a reference to an object, so every change to the object in the function will also change the calling object. Value types (numbers, strings) are always copied when passed, so you need the ref keyword to change this default behavior.

There is also a similar out modifier. It means that the function is not going to read what's in the parameter, it is just going to return some value through it. In C# this is handy when you want to return more than one value from a function. In Nemerle, tuples are used for the same purpose, so the use of out parameters is much less common. Anyway, the usage is:

class C {
  public static read_line (line : out string) : bool
  {
    line = System.Console.ReadLine ();
    if (line == "quit") false
    else true
  }
 
  public static Main () : void
  {
    mutable s = null;
    while (read_line (out s)) {
      System.Console.WriteLine (s);
    }
  }
}

Named parameters

This feature comes from Python. You can name the parameters in a function call, which allows putting them in a different order than in the definition. This can make your code more readable, especially when using default parameters.

frobnicate (foo : int, 
            do_qux : bool, 
            do_baz : bool, 
            do_bar : bool) : int
{
  // this is completely meaningless
  if (do_qux && !do_baz) foo * 2
  else if (do_bar) foo * 7
  else if (do_baz) foo * 13
  else 42
}
 
Main () : void
{
  // Parameters' names can be omitted.
  def res1 = frobnicate (7, true, false, true);
  
  // This is the intended usage -- the first 
  // (main) parameter comes without a name
  // and the following flags with names
  def res2 = frobnicate (7, do_qux = true,
                            do_baz = false, 
                            do_bar = true);
                            
  // You can however name every parameter:
  def res3 = frobnicate (foo = 7, do_qux = true, 
                         do_baz = false, do_bar = true);
                         
  // And permute them:
  def res3 = frobnicate (do_qux = true, do_bar = true, 
                         do_baz = false, foo = 7);
                         
  // You can also omit names for any number of leading 
  // parameters and permute the trailing ones.
  def res2 = frobnicate (7, true,
                            do_bar = true,
                            do_baz = false);
  ()
}

The rules behind named parameters are simple:

Named parameters can be used only when names of parameters are known (which basically means they do not work in conjunction with functional values).

This feature is particularly useful with default parameters:

frobnicate (foo : int, do_qux : bool = false, 
                       do_baz : bool = false, 
                       do_bar : bool = false) : int { ... }
frobnicate (3, do_baz = true);
frobnicate (3, do_baz = true, do_qux = true);

Operator overloading

You can overload existing operators or add new ones simply by specifying them as a static method in some type. The name of method must be quoted with the @ operator. For example, the following class definition:

 class Operand {
   public val : int;
   public this (v : int) { val = v }
 
   public static @<-< (x : Operand, y : Operand) : Operand {
     Operand (x.val + y.val);
   }
 }

contains the <-< binary operator processing Operand objects. It can be used like:

 def x = Operand (2);
 def y = Operand (3);
 def z = x <-< y;
 assert (z.val == 5);

Unary operators can be created by giving only one parameter to the method.

It is also possible to overload built-in operators like + and -. Some of the classes exposed in the Base Class Library (like Point) use this operator overloading. If you use them, you should also provide another way to get its functionality, to mantain compatibility with other languages. For example:

using Nemerle.Utility;
 
[Record]
class Vector {
  [Accessor] x : double;
  [Accessor] y : double;
  [Accessor] z : double;
 
  public Add (v : Vector) : Vector {
    Vector (this.X + v.X, this.Y + v.Y, this.Z + v.Z)
  }
 
  public static @+ (v1 : Vector, v2 : Vector) : Vector {
    v1.Add (v2)
  }
}
 
def v1 = Vector (1, 2, 3);
def v2 = v1 + v1;
def v3 = v2.Add (v1);

Interfaces

The .NET Framework supports only single inheritance. This means that any given class can derive from just one base class. However, a class may sometimes be required to be two or more different things, depending on context. .NET supports this (just like Java) through interfaces. An interface is a contract specifying a set of methods that given class must implement. A class can implement any number of interfaces, in addition to deriving from some base class.

Implementing an interface implies subtyping it. That is, if you have a class A implementing interface I, and method taking I as parameter, then you can pass A as this parameter.

Interfaces most commonly state some ability of type. For example, the ability to convert itself to some other type, or to compare with some other types.

using Nemerle.IO;
 
interface IPrintable {
  Print () : void;
}
 
class RefrigeratorNG : Refrigerator, IPrintable
{
  public Print () : void
  {
    printf ("I'm the refrigerator!\n")
  }
 
  public this ()
  {
  }
}
 
module RP {
  PrintTenTimes (p : IPrintable) : void
  {
    for (mutable i = 0; i < 10; ++i)
      p.Print ()
  }
  
  Main () : void
  {
    def refr = RefrigeratorNG ();
    PrintTenTimes (refr)
  }
}

In the class definition, after the colon, the base class must come first, followed by the supported interfaces in any order.

Design by contract macros

There are several macros defined in the Nemerle standard library helping improve code safety. The most basic one is the assert macro known from C. It is available by default (in Nemerle.Core), no using is necessary. It checks if given expression is true and if it's not, it throws an exception stating the expression, file and line where the problem occurred. You can also supply a second parameter to this macro with an error message you want to include in the exception.

There are other macros, providing requires and ensures contracts, they are described in the wiki page about Design by contract macros.


Properties

Properties are syntactic sugar for get/set design pattern commonly found in Java. Accessing a property looks like accessing a field, but under the hood it is translated to a method call.

class Button {
  text : string;
  public Text : string {
    get { text }
    set { text = value; Redraw () }
  }
}
 
def b = Button ();
b.Text = b.Text + "..."

Setter or getter blocks are not a must (of course, you need at least one :-). As we said in Week 1, for the traditional properties that just take care of saving or loading the value, the Accessor macro is preferred over plain writing.

Indexers

Indexers are another form of property, but instead of exposing a simple field, array access syntax is used. They must take parameters, a feature that normal properties can't have. The main use of indexers is exposing collections to the outside, which is why the array syntax was chosen. All .NET indexers have names, though one indexer can be marked as the default indexer (well, in fact one indexer name is marked as default, but there can be several indexers with that name, subject to the regular overloading rules).

For example, the System.Hashtable default indexer is called Item and System.String one is called Chars. The C# language allows definition of a default indexer only. Nemerle allows you to define other indexers (like VB.NET). In the current release, the default indexer is always Item.

class Table {
  store : array [array [string]];
  public Item [row : int, column : int] : string
  {
    get {
      System.Console.WriteLine ($"get ($row, $column)");
      store[row][column]
    }
    set {
      System.Console.WriteLine ($"set ($row, $column)");
      store[row][column] = value
    }
  }
  public this ()
  {
    store = array (10);
    for (mutable i = 0; i < store.Length; ++i)
      store [i] = array (10);
  }
}
 
def t = Table ();
t[2, 3] = "foo";
t[2, 2] = t[2, 3] + "bar";
 
/* Output:
set (2, 3)
get (2, 3)
set (2, 2)
*/

Delegates

Delegates are half-baked functional values. There is little place for using delegates in Nemerle itself, because it supports true functional values. However other .NET languages all speak delegates, so they are good way to expose Nemerle functional values to them, and vice versa.

Delegates are in essence named functional types. They are defined with delegate keyword:

delegate Foo (_ : int, _ : string) : void;
delegate Callback () : void;

Later, the delegate name can be used to construct delegate instances. Any functional value of corresponding type can be used to create a delegate instance (local functions and lambda expressions in particular):

def my_fun (i : int, s : string) {
  // do something with i and s
}
 
def my_Foo_delegate = Foo (my_fun);
def another_Foo_delegate = Foo (fun (i, s) { ... use i, s ... });

Delegate instances can be invoked using standard function call syntax, as well as the Invoke special method. Please consult the Base Class Library documentation (on MSDN) for details.

The += operator has special meaning on delegate instances -- it calls the System.Delegate.Combine method that makes one function that calls two others in sequence. This operator is probably more useful with events.

module X {
  some_fun () : void
  {
    mutable f = Foo (fun (_, _) {});
    f (3, "foo"); // same as f.Invoke (3, "foo");
    f += fun (_, s) { System.Console.WriteLine (s) };
    f (42, "bar");
  }
}

Please note that creating a delegate is indeed creating a new class, so it can have the same level of visibility that you assign to any other class. You should also write delegates as stand-alone classes to prevent nesting.

You also have to think about accessibility when working with delegates. There is a requirement that if an event has a certain level of accessibility, then its delegate must have equal or greater accessibility.

This is a general rule in Nemerle (taken from C#), that when you define an entity E of type T, then access rights on E cannot be more permissive than access right on T. Otherwise it would be possible that a user would have access to E, but he wouldn't be able to use it, because the T would be inaccessible.

Events

Events are specialized properties for delegates. They are used in a GUI system for connecting signals (setting function to call when a button is pressed, etc.). They can be also used to call user-defined functions on when certain things occur, like a class being loaded by the runtime system.

Events are defined like fields, but they are marked with the event keyword, and always have a delegate type. Inside the class, events are seen as fields of these delegate type, but outside the class only the += and -= operators are available. They are used to connect and disconnect delegates.

class Button {
  public event OnClick : Callback;
}
 
def b = Button ();
// bind OnClick to some anonymous function ...
b.OnClick += Callback (fun () { ... })

Disconnecting is more tricky, as you cannot just do -= Callback (fun () { ... }). If you want to ever disconnect a delegate from an event, you will need to store it:

def b = Button ();
def my_cb = Callback (fun () { ... });
b.OnClick += my_cb;
// ...
b.OnClick -= my_cb;

Implicit conversion from functional values to delegate instances is provided, so you can skip delegate creation:

def b = Button ();
b.OnClick += fun () { ... }

Keep two things in mind when using events: