Code in this tutorial runs under Rust 0.8 as of October 31, 2013.
Rust is a compiled, strongly-typed language with first-class functions and closures. It has shared-nothing light-weight thread-based concurrency with message-passing for communication. It has memory safety but with this complicated three-kinds-of-pointers arrangement. It has a neat modern type system with structs (records) and enums (variants) and Traits (interfaces/type-classes).
So, it’s got all the modern features and focuses on system-level stuff and pervasive parallelism. That “pervasive parallelism” is what I’m going to focus on here.
As my first original Rust program after reading the language tutorial, this is going to be a bit rough. I am quite sure that I don’t understand the pointer types well enough to use them properly. Closures confuse me a bit, especially there being three types that …capture differently?
Anyway, at least the compiler tells me when I get it wrong.
##The Plan I’m going to implement a solution to this challenge problem. This means a program that:
- Spawns 4 threads.
- Thread 1: multiplies 10 * 2, sends result to thread 4
- Thread 2: multiplies 20 * 2, sends result to thread 4
- Thread 3: adds 30 + 40, sends result to thread 4
- Thread 4: receives the three results, sends string to main of format:
"%d + %d + %d = %d" % x y z (x+y+z)
- main receives string from thread 4 and prints it to stdout
##My Code I like getting the big picture to start with, so here’s the code in full:
use std::comm::SharedChan;
fn main() {
let (out,input): (Port<int>, Chan<int>) = stream();
let input = SharedChan::new(input);
let (strout,strin): (Port<~str>, Chan<~str>) = stream();
do spawn {
let x = out.recv();
let y = out.recv();
let z = out.recv();
let result = format!("{:d} + {:d} + {:d} = {:d}", x, y, z, x+y+z);
strin.send(result);
}
let my_in = input.clone();
do spawn {
my_in.send(2 * 10);
}
let my_in = input.clone();
do spawn {
my_in.send(2 * 20);
}
let my_in = input.clone();
do spawn {
my_in.send(30 + 40);
}
let str = strout.recv();
println(str);
}
##Imports
use std::comm::SharedChan;
This use
statement pull modules/functions/types into the local context.
That means that rather than saying std::comm::SharedChan
, I just say SharedChan
.
As a note, use
statements can also appear
at the beginning of function bodies (not just at the top of the file).
You can see what other functions are in the std::comm
module by reading
this official documentation.
##Main Function
fn main() {
The fn
keyword is for creating a function.
Functions are not closures.
This means that they do not grab variables from the enclosing scope
(you can only use the explicit arguments).
The main
function takes no arguments
and does not define a return type
(which means it returns a value of type ()
,
whose only value is ()
,
which is called nil or the empty tuple or unit).
As in most languages, main
is the function that
gets run automatically when you run an executable.
A Rust file without a main
function
cannot be compiled to an executable,
but could be compiled as a library.
##Setting Up Pipes
Our “threads” will actually be “tasks”, which is Rust-speak for a light-weight thread with no shared memory. The way to communicate between tasks is by sending “messages”, meaning values, across a pipe.
A pipe is a one-way stream of values, of a particular type. This means that the pipe has two ends, one that you can put things into and another that you can receive things on.
let (out,input): (Port<int>, Chan<int>) = stream();
You create the two ends of the pipe by calling stream
.
It returns a tuple, which is unpacked here into the variable input
and out
.
The out
side is a Port<int>
.
A Port
has a recv
method, which you can use
to try to get a value out of the pipe (it will wait for one if the pipe is empty).
The <int>
part is Rust’s type-generic notation.
It means that calling recv
on this port will return an int
The in
side is a Chan<int>
.
A Chan
has a send
method, which you can use
to send a value of the proper type across the pipe.
The <int>
part means that you can only send int
s.
Only one task can hold each end of the pipe. This means that a pipe can connect one unique sending task to one unique receiving task. However, we want to have three senders (the first three threads) and one receiver (the fourth thread).
let input = SharedChan(input);
This is a pretty common concurrency pattern,
so there’s an easy built-in way to let multiple tasks send on a pipe.
Running SharedChan(input)
return a SharedChan<int>
that will send its output to out
, the Port
that matches input
.
After this, you won’t be allowed to use input
anyway, so I just reused the name for the SharedChan
.
I’ll show you a little further on how you divide a SharedChan
into the multiple write ends you need.
let (strout,strin): (Port<~str>, Chan<~str>) = stream();
Here, we’re just setting a pipe for the fourth thread to speak to the main thread.
It sends strings (str
) instead of int
s.
The ~
means that we’re sending “owned” pointers to strings,
rather than copies of strings, between the tasks.
Remember that the tasks do not share memory. This means that a pointer to the task-local heap will be meaningless when sent across a pipe. Copying an entire string to send it across could also be very wasteful.
An owned pointer is a pointer to a common “exchange heap”. Only one task at a time can “own” an owned pointer. That means that once we sent a string across the pipe, we won’t be allowed to use it in the sending task anymore – because we gave it away.
Sending the pointer across is quick, and no data gets copied, since we’re just sending the pointer.
The weird pointer types in Rust seem generally really confusing to me. I just guessed at punctuation marks until the compiler was happy, but I think I get why it works now that I’m explaining it.
##Thread 4
Our goal for Thread 4 is to receive three values
and then create and send a string back to main
.
The way to create and run a new task is the spawn
function.
You give it the function/closure to run, and then it does so.
do spawn {
let x = out.recv();
let y = out.recv();
let z = out.recv();
let result = format!("{:d} + {:d} + {:d} = {:d}", x, y, z, x+y+z);
strin.send(result);
}
Here, we’re doing a lot of stuff implicitly.
The stuff in the curly braces is a closure.
That’s like a function, but it gets to grab variables from the enclosing scope.
In this case, we’re grabbing our Chan<~str>
, strin
and out Port<int>
, out
.
The compiler will not let you use strin
or out
subsequently in main
.
It understands that that those values now belong to our Thread/Task 4.
Also, while I call this “Thread 4”, in the code it’s an anonymous closure and (I guess) it’d be called an anonymous task too. There is no way for us to talk about this task in the code after this.
let x = out.recv();
let y = out.recv();
let z = out.recv();
The first thing out new task does is wait.
It’s going to wait to receive three int
s from the pipe.
Once we have the numbers, we need to build a string:
let result = format!("{:d} + {:d} + {:d} = {:d}", x, y, z, x+y+z);
The format!
function is a special syntax extension called a macro.
That’s why it has a !
in the name.
It acts like a function that returns a string.
The idea is that it’s equivalent to sprintf
in C or similar languages.
However, rather than using %d
, you use {:d}
.
There is an additional formatting thing, :?
, which can print anything.
The formating things, like :d
, are type-checked at compile time.
You can see fancier usages of this function in the documentation.
strin.send(result);
Once we send result to main
, we can’t use it anymore.
We gave it away.
However, we don’t care about that here, since we’re all done anyway.
##Threads 1,2,3
These three threads are basically the same, so I’m just going to look at Thread 1 in detail.
Before we split of a new task,
we need to make a Chan<int>
for it to send on.
let my_in = input.clone();
The clone
method of SharedChan
is how we get a new Chan
to give to our new task.
We’ll still be able to use input
in our main
function
after we give my_in
away to the new task.
do spawn {
my_in.send(2 * 10);
}
Here we spawn
using another anonymous zero-argument closure.
This one captures the my_in
that we just made.
All it does is multiply the two literals and send the value across the pipe.
As you can see by the fact that I reuse the name my_in
further down,
the compiler is not banning a name once you give it away to another task.
It just cares about the value.
##Main Thread: Printing
let str = strout.recv();
println(str);
Back in main
, after we create all those child tasks,
we just wait for the final answer.
We won’t be waiting long since there’s so little work to do.
Once we have the string, we print it.
##Conclusion
As you can see if you run rustc
on the file and then run the executable it produces,
the output is the line:
20 + 70 + 40 = 130
Or the line:
20 + 40 + 70 = 130
Those are the two orderings I got from running it 8 times. The changing ordering tells us that Rust really is parallelizing our puny tasks. I’m pretty excited about that.
This was a successful first foray for me into writing in Rust, and writing parallel things in Rust. I hope that it helps you wade in too. :)