Why Rust for Research?

whoami

  • Alex Coleman
  • Research Software Engineer in IT Services
  • Background in cell biology, then data science, now software engineering
  • Software Sustainability Institute Fellow 2023 looking at software security
  • Languages I’m into: Python🐍, Rust🦀, R®

What this talk is

  • An introduction to the Rust language and its features
  • A perspective on why it could be a good language for research
  • Some thoughts on why it might not be the right language for your research

What this talk is not

  • A deep dive into the language
  • A comprehensive comparison of Rust and other languages
  • A call for you to write everything in Rust

What is Rust?

  • Compiled programming language that came out of Mozilla
  • Designed with an emphasis on security, peformance and usability
  • Initially popular for systems based programming but very flexible

Features

  • Memory safety
  • Speed
  • Strongly and statically typed
  • Zero cost abstractions
  • A helpful compiler
  • Modern toolchain

Features: Ownership and borrowing

  • System by which Rust manages memory and avoids common issues
  • Rules of ownership:
    1. Each value must have an owner
    2. Can only be one owner at a time
    3. When the owner goes out of scope, the value is dropped
// create some data on the heap
let s1 = String::from("hello");
// this invalidates s1 because ownership of the data has moved to s2
let s2 = s1;

// this errors because s1 is no longer accessible
println!("{}, world!", s1);

Ownership and borrowing (cont.)

  • When passing data to a function we pass ownership to that function
  • Sometimes we don’t want data to go out of scope after being used in a function, so we borrow it with a reference
  • Compiler checks these references via the borrow checker
fn main() {
    let s1 = String::from("hello");
    let len = calculate_length(&s1);
    println!("The length of '{}' is {}.", s1, len);
}

fn calculate_length(s: &String) -> usize {
    s.len()
}

Fearless Concurrency

  • This ownership model extends to writing concurrent code
  • The compiler works to guarantee against data races in concurrent code
  • Using threads still requires us to be careful but the compiler protects against common issues

No NULL type

  • Lots of programming languages have a NULL or None type
  • This means we often have to remember to test if something is None
AttributeError: 'NoneType' object has no attribute 'foo'
  • Rust uses an optional type and forces you to handle the None case

No NULL type (cont.)

  • Take this example in R
val <- switch("foo", "foo" = 1, "bar" = 2, "foobar" = 3)
val
[1] 1
  • But if I give an option no enumerated
val <- switch("magic", "foo" = 1, "bar" = 2, "foobar" = 3)
val
[1] NULL

No NULL type (cont.)

  • Rust won’t compile this without the final catch-all arm
let input: &str = "foo";
let val: i32 = match input {
  "foo" => 1,
  "bar" => 2,
  "foobar" => 3,
  _ => panic!("Something bad happened here!")
};
println!("{}", val)

What is great about Rust?

fn main() {
    println!("Rust is great!")
}

The toolchain

  • Rustup, toolchain installer and manager
  • Cargo, build system and package manager
  • crates.io, package repository
  • Clippy, Rust linter

A little Rust in your… C

A little Rust in your… Python

A little Rust in your… R

The community

What is not so great about Rust…

error[E0382]: borrow of moved value: `hard`
 --> src/main.rs:4:26
  |
2 |     let hard = String::from("Rust is hard");
  |         ---- move occurs because `hard` has type `String`, which does not implement the `Copy` trait
3 |     let size = count_length(hard);
  |                             ---- value moved here
4 |     println!("{}: {:?}", hard, size);
  |                          ^^^^ value borrowed here after move
  |
note: consider changing this parameter type in function `count_length` to borrow instead if owning the value isn't necessary
 --> src/main.rs:8:20

The compiler

  • People often describe “fighting with the borrow checker”
  • The strictness of the compiler can be infuriating
  • Is slow compared to C and C++

Bounds checking

  • To avoid buffer overflow Rust implements bounds checking when indexing arrays
let array = [0, 1, 2, 3];
let a = array[1]; // works
let b = array[6]; // not a valid index
  • This adds an overhead at runtime

The learning curve

Rust for research?

  • The toolchain makes it easy to distribute code
  • Great for performance critical code
  • Lots of resources to help learn
  • Better resource use means greener🌳

Already happening

Do I have to rewrite it in Rust?

No…

Why Rust might not be for your research?

  • Learning the language is hard
  • Is the tradeoff of safety v. speed-of-development worth it?
  • Is writing Rust sustainable in your research field/group?
  • Lacks the interactive programming experience of Jupyter/Rmarkdown

Thanks for listening

Your next steps

Further reading/watching

Questions