← Back to the Note Garden

Rustlings

45 min read

Rust

Jump to a Section:

Even though compiler errors can be frustrating, they only mean your program isn’t safely doing what you want it to do yet; they do not mean that you’re not a good programmer! - The Rust Book

Intro Commands

Each program in Rust must have a main function - it's always the first code run in every program.

fn main() {

}

Semicolons indicate the end of a statement - most lines of Rust code will end with one.

To compile and run your program in Rust:

// This will compile the program
rustc filename.rs

// Then this will run the program (can leave the extension off in iOS or Linux)
./filename.exe

Cargo is the Rust package manager and build system, which will give us some easier ways to create, compile, and manage our programs. A few common commands:

// create a new project - will give us a cargo.toml file for configuration, a src directory with a main.rs file inside, and it initializes a new git repo and .gitignore file
cargo new hello_cargo

// To build our program - creates the .exe in target/debug, and creates a cargo.lock file for our dependencies
cargo build

// To build and run the program, one after the other
cargo run

// To make sure our code compiles, but does not create the .exe file
cargo check

// When ready to send our program to others or put it in production, can add a release flag that will optimize the program so it's faster, and will put it in target/release
cargo build --release

Cargo expects your source files to be in a src directory, and uses the root folder for README and configuration files.

Variables

By default in Rust, variables are immutable - you can't change the value once they're assigned. Variables are declared with the let keyword.

let x = 5;

However, we can allow a variable to be mutable, if we want to. We do this with the mut keyword.

let mut x = 5;

The difference between variables & constants in Rust

In a similar way, constants are values bound to a name and are immutable. However, you can't use mut with constants - they're always immutable.

We use the const keyword to declare them, and they always need their type defined (as opposed to variables, where it's good to define the type but can be inferred).

Constants can be declared in any scope (including globally). And they can only be set to a constant expression, not the result of a function call or anything else that would have to be computed at runtime. We need to know exactly what the value is before the program is compiled, so it always takes up the same amount of space.

// convention says to name const in screaming snake case
const MAX_POINTS: u32 = 10_000;

Shadowing

Interestingly, there's a concept called shadowing in Rust, where we can re-declare a variable using the same name, to take the initial value and do something new to it (using it in a function or changing the type). The new declaration is what will show when that variable is used.

let x = 5;
let x = x + 1;

Though it seems similar to mutating a variable, this gives us a way to transform the initial value and then have the new value be immutable - so we can change it each time we specifically want to, but then have it be the same anytime we use it like a regular variable.

This also effectively creates a new variable by using let again, which is what lets us change the type. If we had a mutable variable and tried to change the type, the compiler would give us an error - but if we shadow it, it counts it as a new variable instead, and allows it like it would any new variable declaration.

Data Types

Rust is a statically typed language, so it needs to know the type of all variables at compile time. It can infer some of these based on how we use them, but it's often best to declare it.

There are two main types of values in Rust:

Scalar Types

Represents a single value. There are 4 primary scalar types:

For numbers, you can use an _ as a visual separator 1_000

  • Integers A whole number. Will be signed (can be negative or positive, so the sign matters) or unsigned (will only ever be positive). You can use 8, 16, 32, 64, or 128 bit numbers. Integers default to i32. A program will panic (crash) in debug mode if an integer overflows the size you set (a number is out of range). ex: i8 or u16

  • Floating-point numbers There are two types for floating point numbers: f32 and f64. The default is f64. The 32 bit has single precision, while 64 bit has double precision.

  • Booleans bool, Can be either true or false. One byte in size.

  • Characters char, Four bytes in size, and represents a single Unicode Scalar Value. Can be used for English letters, accented letters, emoji, zero width spaces, and more.

    Char literals are specified with single quotes; String literals are specified with double quotes.

Compound Types

These group multiple values into one type. There's two types - tuples and arrays.

Tuples

A general way of grouping together a number of values with various types into one compound type. They have a fixed length: once declared, they can't grow or shrink in size.

let tup: (i32, f64, u8) = (500, 6.4, 1);

// to get individual values, we use destructuring with pattern matching
let (x, y, z) = tup;
// or can use dot notation for the index of the item we want
let one = x.2;

Arrays

Will also have a fixed length, but each element must have the same type. Will be a single chunk of memory allocated on the stack.

// type is written as [type; # of elements]
let a: [i32; 5] = [1, 2, 3, 4, 5];
// if we want each value to be the same, can declare by giving the value then the # of elements
let b = [3; 5];
// will access items in the array with [] notation
let first = a[0];

If you try to access something outside of the array (like with an index that's too long), Rust will check to see if that index is less than the array length. If not, it will panic and end the program. This helps prevent you from allowing memory access that's invalid.

Control Flow

If Statements

Similar to JS, will run a block of code if a condition is met. Expressions are sometimes called arms. A few key things to note:

  • Conditions in if statements must be a bool.
  • Because if is an expression, we can use it on the right side of a let statement.
  • Values that have the potential to be results from each arm of the if statement must be the same type.
  let number = 6;

  if number < 10 {
    println!("Smaller number");
  } else if number == 10 {
    println!("Equal");
  } else {
    println!("Larger number");
  }

Functions

Functions are defined with fn, and can be declared either above or below the main function.

  fn another_function(x: i32) {
    .... // some code
  }

Can be passed parameters, which will need the type expected to be declared. If multiple parameters, use a , to separate them. Parameters do not need to all be the same type.

Function bodies are made up of a series of statements, optionally ending in an expression. There is a distinction between the two words.

Statements are instructions that perform some action and do not return a value. Expressions evaluate to a resulting value.

Defining a variable and assigning it a value is a statement. Function definitions are also statements. Since they don't return values, you can't assign them to another variable.

Numbers, math operations, calling a function, or calling a macro are all expressions - they all evaluate to something. Expressions do not include ending semicolons.

Functions can return values - if they do, we declare the type after an ->. Rust by default will return the final expression in the function, though if you want to return earlier you can call the keyword return.

  fn sum_num(a: i16, b: i16) -> i16 {
    a + b
  }

Ownership

Ownership allows Rust to make memory safety guarantees without needing a garbage collector.

Some languages will collect "garbage", meaning they regularly look for no longer used memory and recycle that as your program runs. Others make you explicitly allocate and free memory yourself. Rust manages memory with ownership, a set of rules checked during compiling.

The Stack and the Heap

Whether a value is on the stack or the heap makes a difference in how Rust behaves. Both are parts of memory available for your code to use, but are structured in different ways. So a quick rundown of what each are and why they matter:

  • The Stack stores values in the order it gets them, and removes values from the end first (last in, first out). Like a stack of plates - you put one on top of the previous one, and when you need one you grab the one on top. Adding is called pushing onto the stack, and removing is popping off the stack. Anything stored on the stack must have a known, fixed size.

  • The Heap is used by asking for a certain amount of memory space, then receiving a pointer to a location on the heap that has enough space to meet your needs. This is used for data with an unknown size, or where the size might change over time. The process of finding a space for the data and returning a pointer is called allocating. Like being seated at a restaurant - you tell the staff how many people are in your party, and they find you a table with enough room. People coming later can ask for your table and be pointed in the right direction.

Since the heap pointer is a known, fixed size you can store that on the stack, but it will eventually have to follow the pointer to find the data.

Pushing to the stack is faster since data always goes on top - we never have to search for where to put something. Allocating to the heap takes a bit more work since we have to find a space, get the pointer, and then update the memory records to prepare for the next allocation.

Accessing heap data is also a bit slower, since we have to follow the pointer to get there (as opposed to just grabbing what's on top of the stack). The more we can work with less memory jumping, the better for our code's performance.

When we call a function, the values passed in and local values are all pushed onto the stack, then popped off once the function is done running.

Ownership helps with keeping track of what's on the heap, minimizing duplicates, and cleaning up unused data so we don't run out of space.

Ownership Rules

  • Each value in Rust has a variable called the owner.
  • There can only be one owner at a time.
  • When the owner goes out of scope, the value will be dropped.

Scope is the range in a program where an item is valid. For variables, they're valid from the moment they're declared until the current scope ends.

The String Type

String literals (like covered earlier) are immutable, and are typically hard coded. Works great in some cases, but sometimes (like taking in user input) we won't know the exact string or the size, and so will want to make a string from the String type. This type is allocated on the heap and is an adjustable size.

If we want to make a String type from a string literal, we can do that like this:

let s = String::from("hello");
// to then add to this, we can do something like this:
s.push_str(", world!");

Here, we call String::from, and this requests the memory we need from the memory allocator at runtime.

We'll also need a way to return this memory once we're done with it. Historically this has been difficult, since we can do this too early, or not at all, or more than once, all of which have their own issues.

We need to pair exactly one allocate with exactly one free.

In Rust, the memory is automatically returned once the variable goes out of scope. Rust will call a function called drop to clear this up for us.

Example: if we were to do this:

let x = 5;
let y = x;

This works like you might think - we bind x to the value 5, then we bind y to a copy of the value stored in x. Because these are integers and pushed on the stack, it works easily and copies the data directly.

String is a little different. They're made up of 3 parts: the pointer to the memory that holds our data; a length; and a capacity. This group of info is stored on the stack when we declare a new String.

The length is how much memory, in bytes, the contents currently use. The capacity is the total amount of memory. If we try to copy a String to another variable like we did with numbers above, what will copy is actually this group of information - not the data itself.

If we left it like this, we'd get a double free error when the variables go out of scope, because the drop function would be called on both and they both access the same point in memory. So when we reassign a value, Rust no longer considers the first one valid, and so won't try to free anything when it goes out of scope.

This is called a move - since we're basically moving the first version of our string into the second version. Rust has this as the default, so it's always inexpensive and makes sure we don't cause a double call to free the memory.

Now if we did want to have two copies of the same string value, we can use a method called clone. This makes it obvious that we're intending to do this, and it might be memory-expensive.

let s1 = String::from("hello"):
let s2 = s1.clone();

Passing a value to a function works in a similar way as assigning it to a variable. It will either move or copy the value, depending on the type of data. Returning values can also transfer ownership.

fn main() {
  let s = String::from("hello"); // s comes into scope
  takes_ownership(s); // s's value moves to the function, and s is no longer valid

  let x = 5; // x comes into scope
  makes_copy(x); // x moves into function, but is a copy type and so x is still valid
} // x goes out of scope, then s - but since s is already invalid, nothing special happens to it
fn main() {
  let s1 = gives_ownership(); // if this function returns a value, that value will be given to s1
  let s2 = String::from("hi"); // s2 comes into scope
  let s3 = takes_and_returns(s2); // s2 is moved into the function and marked invalid; if this function returns a value it will be given to s3
} // s3 goes out of scope; s2 is checked but already invalid so nothing happens to it; s1 goes out of scope

References and Borrowing

Rust has references, where we can refer to a value without taking ownership of it - noted using &.

fn main() {
  let s1 = String::from("hello");
  let len = calculate_length(&s1);
}
fn calculate_length(s: &String) -> usize {
  s.len()
}

Since we're passing in a reference, the value it points to will not be dropped when that reference goes out of scope. Since the function has a reference and not the actual value, we also don't have to return the value to give ownership back - we never took it in the first place.

Having references as function parameters is called borrowing.

We also can't edit a referenced value - when the reference is passed, it's immutable. If we want to be able to change something with our reference, we need to specifically call it as a mutable reference. Can only have 1 mutable reference to a particular piece of data in a particular scope.

fn main() {
  let mut s = String::from("hello");

  change(&mut s);
}
fn change(some_strong: &mut String) {
  some_string.push.str(", world");
}

You also can't borrow an immutable reference and a mutable reference in the same scope.

A references scope starts from where it's declared and continue through the last time it's used. So if we make an immutable reference, use it, and then make a mutable reference, that will work.

Dangling References

Rust makes sure that if we still have a pointer to a spot in memory active, it will always point to that data - it's going to give us an error if it tries to clear while a pointer is still active.

Slices

Slices also don't have ownership - they let you reference a contiguous sequences of elements in a collection.

let s = String::from("hello, world");
// this is a reference to a section of the String, grabbing the range [starting_index..ending_index]
let hello = &s[0..5];

In the range we selected, the starting index is where we start, and the ending index is 1 more than the last position we want. Internally, this structure stores the starting position and the length of the slice (ending minus starting).

In the .. range syntax, if we want to start from the first index (0), we can drop the first number - [..2]. This also works if we want to go to the end of the string - [3..].

The type for a string slice is &str. This type is an immutable reference - and is the type of a string literal!

The slice and range terms we just used also work on arrays.

let a = [1, 2, 3];
let slice = &a[0..2];

Strings

// to create a new empty string we can load data into
let mut s = String::new();
// add initial data to a string
let s = "initial contents".to_string();
let s = String::from("initial contents");
// add data to strings
// push_str takes a string slice (no ownership)
s.push_str("bar");
// push takes a single character
s.push('l');
// concatenate strings - + and format
let s1 = String::from("Hello, ");
let s2 = String::from("world!");
// s1 will be moved here, so no longer valid
let s3 = s1 + &s2;
// or use format for multiple strings - works like println but doesn't print output to screen
let s = format!("{}-{}-{}", s1, s2, s3);

Rust strings don't support indexing like s[0]. Since strings are stored in UTF-8, indexing like that doesn't always return what we might expect. Instead, Rust has you use a slice with a range, to specifically grab the characters you want.

// will grab first 4 bytes of a string - may not always be 4 characters, depending on how the string is encoded (other languages might be more bytes per character)
let s = &hello[0..4];
// other ways to access elements
// to perform operations on individual characters, best to use chars
for c in "test".chars() {
  println!("{}", c);
}
//  if you want bytes, can use .bytes()

Structs

A struct is a custom data type that lets us name and package together multiple related values, similar to an object's data attributes. In structs, the keys are called fields.

Structs are similar to tuples in that each piece can be a different type. However, in structs each piece is named, making it more flexible than a tuple.

struct User {
  username: String,
  email: String,
  sign_in_count: u64,
  active: bool,
}

To then use a struct, we simply call it and assign values to store in each field. If we make it a mutable instance, we can also change fields with dot notation.

let mut user1 = User {
  email: String::from("example@test.com"),
  username: String::from("user124"),
  active: true,
  sign_in_count: 1,
};

user1.email = String::from("newemail@text.com");

Similar to JS, can use a shorthand syntax if the field name and a parameter name in a function use the same name:

fn build_user(email: String) -> User {
  User {
    email,
    active: true,
  }
}

We can also make new instances based on other instances - so if we want to use most of the same values of an instance, but change a few, we can do that.

let user2 = User {
  email: String::from("another@example.com"),
  username: String::from("differentuser2"),
  ..user1
}

// if we wanted to do this more long hand, it would look like
// active: user1.active,

Each struct you define is its own type.

Tuple Structs

If you've got an instance where you want to name a tuple, and it would be verbose to name the fields, we can make a tuple struct - will behave like a tuple, but is named and it's own type, separate from any other tuples.

struct Color(i32, i32, i32);
let black = Color(0, 30, 0);

Unit Structs

????

Methods

Methods are similar to functions - they both use fn, are named, can have parameters and a return value, & contain code to run. But methods are defined in the context of a struct, & their first parameter is always self (the instance of the struct it's called on).

struct Rectangle {
  width: u32,
  height: u32,
}

impl Rectangle {
  fn area(&self) -> u32 {
    self.width * self.height
  }
}

// if we wanted to change the values in self, we would use:
// fn area(&mut self)...

fn main() {
    let rect1 = Rectangle {
        width: 30,
        height: 50,
    };

    println!(
        "The area of the rectangle is {} square pixels.",
        rect1.area()
    );
}

Methods can take ownership of self, borrow self immutably, or borrow self mutably, just as they can with any other parameter.

Can have multiple methods in an impl block - helps keep our code easier to read, since any method that can be used on our struct is tied to it and in the same place. (Though you can use multiple impl blocks for the same struct if you'd like.)

In impl blocks, can also define associated functions, that don't take self as a parameter - they won't have an instance of the struct to work with. Often used for constructors that return a new instance of the struct. To call these, we use the :: syntax (String::from is an example of this).

Enums

Enums (enumerations) allow you to define a type by enumerating it's possible variants. They can encode meaning along with data, and with pattern matching can help in running different code for different values.

// This basically makes a custom data type we can use
enum IpAddrKind {
  V4,
  V6
}
// Then to create an instance of a variance:
let four = IpAddrKind::V4;
// variants are namespaced like this, so that if we want to do something to any IpAddrKind, we can
fn route(ip_kind: IpAddrKind) {}

We can also set up our enums so that they have values associated with each type. Each variant can have different types of data associated with it, as well. Can put any kind of data in an enum - strings, numerics, structs, even another enum.

enum IpAddr {
  V4(u8, u8, u8. u8),
  V6(String),
}

let home = IpAddr::V4(127, 0, 0, 1);

Can also attach methods to enum. Self will be the variant option that we call the method on.

impl Message {
  fn call(&self) {
    // ...
  }
}
let m = Message::write(String::from("hello"));
m.call();

Match Control Flow

match compares values against a series of patterns, and executes code depending on the first one it matches.

enum Coin {
  Penny,
  Nickel,
  Dime,
  Quarter(UsState),
}

fn value_in_cents(coin: Coin) -> u8 {
  match coin {
    Coin::Penny => 1,
    Coin::Nickel => 5,
    Coin::Dime => 10,
    Coin::Quarter(state) => {
      println!("State quarter from {:?}!", state);
      25
    }
  }
}

Looks similar to if but can return any value, not just a Boolean. For each arm, it lists the pattern to match (Coin::Variant), then the arrow points to the value/code to run if it matches.

If Let

fn main() {
    let mut res = 42;
    let option = Some(12);
    if let Some(x) = option {
        res += x;
    }
    println!("{}", res);
}

The if let phrasing here is similar to destructuring - it will check to see if the pattern we've provided (in the code above, Some(x)) has a match in the variable we've passed. If so, it will run the code inside, assigning the variable value to the x. If not, it will not run. Another description here:

Discord screenshot of how this is similar to destructuring

Modules

Modules are a way to organize code within a crate into groups, improving readability and reusability. They also control the privacy of items, if they can be used by outside code (public) or internally only (private).

Modules are most useful for grouping related definitions together - can hold other modules, functions, structs, enums, constants, traits, etc.

mod hosting {
  fn add_to_waitlist() {}
}

To use something inside a module, you'll need to know the path. You can use absolute paths (starting from the crate root), or use the relative path (starting from the current module). Both will use :: in between identifiers.

Defining something inside a module also makes it more private by default. Items in a parent module can't use private items in their children (though children can use items in their ancestor modules). Making a module public doesn't make its contents public. The pub keyword only lets code in its ancestor modules refer to it.

// setting pub in front marks this as public
pub fn eat_at_restaurant() {
  // absolute path
  crate::hosting::add_to_waitlist();
  // relative path
  hosting::add_to_waitlist();
}

Can also construct relative paths using super - sort of like accessing a file path with ...

fn serve_order() {}

mod back_of_house {
  fn fix_incorrect_order() {
    cook_order();
    super::serve_order();
  }

  fn cook_order() {}
}

Another privacy note in regards to structs and enums: if you use pub on a struct it will make the struct public, but it's fields will still be private - you can adjust these individually. Will also need a public associated function that constructs an instance of the struct to use elsewhere, since you won't have access to set any private fields.

With enums, if you make it public all of its variants are public.

use allows us to call an item into scope once, and then use it like it's a local item, simplifying the rest of the calls. With modules, you'll often go to the parent scope instead of all the way to a specific item, so it's clear it's not locally defined. With structs, enums, and such you'll often specify the full path in use.

If two types have the same name (like a Result field in two different modules), you can either specify down to the parent and then call Result, or you can use the keyword as and provide a new local name / alias.

If you bring something into scope with use, it be private in the new scope. So if you need the code that calls your function to also use it, you can use pub use - this is called re-exporting.

If you need to bring in multiple items from the same library, can use curly brackets around a list.

use std::{cmp::Ordering, io};
// to bring in one complete path and one that goes farther:
use std::io::{self, Write};
// or can bring in all public items from a path
use std::collections::*;

Vectors

Vec<T> - allows you to store multiple values in a single data structure that puts all values next to each other in memory. Values must be of the same type.

// create a new, empty vector
// We put a type in < > so Rust knows what type we intend to store
let v: Vec<i32> = Vec::new();
// if we have starting values, Rust can usually infer the type
let v = vec![1, 2, 3];
// to add elements, use push (make sure your variable is set to mut)
v.push(5);
// Also - if you want a direct copy but don't want to take ownership, can make a clone!
let v2 = v.clone();

When a vector gets dropped (goes out of scope), all of its contents are also dropped.

To reference values in a vector, there's two ways:

let v = vec![1, 2, 3, 4, 5];
// can use indexing with & - gives a reference
let third: &i32 = &v[2];
// or can use get - gives an Option
match v.get(2) {
  Some(third) => println!("The third element is {}", third),
  None => println!("There is no third element."),
}

There's two ways because there's two methods that might happen if you try to access something past the size of the vector. If you use [] and try to grab something outside the range, the program will panic. If you use get, it will return None.

Will need to remember that you can't have mutable & immutable references in the same scope - so if you store a reference to an indexed value, then add something to the original vector, it will cause an error. Since vectors are all stored in the same memory space, it might need to move the vector to make room for the new item, so the reference is no longer valid.

To access all values in a vector:

// immutable - read only
let v = vec![100, 32, 57];
for i in &v {
  println!("{}", i);
}
// mutable - if we need to make changes
let mut v = vec![100, 32, 57];
for i in &mut v {
  // * is a dereference operator - grabs the value in i before we make changes
  *i += 50;
}

Enums can also be stored in vectors, since they count as the same type! Useful if we need to keep information together that have different individual types.

enum SpreadsheetCell {
  Int(i32),
  Float(f64),
  Text(String),
}

let row = vec![
  SpreadsheetCell::Int(3),
  SpreadsheetCell::Text(String::from("blue")),
  SpreadsheetCell::Float(10.12),
];

Errors

Unrecoverable Errors with Panic

// if we need our program to panic (print a failure message, unwind and clean up stack, then quit):
panic!("error message!");

Recoverable Errors with Result

Result is an enum with two variants - an Ok and an Err. Both will have their own type that they expect.

enum Result<T, E> {
  Ok(T),
  Err(E),
}

// an example way to use this
use std::fs::File;

fn main() {
  let f = File::open("hello.txt");

  let f = match f {
    Ok(file) => file,
    Err(error) => panic!("Problem opening the file: {:?}", error),
  };
}

//  can also detect different failure reasons
use std::fs::File;
use std::io::ErrorKind;

fn main() {
  let f = File::open("hello.txt");

  let f = match f {
    Ok(file) => file,
    Err(error) => match error.kind() {
      ErrorKind::NotFound => match File::create("hello.txt") {
        Ok(fc) => fc,
        Err(e) => panic!("Problem creating the file: {:?}", e),
      },
      other_error => {
        panic!("Problem opening the file: {:?}", other_error)
      }
    }
  }
}

the Result type has helper methods for various tasks. One is unwrap - will work like our first example above, where it will return the file if all goes well, or call panic! for us if there's an error. There's also expect, which works similarly but lets you provide the error message to return in the panic! call, so you can provide more information.

let f = File::open("hello.txt").unwrap();
let f = File::open("hello.txt").expect("Failed to open hello.txt");

Can also return the error to the code that called it, called propagating. You might have more information or other logic in the calling code that can handle the error better. This is common enough that there's a shortcut variable ? for it. It goes at the end of a line expecting a Result value. If the value of Result is Ok, that value gets returned and we'll continue; if it's an Err, that error will be returned as if we'd called return on it and the function will end.

fn read_username_from_file() -> Result<String, io::Error> {
  let mut f = File::open("hello.txt")?;
  let mut s = String::new();
  f.read_to_string(&mut s)?;
  Ok(s)

  // Could also chain these and rewrite as:
  let mut s = String::new();
  File::open("hello.txt")?.read_to_string(&mut s)?;
  Ok(s)
}

Tests

Tests can be written directly in your Rust code, to verify your non-test code is doing what you expect. Typically will perform 3 actions:

  • Setup any needed data or state.
  • Run the code you want to test.
  • Assert the results are what you expect.
#[cfg(test)]
mod tests {
  // will often see this line in test modules - since it's an inner module, we need to bring the code under test in the outer module into the scope of the inner module - this does that
  use super::*;
  // to make a function a test, add this line before it:
  #[test]
  fn it_works() {
    assert_eq!(2 + 2, 4);
  }
}

// A few different types of tests, and what they do:
assert! // ensure a condition evaluates to true; does nothing if true, panics if false
assert_eq! // pass if both values are the same
assert_ne! // pass if both values are not equal
// to test that a function should panic, add another attribute
// can add an expected parameter to check for a certain panic value
#[test]
#[should_panic(expected = "Guess value is below 100")]
// can use Result for a bit more finesse
#[test]
fn it_works() -> Result<(), String> {
  if 2 + 2 == 4 {
    Ok(())
  } else {
    Err(String::From("two plus two does not equal four"))
  }
}

Macros

There are declarative macros macro_rules! and three kinds of procedural macros:

  • custom macros that specify code added with the derive attribute used on structs and enums #[derive]
  • attribute-like, define custom attributes usable on any item
  • function-like, look like function calls but operate on tokens specified as their argument

Macros are a way of writing code that write other code (meta programming). It lets produce more code than what we write manually. It's similar to functions, but there are some differences:

  • function signatures must declare the number & type of parameters - macros can take a variable number of parameters.
  • macros are expanded before the compiler interprets the meaning, so can implement a trait on a type. Functions can't because they're called at runtime.
  • macros are more complex, since you're writing Rust code that writes Rust code, so are generally more difficult to read & maintain.
  • must define macros or bring into scope before you call them; functions can be defined & called anywhere.

Declarative

Declarative macros act similar to a match - they compare against a pattern, and run code depending on matches.

// makes macro available whenever the crate it's in is brought into scope
#[macro_export]
// start with this line, then the name we want to use
macro_rules! vec {
  // similar to a match expression, we have one arm with a pattern, followed by a block of code associated with the pattern
  // this line will match any Rust expression and assign it to the name $x - the comma says there might be a comma after the expression - the * will match zero or more expressions
  ( $( $x:expr ),* ) => {
    {
      let mut temp_vec = Vec::new();
      $(
        temp_vec.push($x);
      )*
      temp_vec
    }
  };
}

Procedural

Procedural macros are more similar to functions - they accept input, operate on it, and produce output.

// definitions must reside in their own crate with a special crate type
use proc_macro;

#[some_attribute]
pub fn some_name(input: TokenStream) -> TokenStream {}

Loops

Rust has 3 ways to loop over things:

  1. Loop - Legit just loops through continuously until it reaches a break keyword. If there's a value with the break call, that will be returned from the function.
fn main() {
  let mut counter = 0;

  let result = loop {
    counter += 1;

    if counter == 10 {
      break counter * 2;
    }
  };
}
  1. While - Similar to JS, will run through the loop while the condition is true, then once it's false will call break itself and end the loop.
fn main() {
  let mut number = 3;

  while number != 0 {
    println!("{}!", number);

    number -= 1;
  }

  println!("LIFTOFF!");
}
  1. For - will loop over each item in a collection
fn main() {
  let a = [10, 20, 30, 40, 50];

  for element in a.iter() {
    println!("the value is: {}", element);
  }
}

Smart Pointers

Using Box to Point to Data on the Heap

The most straight forward smart pointer is a box - Box<T>; they allow you to store data on the heap instead of the stack - all the stack keeps is a pointer to the heap data. Most often, you'll use this in these situations:

  • if you have a type whose size can't be known at compile time & you want to use a value of that type in a context that needs an exact size (might be a recursive type, where you won't know the size right away)
  • with large amounts of data where you want to transfer ownership but not copy the data when you do (better for performance to only need to move the pointer & not a large amount of data)
  • when you want to own a value and only care that it's a type w/ a particular trait rather than being a specific type
// simple example of how to make a new box
fn main() {
  let b = Box::new(5);
  println!("b = {}", b);
}

Data is accessed like we've done before, and will also go out of scope (both the box and the data it points to) once main closes.

Enabling Recursive Types with Boxes

Rust needs to know how much space a type takes up at compile time. Since recursive functions could go on infinitely, we can't know the exact size at compile time. So we use a box, since it does have an exact size.

// If we tried to define an enum type like this:
enum List {
  Cons(i32, List),
  Nil.
}
// it won't compile. Because we're referencing itself inside it, Rust can't figure out how much space it needs to store this type.

To determine how much space to allocate for a particular type, Rust will go through each variant and see which needs the most space. Since only one variant type will be used at a time, it just needs to know which is the largest and reserve that much space. So we need "indirection", or changing the data structure to store the value indirectly by using a pointer instead. Pointers are always the same size, so we know how much space we need.

// so a way that will compile to make a list is like so:
enum List {
  Cons(i32, Box<List>),
  Nil,
}
// then to use it, we'd call it like this - we pass in the Nil value when we reach the end of the list we want to make
fn main() {
  let list = Cons(1, Box::new(Cons(2, Box::new(Cons(3, Box::new(Nil))))))
}

Boxes provide only indirection and heap allocation, no other special capabilities. This means they also don't have any performance overhead, so can be useful when the indirection is what you really need. They implement the Deref trait, which allows Box values to be treated like references. When a Box goes out of scope, the heap data it points to will be cleaned up as well because of the Drop trait implementation.

Fearless Concurrency

Shared-State Concurrency

Shared memory concurrency is like multiple ownership: multiple threads can access the same memory location at the same time. Naturally, this becomes complex since each owner needs managing. Mutexes are one of the common concurrency primitives for shared memory.

Mutex is short for "mutual exclusion" - it allows only one thread to access some data at any given time. A thread must first signal that it wants access by asking to get the mutex's lock (a data structure that keeps track of who currently has exclusive access to the data). There are two rules with mutexes:

  • You must attempt to acquire the lock before using the data
  • When you're done, you must unlock the data so other threads can use it
// using a mutex in a single threaded context
use std::sync::Mutex;

fn main() {
  let m = Mutex::new(5);

  {
    // calling lock will block this thread, so we can't do anything else until we get access
    // we use unwrap because if another thread holding the lock first panics, we'll never get access. this way, this line will panic as well, so we'll know
    let mut num = m.lock().unwrap();
    *num = 6;
  }
  // the lock will be released automatically when it goes out of scope, so we don't have to implement that ourselves - this is why this call is wrapped in braces

  println!("m = {:?}", m);
}

Mutex is a smart pointer - the call to lock returns a pointer called MutexGuard, that's wrapped in a LockResult (handled above with unwrap). This implements deref to give us access to the data stored inside, and has drop so it will release the access automatically once out of scope.

If we have multiple threads that need to access the same data, we'll want to use an Arc<T>, or "atomic reference counted" type. They work like primitive types but are safe to share across threads.

use std::sync::{Arc, Mutex};
use std::thread;

fn main() {
  let counter = Arc::new(Mutex::new(0));
  let mut handles = vec![];

  for _ in 0..10 {
    let counter = Arc::clone(&counter);
    let handle = thread::spawn(move || {
      let mut num = counter.lock().unwrap();

      *num += 1;
    });
  }
// this line helps make sure all the threads are finished before we look for the final result
  for handle in handles {
    handle.join().unwrap();
  }

  println!("Result: {}", *counter.lock().unwrap());
}

Mutex provides interior mutability, meaning we can mutate contents inside of it, even if the variable holding it is not mutable. You also might run into a deadlock situation, where an operation needs locks to two resources, and two threads each have one lock, so they're stuck waiting for the other forever.

Iterators and Closures

Iterators

The iterator pattern lets you perform a task on a sequence of items in turn, and determines when the sequence is finished. Iterators are lazy - they have no effect until you call methods that consume the iterator.

let v1 = vec![1, 2, 3];
// the iter is declared here, but doesn't do anything yet
let v1_iter = v1.iter();
// now it gets used by the for loop
for val in v1_iter {
  println!("Got: {}", val);
}

All iterators implement a trait named Iterator, which defines the type of the item and a next method, which returns one item at a time wrapped in Some, unless it's done (then it returns none).

You can call next yourself if you want - however, you'll need to make your iter() variable mutable to do so. In for loops, we don't have to worry about that, because the loop handles it for us. Also note - values we get from calls to next are immutable references to the value. If we want an iterator that returns owned values, we can call into_iter instead. Or, if we want to iterate over mutable references, we can use iter_mut.

Methods that Consume the Iterator

Methods that call next are called consuming adaptors, since calling them uses up the iterator. sum is one example - takes ownership of the iterator & goes through the items by repeatedly calling next, adds each item to a running total and returns the total when done. collect is another - collects resulting values into a collection data type.

let v1 = vec![1, 2, 3];
let v1_iter = v1.iter();
let total: i32 = v1_iter.sum();
assert_eq!(total, 6);

Methods that Produce other Iterators

Iterator adaptors allow you to change iterators into different kinds of iterators. Can chain multiple calls to iterator adaptors for complex actions. But you have to call a consuming adaptor method to get results from calls. map is one of these types - takes a closure to call on each item and performs some action. filter is another - will take each item and return a boolean - true if it will be included in the answer, false if not.

let v1: Vec<i32> = vec![1, 2, 3];
let v2: Vec<_> = v1.iter().map(|x| x + 1).collect();
assert_eq!(v2, vec![2, 3, 4]);

You can also create your own Iterator trait on your own types. The only method you're required to provide a definition for is the next method. After that, you can use all other methods w/ default implementations provided by the Iterator trait.

// Here we make a new struct, with one field - we keep count private so only Counter instances can change it
struct Counter {
  count: u32,
}
// then we're going to define a new function so we always start with a default value of 0 in count
impl Counter {
  fn new() -> Counter {
    Counter { count: 0 }
  }
}

// now we make the Iterator trait - our next function will check if the value is < 5 - if so, we add 1 and return Some w/ the new value. Once we hit 5, we return None
impl Iterator for Counter {
  type Item = u32;

  fn next(&mut self) -> Option<Self::Item> {
    if self.count < 5 {
      self.count += 1;
      Some(self.count)
    } else {
      None
    }
  }
}

// to use it, we can make a new Counter instance and call next on it
fn calling_next_directly() {
  let mut counter = Counter::new();

  assert_eq!(counter.next(), Some(1));
  // ... Go until we get None
}

Generic Types, Traits, & Lifetimes

Generics are abstract stand-ins for concrete types or properties - similar to how you'll pass parameters to a function. We don't know what specific value or type will be there, but we can express the behavior we want or how they relate to other generics.

Can use generics to create definitions for items like function signatures or structs, which can then be used with different data types.

Say we want to find the largest value in an array. We can make a function for number arrays, and a function for char arrays, that both do the same looping over the list, but have specific types defined. Or, we can use generics to combine those into one function that can work with both types!

To parameterize types in a function, we'll need to name the type parameter - in Rust, T is pretty common for this, though you can use any identifier/name you want. You have to declare the type name before we can use it.

// function largest is generic over some type T; it has one parameter named list (a slice of values of type T), and returns a reference to a value of the same type T
fn largest<T: PartialOrd + Copy>(list: &[T]) -> T {
  let mut largest = list[0];

  for &item in list {
    if item > largest {
      largest = item;
    }
  }
  largest
}

// can then use this function on various lists
fn main() {
  let number_list = vec![34, 50, 25, 100, 65];
  let result = largest(&number_list);
  let char_list = vec!['y', 'm', 'a', 'q'];
  let result = largest(&char_list);
}

Can also use generic values in structs, methods, and enums. Rust doesn't run any slower with generics than it would with concrete types. When it compiles, Rust will go through and create specific instances for each use case.

// because we use one generic type here, both x and y will have to have the same type when used
struct Point<T> {
  x: T,
  y: T,
}

// can also use generics in methods - we declare the generic type directly after impl, so it works for any type we pass to our Point
impl<T> Point<T> {
  fn x(&self) -> &T {
    &self.x
  }
}

// could also make methods that only work for certain types - so if we pass Point an f32 type, it will have this method, but other uses with different types won't
impl Point<f32> {
  fn distance_from_origin(&self) -> f32 {
    (self.x.powi(2) + self.y.powi(2)).sqrt()
  }
}

fn main() {
  let integer = Point { x: 5, y: 10 };
}

// if we wanted to allow for multiple types, can use more than one generic - can still have both types be the same, but can also let them be different
struct Point<T, U> {
  x: T,
  y: U,
}

// can also mix references to generics on the struct, and generics on the method
impl<T, U> Point<T, U> {
    fn mixup<V, W>(self, other: Point<V, W>) -> Point<T, W> {
        Point {
            x: self.x,
            y: other.y,
        }
    }
}

fn main() {
  let both_types = Point { x: 5, y: 4.2 };
}

// enums work similarly: the standard Option and Result show using one or multiple generics
enum Option<T> {
  Some(T),
  None,
}

enum Result<T, E> {
  Ok(T),
  Err(E),
}

Traits: Defining Shared Behavior

A trait tells the compiler about functionality a particular type has and can share with other types. Can use bounds to specify that a generic can be any type with a certain behavior.

// this declares the trait and it's name. Inside the braces, we declare the method signatures that describe behaviors of types w/ this trait. Each type is responsible for declaring it's behavior for these methods - the trait says what method it expects and what it should return; the calls for each type determine how they implement it
pub trait Summary {
  // can simply define the method, like so
  fn summarize(&self) -> String;
  // or can define it with a default behavior
  fn summarize(&self) -> String {
    String::from("(Read more...)")
  }
}

// if we wanted to use the default method, we define it as an empty block
impl Summary for NewsArticle();

// using it might look like this:
pub struct NewsArticle {
    pub headline: String,
    pub location: String,
    pub author: String,
    pub content: String,
}
// this would overwrite our default value
impl Summary for NewsArticle {
    fn summarize(&self) -> String {
        format!("{}, by {} ({})", self.headline, self.author, self.location)
    }
}

// if instead we wanted to use the default value, would do that here:
println!("New article available! {}", article.summarize());

Since these are all defined in the same file, they're all in the same scope. If we had our trait is in a library, we'd have to pull in crate::Summary before we could use it.

Can implement a trait on a type only if the trait or type is local to our crate. Can't implement external traits on external types - at least one has to be from within our file.

Default implementations can call other methods in the same trait.

pub trait Summary {
  fn summarize_author(&self) -> String;

  fn summarize(&self) -> String {
    format!("(Read more from {}...)", self.summarize_author())
  }
}

impl Summary for Tweet {
  fn summarize_author (&self) -> String {
    format!("@{}", self.username)
  }
}

println!("1 new tweet: {}", tweet.summarize());

Can use traits to define functions that accept different types. For example, could define a notify function that calls summarize on a parameter, which is of a type that implements the Summary trait. Instead of providing a concrete type, we say it will work for any type that implements the provided trait.

pub fn notify(item: &impl Summary) {
  println!("Breaking news! {}", item.summarize());
}

// this is syntax sugar for the trait bound:
notify<T: Summary>(item: &T)

// can specify multiple trait bounds
notify(item: &(impl Summary + Display))
notify<T: Summary + Display>(item: &T)

// can also use trait bound in the return spot, to return a value of a type that implements the trait - can only be used if returning a single type
returns_summarizable() -> impl Summary {}

Using multiple bounds with multiple types can get confusing, so Rust provides a where clause:

// instead of
fn some_func<T: Display + Clone, U: Clone + Debug>(t: &T, u: &U) -> i32
// we do
fn some_func<T, U>(t: &T, u: &U) -> i32
  where T: Display + Clone,
        U: Clone + Debug
{}

Traits can be used to conditionally implement methods. So some methods will happen no matter what, and some only if certain conditions are allowed on that type.

Lifetimes ensure references are valid as long as we need them to be.

← Back to the Note Garden