Rust in Pieces

A top-down approach to learning Rust coming from Python (and vice-versa).

Introduction

This book's goal is to bring the Python and Rust developer communities closer together, and to help more developers from one language learn how to leverage the benefits of the other.

The book and accompanying code are organized into a collection of small projects, termed pieces. Each piece is a self-contained task with Python and Rust implementations that each perform the same task. The aim is to help Python developers gain familiarity with Rust, and vice-versa, by comparing and contrasting the two languages in a top-down manner.

Importantly, the pieces in this book build towards unifying Python and Rust code bases via PyO3, a highly popular open source library that allows you to call Rust bindings from Python or the Python interpreter from Rust. Using one language does not preclude using the other!

As you go through the pieces, you'll find yourself becoming proficient in writing clean, tested, production-worthy code using engineering best practices in either language, moving between them at will. Over time, you can make a more informed choice regarding when to use one or the other language for parts of a larger project.

We believe that Rust 🦀 is the among the most approachable lower-level programming languages for Python developers, and that Python is one of the most valuable high-level languages for Rust developers who are looking to build tooling for the burgeoning data, AI and ML ecosystems. The arrival of tools like PyO3 has made it highly feasible for a developer to straddle both worlds, combining their best parts, thus helping build more efficient and scalable software.

What's covered in this book?

Rust's learning curve is considerably steeper than Python's, so the table below is provided to show a mapping between each piece and its corresponding concept in Rust. As can be seen, structs, serialization, deserialization, vectors and traits are ubiquitous concepts in Rust.

PieceCategoryKey Rust concepts
Hello worldIntromacros
Data structures & constructsIntrocrates, structs, traits, implementations
Simple CSV parsingFilesserde, vec
Regex JSONFilesmatch, regex
Mock data generationFilesRNG, sampling
Age groupingFilesenums
Datetime parsingFileschrono, lifetimes
Extract pronouns from textFilesrayon, parallelism
PostgresDatabasesasync, sqlx, tokio
MeilisearchCLIsasync, async-std, clap
REST API to PostgresAPIsaxum, async, tokio
PyO3 mock data generationUnificationPyO3, Maturin
PyO3 parallel computationUnificationPyO3, Maturin

Why use Rust with Python?

Python is a dynamically typed, interpreted programming language that's known for its flexibility, ease of use and low barrier to entry. It's by far the most popular language for AI, ML and data science, and has been the go-to language for researchers and innovators in these fields for quite a while now.

It's possible to write relatively high-performance code in Python these days by leveraging its rich library ecosystem (which are typically wrappers around C/C++/Cython runtimes). However, performance and concurrency are not Python's strong suits, and this requires performance-critical code to be implemented in lower-level languages. For many Python developers, using languages like C, C++ and Cython is a daunting prospect.

Rust is a statically typed, compiled programming language that's known for its relatively steep learning curve. Its design philosophy is centered around three core functions: performance, safety, and fearless concurrency. It offers a modern, high-level syntax and a rich type system that makes it possible to write code that runs really fast without the need for manual memory management, eliminating entire classes of bugs.

Although it's possible to write all sorts of complex tools and applications in Rust, it's not the best option for every situation. In cases like research and prototyping, where speed of iteration is important, Rust's strict compiler can slow down development, and Python is still the better choice.

We believe that Python 🐍 and Rust 🦀 form a near-perfect pair to address either side of the so-called "two-world problem", explained below.

The two-world problem

The programming world often finds itself divided in two: those who prefer high-level, dynamically typed languages, and those who prefer low-level, statically typed languages.

Many high-level languages are interpreted (i.e., they execute each line as it's read, sequentially). These languages are generally easier to learn because they abstract away the details of memory management, allowing for rapid prototyping and development.

Lower-level languages, on the other hand, tend to be ahead-of-time (AOT) compiled. They offer the programmer more control over memory management, resulting in much more performant code at the cost of a steeper learning curve.

It's for these reasons that scientists, researchers, data scientists, data analysts, quants, etc. have traditionally preferred high-level languages like Python, R and Julia. On the other hand, systems programmers, OS developers, embedded systems engineers, game developers and software engineers tend to prefer lower-level languages like C, C++ and Rust.

The image above is a figurative representation of two distributions of people, typically disparate individuals from either background (with the languages listed in no specific order).

Has the two-world problem been solved before?

A lot of readers will have heard of Julia, a dynamically typed, just-in-time (JIT) compiled alternative to Python and is often touted as a "high-level language with the performance of C". While Julia is no doubt a great language, it's popularity is largely limited to the scientific community and its library ecosystem and user community haven't yet matured to the extent that Python's has. As such, the "two-language problem" that Julia attempts to solve, is still largely unsolved.

Other languages like Mojo explain in their vision how they aim to solve the two-world problem by providing a single unified language (acting like a superset of Python) that can be compiled to run on any hardware. However, Mojo is still very much in its infancy as a language and hasn't gained widespread adoption, and its user community is non-existent.

Rust and PyO3

The most interesting aspect about PyO3 in combination with Rust is that they offer a new way to think the two-world problem. Rather than trying to solve the problem by creating a new language that offers the best of many worlds, Rust and PyO3 embrace the problem by allowing a developer to move between the worlds and choose the best tool for parts of a larger task.

Rust's design philosophy and features make it an ideal candidate to bring people from these worlds (high-level and low-level languages) closer together. Rust's strict compiler, rich type system and ownership principles eliminate the need to manually manage memory without requiring a garbage collector, making it possible for a larger community of analytical and scientifically-minded developers to write high-performance code without sacrificing safety.

The image above shows a distribution of the same potential set of developers who can straddle both worlds. Those who are already proficient in Python and require fast iteration for prototyping can choose to write only very specific, performance-critical parts of their code in Rust. Conversely, those who are already proficient in Rust and require high-performance, safe code for their workflows can choose to interface with Python for only very specific parts that need access to the Python ecosystem.

In our view, the interface that PyO3 provides is fundamentally different from earlier approaches to interoperability with Python (such as pybind11, SWIG or Cython), because unlike the earlier tools, PyO3 and Rust are far more accessible to Python developers. We hope this becomes clearer and clearer as you progress through the book.

How to approach learning a new language

Typically, the first steps in learning a programming language involve understanding its syntax, data structures and control flow expressions before tackling a specific problem. This is termed as bottom-up learning, and it's essential to understand the terminology of the language and its ecosystem.

Bottom-up learning resources typically include:

  • Books
  • Documentation
  • Tutorials
  • Interactive exercises

Although necessary to master a topic, bottom-up learning has the issue that learners often get stuck in tutorial hell, where they're constantly being exposed to new concepts without actually building anything end-to-end. In addition, bottom-up learning can leave learners frustrated because they're not able to see the bigger picture and how the different parts come together as a whole.

In contrast, top-down learning follows a more pragmatic approach that's grounded in the real-world. For this, a more abstract way of thinking is required. The learner identifies a problem statement in their domain and they learn just enough of the language to solve the problem at hand, going deeper only as needed.

Top-down learning resources typically include learning by example, using any of the following resources:

  • Existing codebases
  • Blogs
  • Podcasts
  • Videos

Top-down learning is not a replacement for bottom-up learning. The best way to become proficient and productive in a language is to combine both approaches and do them together.

Don't reinvent the wheel

When learning a new language, it's tempting to start from scratch and build everything that's required to solve a problem from the ground up. This is a slow, sometimes painful process if you're just starting off, and can result in inefficient, unidiomatic code.

Both Rust and Python have rich package ecosystems, and this book leverages them to the fullest extent possible. After all, most great software is built on the shoulders of giants. Reading through existing codebases for libraries, tools and frameworks is a great way to learn how to write idiomatic code in a language, and to understand the performance implications of your code.

Prior reading

It's recommended to have a basic understanding of Python and Rust before reading this book. If you're new to either language, consider reading the following resources. It's okay if you don't absorb everything in the first pass -- the goal is to get a high-level understanding of either language and their ecosystems.

Once you have a handle on the terminology, you can start by getting your hands dirty with the pieces provided in this book. Or better yet, use this framework to create your own pieces for your domain of interest and start writing some code!

How to read this book

The pieces are not meant to be read in a specific order. However, they are roughly organized in order of increasing complexity, and each piece, by and large, utilizes concepts that may have been introduced in an earlier piece.

One of the challenges with learning (and teaching) Rust, is that certain concepts such as ownership, borrowing, traits and lifetimes can be quite challenging to grasp for a new learner, but these concepts are ubiquitous in the language, such that they appear all at once. But, because the learning approach provided here is top-down, the best way to get familiar with these concepts is to try and apply them to your own projects, as done in each piece.

As such, we've tried to introduce these concepts in a way that's as gradual as possible, though it's still possible that you may find yourself having to refer to the Rust book or other resources to understand certain concepts from the bottom-up as you go along.

Setup & installation

This section provides an opinionated guide to setting up your development environment for working with Rust and Python. If you're an experienced developer in either language, feel free to skip this section.

Python

For macOS/Linux users, it's recommended to manage Python versions using pyenv. pyenv lets you easily switch between multiple versions of Python. It's simple, unobtrusive, and follows the UNIX tradition of single-purpose tools that do one thing well.

Follow the instructions from the installation steps section of the README to install pyenv on your system.

Windows users can use pyenv-win, a fork of pyenv that allows you to install and manage Windows-native Python versions.

Python version

This book uses Python 3.11.x, though code run from Python 3.8+ should also work without issues. You can install the latest minor version using pyenv:

pyenv install 3.11.7

Virtual environments

It's recommended to use virtual environments to manage your Python dependencies. This allows you to create isolated environments for each project, and avoid dependency conflicts between projects.

The venv module is included in the Python standard library, so you don't need to install anything extra to use it.

To create a virtual environment on Unix systems, run the following command:

# Setup a new environment for the first time
python -m venv venv
# Activate the environment
source venv/bin/activate

On Windows, it's more or less the same:

py -m venv .venv
.venv\Scripts\activate

You can deactivate the environment by running deactivate in your shell.

Rust

For macOS/Linux users, rustup is the recommended to manage Rust versions. Using this tool, you can easily switch between multiple versions of Rust, and it also ships with the cargo package manager.

See the Rust Book for instructions on how to install rustup on Windows.

Rust version

This book uses Rust 1.75.x. You can install the latest minor version using rustup:

rustup install 1.75.0

You can start a new Rust project in your local directory by running cargo new <project-name>, and you're ready to go!

Pieces

What is the purpose of each piece?

A piece is a self-contained project with Python and Rust implementations that each perform the same tasks. The purpose of each piece is to help Python developers gain familiarity with Rust, and vice-versa, by comparing and contrasting the two languages in a top-down manner.

A piece's directory structure is organized as follows:

pieces
├── intro
│   ├── python
│   │   ├── main.py
│   │   └── test_main.py
│   └── rust
│       ├── Cargo.toml
│       └── src
│           └── main.rs
├── hello_world
├── simple_csv
└── ...

Each piece comes with Python and Rust source code, and their associated tests. When using Rust's test client, test code is placed in the same file as the code it's testing, and is marked with the #[cfg(test)] attribute. When using Python's test client, test code is placed in a separate file, and is marked with the test_ prefix.

For Python, pip is the package manager of choice, and pytest is the test client used throughout the code. For Rust, cargo is the package manager of choice, and Rust's inbuilt test client, invoked by cargo test, is used throughout.

Hello world!

This is the conventional first program that you write when learning a new programming language. It's a simple program that prints the text Hello, world! to the console.

Navigate to the pieces/hello_world directory in the repo to get started.

Python

The file main.py has just one line of code:

print("Hello, world!")

The program is run as follows:

python main.py

Rust

The file main.rs has just three lines of code:

fn main() {
    println!("Hello, world!");
}

The program is run via cargo:

cargo run

Output

Hello, world!

Takeaways

Rust's println! is similar to Python's print function, but it's a macro, not a function. It simply prints the standard output to the console followed by a newline character.

Macros are a powerful Rust feature that allow you to write code that writes other code. We'll see more examples of macros in later pieces, but for now, it's enough to know that in Rust, macros are invoked with an exclamation mark ! at the end of their name.

Introduction

This piece is meant to be a quick introduction to simple constructs that are more or less similar between Python and Rust.

The following constructs are covered:

PythonRust
Protocols/special methodsTraits
EnumerateEnumerate
ZipZip
TupleTuples
LambdasClosures
List comprehensionsMap/filter
DictionaryHashMap
SetHashSet

The code is available in the src/intro directory of the repo.

Rust's traits don't have a direct equivalent in Python, but they are similar enough to protocols or special methods in that they allow us to define a set of methods that a type must implement, allowing us to customize the behavior of the type.

Rust embraces functional programming more than Python does, so it has a number of functional constructs that are commonly used. Where Python prefers list comprehensions, Rust prefers map/filter. Rust's closures, are, at the surface level, similar enough to Python's lambda functions, but they are also a lot more complex and can be viewed as a superset of anonymous functions.

Hopefully, as you read through the examples, you'll see that Rust and Python are not as different as they may seem at first glance!

Protocols vs. Traits

Python has a concept called protocols, sometimes referred to as special methods, or "dunder methods" implemented on built-in types in the standard library. For example, the __str__ method is used to implement the str() function, which returns the string representation of an object. Th __repr__ method is used to implement the repr() function, which returns a string containing a printable representation of an object.

Python: Protocols

In Python, we start by defining a simple Person class that has a name and an age attribute. To make the output of the print statement more interesting, we implement the following __str__ and __repr__ methods that are translated to the str() and repr() functions respectively.

class Person:
    def __init__(self, name: str, age: int) -> None:
        self.name = name
        if age > 0 and isinstance(age, int):
            self.age = age
        else:
            raise ValueError("Age must be a positive integer")

    def __str__(self) -> str:
        return f"{self.name} is {self.age} years old"

    def __repr__(self) -> str:
        return f"Person: {self.name}, {self.age}"

One limitation of Python's type system that's worth noting is that it treats all integers as int types, even if they are unsigned. In this case, the age of a person should be a positive integer, so we need to check for this by using an if statement in the class constructor defined in the __init__ block. Rust's type system, as we'll see, is more powerful, while also being stricter than Python's.

We can now create a Person object via a function and print it to the console by running the code via main.py.

def run1() -> None:
    person = Person("Megan", 28)
    print(person)
    print(repr(person))
    """
    Megan is 28 years old
    Person: Megan, 28
    """

When we print the person object, the __str__ method is called, and when we print the repr object, the __repr__ method is called, thus producing slightly different outputs depending on what we want to display. Generally, repr() is used for debugging a stack trace, and str() is used for displaying something to the user.

Rust: Traits

In Rust, we start by defining a Person struct with a name and an age attribute, in a similar way to the Python example.

struct Person {
    name: String,
    age: u8,
}

Unlike a Python class which always provides __init__, Rust doesn't provide constructors on structs, so we need to define an implementation block (shown below) for the Person struct via the impl keyword.

As noted earlier, Rust allows us to declare the age variable as an unsigned integer, which is more appropriate for this use case, eliminating the need to check for positive integers in the constructor. This makes the code more concise and easier to read in this case.

impl Person {
    fn new(name: &str, age: u8) -> Self {
        Self {
            name: name.to_string(),
            age,
        }
    }
}

Two things stand out in the impl block defined. We provide an argument &str, which represents a string slice, and we use the to_string() method to convert the string slice to a String type.

Because Rust is a statically typed language, it needs to know the type and allocation of all variables at compile time. When we input a person's name during initialization, we don't know how long the name will be. However, arguments to functions and methods in Rust are passed by reference, so we'd typically use a string slice to represent the name. The compiler keeps a track of all this, so if you forget to call the to_string() method, you'll get a nice compiler error!

Rust has its own versions of Python's __str__ and __repr__ methods, but they're called Display and Debug traits. A trait is similar to an interface in other languages, and vaguely similar to a protocol in Python, because it describes an object's behavior.

impl fmt::Display for Person {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        write!(f, "{} is {} years old", self.name, self.age)
    }
}

impl fmt::Debug for Person {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        write!(f, "Person: {}, {}", self.name, self.age)
    }
}

Display and Debug traits are typically not automatically defined on every object type in Rust, because its strict type system needs to know upfront what the user wants to do with the object's types prior to displaying them.

With these bits in place, we can now create a Person object via a function and print it to the console by running the code via main.rs.

fn run1() {
    let p = Person::new("Megan", 28);
    println!("{}", p);
    println!("{:?}", p);
    /*
    Megan is 28 years old
    Person: Megan, 28
    */
}

Note that in Rust, for printing Debug traits, we use the {:?} format specifier, whereas for Display traits, we can just use {}.

The above output is identical to the Python output!

Takeaways

  • Rust's type system is stricter and more powerful than Python's, allowing us to define unsigned integers and other types that are not available in Python's standard library.
  • Python is object-oriented, so it uses classes in many cases to keep related data and methods together
  • Rust isn't an object-oriented language and doesn't use classes, but it does have the concept of traits and implementations
  • In Rust, constructors aren't defined on custom structs, so we need to define the struct's constructor via an implementation block using the impl keyword.

Enumerate

In both Python and Rust, the enumerate function exists to iterate over a list while keeping track of the index of the current item.

Python

Recall from the first example that we defined a Person class with a name and an age attribute.

We can instantiate a list of Person objects and iterate over them using enumerate.

def run2() -> None:
    persons = [Person("James", 33), Person("Salima", 31)]
    for i, person in enumerate(persons):
        print(f"Person {i}: {str(person)}")

Running the above function via main.py gives us the same output as in Rust:

Person 0: James is 33 years old
Person 1: Salima is 31 years old

The enumerate method returns a tuple of (index, item) for each item in the list, allowing us to access the index of the current item as we iterate over the list in a for loop.

Rust

Recall from the first example that we defined a Person struct with a name and an age attribute, in a similar way to the Python example.

We can instantiate a vector of Person objects and iterate over them using enumerate.

What is a vector? Like many other languages, Rust provides arrays, but arrays in Rust are fixed-size and allocated on the stack. Vectors are dynamic arrays that are allocated on the heap, and can grow and shrink as needed, similar to Python lists.

For most purposes, vectors in Rust perform the same function as Python lists. Unlike a Python list, a vector in Rust can only contain objects of the same type, in this case, Person.

fn run2() {
    let persons = vec![Person::new("James", 33), Person::new("Salima", 31)];
    for (i, p) in persons.iter().enumerate() {
        println!("Person {}: {}", i, p)
    }
}

The vec! macro is syntactic sugar for Vec::new(), which creates a new vector of Person objects. Additionally, the iter method returns an iterator over the vector, which is required before we can call the enumerate method on it.

Running the above function via main.rs gives us the same output as in Python:

Person 0: James is 33 years old
Person 1: Salima is 31 years old

Takeaways

  • Both Python and Rust contain a convenience method called enumerate to iterate over a list while keeping track of the index of the current item.
  • Python lists are dynamic arrays that can contain objects of any type.
  • Rust vectors are heap-allocated dynamic arrays that can only contain objects of the same type.

Zip

In both Python and Rust, the zip function exists to construct an iterator over two or more iterables.

Python

Recall from the first example that we defined a Person class with a name and an age attribute.

If we have two lists, one containing names and one containing ages. zip conveniently allows us to iterate over both lists.

def run3() -> None:
    names = ["Alice", "Charlie"]
    ages = [24, 45]
    persons = []
    for name, age in zip(names, ages):
        person = Person(name, age)
        persons.append(person)
    print(f"{repr(persons)}")

The append method is used to add a new item to the end of the list, similar to push in Rust.

Running the above function via main.py gives us the following output:

[Person('Alice', 24), Person('Charlie', 45)]

Note that the zip method returns an iterator over tuples of the same length as the shortest iterable passed to it. So, if we'd passed one list with 3 items and one list with 2 items, the resulting iterator would have 2 items.

Rust

Recall from the first example that we defined a Person struct with a name and an age attribute, in a similar way to the Python example.

Consider that we have two vectors, one containing names and one containing ages. zip conveniently allows us to iterate over both vectors.

fn run3() {
    let names = ["Alice", "Charlie"];
    let ages = [24, 45];
    let mut persons = vec![];
    for (name, age) in names.iter().zip(ages.iter()) {
        persons.push(Person::new(name, *age));
    }
    println!("{:?}", persons);

}
  • The zip method can only called on an iterator, so we need to call iter on both vectors before we can call zip.
  • The push method is used to add a new item to the end of the vector, just like append in Python.

Again, there's no need to "remember" any of this: the Rust compiler is super helpful in calling you out on common mistakes, while offering a helpful solution!

Running the function via main.rs gives us the same output as in Python:

[Person: Alice, 24, Person: Charlie, 45]

Takeaways

The functionality of zip is the largely the same in both Python and Rust.

There really aren't too many differences, but it's worth noting that Rust's zip is held to account by the strict type system, so it's typically only available on iterators (unless you implement your own traits or macros). Python's zip method, on the other hand, can be called on any iterable (lists, tuples, dictionaries, and so on) because of its dynamic, loosely typed nature.

Tuple unpacking

Both Python and Rust support tuple unpacking in similar ways.

Python

Consider the following function in which we unpack the youngest and oldest age from a sorted list of ages:

def run4() -> None:
    sorted_ages = (18, 41, 65)
    youngest, _, oldest = sorted_ages
    print(f"Youngest age: {youngest}, oldest age: {oldest}")
    print(f"Middle age: {sorted_ages[1]}")

The _ is a special variable name in Python that indicates that we don't care about the value, allowing the unused value to be cleared by the Python memory manager during runtime. We can still access the middle age via the index operator for tuples, sorted_ages[1].

Running the above function via main.py gives us the following output:

Youngest age: 18, oldest age: 65
Middle age: 41

Rust

We can write the following function in which we unpack the youngest and oldest age from a sorted list of ages:

fn run4() {
    let sorted_ages: (u8, u8, u8) = (18, 41, 65);
    let (youngest, _, oldest) = sorted_ages;
    println!("Youngest age: {}, oldest age: {}", youngest, oldest);
    println!("Middle age: {}", sorted_ages.1);
}

Just like in Python, the _ indicates that we don't care about the middle value. The difference is that in Rust, there isn't a garbage collector (or reference counter) like in Python, so the unused value is only kept in scope till the function is exited. Also, we need to explicitly declare the type of each age element as unsigned 8-bit integers.

The index operator for tuples in Rust is ., so we can access the middle age via sorted_ages.1.

Running the function via main.rs gives us the same output as in Python:

Youngest age: 18, oldest age: 65
Middle age: 41

Takeaways

  • Tuple unpacking is largely the same in Python and Rust.
  • There are some minor differences between Python and Rust tuples:
    • In rust, elements of a tuple are mutable, while in Python, they are immutable (lists are mutable in Python).
    • In Rust, the index operator for tuples is ., while in Python, it's [].

Lambdas vs. closures

Anonymous functions are functions that are not bound to a name. In Python, they are called lambdas. In Rust, they are called closures. Both are useful for short,one-off functions that are not used anywhere else.

Python

Recall from the first example that we defined a Person class with a name and an age attribute.

In the following example, we use the sorted function to sort a list of Person objects by their age.

def run5() -> None:
    persons = [Person("Aiko", 41), Person("Rohan", 18)]
    sorted_by_age = sorted(persons, key=lambda person: person.age)
    youngest_person = sorted_by_age[0]
    print(f"{youngest_person.name} is the youngest person at {youngest_person.age} years old")

The sorted function takes an optional key argument, which is a function that is called on each item in the list to determine the value to sort by. In this case, we use a lambda to return the age attribute of each Person object.

Rohan is the youngest person at 18 years old

Rust

Recall from the first example that we defined a Person struct with a name and an age attribute, in a similar way to the Python example.

In the following example, we use the sort_by_key method to sort a vector of Person objects by their age.

fn run5() {
    let mut persons = vec![Person::new("Aiko", 41), Person::new("Rohan", 18)];
    // Sort by age
    persons.sort_by_key(|p| p.age);
    let youngest_person = persons.first().unwrap();
    println!(
        "{} is the youngest person at {} years old",
        youngest_person.name, youngest_person.age
    );

The sort_by_key method takes a closure that is called on each item in the vector to determine the value to sort by. In this case, we use a closure operator || to return the age attribute of each Person object.

Rohan is the youngest person at 18 years old

Takeaways

  • Lambdas and closures are anonymous functions that are not bound to a name, or are passed as arguments to other functions.
  • Lambdas and closures are useful for short, one-off functions that are not used anywhere else.
  • Closures are more powerful than lambdas because they define higher-order functions that can capture their environment - this is out of scope for this book, but you can read more about it here.

Single line if-else

Both Python and Rust support single line if-else statements. This is especially useful when performing simple operations on a value, allowing for more concise code.

Python

Consider the following function in which we print a message depending on whether a person is born in a leap year or not.

To do this, we first define a function approx_year_of_birth that returns the approximate year.

def approx_year_of_birth(person: Person) -> int:
    birth_year_approx = datetime.now().year - person.age
    return birth_year_approx

The leap year logic used above is simplistic and does not account for edge cases. It's used here purely for the purposes of illustration.

We can use this function after initializing a list of Person objects.

def run6() -> None:
    persons = [Person("Josephine", 20), Person("Wesley", 31)]
    for person in persons:
        # Check if person is born in a leap year using simplistic leap year logic
        birth_year = approx_year_of_birth(person)
        person_is_born_in_leap_year = True if birth_year % 4 == 0 else False
        print(f"{person}. Born in a leap year?: {person_is_born_in_leap_year}")

Running the above function via main.py gives us the following output:

Josephine is 20 years old. Born in leap year?: True
Wesley is 31 years old. Born in leap year?: False

Rust

We can define the below function in Rust, where we print a message depending on whether a person is born in a leap year or not.

use chrono::prelude::*;

fn approx_year_of_birth(person: &Person) -> u16 {
    let now = chrono::Utc::now();
    let year = now.year() - (person.age as i32);
    year as u16
}

Note that in Rust, we need to use the chrono crate to handle datetimes, unlike in Python where the datetime module comes with the standard library.

We then use this function after initializing a vector of Person objects.

fn run6() {
    let persons = vec![Person::new("Josephine", 20), Person::new("Wesley", 31)];
    for person in persons {
        // check if person is born in a leap year using simplistic leap year logic
        let birth_year = approx_year_of_birth(&person);
        let person_is_born_in_leap_year = birth_year % 4 == 0;
        println!(
            "{}. Born in a leap year?: {}",
            person, person_is_born_in_leap_year
        );
    }

Running the function via main.rs gives us the same output as in Python:

Josephine is 20 years old. Born in a leap year?: true
Wesley is 31 years old. Born in a leap year?: false

Takeaways

  • Single line if-else statements are useful for performing simple operations on a value while remaining concise.
  • In certain cases in Rust, we have to use external crates to handle certain functionality that comes with the standard library in Python.

List comprehensions vs map/filter

One of Python's most popular features is its list comprehensions. They are a concise way to create lists from existing lists. Rust is more functional than Python, so it has a similar feature called map and filter. Although map and filter functions are availble in Python, they are not as commonly used as list comprehensions.

Python

Consider the following function in which we print a message depending on which persons from a list of Person objects are born after the year 1995, based on their current age.

def run7() -> None:
    """
    1. List comprehensions
    """
    persons = [Person("Issa", 39), Person("Ibrahim", 26)]
    persons_born_after_1995 = [
        (person.name, person.age) for person in persons if approx_year_of_birth(person) > 1995
    ]
    print(f"Persons born after 1995: {persons_born_after_1995}")

The list comprehension in the above function essentially does the following:

  1. Iterate over the list of Person objects
  2. Unpack each Person tuple into their name and age
  3. For each person, check if their approximate year of birth is greater than 1995

Running the above function via main.py gives us the following output:

Persons born after 1995: [('Ibrahim', 26)]

Rust

We can define the below function in Rust, where we print a message depending on which persons from a vector of Person objects are born after the year 1995, based on their current age.

fn run7() {
    let persons = vec![Person::new("Issa", 39), Person::new("Ibrahim", 26)];
    let result = persons
        .into_iter()
        .filter(|p| approx_year_of_birth(p) > 1995)
        .map(|p| (p.name, p.age))
        .collect::<Vec<(String, u8)>>();
    println!("Persons born after 1995: {:?}", result)

The filter and map functions in the above function essentially do the following:

  1. Turn the persons vector into an iterator and iterate over the Person objects
  2. For each person, check if their approximate year of birth is greater than 1995
  3. If the above condition is true, then create a tuple of their name and age
  4. Collect all the tuples into a vector of unsigned 8-bit integers

Running the function via main.rs gives us the same output as in Python:

Persons born after 1995: [("Ibrahim", 26)]

The Rust version is a little more verbose than the Python version, but it's still quite readable.

Takeaways

  • Both Python and Rust have convenient ways to create iterables without having to use explicit loops.
  • Python's list comprehensions are more concise than Rust's map and filter functions in most cases.
  • Rust's map and filter functions show that Rust is more functional than Python in its syntax.

Dicts vs. hashmaps

Python's dict is essentially a hash table, which is a data structure that maps keys to values. Rust's HashMap performs the same function. Both are collections of key-value pairs where the keys must be unique, but the values can be duplicated. The purpose of dicts and hashmaps is to allow for fast lookup of values by key.

Python

Consider the below function in Python, where we define a dictionary of processors and their corresponding market names.

    processors = {
        "13900KS": "Intel Core i9",
        "13700K": "Intel Core i7",
        "13600K": "Intel Core i5",
        "1800X": "AMD Ryzen 7",
        "1600X": "AMD Ryzen 5",
        "1300X": "AMD Ryzen 3",
    }

    # Check for presence of value
    is_item_in_dict = "AMD Ryzen 3" in processors.values()
    print(f'Is "AMD Ryzen 3" in the dict of processors?: {is_item_in_dict}')
    # Lookup by key
    key = "13900KS"
    lookup_by_key = processors[key]
    print(f'Key "{key}" has the value "{lookup_by_key}"')

The first portion checks for the presence of a value in the dictionary, while the second portion looks up the value by key.

Running the above function via main.py gives us the following output:

Is "AMD Ryzen 3" in the dict of processors?: True
Key "13900KS" has the value "Intel Core i9"

Rust

We define the below function in Rust, where we define a hashmap of processors and their corresponding market names.

use std::collections::HashMap;

fn run8() {
    let mut processors = HashMap::new();
    processors.insert("13900KS", "Intel Core i9");
    processors.insert("13700K", "Intel Core i7");
    processors.insert("13600K", "Intel Core i5");
    processors.insert("1800X", "AMD Ryzen 7");
    processors.insert("1600X", "AMD Ryzen 5");
    processors.insert("1300X", "AMD Ryzen 3");

    // Check for presence of value
    let value = "AMD Ryzen 3";
    let mut values = processors.values();
    println!(
        "Is \"AMD Ryzen 3\" in the hashmap of processors?: {}",
        values.any(|v| v == &value)
    );
    // Lookup by key
    let key = "13900KS";
    let lookup_by_key = processors.get(key);
    println!(
        "Key \"{}\" has the value \"{}\"",
        key,
        lookup_by_key.unwrap()
    );
}

Just like in the Python version, the first portion checks for the presence of a value in the hashmap, while the second portion looks up the value by key.

Running the function via main.rs gives us the same output as in Python:

Is "AMD Ryzen 3" in the hashmap of processors?: true
Key "13900KS" has the value "Intel Core i9"

Takeaways

Python and Rust contain collections that store key-value pairs for fast lookups. A key difference is that Python's dict keys can be any hashable type and values can be of any type, but in Rust, both the keys and values of a HashMap must be of the same type.

In Python, this dict is perfectly valid:

# You can have a dict with keys of different types
example = {
    "a": 1,
    1: 2
}

In Rust, the compiler will enforce that the keys and values are of the same type, based on the first entry's inferred types.

let mut example = HashMap::new();
example.insert("a", 1);
// This errors because the first entry specified the key as &str
example.insert(1, 2);
// This is valid
example.insert("b", 2);

Sets vs. hashsets

Python's set is an unordered collection of unique items, where duplicate items are not allowed. Rust's HashSet performs the same function.

Python

Consider the following function in which we define a set of processors.

def run9() -> None:
    processors = {
        "Intel Core i9",
        "Intel Core i7",
        "Intel Core i5",
        "AMD Ryzen 7",
        "AMD Ryzen 5",
        "AMD Ryzen 3",
    }
    # Duplicate values are ignored
    processors.add("Intel Core i7")
    processors.add("AMD Ryzen 5")
    # Check for presence of value
    is_item_in_set = "AMD Ryzen 3" in processors
    print(f'Is "AMD Ryzen 3" in the set of processors?: {is_item_in_set}')

The purpose of the above function is to check for the presence of a value in the set of processors. When we add duplicate values to the set, they are ignored.

Running the above function via main.py gives us the following output:

Is "AMD Ryzen 3" in the set of processors?: True

Rust

We define the below function in Rust, where we define a hashset of processors.

use std::collections::HashSet;

fn run9() {
    let mut processors = HashSet::new();
    processors.insert("Intel Core i9");
    processors.insert("Intel Core i7");
    processors.insert("Intel Core i5");
    processors.insert("AMD Ryzen 7");
    processors.insert("AMD Ryzen 5");
    processors.insert("AMD Ryzen 3");
    // Duplicate values are ignored
    processors.insert("Intel Core i7");
    processors.insert("AMD Ryzen 5");
    // Check for presence of value
    let value = "AMD Ryzen 3";
    println!(
        "Is \"AMD Ryzen 3\" in the hashset of processors?: {}",
        processors.contains(&value)
    );
}

The purpose of the above function is to check for the presence of a value in the hashset of processors. When we add duplicate values to the hashset, they are ignored.

Running the function via main.rs gives us the same output as in Python:

Is "AMD Ryzen 3" in the hashset of processors?: true

Takeaways

Python and Rust contain collections that allow for the storage of unique items. A key difference is that Python's set can contain items of any type, while Rust's HashSet can only contain items of the same type that were specified at the time of initialization.

In Python, the following set containing multiple types is valid, as they are all hashable.

example = {1, "hello", 3.14}

In Rust, the compiler enforces that all items in the set are of the same type specified at the time of initialization, or by inferring the first value's type.

let example = HashSet::new();
example.insert(1);
// This errors because the first value specified the key as u32 or similar
example.insert("hello");
// This is valid
example.insert(3);

Contributors

Authors

Rust in Pieces is co-authored by Prashanth Rao and Paul Sanders.

About the authors

  • Prashanth Rao is an A.I. engineer with a background in scientific computing, machine learning, NLP and database systems. He's passionate about making complex topics accessible to a larger audience. In his spare time, Prashanth actively experiments with open source tools, frameworks and databases, and writes about them on his blog.

  • Paul Sanders is a software engineer and consultant having spent decades doing data management and application development in healthcare, pharmaceuticals and biologic drug development. In his spare time, Paul loves contributing to open source software and is actively maintaining several OSS projects.

Additional contributors

Contributions and improvements from the community are welcome! Please see the contributing guidelines.