Rust in Pieces
A top-down approach to learning Rust coming from Python (and vice-versa).
Introduction
This book's goal is to bring the Python and Rust developer communities closer together, and to help more developers from one language learn how to leverage the benefits of the other.
The book and accompanying code are organized into a collection of small projects, termed pieces. Each piece is a self-contained task with Python and Rust implementations that each perform the same task. The aim is to help Python developers gain familiarity with Rust, and vice-versa, by comparing and contrasting the two languages in a top-down manner.
Importantly, the pieces in this book build towards unifying Python and Rust code bases via PyO3, a highly popular open source library that allows you to call Rust bindings from Python or the Python interpreter from Rust. Using one language does not preclude using the other!
As you go through the pieces, you'll find yourself becoming proficient in writing clean, tested, production-worthy code using engineering best practices in either language, moving between them at will. Over time, you can make a more informed choice regarding when to use one or the other language for parts of a larger project.
We believe that Rust 🦀 is the among the most approachable lower-level programming languages for Python developers, and that Python is one of the most valuable high-level languages for Rust developers who are looking to build tooling for the burgeoning data, AI and ML ecosystems. The arrival of tools like PyO3 has made it highly feasible for a developer to straddle both worlds, combining their best parts, thus helping build more efficient and scalable software.
What's covered in this book?
Rust's learning curve is considerably steeper than Python's, so the table below is provided to show a mapping between each piece and its corresponding concept in Rust. As can be seen, structs, serialization, deserialization, vectors and traits are ubiquitous concepts in Rust.
Piece | Category | Key Rust concepts |
---|---|---|
Hello world | Intro | macros |
Data structures & constructs | Intro | crates, structs, traits, implementations |
Simple CSV parsing | Files | serde, vec |
Regex JSON | Files | match, regex |
Mock data generation | Files | RNG, sampling |
Age grouping | Files | enums |
Datetime parsing | Files | chrono, lifetimes |
Extract pronouns from text | Files | rayon, parallelism |
Postgres | Databases | async, sqlx, tokio |
Meilisearch | CLIs | async, async-std, clap |
REST API to Postgres | APIs | axum, async, tokio |
PyO3 mock data generation | Unification | PyO3, Maturin |
PyO3 parallel computation | Unification | PyO3, Maturin |
Why use Rust with Python?
Python is a dynamically typed, interpreted programming language that's known for its flexibility, ease of use and low barrier to entry. It's by far the most popular language for AI, ML and data science, and has been the go-to language for researchers and innovators in these fields for quite a while now.
It's possible to write relatively high-performance code in Python these days by leveraging its rich library ecosystem (which are typically wrappers around C/C++/Cython runtimes). However, performance and concurrency are not Python's strong suits, and this requires performance-critical code to be implemented in lower-level languages. For many Python developers, using languages like C, C++ and Cython is a daunting prospect.
Rust is a statically typed, compiled programming language that's known for its relatively steep learning curve. Its design philosophy is centered around three core functions: performance, safety, and fearless concurrency. It offers a modern, high-level syntax and a rich type system that makes it possible to write code that runs really fast without the need for manual memory management, eliminating entire classes of bugs.
Although it's possible to write all sorts of complex tools and applications in Rust, it's not the best option for every situation. In cases like research and prototyping, where speed of iteration is important, Rust's strict compiler can slow down development, and Python is still the better choice.
We believe that Python 🐍 and Rust 🦀 form a near-perfect pair to address either side of the so-called "two-world problem", explained below.
The two-world problem
The programming world often finds itself divided in two: those who prefer high-level, dynamically typed languages, and those who prefer low-level, statically typed languages.
Many high-level languages are interpreted (i.e., they execute each line as it's read, sequentially). These languages are generally easier to learn because they abstract away the details of memory management, allowing for rapid prototyping and development.
Lower-level languages, on the other hand, tend to be ahead-of-time (AOT) compiled. They offer the programmer more control over memory management, resulting in much more performant code at the cost of a steeper learning curve.
It's for these reasons that scientists, researchers, data scientists, data analysts, quants, etc. have traditionally preferred high-level languages like Python, R and Julia. On the other hand, systems programmers, OS developers, embedded systems engineers, game developers and software engineers tend to prefer lower-level languages like C, C++ and Rust.
The image above is a figurative representation of two distributions of people, typically disparate individuals from either background (with the languages listed in no specific order).
Has the two-world problem been solved before?
A lot of readers will have heard of Julia, a dynamically typed, just-in-time (JIT) compiled alternative to Python and is often touted as a "high-level language with the performance of C". While Julia is no doubt a great language, it's popularity is largely limited to the scientific community and its library ecosystem and user community haven't yet matured to the extent that Python's has. As such, the "two-language problem" that Julia attempts to solve, is still largely unsolved.
Other languages like Mojo explain in their vision how they aim to solve the two-world problem by providing a single unified language (acting like a superset of Python) that can be compiled to run on any hardware. However, Mojo is still very much in its infancy as a language and hasn't gained widespread adoption, and its user community is non-existent.
Rust and PyO3
The most interesting aspect about PyO3 in combination with Rust is that they offer a new way to think the two-world problem. Rather than trying to solve the problem by creating a new language that offers the best of many worlds, Rust and PyO3 embrace the problem by allowing a developer to move between the worlds and choose the best tool for parts of a larger task.
Rust's design philosophy and features make it an ideal candidate to bring people from these worlds (high-level and low-level languages) closer together. Rust's strict compiler, rich type system and ownership principles eliminate the need to manually manage memory without requiring a garbage collector, making it possible for a larger community of analytical and scientifically-minded developers to write high-performance code without sacrificing safety.
The image above shows a distribution of the same potential set of developers who can straddle both worlds. Those who are already proficient in Python and require fast iteration for prototyping can choose to write only very specific, performance-critical parts of their code in Rust. Conversely, those who are already proficient in Rust and require high-performance, safe code for their workflows can choose to interface with Python for only very specific parts that need access to the Python ecosystem.
In our view, the interface that PyO3 provides is fundamentally different from earlier approaches to interoperability with Python (such as pybind11, SWIG or Cython), because unlike the earlier tools, PyO3 and Rust are far more accessible to Python developers. We hope this becomes clearer and clearer as you progress through the book.
How to approach learning a new language
Typically, the first steps in learning a programming language involve understanding its syntax, data structures and control flow expressions before tackling a specific problem. This is termed as bottom-up learning, and it's essential to understand the terminology of the language and its ecosystem.
Bottom-up learning resources typically include:
- Books
- Documentation
- Tutorials
- Interactive exercises
Although necessary to master a topic, bottom-up learning has the issue that learners often get stuck in tutorial hell, where they're constantly being exposed to new concepts without actually building anything end-to-end. In addition, bottom-up learning can leave learners frustrated because they're not able to see the bigger picture and how the different parts come together as a whole.
In contrast, top-down learning follows a more pragmatic approach that's grounded in the real-world. For this, a more abstract way of thinking is required. The learner identifies a problem statement in their domain and they learn just enough of the language to solve the problem at hand, going deeper only as needed.
Top-down learning resources typically include learning by example, using any of the following resources:
- Existing codebases
- Blogs
- Podcasts
- Videos
Top-down learning is not a replacement for bottom-up learning. The best way to become proficient and productive in a language is to combine both approaches and do them together.
Don't reinvent the wheel
When learning a new language, it's tempting to start from scratch and build everything that's required to solve a problem from the ground up. This is a slow, sometimes painful process if you're just starting off, and can result in inefficient, unidiomatic code.
Both Rust and Python have rich package ecosystems, and this book leverages them to the fullest extent possible. After all, most great software is built on the shoulders of giants. Reading through existing codebases for libraries, tools and frameworks is a great way to learn how to write idiomatic code in a language, and to understand the performance implications of your code.
Prior reading
It's recommended to have a basic understanding of Python and Rust before reading this book. If you're new to either language, consider reading the following resources. It's okay if you don't absorb everything in the first pass -- the goal is to get a high-level understanding of either language and their ecosystems.
- Python Crash Course for hands-on Python concepts
- The Rust Book and Rust by Example for an introduction to terminology and concepts in Rust
- Rustlings for a gentle introduction to compiling and running Rust code
Once you have a handle on the terminology, you can start by getting your hands dirty with the pieces provided in this book. Or better yet, use this framework to create your own pieces for your domain of interest and start writing some code!
How to read this book
The pieces are not meant to be read in a specific order. However, they are roughly organized in order of increasing complexity, and each piece, by and large, utilizes concepts that may have been introduced in an earlier piece.
One of the challenges with learning (and teaching) Rust, is that certain concepts such as ownership, borrowing, traits and lifetimes can be quite challenging to grasp for a new learner, but these concepts are ubiquitous in the language, such that they appear all at once. But, because the learning approach provided here is top-down, the best way to get familiar with these concepts is to try and apply them to your own projects, as done in each piece.
As such, we've tried to introduce these concepts in a way that's as gradual as possible, though it's still possible that you may find yourself having to refer to the Rust book or other resources to understand certain concepts from the bottom-up as you go along.
Setup & installation
This section provides an opinionated guide to setting up your development environment for working with Rust and Python. If you're an experienced developer in either language, feel free to skip this section.
Python
For macOS/Linux users, it's recommended to manage Python versions using pyenv. pyenv
lets you easily switch between multiple versions of Python. It's simple, unobtrusive, and follows the UNIX tradition of single-purpose tools that do one thing well.
Follow the instructions from the installation steps section of the README to install pyenv
on your system.
Windows users can use pyenv-win, a fork of pyenv
that allows you to install and manage Windows-native Python versions.
Python version
This book uses Python 3.11.x, though code run from Python 3.8+ should also work without issues. You can install the latest minor version using pyenv
:
pyenv install 3.11.7
Virtual environments
It's recommended to use virtual environments to manage your Python dependencies. This allows you to create isolated environments for each project, and avoid dependency conflicts between projects.
The venv
module is included in the Python standard library, so you don't need to install anything extra to use it.
To create a virtual environment on Unix systems, run the following command:
# Setup a new environment for the first time
python -m venv venv
# Activate the environment
source venv/bin/activate
On Windows, it's more or less the same:
py -m venv .venv
.venv\Scripts\activate
You can deactivate the environment by running deactivate
in your shell.
Rust
For macOS/Linux users, rustup is the recommended to manage Rust versions. Using this tool, you can easily switch between multiple versions of Rust, and it also ships with the cargo
package manager.
See the Rust Book for instructions on how to install rustup
on Windows.
Rust version
This book uses Rust 1.75.x. You can install the latest minor version using rustup
:
rustup install 1.75.0
You can start a new Rust project in your local directory by running cargo new <project-name>
, and you're ready to go!
Pieces
What is the purpose of each piece?
A piece is a self-contained project with Python and Rust implementations that each perform the same tasks. The purpose of each piece is to help Python developers gain familiarity with Rust, and vice-versa, by comparing and contrasting the two languages in a top-down manner.
A piece's directory structure is organized as follows:
pieces
├── intro
│ ├── python
│ │ ├── main.py
│ │ └── test_main.py
│ └── rust
│ ├── Cargo.toml
│ └── src
│ └── main.rs
├── hello_world
├── simple_csv
└── ...
Each piece comes with Python and Rust source code, and their associated tests. When using Rust's test client, test code is placed in the same file as the code it's testing, and is marked with the #[cfg(test)]
attribute. When using Python's test client, test code is placed in a separate file, and is marked with the test_
prefix.
For Python, pip
is the package manager of choice, and pytest
is the test client used throughout the code. For Rust, cargo
is the package manager of choice, and Rust's inbuilt test client, invoked by cargo test
, is used throughout.
Hello world!
This is the conventional first program that you write when learning a new programming language. It's a simple program that prints the text Hello, world!
to the console.
Navigate to the pieces/hello_world
directory in the repo to get started.
Python
The file main.py
has just one line of code:
print("Hello, world!")
The program is run as follows:
python main.py
Rust
The file main.rs
has just three lines of code:
fn main() {
println!("Hello, world!");
}
The program is run via cargo
:
cargo run
Output
Hello, world!
Takeaways
Rust's println!
is similar to Python's print
function, but it's a macro, not a function. It simply prints the standard output to the console followed by a newline character.
Macros are a powerful Rust feature that allow you to write code that writes other code. We'll see more examples of macros in later pieces, but for now, it's enough to know that in Rust, macros are invoked with an exclamation mark !
at the end of their name.
Introduction
This piece is meant to be a quick introduction to simple constructs that are more or less similar between Python and Rust.
The following constructs are covered:
Python | Rust |
---|---|
Protocols/special methods | Traits |
Enumerate | Enumerate |
Zip | Zip |
Tuple | Tuples |
Lambdas | Closures |
List comprehensions | Map/filter |
Dictionary | HashMap |
Set | HashSet |
The code is available in the src/intro
directory of the repo.
Rust's traits don't have a direct equivalent in Python, but they are similar enough to protocols or special methods in that they allow us to define a set of methods that a type must implement, allowing us to customize the behavior of the type.
Rust embraces functional programming more than Python does, so it has a number of functional constructs that are commonly used. Where Python prefers list comprehensions, Rust prefers map/filter. Rust's closures, are, at the surface level, similar enough to Python's lambda functions, but they are also a lot more complex and can be viewed as a superset of anonymous functions.
Hopefully, as you read through the examples, you'll see that Rust and Python are not as different as they may seem at first glance!
Protocols vs. Traits
Python has a concept called protocols, sometimes referred to as special methods, or "dunder methods" implemented on
built-in types in the standard library.
For example, the __str__
method is used to implement the str()
function, which returns the string representation of an object.
Th __repr__
method is used to implement the repr()
function, which returns a string containing a printable representation of an object.
Python: Protocols
In Python, we start by defining a simple Person
class that has a name and an age attribute.
To make the output of the print
statement more interesting, we implement the following __str__
and __repr__
methods
that are translated to the str()
and repr()
functions respectively.
class Person:
def __init__(self, name: str, age: int) -> None:
self.name = name
if age > 0 and isinstance(age, int):
self.age = age
else:
raise ValueError("Age must be a positive integer")
def __str__(self) -> str:
return f"{self.name} is {self.age} years old"
def __repr__(self) -> str:
return f"Person: {self.name}, {self.age}"
One limitation of Python's type system that's worth noting is that it treats all integers as int
types, even if they
are unsigned. In this case, the age of a person should be a positive integer, so we need to check for this by using
an if
statement in the class constructor defined in the __init__
block. Rust's type system, as we'll see, is
more powerful, while also being stricter than Python's.
We can now create a Person
object via a function and print it to the console by running the code via main.py
.
def run1() -> None:
person = Person("Megan", 28)
print(person)
print(repr(person))
"""
Megan is 28 years old
Person: Megan, 28
"""
When we print the person
object, the __str__
method is called, and when we print the repr
object,
the __repr__
method is called, thus producing slightly different outputs depending on what we want to display.
Generally, repr()
is used for debugging a stack trace, and str()
is used for displaying something to the user.
Rust: Traits
In Rust, we start by defining a Person
struct with a name and an age attribute, in a similar way to the Python example.
struct Person {
name: String,
age: u8,
}
Unlike a Python class which always provides __init__
, Rust doesn't provide constructors on structs, so we
need to define an implementation block (shown below) for the Person
struct via the impl
keyword.
As noted earlier, Rust allows us to declare the age
variable as an unsigned integer, which is more
appropriate for this use case, eliminating the need to check for positive integers in the constructor.
This makes the code more concise and easier to read in this case.
impl Person {
fn new(name: &str, age: u8) -> Self {
Self {
name: name.to_string(),
age,
}
}
}
Two things stand out in the impl
block defined. We provide an argument &str
, which represents a string slice,
and we use the to_string()
method to convert the string slice to a String
type.
Because Rust is a statically typed language, it needs to know the type and allocation of all variables at compile time.
When we input a person's name during initialization, we don't know how long the name will be. However, arguments to
functions and methods in Rust are passed by reference, so we'd typically use a string slice to represent the name. The
compiler keeps a track of all this, so if you forget to call the to_string()
method, you'll get a nice compiler error!
Rust has its own versions of Python's __str__
and __repr__
methods, but they're called Display
and Debug
traits.
A trait is similar to an interface in other languages, and vaguely similar to a protocol in Python, because it describes
an object's behavior.
impl fmt::Display for Person {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "{} is {} years old", self.name, self.age)
}
}
impl fmt::Debug for Person {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "Person: {}, {}", self.name, self.age)
}
}
Display
and Debug
traits are typically not automatically defined on every object type in Rust, because its
strict type system needs to know upfront what the user wants to do with the object's types prior to displaying them.
With these bits in place, we can now create a Person
object via a function and print it to the console by running the code via main.rs
.
fn run1() {
let p = Person::new("Megan", 28);
println!("{}", p);
println!("{:?}", p);
/*
Megan is 28 years old
Person: Megan, 28
*/
}
Note that in Rust, for printing Debug
traits, we use the {:?}
format specifier, whereas for Display
traits, we can just use {}
.
The above output is identical to the Python output!
Takeaways
- Rust's type system is stricter and more powerful than Python's, allowing us to define unsigned integers and other types that are not available in Python's standard library.
- Python is object-oriented, so it uses classes in many cases to keep related data and methods together
- Rust isn't an object-oriented language and doesn't use classes, but it does have the concept of traits and implementations
- In Rust, constructors aren't defined on custom structs, so we need to define the struct's constructor via an implementation
block using the
impl
keyword.
Enumerate
In both Python and Rust, the enumerate
function exists to
iterate over a list while keeping track of the index of the current item.
Python
Recall from the first example that we defined a
Person
class with a name and an age attribute.
We can instantiate a list of Person
objects and iterate over them using enumerate
.
def run2() -> None:
persons = [Person("James", 33), Person("Salima", 31)]
for i, person in enumerate(persons):
print(f"Person {i}: {str(person)}")
Running the above function via main.py
gives us the same output as in Rust:
Person 0: James is 33 years old
Person 1: Salima is 31 years old
The enumerate
method returns a tuple of (index, item)
for each item in the list,
allowing us to access the index of the current item as we iterate over the list in a for
loop.
Rust
Recall from the first example that we defined a Person
struct with a
name and an age attribute, in a similar way to the Python example.
We can instantiate a vector of Person
objects and iterate over them using enumerate
.
What is a vector? Like many other languages, Rust provides arrays, but arrays in Rust are fixed-size and allocated on the stack. Vectors are dynamic arrays that are allocated on the heap, and can grow and shrink as needed, similar to Python lists.
For most purposes, vectors in Rust perform the same function as Python lists.
Unlike a Python list, a vector in Rust can only contain objects of the same type, in this case, Person
.
fn run2() {
let persons = vec![Person::new("James", 33), Person::new("Salima", 31)];
for (i, p) in persons.iter().enumerate() {
println!("Person {}: {}", i, p)
}
}
The vec!
macro is syntactic sugar for Vec::new()
, which creates a new vector of Person
objects.
Additionally, the iter
method returns an iterator over the vector, which is required before we can
call the enumerate
method on it.
Running the above function via main.rs
gives us the same output as in Python:
Person 0: James is 33 years old
Person 1: Salima is 31 years old
Takeaways
- Both Python and Rust contain a convenience method called
enumerate
to iterate over a list while keeping track of the index of the current item. - Python lists are dynamic arrays that can contain objects of any type.
- Rust vectors are heap-allocated dynamic arrays that can only contain objects of the same type.
Zip
In both Python and Rust, the zip
function exists to construct an iterator over two or more iterables.
Python
Recall from the first example that we defined a
Person
class with a name and an age attribute.
If we have two lists, one containing names and one containing ages. zip
conveniently allows us
to iterate over both lists.
def run3() -> None:
names = ["Alice", "Charlie"]
ages = [24, 45]
persons = []
for name, age in zip(names, ages):
person = Person(name, age)
persons.append(person)
print(f"{repr(persons)}")
The append
method is used to add a new item to the end of the list, similar to push
in Rust.
Running the above function via main.py
gives us the following output:
[Person('Alice', 24), Person('Charlie', 45)]
Note that the zip
method returns an iterator over tuples of the same length as the shortest iterable passed to it.
So, if we'd passed one list with 3 items and one list with 2 items, the resulting iterator would have 2 items.
Rust
Recall from the first example that we defined a Person
struct with a
name and an age attribute, in a similar way to the Python example.
Consider that we have two vectors, one containing names and one containing ages. zip
conveniently allows us
to iterate over both vectors.
fn run3() {
let names = ["Alice", "Charlie"];
let ages = [24, 45];
let mut persons = vec![];
for (name, age) in names.iter().zip(ages.iter()) {
persons.push(Person::new(name, *age));
}
println!("{:?}", persons);
}
- The
zip
method can only called on an iterator, so we need to calliter
on both vectors before we can callzip
. - The
push
method is used to add a new item to the end of the vector, just likeappend
in Python.
Again, there's no need to "remember" any of this: the Rust compiler is super helpful in calling you out on common mistakes, while offering a helpful solution!
Running the function via main.rs
gives us the same output as in Python:
[Person: Alice, 24, Person: Charlie, 45]
Takeaways
The functionality of zip
is the largely the same in both Python and Rust.
There really aren't too many differences, but it's worth noting that Rust's zip
is held to account by the strict type system,
so it's typically only available on iterators (unless you implement your own traits or macros). Python's zip
method, on the
other hand, can be called on any iterable (lists, tuples, dictionaries, and so on) because of its dynamic, loosely typed nature.
Tuple unpacking
Both Python and Rust support tuple unpacking in similar ways.
Python
Consider the following function in which we unpack the youngest and oldest age from a sorted list of ages:
def run4() -> None:
sorted_ages = (18, 41, 65)
youngest, _, oldest = sorted_ages
print(f"Youngest age: {youngest}, oldest age: {oldest}")
print(f"Middle age: {sorted_ages[1]}")
The _
is a special variable name in Python that indicates that we don't care about the value,
allowing the unused value to be cleared by the Python memory manager during runtime. We can
still access the middle age via the index operator for tuples, sorted_ages[1]
.
Running the above function via main.py
gives us the following output:
Youngest age: 18, oldest age: 65
Middle age: 41
Rust
We can write the following function in which we unpack the youngest and oldest age from a sorted list of ages:
fn run4() {
let sorted_ages: (u8, u8, u8) = (18, 41, 65);
let (youngest, _, oldest) = sorted_ages;
println!("Youngest age: {}, oldest age: {}", youngest, oldest);
println!("Middle age: {}", sorted_ages.1);
}
Just like in Python, the _
indicates that we don't care about the middle value. The difference
is that in Rust, there isn't a garbage collector (or reference counter) like in Python, so the
unused value is only kept in scope till the function is exited. Also, we need to explicitly
declare the type of each age element as unsigned 8-bit integers.
The index operator for tuples in Rust is .
, so we can access the middle age via sorted_ages.1
.
Running the function via main.rs
gives us the same output as in Python:
Youngest age: 18, oldest age: 65
Middle age: 41
Takeaways
- Tuple unpacking is largely the same in Python and Rust.
- There are some minor differences between Python and Rust tuples:
- In rust, elements of a tuple are mutable, while in Python, they are immutable (lists are mutable in Python).
- In Rust, the index operator for tuples is
.
, while in Python, it's[]
.
Lambdas vs. closures
Anonymous functions are functions that are not bound to a name. In Python, they are called lambdas. In Rust, they are called closures. Both are useful for short,one-off functions that are not used anywhere else.
Python
Recall from the first example that we defined a
Person
class with a name and an age attribute.
In the following example, we use the sorted
function to sort a list of Person
objects by their
age.
def run5() -> None:
persons = [Person("Aiko", 41), Person("Rohan", 18)]
sorted_by_age = sorted(persons, key=lambda person: person.age)
youngest_person = sorted_by_age[0]
print(f"{youngest_person.name} is the youngest person at {youngest_person.age} years old")
The sorted
function takes an optional key
argument, which is a function that is called on each
item in the list to determine the value to sort by. In this case, we use a lambda to return the
age
attribute of each Person
object.
Rohan is the youngest person at 18 years old
Rust
Recall from the first example that we defined a Person
struct with a name and an age attribute, in a similar way to the Python example.
In the following example, we use the sort_by_key
method to sort a vector of Person
objects by
their age.
fn run5() {
let mut persons = vec![Person::new("Aiko", 41), Person::new("Rohan", 18)];
// Sort by age
persons.sort_by_key(|p| p.age);
let youngest_person = persons.first().unwrap();
println!(
"{} is the youngest person at {} years old",
youngest_person.name, youngest_person.age
);
The sort_by_key
method takes a closure that is called on each item in the vector to determine the
value to sort by. In this case, we use a closure operator ||
to return the age
attribute of each
Person
object.
Rohan is the youngest person at 18 years old
Takeaways
- Lambdas and closures are anonymous functions that are not bound to a name, or are passed as arguments to other functions.
- Lambdas and closures are useful for short, one-off functions that are not used anywhere else.
- Closures are more powerful than lambdas because they define higher-order functions that can capture their environment - this is out of scope for this book, but you can read more about it here.
Single line if-else
Both Python and Rust support single line if-else statements. This is especially useful when performing simple operations on a value, allowing for more concise code.
Python
Consider the following function in which we print a message depending on whether a person is born in a leap year or not.
To do this, we first define a function approx_year_of_birth
that returns the approximate year.
def approx_year_of_birth(person: Person) -> int:
birth_year_approx = datetime.now().year - person.age
return birth_year_approx
The leap year logic used above is simplistic and does not account for edge cases. It's used here purely for the purposes of illustration.
We can use this function after initializing a list of Person
objects.
def run6() -> None:
persons = [Person("Josephine", 20), Person("Wesley", 31)]
for person in persons:
# Check if person is born in a leap year using simplistic leap year logic
birth_year = approx_year_of_birth(person)
person_is_born_in_leap_year = True if birth_year % 4 == 0 else False
print(f"{person}. Born in a leap year?: {person_is_born_in_leap_year}")
Running the above function via main.py
gives us the following output:
Josephine is 20 years old. Born in leap year?: True
Wesley is 31 years old. Born in leap year?: False
Rust
We can define the below function in Rust, where we print a message depending on whether a person is born in a leap year or not.
use chrono::prelude::*;
fn approx_year_of_birth(person: &Person) -> u16 {
let now = chrono::Utc::now();
let year = now.year() - (person.age as i32);
year as u16
}
Note that in Rust, we need to use the chrono
crate to handle datetimes, unlike in Python where
the datetime
module comes with the standard library.
We then use this function after initializing a vector of Person
objects.
fn run6() {
let persons = vec![Person::new("Josephine", 20), Person::new("Wesley", 31)];
for person in persons {
// check if person is born in a leap year using simplistic leap year logic
let birth_year = approx_year_of_birth(&person);
let person_is_born_in_leap_year = birth_year % 4 == 0;
println!(
"{}. Born in a leap year?: {}",
person, person_is_born_in_leap_year
);
}
Running the function via main.rs
gives us the same output as in Python:
Josephine is 20 years old. Born in a leap year?: true
Wesley is 31 years old. Born in a leap year?: false
Takeaways
- Single line if-else statements are useful for performing simple operations on a value while remaining concise.
- In certain cases in Rust, we have to use external crates to handle certain functionality that comes with the standard library in Python.
List comprehensions vs map/filter
One of Python's most popular features is its list comprehensions. They are a concise way to create
lists from existing lists. Rust is more functional than Python, so it has a similar feature called
map
and filter
. Although map and filter functions are availble in Python, they are not as
commonly used as list comprehensions.
Python
Consider the following function in which we print a message depending on which persons from
a list of Person
objects are born after the year 1995, based on their current age.
def run7() -> None:
"""
1. List comprehensions
"""
persons = [Person("Issa", 39), Person("Ibrahim", 26)]
persons_born_after_1995 = [
(person.name, person.age) for person in persons if approx_year_of_birth(person) > 1995
]
print(f"Persons born after 1995: {persons_born_after_1995}")
The list comprehension in the above function essentially does the following:
- Iterate over the list of
Person
objects - Unpack each
Person
tuple into their name and age - For each person, check if their approximate year of birth is greater than 1995
Running the above function via main.py
gives us the following output:
Persons born after 1995: [('Ibrahim', 26)]
Rust
We can define the below function in Rust, where we print a message depending on which persons from
a vector of Person
objects are born after the year 1995, based on their current age.
fn run7() {
let persons = vec![Person::new("Issa", 39), Person::new("Ibrahim", 26)];
let result = persons
.into_iter()
.filter(|p| approx_year_of_birth(p) > 1995)
.map(|p| (p.name, p.age))
.collect::<Vec<(String, u8)>>();
println!("Persons born after 1995: {:?}", result)
The filter
and map
functions in the above function essentially do the following:
- Turn the
persons
vector into an iterator and iterate over thePerson
objects - For each person, check if their approximate year of birth is greater than 1995
- If the above condition is true, then create a tuple of their name and age
- Collect all the tuples into a vector of unsigned 8-bit integers
Running the function via main.rs
gives us the same output as in Python:
Persons born after 1995: [("Ibrahim", 26)]
The Rust version is a little more verbose than the Python version, but it's still quite readable.
Takeaways
- Both Python and Rust have convenient ways to create iterables without having to use explicit loops.
- Python's list comprehensions are more concise than Rust's
map
andfilter
functions in most cases. - Rust's
map
andfilter
functions show that Rust is more functional than Python in its syntax.
Dicts vs. hashmaps
Python's dict
is essentially a hash table, which is a data structure that maps keys to values.
Rust's HashMap
performs the same function. Both are collections of key-value pairs where the
keys must be unique, but the values can be duplicated. The purpose of dicts and hashmaps is to
allow for fast lookup of values by key.
Python
Consider the below function in Python, where we define a dictionary of processors and their corresponding market names.
processors = {
"13900KS": "Intel Core i9",
"13700K": "Intel Core i7",
"13600K": "Intel Core i5",
"1800X": "AMD Ryzen 7",
"1600X": "AMD Ryzen 5",
"1300X": "AMD Ryzen 3",
}
# Check for presence of value
is_item_in_dict = "AMD Ryzen 3" in processors.values()
print(f'Is "AMD Ryzen 3" in the dict of processors?: {is_item_in_dict}')
# Lookup by key
key = "13900KS"
lookup_by_key = processors[key]
print(f'Key "{key}" has the value "{lookup_by_key}"')
The first portion checks for the presence of a value in the dictionary, while the second portion looks up the value by key.
Running the above function via main.py
gives us the following output:
Is "AMD Ryzen 3" in the dict of processors?: True
Key "13900KS" has the value "Intel Core i9"
Rust
We define the below function in Rust, where we define a hashmap of processors and their corresponding market names.
use std::collections::HashMap;
fn run8() {
let mut processors = HashMap::new();
processors.insert("13900KS", "Intel Core i9");
processors.insert("13700K", "Intel Core i7");
processors.insert("13600K", "Intel Core i5");
processors.insert("1800X", "AMD Ryzen 7");
processors.insert("1600X", "AMD Ryzen 5");
processors.insert("1300X", "AMD Ryzen 3");
// Check for presence of value
let value = "AMD Ryzen 3";
let mut values = processors.values();
println!(
"Is \"AMD Ryzen 3\" in the hashmap of processors?: {}",
values.any(|v| v == &value)
);
// Lookup by key
let key = "13900KS";
let lookup_by_key = processors.get(key);
println!(
"Key \"{}\" has the value \"{}\"",
key,
lookup_by_key.unwrap()
);
}
Just like in the Python version, the first portion checks for the presence of a value in the hashmap, while the second portion looks up the value by key.
Running the function via main.rs
gives us the same output as in Python:
Is "AMD Ryzen 3" in the hashmap of processors?: true
Key "13900KS" has the value "Intel Core i9"
Takeaways
Python and Rust contain collections that store key-value pairs for fast lookups. A key difference is
that Python's dict
keys can be any hashable type and values can be of any
type, but in Rust, both the keys and values of a HashMap
must be of the same type.
In Python, this dict
is perfectly valid:
# You can have a dict with keys of different types
example = {
"a": 1,
1: 2
}
In Rust, the compiler will enforce that the keys and values are of the same type, based on the first entry's inferred types.
let mut example = HashMap::new();
example.insert("a", 1);
// This errors because the first entry specified the key as &str
example.insert(1, 2);
// This is valid
example.insert("b", 2);
Sets vs. hashsets
Python's set
is an unordered collection of unique items, where duplicate items are not allowed.
Rust's HashSet
performs the same function.
Python
Consider the following function in which we define a set of processors.
def run9() -> None:
processors = {
"Intel Core i9",
"Intel Core i7",
"Intel Core i5",
"AMD Ryzen 7",
"AMD Ryzen 5",
"AMD Ryzen 3",
}
# Duplicate values are ignored
processors.add("Intel Core i7")
processors.add("AMD Ryzen 5")
# Check for presence of value
is_item_in_set = "AMD Ryzen 3" in processors
print(f'Is "AMD Ryzen 3" in the set of processors?: {is_item_in_set}')
The purpose of the above function is to check for the presence of a value in the set of processors. When we add duplicate values to the set, they are ignored.
Running the above function via main.py
gives us the following output:
Is "AMD Ryzen 3" in the set of processors?: True
Rust
We define the below function in Rust, where we define a hashset of processors.
use std::collections::HashSet;
fn run9() {
let mut processors = HashSet::new();
processors.insert("Intel Core i9");
processors.insert("Intel Core i7");
processors.insert("Intel Core i5");
processors.insert("AMD Ryzen 7");
processors.insert("AMD Ryzen 5");
processors.insert("AMD Ryzen 3");
// Duplicate values are ignored
processors.insert("Intel Core i7");
processors.insert("AMD Ryzen 5");
// Check for presence of value
let value = "AMD Ryzen 3";
println!(
"Is \"AMD Ryzen 3\" in the hashset of processors?: {}",
processors.contains(&value)
);
}
The purpose of the above function is to check for the presence of a value in the hashset of processors. When we add duplicate values to the hashset, they are ignored.
Running the function via main.rs
gives us the same output as in Python:
Is "AMD Ryzen 3" in the hashset of processors?: true
Takeaways
Python and Rust contain collections that allow for the storage of unique items. A key difference is
that Python's set
can contain items of any type, while Rust's HashSet
can only contain items of
the same type that were specified at the time of initialization.
In Python, the following set
containing multiple types is valid, as they are all hashable.
example = {1, "hello", 3.14}
In Rust, the compiler enforces that all items in the set are of the same type specified at the time of initialization, or by inferring the first value's type.
let example = HashSet::new();
example.insert(1);
// This errors because the first value specified the key as u32 or similar
example.insert("hello");
// This is valid
example.insert(3);
Contributors
Authors
Rust in Pieces is co-authored by Prashanth Rao and Paul Sanders.
About the authors
-
Prashanth Rao is an A.I. engineer with a background in scientific computing, machine learning, NLP and database systems. He's passionate about making complex topics accessible to a larger audience. In his spare time, Prashanth actively experiments with open source tools, frameworks and databases, and writes about them on his blog.
-
Paul Sanders is a software engineer and consultant having spent decades doing data management and application development in healthcare, pharmaceuticals and biologic drug development. In his spare time, Paul loves contributing to open source software and is actively maintaining several OSS projects.
Additional contributors
Contributions and improvements from the community are welcome! Please see the contributing guidelines.