Over complicating things - with Rust!
Introduction
This post details one of the rare occasions where changing your code with the sole priority of “just making it work” creates an end result that's arguably superior to the “proper” code you had initially set out to write.
Contents
- Context
- What I wanted to do
- Why I wanted to do it
- How I started to implement this approach
- Where I got totally blocked
- What I did instead
- TL;DR
(If you want to skip the journey and go straight to the before and after, then pop straight down to What I did instead and the TL;DR.)
Context
I was implementing this user story for my CLI maze maker (repo here):
As a user
So I can choose the algorithm used to make my maze
I want to be able to select algorithms via the CLI
Things went without a hitch in terms of getting arguments from the CLI and using them to pick the maze making algorithm.
The trouble (and main focus of this post) came when I tried to add some simple validation and error handling on the user input.
What I wanted to do
- I wanted to sanitise the user’s input, without matching against bare strings, e.g. avoiding
selected_algo === “binary_tree”
- So I wanted to assign the string to a variable, specifically a constant
I didn’t go straight to a standalone constant for each string. Instead, my plan was to wrap all the name constants in a container constant, so I would access the relevant string with something like ALGORITHM_NAMES.BINARY_TREE
Why I wanted to do it
Why would I start with this structure vs a standalone constant?
- I thought just
BINARY_TREE
on its own was ambiguous in comparison to e.g.ALGORITHM_NAMES.BINARY_TREE
, especially as I was already referring to an imported function calledbinary_tree
nearby 1 - The organiser in me likes the encapsulation that name spacing provides.
- I love dot notation, on aesthetic and ergonomic grounds (so much easier and more pleasant to type
something.else
thansomething[“else”]
) 2
I was also very much influenced by habit: in previous Python projects we’d had something like the below code:
#constants.py
LOCAL = "local"
DEV = "development"
STAGING = "staging"
PROD = "production"
class Envs:
LOCAL = LOCAL
DEV = DEV
STAGING = STAGING
PROD = PROD
#buckets.py
import os
current_env = os.environ.get('env')
if current_env == Envs.LOCAL:
print("Hey we're doing local dev, set up some LocalStack stuff")
I found this easy to read and easy to use.
All of the above were enough for me to think it would be reasonable and worthwhile to recreate the approach in Rust for this use-case.
How I started to implement this approach
Initially, it was looking promising.
(Advanced warning: this post's code snippets are very much “Rust newbie just trying to make things work”)
First, I created the container structure (which took far more work than I anticipated):
// constants.rs
const BINARY_TREE: &str = "binary_tree";
const SIDEWINDER: &str = "sidewinder";
#[derive(Debug)]
pub struct Algorithms<'a> {
pub binary_tree: &'a str,
pub sidewinder: &'a str
}
pub const ALGORITHMS: Algorithms = Algorithms {
binary_tree: BINARY_TREE,
sidewinder: SIDEWINDER
};
Then, I imported the struct and checked I could access the values:
// maze_display.rs
use constants::*; // remember readers, just trying to make it work!
println!("{:?}", ALGORITHMS.binary_tree);
That was successful, so next step was to use to set a default algorithm in case the user doesn’t pass one in themselves:
// maze_display.rs
// fetch user input from the command line
let cli_args: Vec<String> = env::args().collect();
// default to binary tree if relevant arg not present
// yes, this is still very loosey-goosey input sanitisation
let algorithm: &str = if cli_args.len() > 1 { &cli_args[1] } else { ALGORITHMS.binary_tree };
Where I got totally blocked
All of the above code was up and working, so I moved onto creating a match
statement to do the following:
- Check user input matches a valid algorithm
- If a match is found, return the relevant function that implements that algorithm
- If no match found, tell them they’ve given an invalid option
In the same file as the above blocks, I added the below code:
//maze_display.rs
use maze_makers::{binary_tree, sidewinder}; // functions to make mazes
match algorithm { // get string representing user's algorithm selection
// if this matches a known algo, return the appropriate function
ALGORITHMS.binary_tree => binary_tree,
ALGORITHMS.sidewinder => sidewinder,
_ => panic!("Unrecognised algorithm"),
}
And this is where I ran into trouble, with the below error:
error: expected one of `=>`, `@`, `if`, or `|`, found `.`
ALGORITHMS.binary_tree => btree,
^ expected one of `=>`, `@`, `if`, or `|`
This was a surprise to me. If I could access ALGORITHMS.binary_tree
in the below block :
let algorithm: &str = if cli_args.len() > 1 { &cli_args[1] } else { ALGORITHMS.binary_tree };
Why could I not access it the same way in the match
statement?
Beyond “Because the syntax within match statements has a different structure” I never fully understood why this didn’t work. I went down rabbit hole after rabbit hole 3 of how one should access struct values in a match statement, but I never got anything sensible up and running.
What I did instead
Eventually, I decided to change tactic completely and just use simple string constants.
Before:
// constants.rs
const BINARY_TREE: &str = "binary_tree";
const SIDEWINDER: &str = "sidewinder";
#[derive(Debug)]
pub struct Algorithms<'a> {
pub binary_tree: &'a str,
pub sidewinder: &'a str
}
pub const ALGORITHMS: Algorithms = Algorithms {
binary_tree: BINARY_TREE,
sidewinder: SIDEWINDER
};
After:
// constants.rs
pub const BINARY_TREE: &str = "binary_tree";
pub const SIDEWINDER: &str = "sidewinder";
On doing this, and comparing the two, I realised just how unnecessarily complex my initial approach had been. I don’t know if it’s possible to do what I wanted in Rust, but I was ultimately glad that I couldn’t.
A lot is said about Rust’s design being made to push developers into the pit of success, and I wonder if this is an example of that.
TL;DR
Rust string validation takeaway:
- If you want to group related strings in a data structure and read them in a match statement, think carefully on whether that data structure is truly necessary.
More transferable takeaways:
- When totally blocked, try changing your approach completely - you may be pleasantly surprised by the results
- Sometimes over-engineering things is fun, sometimes it’s an awful waste of time
- It is dangerously easy to forget the below principles when writing things for yourself:
- premature optimisation (or organisation) is the root of all evil
- YAGNI
- KISS
Hopefully this was useful for anyone (maybe particularly people from a Python background?) rolling their own simple string validation in Rust.
Next up I'll be writing about how I moved the maze display from a shell-based co-ordinate system (not a real term) to a 2D Cartesian co-ordinate system. Watch this space! ... hehe
Arguably this is overkill when you have strongly typed language like Rust - you can just hover over the variable and see that it’s a string rather than an executable. But I’m a big believer of favouring explicit over implicit, so this wasn’t quite a strong enough argument.↩
Though I recently read this SO answer which has dampened my enthusiasm for using it in some scenarios. The poster made the very good point that it’s potentially misleading to use dot notation for things that aren’t actually proper classes or objects - just like with the above example. So I’m a bit less in love with it than I was before reading that.↩
For example trying out most of the relevant approaches listed here: All the Pattern Syntax - The Rust Programming Language↩