Writing your own Rust linter

Lints allow to improve codebases and even detect potential logic issues at compile-time. In rust, we have the incredible clippy which is a must use for any Rust project. However, because it is used everywhere, it also cannot cover all use cases. If you need to add new lints specific to your project(s), better write your own linter (to be used alongside clippy of course).

This blog post will explain how to do it. Please note that this blog post will not explain how rustc APIs work or how to use them. For that, I recommend you taking a look at the rustc dev guide.

rustc-tools

We will write our own linter using the rustc-tools crate. As it is relying on rustc's API, it will require some setup before we can start.

Setup

First, let's write the Cargo.toml file:

[package]
name = "cargo-tools-inner"
version = "0.0.1"
authors = []
license = ""
edition = "2021"

# A noop dependency that changes in the Rust repository, it's a bit of a hack.
# See the `src/tools/rustc-workspace-hack/README.md` file in `rust-lang/rust`
# for more information.
rustc-workspace-hack = "1.0.0"

[dependencies]
rustc-tools = "0.3"

So everything is as usual here, except for this:

# A noop dependency that changes in the Rust repository, it's a bit of a hack.
# See the `src/tools/rustc-workspace-hack/README.md` file in `rust-lang/rust`
# for more information.
rustc-workspace-hack = "1.0.0"

I voluntarily left the comment as it provides interesting information. Now why is it needed? Because our linter will be a rust compiler extension. It will use the rustc's API and link to its internal crates. Therefore, cargo needs to handle it a bit differently.

The second file that we will need to add (at the same level as Cargo.toml) is rust-toolchain:

[toolchain]
channel = "nightly-2023-10-06"
components = ["rustc-dev", "rustfmt", "llvm-tools-preview"]

It needs to be the same one as your rustc-tools version. You can find it here. This file will tell cargo to install the nightly-2023-10-06 toolchain and the listed components of this version when you will run any cargo command. So nothing else is needed to be done, except creating this file.

All done for the setup, we can now start actually writing our linter!

Basic code

Let's start by creating a src/main.rs file:

Run// This feature must be added so we can use compiler APIs.
#![feature(rustc_private)]

// We need to import them like this otherwise it doesn't work.
extern crate rustc_lint;

use rustc_lint::LintStore;
use rustc_tools::with_lints;

fn main() {
    // We skip the first argument because it's the current binary.
    let args = std::env::args().skip(1).collect::<Vec<_>>();
    with_lints(args, vec![], |store: &mut LintStore| {
        // This is where we will register the lints.
    }).expect("with_lints failed");
}

With this, the last bits of "magic" have been added. Since we want to use rustc's API, we need to enable the rustc_private feature. Also, rustc crates need to be imported explicitly in the code (like we used to before the 2018 Rust edition) with extern crate.

Let's explain the arguments of with_lints:

The first argument is a &[String], it's the arguments that will be passed to the rustc_driver (kinda the same as if calling rustc).

The second argument (a Vec<String>) is the list of files that would trigger a re-compilation if they are modified in any way since last compilation.

The third and last argument is a callback that will pass the LintStore type. We will use it to register our lints.

Now that everything is ready, we can write our first lint!

Writing a lint

First, we need to pick which kind of lint we want: an early or late one. In short, early lints don't have access to type information whereas late ones do.

Let's write a lint which warns every time it sees a call to .unwrap(). We don't want to only check that a method unwrap is called but also that it's called on a Result or Option type. For that we need to have access to type information, so it will be a late lint.

Let's create a src/unwrap_call.rs file:

Runuse rustc_lint::{LateLintPass, LintContext};
use rustc_session::{declare_lint_pass, declare_tool_lint};

declare_tool_lint! {
    // `tools` is the name of the "binary" here. It's required when creating a lint.
    pub tools::UNWRAP_CALL,
    Warn,
    "warns if `Option::unwrap` or `Result::unwrap` is called"
}
declare_lint_pass!(UnwrapCall => [UNWRAP_CALL]);

// `UnwrapCall` was declared in `declare_lint_pass`.
impl LateLintPass<'_> for UnwrapCall {}

declare_lint_pass macro declares our (empty) UnwrapCall type and implements the LintPass trait on it.

declare_tool_lint is more interesting as it declares the lint itself, gives it a description and a level (Warn in our case).

With this, your linter users will be able to write #[allow(tools::unwrap_call)] in their code to ignore our lint (sadness...).

For now, LateLintPass implementation is empty, we will come back to it later.

Since it's declared, we can now register it like this:

Run// ...

// We need to create a `src/unwrap_call.rs` file.
mod unwrap_call;

// ...

with_lints(args, vec![], |store: &mut LintStore| {
    store.register_late_pass(|_| Box::new(unwrap_call::UnwrapCall));
}).expect("with_lints failed");

Let's explain a bit more how passes work: for each registered lint, it will run a visitor (LateLintPass or EarlyLintPass) which will go through all the HIR data. The only thing we need to do is to implement the methods we're interested into.

For example, if we only want to checks idents, we implement the EarlyLintPass::check_ident and do our check in it. The visitor will call it for us on all idents it will encounter in the crate it's being run upon.

So in our case, to detect all calls to Option::unwrap and Result::unwrap, we will need to implement the check_expr method and check for each method call if it's one of the two:

Runuse rustc_lint::{LateContext, LateLintPass, LintContext};
use rustc_hir::{Expr, ExprKind};
use rustc_middle::ty;
use rustc_span::sym;

impl<'tcx> LateLintPass<'tcx> for UnwrapCall {
    fn check_expr(&mut self, cx: &LateContext<'tcx>, expr: &'tcx Expr<'tcx>) {
        match expr.kind {
            // If this is a method...
            ExprKind::MethodCall(name, recv, _args, span) => {
                // And the method is named "unwrap"...
                if name.ident.as_str() != "unwrap" {
                    return;
                }

                // In here, `recv` is the caller of `unwrap`. So in
                // `x.unwrap()`, `recv` will be `x`.
                //
                // We compute the actual type of `x`. Quite useful if you have
                // `Self::unwrap()` for example because at this point, we need
                // to infer what `Self` (or any generic) actually is. Same goes
                // if you have `x.as_ref().unwrap()`. We need to infer which
                // type `ref_ref` is returning.
                let caller_ty = cx.typeck_results().expr_ty(recv);
                let is_option_or_result = match caller_ty.kind() {
                    // `Result` and `Option` are enums so we only look at Adt
                    // (Algebraic data types).
                    ty::Adt(adt, _) => {
                        // The compiler is providing us a nice function to check
                        // if the `DefId` (which we get from `adt.did()`) is
                        // actually what we're looking for.
                        cx.tcx.is_diagnostic_item(sym::Option, adt.did())
                            || cx.tcx.is_diagnostic_item(sym::Result, adt.did())
                    }
                    _ => false,
                };
                if !is_option_or_result {
                    return;
                }
                // This is what we're looking for! We emit the lint with a
                // message.
                cx.struct_span_lint(
                    UNWRAP_CALL,
                    span,
                    "avoid using `unwrap` if possible",
                    |diag| diag,
                );
            }
            // We ignore everything else.
            _ => {}
        }
    }
}

Don't forget to import the crates we're using with extern crate in main.rs:

Runextern crate rustc_hir;
extern crate rustc_middle;
extern crate rustc_span;

And we're done! Let's write a small code to test it out (in foo.rs for example):

Runfn main() {
    let x = Ok::<&str, ()>("a").unwrap();
    let y = Some(12);
    let y = y.unwrap();
}

We run our linter:

cargo run --bin cargo-tools-inner foo.rs

which outputs:

warning: avoid using `unwrap` if possible
 --> foo.rs:2:33
  |
2 |     let x = Ok::<&str, ()>("a").unwrap();
  |                                 ^^^^^^^^
  |
  = note: `#[warn(tools::unwrap_call)]` on by default

warning: avoid using `unwrap` if possible
 --> foo.rs:4:15
  |
4 |     let y = y.unwrap();
  |               ^^^^^^^^

Victory!

So our linter can now run on simple codes. But what happens if this code has dependencies? Suddenly, things become much more complicated! Luckily for us, there is cargo which we can use to handle the "boring" part. And luckily for you, rustc-tools has a helper for cargo integration as well.

cargo integration

To write a cargo command, you need to create a binary with a name starting by cargo-. And that's pretty much it. Luckily for us, if you copied the Cargo.toml file at the beginning, it's already called cargo-tools-inner (I will explain for the -inner part later). So if we install it (with cargo), then we can run it with cargo tools-inner. However, doing this won't solve the dependency issue. To fix it, we will need to update our main function:

Runfn main() {
    // Different `skip`.
    let cargo_args = std::env::args().skip(2).collect::<Vec<_>>();
    // New call.
    rustc_tools::cargo_integration(&cargo_args, |args| {
        // `args` is all we need to give to `with_lints` as it includes
        // arguments to build with dependencies and all `rustc_driver` needs.
        with_lints(args, vec![], |store: &mut LintStore| {
            store.register_late_pass(|_| Box::new(unwrap_call::UnwrapCall));
        }).expect("with_lints failed");
    }).expect("cargo_integration failed");
}

There are two big changes here: we now skip not only the first but also the second command-line argument. The first being cargo and the second being tools-inner.

The second big change is the call to cargo_integration. Internally, it will run cargo check (with the extra arguments provided by the user if any), retrieve the dependencies and all needed options and generate args that will be passed to the closure. As the documentation mentions: you must not pass subcommands to it (like build, doc, etc). Any other argument is fine.

Let's install the binary:

cd [rustc-tools-inner location]
cargo install --path .

And now let's test it on a project:

$ cargo tools-inner
.cargo/bin/cargo-tools-inner: error while loading shared libraries:
librustc_driver-d069aac035ada932.so: cannot open shared object file: No such
file or directory

Oh no. What happened? Well, here comes the not funny part: to run it, it needs to use exactly the same rustc version as the one it was compiled with. So this will work:

$ cargo +nightly-2023-10-06 tools-inner

However, it's really bad UX-wise. So now comes the final part: wrapping our cargo command.

Let's create another project named cargo-tools with this Cargo.toml:

[package]
name = "cargo-tools"
version = "0.0.1"
authors = []
license = "MIT"
edition = "2021"

As you can see, it's much simpler than the one for cargo-tools-inner. It's one for one simple reason: it does almost nothing. Let's now create a src/main.rs file:

Runuse std::process::{Command, exit};

fn main() {
    let cargo_args = std::env::args().skip(2).collect::<Vec<_>>();
    let toolchain = "+nightly-2023-10-06";

    if let Err(err) = Command::new("cargo")
        .arg(toolchain)
        .arg("tools-inner")
        .args(&cargo_args)
        .spawn()
        .and_then(|mut child| child.wait())
    {
        eprintln!(
            "Command `cargo {toolchain} tools-inner {}` failed: {err:?}",
            cargo_args.join(" "),
        );
        exit(1);
    }
}

And that's it! Let's install it:

cd [rustc-tools location]
cargo install --path .

You can now run:

cargo tools

It all works! So for your linter, you will need to distribute a wrapper (cargo-tools) and the actual linter (cargo-tools-inner). Not optimal, but until librustc_driver can be linked statically, there is not much that can be done for the moment.

Important to note: another project called dylint took another approach to solve this issue. Don't hesitate to compare both and take whichever seems better for you.

Potential improvements

In our unwrap_call lint, we don't check if the unwrap() call was generated from a macro (which can be checked with Span::from_expansion). We also don't handle this case:

Runlet x = Some("a");
let x = Option::unwrap(x); // Should warn!

For that, we need to handle ExprKind::Call.

You can find the source code for both cargo-tools and cargo-tools-linter here. It also contains other lints if you want more examples.

I think with this you have all the basics to start writing your own linter. Have fun!

Words of the end

All this was written with the help(?) of my cat (who loves ice cream a bit too much):

Posted on the 18/01/2024 at 11:00 by @GuillaumeGomez

docs.rs switching jinja template framework from tera to rinja

Testing rustdoc

Back to articles list