New doc comment handling in rustdoc

Today, I'll talk about a recently merged rustdoc pull request (you can see it here) which introduced two great changes:

Unification of different doc comment item kinds.
Removal of dark magic around indent handling.

Unification of different doc comment kinds

Some of you may have needed to make a part of a doc comment "optional" or depend on a given build setting. For example:

Run/// ```
/// let example = Example::new()
///     .first("hello")
#[cfg_attr(not(feature = "one"), doc = "    .second(\"hello\")\n")]
///     .third("hello")
///     .build();
/// ```
pub struct Example;

With this, everyone would expect to get (in case the "one" feature was not given):

Runlet example = Example::new()
    .first("hello")
    .second("hello")
    .third("hello")
    .build();

But instead we got:

Runlet example = Example::new()
    .first("hello")
.second("hello")

.third("hello")
    .build();

If we did give the "one" feature though, we'd get:

Runlet example = Example::new()
    .first("hello")
    .third("hello")
    .build();

So what is going on there? As you may have expected from the title, the fact that .second("hello") is what we call a "raw doc" kind differs from the "sugared doc" kind. But then, why this strange behavior?

Before the change, we grouped doc fragments by kind and order. Meaning that as long as we have the same kind, we treat it the same way, but as soon we encounter a new kind, we treat the previous group then treat the new one.

It may sound illogical as is, but you'll understand why when we'll talk about the indent issues.

Just note that since pull request, we now group all items as one, whatever their kinds. The interesting part starts just now. :)

A story about indents

At this stage, the big question remains: why did we handle the different doc fragment kinds like this? Well, let's take an example:

Run/// I am a sugared doc comment
///
#[doc = "I am a raw doc comment"]

Nothing weird for you I guess? However there is one problem already: the sugared doc comment starts with a whitespace whereas the raw one doesn't. Which would render as:

 I am a sugared doc comment
I am a raw doc comment

You can see an unwanted whitespace prepending I am a sugared doc comment. Not exactly what we want, right? Then we have two solutions:

Add a whitespace at the beginning of all raw doc comments.
Remove the prepending whitespace from all sugared doc comments.

Because yes, in case you didn't know, this is perfectly valid:

Run///I am a sugared doc comment

Just not great looking. :)

Another issue that made the handling this way was the following case:

Run#[doc = "Some raw doc comment
         but we want to keep the alignement!"]

Before this pull request, it would simply render as follows:

Some raw doc comment but we want to keep the alignement!

Because it was decided that it was more beautiful to handle indent this way.

So with all this, it's at best very complicated to handle both kinds with their own weirdness, right? Now I think you got the full story behind it.

No more dark magic

Based on all this, I mostly thought it was way too obfuscated and decided it was time to simplify things again. The only issue remaining was the common handling of the prepending whitespace for the sugared doc comment. We decided that in case we had a doc comment with different fragment kinds, we would "favor" the sugared doc one. A bit more explanation here is required I think.

Back to the first example:

Run/// ```
/// let example = Example::new()
///     .first("hello")
#[cfg_attr(not(feature = "one"), doc = "    .second(\"hello\")\n")]
///     .third("hello")
///     .build();
/// ```

The important thing to find first is the minimum indent value (which is one from let example = Example::new()). So by default, the minimum indent is always 1, unless you have:

Run///no_whitespace()

Then it'll be 0.

Therefore, we still handling sugared doc comment a bit differently in that regard.

Now, you might wonder why we made all this about indent? The answer is because of this:

Run/// blablabla
///
///     let code = 12;
///     println!("{}", code);

In here, if we didn't "unindent", we'd have a code block with 4 whitespaces on the start of each of its lines. As simple as that!

Words of the end

With this pull request, we simplified the handling of doc comments by unifying how we handled them while making it easier for the users to get the expected results. I think that overall, it's quite the big improvement and I'm very happy that this issue has finally be fixed.

Posted on the 11/11/2020 at 17:00 by @GuillaumeGomez

doc(alias) is stable and it's gonna be super useful!

rgsl release 2.0: Huge cleanup and rewrite

Back to articles list