Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parser: use memchr to speed-up skip_till() #90

Merged
merged 1 commit into from
Jul 28, 2024

Conversation

Kijewski
Copy link
Collaborator

@Kijewski Kijewski commented Jul 25, 2024

skip_till() is used in the parser to find the next block {%, comment {# or expression {{. At every character position, it is tested if one of these three substrings follows. Using memchr3(), we could at least skip to the next candidate, a {. The syntax for blocks, comments and expressions can be modified by the user, but that does not matter much; we can simply supply memchr3() with the first byte in each of these strings.

Benchmark results (cd rinja_parser; cargo bench)
librustdoc/all          time:   [366.54 µs 366.78 µs 367.02 µs]
                        thrpt:  [38.475 MiB/s 38.500 MiB/s 38.525 MiB/s]
                 change:
                        time:   [-17.358% -17.065% -16.820%] (p = 0.00 < 0.05)
                        thrpt:  [+20.221% +20.576% +21.004%]
                        Performance has improved.

librustdoc/item_info    time:   [6.3315 µs 6.3400 µs 6.3495 µs]
                        thrpt:  [24.783 MiB/s 24.820 MiB/s 24.853 MiB/s]
                 change:
                        time:   [-6.5547% -6.4090% -6.2633%] (p = 0.00 < 0.05)
                        thrpt:  [+6.6818% +6.8479% +7.0144%]
                        Performance has improved.

librustdoc/item_union   time:   [39.377 µs 39.551 µs 39.720 µs]
                        thrpt:  [24.850 MiB/s 24.957 MiB/s 25.067 MiB/s]
                 change:
                        time:   [-6.9834% -6.2455% -5.5849%] (p = 0.00 < 0.05)
                        thrpt:  [+5.9153% +6.6616% +7.5077%]
                        Performance has improved.

librustdoc/page         time:   [170.83 µs 170.99 µs 171.23 µs]
                        thrpt:  [36.164 MiB/s 36.213 MiB/s 36.248 MiB/s]
                 change:
                        time:   [-12.413% -12.183% -11.968%] (p = 0.00 < 0.05)
                        thrpt:  [+13.595% +13.873% +14.173%]
                        Performance has improved.

librustdoc/print_item   time:   [21.163 µs 21.234 µs 21.322 µs]
                        thrpt:  [44.280 MiB/s 44.463 MiB/s 44.612 MiB/s]
                 change:
                        time:   [-19.848% -18.613% -17.491%] (p = 0.00 < 0.05)
                        thrpt:  [+21.198% +22.870% +24.763%]
                        Performance has improved.

librustdoc/short_item_info
                        time:   [19.781 µs 19.813 µs 19.846 µs]
                        thrpt:  [45.652 MiB/s 45.727 MiB/s 45.801 MiB/s]
                 change:
                        time:   [-18.027% -17.806% -17.574%] (p = 0.00 < 0.05)
                        thrpt:  [+21.321% +21.663% +21.991%]
                        Performance has improved.

librustdoc/sidebar      time:   [40.694 µs 40.806 µs 40.957 µs]
                        thrpt:  [30.131 MiB/s 30.242 MiB/s 30.325 MiB/s]
                 change:
                        time:   [-14.698% -14.069% -13.456%] (p = 0.00 < 0.05)
                        thrpt:  [+15.548% +16.372% +17.231%]
                        Performance has improved.

librustdoc/source       time:   [15.249 µs 15.264 µs 15.278 µs]
                        thrpt:  [48.251 MiB/s 48.295 MiB/s 48.343 MiB/s]
                 change:
                        time:   [-25.832% -25.678% -25.532%] (p = 0.00 < 0.05)
                        thrpt:  [+34.285% +34.550% +34.829%]
                        Performance has improved.

librustdoc/type_layout_size
                        time:   [9.0168 µs 9.0446 µs 9.0789 µs]
                        thrpt:  [29.832 MiB/s 29.945 MiB/s 30.038 MiB/s]
                 change:
                        time:   [-11.100% -10.437% -9.4426%] (p = 0.00 < 0.05)
                        thrpt:  [+10.427% +11.653% +12.486%]
                        Performance has improved.

librustdoc/type_layout  time:   [34.088 µs 34.114 µs 34.139 µs]
                        thrpt:  [78.860 MiB/s 78.919 MiB/s 78.979 MiB/s]
                 change:
                        time:   [-37.865% -37.723% -37.585%] (p = 0.00 < 0.05)
                        thrpt:  [+60.217% +60.573% +60.941%]
                        Performance has improved.

Resolves #88.

@Kijewski Kijewski force-pushed the pr-faster-skip-till branch from cfda7aa to d1e37af Compare July 25, 2024 21:40
@GuillaumeGomez
Copy link
Collaborator

PR looks good. Just a few things I need to confirm: you mentioned that memchr only works on ASCII characters. However, we can overload syntax (like comment start/end for example). Nothing is preventing users to use non-ASCII characters for that. Will it still work in this case? If yes, please add tests to ensure it.

@Kijewski
Copy link
Collaborator Author

you mentioned that memchr only works on ASCII characters

memchr works on any &[u8]. It does not know or care what the input it.

But we have to care, because we must not split a &str. The first byte of a UTF-8 encoded code point is either:

  • 0b0xxx_xxxx (ASCII, 1 byte),
  • 0b110x_xxxx (BMP, 2 bytes),
  • 0b1110_xxxx (3 bytes) or
  • 0b1111_0xxx (4 bytes),
  • but never 0b10xx_xxxx.

Knowing that, we know that the result of memchr will never point into a UTF-8 sequence, only at the start of one, because there is no way that memchr could have searched for a byte inside a UTF-8 sequence.

@Kijewski Kijewski force-pushed the pr-faster-skip-till branch 5 times, most recently from 9323920 to ce48db1 Compare July 28, 2024 14:36
@Kijewski
Copy link
Collaborator Author

Kijewski commented Jul 28, 2024

Rebased on the current HEAD. Some other PR did make the parsing faster in the meantime, too, so I re-did the benchmark. Updated the initial comment. Still a good speed-up.

@Kijewski Kijewski requested a review from GuillaumeGomez July 28, 2024 14:38
@@ -0,0 +1,97 @@
pub(crate) trait Splitter: Copy {
/// The any of the needles was found in the haystack, then split the haystack at the first hit.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think something went wrong with this explanation. XD

Missing a word maybe between The and any at the beginning?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"The" should have been "if". Common typo, the "the" key is right next to the "if" key. 🙈

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obviously 😛

Ast::from_str("{= strvar|e =}", None, &syntax).unwrap();

let syntax = Syntax {
expr_start: "🖎", // U+1F58E == b"\xf0\x9f\x96\x8e"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 👍

Please just add a comment on this one explaining it ensures that unicode characters can be used as expression markers.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think that a speaking name is enough?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, the function name makes it obvious for me at least.

@Kijewski Kijewski force-pushed the pr-faster-skip-till branch from ce48db1 to bf6fb32 Compare July 28, 2024 16:12
`skip_till()` is used in the parser to find the next block `{%`,
comment `{#` or expression `{{`. At every character position, it is
tested if one of these three substrings follows. Using [`memchr3()`],
we could at least skip to the next candidate, a `{`. The syntax for
blocks, comments and expressions can be modified by the user, but that
does not matter much; we can simply supply `memchr3()` with the first
byte in each of these strings.

[`memchr3()`]: <https://docs.rs/memchr/2.7.4/memchr/fn.memrchr3.html>

```text
librustdoc/all          time:   [366.54 µs 366.78 µs 367.02 µs]
                        thrpt:  [38.475 MiB/s 38.500 MiB/s 38.525 MiB/s]
                 change:
                        time:   [-17.358% -17.065% -16.820%] (p = 0.00 < 0.05)
                        thrpt:  [+20.221% +20.576% +21.004%]
                        Performance has improved.

librustdoc/item_info    time:   [6.3315 µs 6.3400 µs 6.3495 µs]
                        thrpt:  [24.783 MiB/s 24.820 MiB/s 24.853 MiB/s]
                 change:
                        time:   [-6.5547% -6.4090% -6.2633%] (p = 0.00 < 0.05)
                        thrpt:  [+6.6818% +6.8479% +7.0144%]
                        Performance has improved.

librustdoc/item_union   time:   [39.377 µs 39.551 µs 39.720 µs]
                        thrpt:  [24.850 MiB/s 24.957 MiB/s 25.067 MiB/s]
                 change:
                        time:   [-6.9834% -6.2455% -5.5849%] (p = 0.00 < 0.05)
                        thrpt:  [+5.9153% +6.6616% +7.5077%]
                        Performance has improved.

librustdoc/page         time:   [170.83 µs 170.99 µs 171.23 µs]
                        thrpt:  [36.164 MiB/s 36.213 MiB/s 36.248 MiB/s]
                 change:
                        time:   [-12.413% -12.183% -11.968%] (p = 0.00 < 0.05)
                        thrpt:  [+13.595% +13.873% +14.173%]
                        Performance has improved.

librustdoc/print_item   time:   [21.163 µs 21.234 µs 21.322 µs]
                        thrpt:  [44.280 MiB/s 44.463 MiB/s 44.612 MiB/s]
                 change:
                        time:   [-19.848% -18.613% -17.491%] (p = 0.00 < 0.05)
                        thrpt:  [+21.198% +22.870% +24.763%]
                        Performance has improved.

librustdoc/short_item_info
                        time:   [19.781 µs 19.813 µs 19.846 µs]
                        thrpt:  [45.652 MiB/s 45.727 MiB/s 45.801 MiB/s]
                 change:
                        time:   [-18.027% -17.806% -17.574%] (p = 0.00 < 0.05)
                        thrpt:  [+21.321% +21.663% +21.991%]
                        Performance has improved.

librustdoc/sidebar      time:   [40.694 µs 40.806 µs 40.957 µs]
                        thrpt:  [30.131 MiB/s 30.242 MiB/s 30.325 MiB/s]
                 change:
                        time:   [-14.698% -14.069% -13.456%] (p = 0.00 < 0.05)
                        thrpt:  [+15.548% +16.372% +17.231%]
                        Performance has improved.

librustdoc/source       time:   [15.249 µs 15.264 µs 15.278 µs]
                        thrpt:  [48.251 MiB/s 48.295 MiB/s 48.343 MiB/s]
                 change:
                        time:   [-25.832% -25.678% -25.532%] (p = 0.00 < 0.05)
                        thrpt:  [+34.285% +34.550% +34.829%]
                        Performance has improved.

librustdoc/type_layout_size
                        time:   [9.0168 µs 9.0446 µs 9.0789 µs]
                        thrpt:  [29.832 MiB/s 29.945 MiB/s 30.038 MiB/s]
                 change:
                        time:   [-11.100% -10.437% -9.4426%] (p = 0.00 < 0.05)
                        thrpt:  [+10.427% +11.653% +12.486%]
                        Performance has improved.

librustdoc/type_layout  time:   [34.088 µs 34.114 µs 34.139 µs]
                        thrpt:  [78.860 MiB/s 78.919 MiB/s 78.979 MiB/s]
                 change:
                        time:   [-37.865% -37.723% -37.585%] (p = 0.00 < 0.05)
                        thrpt:  [+60.217% +60.573% +60.941%]
                        Performance has improved.
```
@Kijewski Kijewski force-pushed the pr-faster-skip-till branch from bf6fb32 to 7b99783 Compare July 28, 2024 16:15
@GuillaumeGomez
Copy link
Collaborator

Thanks!

@GuillaumeGomez GuillaumeGomez merged commit bf03e44 into askama-rs:master Jul 28, 2024
17 checks passed
@Kijewski Kijewski deleted the pr-faster-skip-till branch July 28, 2024 16:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

parser: use memchr to speed-up skip_till()
2 participants