-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
parser: use memchr
to speed-up skip_till()
#90
Conversation
cfda7aa
to
d1e37af
Compare
PR looks good. Just a few things I need to confirm: you mentioned that |
But we have to care, because we must not split a
Knowing that, we know that the result of |
9323920
to
ce48db1
Compare
Rebased on the current HEAD. Some other PR did make the parsing faster in the meantime, too, so I re-did the benchmark. Updated the initial comment. Still a good speed-up. |
rinja_parser/src/memchr_splitter.rs
Outdated
@@ -0,0 +1,97 @@ | |||
pub(crate) trait Splitter: Copy { | |||
/// The any of the needles was found in the haystack, then split the haystack at the first hit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think something went wrong with this explanation. XD
Missing a word maybe between The
and any
at the beginning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"The" should have been "if". Common typo, the "the" key is right next to the "if" key. 🙈
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Obviously 😛
Ast::from_str("{= strvar|e =}", None, &syntax).unwrap(); | ||
|
||
let syntax = Syntax { | ||
expr_start: "🖎", // U+1F58E == b"\xf0\x9f\x96\x8e" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice 👍
Please just add a comment on this one explaining it ensures that unicode characters can be used as expression markers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think that a speaking name is enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, the function name makes it obvious for me at least.
ce48db1
to
bf6fb32
Compare
`skip_till()` is used in the parser to find the next block `{%`, comment `{#` or expression `{{`. At every character position, it is tested if one of these three substrings follows. Using [`memchr3()`], we could at least skip to the next candidate, a `{`. The syntax for blocks, comments and expressions can be modified by the user, but that does not matter much; we can simply supply `memchr3()` with the first byte in each of these strings. [`memchr3()`]: <https://docs.rs/memchr/2.7.4/memchr/fn.memrchr3.html> ```text librustdoc/all time: [366.54 µs 366.78 µs 367.02 µs] thrpt: [38.475 MiB/s 38.500 MiB/s 38.525 MiB/s] change: time: [-17.358% -17.065% -16.820%] (p = 0.00 < 0.05) thrpt: [+20.221% +20.576% +21.004%] Performance has improved. librustdoc/item_info time: [6.3315 µs 6.3400 µs 6.3495 µs] thrpt: [24.783 MiB/s 24.820 MiB/s 24.853 MiB/s] change: time: [-6.5547% -6.4090% -6.2633%] (p = 0.00 < 0.05) thrpt: [+6.6818% +6.8479% +7.0144%] Performance has improved. librustdoc/item_union time: [39.377 µs 39.551 µs 39.720 µs] thrpt: [24.850 MiB/s 24.957 MiB/s 25.067 MiB/s] change: time: [-6.9834% -6.2455% -5.5849%] (p = 0.00 < 0.05) thrpt: [+5.9153% +6.6616% +7.5077%] Performance has improved. librustdoc/page time: [170.83 µs 170.99 µs 171.23 µs] thrpt: [36.164 MiB/s 36.213 MiB/s 36.248 MiB/s] change: time: [-12.413% -12.183% -11.968%] (p = 0.00 < 0.05) thrpt: [+13.595% +13.873% +14.173%] Performance has improved. librustdoc/print_item time: [21.163 µs 21.234 µs 21.322 µs] thrpt: [44.280 MiB/s 44.463 MiB/s 44.612 MiB/s] change: time: [-19.848% -18.613% -17.491%] (p = 0.00 < 0.05) thrpt: [+21.198% +22.870% +24.763%] Performance has improved. librustdoc/short_item_info time: [19.781 µs 19.813 µs 19.846 µs] thrpt: [45.652 MiB/s 45.727 MiB/s 45.801 MiB/s] change: time: [-18.027% -17.806% -17.574%] (p = 0.00 < 0.05) thrpt: [+21.321% +21.663% +21.991%] Performance has improved. librustdoc/sidebar time: [40.694 µs 40.806 µs 40.957 µs] thrpt: [30.131 MiB/s 30.242 MiB/s 30.325 MiB/s] change: time: [-14.698% -14.069% -13.456%] (p = 0.00 < 0.05) thrpt: [+15.548% +16.372% +17.231%] Performance has improved. librustdoc/source time: [15.249 µs 15.264 µs 15.278 µs] thrpt: [48.251 MiB/s 48.295 MiB/s 48.343 MiB/s] change: time: [-25.832% -25.678% -25.532%] (p = 0.00 < 0.05) thrpt: [+34.285% +34.550% +34.829%] Performance has improved. librustdoc/type_layout_size time: [9.0168 µs 9.0446 µs 9.0789 µs] thrpt: [29.832 MiB/s 29.945 MiB/s 30.038 MiB/s] change: time: [-11.100% -10.437% -9.4426%] (p = 0.00 < 0.05) thrpt: [+10.427% +11.653% +12.486%] Performance has improved. librustdoc/type_layout time: [34.088 µs 34.114 µs 34.139 µs] thrpt: [78.860 MiB/s 78.919 MiB/s 78.979 MiB/s] change: time: [-37.865% -37.723% -37.585%] (p = 0.00 < 0.05) thrpt: [+60.217% +60.573% +60.941%] Performance has improved. ```
bf6fb32
to
7b99783
Compare
Thanks! |
skip_till()
is used in the parser to find the next block{%
, comment{#
or expression{{
. At every character position, it is tested if one of these three substrings follows. Usingmemchr3()
, we could at least skip to the next candidate, a{
. The syntax for blocks, comments and expressions can be modified by the user, but that does not matter much; we can simply supplymemchr3()
with the first byte in each of these strings.Benchmark results (
cd rinja_parser; cargo bench
)Resolves #88.