Skip to content

Commit 0375954

Browse files
committed
regex_macros: delete it
The regex_macros crate hasn't been maintained in quite some time, and has been broken. Nobody has complained. Given the fact that there are no immediate plans to improve the situation, and the fact that it is slower than the runtime engine, we simply remove it.
1 parent b8f56f1 commit 0375954

15 files changed

+73
-1081
lines changed

HACKING.md

+30-37
Original file line numberDiff line numberDiff line change
@@ -185,37 +185,36 @@ A regular expression program is essentially a sequence of opcodes produced by
185185
the compiler plus various facts about the regular expression (such as whether
186186
it is anchored, its capture names, etc.).
187187

188-
### The regex! macro (or why `regex::internal` exists)
189-
190-
The `regex!` macro is defined in the `regex_macros` crate as a compiler plugin,
191-
which is maintained in this repository. The `regex!` macro compiles a regular
192-
expression at compile time into specialized Rust code.
193-
194-
The `regex!` macro was written when this library was first conceived and
195-
unfortunately hasn't changed much since then. In particular, it encodes the
196-
entire Pike VM into stack allocated space (no heap allocation is done). When
197-
`regex!` was first written, this provided a substantial speed boost over
198-
so-called "dynamic" regexes compiled at runtime, and in particular had much
199-
lower overhead per match. This was because the only matching engine at the
200-
time was the Pike VM. The addition of other matching engines has inverted
201-
the relationship; the `regex!` macro is almost never faster than the dynamic
202-
variant. (In fact, it is typically substantially slower.)
203-
204-
In order to build the `regex!` macro this way, it must have access to some
205-
internals of the regex library, which is in a distinct crate. (Compiler plugins
206-
must be part of a distinct crate.) Namely, it must be able to compile a regular
207-
expression and access its opcodes. The necessary internals are exported as part
208-
of the top-level `internal` module in the regex library, but is hidden from
209-
public documentation. In order to present a uniform API between programs build
210-
by the `regex!` macro and their dynamic analoges, the `Regex` type is an enum
211-
whose variants are hidden from public documentation.
212-
213-
In the future, the `regex!` macro should probably work more like Ragel, but
214-
it's not clear how hard this is. In particular, the `regex!` macro should be
215-
able to support all the features of dynamic regexes, which may be hard to do
216-
with a Ragel-style implementation approach. (Which somewhat suggests that the
217-
`regex!` macro may also need to grow conditional execution logic like the
218-
dynamic variants, which seems rather grotesque.)
188+
### The regex! macro
189+
190+
The `regex!` macro no longer exists. It was developed in a bygone era as a
191+
compiler plugin during the infancy of the regex crate. Back then, then only
192+
matching engine in the crate was the Pike VM. The `regex!` macro was, itself,
193+
also a Pike VM. The only advantages it offered over the dynamic Pike VM that
194+
was built at runtime were the following:
195+
196+
1. Syntax checking was done at compile time. Your Rust program wouldn't
197+
compile if your regex didn't compile.
198+
2. Reduction of overhead that was proportional to the size of the regex.
199+
For the most part, this overhead consisted of heap allocation, which
200+
was nearly eliminated in the compiler plugin.
201+
202+
The main takeaway here is that the compiler plugin was a marginally faster
203+
version of a slow regex engine. As the regex crate evolved, it grew other regex
204+
engines (DFA, bounded backtracker) and sophisticated literal optimizations.
205+
The regex macro didn't keep pace, and it therefore became (dramatically) slower
206+
than the dynamic engines. The only reason left to use it was for the compile
207+
time guarantee that your regex is correct. Fortunately, Clippy (the Rust lint
208+
tool) has a lint that checks your regular expression validity, which mostly
209+
replaces that use case.
210+
211+
Additionally, the regex compiler plugin stopped receiving maintenance. Nobody
212+
complained. At that point, it seemed prudent to just remove it.
213+
214+
Will a compiler plugin be brought back? The future is murky, but there is
215+
definitely an opportunity there to build something that is faster than the
216+
dynamic engines in some cases. But it will be challenging! As of now, there
217+
are no plans to work on this.
219218

220219

221220
## Testing
@@ -236,7 +235,6 @@ the AT&T test suite) and code generate tests for each matching engine. The
236235
approach we use in this library is to create a Cargo.toml entry point for each
237236
matching engine we want to test. The entry points are:
238237

239-
* `tests/test_plugin.rs` - tests the `regex!` macro
240238
* `tests/test_default.rs` - tests `Regex::new`
241239
* `tests/test_default_bytes.rs` - tests `bytes::Regex::new`
242240
* `tests/test_nfa.rs` - tests `Regex::new`, forced to use the NFA
@@ -261,10 +259,6 @@ entry points, it can take a while to compile everything. To reduce compile
261259
times slightly, try using `cargo test --test default`, which will only use the
262260
`tests/test_default.rs` entry point.
263261

264-
N.B. To run tests for the `regex!` macro, use:
265-
266-
cargo test --manifest-path regex_macros/Cargo.toml
267-
268262

269263
## Benchmarking
270264

@@ -284,7 +278,6 @@ separately from the main regex crate.
284278
Benchmarking follows a similarly wonky setup as tests. There are multiple entry
285279
points:
286280

287-
* `bench_rust_plugin.rs` - benchmarks the `regex!` macro
288281
* `bench_rust.rs` - benchmarks `Regex::new`
289282
* `bench_rust_bytes.rs` benchmarks `bytes::Regex::new`
290283
* `bench_pcre.rs` - benchmarks PCRE

README.md

-31
Original file line numberDiff line numberDiff line change
@@ -188,37 +188,6 @@ assert!(!matches.matched(5));
188188
assert!(matches.matched(6));
189189
```
190190

191-
### Usage: `regex!` compiler plugin
192-
193-
**WARNING**: The `regex!` compiler plugin is orders of magnitude slower than
194-
the normal `Regex::new(...)` usage. You should not use the compiler plugin
195-
unless you have a very special reason for doing so. The performance difference
196-
may be the temporary, but the path forward at this point isn't clear.
197-
198-
The `regex!` compiler plugin will compile your regexes at compile time. **This
199-
only works with a nightly compiler.**
200-
201-
Here is a small example:
202-
203-
```rust
204-
#![feature(plugin)]
205-
206-
#![plugin(regex_macros)]
207-
extern crate regex;
208-
209-
fn main() {
210-
let re = regex!(r"(\d{4})-(\d{2})-(\d{2})");
211-
let caps = re.captures("2010-03-14").unwrap();
212-
213-
assert_eq!("2010", caps[1]);
214-
assert_eq!("03", caps[2]);
215-
assert_eq!("14", caps[3]);
216-
}
217-
```
218-
219-
Notice that we never `unwrap` the result of `regex!`. This is because your
220-
*program* won't compile if the regex doesn't compile. (Try `regex!("(")`.)
221-
222191

223192
### Usage: a regular expression parser
224193

bench/Cargo.toml

-1
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,6 @@ re-onig = ["onig"]
4949
re-re2 = []
5050
re-rust = []
5151
re-rust-bytes = []
52-
re-rust-plugin = ["regex_macros"]
5352
re-tcl = []
5453

5554
[[bench]]

bench/run

+1-4
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
#!/bin/bash
22

33
usage() {
4-
echo "Usage: $(basename $0) [rust | rust-bytes | rust-plugin | pcre1 | pcre2 | re2 | onig | tcl ]" >&2
4+
echo "Usage: $(basename $0) [rust | rust-bytes | pcre1 | pcre2 | re2 | onig | tcl ]" >&2
55
exit 1
66
}
77

@@ -22,9 +22,6 @@ case $which in
2222
rust-bytes)
2323
exec cargo bench --bench bench --features re-rust-bytes "$@"
2424
;;
25-
rust-plugin)
26-
exec cargo bench --bench bench --features re-rust-plugin "$@"
27-
;;
2825
re2)
2926
exec cargo bench --bench bench --features re-re2 "$@"
3027
;;

bench/src/bench.rs

+2-13
Original file line numberDiff line numberDiff line change
@@ -11,11 +11,6 @@
1111
// Enable the benchmarking harness.
1212
#![feature(test)]
1313

14-
// If we're benchmarking the Rust regex plugin, then pull that in.
15-
// This will bring a `regex!` macro into scope.
16-
#![cfg_attr(feature = "re-rust-plugin", feature(plugin))]
17-
#![cfg_attr(feature = "re-rust-plugin", plugin(regex_macros))]
18-
1914
#[macro_use]
2015
extern crate lazy_static;
2116
#[cfg(not(any(feature = "re-rust", feature = "re-rust-bytes")))]
@@ -27,7 +22,6 @@ extern crate onig;
2722
#[cfg(any(
2823
feature = "re-rust",
2924
feature = "re-rust-bytes",
30-
feature = "re-rust-plugin",
3125
))]
3226
extern crate regex;
3327
#[cfg(feature = "re-rust")]
@@ -43,7 +37,7 @@ pub use ffi::pcre1::Regex;
4337
pub use ffi::pcre2::Regex;
4438
#[cfg(feature = "re-re2")]
4539
pub use ffi::re2::Regex;
46-
#[cfg(any(feature = "re-rust", feature = "re-rust-plugin"))]
40+
#[cfg(feature = "re-rust")]
4741
pub use regex::Regex;
4842
#[cfg(feature = "re-rust-bytes")]
4943
pub use regex::bytes::Regex;
@@ -52,14 +46,11 @@ pub use ffi::tcl::Regex;
5246

5347
// Usage: regex!(pattern)
5448
//
55-
// Builds a ::Regex from a borrowed string. This is used in every regex
56-
// engine except for the Rust plugin, because the plugin itself defines the
57-
// same macro.
49+
// Builds a ::Regex from a borrowed string.
5850
//
5951
// Due to macro scoping rules, this definition only applies for the modules
6052
// defined below. Effectively, it allows us to use the same tests for both
6153
// native and dynamic regexes.
62-
#[cfg(not(feature = "re-rust-plugin"))]
6354
macro_rules! regex {
6455
($re:expr) => { ::Regex::new(&$re.to_owned()).unwrap() }
6556
}
@@ -99,7 +90,6 @@ macro_rules! text {
9990
feature = "re-pcre2",
10091
feature = "re-re2",
10192
feature = "re-rust",
102-
feature = "re-rust-plugin",
10393
))]
10494
macro_rules! text {
10595
($text:expr) => { $text }
@@ -116,7 +106,6 @@ type Text = Vec<u8>;
116106
feature = "re-pcre2",
117107
feature = "re-re2",
118108
feature = "re-rust",
119-
feature = "re-rust-plugin",
120109
))]
121110
type Text = String;
122111

bench/src/main.rs

-1
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,6 @@ extern crate onig;
1818
#[cfg(any(
1919
feature = "re-rust",
2020
feature = "re-rust-bytes",
21-
feature = "re-rust-plugin",
2221
))]
2322
extern crate regex;
2423
#[cfg(feature = "re-rust")]

bench/src/misc.rs

-1
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,6 @@ use {Regex, Text};
1919
#[cfg(not(feature = "re-onig"))]
2020
#[cfg(not(feature = "re-pcre1"))]
2121
#[cfg(not(feature = "re-pcre2"))]
22-
#[cfg(not(feature = "re-rust-plugin"))]
2322
bench_match!(no_exponential, {
2423
format!(
2524
"{}{}",

ci/run-kcov

-14
Original file line numberDiff line numberDiff line change
@@ -14,15 +14,10 @@ tests=(
1414
regex
1515
)
1616
tmpdir=$(mktemp -d)
17-
with_plugin=
1817
coveralls_id=
1918

2019
while true; do
2120
case "$1" in
22-
--with-plugin)
23-
with_plugin=yes
24-
shift
25-
;;
2621
--coveralls-id)
2722
coveralls_id="$2"
2823
shift 2
@@ -33,15 +28,6 @@ while true; do
3328
esac
3429
done
3530

36-
if [ -n "$with_plugin" ]; then
37-
cargo test --manifest-path regex_macros/Cargo.toml --no-run --verbose
38-
kcov \
39-
--verify \
40-
--include-pattern '/regex/src/' \
41-
"$tmpdir/plugin" \
42-
$(ls -t ./regex_macros/target/debug/plugin-* | head -n1)
43-
fi
44-
4531
cargo test --no-run --verbose --jobs 4
4632
for t in ${tests[@]}; do
4733
kcov \

regex_macros/Cargo.toml

-35
This file was deleted.

0 commit comments

Comments
 (0)