Skip to content

Commit a71b71e

Browse files
committed
New cosmetic filter parser using CSSTree library
The new parser no longer uses the browser DOM to validate that a cosmetic filter is valid or not, this is now done through a JS library, CSSTree. This means filter list authors will have to be more careful to ensure that a cosmetic filter is really valid, as there is no more guarantee that a cosmetic filter which works for a given browser/version will still work properly on another browser, or different version of the same browser. This change has become necessary because of many reasons, one of them being the flakiness of the previous parser as exposed by many issues lately: - uBlockOrigin/uBlock-issues#2262 - uBlockOrigin/uBlock-issues#2228 The new parser introduces breaking changes, there was no way to do otherwise. Some current procedural cosmetic filters will be shown as invalid with this change. This occurs because the CSSTree library gets confused with some syntax which was previously allowed by the previous parser because it was more permissive. Mainly the issue is with the arguments passed to some procedural cosmetic filters, and these issues can be solved as follow: Use quotes around the argument. You can use either single or double-quotes, whichever is most convenient. If your argument contains a single quote, use double-quotes, and vice versa. Additionally, try to escape a quote inside an argument using backslash. THis may work, but if not, use quotes around the argument. When the parser encounter quotes around an argument, it will discard them before trying to process the argument, same with escaped quotes inside the argument. Examples: Breakage: ...##^script:has-text(toscr') Fix: ...##^script:has-text(toscr\') Breakage: ...##:xpath(//*[contains(text(),"VPN")]):upward(2) Fix: ...##:xpath('//*[contains(text(),"VPN")]'):upward(2) There are not many filters which break in the default set of filter lists, so this should be workable for default lists. Unfortunately those fixes will break the filter for previous versions of uBO since these to not deal with quoted argument. In such case, it may be necessary to keep the previous filter, which will be discarded as broken on newer version of uBO. THis was a necessary change as the old parser was becoming more and more flaky after being constantly patched for new cases arising, The new parser should be far more robust and stay robist through expanding procedural cosmetic filter syntax. Additionally, in the MV3 version, filters are pre-compiled using a Nodejs script, i.e. outside the browser, so validating cosmetic filters using a live DOM no longer made sense. This new parser will have to be tested throughly before stable release.
1 parent fe21ce5 commit a71b71e

14 files changed

+556
-550
lines changed

platform/common/vapi-common.js

+5
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,8 @@
2222

2323
// For background page or non-background pages
2424

25+
/* global browser */
26+
2527
'use strict';
2628

2729
/******************************************************************************/
@@ -89,6 +91,9 @@ vAPI.webextFlavor = {
8991
soup.add('chromium')
9092
.add('user_stylesheet');
9193
flavor.major = parseInt(match[1], 10) || 0;
94+
if ( flavor.major >= 105 ) {
95+
soup.add('native_css_has');
96+
}
9297
}
9398

9499
// Don't starve potential listeners

platform/mv3/make-rulesets.js

+1-1
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@ const outputDir = commandLineArgs.get('output') || '.';
5555
const cacheDir = `${outputDir}/../mv3-data`;
5656
const rulesetDir = `${outputDir}/rulesets`;
5757
const scriptletDir = `${rulesetDir}/js`;
58-
const env = [ 'chromium', 'ubol' ];
58+
const env = [ 'chromium', 'ubol', 'native_css_has' ];
5959

6060
/******************************************************************************/
6161

src/about.html

+1
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@
3737
<div class="li"><span><a href="https://github.com/foo123/RegexAnalyzer" target="_blank">Regular Expression Analyzer</a> by <a href="https://github.com/foo123">Nikos M.</a></span></div>
3838
<div class="li"><span><a href="https://github.com/hsluv/hsluv" target="_blank">HSLuv - Human-friendly HSL</a> by <a href="https://github.com/boronine">Alexei Boronine</a></span></div>
3939
<div class="li"><span><a href="https://searchfox.org/mozilla-central/rev/d317e93d9a59c9e4c06ada85fbff9f6a1ceaaad1/browser/extensions/webcompat/shims/google-ima.js" target="_blank">google-ima.js</a> by <a href="https://www.mozilla.org/">Mozilla</a></span></div>
40+
<div class="li"><span><a href="https://github.com/csstree/csstree" target="_blank">CSSTree</a> by <a href="https://github.com/lahmatiy">Roman Dvornov</a></span></div>
4041
</div>
4142
<div class="li"><span data-i18n="aboutCDNs"></span></div>
4243
<div class="liul">

src/js/contentscript-extra.js

+35-32
Original file line numberDiff line numberDiff line change
@@ -95,19 +95,6 @@ class PSelectorMatchesCSSTask extends PSelectorTask {
9595
}
9696
}
9797
}
98-
class PSelectorMatchesCSSAfterTask extends PSelectorMatchesCSSTask {
99-
constructor(task) {
100-
super(task);
101-
this.pseudo = '::after';
102-
}
103-
}
104-
105-
class PSelectorMatchesCSSBeforeTask extends PSelectorMatchesCSSTask {
106-
constructor(task) {
107-
super(task);
108-
this.pseudo = '::before';
109-
}
110-
}
11198

11299
class PSelectorMatchesMediaTask extends PSelectorTask {
113100
constructor(task) {
@@ -247,6 +234,20 @@ class PSelectorSpathTask extends PSelectorTask {
247234
output.push(node);
248235
}
249236
}
237+
// Helper method for other operators.
238+
static qsa(node, selector) {
239+
const parent = node.parentElement;
240+
if ( parent === null ) { return []; }
241+
let pos = 1;
242+
for (;;) {
243+
node = node.previousElementSibling;
244+
if ( node === null ) { break; }
245+
pos += 1;
246+
}
247+
return parent.querySelectorAll(
248+
`:scope > :nth-child(${pos})${selector}`
249+
);
250+
}
250251
}
251252

252253
class PSelectorUpwardTask extends PSelectorTask {
@@ -339,23 +340,20 @@ class PSelector {
339340
constructor(o) {
340341
if ( PSelector.prototype.operatorToTaskMap === undefined ) {
341342
PSelector.prototype.operatorToTaskMap = new Map([
342-
[ ':has', PSelectorIfTask ],
343-
[ ':has-text', PSelectorHasTextTask ],
344-
[ ':if', PSelectorIfTask ],
345-
[ ':if-not', PSelectorIfNotTask ],
346-
[ ':matches-css', PSelectorMatchesCSSTask ],
347-
[ ':matches-css-after', PSelectorMatchesCSSAfterTask ],
348-
[ ':matches-css-before', PSelectorMatchesCSSBeforeTask ],
349-
[ ':matches-media', PSelectorMatchesMediaTask ],
350-
[ ':matches-path', PSelectorMatchesPathTask ],
351-
[ ':min-text-length', PSelectorMinTextLengthTask ],
352-
[ ':not', PSelectorIfNotTask ],
353-
[ ':nth-ancestor', PSelectorUpwardTask ],
354-
[ ':others', PSelectorOthersTask ],
355-
[ ':spath', PSelectorSpathTask ],
356-
[ ':upward', PSelectorUpwardTask ],
357-
[ ':watch-attr', PSelectorWatchAttrs ],
358-
[ ':xpath', PSelectorXpathTask ],
343+
[ 'has', PSelectorIfTask ],
344+
[ 'has-text', PSelectorHasTextTask ],
345+
[ 'if', PSelectorIfTask ],
346+
[ 'if-not', PSelectorIfNotTask ],
347+
[ 'matches-css', PSelectorMatchesCSSTask ],
348+
[ 'matches-media', PSelectorMatchesMediaTask ],
349+
[ 'matches-path', PSelectorMatchesPathTask ],
350+
[ 'min-text-length', PSelectorMinTextLengthTask ],
351+
[ 'not', PSelectorIfNotTask ],
352+
[ 'others', PSelectorOthersTask ],
353+
[ 'spath', PSelectorSpathTask ],
354+
[ 'upward', PSelectorUpwardTask ],
355+
[ 'watch-attr', PSelectorWatchAttrs ],
356+
[ 'xpath', PSelectorXpathTask ],
359357
]);
360358
}
361359
this.raw = o.raw;
@@ -374,7 +372,12 @@ class PSelector {
374372
prime(input) {
375373
const root = input || document;
376374
if ( this.selector === '' ) { return [ root ]; }
377-
return Array.from(root.querySelectorAll(this.selector));
375+
let selector = this.selector;
376+
if ( input !== document && /^ [>+~]/.test(this.selector) ) {
377+
return Array.from(PSelectorSpathTask.qsa(input, this.selector));
378+
}
379+
const elems = root.querySelectorAll(selector);
380+
return Array.from(elems);
378381
}
379382
exec(input) {
380383
let nodes = this.prime(input);
@@ -453,7 +456,7 @@ class ProceduralFilterer {
453456
let style, styleToken;
454457
if ( selector.action === undefined ) {
455458
style = vAPI.hideStyle;
456-
} else if ( selector.action[0] === ':style' ) {
459+
} else if ( selector.action[0] === 'style' ) {
457460
style = selector.action[1];
458461
}
459462
if ( style !== undefined ) {

src/js/cosmetic-filtering.js

+3-3
Original file line numberDiff line numberDiff line change
@@ -429,7 +429,7 @@ FilterContainer.prototype.compileGenericHideSelector = function(
429429
// https://github.com/uBlockOrigin/uBlock-issues/issues/131
430430
// Support generic procedural filters as per advanced settings.
431431
// TODO: prevent double compilation.
432-
if ( compiled !== raw ) {
432+
if ( compiled.charCodeAt(0) === 0x7B /* '{' */ ) {
433433
if ( µb.hiddenSettings.allowGenericProceduralFilters === true ) {
434434
return this.compileSpecificSelector(parser, '', false, writer);
435435
}
@@ -830,12 +830,12 @@ FilterContainer.prototype.cssRuleFromProcedural = function(json) {
830830
let mq;
831831
if ( tasks !== undefined ) {
832832
if ( tasks.length > 1 ) { return; }
833-
if ( tasks[0][0] !== ':matches-media' ) { return; }
833+
if ( tasks[0][0] !== 'matches-media' ) { return; }
834834
mq = tasks[0][1];
835835
}
836836
let style;
837837
if ( Array.isArray(action) ) {
838-
if ( action[0] !== ':style' ) { return; }
838+
if ( action[0] !== 'style' ) { return; }
839839
style = action[1];
840840
}
841841
if ( mq === undefined && style === undefined ) { return; }

src/js/epicker-ui.js

+4-1
Original file line numberDiff line numberDiff line change
@@ -834,7 +834,10 @@ const startPicker = function() {
834834
$id('candidateFilters').addEventListener('click', onCandidateClicked);
835835
$stor('#resultsetDepth input').addEventListener('input', onDepthChanged);
836836
$stor('#resultsetSpecificity input').addEventListener('input', onSpecificityChanged);
837-
staticFilteringParser = new StaticFilteringParser({ interactive: true });
837+
staticFilteringParser = new StaticFilteringParser({
838+
interactive: true,
839+
nativeCssHas: vAPI.webextFlavor.env.includes('native_css_has'),
840+
});
838841
};
839842

840843
/******************************************************************************/

src/js/messaging.js

+3-1
Original file line numberDiff line numberDiff line change
@@ -1721,7 +1721,9 @@ const getURLFilteringData = function(details) {
17211721
};
17221722

17231723
const compileTemporaryException = function(filter) {
1724-
const parser = new StaticFilteringParser();
1724+
const parser = new StaticFilteringParser({
1725+
nativeCssHas: vAPI.webextFlavor.env.includes('native_css_has'),
1726+
});
17251727
parser.analyze(filter);
17261728
if ( parser.shouldDiscard() ) { return; }
17271729
return staticExtFilteringEngine.compileTemporary(parser);

src/js/reverselookup.js

+3-1
Original file line numberDiff line numberDiff line change
@@ -134,7 +134,9 @@ const fromNetFilter = async function(rawFilter) {
134134
if ( typeof rawFilter !== 'string' || rawFilter === '' ) { return; }
135135

136136
const writer = new CompiledListWriter();
137-
const parser = new StaticFilteringParser();
137+
const parser = new StaticFilteringParser({
138+
nativeCssHas: vAPI.webextFlavor.env.includes('native_css_has'),
139+
});
138140
parser.setMaxTokenLength(staticNetFilteringEngine.MAX_TOKEN_LENGTH);
139141
parser.analyze(rawFilter);
140142

src/js/static-dnr-filtering.js

+8-8
Original file line numberDiff line numberDiff line change
@@ -97,6 +97,7 @@ function addExtendedToDNR(context, parser) {
9797
if ( bad ) { continue; }
9898
if ( hn.endsWith('.*') ) { continue; }
9999
const { compiled, exception } = parser.result;
100+
if ( typeof compiled !== 'string' ) { continue; }
100101
if ( compiled.startsWith('{') ) { continue; }
101102
if ( exception ) { continue; }
102103
let details = context.cosmeticFilters.get(compiled);
@@ -126,14 +127,14 @@ function addExtendedToDNR(context, parser) {
126127
/******************************************************************************/
127128

128129
function addToDNR(context, list) {
130+
const env = context.env || [];
129131
const writer = new CompiledListWriter();
130132
const lineIter = new LineIterator(
131-
StaticFilteringParser.utils.preparser.prune(
132-
list.text,
133-
context.env || []
134-
)
133+
StaticFilteringParser.utils.preparser.prune(list.text, env)
135134
);
136-
const parser = new StaticFilteringParser();
135+
const parser = new StaticFilteringParser({
136+
nativeCssHas: env.includes('native_css_has'),
137+
});
137138
const compiler = staticNetFilteringEngine.createCompiler(parser);
138139

139140
writer.properties.set('name', list.name);
@@ -180,10 +181,9 @@ function addToDNR(context, list) {
180181
/******************************************************************************/
181182

182183
async function dnrRulesetFromRawLists(lists, options = {}) {
183-
const context = {};
184+
const context = Object.assign({}, options);
184185
staticNetFilteringEngine.dnrFromCompiled('begin', context);
185-
context.extensionPaths = new Map(options.extensionPaths || []);
186-
context.env = options.env;
186+
context.extensionPaths = new Map(context.extensionPaths || []);
187187
const toLoad = [];
188188
const toDNR = (context, list) => addToDNR(context, list);
189189
for ( const list of lists ) {

0 commit comments

Comments
 (0)