Exclude terms from Taggers?

leifbrown · May 31, 2025, 1:02am

I’m using four notes to collect and present the terms highlighted by the inbuilt taggers, and finding the taggers to be a little more enthusiastic and inclusive than I’d prefer.

Is there any way to exclude terms from the taggers?

eastgate · May 31, 2025, 2:49pm

I don’t think there is.

Can you give us an example or two of excess tagger enthusiasm?

leifbrown · June 1, 2025, 4:20pm

My current working document is tagging (among others) “Alzheimer (technically correct, but not contextually relevant, and never used without the 's), Artifact, Balm, Boy, Cobble, Gates…”

There’s also a smattering of “names” like “John copy” and “Someone truly”.

My capture notes are often incomplete sentences with the first letter capitalized, and that may be a factor. NLNames seems to be capturing ~2000 words as possible names.

NLOrganizations has given me my new favorite band name: the “Constant doppelgangers”, plus “GPS, Gravity, Hearts, Idea of unification…”

This isn’t impacting performance, but I’ve been experimenting with using tagger functionality to forward new tags that are worth tracking. I can reasonably scroll through a few dozen tags, but a few thousand gets prohibitive.

eastgate · June 1, 2025, 9:52pm

Ah. There’s no way to exclude things from NLNames and NLOrganizations. Both use an Apple-trained neural net which was superb for (say) 2020 but sadly out-of-date today.

I expect this will improve shortly; we’re sure to hear something about that at WWDC next week.

In the interim, you could define $ScreenedNames and $ScreenedOrganizations, and set these using a rule or edict to remove unwanted hits. For example:

var:list exclude="[Alzheimer;John copy;Yours truly;]"
$ScreenedNames=$NLNames-exclude;

In a large document, you might want to keep the exclusion list in a configuration note to make it easier to add elements.

leifbrown · June 2, 2025, 2:11am

I was already drafting that approach.

Thanks!

leifbrown · June 2, 2025, 6:45pm

Thinking this through, is anything likely to break if I just declare a tagger line like
`excluded_names:Name1;Name2;NameN;

?

eastgate · June 2, 2025, 7:49pm

You could do that, but it’s probably better to remove the excluded names from $NLNames — that’s just list manipulation — than to scan all the text. But for most purposes, whatever is easiest is probably best.