Let me clarify what I meant by “expensive”.
Let’s suppose I have a lot of notes, and that each note has a lot of text of length.
First, we look for a word libero. We read through the text of note 1: nope, not there! We read through the text of note 2: note there, either! It does appear in note 3, and once we see it, we can skip reading the rest of note 3 and proceed to note 4. I think it’s clear that we’ll have to look at every note, and might need to look at all the text of every note if we never find libero.
Next, we look for the phrase libero fermentum. This is just about the same amount of work as just looking for libero.
If we do a lot of this, we find that looking for zebra is a little easier. Z’s are rare, so we can just scan for them while hardly paying attention. On the other hand, searching for Grant said that he expected Lincoln to agree might be a pain in the neck in our notes about 1864: “Grant” shows up a lot, which means we have to pay attention. “Grant said” is pretty common, too. So we’re doing a little more work.
Now, a pattern like Grant NEAR Lincoln is even harder, because every time we meet Grant, we must pay close attention. We scan the next n words, looking for Lincoln. When we get to the end, we need to rewind all the way back to the word immediately following Grant and try again. So, we spend time ping-ponging back and forth through the text. That can get expensive.
It’s expensive in another way, too, though I don’t know whether this second effect is observable or meaningful. At a very low level, modern processors anticipate coming instructions and automatically prepare for them. For example, if the processor knows it’s going to need the next character of the text shortly, it can ask the memory now to start sending that character so that it will be ready when needed. The more complex logic needed for the backup is less likely to be anticipated than the simpler search.
Now, as a matter of principle, I recommend that you ignore all this in Tinderbox and assume that whatever you want will be fast enough. Sometimes, maybe it won’t be fast enough, and then we can sit down and work it out.
In this case, I presume we’re using NEAR to disambiguate common terms: “strike NEAR batter” is likely baseball and “strike NEAR union” is likely labor. But this isn’t infallible, and it may be that linguistics (“Washington the place, not the name”) or a custom neural net will be better.