Tinderbox Forum

Finding a partial match in $Name

So, my adventures with regex continue. According to a little program called Expressions, this:

^(.{1,3}\s.)

Should find the element ⌥⇧⌘ L in ⌥⇧⌘ L Link to x (in $Name).

If I construct a query:

descendedFrom("Content")&$Name.icontains("^(.{1,3}\s.)")

I get about a quarter of the hits I should. If I eliminate \s I go from about 110 hits to 400.

I have also tried descendedFrom("Content")&$Name.icontains("^.{1,3}\s.") but that does not change the result.

Looking at the text in BBEdit suggests that there is indeed a space and not something else in the problem position.

Any ideas?

Before testing lots of things it isn’t (regex are very precise), could you post a small TBX test file so we can work against a common reference. I couldn’t recreate your result but without a common reference this can be hard to resolve.

Incidentally, I assume that here:

$Name.icontains("^(.{1,3}\s.)")

…the inner parentheses are because you want to create a back-reference? If you are trying to match literal parentheses, these must be escaped, e.g. \( and \). the above match means “from the start if the string, open a back reference, match any character between 1 and 3 times (inclusively) , match a single whitespace character, match any character once, close the back-reference”.

I note we are using symbol characters in match which might be a factor but again in cases like this we need to ensure we’re all testing exactly the same thing and not we each assume is meant by the text examples above.

Although I can’t find it documented, I’ve a vague recollection that when using back-references the pattern must match the whole string. IOW:

$Name.icontains("^(.{1,3}\s.).*")

…which extends the the above logic by (after closing the back-reference) match zero or more characters (i.e. to the end of the string).

There’s no requirement for the pattern to match the whole string, though this improves performance in longer texts because, in the example above, we could look at the $Name

ESTHER: Now it came to pass in the days of Ahaseurus, (this is Ahaseurus which reigned from India even unto Ethiopia, over an hundred and seven and twenty provinces:) that in those days…

and reject it immediately because it does not have whitespace \s where the pattern expects it. But if we did not have the initial ^, we’'d have to scan through the whole Megillah.

Thanks for looking at this. I’m a little puzzled.

Test_regex.tbx (99.7 KB)

I also took a couple of screenshots.

Removing the white space makes a difference to the result that I didn’t expect.

Cheers!
M.

To extract N different accelerator keys, a space and a letter or number, I used this:

^[^\s]{1,4}\s\w

…otherwise the ‘⌘ L’ item is detected but not for the right reasons. This works in Expressions and in BBEdit grep. No idea why it’s not working the same in Tinderbox.

Thank you! Remarkable sleuthing. I would never have arrived at that.

Onwards! :wink:

Hmm! Unfortunately, it doesn’t seem to work here.

Not to worry. This is probably something to be done in another app, like Nisus, then put in a spreadsheet. I’ll look at some options. But thanks for trying.

I think it’s a Unicode/symbol related thing. Using the same query as your last example I added some tests without symbols and they work as expected:

This doesn’t therefore look like a simple user input error, or the new examples should fail. I think it probably needs looking at under the hood by support. I’ve sent an email.

Small correction - not that is makes things work:

^[⌃⇧⌥⌘]{1,4}\s\w

The first character is the caret (Shift+6) and is the regex start of string character. The second instance is a different, symbol character Unicode #2303, the “UP ARROWHEAD”.

I’d not try to fix this for now until we have a steer as to how Tinderbox’s regex engine is handling these unicode symbol characters.

Many thanks. I usually assume I must have done something wrong – which is usually true – so it would be a novelty for it not to be my fault :slight_smile:

Cheers!

If it under the hood, a fix might take a bit, so I’ve just tested a fix that seems to work. First, put square brackets around your shortcut. So ⌥⌘ N Menu text becomes [⌥⌘ N] Menu text, etc. Now use the query:

descendedFrom("Content")&$Name.icontains("^\[.+\]")

This gets round the undiagnosed problem of matching symbol characters (higher order Unicode codes). Instead, we match: from string start, a single left [, one of more of any character, until a single right ].

Thus from “[⌥⌘ N] Menu text”, the agent matches “[⌥⌘ N]” If you want, you could put the closing ] before the first space but the letter is also part of the shortcut. So if we use:

descendedFrom("Content")&$Name.icontains("^\[(.+)\]")

We get a back-reference $1 which is the contents of the matched square brackets. Thus from “[⌥⌘ N] Menu text”, the agent matches “[⌥⌘ N]” and the back-reference is “⌥⌘ N`”.

In this example, the agent sets $Text to the back-reference: Test_regex.tbx (99.5 KB)

Slightly different from the starting question, but methinks close enough. There’s so often a different path to or close solution for a problem.

Thank you again for the continued efforts to find a solution! I think I will be playing with a copy of the file in BBEdit or Nisus, as I don’t fancy changing four hundred instances of modifier keys plus character by hand. (That number illustrates to me why shortcuts have become a problem rather than a solution - that is only about three programs I’ve documented thus far.) I’ve already been considering how I might split up the data before putting it into Tinderbox, which might be another way of achieving what I was trying to do (documenting shortcut conflicts, among other things).

Why not use a regex in Tinderbox. I made this stamp:

$Name = "[" + $Name.replace("^([^ ]+ \w)(.*)$",$1+"]"+$2);

It works fine on your test strings, such that “⇧⌥⌘ F Menu text” becomes “[⇧⌥⌘ F] Menu text”.

Regex ASSUMPTION: the names of notes to be processed take the form of one or more control characters followed by a space followed by a single letter or number followed by a single space and then the rest of the title, e.g. “⌘ L Menu text” or “⇧⌥⌘ F Menu text”.

Regex’s logic: from string start, open back-reference #1, match one or more non-space characters, a space, a single word character, close back-reference #1, open back-reference #2, match to end of string and close back-reference #2

. This is generally held to mean a letter or number, i.e. A–Z, a–z, 0–9

I used a stamp for control. If you used an agent action you’d need to add extra code to stop the action occurring a second time on already corrected notes. Either include a query term that ignores notes whose $Name starts with a '[", such as:

$Name.beginsWith("[")==false

Or put a conditional block in the agent action

if($Name.beginsWith("[")==false){
   $Name = "[" + $Name.replace("^([^ ]+ \w)(.*)$",$1+"]"+$2);
}

Thanks, Mark. I will have to go away and see if I can understand that.

I think at this point this is taking up too much of your valuable time. I will have to think about what I am trying to achieve at a strategic level, and consider whether the “tactics” are getting me closer to the goal, or just providing a distraction when I should be thinking about other ways of achieving what I want.

I think this may be a case of me trying to use Tinderbox in ways that are beyond my capabilities, and rather than digging myself deeper, it may be best to try something simpler.

But this has been highly educational, though I fear that one of the learning points is that I ought not to try to do this kind of thing! Thank you for all the suggestions and insight.

Hi there–I briefly skimmed through this thread and thought of an approach that may be useful or may be nothing like what you want, so feel free to ignore this if not helpful!

What if you made an attribute for each key modifier and one for “other”? Make the key modifier attributes boolean and the “myOther” a string.

  • myOpt
  • myCmd
  • myCtrl
  • myShift
  • myOther (e.g., space, alphabetic, etc.)

You could then add these to the DisplayName.

Then, when searching, you could have some flexibility with the booleans and probably get out of regex.

Yes, I have been thinking along exactly those lines. The problem I have been floundering to solve is how to extract the information without doing it by hand. I have pdfs which list all the shortcuts in a program, but I haven’t yet worked out how to split it up to make it more useable. I’ve been toying with the idea of a macro in Nisus Writer Pro, but once again, that is not something I know much about, and I doubt I could make one that would parse the text correctly.

I really wish software companies would publish tables of their shortcuts and menu items. MS Word used to have one (Word 5, about 25 years ago), which was useful. DEVONthink has somewhere around 150 shortcuts or menu items, which is beyond my memory. Particularly when the same shortcuts do different things in different programs. But perhaps trying to document all this is a fool’s errand. Hence the need to go away and think about whether this is worth it.

Ha! I know the feeling. I basically have gotten to the point where my main keyboard shortcut is CMD-SHIFT-? to get the menu search bar fired up and just go from there.

You are most kind. It’s only a bother servicing the importunate, one of which you are definitely not. I simply read your “…as I don’t fancy changing four hundred instances of modifier keys plus character by hand:” and thinking “I’m not sure it needs to be manually”. The regex didn’t take that long. That’s not boastful: it gets easier with practice and things either tend to work quickly (as here) or take forever (thankfully not, this time). Lucky us. :slight_smile:

I think you are right to re-assess at this point. If you see the current path beginning to throw up lots of fiddly work it’s a good prod that there is perhaps a better way forward to the same aim (I see there’s already been one such suggestion).

Separately, there are two intersecting problems here:

  • what shortcuts each program wants to use.
  • what shortcuts each program can to use.

Broadly the first app open since boot (and still open) to use the shortcut wins out. a later app’s use of the same shortcut won’t work but rather fairly silently. So far, so good. Figuring which app currently ‘holds’ a shortcut is seemingly an unmet need (actually a quite common occurrence in software once you go of f the marked trails). ShortcutDetective seems to be the most approachable tool out there for triaging this issue.

Lest we users feel the app dev should inform us if a shortcut is already in use with another app, I think the problem is they can’t as the app can reliably find that out; it scans for the shortcut input but nothing arrives (OK, you can tell i’m not a programmer, doubtless I’ll be corrected). So, I guess that’s a macOS feature request for Apple?

As you had given me so much help, I thought you might like to know that it was not in vain. Because I am obsessive, I could not give up, and I managed to make some progress, making use of the booleans mentioned above. There is still a lot to do, but at least I have a decent starting point for documenting this stuff. I attach the file in case it might be of use, though it is pretty rough still.

Computing_shortcuts.tbx (604.9 KB)

In fact, I managed to crash the program, as somehow an illegal character managed to worm its way in. Kudos to the other Mark for sorting that out in a matter of minutes.

Thank you to all.

1 Like