Tinderbox Meetup Video- Saturday, July 22, 2023: A Review of Bookends--Citation and Reference Management--with Jonathan Ashwell

Thanks, I’ll add that to my list of ‘cleanliness’ checks. Much appreciated. The main source of ‘bad’ bibTeX form me is IEEE. I ‘say’ bad’ but not in a right/wrong sense as BibTeX unhelpfully defines two differing formats for the same filed with no flag to signal which is being used.

If only someone had the common sense, given the plain-text nature of the format, to do something like allow appending a character such a # to the end of the data to signal the name style being used (‘last- comma-first’ vs. ‘first last’). Still like date-month vs. month-day, it’s a mess due to parochialism beating common sense.

@MWA could you help me read this RegEx, what is it doing?

1 Like

authors means test the Bookends database field called ‘authors’.

REGEX says to use a regex, whose pattern then follows enclosed in single quotes. so the pattern is:

(?m)^(?:(?!,).)*$

The opening (?...) parentheses are setting a mode, here m is for multi-line matching (N.B. these mode letter codes aren’t covered in the Bookends manual. Now we have a more recognisable pattern:

^(?:(?!,).)*$

It matches from explicit start ^ to explicit end $.

The (?:regex) is a non-capturing parentheses group the regex so you can apply regex operators, but do not capture anything. It nests another non-capturing test, this time a negative lookahead, for a comma (?!,), i.e. if you see a comma, don’t match. This is followed by a period indicating a single character of any value.

The result is like so:

Here, our test string (bottom left) has three names in the wrong format and one in the good—note how the 3 ‘bad’ ones are detected (coloured highlight in the Patters app UI’s test)…

In Bookends, using a Smart (SQL) group, the regex forms the SQL statement such that any record where the a ‘bad’ name is found is returned by the SQL query. Putting this into my Bookends database immediately found 11 records with author names needing attention. Otherwise hard to spot with 000s of records.

†. In Bookends, but not necessarily all Reference Managers, individual authors are intered one per line (i.e. a line break after each name) in the ‘authors’ field.

‡. Parentheses whose first contained character is a ? are non-capturing groups, i.ew. they match but do not populate a back reference.

2 Likes