How can I use Tinderbox to generate a list of abbreviations?

Hello,

Is there an action code that can generate a list of abbreviations, for example, in academic writing? That is an action code that would pull up all the abbreviations appearing in an article and place them as a list in one note.

Thank you

The way I usually do this is (in any document) is to use a regex search to pull out occurrences of two (or perhaps three depending on your discipline) or more consecutive capital letters [A-Z]

Thanks, Rob. I don’t have much experience with coding. Are you able to share an example I can modify? I have gleaned some knowledge on regex from watching Michael Becker’s and the meetup videos, so I should be able to modify the code you will share to accomplish my goal.

Hi Stephen,

it depends on how your abbreviations are build. For all abbreviations that start with a capital letter and contain at least one more capital letter anywhere in the word the RegEx is:

\b(?:[A-Z][a-z]*){2,}

I think I’ve found a way to do this – I’m sure there will be better ways!

The problem is in two parts:

  1. Identify abbreviations within notes, which is easy.

  2. Pull the abbreviations out of those notes into the text of another note, which is harder…

Step 1

Assume we’ve got 3 notes inside the container “Texts”, 2 of which have abbreviations and 1 doesn’t.

We can identify which have abbreviations with an agent (here called ‘Texts with abbreviations’), whose $Query is: `

inside(Texts) & $Text.contains("[A-Z]{3,}")

The contains clause basically just search for a sequence of three or more ({3,}) capital letters ([A-Z]). This will find Text A and B, which both have abbreviations. (See @webline’s post for a better pattern.)

That’s the easy part… (Note, we don’t actually have to use the Agent to get the final result, but I’ve included it to show more easily how the regular expression works…)

Step 2: extracting the abbreviations

To pull the abbreviations out takes a bit more work.

First, I created a dedicated set attribute $Abbreviations. It’s a set attribute, rather than a list, because I’m assuming you don’t want duplicates.

Secondly, we need to go through all the words in each note’s text. (I’ve done this in the note’s Rule, but you could equally use a Stamp).

The easiest way to do this, AFAIK, is to feed the text of the note into a set attribute one word at a time.

$MySet = $Text.split("\W+"); 

(.split moves each word (W+) into $MySet one word at a time, removing duplicates.)

Next we want to loop through MySet and compare each word to our regular abbreviations pattern ([A-Z]{3,}). If it is a match, then we add it to the notes $Abbreviations set atttribute.

$MySet.each(x){
    if x.contains("[A-Z]{3,}") 
       {$Abbreviations = $Abbreviations + x}
}

You combine the two steps into a single rule, which you can see in the screenshot below.

Finally, we collect all the $Abbreviation sets for each of the notes in ‘Texts’ and place them one to a line in the $Text of ‘List of Abbreviations’. To do this, we give ‘List of Abbreviations’ the following Rule:

$Text = collect(children(Texts),$Abbreviations).sort.replace(";","\n")

(replace(";","\n") simply turns a set/list separated by ‘;’ into a vertical list.)

That should be it: you’ve now got all the abbreviations into the text of a single note. Obviously, there are lots of possibilities (and I’m sure the more experienced will be able to point out better ways), but I think this shows the general principle. E.g., you may want to have a more sophisticated regular expression match as @webline suggests above.

Here’s the test file for you to play with, if that helps.
Abbreviation Test.tbx (102.7 KB)

HTH!

David.

You might want to consider using simple text expansion software to help you achieve this.

Even OSX supports them via Keyboard → Shortcuts.

Hello,

Thank you. I have used your regex expression to modify David’s code as follows:

$MySet = $Text.split("\W+");
$MySet.each(x){ if x.contains("(?:[A-Z][a-z]*){2,}"){$Abbreviations = $Abbreviations + x} }

It worked.

Hello.

Thank you, I have followed your second step with Detlef’s regex code:

$MySet = $Text.split("\W+"); $MySet.each(x){ if x.contains("(?:[A-Z][a-z]*){2,}"){$Abbreviations = $Abbreviations + x} }

And your third step with my notes’s name (MyOutput) inserted.

$Text = collect(children(MyOutput),$Abbreviations).sort.replace(";","\n")

It is a great start. It pulled out the list below which I can then go through, remove duplicates and non-abbreviations, and define for my list of abbreviations.

Pic of part:

There may be a way to refine the output but it is a great start.

For the first part, your agent query uses the name Text, which happens to be the first name of all your notes. All my children notes have different names. How can I modify it to run a similar query if I need to run such an agent for this or other assignments?

Thank you.

I’m please it worked! I wasn’t sure how it could be done at first, and I had fun finding out…

Just one refinement, which will stop you having to look for duplicates manually. If you change this:

$Text = collect(children(MyOutput),$Abbreviations).sort.replace(";","\n")

to this:

$Text = collect(children(MyOutput),$Abbreviations).unique.sort.replace(";","\n")

(i.e. add .unique immediately before .sort`) then you should remove the duplicates in the final list. (I should have put this in the original, sorry…)

Also, when you post code on this forum, enclose the code in single backpacks (e.g. `code`) for inline code, and in triple backticks for lines on their own:

```

Your code goes here

```

HtH!

Hi,

I use TextExpander for test expansion, and I am unsure how it generates a list of acronyms or abbreviations. Do you mean using it to launch a script? Kindly share an example.

The agent query is actually

inside("Texts") & $Text.contains...

This means "look for any note inside the container “Texts” AND whose $Text value contains our regular expression. It doesn’t actually look at the names of the notes at all, so it shouldn’t affect how you name your individual notes.

So, if the container was called “Reports”, the query would be inside("Reports"), and so on.

Of course, you can replace the `inside(“Texts”) with a very wide range of other conditions, for example

  • $Prototype == "pDocument" – matches any document with that prototype
  • $Tag.contains("Draft") – any document with a tag name ‘Draft’ etc.

You use this sort of condition to narrow down the number of results, so you’re not searching inside every note in the file. (Note that you must use == not = when you’re matching the name of an $Attribute).

The ‘Getting Started with Tinderbox’ and ‘Actions and Dashboards’ pdfs (on the Help menu) have some worked examples if you like an introduction to Agent Queries, and there’s some more detailed help in the Help file.

Hope this helps!

Yes, that removed the duplicates

Thanks, too, for the instruction on how to share code on the forum.

1 Like

Thanks. I have understood, tried it with the correct parent note name, and it worked.

1 Like

BTW you could leverage code from my note creation video: Tinderbox Training Video 62- Dynamically create notes from attributes with functions.

You can have the abbreviations note create unique notes for each acronym, which you can then define.

1 Like

your information is very interesting and good question but i am not idea.