Annotated co-occurrence example code


(Mark Anderson) #1

In a thread on value co-occurrence, I posted a demo (see post) showing a proof of concept. There was a question as to how the complex looking action code worked. What follows is a cleaner version of what’s in the example in my earlier file.

Edit:

The original solution used a rule but the combination of using a CPU-intensive rule to set the $Text of the current item (which was likely itself an agent) seemed to cause a few problems. A better solution is to pass the output of the action to a separate note. this has the advantage too that the action runs once only (though the stamp can be manually re-applied as needed).

Assumptions (used in making the code):

  • All attributes starting ‘$My’ simply indicate the attribute data type to use. Where more that one of that type is needed, I suffix a number (e.g. $MyList2). Feel free to use other attribute names as you see fit in you own use.
  • The values to analyse are from a multi-value (Set or List type) attribute, here called $OpenCode.
  • This code is a rule to be run on an agent which finds only the desired notes. Regardless of other scoping arguments, the query should include & $OpenCode.size > 1 as only notes where the attribute has 2 or more values will contribute co-occurring pairs for analysis.
  • The path to the note used to receive the result of the action is stored in a String-type attribute $TargetNote.
  • $TargetNote is set as a key attribute in the note/agent whose children are being analysed.
  • Where possible Sandbox group system attribute such as $MyString are used with an exta List $MyList2 and a Set $MySet2 need to be added to the document
  • All pairs are assumed to have no ordering, i.e. AB is also considered an instance of BA, so AB==BA.
  • The loop variable names X, Y and Z have no specific meaning. You can choose others if you prefer.
  • If you want to review or alter the code, I suggest doing so in a code note or a text editor before pasting into the Rule Inspector as the code can’t all fit in the visible box.
  • This code has not been tested with very large value sets nor for funky values. The presence of (regex) control characters, e.g. [ \ ^ $ . | ? * + ( ), in values may cause issue. I’d recommend constraining individual values to A-Z, a-z, 0-9, underscore and space to avoid any excitement.
  • The code checks only pairs of values and not triples, quartets, etc., of co-occurrence.

The process uses uses 2 stamps, both of which are applied _to the target agent. The first simply resets the target not $Text (erasing it). This is not required, but doing so changes the (Outline view) note icon to show an empty note. When the main action is run the icon changes to show it containing text thus indicating the action has completed. This makes more sense of a real-world document where there may be a lot of data and other rule/agent processes running at the same time. The first stamp’s code:

$Text($TargetNote)=;

The main action is this:

$Text($TargetNote)=;
$MyString =;
$MyNumber =;
$MySet = collect(children,$OpenCode);
$MyList = $MySet.isort;
$MyList2 = $MyList;
$MySet2 = collect(children,$Path(original));
$MyList.each(X){
   $MyList2.each(Y){
      $MySet2.each(Z){
         if(X != Y & ($OpenCode(Z).contains(X) & $OpenCode(Z).contains(Y))){
            $MyNumber = $MyNumber + 1;
         };
      };
      if(X != Y & $MyNumber > 0){
         $MyString = $MyString + X + " + " + Y + ": " + $MyNumber + "\n";
      };
      $MyNumber =;
   };
   $MyList2 = $MyList2 - X;
};
$Text($TargetNote)=$MyString;

The resulting target note’s $Text looks like this:

blue + green: 10
blue + orange: 6

I’ll add another post annotating the big action to explain what’s going on (as this post is long enough!).

The demo is here: http://www.acrobatfaq.com/tbdemos/OpenCode_Example_with_co-occurrence.zip


Research grid with exported data
Find co-occuring set values across notes
(Mark Anderson) #2

This is an annotated version of the stamp listed above. Do not use the code below as shown (use the un-commented version in the post above). The lines starting ‘##’ are comments and will not be understood as such by Tinderbox’s action parser.

## Initialise target note by deleting its text
$Text($TargetNote)=;
## Initialise variables (ensure no values carry over from previous loop)
$MyString =;
$MyNumber =;
## Collect all in-scope unique values (Set de-dupes a list)
$MySet = collect(children,$OpenCode);
## Sort list case-insensitively
$MyList = $MySet.isort;
## Close the list to use as the listing for the 2nd pair item
$MyList2 = $MyList;
## Get the paths of the agent's child alias' originals
$MySet2 = collect(children,$Path(original));
## Loop through list 1, the first pair item, storing the loop value as X
$MyList.each(X){
   ## Loop through list 2, the second pair item, storing the loop value as Y
   $MyList2.each(Y){
      ## Loop through the set of paths, storing the loop value a Z
      $MySet2.each(Z){
         ## Ignore if X and Y are the same, i.e. not a discrete pair: otherwise, test $SourceValues for X and Y
         ## Test the X!=Y first so the regex-based .contains() tests are run only when needed.
         if(X != Y & ($OpenCode(Z).contains(X) & $OpenCode(Z).contains(Y))){
            ## Found both X and Y? Add 1 to a counter attribute
            $MyNumber = $MyNumber + 1;
         };
      };
      ## after all paths looped for current X and Y, test if counter got incremented
      if(X != Y & $MyNumber > 0){
         ## if so, record the X and Y values and the count found
         $MyString = $MyString + X + " + " + Y + ": " + $MyNumber + "\n";
      };
      ## reset the inner counter ready for the next set of X/Y tests
      $MyNumber =;
   };
   ## as pair AB==BA, removed used list 1 values from list 2 as we'cve checked that pair already
   $MyList2 = $MyList2 - X;
};
## All loops done, now pass the resulting text ($MyString) to the $Text of the agent's $TargetNote
$Text($TargetNote)=$MyString;