Reducing load of agents

Steve_Scott · December 7, 2016, 1:00am

I’m trying to remember the technique of reducing loads of agents. Could someone help me remember how to create an agent that is searching from another agent.

I have a prototype that is finding all of the vocabulary for the course. Now I want to create an agent that will search within that found set for all Hindu vocabulary.

I know that I can create a new agent:

$Religion="Hinduism"

However, I remember reading that it is more efficient to search within existing agents when possible.

mwra · December 7, 2016, 10:16am

OK, so you already have the agent ‘Vocabulary’, which contains aliases to a subset of notes in the document, i.e. those matching the agents query. If you know all the ‘Hinduism’ notes you now seek are also all matches to the Vocabulary agents you can to a more efficient search like so:

inside("Vocabulary") & $Religion=="Hinduism"

Note: use ‘==’ in agent queries for ‘is equal to’, rather than ‘=’. The latter works for legacy support reasons but is not recommended for new code.

The second part of the query now only interrogates the subset of notes found in ‘Vocabulary’.

In simple terms, imagine the doc has 1000 notes and 'Vocabulary finds 50 of those, plus the doc contains 10 ‘Hinduism’ notes. If you ran just the last part of the above query, Tinderbox would have to check all 1000 notes to find the 10 matches but as it is it only has to check 50. Scale up the numbers and the effort saved is clearer. This becomes more pertinent if doing a regex-based test, for instance if you were testing $Text.contains("Hinduism").

Do be aware that if an agent B is testing the results of another agent A, if you make changes affecting A, you may need to wait for the agent update cycle to complete a couple pf times so that B is seeing A’s updated results. That’s hard to see in a test with 20 notes as it all happens so quickly as to be irrelevant, but with 000s of notes it is something to bear in mind.

Tip: the agent cycle processes agents in order of their $OutlineOrder position. So in the above, it makes sense that - if possible B would have a higher $OutlineOrder value than A so that when B runs A has always just been updated. If both were on the same map, the z-order (stacking) represents $OutlineOrder so by overlapping A and B you can tell which would run first in any given cycle.

If you wanted to do your task, of listing vocabulary notes for a given $Religion, but didn't want a permanent agent per religion you could have one agent with this query:

    inside("Vocabulary") & $Religion==$MyString(agent)

Now you set the agent's key attributes to include $MyString. Changing the latter, e.g. from "Hinduism" to "Judaism", changes the query offering quick re-use of the agent. 

> Note, although you could use $Religion(agent) in the query, meaning the agent's $Religion attribute would have a pop-up list of existing $Religion values for easy value selection, you do then need to consider if setting a $Religion value in your agent would make queries mis-match.

Steve_Scott · December 7, 2016, 3:48pm

That is exactly what I needed. It works perfectly. Thanks for the clear write up!

bmscmoreira · December 8, 2016, 3:10pm

Great advice, thanks

jmm · December 22, 2017, 5:25pm

I am trying to base an agent in another one, as it is explained in this thread. However, the following query doesn’t work. I’m puzzled because its three parts do work separately. My guess is that descendedFrom has to be qualified with original, but I don’t know how to do it.

inside("QuotationsAgent") & descendedFrom("/Notes/Academic/Klemperer 2003") & $Tags.contains("miedo")

Incidentally, I’m using this agent to filter notes for a timeline view. It would be more convenient to use and modify it under /Notes/Academic/Klemperer 2003, but I wonder if it is better practice to place all agents under /Agents, so that they are not scattered and forgotten all over the document.

I hope somebody can advise on these issues.

jmm · December 22, 2017, 5:51pm

I’ve solved the code issue. The problem was that the two agent queries specified contradictory paths.
I still wonder where to place agents, but this is not so important to solve right away.

mwra · December 22, 2017, 8:09pm

I believe agents don’t ‘need’ to live anywhere. The issues surrounding agents are, in brief:

Polling too many notes. Think about scope. A query is parsed left-to-right, so start with terms that limit the scope before getting into the finer detail. Operators like inside(), descendedFrom(), or `$Prototype==“SomePrototype” will help restrict scope.
Overlooking other agents. inside(), especially, will match to aliases in scope when the original isn’t. Plus, if an agent polls other agents, consider the possible presence of aliases of notes otherwise assumed to be out of scope.
Overall agent/alias count. A rule runs in the context not only of the original, but also every alias of that note. If you’ve complex rules, pay attention to this aspect. Reducing the overall alias count, e.g. by polling inside another agent rather than requerying all the originals, can aid performance.

Edit: added missing “don’t” to first paragraph, so it should now make better sense.

eastgate · December 22, 2017, 8:33pm

I can’t resist an irrelevant question.

In JMM’s query does

descendedFrom(“/Notes/Academic/Klemperer 2003”)

refer to William Klemperer, the molecular beam chemist? I don’t imagine you’d have academic notes on Otto or Werner, and Victor died in 1960…

jmm · December 23, 2017, 7:04am

Your question is welcomed. It is Victor, the Romance language scholar, cousin of Otto, the conductor and father of Werner, the actor. I had not heard of William, which is not surprising because I know next to nothing about chemistry. I assume you knew of him. In case he was related to the rest, talent was scattered often enough in the Klemperer family. Among the 7 siblings of Victor, Georg was a physician who consulted Vladimir Lenin.

Unlike Otto and Georg, who swiftly reached the USA in the early days of nazism, Victor endured it as a jew and secretely wrote about the newspeak of nazism in a book and two volumes of diaries. The latter were posthumously published in 1995 and I have a 2003 edition. I’ve been wanting to read them for sometime now, and this is when Tinderbox comes into the picture.

It occurred to me that the timeline view would be a good tool to see the evolution of his thoughts on specific issues over a 12-year span. Therefore I’ve taken the opportunity to read them. I’ve written my annotations in Tinderbox. I have also extracted and tagged over a hundred quotations. Instead of waiting for the new release of watch folders manages to transfer tags, and doesn’t delete them from previously dragged notes, I dragged and recorded a $StartDate for each of the quotations from the day stamped in the diaries. My first look into the evolution of his thoughts on topics under the heavy influence of the times is indeed interesting. And here comes my Tinderbox dilemma:

On the one hand, DEVONthink is the tool for sources but not apt at getting the insight a Tinderbox timeline permits on this kind of source. On the other hand, Tinderbox is better left for my own writing because of the maximum number of text notes it can handle without lagging. The number of notes that fit in a screen in map view is not an issue because I can isolate my annotations in their own container under each source.

This case makes clear that TB can play a role in the selection of sources to be included in my writing. Let me make clear that I have no intention to include a whole DT database of sources in TB, nor to include files of kinds other than text. Nevertheless, the number of notes in such a document would quickly run into the thousands. By the way, what is the approximate number of text notes a tbx document can comfortably handle?

A solution to this dilemma would be to analyse selected sources and do my own writing in two separate TB documents. Not just because of TB technical limitations but for the sanity of not being under the impression that one has written the big amounts of text one is handling. Not with links but through their URLs notes on both documents can be referenced -footnotes are a different issue, and I understand they must be in the tbx document dedicated to writing. This path of action seems reasonable to me but interdected because of TB’s current pseudo-URL schema: the path part of the URL makes it untrustworthy future wise. Are there any plans to change it?

Any practical insight from seasoned users into how to approach this task division will be welcomed. For these tasks I would like to keep my work among DT and TB for these tasks. I don’t want to integrate the otherwise good Scapple, iThoughts or Aeon. I am curious about what The Brain and Panorama can provide. But in general I would prefer not to add any other app into the picture, unless it provides clear advantages and few additional integration problems.

To sum it up, I think I am advancing nicely with Tinderbox on three fronts: the actual work practice with real text (@JFallows posts have been clearly helpful), the coding side (atbref is a great resource), and an interest on the theoretical part.

On the last front, I am half way through @eastgate’s Getting started with hypertext narrative. I find it dense when trying to put some of it into practice, and interesting. It made me laugh in agreement when I read that books are not natural objects nor God sent, but technological objects in evolution throughout history. Most of their paper form defenders miss the point that the evolution of books hasn’t been reduced to materials and their impact in wider access to them.

Innovations in book technology have impacted thought processes. From collective vocal recitation to an individual silent activity. Book ownership brought the possibility of re-reading. Division in chapters contributed to logical writing and reading, as opposed to a uniform text narrated from beginning to end. The spacing of text made possible alphabetic indexing. The innovation of subject-based indexing is not that far from nowadays tags. Neither is the technique of medieval commentary from digital annotations. Actually, I agree that the question is if consumer software has so far done more than mechanise already existing techniques, that previously required experts and were very time consuming, rather than having already achieved innovations of great impact. According to Ivan Illich’s In the vineyard of the text, Hugh’s Didascalion on the art of reading is a work that had a huge impact in the early 12th century, when a series of important text innovations appeared, and familiar with our current concerns.

As for difficulty, I happen to own a copy of Tinderbox and a manuscript unscripted in palm leaves. I have no doubt that Tinderbox is an easier tool to master for the art of writing.

PaulWalters · December 23, 2017, 12:07pm

I would focus research / note taking & annotation, and writing drafts, in the same document. Mainly because you have the benefit of your raw source information side-by-side with your own creation. And you don’t need to be bothered with inter-document links.

If you have a lot of source material that is relevant but not essential to the core of your writing efforts, then keeping it in DEVONthink or Finder and watching those DEVONthink groups and/or Finder folders is a good idea. You can delete a watched group when it starts cluttering your Tinderbox document, and add it back when you need the material at hand for a while.**

There’s always a temptation to branch out into other applications that you mention – TheBrain, Scapple, Aeon, and others. Avoid it. Once you commit to one technology for a project, stick to it. If my goals are to do my research, collect my sources, take my notes, develop my thesis and insights, and write my piece then every minute I stray off to play with some other technology and try to integrate it is time off task.

Not to say these other softwares are not wonderful. They are. Use them for the next project, not this one.

** _{Not much different from the analog world where you take an old notebook off the shelf and “watch” it for some important notes for a while, while writing, then put it back on the shelf until it is needed for a later chapter.}

eastgate · December 23, 2017, 4:10pm

Tinderbox can easily handle thousands of notes.

If you’re looking at hundreds of thousands of notes, then (a) Tinderbox would really sweat, and (b) lots of the things that gives Tinderbox leverage — lexical agents, maps, treemaps, and other visualizations — lose tractions.

If you’re looking at millions of notes, a database is the only reasonable solution.

jmm · December 26, 2017, 7:24pm

I’ve taken my time to browse atbref, in order to better comprehend your once again interesting and canonical reply. I’ve found out that setting $Searchable to false would let me place agents anywhere avoiding the performance hit on other searches.

As it is an intrinsic attribute, I understand that using it as an OnAdd action in the agent will affect only the aliases found, not the original notes. If I need to make the search results available to another agent, searchable again, I understand that I would have to delete the OnAdd action in the agent, and apply a stamp with $Searchable==false to the search results.

Incidentally, I understand that in the case of prototypes $Searchable will not affect the notes the prototype is applied to.

eastgate · December 26, 2017, 7:42pm

I’m having a very hard time following this discussion.

I believe agents don’t “need” to live anywhere.

I can’t see any reason, offhand, why the location of an agent would have any impact at all on its workload.

setting $Searchable to false would let me place agents anywhere

I can’t see, offhand, why this would be true. Setting $Searchable to false prevents a note from matching an agent’s query; it’s typically used to prevent prototypes and other infrastructure from being flagged by agents.

If I need to make the search results available to another agent, searchable again, I understand that I would have to delete the OnAdd action in the agent, and apply a stamp with $Searchable==false to the search results.

I think you mean, “$Searchable=true”. But I wouldn’t use a stamp for this; since you already have an OnAdd action, you could simply change to

$Searchable=true;

or

$Searchable=;

jmm · December 26, 2017, 7:53pm

I was thinking of setting an agent search results to $Searchable==false, so that other agents won’t have to process those aliases; hence the impact on the workload. But I probably haven’t understood $Searchable well.

Yes, sorry about it.

jmm · December 26, 2017, 7:54pm

If I am not wrong, your approach assumes that raw source and annotations are to be discarded after each project. I was leaning towards the opposite approach, more in line to what I understand @JFallows has explained in this forum: a database of notes that I can refer to in successive writing projects.

I would have further processed some of the notes imported in the watched folder, so I am not sure I would want to lose their links, etc.

I was meaning to have a big file with all annotations on my sources to be looked at whilst writing. At the end of the project, I would duplicate the tbx file. In one of the tbx, I would delete my writing but all sources and annotations will be available for the next project. In the other tbx, I would leave my writing and only the sources and annotations that have been used in that project.

Since you are a more seasoned user of Tinderbox than myself, I now wonder if I am following the wrong course in the design of my workflow.

PaulWalters · December 26, 2017, 9:18pm

No, I neither assume nor suggest discarding anything. I never throw away notes. I merely suggest doing notetaking and writing in the same document – because that’s how I work, and not because of any disagreement with other suggestions.

Not what I’d do. But it’s not important what I’d do. What’s important is what you feel comfortable doing. As has been said repeatedly in this forum ad nauseam – there is no right way to use Tinderbox. Experiment. See what’s comfortable for your way of thinking.

jjvornov · December 27, 2017, 7:24pm

I find the discussion of these particular use cases both interesting and useful. For myself, I’m dealing with large numbers of PDFs and word documents. These are articles from the medical and neuroscience literature plus documents like clinical trial protocols and investigator brochures describing what’s known about experimental drugs. The scientific literature is brought in through Mendeley but indexed in Devonthink. It mimics my old paper copies and folder structure I started with almost 40 years ago when I began my PhD research. Except its instantantly accessible, searchable, linkable and all lives on the SSD of a 2.5 pound MacBook Pro.

I don’t get along with PDF annotations, just like I never write in books or on printed papers. My notebook is in a Tinderbox file for the project. My drafts are always in some other program depending on where they are headed. Mostly because I try to keep a flow from note to finished writing going one way. Notes are for me, writing is for others.

jmm · December 28, 2017, 5:12am

I find both of your use cases interesting, thank you. I store them for future reference when I find myself at the crossroads of making choices again on the design of my workflow.