Retrieving all notes within N links away

mdavidson · October 8, 2020, 8:09pm

FollowTheLinks.tbx (104.9 KB) I find linking in TB very useful for connecting notes/concepts across the note hierarchy. TB also offers tools (action codes) to retrieve notes connected 1 link away from the note of interest. I’m wondering how to extend this idea to N links. In my work N is usually quite small e.g. 2 or 3 but the idea can be generalised and is often used in graph theory.

Here an illustrative example close to my work at the space agency. You see three areas identified by Adornments: Application, Information that can be retrieved through Earth Observation Satellites and two basic satellite types (radar and optical).

You can see that the application is linked to information needs (to be provided by a satellite) which in turn is linked to a satellite type. For instance, to detect and measure deforestation requires information about land cover and vegetation height. Land cover can be provided by both Optical and Radar satellites. Vegetation height on the other hand requires Radar.

What I would like to do is retrieve all applications supported by a given satellite (say SAR or optical) by retrieving all notes two links away (the application notes). Possible ?

eastgate · October 8, 2020, 9:28pm

This should be possible. I see one implementation that uses rules, and another based on agents.

Let’s start with $Distance, a numeric attribute, which is initially zero for all notes.

Then, set $Distance(/theSatellite) to 1.

Now, every note has a rule that gets the list its inbound links. Examine the distance of each inbound link’s source: if 1+$Distance(theSource) is less than our distance, or if our distance is zero, then set our own $Distance to 1+$Distance(theSource). Eventually, every note reachable from the Satellite will have its $Distance. (This is called “spreading activation”.)

Another approach would be to have a set attribute $Satellites. When there’s a link to this from any satellite, $Satellite(this) gets the name (or path) of the satellite added to its set. Then, any note can look at its inbound links and see which ones have a $Satellite. This is easier, but limited to the specific case.

You could do this with an agent for each satellite, too.

This would be easy to add under the hood, perhaps as graphDistance(start, end,[linkType]).

mdavidson · October 9, 2020, 8:11am

Thanks for the elegant solution @eastgate. I can see how the more general first solution works through successive updates to the status of each note as it checks it’s distance from the source. I’ll definitely implement a solution along these lines this weekend. For more interactive exploration of notes based on a neighbourhood defined by links, a graphDistance function, along the lines you’ve suggested, would be much faster and responsive.

I recall that there was a discussion some years ago in this forum on the potential of using note links as a new additional filter or method for retrieving and visiting notes (in a similar way to graphs but with notes as the nodes). What I’ve tried to do here is provide an example above of a use case close to my work. Plenty scope for other fields and notes analyses.

PaulWalters · October 9, 2020, 11:42am

I like this idea, though it’s a slippery slope leading to lots of other graph-based enhancements

Maybe “nodeDistance()” would be more accurate?

I assume in a case like below, the result of graphDistance(“/a”,“/d”,“type”) would be 1 rather than the other solution 3?

mdavidson · October 9, 2020, 2:06pm

I like the illustration and I hope to avoid any slippery slopes I identify two different use cases:

Use Case 1 is the one I outlined above. It answers the question: retrieve all notes within N links (2 in my example) from the note of interest. In your example, this would mean retrieving the note “/c” as this is the only note 2 links away from “/a” (assuming we move along the direction of the link as a starting point although in this case it doesn’t matter).
Use Case 2 answers the question: give me the shortest path in terms of links between “/a” and “d”. In this case the answer to graphDistance() is 1 as you’ve pointed out.

Both use cases have their applications. I’m not sure how I would use the graphDistance() function for Use Case 1 as you still need a list of notes to test. As far as I can tell there is code/algorithms readily available for both cases so hopefully the gain is high and the pain is low re. implementation.

mwra · October 9, 2020, 3:05pm

This renews a very old feature request with was to be able to scope links()—with its existing filters (e.g. direction, link type) to more than one ‘hop’. IOW, "give me a list of paths for all notes linked by an inbound path length of 4 links, or less, of type ‘example’.

I do have a please that we think about terminology early (as we sometimes forget in our enthusiasm). We already have confused people thinking `ziplinks’ are a type of Tinderbox link, when in fact they are text link that are created using the ‘ziplinks’ link creation method. So you can’t, for instance, filter for ‘zip links’. It might be nuance to ld hands but lazy terminology just adds to the ‘hardness’ percieved by users—it is an unforced design error.

In that vein, I’d note we already have the action operators distance(), distanceTo() which confusingly [sic] are quite different in purpose—though you wouldn’t know without digging into the documentation (and I can’t find distance() is the app Help).

If some of the above go ahead, I’d suggest the existing operators are replaced by/mapped to:
distance() → mapDistance(), distanceTo() → geogDistanceTo(). This then gives us cogerence for intenal distance measures, e.g. mapDistance() vs. graphDistance() … and perhaps other types of intra-doc distance measures.

I’d suggest that if possible use case #12 above be done by extending links, e.g. an optional fifth param defaulting to 1 (‘hop’) is omitted, i.e. the status quo. Currently, links() must specify a directions (only inbound or outbound) and to get both you need to make the call twice and pass the resulting path lists to a set.Perhaps links might also resolve this by adding a ‘both’ direction option, noting that when using this you don’t know in which direction the link path flows (as Tinderbox links are always directional even if some folk would prefer they weren’t). A ‘both’ option might allow those who don’t need to worry about direction to work with links(). An edge case though: extending links would not necessarily allow for a path N links long reached by links not all of the same directionality.

Anyway, I really like what’s emerging here, but just ask a <pretty please> that we give some forethought to using terminology that is not totally confusing to the less experienced user. There is no need to do this.

eastgate · October 9, 2020, 3:44pm

I agree, but in some cases (as you know) there is no ideal solution.

mwra · October 9, 2020, 9:35pm

Amen. suggesting things is the easy part. I well understand the implementation is less so, and often fraught with unseen and hard to-explain(-to-the -passer-by) issues.

I’m all for “less haste more speed”. Time in consideration invariably pays off in the long term.

mdavidson · October 12, 2020, 7:31am

In these situations, I find it useful to try and list existing and potential measures of distance between notes. To my knowledge we have

the distance between 2 notes in Map units (already implemented via distance ). This tells you how far apart the two notes lie within a Map view. Although the details of this distance are documented, I assume this is based on the Euclidean distance between the (x,y) centre positions within the Map along the lines of this link. Small editorial note: a bracket is missing in the expression at the bottom of distance in aTBRef documentation.
the geographic distance between 2 notes in kilometres (already implemented via distanceTo ). This is based on the geographic attributes $Latitude and $Longitude defined for each note.
the textual similarity between the content of two notes based on text, note length, name, user attributes and other aspects such as which prototype they have (this is implemented via SimilarTo). The similarity is based on calculated numeric value. The numeric value can be considered as the distance between notes in terms of their content. The action code happens to return N notes with the smallest distance to the note being compared to.
the link or graph distance between two notes (not yet implemented). This calculates now many links (with the option of specifying type) lie between two notes. Probably the most useful would be to calculate the shortest path between the two notes of interest. Other measures could include average distance or a full list of the paths.
the hierarchical distance between two notes (not yet implemented or even proposed to my knowledge). This tells you how many steps through the note hierarchy e.g. up and down the different branches are required to bring Note1 next to Note 2.This distance is zero if Note1 is the sibling of Note2, 1 if Note 1 is the child of Note 2, 2 if Note2 is the grandparent of Note1 etc…

What is common to the above is that the action code has the form function(note1, note2,...). While there is no perfect solution to naming conventions, it might be desirable to bundle the functionality into a single distance(note1, note2, type) function and letting the user specify which type of distance he/she would like to compute e.g. distance(note1,note2,"map") for Map distance, distance(note1,note2,"link") for link distance and so on. This has the advantage of one single action code and flexibility with respect to future developments e.g. new functionality can be addressed by adding a new type parameter while maintaining the same action code.

mwra · October 12, 2020, 9:16am

Note on distance() is now corrected. As it states, the distance figure (a Number) returned by the function is the distance in TB v8.8.0b map units between the centres of the two notes: i.e. at an X of $Xpos+($Width/2) and Y of $Ypos-($Height/2) for the two notes.

Geographic distance strikes me as an outlier here as it is measuring something external to a Tinderbox view. For an internal map, distance is measured by a different feature, the above distance() feature. So, this measurement fits in the grouping only due to the term ‘distance’.

The textural similarity does, I believe, resolve to an under-the-hood number but that’s mainly an internal for figuring list orders in the UI rather than something for the user. similarTo() is now supplemented by wordsRelatedTo(). The manner of the similarity test is under a process of evolution, which is why wordsRelatedTo() only works when running in OS 10.15+. Doubtless this will continue to evolve. A challenge to the user is the opacity of these functions (not due to the app’s implementation) but to the innate black-box form of such algorithms.

In terms of link distance there are several contingent tasks, e.g. “What are the things within N links of me?”, “Am I linked to note X by a path of N links (or fewer and/or limited by link type)?”, “What is the shortest path (by link type)?”, etc. I’m not sure the latter is the most useful for the sorts of task I see reflected by TB users. In fact, I suspect some of these are most useful as a boolean query “Is note X accessible within 4 links from here?” … which is useful if you expect it to be true and it isn’t, or vice-versa. I think this aspect of distance, i.e. link-based, with become more pertinent if hyperbolic view gets developed more. At present there’s a chick-and-egg issue. You can’t do too much with hyperbolic view other than display the currently connected network (I’m not suggesting that alone is not useful). The interesting part comes if you can focus (or hide) different link-type paths (giving link types usage more weight than just labels), plus be able to use some of these link-network based questions, e.g. “Draw me the hyperbolic network to a maximum of N links distance based on the current selection”, and so forth.

The ‘hierarchical’ distance can be calculated by the user (i.e. not via an action operator) as all notes (and aliases) have a 1-based $OutlineDepth and a $Path. It’s a bit of a chore, but could be done that way now if needed, though I’d agree this doesn’t seem to be a pressing need (then again these things never seem that way in advance of an actual use case!).

mdavidson · October 12, 2020, 9:39am

Under link distance I would certainly agree on value of the functions in the focusing and hiding of different link paths and types. I see this not only for the Hyperbolic view but also for the standard Map view when there are complex networks of links displayed.

If you consider my original example at the beginning of this thread, filtering out all the links NOT related to the satellite of interest would help visually identify those applications supported by the satellite of interest.

mdavidson · October 22, 2020, 8:59pm

I’ve finally got some time to try out your suggestions. The following code works fine.

$MyList=links.outbound…$Path;
$MyList.each(x){if($DistanceToSource(x)>$DistanceToSource & $DistanceToSource == 0){$DistanceToSource=$DistanceToSource(x)+1;};};

Here I use the attribute $DistanceToSource to store the distance from the source note.I also initiate the source note setting $DistanceToSource=1. For the moment I’ve used Stamps. I can see how an Edict can work as well based on this code.

For Agents I believe the implementation is more complicated. The main challenge is that there is no option for retrieving or using the links() action code with the aliases retrieved using the Agent.

eastgate · October 22, 2020, 10:11pm

How about this?

$MyList=links(original).outbound…$Path

I may not understand what behavior you’d like for the aliases inside agents.

mdavidson · October 23, 2020, 7:50am

You guessed right ! I’m looking to access the links of the original notes through their aliases as they are retrieved and processed by the agent. If original can be used to access the links of the original note the problem is solved.

I may have misunderstood the links entry in aTBRef. The general form of the action code is clear enough

links[(item|group)].kind.[linkType].$Attribute

However, I was confused by the statement some ways underneath that discusses (item/group) which states:

Aliases can never the referenced object(s) of item|group , even by use of paths.

mwra · October 23, 2020, 10:11am

What I read/wrote this to mean is that an alias can’t be the object for which the app determines links using this operator. I believe this is still true. The original of an alias is not an alias, so this holds true. The article also states:

When using links() in the context of an agent’s action, remember that aliases can have different links to thir originals. Therefore, it likely you will want to use ‘original’ as the note item for the call.

I’ll move the latter text to the former to make this more explicit. Done! See here. It now states:

Aliases can never the referenced object(s) of item|group , even by use of paths. When using links() in the context of an agent’s action, remember that aliases can have different links to their originals. Therefore, it is likely you will want to use ‘original’ as the note item for the call if this ‘this’ is alias.

The should explain this issue better. But, if unfamiliar with an action operator and looking at these articles, I still recommend reading the whole article, as recent changes are often added at the end of the article.

mdavidson · October 23, 2020, 1:36pm

All clear now thanks.