Tinderbox Meetup - Saturday, May 13, 2023

mwra · May 15, 2023, 3:57pm

As explained, user (URL-type) attributes. the point being URLS can be in any of:

weblinks in $Text
system URL-type attributes
user URL-type attributes

any of which might be duplicates. The code covers all 3. You can leave out the extra user attribute tests but the code won’t fail if they aren’t there.

The example is just that rather than a precise fix to a bounded problem, as likely we don’t all use the same set of attributes. Doing version 1 then made me realise that a simpler way is to add all the possible URLs (link or attribute derived) to s Set, as that automatically de-dupes so the number of items in the set is the number of discrete URL attributes/links in the note.

Essentially, version #2 makes a set type variable, then adds to it the URLs of all weblinks in the note’s $Text (weblinks are always from $Text, or they need to be URL-type attributes). Then add each URL-type attribute you (think you) will use to make a de-duped list whose item count you can then record.

satikusala · May 15, 2023, 4:49pm

Thanks. I get it now, for example I have me $URLEmail, i.e., a URL to diggera a mailto. Can you share an example of a SpecURL, SOureceURL, XSDSpecURL and JSONSpect URl and how they may look or act differently from other kinds of URLs.

mwra · May 15, 2023, 10:01pm

The only example is my current research TBX which i can’t share as it’s designed for me and would need lots of description for others to use it. But to answer the other part of the question, they don’t look any different—or they look like any other URL-type attribute (system or user). The attribute names are simply to remind me of the purpose of the various URLs, plus I need to store more than one URL in an attribute , per note… I’m using $SourceURL for the resource main page, $SpecURL for the specification or API for that resource, $JSONSpecURL for a JSON-based resource. So, there are just URLs, albeit with different nuance and in some cases linked to particular formats (XSD, JSON). The formats don’t matter to anyone but the maker, so it’s best not to be too literal in interpretation.

But, recall the starting premise. An unexpected side-effect of how the data for >6 notes was generated resulted in a lot of duplicate URLs and some hard to tell by eye as they were weblinks using anchor text (IOW you need to open Browse Links to see the actual URL). The code shows how I was quickly able to ascertain I had a 20% duplication rate. The exercise was worth it for that alone. Beyond that the code has no particular purpose but it does offer a generalisable example for those who have a similar issue albeit with different attributes, data, etc.problem.

Making it a formal demo would confuse and get people in process-fixation, i.e. trying to follow the process without understand, rather then seeing the pattern and using the understanding to improve processes in other contexts.