Thinking through an idea - am I modelling this the wrong way, mentally?

satikusala · January 2, 2025, 12:39pm

Interesting, I’ve not see anyone put a function in an $Edict like this before. Clearly it works, but that is not what I expected. I expected to see the function as a note in the library and the $Edict calling the functions.

Also, @DaveM, I’m curious, would it not be helpful to have each found candidate be its own note rather than in a log? You could have the path as an attribute value or link that could help with automatically navigation. If this path interests you, this is something we could explore on Saturday’s meetup.

mwra · January 2, 2025, 1:06pm

@satikusala it is allowed for users to place a function in any action, though less usual practice. I read the code above as a test, so if the document didn’t already have a library (and /Hints, etc.) then this seems a sensible quick way to insert a user function with minimal effort.

As @DaveM’s code example stopped before the full loop completed we can’t tell what came next. In my test I ‘printed’ the matching paths to $Text just as a a proof the .each() loop worked as expected.

The /Hints/Library/ container does make it easier to store all functions in a central place, but requires the ‘built-in’ ‘Hints’ folder to be added to to the TBX. N.B. in my test, I added the Hints before realising it wasn’t actually needed for solving this puzzle.

Re storing functions: Storing function code.

DaveM · January 2, 2025, 3:45pm

Thanks @mwra and @satikusala! I’m refactoring and taking your expertise onboard.

I stuck the function in the edict as I just needed a quick way to check a few things were happing the way I expected, and couldn’t remember the syntax for calling functions defined elsewhere. While I was on a roll, I didn’t want tio break my flow - I have enough structural and syntactical issues to keep me covered there. It’s a filthy bodge, but it did what it needed to do at the time, and imposed no dependancies onr the need to flip around other parts of the document.

And Mark, I didn’t put the whole function up because I open a regex can of worms that could easily overwhelm this first issue.

What I’m hoping to do is maybe set up a prototype, that allows instances to specify a prefix to collate these notes from. So one might collect any strings with ‘{q:…’, another collects references to notes with ‘{xyz:…}’, which I use as a general follow-up marker.

Michael, I think the answer to why I’m putting things in a log note would make sense if I’d put the whole thing up. The log was just sanity checking of data; there’s a whole extra chunk of code pulling the relevant sections from each discovered note, then creating new notes that link to them.

Hopefully I can get something more generalisable together to put up here, but not before - I suspect - I’ll be asking about how to make regexes non-greedy, and being told by @mwra that I should probably be using the new stream handling methods anyway

DaveM · January 2, 2025, 7:35pm

Okay, I made the mistake of trying stream processing.

There’s a disconnect between how my head thinks it works, and how it thinks it works.

So say I have the following text string:

this is {q:first item} simply a test {q:or is it testing} case

(I can’t assume a paragraph won’t have multiple matches in it!)

Let’s assume this string is stored in a variable, candidate.

I have tried the following:


		/* find Lingering Questions! */

		var lingerers:list;
		var toExtract = $Text(candidate);
		//var catch:string;
		var catchFail = false;

		while(catchFail == false) {
			slog("- hunting through "+toExtract);
			toExtract.skipTo('{').expect('q:').captureTo('}', "catch" )
			catchFail = toExtract.failed;
			if (!toExtract.failed) {
				slog(" - !found something - "+catch);
				lingerers += catch;
				toExtract.captureRest("toExtract");
			}
			catchFail = true; // a showstopper stoppper
		}

now, I’m kinda stumped as the documentation suggests .captureTo()'s destination (the 2nd parameter) is a string matching an attribute. But I want it in a variable!

I’m also kinda stumped as to how to catch failures. It looks like it should be some kind of try/catch or promise structure, but then I get all tied up between the action code and javascript, and can’t find any good examples.

I’ve attached the return-butchered tbx file - my ugly edict is in ‘q: collector’, where you can see abandoned attempts to use regexes.

new-model-thesis-enbutchered.tbx (163.3 KB)

If you’re running this, make sure the while loop doesn’t run continuously, as macos choked on the giant memoryball I’d ended up inducing.

mwra · January 2, 2025, 10:51pm

I can see an immediate code error on the line toExtract.skipTo('{').e... as it is lacking a required line terminator as further discrete expressions (i.w. discrete lines of code) follow this one.

I’ll take a look. At first sight, there is no need for the while() loop. You simply consume the text stream capturing the pattern desired into an attribute or variable. You might want to revisit the articles in the Stream Processing and parsing section of aTbRef (I don’t think Help has as much detail).

Note that String.captureTo(matchStr[, targetAttributeStr]) allows the ‘targetAttributeStr’ parameter to be an attribute or a variable—see the linked article.

But, stream parsing won’t work if one line (aka paragraph) of $Text has between 1+ and N ‘q:’ markers. Consider:

$Text has zero markers, ignore note
$Text has one one marker, use .contains()
$Text has multiple markers but no more than one in any single paragraph. Iterate $Text.paragraphList() and use .contains() on each in turn
$Text has multiple markers within a single line/paragraph. Use String.extractAll(regexStr[, caseInsensitiveBln]) to get a list of all the markers (op’s argument is a regex.

Stream parsing consumes the source text, be it all or just a single paragraph/line. But within the source, a capture can’t auto-repeat. Rather, if $Text contained 3 ‘q:’ markers you would need to formally code 3 x detection of the marker as discrete captures.

Another small snit: Tinderbox comments are single-line (whole or end of line) started using //. This is not JavaScript, which is the assumption that may be setting you wrong here. See Comments in Action code.

The challenge of iterative fixing is we (helping) don’t know the real purpose. For instance, it matters that there may be more than one ‘q:’ marker is text. Unclear is what your intended end state is. It looks like you are trying to find the value of all ‘q:’ markers (i.e. from after the colon to the first } character) and make a note for each. Unclear is whether you need to know the source of the question. I suspect you do. Knowing the overall task helps as there may be quicker/better approaches and/or we may need make/store additional info to achieve the aim.

Meanwhile, with the need to support one or more ‘q:’ markers in the same line/paragraph I think the best detection approach is along these lines:

$Text(/log) = $Text.extractAll("q:[^\}]+");

Which your example above of:

this is {q:first item} simply a test {q:or is it testing} case

The $Text of the log file is:

q:first item;q:or is it testing

i.e. a list of two items. But, do you need to know where within a given note’s $Text the matches occur. this is another aspect that might require a change of tack.

P.S. thanks for the test file!

eastgate · January 3, 2025, 2:05am

I think it will! I’m cooking as I type, so I cannot linger, but the idea is not that bad:

Grab what you want to parse: var:string theString=$Text;
while(theString.length>0){....} so the loop terminates when we’re out of text.
In the loop, theString=theString.skipTo('{').captureTo("MyString"). That is, we proceed through the data, capturing the next {…} expression to $MyString.
Call a function to process $MyString appropriately.

mwra · January 3, 2025, 10:44am

Oh, that’s nice. So the while() loop does have a purpose! In this case, my hunch is doing .extractAll() is simpler—but is it more/less efficient. Both methods use regex-based operator arguments.

Noting that you were otherwise focussed, I think the latter—to capture the value of a {q:...} mark should likely be:

theString=theString.skipTo("{q:").captureTo("}","MyString")

We skip the stream cursor to after the first occurrence of {q:, then capture everything up to but not including the next }. The captured sub-string is stored in $MyString, whilst the remainder of the source string is passed back to the left side if the expression overwriting the starting string with the unprocessed remainder of the string.

As we’re going to loop this process, we want to pass the value of $MyString (or whatever attribute/variable we choose) to a List-type object before completing this iteration of the loop so that the detection/extraction of the next marker doesn’t destroy the only saved value of the last extraction. So, (not tested as also busy!) something like:

var:string theString=$Text;
var:string vTheMarker =;
var:list vMarkers =;
while(theString.length>0){
   theString=theString.skipTo("{q:").captureTo("}","vTheMarker");
   vMarkers += vTheMarker;
   vTheMarker =; // just in case of accidental value re-use
}

Now the List-type variable ‘vMarkers’ holds a list of every ‘{q:…}’ marker’s value, listed in the order of detection. This can be run inside an outer loop that is processing multiple notes one at a time. Thus ‘vMarkers’, declared outside the outer loop, could hold a sequence of marker values ordered my note and sequence within note’s $Text.

The challenge here is we’re iteratively solving an emergent process, so the intermediate answers may be right but sub-par overall. We’re collecting tags, for sure, but to what end? I’ve been round this loop in my own past work. I figured all the extraction etc. but because I didn’t think what I’d then do with the outcome I had to go back and re-do the process so I had needed contextual info.

Regardless, I’ve made a note on the spike to address the while(){} loop approach. For those reading who are still wondering, the ‘secret’ is that the result of any stream processing is to return all of the source string after the last processed point. so if we detect and capture marker #1 and start a new stream process with the stream output, the first detection will be of marker #2 as #1 is no longer in the (current) source string. Essentially we are recursing through ever-smaller parts of the original string (e.g. a not’s $Text). This helps makes sense of why operators like .captureTo() store their captures in a nominated attribute or variable while the expression as a whole returns the non-processed part of the string.

DaveM · January 14, 2025, 11:32am

Mostly for the sake of closure, I have an updated version, which I’m fairly comfortable with. It’s working in this example document, and I’ve turned it into a prototype (‘Collector’) which should make re-use a little easier. The prototype is loosely documented, and exposes attributes that make configuration a little cleaner.

Subnote Collector.tbx (185.1 KB)

This example puts matching ‘subnotes’ under ‘/Collect Q’.

Thanks to @mwra, @eastgate @satikusala @MartinBoycott-Brown for suggestions, mostly corrections! Hope it may be of use beyond just me!

andreas · February 16, 2025, 9:58pm

Has @DaveM ever finished shaving and consequently attended the meetup? If so, which meetup was it? @satikusala

satikusala · February 17, 2025, 12:03am

I’ve not seen anything.

andreas · February 17, 2025, 7:21am

are you, @DaveM, going to attend an upcoming meetup as envisioned past Summer by @satikusala ?

DaveM · February 20, 2025, 12:01pm

Hi hi hi, sorry I’ve been quiet, I’ve been horribly sick, amongst an array of interrelated tribulations.

Uh I should possibly get something properly organised. And I still need to hunker down for your kind offer of tinderbox-thesis trauma, @satikusala.

(huffs more industrial-grade decongestant)