My first loop function attempt...close but a no go

TomD · March 9, 2023, 3:40pm

I am trying to create my first loop function…and after several attempts, its still fails. :frowning

Here is the $Text I am trying to clean (leftover html code from Devon)

— Input $Text via a stamp
1

2

—Output I want

1

2

–
basically I want to remove the entire image tag from here “<img” … to here “/>”
and replace it with nothing.

Here is the function I made which does not work.

function HTMLCleaner(x){
$Text.eachLine(x) {
if(x.beginsWith(“<img “)) {
x.paragraphs.replace(paragraphs,””)
}
}
}

Here is the sample file. Any thoughts?

loopEx.tbx (167.7 KB)

Signed
A stumped tinderboxer

eastgate · March 9, 2023, 4:15pm

Your problem isn’t with the loop! The problem is, what do you want to do inside the loop?

Let’s begin with a description in plain language. It might be something like this:

Remove all lines that begin with <img , and leave the rest.

That’s reasonable, but it rings a little alarm bell: we’re changing something ($Text) at the same time we’re walking through it. That can be OK, but you have to think carefully — just as, when sitting on a branch with a saw, the detail of the relative positions of yourself, the saw, and the tree trunk are suddenly important.

Another way to say this might be:

Make a copy of $Text, omitting lines that start with <img, and save the result in $Text.

function HTMLCleaner(){
  var:string result;
  $Text.eachLine(x) {
     if(!x.beginsWith("<img ")) {
     result = result+x+'\n';
		}
   }
	
  $Text=result;
}

loopEx revised.tbx (218.1 KB)

mwra · March 9, 2023, 5:55pm

I’ve a simpler, if less elegant alternate that works without a loop. At a note at root called “log”. We’ll write to that just to avoid the saw/branch issue during testing.

Here’s your stamp:

var:string vText;
vText = $Text.replace("<[^>]+>","");
$Text(/log) = vText;

Regex logic:

match a left angle bracket <
match one or more characters that anything except a right angle bracket >
match a right angle bracket >

Assumptions:

There is only one HTML tag per target line
We know (via HTML documentation) that a tag cannot contain a literal angle bracket , i.e. < or >.

If you want to delete the whole line (i.e. including the line break after the tag ending) the stamp code is this:

var:string vText;
vText = $Text.replace("<[^>]+>\n","");
$Text(/log) = vText;

The point of the regex re-cap and assumtions is to show how precise such an approach. For instance this would fail if a target line had more than one tag, e.g. the <div><img/></div>.

That’s why I think the loop solution is more elegant. I thought it too had a literality trap. This is because .beginsWith() operator only takes a literal input. So if an image tag line started with a space it might fail. It doesn’t! Happily it looks like .beginsWith()—and I assume .endsWith()—silently trim the tested string before testing.

In the light of this if you wanted to use the regex approach and were concerned about leading/trailing whitespace (including tabs and non-breaking spaces) you could go with this stamp:

var:string vText;
vText = $Text.replace("\s*<[^>]+>\s*","\n");
$Text(/log) = vText;

I still think the loop solution is better as it avoids regex, which as we know, can go unexpectedly wrong.

TomD · March 9, 2023, 6:52pm

Wow…why does it seem so obvious after you and MarkB demonstrate the code answer and explain what you did! Still not automatic for me yet. I am trying to write in pseudocode first…will also try flowcharting to see which I like.

I am going to work through more examples on my own. Thank you both for the wonderful explanations and different techniques!

One last question:
One thing I am just beginning to learn to do is to write my code line by line to see if each line of code works. Although tinderbox does not have a “breakpoint” marker like some code editors (not a criticism, rather just an observation), I notice you and MarkB are using $Text(log).

Is there a way I can use the operator create with $Text(/log) to make a pseudo break point to test line by line?

Thanks again
Tom