Code to Parse Text with a prefix + a delimiter "," into separate notes

I am stuck here…and need a bit of help to code a function

I have a note with a line I would like to parse. I use the "ppl " prefix for a group of people with a delimiter “,”.

like so:

blah blah blah
ppl tom, steve, frank

I would like 2 options with the parsing:

  1. Create 3 “Person” notes
  2. Add each of them individually to a List attribute named MyPeople to the note containing the people

I have included a sample tbx below.

Thanks in advance
Tom

Parsing Separate People.tbx (172.0 KB)

Late here, so no time to test but I do note this:

In the example the second line begins with spaces before the first text. Computers don’t think like humans. The latter’s not a show stopper, but you need to consider what you’d allow to occur in the line before ‘ppl’ without the test giving a false positive. I’m sorry if that sounds pedantic—that’s not my intent here—but not being precise tends to lead to unexpected failures when we don’t test the condition we specified. :slight_smile:

Take a look at the Stream Processing and parsing tools.

OK, I added a container /People for any new person(s) created (TBX is below). The stamp “Parse for people” has this code:

var:string vPeople;
$Text.skipTo("ppl").captureLine(vPeople);
vPeople = vPeople.trim().replace(" *, *",";");
vPeople.each(aPerson){
	var:string vPath;
   vPath = create("/People",aPerson.capitalize);
	$Prototype(vPath) = "Person";
   $MyPeople+=aPerson.capitalize;
};

The issue of possible leading spaces before the ‘ppl’ became moot as the Stream.skipTo() operator only takes a literal string value. As used it searches for the first match in text of the stream of the literal string ‘ppl’. so you want to avoid topics like “Topic: but apples” as that contains the first instance of ‘ppl’. the match is case sensitive, so the code above won’t match ‘Ppl’. Here is a case where you want to revise your target marker (here ‘ppl’) to something less likely to give a false match in the preceding text. Here, I’ve stayed in the original text and renaming the marker is an exercise left for the reader.

Here is the adjusted demo file. I changed the new people so they didn’t overwrite the originals, plus I moved al ‘Person’ notes to a people container. Note that when using create, a new note is made as next sibling of the current note or last child of a target container. That is why it made sense to add a ‘People’ container. You can easily adjust the stamp code if you don’t want to use a container.

Here is the TBX used in my test: Parsing Separate People-ed1.tbx (228.5 KB)

1 Like

Thank you Mark for working through the code with me. At my current stage in coding, its like someone learning a new language. I have studied the nouns and verbs and other parts, but I still need work with my logic, building programming sentences. One day at a time.
Question: Instead of making vPeople a string variable, since there were multiple items, I was heading in the direction of making it vPeople a list rather than keeping it a string. In my case, If so, I would have used the collect.each(aPerson) instead. Would this have been an alternate way of looking at the problem as well?
Thanks as always for all your help.
Tom

Yes - you are absolutely correct, vPeople could have been a list-type but actually both a String and a List would help more (see below). The code works as because Tinderbox is cleaver enough that if passed a string (vPeople) that has semi-colons in it that is chained to an each, it parses the string as a list.

There lies the difference in rushing through to get a solution and writing the solution as an aid for others. so, better would be:

var:string vPeopleStr;
var:list vPeopleList;

$Text.skipTo("ppl").captureLine(vPeopleStr);
vPeopleList = vPeopleStr.trim().replace(" *, *",";");
vPeopleList.each(aPerson){
	var:string vPath;
   vPath = create("/People",aPerson.capitalize);
	$Prototype(vPath) = "Person";
   $MyPeople+=aPerson.capitalize;
};

N.B. in the TBX below, $MyPeople in the last example above is updated to $MyPeopleSet for reasons with will become apparent below.

So, the process could be made more succinct, but I think it gives clarity as to the point. Thus, we:

  • scan $Text for a marker, capturing the rest of that line to a variable
  • we trim and tidy the string into Tinderbox list form
  • we iterate the list
  • in-loop we:
    • make a new per-person note and apply the ‘Person’ prototype
    • add the person to the stamped note’s $MyPeople

Describing that made me notice one more thing, $MyPeople is better as a Set-type so inadvertently you don’t get duplicates. The only sensible reason to use a List here is if the sorting of the list values matters. In which case, you might want to check is that person is already in the list before adding them like so:

var:string vPeopleStr;
var:list vPeopleList;

$Text.skipTo("ppl").captureLine(vPeopleStr);
vPeopleList = vPeopleStr.trim().replace(" *, *",";");
vPeopleList.each(aPerson){
	var:string vPath;
	var:string vPerson = aPerson.capitalize;
   vPath = create("/People",vPerson);
	$Prototype(vPath) = "Person";
   if(!$MyPeopleList.contains(vPerson)){
      $MyPeopleList+=vPerson;
   }
};

I’ve updated the doc and replaced $MyPeople with a Set $MyPeopleSet and a List $MyPeopleList. There are now 2 stamps one for the Set-based method and one for the List (avoiding dupes):

Updated TBX: Parsing Separate People-ed2.tbx (242.6 KB)

1 Like