Tinderbox Forum

Help with Regex and template code?

Related to my Zettelkasten file export project, I’m trying to see if I can use Regex to format lists of outbound and inbound links to the the wiki formatting and send that to an attribute for export. I’m mainly only good at adapting regexes though, I don’t know enough to write them from scratch at this point.

Here’s some sample data:

* <a href="201906061128 social practices and mediation.html">social practices and mediation</a>

* <a href="201906061114 appropriating mediational means.html">appropriating mediational means</a>

What I want to achieve is this:

* [[201906061128]] social practices and mediation
* [[201906061114]] appropriating mediational means

I have the first bit of code here:

$inLinksTrans=$inLinks.replace('<a href="',"");

However, I’m not sure how to construct the next two regexes. Basically, I want to do the following:

  • remove .html and everything after it on each line
  • find each 12 digit number and wrap it in [[ ]]

Would love some guidance from an expert if anyone has the time. Thanks!

$MyString=$Text.replace('<a href="',"").replace('\.html.*',"").replace("(\d{12})","[[$1]]");
$Text = $MyString;

For some reason, back-references don’t work if you act on $Text (specifically) as opposed to a string.

Amazing! Thanks Mark.

However, I’m not needing to mess with $Text it seems. The following worked for me:

$inLinksTrans=$inLinks.replace('<a href="',"").replace("(\d{12})","[[$1]]").replace('\.html.*',"");

But don’t forget the .replace('\.html.*',"") part to trim off the excess.

Ok, another complication. I’m trying to use an agent to assign this action to notes that have inbound links. The agent is in the root container, while the original notes are in another container. What I’m getting now is some extra path information in the replaced text:

* ../Container-Name/[[201906061128]] - social practices and mediation

Is there a way to programmatically figure out what the container name is and strip it out with the regex?

Or, I suppose the easier way around is just to chain another .replace to the end to excise everything between a period and a forward slash …

Ok, I got it. Not sure this will work in all cases going forward, but it works now:

.replace("(\.).*(\/)","") (stuck onto the end of the chain posted earlier)