URL scraping from a devonthink note (html file) into Tinderbox?

I have a Devonthink html file with several hyperlinks embbeded. See my sample file that I included here that I drag and dropped into a Tinderbox. Simple

CreateNewNoteWithURL.tbx (158.1 KB)

What I want to do:

  1. Create a new note with each title of the hyperlink as the $Name
  2. Save the URL hyperlink to $URL

Any thoughts on how to get started.
I think this is very practical use case to parse and scrape urls from a Devonthink note into a Tinderbox note… at least for me.

Thanks
tom

A quick sketch of one approach:

  1. eachLink(x){...} iterates through each link of the note, binding in turn information about each link to the dictionary x.
  2. The link anchor of each link is x[anchor].
  3. The destination url of each link is x[url]
  4. You can then create the note using create(..), and set the url of the new note.

Hmm links in in the $Text are note actually linked. I’m not sure this would work. I tried:

$Text.eachLink(x){

create(x);

};

It did not work.
The first step is to pull the URLs form the text into a list and then iterate against them. I’ve got Applescript somewhere to do this, or maybe we could use a stream operator. I don’t have time to work on this right now

x is a dictionary, such as {source: /path/to/a; anchor: “fruity goodness”; …}

In tested the following:

$Text.eachLink(x){
   var:string dname=x["anchor"];
   var:string c="/Data/testing/";
   var:string path=create(c,dname);
   $URL(path)=x["url"];

};

Using the file above, I created a stamp with the function Mark provided.

It did not work for me either. Here is the file I am using.

CreateNewNoteWithURL.tbx (178.0 KB)

Tom

You had a few syntax errors. The function body is enclosed in braces:

function ParseDevonLinks() {
$Text.eachLink(x){
   var:string dname=x["anchor"];
   var:string c="/Data/test";
   var:string path=create(c,dname);
   $URL(path)=x["url"];
   }
}

The stamp definition needs to actually invoke this function:

ParseDevonLinks();

Tested using a recent backstage build, b614.

CreateNewNoteWithURL-2.tbx (256.5 KB)

1 Like

Works perfectly now. Thanks MarkB. Headliner: that is awesome…

Rookie coding mistakes… :frowning: and frustrating.

Tom

One question:

“var:string c=”/Data/test";"

How would I change this to designate the $Path under the note being stamped?

Would it be:

"var:string c=$Path(child)

Thanks
Tom

If you want to use this note as the container, just pass a single argument to create(), the name of the note you want to create.

1 Like

OMG!!! This is so very cool. :slight_smile:

In this case, the note is $ReadOnly (as part of the DEVONthink watch folder process), but a couple of side notes for the later reader wanting to use this which other notes.

eachLink(), lists all links so potentially, there may be 3 outbound link types (basic, text, web) and 2 inbound link types (basic, text). How then to tell these apart?

In vs Out. Outbound links will have that same $ID as the note whose links are being read (i.e. this unless using the optional scope argument to read from a different note).

Basic. Has no anchor text.

Text. Has anchor text but no URL.

Web link. Has URL (and anchor text—but the URL is the signifier.

In reverse, checking a single link in the list iterated by eachLink() and you need to tell all 3 apart:

eachLink(aLink){
   if(aLink["url"]!=""){
      // this is a web link
   }else{
      if(aLink["anchor "]!=""){
         // this is a text link (has anchor but no url)
      }else{
      // this is basic link (has neither anchor nor url)
      }
   }
};

Side note (as discovered testing links today for a separate issue): eachLink() is currently (v9.5.2) unable to report if a basic or text link has target anchor text (the anchor value applies to the anchor text at source). This data is also omitted from the Browse Links dialog. Indeed, at present you need to look at the doc’s XML to find which links have target text anchors. There’s no fault here but a slight chick-and-egg problem: to-text links are allowed but not seen much as they are hard to make/check, so this may be a reason they are not much used.

†. This got harder in the v6+ UI. To make a link to a text selection in the target note, drag a link (basic or text) to the link well on the tab bar. Now select the target note, and then select the desired text within $Text. Leaving the selection in place , drag from the link well onto the selection and release and set the rest of the link settings as normal. You now have a link that points to a target $Text. When the link is followed, Tinderbox will ensure the target text anchor is scrolled into view. So, this might be useful if linking into notes with much $Text, albeit noting that such links are difficult to review (as least in terms of the target anchor text string).

1 Like

See: missing-link-info.tbx (198.8 KB)

Select note ‘aa’ and then run either of the stamps and see the results in note ‘log’. for each of the stamped notes links stamp sort of link returns one of web/text/basic and stamp direction of link returns either outbound or inbound.

I’ve also updated eachLink(loopVar[,scope]){actions}.