Tinderbox Forum

$AutoFetchCommand to simplify fetched $Text?

I want to $AutoFetch the contents of a web page at $URL, and I’ve assumed I could use $AutoFetchCommand to post-process the fetched data and replaced the fetched $Text using some sort of command-line driven text simplification utility. (Maybe Pandoc? Not sure.)

I’m not clear on the operation of $AutoFetchCommand. The examples at aTbRef are confusing. E.g, an example that the command in $AutoFetchCommand would replace $Text with the list of files in ~/Documents. Why would someone fetch text from a website and then have Tinderbox replace that text with a file list? I want to operate on the fetched $Text – and replace the fetched $Text with the output of some simplificaiton utility. I don’t want to eliminate $Text, like the examples suggest.

Anyway, I need some pointers to find a command line or web-based “web site simplification” utility, and then how to make that work to clean up a page at $URL fetched when $AutoFetch is true. It allow looks possible, but the exact sequence of events eludes me.

(This is going to be baked into a prototype for notes created by dragging .weblocs to a Tinderbox document.)

Sorry for the confusing notes. It was probably about 10 years back that $AutoFetchCommand was last being discussed. @eastgate can give a better idea as to the original concept of $AutoFetch & $AutoFetchCommand.

My hunch is you might do better with runCommand() to either pass a script or better pass arguments to a shell script using curl (or wget?) with the pandoc manipulation. Lest I imply otherwise, I am expert in none of these. :frowning:

AutoFetch command is a Tinderbox action that is run on this note immediately after a network Autofetch has been performed.

Yes, I know that. As I mentioned, I want to know how to use $AutoFetchCommand to replace the text fetched from a website into this note, by taking the contents of $Text, running it through some utility (??? what) that “simplifies” the text by essentially converting to markdown or plain text, and setting $Text to that simplified version of the original $Text from this note

This might be the wrong forum to be asking about text conversion. You can delete this thread.

I don’t think this the wrong forum. Quite what the target script is, I’m less sure and indeed their might be better places to source that (please don’t read that as a rebuke). But, once we have the script, we can help tune the action code call.

Sorry, the examples in aTbRef’s $AutoFetchCommand page are confusing. If nothing else the use the old (deprecated?) back-tick instead of runCommand() (q.v.), so i need to update those. The code used is, I agree, confusing. When written, years ago, I knew less about the command line. If there are better actual examples to use, I’d be happy to use those instead. If nothing else, this has smoked out an out-of-date aTbRef page and it’s now on my to-do list.

Per Mark’s confirmation above, $AutoFetchCommand simply stores the action to run. So although a String-type attribute, ISTM it could be thought of as if an Action-type one. (Indeed, should it become one).

You will likely use runCommand() to call your script. The action will need to identify the (external) script/app to run and pass parameters to it, one of which will be the newly $AutoFetched text.

Unless the target HTML pages are expected to change and you want the Tinderbox note to reflect that, I’m not sure $AutoFetch is necessarily needed as otherwise it will be a background process running that you don’t need. But I accept that this partly stems from a start point of using a dragged webloc.

My $AutoFetch article is perhaps unclear (again old) as the bullet points are referring to $URL, i.e. $URL must be set (a webloc drag may do this?).

You might want to drag the webloc to set $URL, then set $AutoFetch & $AutoFetchCommand (via a stamp?) then if this is a once only import/parse, use another stamp to reset$AutoFetch & $AutoFetchCommand. This leaves you with the parsed web content and no superfluous update processing. It also occurs, that if doing this for multiple notes, another angle is to store the script in a code note and call it from there - or via a stamp. Either way in might make it easier to see the whole code without needing to use an external text editor.

Bit busy, but I’ll keep a look out for an HTML->Markdown script.

I can’t get AutoFetchCommand to return expected results, though it is doing something.

In doing conversions if Data > Convert in DEVONthink doesn’t do the trick then textutil can often do a decent job.

https://eclecticlight.co/2018/03/28/free-conversion-of-text-files-with-textutil/

More at textutil man.

And the new external scripting support may be easier than going through runCommand(). For example:

2 Likes

@sumnerg: send me the test file, please

I’m going to skinny down my request: forget the rest of the thread above.

All I want is to watch a website and have the content that is put into $Text for that note “simplified” – meaning, the same thing that Safari Reader mode does. I don’t want to have too fiddle with formatting or anything.

Well, let’s try this:

URL: http://www.eastgate.com/
AutoFetchCommand: $MyString=$Text; $Text=$MyString

This will take the system’s best guess at the formatted text, discard the styles, and put the unstyled text back into $Text