Export individual HTML files from agent results

Whyt not export the whole document, and then grab the directory you want? Export is fast, and it’s now roughly twice as fast as it was.

For several reasons:

  1. it takes 90+ seconds to export my file and crashes half the time when I do it (I will run this operation a number of times)
  2. I’m pulling notes from agents and have the agent adjust the file before the export, on occasion
  3. I don’t need all 3,300 notes sitting on my harddrive, I only want the 11~75
  4. I have the agent cherry picking notes from sub-directories based on priority, so going through individual files to pull the one’s I want won’t be practical at all, especially given that agent results don’t appear to be saved as a directory in the export.

Given the above options, unless there is a script that I can find to save each result from an agent as an individual HTML file I’ll need to individual save each on with export selected note.

The answer, today at least, is judicious use of $HTMLDontExport & $HTMLEXportChildren. It feels like we’ve been round this before, but with the exception of things like the /Prototypes container, the default/design intent is exporting the whole file when using HTML export. By setting both the mentioned attributes to their non-default setting they and all descendants will not export, meaning that a full export can then export only a selected range of notes. But, messing around to use this method for changing ad hoc selections is a recipe for misconfigured export. In a large complex doc ‘seeing’ what exports and doesn’t is difficult, especially if affordances like note colour or badges already used for some other indicator.

The speed of exporting the whole doc is an irrelevance, it just shifts the basic problem of getting the desired items to a different context. Scale is a non-trivial factor here. The UX of finding 10 discrete files out of an overall 50 files is significantly different to 10 out of 1,000.

I recall the genesis—in part or whole—of the current ‘export selected note’ feature was the problem I reported of trying to fix a single article in aTbRef and ending up having to export the whole file. It is still easier to use ‘export selected’ 20-30 times than exporting the whole doc and still having to find the bits I actually need. As said, speed of export isn’t a factor here, it is the task overall.

Meanwhile, I recall mention of a feature request (not sure if anyone followed through) of an option to export the current container and descendants. If that were adopted is would be a reasonable trade-off as the user could either export an agent container and use a query so that desired note(s) were exported, by being (aliased) children or my simply only exporting part of the doc if all the needed pages were in a narrow context.

1 Like

Love your write-up.

I understand this approach, $HTMLDontExport & $HTMLEXportChildren, but in my context with thousands of notes, this will become nearly impossible to manage. It will be too easy to miss something important.

The alternative of using individual export selected note will work, but it is not very efficient, especially for an operation I need to do multiple times a day, or at least multiple times in any given week. I run a variety of agents that modify my output so that I can send the content to different departments, sending them only what they need.

Yes, for me at least, the selected “export container,” in my case an agent, would be really useful. In laymen’s terms, we could add a new export menu Export selected notes as children." This would give assurance that I’m getting exactly what I want, when I want it, with maximum efficiency and effectiveness.

I could see the need of constraining this feature to only agent export, that way you’d not have to deal with the case of exporting descendants as individual notes. Or, if you do let this work on containers other than agents I would suggest this feature gets constrained to only children export, the descendants are more than likely part of the child’s makeup, and if not you could simply go down a layer and export the child containers. These seem like reasonable tradeoffs.

If you have a reproducible crash, send the document and instructions for reproducing it. (You’re doing some very tricky HTML export work, including stuff that’s never been attempted before, just as we’re updating HTML export — in part — to accommodate it. Not surprising that there may be problems.)

The current backstage export is about twice as far as 8.9.2, by the way.

Try this:

-- Have an export folder ready. Select Tinderbox notes and run. 
-- NB. Will overwrite existing files of same name.

set theFolder to (choose folder with prompt "Choose a folder to receive the exported MD files")

tell front document of application "Tinderbox 8"
	repeat with aNote in selections
		tell aNote
			set theFilePath to POSIX path of theFolder & (value of attribute "Name") & ".html" -- name file after note
			set theHTML to evaluate with "exportedString(this,$HTMLExportTemplate)"
			do shell script "touch " & quoted form of theFilePath -- create file if doesn't already exist
			do shell script "echo " & quoted form of theHTML & "> " & quoted form of theFilePath -- write to file
		end tell
	end repeat
end tell

This is a simple adaptation of the Markdown export script in this thread. It assumes an html export template is configured. Have an export folder ready on your system. Select the aliases gathered by the agent or notes in another container (as opposed to the agent or other container itself; if you do that you’ll get one file) and run. If all goes well the result will be individual .html files in the chosen folder.

1 Like

Thanks @sumnerg this works great!!! It will suit my needs perfectly.

Now, on to the next step/ideas to refine the process even more. I don’t know how to write AppleScript, so could use some guidnece.

  1. is there a way to have the AppleScript remove spaces in the file name of the file that is being created, or replace the spaces withe an underscore, _?
  2. once this is accomplished can we have the AppleScript feed the newly create .HTML file to Pandoc with the following command-line, e.g. “pandoc -s OB#6PGIDapp.html -o OB#6PGIDapp.pdf”. This will have pandoc create a PDF file from the newly created HTML file. The reason for removing the spaces is that Pandoc does not like filenames with spaces.

Thoughts?

PS: thanks again. The apple script will save me a lot of time.

Thank you for this great script, works great! One question: Do you know (or anyone else) any reliable way for converting html files in batch into markdown except for pandoc? I’m having difficulties running pandoc because I’m on M1 Mac and after trying installing it few times with no results I gave up.

To the best of my knowledge the plan is to have an updated markdown rendering engine in TBX 9. You’ll be able to create markdown files natively in TBX and export them as markdown as well, if all goes as expected. So keep an eye out for the 9.0 release.

To be clear, between now an then you can write markdown in TBX 8.9 and export it as markdown, but in TBX 8.9 and below the preview won’t be as nice as what is expected in TBX 9.0. All you need to do is create a markdown template. Let me know if you need help with this.

Thank you for your quick response!
Good to know that new version will get better markdown support! :heart: Most of my notes are written with some HTML code inside them, so it would require some additional work to rewrite them into markdown. From what I know after watching last few Tinderbox Meetups, TBX 9 is just around the corner and should land in a few weeks time (of course If everything goes according to plan or those plans won’t change), so I will just patiently wait for it as my case is not so urgent.
Thank you for offering to help, I really appreciate it!

Me too! I’d love to have my cake and eat it too, i.e. have both HTML and markdown in my notes. I’m still testing, but I think it may work.

Yes, you can do something like this to have underscores instead of blanks:

-- Have an export folder ready. Select Tinderbox notes and run. 
-- NB. Will overwrite existing files of same name.

set text item delimiters to "_"

set theFolder to (choose folder with prompt "Choose a folder to receive the exported MD files")

tell front document of application "Tinderbox 8"
	repeat with aNote in selections
		tell aNote
			set theFilePath to POSIX path of theFolder & (words of (value of attribute "Name" as text) as text) & ".html" -- name file after note
			set theHTML to evaluate with "exportedString(this,$HTMLExportTemplate)"
			do shell script "touch " & quoted form of theFilePath -- create file if doesn't already exist
			do shell script "echo " & quoted form of theHTML & "> " & quoted form of theFilePath -- write to file
		end tell
	end repeat
end tell

If you don’t want underscores and instead want to remove the spaces between words then just remove the first line in the script. Or keep that line and remove the underscore between the quotes.

Thanks for heads up on difficulty with Pandoc on M1 Macs. Presume you have already seen this and you couldn’t get it to work.

@satikusala You’ll be able to create markdown files natively in TBX and export them as markdown as well

Write in Markdown and export to Markdown? Or also write in rich text and export from that to both html and Markdown “natively” (i.e. without messing about trying to install Pandoc and get it to work)?

I’ve given it another chance and successfully installed pandoc! Thanks for help! I’m still new to macOS (M1 is my first Mac ever), so there are still many things that I don’t quite understand. Converting HTML files one by one through terminal works great, but with few hundreds of notes it will take some time, so I tried to apply your Markdown export script You wrote in this thread (I’ve only changed pandoc directory), but with almost no results i.e. after running script, I get empty markdown files.
I think I will just wait for the TBX 9 :slightly_smiling_face:
Thanks again!

From everything I’ve read an M1 Mac is a great place to start!

If you’ve got Pandoc working via the command line then it shouldn’t be that hard to get it working via script. What is the complete command you are using from the command line that works?

It looks like this:

Mac-mini-Arek:pandoc shijianhui$ pwd
/Users/arkadiuszszlaga/pandoc
Mac-mini-Arek:pandoc shijianhui$ pandoc --version
pandoc 2.13
Compiled with pandoc-types 1.22, texmath 0.12.2, skylighting 0.10.5,
citeproc 0.3.0.9, ipynb 0.1.0.1
User data directory: /Users/shijianhui/.local/share/pandoc
Copyright (C) 2006-2021 John MacFarlane. Web: https://pandoc.org
This is free software; see the source for copying conditions. There is no
warranty, not even for merchantability or fitness for a particular purpose.
Mac-mini-Arek:pandoc shijianhui$ ls
test.html
Mac-mini-Arek:pandoc shijianhui$ pandoc test.html -f html -t markdown -s -o test.md
Mac-mini-Arek:pandoc shijianhui$ ls
test.html test.md

Thanks again for help!

And what Pandoc command did you use in the script that didn’t work (including path)?

I’ve tried twice with different paths:

-- Have an export folder ready. Select Tinderbox notes and run. 
-- Assumes Pandoc is installed at /usr/local/bin/. See https://pandoc.org/installing.html
-- NB: overwrites any like-named .md files in that folder

set prefix to "@" -- character(s) used to distinguish internal links from external ones

set pandocCmd to "/Users/shijianhui/.local/share/pandoc -f html -t markdown_mmd" -- html to MultiMarkdown
set sedCmd to "sed -E 's/" & prefix & "(\\[.+\\]).+\\)/[\\1]/g;t'" -- grab anchor, surround by [[ ]]
set cmdStr to pandocCmd & " | " & sedCmd -- assemble the "pipe"

set theFolder to (choose folder with prompt "Choose a folder to receive the exported MD files")

tell front document of application "Tinderbox 8"
	repeat with aNote in selections
		tell aNote
			set theFilePath to POSIX path of theFolder & (value of attribute "Name") & ".md" -- name file after note
			set theHTML to evaluate with "exportedString(this,$HTMLExportTemplate)"
			set theMMD to do shell script "echo " & quoted form of theHTML & " | " & cmdStr
			do shell script "touch " & quoted form of theFilePath -- create file if doesn't exist
			do shell script "echo " & quoted form of theMMD & "> " & quoted form of theFilePath -- write to file
		end tell
	end repeat
end tell

and

-- Have an export folder ready. Select Tinderbox notes and run. 
-- Assumes Pandoc is installed at /usr/local/bin/. See https://pandoc.org/installing.html
-- NB: overwrites any like-named .md files in that folder

set prefix to "@" -- character(s) used to distinguish internal links from external ones

set pandocCmd to "/Users/shijianhui/pandoc -f html -t markdown_mmd" -- html to MultiMarkdown
set sedCmd to "sed -E 's/" & prefix & "(\\[.+\\]).+\\)/[\\1]/g;t'" -- grab anchor, surround by [[ ]]
set cmdStr to pandocCmd & " | " & sedCmd -- assemble the "pipe"

set theFolder to (choose folder with prompt "Choose a folder to receive the exported MD files")

tell front document of application "Tinderbox 8"
	repeat with aNote in selections
		tell aNote
			set theFilePath to POSIX path of theFolder & (value of attribute "Name") & ".md" -- name file after note
			set theHTML to evaluate with "exportedString(this,$HTMLExportTemplate)"
			set theMMD to do shell script "echo " & quoted form of theHTML & " | " & cmdStr
			do shell script "touch " & quoted form of theFilePath -- create file if doesn't exist
			do shell script "echo " & quoted form of theMMD & "> " & quoted form of theFilePath -- write to file
		end tell
	end repeat
end tell

I’m still learning AppleScript so there is an extremely high probability that I have made a silly mistake and don’t see it.

The trick is to have the path point to where Pandoc is installed on your machine, which may vary depending on the method of installation.

Pre-M1, if Pandoc is installed via Homebrew, then the path used in the original AppleScript should work.

set pandocCmd to "/usr/local/bin/pandoc -f html -t markdown_mmd" -- html to MultiMarkdown

Sometimes the $PATH variable needs editing. When you enter
echo "$PATH"
(with the quotes around $PATH) in the command line what do you see? Does it include /user/local/bin ?

If you’re still stuck then maybe @Bernard-0 or someone else who knows more about the command line can help. It shouldn’t be too hard, and is probably worth it, because Pandoc does some useful things.

1 Like

Pandoc doesn’t mind filenames with spaces at all :wink: When you have a path with a blank space in it in the CL you need to escape it or add quotes:

cd ~/Dropbox/Michael\ Becker/ 
cd "~/Dropbox/Michael Becker"

Otherwise, it would read the empty space as the end of the path.

It’s the same in the M1 Mac :wink:

2 Likes