Tinderbox Forum

HTML export templates generate unpredictable results

Tried to master HTML export templates. One of the problems was to convert $Name in cyrillic to File Name correctly. I put this code in the Notes text field:

^action($HTMLExportFileName= $Name)^
^action($HTMLExportExtension= “.html”)^
Это заголовок:^value($Name)^
А это текст:
^text^
The export works somewhat irregularly 50% times works OK in the rest is unpredictable mess.
What could be the problem?

Welcome back to the forum. I’m sorry you are having a problem. As the text engine supports Unicode, using Cyrillic text should not be a problem.

If I put your code (above) into the note I am trying to export, I get an error as the ^text^ code is trying to call itself. I suspect you are trying to make a note be both a template and what it exports. Regardless, I think a reference document might help, so I’ve made one: cyrillic-test.tbx (117.9 KB)

Export seems to work for me, in that TBX document. Ley me know if it helps. My lack of understanding of Cyrillic means I’ve used a ‘lorem ipsum’ generator to male Cyrillic ‘filler’ text (i.e. meaningless) as I wanted to ensure the test note was using a title/text of the relevant textual symbols.

Take a lot at the file and use it as a reference for any questions you may have.

The default for HTML export file extensions ($HTMLExportExtension) is ‘.html’ so you should not need to set that.

You do need to set $HTMLExportFileName explicitly—as you have done. Tinderbox’s method for parsing out non-html-safe characters from $Name is getting confused and treating all non-Roman characters as ‘unsafe’ and replacing runs of such characters with its substitution character of an underscore ($HTMLExportFileNameSpacer). As the whole name is one continuous end of ‘bad’ (to the parser’s eye) characters, the default export filename is ‘_.html’ which I’d agree is unhelpful. Pinging @eastgate as I suspect this isn’t intentional and a hang-over from days when the Web was less friendly to non-Roman alphabets.

Anyway, I hope that helps and do please ask any additional questions. :slight_smile:

. The default is currently recorded incorrectly in aTbRef as being nothing (i.e. no character) when it is in fact an underscore. I will fix this, sorry!

I realised after posting that using $Name as the htML filename might not be ideal because spaces in the note title are saved. So, I updated the test doc (attached below) to set the filename to:

$HTMLExportFileName=$Name.replace(" ","_");

Now, all the spaces in the $Name beome underscores in the HTML filename. You still need to consider punctuation marks (full stop, comma, exclamation mark, colon, etc.) and parentheses/brackets as these might also best be removed from the exported HTML filename.

In the updated, the first test note uses template ‘test 3’ that incorporates the above change. I’ve also discovered the setting these HTML-related attributes in the template head doesn’t work, or not until the page has been exported several times (I was using the File menu’s ‘Export Selected Note’ option). I’d assumed that as the code runs as part of the template, the newly set $HTMLExportFileName would be used, but it isn’t (I’m now unsure as to what I should expect: @eastgate?).

As a further workaround, I’ve made a stamp ‘Set HTML filename’ using the code removed from the head of template ‘test 4’. The idea is you stamp the note before export and then export (using template ‘test 4’). This is shown in the second test note (again - nonsense text) I just added to the test file.

The updated test file is here: cyrillic-test2.tbx (129.9 KB)

Mark, thank you very much! I applied stamps to all my notes in a file, then exported in HTML format and it worked perfectly. Now I have another question what would be the best way to create a helpdesk site using Tinderbox and notes with cyrillic headers? Should I trancribe Russian names into latin characters?

Much depends your the schedule. I believe that Tinderbox 9 will resolve the current difficulty with Cyrillic file names; when this part of Tinderbox was first designed, Cyrillic file names were not yet sanctioned.