Exporting Non-basic Characters

bcrane · May 28, 2017, 1:35am

I’m curious if anyone has tips or tricks for dealing with non-basic characters. I’ve looked at the pages that seemed relevant in the TbRef (e.g. HTMLEntities) but when I export to HTML I have pages that read curly quotes as:

and non-English characters in a word like “l’oeil” exporting as:

I can imagine ways of replacing these characters with the character codes by agents before export (and then changing it all back), but if anyone has alternatives for how to make an HTML export readable after export as-is I’d be interested to know.

update: just realized that in both cases I’ve shown a single curly quote. It seems like it’s mostly curly single and double quotes that aren’t converting to readable form in HTML export.

mwra · May 28, 2017, 9:28am

I can’t replicate this with a new default v7 TBX to which I’ve added the built-in ‘HTML’ template. Certainly since v6, Tinderbox has exported UTF-8 text. In what context are you seeing the corrupted characters - inn Tinderbox’s preview or some other setting? If the latter is the corruption in the rendered view only or in the source (text) as well?

bcrane · May 28, 2017, 11:45am

Actually, just knowing that you can’t replicate it helped me sort out the problem. It was a <head> <meta> issue. I’ve specified the charset at UTF-8 and the exported pages display correctly now. Thanks.