HTML Export - accents


(Dave Rogers) #1

Quick silly question. When I write in TB “Au revoir, gophér.” The accent over the “e” (an affectation, just go with it), appears as it should. Yay!

But when I export it as HTML I get this: “Au revoir, gophér.” Boo!

What am I missing? Some line in my CSS style sheet, export template or something? Have I picked the wrong typeface? I know this has something to do with unicode. Apparently none of that stuff is in the “just works” category yet, and still requires fiddling by the operator, despite the incredible amounts of compute-power dedicated to recording and publishing to the world the transient, ephemeral emissions of my clearly troubled mind.

Any insight would be most appreciated. Thanks.

Dave Rogers
Struggling with HTML since 1999


(Mark Anderson) #2

I can’t replicate this in v7.3.1. To test, I:

  • made a new TBX
  • added the build-in ‘HTML page’ template (which adds necessary prototypes)
  • made a new note ‘Text note’ and added “Au revoir, gophér.” as the $Text.
  • tuned on (Window menu) the text pane selectors.
  • applied the template to the note (actually it’s done for you, by default).

The result? Neither the preview pane:

export-encode2

… nor the HTML pane:

export-encode1

… show the character encoding error you describe (gophér).

So, whilst I understand what you report and I’m sure you’re describing what you see, I don’t know what you’re doing (even if unintentionally) to create that error. Here’s the test file used for the above - I’ve also added a second test using ^value($Text)^ instead of the normal ^text^ text export method: Export encoding.tbx (61.2 KB)

Hopefully, you see the same as illustrated above when you open and test the file. If that works, but your files don’t please upload a similar small example showing the error so we can find the cause, likely something misconfigured in error or while experimenting.

“…since 1999”? Pfft, we had CSS by then. I remember the joys of <font> tags in HTML3 and “best viewed in Internet Explorer”**. :wink:

** ‘best’ being rather over-optimistic.

Sidenote: Tinderbox has been cranking out HTML since 2001 and lived through a lot of change to the way we use i!


(Dave Rogers) #3

Well, that was interesting. It seems it’s a Safari thing? All the previews looked fine, as you found. When I used quicklook in Finder, everything looks okay. I go to the web, and I get a sharp stick in the eye!

So I went to the “View” menu in Safari, thinking I’d see “View Source,” or something. (That’s “Develop,” I guess.) But down there near the bottom is Text Encoding. Hmmmm… What does that do?

Mine was set for “Default.” That sounds reasonable, right? But just to see what happens, I selected Unicode (UTF-8), and voila! All is right with the world.

Thanks for checking, Mark. I looked at all my templates and my stylesheets and I didn’t see anything that looked like it might be problematic. For sentimental reasons, I use Trebuchet MS and it seemed to render accents okay in TextEdit.

So, it’s Safari. Why doesn’t that “just work?” (Rhetorical question.)

Thanks again,
Dave

P.S. It’s weirder. Because of course it is.

So I downloaded your test file, everything looks fine - as it does in my TB file.
I export as HTML, as I always do. Everything looks fine in the quicklook view.
I hit “Open in Safari,” and… everything looks fine.
So I check the “View” menu, expecting to see Unicode (UTF-8) selected, but no, it’s “Default.” And it looks fine.
I thought, well perhaps that’s my new “default.” Maybe I switched it somehow long, long ago and it’s just been “wrong” for however long.
I went back to my page, and looked at View and saw Unicode (UTF-8) was still selected. Switched to Default, and got the sharp stick in the eye again!

For whatever it’s worth, here’s the offending page - http://nice-marmot.net/May_2018.html#note_335

It’s just another ill-tempered morning rant I don’t wish to inflict on you, but is there a clue in there what is going on? Do you see the sharp stick in the eye there?

PPS - the time stamp in the permalink is also different depending on whether view is set for Default or Unicode (UTF-8). Another clue?


(eastgate) #4

Your HTML export template, in the header, should tell the browser what character set you’re using.

 <meta http-equiv="content-type" content="text/html; charset=utf-8">

(Dave Rogers) #5

That fixed it! Thanks! Timestamp looks different, but it’s a non-issue.

Dave

HTML. The struggle is real.


(Mark Anderson) #6

The ‘fix’ would imply you’ve likely got some old documents/templates in the mix and looking at http://nice-marmot.net/April_2018.html, that seems to be the case (if the latter is output from Tinderbox). Just make sure all the doc’s various templates that export whole HTML pages have the same fix. FWIW, Tinderbox outputs plain text as UTF-8 with Unix-style line-ends (\n).

Aside, and prompted by your 6 Apr blog, I’m just finishing up reading Bardini’s Bootstrapping along with Hiltzik’s Dealers of Lightning, Ornstein’s Computing in the Middle Ages, and Waldrop’s The Dream Machine.


(Dave Rogers) #7

Thanks, Mark. I’ll go through and check again, but I think the page template should have gotten them all. But I’ll check! (Update: Ah, yes! I see what you were referring to. Tinderbox exports the whole blog every time. But I only upload the changed files. April had an eye-stabbing em-dash, and yes, that’s fixed - but I needed to upload the new file. Now I suppose I should just re-upload the whole damn thing! Thanks!)

I’m still working on Bootstrapping. Went to Ireland and got distracted by the Lusitania and some Irish history. I think Englebart would have loved Tinderbox. I’m also saddened by what has become of computing. But that’s a blog post for another day!