Tinderbox Forum

Capitalize Russian sentences with .capitalize

I tried to use .lowercase with Russian sentences. It worked, but I failed in capitalizing them to common sentence format. Is there some workaround?

I can replicate this with a note “ångstrom”. Use stamp:

$Name = $Name.uppercase;

I get “ÅNGSTROM”. Use stamp:

$Name = $Name.lowercase;

I get “ångstrom”. Apply this code:

$Name = $Name.capitalize;

I get “ångstrom”. Wrong! If I change the title to “angstrom” and reapply that code I get “Angstrom”

I’ve a hunch that the process is applying a 32 bit (character?) offset to the english A–Z letters. As shown below, with sets a capital:

By comparison å is ASCII character #134 and Å is character #143, which is not a 32 number difference. I don’t think there is a simple user workaround for this.

It’s not that simple, alas. We’ll investigate.

1 Like

Mark, thank you for the investigation. What surprises me is that .lowercase works perfectly with Russian characters while the reverse command .capitalize doesn’t.

Indeed, but as a non-programmer, I’ve no idea why!

1 Like

Looking at the code, .capitalize() does use an old library that knows nothing of Unicode and that only operates on Latin characters. Uppercase and Lowercase use more modern libraries that understand Unicode and locales. I expect that all of these will be modernized in the next release.