Sunday, February 19, 2006

BBC orchestra's problem with accents ...

It is wonderful news that the Czech maestro Jiri Belohlavek (right) has been appointed Chief Conductor of the BBC Symphony Orchestra from July 2006. But there is one small problem that the internet driven BBC may not have thought of.

I wrote recently about the problem news feeds such as Topix.net's very useful Symphony News have handling accented characters. Well, as is obvious, Jiří Bělohlávek has a lot of accents in his name if you write it correctly. Now I always try to use accents as they are an integral part of language. But I have just uploaded an article about Bělohlávek and the headline Bělohlávek's orchestral rhapsody in Norwich was parsed by Topix.net as BA - which is at least appropriate!

The obvious answer is to leave out the accents. This is what Bělohlávek's agents, IMG Artists, do on their web site, but puzzlingly the BBC retain the accents with the exception of ě. To avoid the problem I have compromised and republished the article without accents in the headline but retaining them in the body text, and have also omitted them from the first paragraph above. My György Kurtág which will be uploaded in a few hours will also be accent less for the same reason. It just seems a shame that the diversity and richness of language, and the aesthetic appeal of the words, should be dumbed-down because of this. Shouldn't we instead be trying to dumb-up our internet software to handle accents?

Photo of Jiří Bělohlávek from Edinburgh Festival

Report broken links, missing images and other errors to - overgrownpath at hotmail dot co dot uk
If you enjoyed this post take An Overgrown Path to Bělohlávek's orchestral rhapsody in Norwich

3 comments:

Colin Hartnett said...

I think all web browsers made today handle diacritics well. It's the authoring that's the problem. People just don't know how to enter in the characters... they're not on the keyboard, so they don't exist, basically.

Pliable said...

Colin, thanks for that. But I'm not sure in this case that the problem is with the browser.

You will see from the final paragraphs of my article (where I have not deliberately stripped out the accents) that I have keyed the accents, and the browser reproduces them.

The problem lies in intermediary processing by the news aggregator www.topix.net which is taking the article and parsing it without accents before it is viewed in the reader's browser. It is a particular problem in subject areas such as classical music where accents are common.

A tip for fellow bloggers who struggle to find accents on their keyboard. The edit tools on Wikipedia includes a complete range of accented characters. Compose your text as a Wiki draft, then cut and paste it into your blog - with apologies to Wikip for suggesting using their bandwidth.

Trouble is when you've created the lovely text the news aggregator will strip it out!

Tim Rutherford-Johnson said...

You could also refer to references such as this: http://www.starr.net/is/type/htmlcodes.html

which list html and ASCII codes for most diacriticals. (For the Eastern European stuff you have to go ASCII.)