Language support in phpBB 2.2
Posted on phpBB 2.2 support forum 05-Sep-03

Hi,

This is my first post here, so first of all I'd like to thank all phpBB developers and supporters for a great work. I especially like the new pricing scheme you've announced...

I use phpBB 2.0.4 on web site the serves an Israeli hiking group that I also manage. It's a small message board by any standard, but it serves as a crucial information center for the group.

I set up the board to work Hebrew, and it works nicely, but the task wasn't as smooth and easy as it could. Partly because of missing features in the code, partly because of inadequate support files such as templates and translations.

My intention isn't to criticize - on the contrary I wish to describe a few problems and how they can fixed in 2.2, for the benefit of all Right-to-Left (RTL) languages, not just Hebrew but also Arabic and maybe others. For some of the issues I may be able to contribute knowledge and actual work. This is a long (but IMHO interesting) post, so take a deep breath or just skip on to something else...

1. Multiple languages (with different directionality) in the same board.

By this I mean not user interface in different languages, but forums and topics that use different languages. Note the only way to support multiple languages in a web page is using UTF-8 (there's only one charset declaration for the entire HTMl page).

What I wish is to have a language attribute per the entire board, per each forum and maybe even per each topic. Forums default the board's language, and topics default to the forum's language. The default UI language (eg. for guests and new users) is of course the baord's language, but can be changed by users at will.

When a user enters a forum, the forum is displayed using the forum's language, including proper layout (LTR or RTL).  The UI layout and language don't change, only the layout of the table that contains the forum's data.

When a user posts a new topic, the default language of the topic is the forum's language, but the user may change the language (if permitted by configuration setting). When a user replies to an exisiting topic, the default language is the topic's language, but again if permitted for the forum the user may change the language.

When a topic is displayed, the layout (LTR or RTL) for the topic is determined by the language of the first message. The contents of messages in other lanuages is displyaed with the correct directionality for each languages, but withing the layout determined by the first message.

I implemented something that more or less works along these lines, using a smart but not very elegant trick. I add a special character combination to a forum's name which determines whether the forum is LTR or RTL. Same for topic titles. Then I've hooked a piece of code that checks for this signature and modified the rendering of the contents.

You may wish to look at this on my site http//hug-elad.org/forum - note that most of the forums are in Hebrew, which will probably look like gibberish to you. There are two (almost empty) English forums at the bottom. Look at a message in a Hebrew forum and then at a message in an English forum, at see how the layout is switched. Also note the forum names in the jumpbox contain ~R~ or ~L~ that signify either an RTL or an LTR forum (it works at the topic level too).

What's needed to implement this:

2. Improve templates

The current templates do not fully accoutn for directionality. Even though directionality itself is included, right/lft alignment is in many cases hard coded, making the output mixed up. Not much to say except:

3. Take into account UTF-8 string lengths!

I converted the Hebrew language to UTF-8, to make phpBB compatible to the rest of my site, and also to be ready in case I need to support other languages (eg. Arabic, Russian).

It works, but there are problems with string length limits. Most language encodings use 1 byte per character, so it's easy to match the input size limit to the column size in the database.

However UTF-8 uses a variable number of bytes per character, usually 1-2 but with Chinese/Japanese/Korean even up to 4 (or is it 6) bytes per character.

Currently phpBB code and DB schemas don't take this into account. In one case I typed a very long topic title, which was within the character limit, but in UTF-8 it exceeded the byte limit. This led to a corruption of the topic and I had some hard time fixing it with phpMyAdmin.

What's needed to correcct this:

4. Enforce English / LTR layout in some places

The previous points were all about flexibility of using different languages and layouts, but in some places (mostly admin stuff), non-English languages and non-LTR layout may cause serious problems. Why?

Most boards are installed on hosted web sites, where the board admin has no control at all about the underlying locale settings, filesystem character support, etc. So admins must be very careful not to use for example file and directory names that contain exotic characters.

It's perfectly possible to type English text (eg. file name etc.) into an input file even when the page is in Hebrew. However, due to the BiDi algorithm at work on the browser side, the text doesn't appear as it should. This is crucial for paths and file names, but other stuff may get confused as well. For example

In LTR (normal) layout: forum/mydir/
In RTL layout will show /forum/mydir

In LTR (normal) layout 1+2=3
In RTL layout will show 3=1+2

Note how the slash in the end is shown as if it's in the beginning of the path string, while it's actually still in the end. Even for me, an experienced and knowledgeable user, this is confusing and causing mis-typing.

By the way, as it is today the admin CP is language aware but almost completely layout UNaware - which causes even more serious confusion. For example radio selections are inverted (looks like Yes is selected while actually it's No, etc.). Imagine what this can do for settings such as "Board Active"...

Possible approaches:

That't it, at least for now.

Thank you for the time and attention to read all this.

I hope there will be a fruitfull discussion following this message, and I expect that whoever is in charge will instruct how to submit those enhancements request to the developers.

As I said I'm willing to help - mostly I can help with translation, fixing templates, testing, etc.

Regards,

E.Z