Wiki source for LocalizationProposal


Show raw source

====Localization proposal using gettext====
<<This document describes a proposal to implement gettext for all translation strings in WikkaWiki, starting with Wikka 1.3. Note that this is only a proposal, so this document can and probably will be modified several times. Should the dev team decide to adopt gettext as the localization standard for Wikka, this page will be renamed to indicate this.

SVN checkout: svn co https://wush.net/svn/wikka/branches/1.3_gettext
<<::c::

===References===
The following references were used to develop the initial gettext implementation using the Wikka 1.3 development branch:

[[http://codex.wordpress.org/Translating_WordPress | "Translating WordPress"]]
[[http://danilo.segan.org/blog/ihatephp | PHP-gettext dev blog (Danilo Segan)]]
[[https://launchpad.net/php-gettext/ | PHP-gettext repository]]
[[http://www.gnu.org/software/gettext/manual/gettext.html#xgettext-Invocation | GNU gettext manual]]
[[http://mel.melaxis.com/devblog/2005/08/06/localizing-php-web-sites-using-gettext/ | Some gettext notes from Pablo Hoch's blog]]

===Implementation Notes===
The [[https://launchpad.net/php-gettext/ | PHP-gettext]] standalone library is used to implement gettext in Wikka. This eliminates the need for a Wikka administrator to ensure their version of PHP has gettext support compiled in. PHP-gettext requires no external libraries and only a minimal amount of configuration. It is licensed under GPLv2.

In Wikka 1.3, the PHP-gettext version 1.09 libraries are located in ##3rdparty/core/php-gettext##. No modifications are necessary when installing from the PHP-gettext version 1.09 release package.

Testing was conducted on a Windows 7 laptop running the excellent [[http://www.wampserver.com/en/ | WampServer 2.0]] package (Apache 2.2.11, PHP 5.2.11, and MySQL 5.1.36) using [[http://www.gnu.org/software/gettext/ | GNU gettext 0.17]] tools under [[http://www.cygwin.com/ | Cygwin]].

Defines that were used as translation strings in ##lang/en/en.inc.php## and related language files were replaced in source files with their English equivalents using a Perl script (jump to the end of this article for the script). The gettext macro used for all Wikka translation strings is ##T_## (the reason for this is that ##_## is already used by the installer). For instance, the following define:

##if(!defined('FOOTER_PAGE_EDIT_LINK_DESC')) define('FOOTER_PAGE_EDIT_LINK_DESC', 'Edit page');##

was replaced in the ##header.php## source code file with the following:

##T_('Edit page')##

A file called ##localization.php## is used in the Wikka top-level directory to configure PHP-gettext. This file should normally not require modification by the end-user. The file itself is invoked from within ##wikka.php## via the ##include_once## directive.

To use another locale, you have to add an entry in ##wikka.config.php## for the parameter ##default_locale##

%%(php)
'default_locale' => 'fr_CH',
%%

===Locale directory structure===
The ##locale## directory is structured as follows:

%%
locale/
locale/po <--contains the generic template file; must be copied to lang-specific directories for translation
locale/po/messages.pot <--generic template file
locale/en_US <--locale-specific
locale/en_US/LC_MESSAGES <--holds lang-specific translations
locale/en_US/LC_MESSAGES/en_US.po <--lang-specific template file, usually created by msginit
locale/en_US/LC_MESSAGES/en_US.mo <--compiled translation file, usually created by msgfmt
locale/de_DE
locale/de_DE/LC_MESSAGES
etc...
%%

===Generating the gettext template (.pot) file===
Any time new translation macros (of the form ##T_(...)##) are added to the source code, a new gettext template file must be generated. There are several different gettext utilities that can be used to generate this file. GNU gettext command-line utility examples are used in this document, so we will be using the gettext command from the Wikka top-level directory:

##find ./ -name '*.php' | xargs xgettext -L PHP --force-po -kT_ -o locale/po/messages.pot##

===Creating language-specific template (.po) files===
If one does not already exist, create a new directory structure under ##locale/## using [[http://www.rfc-editor.org/rfc/bcp/bcp47.txt | BCP-47]] language tags ([[http://schneegans.de/lv/ | validator]]). For instance:

##mkdir -p locale/fr_CH/LC_MESSAGES##

The GNU gettext command ##msginit## can then be invoked to copy the ##messages.pot## template file for use with the language to be translated:

## msginit --locale fr_CH --input locale/po/messages.pot --output-file locale/fr_CH/LC_MESSAGES/fr_CH.po##

===Creating translations===
Several utilities exist that can be used to modify .po files. Some of the available utilities are listed [[http://www.gnu.org/software/gettext/manual/gettext.html#Editing | here]]. The file can also be modified manually in a text editor.

[More info needed? I really don't want this to become a translation how-to!]

===Compiling translations (.po->.mo files)===
Once translations in the .po file are complete, these must be compiled into a binary format for use by the PHP-gettext libraries. The GNU gettext ##msgfmt## can be used here:

## msgfmt -o locale/fr_CH/LC_MESSAGES/fr_CH.mo locale/fr_CH/LC_MESSAGES/fr_CH.po##

If you are receiving multibyte errors when running this command, you will most likely have to manually edit the .po file, specifically the following line:

## "Content-Type: text/plain; charset=UTF-8\n" ##

===Merging translations===
[TBD]

===expandDefines.pl===
%%
#! /usr/bin/perl -w
#
# expandDefines.pl: Expand defines, mark with T_() tag for gettext
# processing
#
# Usage: expandDefines.pl <lang>
#
# Author: Brian Koontz <brian@wikkawiki.org> Copyright 2010

# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
#
#####################################################################
use strict;

# Create dictionary of defines appearing in lang file
if(!$ARGV[0])
{
die "Usage: $0 <lang>\n";
}
my $lang = $ARGV[0];
my $langfile = "/cygdrive/c/wamp/www/wikka-gettext/lang/$lang/$lang.inc.php";
my %dict = ();
my $lineno = 0;
open(IN, "<$langfile") or die "Can't open $langfile for reading!";
while(<IN>)
{
chomp;
if(!/^.*\s+define\s*\((.*?)\)/)
{
next;
}
my @fields = split(/\s*,\s*/, $1);
my $const = $fields[0];
# Restore other commas
my $val = join(', ', @fields[1..$#fields]);
$const =~ s/['"](.*?)['"]/$1/;
$val =~ s/['](.*?)[']/$1/;
$dict{$const} = $val;
$lineno++;
}
print "Number of language constants: $lineno\n";
close IN;

# Parse each .php file, replacing constants with expansion: T("...").
my @filelist = ();
# Exclude these files from search
my @exclude = ($langfile, '3rdparty', 'wikka.config.php');
# Exclude these strings from search
my @excludestrings = ('DIRECTORY_SEPARATOR', 'defined', 'define');
# Exclude these strings from gettext-wrapping
my @excludefromtranslation = ('class=', 'id=', 'name=');
use File::Find;
sub getFile
{
if($File::Find::name=~/.*.php$/ &&
!grep $File::Find::name=~/$_/, @exclude)
{
push(@filelist, $File::Find::name);
}
}
# Get list of files
find (\&getFile, ".");
$lineno = 0;
my $header = 0;
foreach my $file(@filelist)
{
open(IN, "<$file") or die "Can't open $file for reading!";
open(OUT, ">$file.new") or die "Can't open $file.new for writing!";
my $search = "([A-Z0-9]+(_[A-Z0-9]+)+)";
while(<IN>)
{
$lineno++;
my $line = $_;
if(!grep($line=~/$_/, @excludestrings))
{
# Search for two or more consecutive upper-case letter groupings
# separated by _
while($line =~ /$search/g)
{
if(exists($dict{$1}))
{
if($dict{$1} =~ /^[0-9]+$/)
{
1; # Don't do anything with defines
# for numeric constants
}
elsif(!grep($dict{$1}=~/$_/, @excludefromtranslation))
{
$line =~ s/$search/T_("$dict{$1}")/;
}
else
{
$line =~ s/$search/'$dict{$1}'/;
}
}
}
}
else
{
1;
}
print OUT $line;
}
close IN;
close OUT;
system("cp $file $file.orig");
system("cp $file.new $file");
$lineno = 0;
$header = 0;
}
%%
Valid XHTML :: Valid CSS: :: Powered by WikkaWiki