This document describes a proposal to implement gettext for all translation strings in
WikkaWiki, starting with Wikka 1.3. Note that this is only a proposal, so this document can and probably will be modified several times. Should the dev team decide to adopt gettext as the localization standard for Wikka, this page will be renamed to indicate this.
SVN checkout: svn co
https://wush.net/svn/wikka/branches/1.3_gettext
The following references were used to develop the initial gettext implementation using the Wikka 1.3 development branch:
"Translating WordPress"
PHP-gettext dev blog (Danilo Segan)
PHP-gettext repository
GNU gettext manual
Some gettext notes from Pablo Hoch's blog
The
PHP-gettext standalone library is used to implement gettext in Wikka. This eliminates the need for a Wikka administrator to ensure their version of PHP has gettext support compiled in. PHP-gettext requires no external libraries and only a minimal amount of configuration. It is licensed under
GPLv2.
In Wikka 1.3, the PHP-gettext version 1.09 libraries are located in
3rdparty/core/php-gettext. No modifications are necessary when installing from the PHP-gettext version 1.09 release package.
Testing was conducted on a Windows 7 laptop running the excellent
WampServer 2.0 package (Apache 2.2.11, PHP 5.2.11, and
MySQL 5.1.36) using
GNU gettext 0.17 tools under
Cygwin.
Defines that were used as translation strings in
lang/en/en.inc.php and related language files were replaced in source files with their English equivalents using a Perl script (jump to the end of this article for the script). The gettext macro used for all Wikka translation strings is
T_ (the reason for this is that
_ is already used by the installer). For instance, the following define:
if(!defined('FOOTER_PAGE_EDIT_LINK_DESC')) define('FOOTER_PAGE_EDIT_LINK_DESC', 'Edit page');
was replaced in the
header.php source code file with the following:
T_('Edit page')
A file called
localization.php is used in the Wikka top-level directory to configure PHP-gettext. This file should normally not require modification by the end-user. The file itself is invoked from within
wikka.php via the
include_once directive.
To use another locale, you have to add an entry in
wikka.config.php for the parameter
default_locale
'default_locale' => 'fr_CH',
The
locale directory is structured as follows:
locale/
locale/po <--contains the generic template file; must be copied to lang-specific directories for translation
locale/po/messages.pot <--generic template file
locale/en_US <--locale-specific
locale/en_US/LC_MESSAGES <--holds lang-specific translations
locale/en_US/LC_MESSAGES/en_US.po <--lang-specific template file, usually created by msginit
locale/en_US/LC_MESSAGES/en_US.mo <--compiled translation file, usually created by msgfmt
locale/de_DE
locale/de_DE/LC_MESSAGES
etc...
Any time new translation macros (of the form
T_(...)) are added to the source code, a new gettext template file must be generated. There are several different gettext utilities that can be used to generate this file. GNU gettext command-line utility examples are used in this document, so we will be using the gettext command from the Wikka top-level directory:
find ./ -name '*.php' | xargs xgettext -L PHP --force-po -kT_ -o locale/po/messages.pot
If one does not already exist, create a new directory structure under
locale/ using
BCP-47 language tags (
validator). For instance:
mkdir -p locale/fr_CH/LC_MESSAGES
The GNU gettext command
msginit can then be invoked to copy the
messages.pot template file for use with the language to be translated:
msginit --locale fr_CH --input locale/po/messages.pot --output-file locale/fr_CH/LC_MESSAGES/fr_CH.po
Several utilities exist that can be used to modify .po files. Some of the available utilities are listed
here. The file can also be modified manually in a text editor.
[More info needed? I really don't want this to become a translation how-to!]
Once translations in the .po file are complete, these must be compiled into a binary format for use by the PHP-gettext libraries. The GNU gettext
msgfmt can be used here:
msgfmt -o locale/fr_CH/LC_MESSAGES/fr_CH.mo locale/fr_CH/LC_MESSAGES/fr_CH.po
If you are receiving multibyte errors when running this command, you will most likely have to manually edit the .po file, specifically the following line:
"Content-Type: text/plain; charset=UTF-8\n"
[TBD]
#! /usr/bin/perl -w
#
# expandDefines.pl: Expand defines, mark with T_() tag for gettext
# processing
#
# Usage: expandDefines.pl <lang>
#
# Author: Brian Koontz <brian@wikkawiki.org> Copyright 2010
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
#
#####################################################################
use strict;
# Create dictionary of defines appearing in lang file
if(!$ARGV[0])
{
die "Usage: $0 <lang>\n";
}
my $lang = $ARGV[0];
my $langfile = "/cygdrive/c/wamp/www/wikka-gettext/lang/$lang/$lang.inc.php";
my %dict = ();
my $lineno = 0;
open(IN, "<$langfile") or die "Can't open $langfile for reading!";
while(<IN>)
{
chomp;
if(!/^.*\s+define\s*\((.*?)\)/)
{
next;
}
my @fields = split(/\s*,\s*/, $1);
my $const = $fields[0];
# Restore other commas
my $val = join(', ', @fields[1..$#fields]);
$const =~ s/['"](.*?)['"]/$1/;
$val =~ s/['](.*?)[']/$1/;
$dict{$const} = $val;
$lineno++;
}
print "Number of language constants: $lineno\n";
close IN;
# Parse each .php file, replacing constants with expansion: T("...").
my @filelist = ();
# Exclude these files from search
my @exclude = ($langfile, '3rdparty', 'wikka.config.php');
# Exclude these strings from search
my @excludestrings = ('DIRECTORY_SEPARATOR', 'defined', 'define');
# Exclude these strings from gettext-wrapping
my @excludefromtranslation = ('class=', 'id=', 'name=');
use File::Find;
sub getFile
{
if($File::Find::name=~/.*.php$/ &&
!grep $File::Find::name=~/$_/, @exclude)
{
push(@filelist, $File::Find::name);
}
}
# Get list of files
find (\&getFile, ".");
$lineno = 0;
my $header = 0;
foreach my $file(@filelist)
{
open(IN, "<$file") or die "Can't open $file for reading!";
open(OUT, ">$file.new") or die "Can't open $file.new for writing!";
my $search = "([A-Z0-9]+(_[A-Z0-9]+)+)";
while(<IN>)
{
$lineno++;
my $line = $_;
if(!grep($line=~/$_/, @excludestrings))
{
# Search for two or more consecutive upper-case letter groupings
# separated by _
while($line =~ /$search/g)
{
if(exists($dict{$1}))
{
if($dict{$1} =~ /^[0-9]+$/)
{
1; # Don't do anything with defines
# for numeric constants
}
elsif(!grep($dict{$1}=~/$_/, @excludefromtranslation))
{
$line =~ s/$search/T_("$dict{$1}")/;
}
else
{
$line =~ s/$search/'$dict{$1}'/;
}
}
}
}
else
{
1;
}
print OUT $line;
}
close IN;
close OUT;
system("cp $file $file.orig");
system("cp $file.new $file");
$lineno = 0;
$header = 0;
}