International Components for Unicode

Claudio Zizza

Developer Developer Developer
(Currently PHP)

PHP Snippets: php.budgegeria.de

Twitter: @SenseException

Mars Climate Orbiter

Date representation

es_ES: 21/4/16
en_US: 4/21/16

ICU - International Components for Unicode

icu-project.org

ICU

  • Open Source Project
  • Unicode and Globalization support (C/C++/Java)
  • Released in 1999
  • Sponsored by IBM and others
  • Current version: ICU 57.1

Intl-Extension Classes

NumberFormatter

Format


    <?php

    $numberFormatter = new NumberFormatter('de_CH', NumberFormatter::DECIMAL);
    echo $numberFormatter->format(1000.45);

    $numberFormatter = new NumberFormatter('de_CH', NumberFormatter::CURRENCY);
    echo $numberFormatter->format(1000.45);

    echo $numberFormatter->getSymbol(NumberFormatter::CURRENCY_SYMBOL);
            
1'000.45
CHF 1'000.45
CHF

Parser


                <?php

                $numberFormatter = new NumberFormatter('de_CH', NumberFormatter::DECIMAL);
                echo $numberFormatter->parse('1\'000.45');
            
1000.45

IntlDateFormatter

Date & Time Formatter


    <?php

    $dateFormatter = new IntlDateFormatter('it_IT', IntlDateFormatter::LONG,
        IntlDateFormatter::SHORT);

    $date = new DateTime();

    echo $dateFormatter->format($date) . PHP_EOL;
    echo $dateFormatter->format(1390252923);
            
18 aprile 2016 21:51
20 gennaio 2014 22:22

Date & Time Parser


    <?php

    $dateFormatter = new IntlDateFormatter('it_IT', IntlDateFormatter::LONG,
        IntlDateFormatter::SHORT);

    echo $dateFormatter->parse('20 gennaio 2014 22:22');
            
1390252920

MessageFormatter

Formatter whole messages


    <?php

    $text = 'Am {dateval,date,full} waren es {visitor,number,integer} Besucher.';

    $msgDe = new MessageFormatter('de_DE', $text);
    $msgCh = new MessageFormatter('de_Ch', $text);

    $args = array(
    'visitor' => 1240000,
    'dateval' => new DateTime(),
    );

    echo $msgDe->format($args);
    echo $msgCh->format($args);
            
Am Sonntag, 17. April 2016 waren es 1.240.000 Besucher.
Am Sonntag, 17. April 2016 waren es 1'240'000 Besucher.

MessageFormatter static


    <?php

    $text = 'Am {dateval,date,full} waren es {visitor,number,integer} Besucher.';

    $args = array(
        'visitor' => 1240000,
        'dateval' => new DateTime(),
    );

    echo MessageFormatter::formatMessage('de_DE', $text, $args);
            
Am Sonntag, 17. April 2016 waren es 1.240.000 Besucher.

MessageFormatter

TypeStyle
number(none)
integer
currency
percent
(styletext)
TypeStyle
date(none)
short
medium
long
full
(styletext)
TypeStyle
time(none)
short
medium
long
full
(styletext)
TypeStyle
spellout
ordinal
duration

IntlCalendar

Calendar information


    <?php

    $calendar = IntlCalendar::createInstance('Europe/Berlin', 'de_DE');

    var_dump(
        $calendar->getTime(),
        $calendar->getType(),
        $calendar->isWeekend(),
        $calendar->get(IntlCalendar::FIELD_YEAR),
        $calendar->inDaylightTime()
    );
            
1461093920108
gregorian
false
2016
true

Build a Calendar

April
Mo.Di.Mi.Do.Fr.Sa.So.
28293031123
45678910
11121314151617
18192021222324
2526272829301
April
SunMonTueWedThuFriSat
272829303112
3456789
10111213141516
17181920212223
24252627282930

IntlTimeZone

TimeZone


    <?php

    $timezone = IntlTimeZone::createTimeZone('Europe/Berlin');
    $timezone2 = IntlTimeZone::createTimeZone('Europe/Paris');

    var_dump(
        $timezone->getDisplayName(),
        $timezone->hasSameRules($timezone2),
        $timezone->useDaylightTime(),
        $timezone->getRawOffset()
    );
            
Central European Standard Time
false
true
3600000
Good talk about timezones by Andreas Heigl: time is an illusion

Locale

Locale


    <?php

    var_dump(Locale::getDefault());

    Locale::setDefault('de_DE');

    var_dump(Locale::getDefault());
            
en_US_POSIX
de_DE

Locale's Region


    <?php

    var_dump(
        Locale::getDisplayRegion('de_DE', 'de'),
        Locale::getDisplayRegion('de_DE', 'it'),
        Locale::getDisplayRegion('de_DE', 'en')
    );
            
Deutschland
Germania
Germany

Locale's Language


    <?php

    var_dump(
        Locale::getDisplayLanguage('de_DE', 'de'),
        Locale::getDisplayLanguage('de_DE', 'it'),
        Locale::getDisplayLanguage('de_DE', 'en')
    );
            
Deutsch
tedesco
German

Spoofchecker

Can I misread it?


    <?php

    $spoof = new Spoofchecker();

    // are strings visually confusable?
    var_dump(
        $spoof->areConfusable("Körner", "Körner\0"),
        $spoof->areConfusable("Körner", "Korner"),
        $spoof->areConfusable('lol', '1o1'),
        $spoof->areConfusable('lol', 'IoI')
    );
            
false
false
true
true

UConverter

Character Encoding


    <?php

    $uconv = new UConverter('UTF-8', 'latin-1');
    echo $uconv->convert('coraz�n');

    echo UConverter::transcode('coraz�n', 'UTF-8', 'latin-1');
            
corazón
corazón

"Hochdörfer" won't be a problem anymore

IntlBreakIterator / IntlIterator

IntlBreakIterator


    <?php

    $text = "Si contano i danni. A Pescara, ".
    "1.500 sfollati per l'esondazione del Fosso Vallelunga. ".
    "Dall'inizio dell'anno l'agricoltura ha subito un miliardo ".
    "di euro di danni.";

    $i = IntlBreakIterator::createSentenceInstance('it_IT');
    $i->setText($text);

    foreach($i->getPartsIterator() as $sentence) {
        echo $sentence . PHP_EOL . '----- next -----' .  PHP_EOL;
    }
            
Si contano i danni.
----- next -----
A Pescara, 1.500 sfollati per l'esondazione del Fosso Vallelunga.
----- next -----
Dall'inizio dell'anno l'agricoltura ha subito un miliardo di euro di danni.
----- next -----

Collator

Sorting


    <?php

    $array = array('a', 'g', 'A', 'ß', 'ä', 'j', 'z');

    sort($array);
            
sort = A,a,g,j,z,ß,ä

Sorting with Collator


    <?php

    $array = array('a', 'g', 'A', 'ß', 'ä', 'j', 'z');

    $collator = new Collator('de_DE');
    $collator->setAttribute(Collator::CASE_FIRST, Collator::LOWER_FIRST);
    $collator->sort($array);
            
Collator::sort = a,A,ä,g,j,ß,z

Transliterator

How to spell it


    <?php

    $trans = Transliterator::create('Any-Latin');
    echo $trans->transliterate('こんにちは');
            
kon'nichiha

Transliterator


    <?php

    var_dump(Transliterator::listIDs());
            
array(286) { [0]=> string(11) "ASCII-Latin" [1]=> string(11) "Latin-Arabic" ...

ResourceBundle

Custom Resources


    <?php

    // returns null on error on PHP < 7.0
    $curr = new ResourceBundle('it', __DIR__ . '/resources');

    // get old german currency
    $demCurrency = $curr->get('Currencies')
        ->get('DEM');

    echo $demCurrency->get(0) . PHP_EOL;
    echo $demCurrency->get(1) . PHP_EOL;
            
DEM
Marco Tedesco

Custom Ressources - it.txt

ICU Data
it{
    Currencies{
        ADP{
            "ADP",
            "Peseta Andorrana",
        }
        ....
        DEM{
            "DEM",
            "Marco Tedesco",
        }
            

Convert Ressources for ResourceBundle


    genrb -d /path/to/resources/currency/ it.txt
            
Created resource: it.res

genrb

http://linux.die.net/man/1/genrb

IntlChar

Characters


    <?php

    echo IntlChar::toupper('ä');
    echo IntlChar::tolower('Č');
    echo IntlChar::chr(9730);
            
Ä
č

Characters


    <?php

    var_dump(IntlChar::isUUppercase('A'));
    var_dump(IntlChar::isULowercase('A'));
    var_dump(IntlChar::isdigit(3));
    var_dump(IntlChar::isdigit('3'));
    var_dump(IntlChar::isspace(' '));
            
true
false
false
true
true

Character name


    <?php

    echo IntlChar::charName('Ü');
            
LATIN CAPITAL LETTER U WITH DIAERESIS

Exception on errors

IntlException instead errors


    <?php

    // Exists since PHP 5.5 for php.ini
    ini_set('intl.use_exceptions', true);

    try {
        $curr = new ResourceBundle('IDontExist', __DIR__);
    } catch (IntlException $e) {
        echo $e->getMessage();
    }
            
resourcebundle_ctor: Cannot load libICU resource bundle

and Intl

Thank you

Claudio Zizza
php.budgegeria.de
@SenseException