International Components for Unicode

Claudio Zizza

Developer Developer Developer
(Currently PHP)

Co-Organizer PHP Usergroup Karlsruhe

Part of Doctrine-Team: doctrine-project.org
PHP Snippets: php.budgegeria.de

Twitter: @SenseException

Mars Climate Orbiter

Mars Climate Orbiter
Importance of Intl for developers

Date representation

What?
es_ES: 21/4/16
en_US: 4/21/16

ICU - International Components for Unicode

icu-project.org

ICU

  • Open Source Project
  • Unicode and Globalization support (C/C++/Java)
  • Released in 1999
  • Sponsored by IBM and others
  • Current version: ICU 66

Intl-Extension Classes

NumberFormatter

1.000,45

1'000.45

1,000.45

1 000,45

Format


    <?php

    $numberFormatter = new NumberFormatter('de_CH', NumberFormatter::DECIMAL);
    echo $numberFormatter->format(1000.45);

    $numberFormatter = new NumberFormatter('de_CH', NumberFormatter::CURRENCY);
    echo $numberFormatter->format(1000.45);

    echo $numberFormatter->getSymbol(NumberFormatter::CURRENCY_SYMBOL);
            
1'000.45
CHF 1'000.45
CHF

Parser


    <?php

    $numberFormatter = new NumberFormatter('de_CH', NumberFormatter::DECIMAL);
    echo $numberFormatter->parse('1\'000.45');
            
1000.45

IntlDateFormatter

August 18, 2017 at 7:00 PM

8/18/17

Aug 18, 2017, 7:00:23 PM GMT

7:00:23 PM



18 августа 2017 г., 19:01

Date & Time Formatter


    <?php

    $dateFormatter = new IntlDateFormatter('it_IT', IntlDateFormatter::LONG,
        IntlDateFormatter::SHORT);

    $date = new DateTime();

    echo $dateFormatter->format($date) . PHP_EOL;
    echo $dateFormatter->format(1390252923);
            
18 aprile 2016 21:51
20 gennaio 2014 22:22

Date & Time Parser


    <?php

    $dateFormatter = new IntlDateFormatter('it_IT', IntlDateFormatter::LONG,
        IntlDateFormatter::SHORT);

    echo $dateFormatter->parse('20 gennaio 2014 22:22');
            
1390252920

MessageFormatter

Formats whole messages


    <?php

    $text = 'Am {dateval,date,full} waren es {visitor,number,integer} Besucher.';

    $msgDe = new MessageFormatter('de_DE', $text);
    $msgCh = new MessageFormatter('de_CH', $text);

    $args = array(
        'visitor' => 1240000,
        'dateval' => new DateTime(),
    );

    echo $msgDe->format($args);
    echo $msgCh->format($args);
            
Am Sonntag, 17. April 2016 waren es 1.240.000 Besucher.
Am Sonntag, 17. April 2016 waren es 1'240'000 Besucher.

MessageFormatter static


    <?php

    $text = 'Am {dateval,date,full} waren es {visitor,number,integer} Besucher.';

    $args = array(
        'visitor' => 1240000,
        'dateval' => new DateTime(),
    );

    echo MessageFormatter::formatMessage('de_DE', $text, $args);
            
Am Sonntag, 17. April 2016 waren es 1.240.000 Besucher.

MessageFormatter

TypeStyle
number(none)
integer
currency
percent
(styletext)
TypeStyle
date(none)
short
medium
long
full
(styletext)
TypeStyle
time(none)
short
medium
long
full
(styletext)
TypeStyle
spellout
ordinal
duration

IntlCalendar

Calendar information


    <?php

    $calendar = IntlCalendar::createInstance('Europe/Berlin', 'de_DE');

    var_dump(
        $calendar->getTime(),
        $calendar->getType(),
        $calendar->isWeekend(),
        $calendar->get(IntlCalendar::FIELD_YEAR),
        $calendar->inDaylightTime()
    );
            
1461093920108
gregorian
false
2016
true

    <?php

    $calendar = IntlCalendar::createInstance('Europe/Berlin', 'de_DE');

    while (false === $calendar->isWeekend()) {
        echo 'All work and no play makes me a dull dev. ';

        $calendar->setTime(IntlCalendar::getNow());
    }

    echo 'Yay.';
            

Build a Calendar

April 2016
Mo.Di.Mi.Do.Fr.Sa.So.
28293031123
45678910
11121314151617
18192021222324
2526272829301
April 2016
SunMonTueWedThuFriSat
272829303112
3456789
10111213141516
17181920212223
24252627282930

IntlTimeZone

TimeZone


    <?php

    $timezone = IntlTimeZone::createTimeZone('Europe/Berlin');
    $timezone2 = IntlTimeZone::createTimeZone('Europe/Paris');

    var_dump(
        $timezone->getDisplayName(),
        $timezone->hasSameRules($timezone2),
        $timezone->useDaylightTime(),
        $timezone->getRawOffset()
    );
            
Central European Standard Time
false
true
3600000
Good talk about timezones by Andreas Heigl: time is an illusion

Locale

Locale


    <?php

    var_dump(Locale::getDefault());

    Locale::setDefault('de_DE');

    var_dump(Locale::getDefault());
            
en_US_POSIX
de_DE

Locale's Region


    <?php

    var_dump(
        Locale::getDisplayRegion('de_DE', 'de'),
        Locale::getDisplayRegion('de_DE', 'it'),
        Locale::getDisplayRegion('de_DE', 'en')
    );
            
Deutschland
Germania
Germany

Locale's Language


    <?php

    var_dump(
        Locale::getDisplayLanguage('de_DE', 'de'),
        Locale::getDisplayLanguage('de_DE', 'it'),
        Locale::getDisplayLanguage('de_DE', 'en')
    );
            
Deutsch
tedesco
German

Spoofchecker

'lol' === 'IoI'

// false

Can I misread it?


    <?php

    $spoof = new Spoofchecker();

    // are strings visually confusable?
    var_dump(
        $spoof->areConfusable("Körner", "Körner\0"),
        $spoof->areConfusable("Körner", "Korner"),
        $spoof->areConfusable('lol', '1o1'),
        $spoof->areConfusable('lol', 'IoI')
    );
            
false
false
true
true

UConverter

Character Encoding


    <?php

    $uconv = new UConverter('UTF-8', 'latin-1');
    echo $uconv->convert('coraz�n');

    echo UConverter::transcode('coraz�n', 'UTF-8', 'latin-1');
            
corazón
corazón

IntlBreakIterator / IntlIterator

IntlBreakIterator


    <?php

    $text = "Si contano i danni. A Pescara, ".
    "1.500 sfollati per l'esondazione del Fosso Vallelunga. ".
    "Dall'inizio dell'anno l'agricoltura ha subito un miliardo ".
    "di euro di danni.";

    $i = IntlBreakIterator::createSentenceInstance('it_IT');
    $i->setText($text);

    foreach($i->getPartsIterator() as $sentence) {
        echo $sentence . PHP_EOL . '----- next -----' .  PHP_EOL;
    }
            
Si contano i danni.
----- next -----
A Pescara, 1.500 sfollati per l'esondazione del Fosso Vallelunga.
----- next -----
Dall'inizio dell'anno l'agricoltura ha subito un miliardo di euro di danni.
----- next -----

Collator

Sorting


    <?php

    $array = array('a', 'g', 'A', 'ß', 'ä', 'j', 'z');

    sort($array);
            
sort = A,a,g,j,z,ß,ä

Sorting with Collator


    <?php

    $array = array('a', 'g', 'A', 'ß', 'ä', 'j', 'z');

    $collator = new Collator('de_DE');
    $collator->setAttribute(Collator::CASE_FIRST, Collator::LOWER_FIRST);
    $collator->sort($array);
            
Collator::sort = a,A,ä,g,j,ß,z

Transliterator

How to spell it


    <?php

    $trans = Transliterator::create('Any-Latin');
    echo $trans->transliterate('こんにちは');
            
kon'nichiha

Transliterator


    <?php

    var_dump(Transliterator::listIDs());
            
array(286) { [0]=> string(11) "ASCII-Latin" [1]=> string(11) "Latin-Arabic" ...

ResourceBundle

Custom Resources


    <?php

    // returns null on error on PHP < 7.0
    $curr = new ResourceBundle('it', __DIR__ . '/resources');

    // get old german currency
    $demCurrency = $curr->get('Currencies')
        ->get('DEM');

    echo $demCurrency->get(0) . PHP_EOL;
    echo $demCurrency->get(1) . PHP_EOL;
            
DEM
Marco Tedesco

Custom Ressources - it.txt

it{
    Currencies{
        ADP{
            "ADP",
            "Peseta Andorrana",
        }
        ....
        DEM{
            "DEM",
            "Marco Tedesco",
        }
            

Convert Ressources for ResourceBundle


    genrb -d /path/to/resources/currency/ it.txt
            
Created resource: it.res

genrb

http://linux.die.net/man/1/genrb

IntlChar

Characters


    <?php

    echo IntlChar::toupper('ä');
    echo IntlChar::tolower('Č');
    echo IntlChar::chr(9730);
            
Ä
č

Characters


    <?php

    var_dump(IntlChar::isUUppercase('A'));
    var_dump(IntlChar::isULowercase('A'));
    var_dump(IntlChar::isdigit(3));
    var_dump(IntlChar::isdigit('3'));
    var_dump(IntlChar::isspace(' '));
            
true
false
false
true
true

Character name


    <?php

    echo IntlChar::charName('Ü');
            
LATIN CAPITAL LETTER U WITH DIAERESIS

Exception on errors

IntlException instead errors


    <?php

    // Exists since PHP 5.5 for php.ini
    ini_set('intl.use_exceptions', true);

    try {
        $curr = new ResourceBundle('IDontExist', __DIR__);
    } catch (IntlException $e) {
        echo $e->getMessage();
    }
            
resourcebundle_ctor: Cannot load libICU resource bundle

So much code...

Still awake?

Intl-Format

https://github.com/SenseException/intl-format

Similar to sprintf


    <?php

    $intlFormat = (new Budgegeria\IntlFormat\Factory())->createIntlFormat('en_US');

    $date = new DateTime();
    $number = 1002.25;

    echo $intlFormat->format('At %time_short the value was %number', $date, $number);
            
At 5:30 AM the value was 1,002.25

Available formats in Intl-Format

  • Numbers
  • Date and time
  • Timezone
  • Locale
  • Custom types by you

and Intl

Thank you

Claudio Zizza
@SenseException