[Architecture] [Accessforall] Codes for languages in ISO 24751 and the registry

Andy Heath andyheath at axelrod.plus.com
Thu Oct 4 09:15:09 EDT 2012


Just a slight modification ..

I'm led to believe the solution of choice is to use either 639-2 or 
639-3 as appropriate. 639-3 seems to be a slight improvement on 639-2 
(unless one needs bibliographic languages) in that (as I understand it) 
where there is a group language in part 2 (such as Arabic) that has no 
specific versions its included in part 3 as a specific not a group 
language but where there is are specific versions the general language 
is omitted.  I am led to believe there are more languages included in 
part 3 than part 2 but I don't know how important the extra ones are.  I 
think that for all practical purposes at this point there won't be any 
differences between 639-2 and 639-3 and its something that would be easy 
to change later but its something to watch for.

There is also the question of using codes that aren't registered at all 
(maybe that's "yet" or maybe its not).  There is another IETF guideline 
which provides some best practices on this and extended codes and so on 
(a good bedtime read for the geeks out there)

http://tools.ietf.org/html/bcp47

My point is that this is a slightly moving target that may evolve a 
little but 639-2 augmented with 639-3 if needed would do the job for now 
but possibly not for ever.

andy
> OK
>
> Does anyone want to SPEAK AGAINST doing as Colin outlined which seems to
> be in line with everyone else's comments.
>
>    If so please post any counter thoughts in the next few days.    We
> have everyone I think on the two lists attached so we can make a
> decision if there are no counter proposals to consider
>
> thanks
>
> /Gregg/
> --------------------------------------------------------
> Gregg Vanderheiden Ph.D.
> Director Trace R&D Center
> Professor Industrial & Systems Engineering
> and Biomedical Engineering
> University of Wisconsin-Madison
>
> Technical Director - Cloud4all Project - http://Cloud4all.info
> Co-Director, Raising the Floor - International
> and the Global Public Inclusive Infrastructure Project
> http://Raisingthefloor.org   --- http://GPII.net
>
>
>
>
>
>
>
>
>
> On Oct 3, 2012, at 10:44 PM, Colin Clark <colinbdclark at gmail.com
> <mailto:colinbdclark at gmail.com>> wrote:
>
>> Hi all,
>>
>> We should be using ISO 639-2 language codes throughout the system. If
>> not, it's a bug.
>>
>> If I remember correctly, this was probably introduced by the UI
>> Options team who were integrating at very short notice with the GPII
>> framework. I believe UI Options can support both two- and
>> three-character language codes (as is often the case).
>>
>> As a speaker of "eng-CA", I don't see any reason not to simply use ISO
>> 639-2 from the start and to also support country codes, as Christophe
>> suggests. I also think it's probably worth supporting the
>> two-character subset for interoperability if possible.
>>
>> Colin
>>
>> On 2012-10-03, at 1:18 PM, Gregg Vanderheiden wrote:
>>
>>> I think that having language and country codes is a great idea.
>>>
>>> Wd DO need to decide which codes to use.  I think the square brackets
>>> were because an official decision was not made yet
>>>
>>> But I think using the ISO codes for both would be the right thing to
>>> do.  I added the arch list to see if someone knows  why two letter
>>> codes are currently used.  (W3C?)
>>>
>>> We also should say something like  "if no country is specified then
>>> ...."
>>> (is there a default country for all languages specified somewhere?)
>>> we might say the country of origin -- but I'm not sure all languages
>>> have an (existing) country of origin anymore.
>>>
>>> Good catch Christophe.
>>> Lets get a decision and then record it in the Glossary.
>>>
>>> I wonder if we should have a decision registry somewhere since we
>>> have so many people involved.
>>>
>>>
>>> Gregg
>>> --------------------------------------------------------
>>> Gregg Vanderheiden Ph.D.
>>> Director Trace R&D Center
>>> Professor Industrial & Systems Engineering
>>> and Biomedical Engineering
>>> University of Wisconsin-Madison
>>>
>>> Technical Director - Cloud4all Project - http://Cloud4all.info
>>> Co-Director, Raising the Floor - International
>>> and the Global Public Inclusive Infrastructure Project
>>> http://Raisingthefloor.org   --- http://GPII.net
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Oct 3, 2012, at 11:43 AM, Christophe Strobbe
>>> <christophestrobbe at yahoo.co.uk
>>> <mailto:christophestrobbe at yahoo.co.uk>> wrote:
>>>
>>>> Hi,
>>>>
>>>> While creating a preference set for one of the personas in the
>>>> Cloud4all smarthouse simulation
>>>> <http://wiki.gpii.net/index.php/SmartHouses_Preference_Sets>, I
>>>> looked into language codes and found the following:
>>>> (1) ISO/IEC 24751:2008 (all subparts) refer to ISO 639-2:1998 for
>>>> language codes. In the registry, the value space for "language" is
>>>> [ISO 639-2/T] (I don't know the reason for the square brackets).
>>>> According to <https://en.wikipedia.org/wiki/List_of_ISO_639-2_codes>
>>>> and <http://www.loc.gov/standards/iso639-2/php/code_list.php>, the
>>>> ISO 639-2 codes are three-letter codes (e.g. "eng" for English,
>>>> "dut" or "nld" for Dutch, "fre" or "fra" for French, etc). However,
>>>> the JSON preference sets I've seen so far (I mean those by the
>>>> GPII/Cloud4all Architecture team) use two-letter codes (see Carla's,
>>>> Nisha's and Timothy's preference sets). Am I misreading the
>>>> information I found about ISO 639-2?
>>>> (2) Related to this is the absence of country information, i.e.
>>>> combining a language code with a country code from ISO 3166 (see
>>>> <http://www.loc.gov/standards/iso639-2/faq.html#22>). This is
>>>> relevant to text-to-speech engines and Braille. For example for
>>>> Dutch, not many people in Flanders are keen on TTS that uses
>>>> pronunciation rules from the Netherlands. Braille conventions also
>>>> vary between countries that use the same official language (well,
>>>> they even vary between Braille centres, but let's not go into that).
>>>> (3) Note that IETF RFC 4646 <http://tools.ietf.org/html/rfc4646>
>>>> gives preference to the shortest ISO 639 code (2 or three letters)
>>>> that is available for a language (check the ABNF syntax under
>>>> <http://tools.ietf.org/html/rfc4646#section-2.1>). This base code
>>>> can then be combined with an ISO 3166 country code, to create tags
>>>> like en-US (American English) and en-GB (British English). However,
>>>> IETF RFC 4646 is referenced neither by ISO 24751 nor by the registry.
>>>>
>>>> Best regards,
>>>>
>>>> Christophe Strobbe
>>>>
>>>> _______________________________________________
>>>> Accessforall mailing list
>>>> Accessforall at fluidproject.org <mailto:Accessforall at fluidproject.org>
>>>> http://lists.idrc.ocad.ca/cgi-bin/mailman/listinfo/accessforall
>>>
>>> _______________________________________________
>>> Accessforall mailing list
>>> Accessforall at fluidproject.org <mailto:Accessforall at fluidproject.org>
>>> http://lists.idrc.ocad.ca/cgi-bin/mailman/listinfo/accessforall
>>
>> ---
>> Colin Clark
>> Technical Lead, Fluid Project
>> http://fluidproject.org
>>
>
>
>
> _______________________________________________
> Accessforall mailing list
> Accessforall at fluidproject.org
> http://lists.idrc.ocad.ca/cgi-bin/mailman/listinfo/accessforall
>



Cheers

andy
-- 
__________________
Andy Heath
http://axelafa.com



More information about the Architecture mailing list