Re: [Ltru] Punjabi

"Mark Davis" <mark.davis@icu-project.org> Wed, 14 March 2007 16:19 UTC

Return-path: <ltru-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1HRWCH-0008Gf-K6; Wed, 14 Mar 2007 12:19:29 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1HRWCG-0008GZ-FS for ltru@ietf.org; Wed, 14 Mar 2007 12:19:28 -0400
Received: from ug-out-1314.google.com ([66.249.92.168]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1HRWCB-0006jF-3l for ltru@ietf.org; Wed, 14 Mar 2007 12:19:28 -0400
Received: by ug-out-1314.google.com with SMTP id 72so587411ugd for <ltru@ietf.org>; Wed, 14 Mar 2007 09:19:22 -0700 (PDT)
DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:references:x-google-sender-auth; b=aRhycg/QSfhECYKhpoiQ0R9BlYqi5gHoSs09c7FV+DbH+1QY7/KQCiKTxtPlMNuFlZuvJZwNsP04d3WAsnaNmWX+dhdxjrfVUOGxV3X2ibpUg7ruZVTQhiDx4KM9keTJj0/XUUFgqvOV0wgBqz3VApJjjXrlzY8UkGEg/dDAKQ0=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:references:x-google-sender-auth; b=SCmM3sF5auAxAjpWWXivnOQHZ4BnI9zdp3N7fpySdYCqpZ0ksujt1NDXAoHvkRv/Mxlxg94VvOU5ok2Qz5XzjVUDVPuGE1xYSCe6TqhjFkFi2rJa04JwwfStPbIa1xVofUw5Km8P3J6ZHheHLgXcvezEu+/k3xsGBFyPVg20jJY=
Received: by 10.115.17.1 with SMTP id u1mr2946199wai.1173889158258; Wed, 14 Mar 2007 09:19:18 -0700 (PDT)
Received: by 10.114.196.2 with HTTP; Wed, 14 Mar 2007 09:19:18 -0700 (PDT)
Message-ID: <30b660a20703140919n5332348ha9beb1ccf1b02ba8@mail.gmail.com>
Date: Wed, 14 Mar 2007 09:19:18 -0700
From: Mark Davis <mark.davis@icu-project.org>
To: "Sarmad Hussain, Dr." <sarmad.hussain@nu.edu.pk>
Subject: Re: [Ltru] Punjabi
In-Reply-To: <3B848F4FAFB98A43A09D301DAA62A77809F67254@host210-2-148-28.lhr.dancom.net.pk>
MIME-Version: 1.0
References: <3B848F4FAFB98A43A09D301DAA62A77809F67254@host210-2-148-28.lhr.dancom.net.pk>
X-Google-Sender-Auth: 19401aa02c29445a
X-Spam-Score: 0.6 (/)
X-Scan-Signature: 4ec58ef3f343ebf5ac40a04538f9a6fc
Cc: LTRU Working Group <ltru@ietf.org>, Nayyara Karamat <Nayyara.Karamat@nu.edu.pk>, iso639-2@loc.gov, sukhjinder_sidhu@hotmail.com, rick@unicode.org, ISO639-3@sil.org, Abbas Malik <abbas.malik@gmail.com>
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============0981879612=="
Errors-To: ltru-bounces@ietf.org

Thanks for your responses The key question we are faced with (in the Unicode
CLDR project) is what language tags (aka locale codes) to use for what is
called "Panjabi" in various regions. It appears that according to the
meaning of the codes as defined by ISO 639 and thus BCP 47, we should be
using:

   - pa-IN for "Panjabi as generally used in India" -- and pa would
   customarily use the Guru (Gurmukhi) script, so that would be suppressed in
   the tag
   - lah-PK for "Panjabi as generally used in Pakistan" -- and lah would
   customarily use the Arab (Arabic) script, so that would be suppressed in the
   tag. (We may need further subdivisions of lah, which we'll be able to
   express once BCP 47 incorporates ISO 639-3 this year.)

This doesn't mean that there aren't pa-PK users or lah-IN users -- we are
just focusing above on the largest groupings, initially. We have been using
variously combinations of 'pa' with 'Arab', 'Guru', 'PK', and 'IN', but want
to make any fixes we need now, since we are starting data submission for the
next version of CLDR: see http://unicode.org/press/pr-cldr1.5s.html.

We don't have a particular axe to grind here; our concern is that we are
using the right codes to denote the right entities, according to the
relevant standards. See also the ISO 639-3 Registration Authority's website:

   - http://www.sil.org/iso639-3/
   - http://www.sil.org/iso639-3/documentation.asp?id=lah
   - http://www.sil.org/iso639-3/documentation.asp?id=pan

See also http://www.w3.org/International/articles/language-tags/ for
information about BCP 47.

Mark

On 3/14/07, Abbas Malik <abbas.malik@gmail.com> wrote:
>
>  Dear All,
>
>
>
> Firstly, if we are going to categorize a language according to the writing
> systems that one language has, then Punjabi can be categorize in two ways.
> One is Gurmukhi (script used for writing Punjabi in India and is derivation
> of the old Indian scripts like SHARDA, LANDHA and TAKRI) and the other is
> Shahmukhi (derivation of Perso-Arabic script for writing Punjabi in
> Pakistan). Gurmukhi is coded as PAN. How Shahmukhi should be coded? For more
> details on the differences of these scripts of Punjabi, please look at the
> paper "Punjabi Machine Transliteration"
> http://www.puran.info/pub/mgam06-01.pdf. For a standard code page for
> Shahmukhi script, please look at the paper "Towards a Unicode Compatible
> Punjabi Character Set" http://www.puran.info/pub/mgam05-01.pdf. Punjabi in
> India is also written in Devanagari (Hinid script). Now in the present day
> on internet, Punjabi has also been started to write in Roman script. Thus,
> this is not a good way to categorize a language that is written in two or
> more different ways. How this phenomenon about a language should be handled
> in standards like ISO 639-3?
>
>
>
> Now come to the question of dialects, one can find different Punjabi
> dialects in Indian Punjab also, like spoken language in Amritsar (a city of
> Indian Punjab on boarder of Pakistan) is little different than language
> spoken in Jhalandhr (a city of Indian Punjab). Spoken Punjabi in Amritsar is
> very close to the spoken Punjabi in Lahore. If we look at the Pakistani
> Punjab, then these dialects are much apart that some time a person who can
> understand one dialect, may not be able to understand the other dialect
> 100%. Major dialects of Punjabi are Majhi (Spoken in central Punjab, main
> city Lahore), Potwari (spoken in northern Punjab, main city Jhehlem),
> Seraiki (spoken in southern Punjab, main city Multan). Interestingly all of
> these main dialects are represented with different code in ISO 639-3
> standard, Majhi with PNB (Punjabi, western), Potwari with PMU (Punjabi,
> Mirpuri) and Seraiki with SKR (Seraiki). All PAN, PNB, PMU, SKR represent
> the language Punjabi. I am not expert on the coding, but I think that it
> does not look good. There should be some way to categorize subcategories
> with in one language.
>
>
>
> I try to make things clear. Please do not hesitate to ask more questions.
> I would be very happy to answer.
>
>
>
> Best regards,
>
>
>
> ---
>
> *M. G. Abbas MALIK*
>
> Doctorant à l'ED MSTII,
>
> Univ. Joseph Fourier
>
> GETALP - LIG, IMAG - campus,
>
> BP53675 385,  rue de la Bibliothèque
>
> 38041 Grenoble Cedex 9, France
>
> Tel:         +33 (0) 4 76 51 48 17
>
> Fax:         +33 (0) 4 76 44 66 75
>
> Mob:      +33 (0) 6 74 50 46 01
>
> Mel:        Abbas.Malik at imag.fr, abbas.malik at gmail.com
>
> Url:         www.puran.info
>

On 3/13/07, Sarmad Hussain, Dr. <sarmad.hussain@nu.edu.pk> wrote:
>
>  There are many more dialects of Punjabi, depending on the region within
> Punjab in Pakistan.  What is spoken in Sargodha is much different from what
> is spoken in Lahore, etc.  However, most agree that they are speaking
> Punjabi.  There is some difference in vocabulary but real difference is in
> the pronunciation.  If locale is to be sensitive to these dimensions of a
> language, then multiple codes need to be put in.  However, if locale is just
> identifying the language not the dialect (sub-language? as in some cases the
> dialects may not be mutually understandable), then a singular locale would
> do.  I am not sure what level locale is designed to serve?  Could anybody
> else further elaborate on this?  I am cc:ing a couple of other people, in
> case they want to comment.
>
>
>
> We had looked at the written version of Punjabi in Pakistan (also called
> Shahmukhi) for standardization purposes, and there seems to be less variety
> at this level.
>
>
>
> Regards,
> Sarmad
>
>
>  ------------------------------
>
> *From:* Don Osborn [mailto:dzo@bisharat.net]
> *Sent:* Wednesday, March 14, 2007 12:55 AM
> *To:* 'Mark Davis'; 'LTRU Working Group'; ISO639-3@sil.org;
> iso639-2@loc.gov
> *Cc:* Sarmad Hussain, Dr.
> *Subject:* RE: [Ltru] Punjabi
>
>
>
> Hi Mark. An addendum to your question would be what they write. Might
> there be a pa-PK written standard? I don't know, just asking.
>
>
>
> I will take the liberty of cc'ing the question to Dr. Sarmad Hussein of
> the National University of Computer and Emerging Sciences in Lahore, who
> also heads the PAN L10n project in Asia ( http://www.panl10n.net ), in
> case he has any thoughts.
>
>
>
> Don
>
>
>
>
>
> *From:* Mark Davis [mailto:mark.davis@icu-project.org]
> *Sent:* Tuesday, March 13, 2007 3:12 PM
> *To:* LTRU Working Group; ISO639-3@sil.org; iso639-2@loc.gov
> *Subject:* [Ltru] Punjabi
>
>
>
> I have a question about Punjabi. ISO 639-2 gives "pan" as Punjabi. ISO
> 639-3 divides Punjabi into three separate codes:
>
> pmu    Mirpur Panjabi
> pnb    Western Panjabi
> pan    Panjabi // called Eastern Panjabi in the Ethnologue.
>
> It looks from this that according to ISO 639-3, there is no macro language
> for Panjabi; Pakistanis don't speak "pan" (= "pa"), even as a macro language
> they speak something else. So a language pa-PK (or locale pa_PK) is probably
> a mistake. Is this a fair statement?
>
> --
> Mark
>



-- 
Mark
_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www1.ietf.org/mailman/listinfo/ltru