RE: [Ltru] Punjabi

Sukhjinder Sidhu <sukhjinder_sidhu@hotmail.com> Wed, 14 March 2007 22:04 UTC

Return-path: <ltru-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1HRba5-0001mo-7u; Wed, 14 Mar 2007 18:04:25 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1HRWh2-0005FW-GL for ltru@ietf.org; Wed, 14 Mar 2007 12:51:16 -0400
Received: from bay0-omc3-s22.bay0.hotmail.com ([65.54.246.222]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1HRWgv-0005cx-DW for ltru@ietf.org; Wed, 14 Mar 2007 12:51:16 -0400
Received: from BAY117-W15 ([207.46.8.50]) by bay0-omc3-s22.bay0.hotmail.com with Microsoft SMTPSVC(6.0.3790.2668); Wed, 14 Mar 2007 09:51:08 -0700
Message-ID: <BAY117-W1545194BA616EBFE84087CE8730@phx.gbl>
X-Originating-IP: [195.245.235.230]
From: Sukhjinder Sidhu <sukhjinder_sidhu@hotmail.com>
To: Mark Davis <mark.davis@icu-project.org>, "Sarmad Hussain, Dr." <sarmad.hussain@nu.edu.pk>
Subject: RE: [Ltru] Punjabi
Date: Wed, 14 Mar 2007 16:51:08 +0000
Importance: Normal
MIME-Version: 1.0
X-OriginalArrivalTime: 14 Mar 2007 16:51:08.0716 (UTC) FILETIME=[F6CD66C0:01C76658]
X-Spam-Score: 2.7 (++)
X-Scan-Signature: e3ebaaff3b3539efaf29ef65eea2aded
X-Mailman-Approved-At: Wed, 14 Mar 2007 18:04:23 -0400
Cc: LTRU Working Group <ltru@ietf.org>, Nayyara Karamat <nayyara.karamat@nu.edu.pk>, iso639-2@loc.gov, rick@unicode.org, iso639-3@sil.org, Abbas Malik <abbas.malik@gmail.com>
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============0278179841=="
Errors-To: ltru-bounces@ietf.org

Mark, I think I'm going to have to disagree here.  The typical, standard dialect of Punjabi (well, as 'standard' as it has ever gotten) is Maajhi.  This is spoken both in Lahore, and Amritsar.  Lahnda as a description for the language tends to drift more towards Seraiki.  However, if we look at Ethnologue:
 
http://www.ethnologue.com/show_language.asp?code=pan
http://www.ethnologue.com/show_language.asp?code=pnbWe see that Maajhi is mentioned in both Eastern (Gurmukhi) and Western Punjabi (Shahmukhi).  "Standard" Punjabi is the same language both sides of the border.  Dogri (Pahari), Hindko, Mirpuri, Seraiki (Multani) have never been considered standard Punjabi dialects and some of these are already recognised as independent languages (Dogri, Seraiki) for example.
 
I personally think we should keep both pa-IN (pa-Guru), pa-PK (pa-Arab) for standard Punjabi, while allowing for the Lahnda languages (Hindko, Seraiki etc) to be defined later with their own language codes.
 
Regards,
Sukhjinder Sidhu


Date: Wed, 14 Mar 2007 09:19:18 -0700From: mark.davis@icu-project.orgTo: sarmad.hussain@nu.edu.pkSubject: Re: [Ltru] PunjabiCC: dzo@bisharat.net; ltru@ietf.org; ISO639-3@sil.org; iso639-2@loc.gov; abbas.malik@gmail.com; Nayyara.Karamat@nu.edu.pk; sukhjinder_sidhu@hotmail.com; rick@unicode.orgThanks for your responses The key question we are faced with (in the Unicode CLDR project) is what language tags (aka locale codes) to use for what is called "Panjabi" in various regions. It appears that according to the meaning of the codes as defined by ISO 639 and thus BCP 47, we should be using: 

pa-IN for "Panjabi as generally used in India" -- and pa would customarily use the Guru (Gurmukhi) script, so that would be suppressed in the tag
lah-PK for "Panjabi as generally used in Pakistan" -- and lah would customarily use the Arab (Arabic) script, so that would be suppressed in the tag. (We may need further subdivisions of lah, which we'll be able to express once BCP 47 incorporates ISO 639-3 this year.)This doesn't mean that there aren't pa-PK users or lah-IN users -- we are just focusing above on the largest groupings, initially. We have been using variously combinations of 'pa' with 'Arab', 'Guru', 'PK', and 'IN', but want to make any fixes we need now, since we are starting data submission for the next version of CLDR: see http://unicode.org/press/pr-cldr1.5s.html.We don't have a particular axe to grind here; our concern is that we are using the right codes to denote the right entities, according to the relevant standards. See also the ISO 639-3 Registration Authority's website: 

http://www.sil.org/iso639-3/
http://www.sil.org/iso639-3/documentation.asp?id=lah
http://www.sil.org/iso639-3/documentation.asp?id=panSee also http://www.w3.org/International/articles/language-tags/ for information about BCP 47. Mark
On 3/14/07, Abbas Malik <abbas.malik@gmail.com> wrote: 



Dear All,
 
Firstly, if we are going to categorize a language according to the writing systems that one language has, then Punjabi can be categorize in two ways. One is Gurmukhi (script used for writing Punjabi in India and is derivation of the old Indian scripts like SHARDA, LANDHA and TAKRI) and the other is Shahmukhi (derivation of Perso-Arabic script for writing Punjabi in Pakistan). Gurmukhi is coded as PAN. How Shahmukhi should be coded? For more details on the differences of these scripts of Punjabi, please look at the paper "Punjabi Machine Transliteration" http://www.puran.info/pub/mgam06-01.pdf. For a standard code page for Shahmukhi script, please look at the paper "Towards a Unicode Compatible Punjabi Character Set" http://www.puran.info/pub/mgam05-01.pdf. Punjabi in India is also written in Devanagari (Hinid script). Now in the present day on internet, Punjabi has also been started to write in Roman script. Thus, this is not a good way to categorize a language that is written in two or more different ways. How this phenomenon about a language should be handled in standards like ISO 639-3?
 
Now come to the question of dialects, one can find different Punjabi dialects in Indian Punjab also, like spoken language in Amritsar (a city of Indian Punjab on boarder of Pakistan) is little different than language spoken in Jhalandhr (a city of Indian Punjab). Spoken Punjabi in Amritsar is very close to the spoken Punjabi in Lahore. If we look at the Pakistani Punjab, then these dialects are much apart that some time a person who can understand one dialect, may not be able to understand the other dialect 100%. Major dialects of Punjabi are Majhi (Spoken in central Punjab, main city Lahore), Potwari (spoken in northern Punjab, main city Jhehlem), Seraiki (spoken in southern Punjab, main city Multan). Interestingly all of these main dialects are represented with different code in ISO 639-3 standard, Majhi with PNB (Punjabi, western), Potwari with PMU (Punjabi, Mirpuri) and Seraiki with SKR (Seraiki). All PAN, PNB, PMU, SKR represent the language Punjabi. I am not expert on the coding, but I think that it does not look good. There should be some way to categorize subcategories with in one language.
 
I try to make things clear. Please do not hesitate to ask more questions. I would be very happy to answer.
 
Best regards,
 
---
M. G. Abbas MALIK
Doctorant à l'ED MSTII,
Univ. Joseph Fourier
GETALP - LIG, IMAG - campus,
BP53675 385,  rue de la Bibliothèque
38041 Grenoble Cedex 9, France
Tel:         +33 (0) 4 76 51 48 17
Fax:         +33 (0) 4 76 44 66 75
Mob:      +33 (0) 6 74 50 46 01
Mel:        Abbas.Malik at imag.fr , abbas.malik at gmail.com
Url:         www.puran.info 
On 3/13/07, Sarmad Hussain, Dr. <sarmad.hussain@nu.edu.pk> wrote: 



There are many more dialects of Punjabi, depending on the region within Punjab in Pakistan.  What is spoken in Sargodha is much different from what is spoken in Lahore, etc.  However, most agree that they are speaking Punjabi.  There is some difference in vocabulary but real difference is in the pronunciation.  If locale is to be sensitive to these dimensions of a language, then multiple codes need to be put in.  However, if locale is just identifying the language not the dialect (sub-language? as in some cases the dialects may not be mutually understandable), then a singular locale would do.  I am not sure what level locale is designed to serve?  Could anybody else further elaborate on this?  I am cc:ing a couple of other people, in case they want to comment.
 
We had looked at the written version of Punjabi in Pakistan (also called Shahmukhi) for standardization purposes, and there seems to be less variety at this level.  
 
Regards,Sarmad
 




From: Don Osborn [mailto:dzo@bisharat.net] Sent: Wednesday, March 14, 2007 12:55 AMTo: 'Mark Davis'; 'LTRU Working Group'; ISO639-3@sil.org; iso639-2@loc.govCc: Sarmad Hussain, Dr.Subject: RE: [Ltru] Punjabi
 
Hi Mark. An addendum to your question would be what they write. Might there be a pa-PK written standard? I don't know, just asking. 
 
I will take the liberty of cc'ing the question to Dr. Sarmad Hussein of the National University of Computer and Emerging Sciences in Lahore, who also heads the PAN L10n project in Asia ( http://www.panl10n.net ), in case he has any thoughts. 
 
Don
 
 



From: Mark Davis [mailto:mark.davis@icu-project.org] Sent: Tuesday, March 13, 2007 3:12 PMTo: LTRU Working Group; ISO639-3@sil.org; iso639-2@loc.govSubject: [Ltru] Punjabi
 
I have a question about Punjabi. ISO 639-2 gives "pan" as Punjabi. ISO 639-3 divides Punjabi into three separate codes:pmu    Mirpur Panjabipnb    Western Panjabipan    Panjabi // called Eastern Panjabi in the Ethnologue. It looks from this that according to ISO 639-3, there is no macro language for Panjabi; Pakistanis don't speak "pan" (= "pa"), even as a macro language they speak something else. So a language pa-PK (or locale pa_PK) is probably a mistake. Is this a fair statement? -- Mark -- Mark 
_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www1.ietf.org/mailman/listinfo/ltru