Re: [weirds] Internationalization Issues

Kentaro Mori <kentaro@jprs.co.jp> Wed, 24 October 2012 10:01 UTC

Return-Path: <kentaro@jprs.co.jp>
X-Original-To: weirds@ietfa.amsl.com
Delivered-To: weirds@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 932DD21F8A6B for <weirds@ietfa.amsl.com>; Wed, 24 Oct 2012 03:01:20 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.599
X-Spam-Level:
X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[AWL=-0.001, BAYES_00=-2.599, WEIRD_PORT=0.001]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id pTm9mg+oZoOr for <weirds@ietfa.amsl.com>; Wed, 24 Oct 2012 03:01:19 -0700 (PDT)
Received: from off-send02.tyo.jprs.co.jp (off-send02.tyo.jprs.co.jp [IPv6:2001:df0:8:17::20]) by ietfa.amsl.com (Postfix) with ESMTP id 4668621F8A6A for <weirds@ietf.org>; Wed, 24 Oct 2012 03:01:19 -0700 (PDT)
Received: from off-sendsmg01.tyo.jprs.co.jp (off-sendsmg01.tyo.jprs.co.jp [172.18.8.32]) by off-send02.tyo.jprs.co.jp (8.13.8/8.13.8) with ESMTP id q9OA1Cd4030403; Wed, 24 Oct 2012 19:01:12 +0900
X-AuditID: ac120820-b7fd46d0000058b8-07-5087bc68e82d
Received: from [127.0.0.1] (Unknown_Domain [172.19.24.216]) by off-sendsmg01.tyo.jprs.co.jp (Symantec Messaging Gateway) with SMTP id A3.95.22712.86CB7805; Wed, 24 Oct 2012 19:01:12 +0900 (JST)
Message-ID: <5087BC68.5040908@jprs.co.jp>
Date: Wed, 24 Oct 2012 19:01:12 +0900
From: Kentaro Mori <kentaro@jprs.co.jp>
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:16.0) Gecko/20121010 Thunderbird/16.0.1
MIME-Version: 1.0
To: Byron Ellacott <bje@apnic.net>
References: <CCA6B194.DE13%andy@arin.net> <6C6109C6-0FA3-40A0-9562-A8F55F178003@apnic.net> <50875975.8090908@jprs.co.jp> <50877817.9060805@nic.ad.jp> <1045CB70-C1E9-48F9-9E3C-6F9032581650@apnic.net>
In-Reply-To: <1045CB70-C1E9-48F9-9E3C-6F9032581650@apnic.net>
X-Enigmail-Version: 1.4.5
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 7bit
X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrNIsWRmVeSWpSXmKPExsWyRljihm7GnvYAg2VvlSz+Tv3AZDFr+Ucm i/ldx1kcmD0mntrA7rFkyU8mj+2nnrIFMEdx2aSk5mSWpRbp2yVwZbz9sZal4LZJxf2JC5kb GJu0uxg5OSQETCSaJs9lgbDFJC7cW88GYgsJHGaUOHTPEsTmFdCU2HZ9MiOIzSKgKnFg0k52 EJtNQF1ibt8esF5RgWCJuxMeM0HUC0qcnPkEKM7BISKgJDHlnB9ImFnAXOLQiYlg44UFDCU2 7O4EsrmAVh1jlDiw/hrYTE4BW4lXE2YzQtwjKfH2/StmiGYdiXd9D6BseYntb+cwT2AUmIVk 3SwkZbOQlC1gZF7FKJOflqZbnJqXUpybbmCoV1KZr5dVUFSslwyiNzGCwlaIQ2EH44xTBocY BTgYlXh4Fa60BQixJpYVV+YeYpTkYFIS5V2/oz1AiC8pP6UyI7E4I76oNCe1+BCjBAezkgjv k1VAOd6UxMqq1KJ8mJQ0B4uSOO/xszv8hATSE0tSs1NTC1KLYLIyHBxKEryWu4EaBYtS01Mr 0jJzShDSTBycIMN5gIZ7gtTwFhck5hZnpkPkTzFKSonzpoEkBEASGaV5cL2vGMWBXhDm1QbJ 8gBTEFzXK6CBTEADzXlaQQaWJCKkpBoYpardt9T4f5z4eDH/pAlHfHleZEaU7Lz4XuGXveiS pNPLp6nYJ35U/S24J1Rz+oKMvcwVcx+2BpjEmkS3fTuZr9my/yNL3MfNWrvz3dTXT8tb3Hbp gNiSo05LGR4k+5ctFJ0hceCn3prVH+8qLbz360jQBd5L2dJzvv+J+h1pFyXqemHVv8b0ECWW 4oxEQy3mouJEAOvUx6z+AgAA
Cc: weirds@ietf.org
Subject: Re: [weirds] Internationalization Issues
X-BeenThere: weirds@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "WHOIS-based Extensible Internet Registration Data Service \(WEIRDS\)" <weirds.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/weirds>, <mailto:weirds-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/weirds>
List-Post: <mailto:weirds@ietf.org>
List-Help: <mailto:weirds-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/weirds>, <mailto:weirds-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 24 Oct 2012 10:01:20 -0000

Byron-san;

(2012/10/24 14:12), Byron Ellacott wrote:
> Thank you both for your input.  If you are intending to implement a weirds service down the line, what would be your expectations on language support in the protocol?

'My current' expectations on language support are like follows:
(Frankly saying, I haven't completely caught up on the context/concept
of the protocol, so if they won't make sense for your arguments sorry
for that, and there is a possibility that I change my mind as my
understanding develops)

- default language (or character code) specification feature
In a case client has no ability to specify which language it prefers,
it would be not so good for server to return data of all language it
supports other than to return data by default language, as I believe
vast majority of whois.jprs.jp user won't expect to obtain data by 'both
of Japanese and English' or 'English only' (so you would be a very rare
user :)
; and I wonder this is a protocol stuff or not?

- language (or character code) tag for each element
There is a possibility some element has data by multiple languages (such
as Japanese and English) and the other element has data by a single
language (such as Japanese).  If language tag is for a set of elements,
client will have no clue for determining which element of one language
corresponds to which element of other language.

- character code specification feature
In Japan, sometimes client software encounters failure in automatic
judgment of encoding character code such as between ISO-2022-JP and
Shift-JIS, so apparent indication of encoding by server will be welcomed.

Thanks for questioning,

Kentaro

> 
>   Byron
> 
> On 24/10/2012, at 3:09 PM, MasaYUKI Okada <okadams@nic.ad.jp> wrote:
> 
>> Byron-san,
>>
>> I'm Masayuki Okada at JPNIC.
>>
>> Our WHOIS system(whois.nic.ad.jp:43) will response English(ASCII)
>> if '/e' is put into the character of the last of an input. 
>>
>> By default, it answers in Japanese(ISO-2022-JP). 
>>
>> About main registration records, JPNIC are collecting both Japanese 
>> and English as an mandantory.
>>
>> just for reference:
>>  - How to use JPNIC WHOIS - 
>>   http://www.nic.ad.jp/en/db/whois/
>>
>> --
>> Masayuki Okada
>> JPNIC
>>
>> (2012/10/24 11:59), Kentaro Mori wrote:
>>> Byron-san and folks,
>>>
>>> I'm Kentaro Mori from JPRS.
>>> (sorry for late reply)
>>>
>>> For Whois, JPRS collects English data as well as native (Japanese) one
>>> at the time registrant applies to .JP domain name registration.
>>> Additionally, the English data doesn't cover all of Japanese data,
>>> e.g., it is partial.
>>> ISO-2022-JP as character-set was a normal choice when JPRS (more
>>> correctly, JPNIC at that time) started Whois service, though we may have
>>> alternative choice such as UTF-8 now.
>>>
>>> --Kentaro
>>>
>>> (2012/10/22 9:56), Byron Ellacott wrote:
>>>>
>>>> On 19/10/2012, at 9:33 PM, Andy Newton <andy@arin.net> wrote:
>>>>
>>>>> On 10/18/12 8:40 PM, "Byron Ellacott" <bje@apnic.net> wrote:
>>>>>
>>>>>> Indicate to the end user that it's not a native language?
>>>>>> Auto-translate?
>>>>
>>>> Murray has the right sense of what I meant, for both of these.
>>>>
>>>>>> Negotiate for native language with Accepts-Language, if indicated as
>>>>>> possible via a Vary header?
>>>>>
>>>>> That's HTTP layer stuff. We're talking about embedding multiple language
>>>>> tags in the response.
>>>>
>>>> Are we?  I thought draft-sheng-weirds-icann-rws-dnrd-01 sect. 7.3
>>>> suggested a single language tag for the entire response, with "possible
>>>> considerations" of multiple language tags.
>>>>
>>>> But, with this point, what I'm suggesting is that the user of a particular
>>>> client likely has one or a few preferred languages, which they could
>>>> potentially indicate to the server, in the event that the server has multiple
>>>> translations.  This would be applicable for mixed language responses as
>>>> well as single language responses, since it only indicates a client preference,
>>>> not a strict requirement.
>>>>
>>>> My primary perspective on this entire subject is that whatever mechanisms
>>>> or systems indicate language or language preference need to be optional,
>>>> and should support reasonable use cases for current or likely future operators.
>>>> I think there's a use case for language preference indication, as per below,
>>>> and I think Ning is suggesting a use case for tagging the language of an
>>>> entire response, inline in the response.  What are your (collective "your")
>>>> thoughts on how reasonable these use cases are?
>>>>
>>>>>> Some RDAP services will not support multiple languages meaningfully, but
>>>>>> there are existing whois services that provide (non-standard, varying)
>>>>>> ways to indicate a preferred language on query, with multiple language
>>>>>> options available for many response fields.
>>>>>
>>>>> Can you provide an example of one of these services so we can query it?
>>>>> That would go a long way in helping shape this need, I would think. Are
>>>>> there registries that collect contact data in multiple languages?
>>>>
>>>> $ whois -h whois.nic.ad.jp -- 'NET 113.32.19.157'
>>>> $ whois -h whois.nic.ad.jp -- 'NET 113.32.19.157 /e'
>>>>
>>>> $ whois -h whois.jprs.jp -- 'jprs.jp'
>>>> $ whois -h whois.jprs.jp -- 'jprs.jp /e'
>>>>
>>>> The data labels are sometimes translated, sometimes not.  In the native
>>>> language responses, there's often an English translation.  JPRS includes
>>>> an English help/info block even for the native language response.  I don't
>>>> know for sure if they collect the information in multiple languages, though
>>>> I think they do - any JPRS or JPNIC operators on the list to confirm?
>>>> Character set is ISO-2022-JP.
>>>>
>>>> I don't know if there are other services with such a switch mechanism,
>>>> either - we're all aware of how hard it is to find out what's actually done
>>>> out there on port 43 :-) - but for another comparison whois.kisa.kr returns
>>>> both native and english output, at least for "kisa.kr".  Character set is
>>>> EUC-KR.
>>>>
>>>>  Byron
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> weirds mailing list
>>>> weirds@ietf.org
>>>> https://www.ietf.org/mailman/listinfo/weirds
>>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> weirds mailing list
>>> weirds@ietf.org
>>> https://www.ietf.org/mailman/listinfo/weirds
>>>
>>
>> _______________________________________________
>> weirds mailing list
>> weirds@ietf.org
>> https://www.ietf.org/mailman/listinfo/weirds
>