Re: [iola-conversion-tool] Multi-part names not displaying properly on http://datatracker.ietf.org/wg/
Henrik Levkowetz <henrik@levkowetz.com> Thu, 01 March 2012 12:09 UTC
Return-Path: <henrik@levkowetz.com>
X-Original-To: iola-conversion-tool@ietfa.amsl.com
Delivered-To: iola-conversion-tool@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0790F21F856D for <iola-conversion-tool@ietfa.amsl.com>; Thu, 1 Mar 2012 04:09:48 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.524
X-Spam-Level:
X-Spam-Status: No, score=-102.524 tagged_above=-999 required=5 tests=[AWL=0.076, BAYES_00=-2.599, NO_RELAYS=-0.001, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HPnMZfXYPX7P for <iola-conversion-tool@ietfa.amsl.com>; Thu, 1 Mar 2012 04:09:46 -0800 (PST)
Received: from grenache.tools.ietf.org (grenache.tools.ietf.org [IPv6:2a01:3f0:1:2::30]) by ietfa.amsl.com (Postfix) with ESMTP id 8C6CF21F8568 for <iola-conversion-tool@ietf.org>; Thu, 1 Mar 2012 04:09:46 -0800 (PST)
Received: from [2a01:3f0:1:0:21e:c2ff:fe13:7e3e] (port=60416 helo=brunello.netnod.se) by grenache.tools.ietf.org with esmtpsa (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.77) (envelope-from <henrik@levkowetz.com>) id 1S34pN-0000rm-EC; Thu, 01 Mar 2012 13:09:45 +0100
Message-ID: <4F4F6709.7010603@levkowetz.com>
Date: Thu, 01 Mar 2012 13:09:45 +0100
From: Henrik Levkowetz <henrik@levkowetz.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2
MIME-Version: 1.0
To: Cindy Morgan <cmorgan@amsl.com>
References: <24BD5CE3-41A6-4964-A609-6C86D667662E@amsl.com>
In-Reply-To: <24BD5CE3-41A6-4964-A609-6C86D667662E@amsl.com>
X-Enigmail-Version: 1.3.5
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 8bit
X-SA-Exim-Connect-IP: 2a01:3f0:1:0:21e:c2ff:fe13:7e3e
X-SA-Exim-Rcpt-To: cmorgan@amsl.com, iola-conversion-tool@ietf.org, henrik-sent@levkowetz.com
X-SA-Exim-Mail-From: henrik@levkowetz.com
X-SA-Exim-Version: 4.2.1 (built Mon, 26 Dec 2011 16:24:06 +0000)
X-SA-Exim-Scanned: Yes (on grenache.tools.ietf.org)
Cc: iola-conversion-tool@ietf.org
Subject: Re: [iola-conversion-tool] Multi-part names not displaying properly on http://datatracker.ietf.org/wg/
X-BeenThere: iola-conversion-tool@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Discussion of the IOLA / DB Schema Conversion Tool Project <iola-conversion-tool.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/iola-conversion-tool>, <mailto:iola-conversion-tool-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/iola-conversion-tool>
List-Post: <mailto:iola-conversion-tool@ietf.org>
List-Help: <mailto:iola-conversion-tool-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/iola-conversion-tool>, <mailto:iola-conversion-tool-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 01 Mar 2012 12:09:48 -0000
Hi, I've entered this into the tracker as ticket #783 On 2012-02-29 18:00 Cindy Morgan said: > > We have several WG chairs who have multi-part names. While it looks like their entire names are being displayed on the WG charter pages, the master WG list at http://datatracker.ietf.org/wg/ is cutting out the middle parts of those names. > > > > Some examples: > > > > "Francois Faucheur" on http://datatracker.ietf.org/wg/ > > "Francois Le Faucheur" on http://datatracker.ietf.org/wg/cdni/charter/ > > > > "Jamal Salim" on http://datatracker.ietf.org/wg/ > > "Jamal Hadi Salim" on http://datatracker.ietf.org/wg/forces/charter/ > > > > "Gunter Velde" on http://datatracker.ietf.org/wg/ > > "Gunter Van de Velde" on http://datatracker.ietf.org/wg/opsec/charter/ > > > > (There may be--and probably are--more, but these are the ones that jump out at me on first glance. Please let me know if you need a more thorough audit.) Thanks for noticing this. (The following has also been added to the ticket:) A first fix has been applied, but refinement of the names in the database will be needed. The situation is this: The old model treated names as if all patterns fit an anglosaxon name pattern, which lets you split a name into a first 'first', a middle 'middle' and a last 'last'. This works well for some names, but not so well for others. As an example, take Spanish names (Mexican names are slightly different, again). For a comprehensive description, check http://en.wikipedia.org/wiki/Spanish_names . Here's a simplified take: Spanish names have given names and a surname, where the first given name is sometimes composed by two words ('Juan Pablo') -- it's not a first and middle name, but the first name -- and a surname, which has two parts, composed from the father's first surname and the mother's first surname. If the name is shortened, for daily work or when addressed by surname alone, for instance, the _first_ surname is used -- not the last: "A man named José Antonio Gómez Iglesias would normally be addressed as Señor Gómez instead of Señor Iglesias." (from the article). Many people in daily (email) correspondence uses only the patronymic surname (something I became very aware of when working with our Yaco developer, "Emilio A. Sánchez López", who uses almost, but not quite, consistently "Emilio A. Sánchez" for his emails). If we try to force this into the legacy fields, it comes out wrong one way or another -- either the double names will always be used, if both are put into the surname field, or only one will ever be recognized, if only the patronymic surname is entered. The new database starts out by not assuming that it knows best how a name should be split, instead it has one utf-8 field and one ascii field for the preferred presentation name, another ascii filed for a shortened name, and any number of aliases for alternative forms of the name, maybe containing titles, honorifics, or other variations like both surnames for someone with a Spanish name who has a preferred presentation using only the patronymic surname. It puts name splitting, for where it may be needed, into code, where it can be updated and refined. Now, there is a lot of variations in preferences here, and the conversion from the old database was maybe a bit too simplified, with the outcome that in a number of cases the names will have to be adjusted, so that the preferred name is indicated, and alternative forms are entered as aliases. There are actually quite few places in the datatracker where we *need* to split out the friendly name, the formal address surname, etc., but there are code to do that, which clearly also need refinement. But as long as our usage doesn't normally need that, we should be OK with the preferred name in utf-8 and ascii, with programmatic extraction of parts. Ole and I have previously discussed what I mention above, and have also touched on the possibility that name splitting code may need a hints field (e.g., 'Spanish', 'Arabic', etc.) -- that is a refinement we can add if it turns out that it's needed to resolve name splitting properly. Currently (after my first fix) the code which produces the page (http://datatracker.ietf.org/wg/) combines first, middle and last, but it should transition to using the preferred name, as soon as names with prefx and suffix parts which should not normally be displayed have been modified to have the forms with prefix/suffix as aliases, and the preferred display form entered adjusted to work in this context. Best regards, Henrik
- [iola-conversion-tool] Multi-part names not displ… Cindy Morgan
- Re: [iola-conversion-tool] Multi-part names not d… Henrik Levkowetz