Re: [TOOLS-DEVELOPMENT] Narrowing the slowdown down...

Robert Sparks <rjsparks@nostrum.com> Mon, 27 June 2011 15:24 UTC

Return-Path: <rjsparks@nostrum.com>
X-Original-To: tools-development@ietfa.amsl.com
Delivered-To: tools-development@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A84E621F865C; Mon, 27 Jun 2011 08:24:20 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.599
X-Spam-Level:
X-Spam-Status: No, score=-102.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id L1KMe65jiIbp; Mon, 27 Jun 2011 08:24:19 -0700 (PDT)
Received: from nostrum.com (shaman.nostrum.com [72.232.179.90]) by ietfa.amsl.com (Postfix) with ESMTP id 7D70421F865B; Mon, 27 Jun 2011 08:24:19 -0700 (PDT)
Received: from dn3-177.estacado.net (vicuna-alt.estacado.net [75.53.54.121]) (authenticated bits=0) by nostrum.com (8.14.3/8.14.3) with ESMTP id p5RFMu7G025890 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Mon, 27 Jun 2011 10:22:57 -0500 (CDT) (envelope-from rjsparks@nostrum.com)
Mime-Version: 1.0 (Apple Message framework v1084)
Content-Type: text/plain; charset="iso-8859-1"
From: Robert Sparks <rjsparks@nostrum.com>
In-Reply-To: <4E089F70.8090709@ericsson.com>
Date: Mon, 27 Jun 2011 10:22:56 -0500
Content-Transfer-Encoding: quoted-printable
Message-Id: <641D409C-31C9-4F8D-A67E-B6B60BBC7656@nostrum.com>
References: <20110627144404.GA29259@amsl.com> <4E089F70.8090709@ericsson.com>
To: Magnus Westerlund <magnus.westerlund@ericsson.com>
X-Mailer: Apple Mail (2.1084)
Received-SPF: pass (nostrum.com: 75.53.54.121 is authenticated by a trusted mechanism)
Cc: "wgchairs@ietf.org" <wgchairs@ietf.org>, "iab@ietf.org" <iab@ietf.org>, "tools-development@ietf.org" <tools-development@ietf.org>, "iaoc@ietf.org" <iaoc@ietf.org>, "Romascanu, Dan (Dan)" <dromasca@avaya.com>, "iesg@ietf.org" <iesg@ietf.org>, Pete Resnick <presnick@qualcomm.com>, "henrik@levkowetz.com" <henrik@levkowetz.com>
Subject: Re: [TOOLS-DEVELOPMENT] Narrowing the slowdown down...
X-BeenThere: tools-development@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Tools Development list server <tools-development.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tools-development>, <mailto:tools-development-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tools-development>
List-Post: <mailto:tools-development@ietf.org>
List-Help: <mailto:tools-development-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tools-development>, <mailto:tools-development-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 27 Jun 2011 15:24:20 -0000

Hi Magnus -

Thanks for helping brainstorm. In this case, I'm pretty confident that the history of the document has no impact on the bug.

RjS

On Jun 27, 2011, at 10:19 AM, Magnus Westerlund wrote:

> Hi Glen,
> 
> This might help you.
> 
> This document is from AVT that was concluded and then split into 4 WGs
> and the documents got re-assigned to the new WGs. I have earlier filed
> several ticket on the WG tracker regarding documents that has not
> correctly accessible despite it showing up as WG document. Apparently
> there is more than one field that contains which WG the document belongs
> to. Henrik should know more about this issue.
> 
> So it is not surprising if there are issues around AVTs old documents as
> they have been re-assigned.
> 
> Cheers
> 
> Magnus
> 
> 
> On 2011-06-27 16:44, Glen wrote:
>> All -
>> 
>> I have sent detailed data to the tools team and Henrik, but I wanted to alert
>> everyone to a pattern I've seen during my analysis:
>> 
>> This request:
>> 
>> POST /doc/draft-ietf-payload-rfc3016bis/edit/position/ HTTP/1.1" 302 - 
>> https://datatracker.ietf.org/doc/draft-ietf-payload-rfc3016bis/edit/position/
>> 
>> was seen in the logs at the start of both slowdowns, and I now suspect that
>> there may be database corruption and/or some problem with the code related
>> either to ballot positions generally, or this draft specifically.
>> 
>> It comes to my mind that, while I was gone, a request came in to clear the
>> ballot positions for a draft, which the secretariat did.  This may have been
>> the draft that was cleared - and clearing it may have caused some type of
>> problem for the datatracker.
>> 
>> Of course, the datatracker should not loop or fail even if data is bad, but
>> not all possibilities can be forseen.
>> 
>> It is my hope that we will both be able to correct a potential database
>> problem, and find and harden a potential datatracker bug, quickly.
>> 
>> In the meantime, until we hear from the tools team, it might be best to
>> at least refrain from voting on the above draft, if not all drafts.
>> 
>> If you do vote on a draft, and get a response, don't get too excited either
>> way.  The server actually survives for an hour or more once the bug starts
>> using resources (I'm actually proud of this - it's a HUGE server with lots
>> of resources - the old servers would have died much more quickly. ;-) so
>> things can appear okay for a while.
>> 
>> Now that we know what to look for, we can catch it earlier, but I'm still
>> hopeful for a quick fix and repair today.
>> 
>> Thanks,
>> Glen
>> 
> 
> 
> -- 
> 
> Magnus Westerlund
> 
> ----------------------------------------------------------------------
> Multimedia Technologies, Ericsson Research EAB/TVM
> ----------------------------------------------------------------------
> Ericsson AB                | Phone  +46 10 7148287
> Färögatan 6                | Mobile +46 73 0949079
> SE-164 80 Stockholm, Sweden| mailto: magnus.westerlund@ericsson.com
> ----------------------------------------------------------------------
>