Re: Narrowing the slowdown down...

Magnus Westerlund <magnus.westerlund@ericsson.com> Mon, 27 June 2011 15:19 UTC

Return-Path: <magnus.westerlund@ericsson.com>
X-Original-To: wgchairs@ietfa.amsl.com
Delivered-To: wgchairs@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6956C11E8124; Mon, 27 Jun 2011 08:19:29 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -106.599
X-Spam-Level:
X-Spam-Status: No, score=-106.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WYOwOjwSST7o; Mon, 27 Jun 2011 08:19:28 -0700 (PDT)
Received: from mailgw9.se.ericsson.net (mailgw9.se.ericsson.net [193.180.251.57]) by ietfa.amsl.com (Postfix) with ESMTP id DB92811E80F1; Mon, 27 Jun 2011 08:19:27 -0700 (PDT)
X-AuditID: c1b4fb39-b7bfdae000005125-a2-4e089f710cd9
Received: from esessmw0197.eemea.ericsson.se (Unknown_Domain [153.88.253.124]) by mailgw9.se.ericsson.net (Symantec Mail Security) with SMTP id 8E.4B.20773.17F980E4; Mon, 27 Jun 2011 17:19:14 +0200 (CEST)
Received: from [127.0.0.1] (153.88.115.8) by esessmw0197.eemea.ericsson.se (153.88.115.88) with Microsoft SMTP Server id 8.3.137.0; Mon, 27 Jun 2011 17:19:13 +0200
Message-ID: <4E089F70.8090709@ericsson.com>
Date: Mon, 27 Jun 2011 17:19:12 +0200
From: Magnus Westerlund <magnus.westerlund@ericsson.com>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.18) Gecko/20110616 Thunderbird/3.1.11
MIME-Version: 1.0
To: Glen <glen@amsl.com>
Subject: Re: Narrowing the slowdown down...
References: <20110627144404.GA29259@amsl.com>
In-Reply-To: <20110627144404.GA29259@amsl.com>
X-Enigmail-Version: 1.1.1
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 8bit
X-Brightmail-Tracker: AAAAAA==
Cc: "wgchairs@ietf.org" <wgchairs@ietf.org>, "iab@ietf.org" <iab@ietf.org>, "tools-development@ietf.org" <tools-development@ietf.org>, "iaoc@ietf.org" <iaoc@ietf.org>, "Romascanu, Dan (Dan)" <dromasca@avaya.com>, Pete Resnick <presnick@qualcomm.com>, "henrik@levkowetz.com" <henrik@levkowetz.com>
X-BeenThere: wgchairs@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Working Group Chairs <wgchairs.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/wgchairs>, <mailto:wgchairs-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/wgchairs>
List-Post: <mailto:wgchairs@ietf.org>
List-Help: <mailto:wgchairs-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/wgchairs>, <mailto:wgchairs-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 27 Jun 2011 15:19:29 -0000

Hi Glen,

This might help you.

This document is from AVT that was concluded and then split into 4 WGs
and the documents got re-assigned to the new WGs. I have earlier filed
several ticket on the WG tracker regarding documents that has not
correctly accessible despite it showing up as WG document. Apparently
there is more than one field that contains which WG the document belongs
to. Henrik should know more about this issue.

So it is not surprising if there are issues around AVTs old documents as
they have been re-assigned.

Cheers

Magnus


On 2011-06-27 16:44, Glen wrote:
> All -
> 
> I have sent detailed data to the tools team and Henrik, but I wanted to alert
> everyone to a pattern I've seen during my analysis:
> 
> This request:
> 
> POST /doc/draft-ietf-payload-rfc3016bis/edit/position/ HTTP/1.1" 302 - 
> https://datatracker.ietf.org/doc/draft-ietf-payload-rfc3016bis/edit/position/
> 
> was seen in the logs at the start of both slowdowns, and I now suspect that
> there may be database corruption and/or some problem with the code related
> either to ballot positions generally, or this draft specifically.
> 
> It comes to my mind that, while I was gone, a request came in to clear the
> ballot positions for a draft, which the secretariat did.  This may have been
> the draft that was cleared - and clearing it may have caused some type of
> problem for the datatracker.
> 
> Of course, the datatracker should not loop or fail even if data is bad, but
> not all possibilities can be forseen.
> 
> It is my hope that we will both be able to correct a potential database
> problem, and find and harden a potential datatracker bug, quickly.
> 
> In the meantime, until we hear from the tools team, it might be best to
> at least refrain from voting on the above draft, if not all drafts.
> 
> If you do vote on a draft, and get a response, don't get too excited either
> way.  The server actually survives for an hour or more once the bug starts
> using resources (I'm actually proud of this - it's a HUGE server with lots
> of resources - the old servers would have died much more quickly. ;-) so
> things can appear okay for a while.
> 
> Now that we know what to look for, we can catch it earlier, but I'm still
> hopeful for a quick fix and repair today.
> 
> Thanks,
> Glen
> 


-- 

Magnus Westerlund

----------------------------------------------------------------------
Multimedia Technologies, Ericsson Research EAB/TVM
----------------------------------------------------------------------
Ericsson AB                | Phone  +46 10 7148287
Färögatan 6                | Mobile +46 73 0949079
SE-164 80 Stockholm, Sweden| mailto: magnus.westerlund@ericsson.com
----------------------------------------------------------------------