Re: [Anima] ANIMA when there is a system-wide issue

Brian E Carpenter <brian.e.carpenter@gmail.com> Thu, 17 December 2020 01:46 UTC

Return-Path: <brian.e.carpenter@gmail.com>
X-Original-To: anima@ietfa.amsl.com
Delivered-To: anima@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 132A53A1372 for <anima@ietfa.amsl.com>; Wed, 16 Dec 2020 17:46:53 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.699
X-Spam-Level:
X-Spam-Status: No, score=-0.699 tagged_above=-999 required=5 tests=[BAYES_05=-0.5, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, NICE_REPLY_A=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id imIgN3_jT8wX for <anima@ietfa.amsl.com>; Wed, 16 Dec 2020 17:46:51 -0800 (PST)
Received: from mail-pf1-x431.google.com (mail-pf1-x431.google.com [IPv6:2607:f8b0:4864:20::431]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B1E983A1371 for <anima@ietf.org>; Wed, 16 Dec 2020 17:46:51 -0800 (PST)
Received: by mail-pf1-x431.google.com with SMTP id c12so17938880pfo.10 for <anima@ietf.org>; Wed, 16 Dec 2020 17:46:51 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:from:organization:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=z5r0As/vAa6NMCkbIm6pvwpgki1AsSBNalSNlKnTFSU=; b=YmAEnHLnDUy70Z/6ZMqQi5nhwmZ0PHWU3T8p5juvOFKzlpgTjgsjQtqjE5+XeAXvGg MzmOAfUyMviLrRuLj3IRoxtQUZmozP3JcSpbLV8IIfPrMUm6HbFVoqbwqy4uVWABHvE7 k12Agp3oO+VIersy6AdSKxVMoE5Bv7n4s0ZLGuVOWZkVn63O8IeXin2hF/3JGqAmCwp7 XX1LrXG6qpU0tsMjEct/kfhQBSXoFECrZwuYlq/KmE1Jq3ADRRhKDCEbOxaBx1KQSQt5 JZfNnqyi4tlnQhYEm+dvoIOsKg5eIo9k5iW+eyUGwe2WfBrjwPIj+AACzU8pBvVT5nJ9 P6VA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:organization :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=z5r0As/vAa6NMCkbIm6pvwpgki1AsSBNalSNlKnTFSU=; b=agAf3+aDSX1ZyISdULDte2ColQMHN1l53bEpvzVWxRyVxIBs4C1Ti1JumR3UB7ZlyS /mPNQqL0176Q/HODwt5vNjWh7Ad/U6l2qb1agn2BrEHTZg8mviGshz629IzR4iL5uRoO jUDcHOe7xMIig8Pdy2qCyLzYmxWACi+X4p7t3Ye6y33S09bf/w3ZGE2eVlfEjQ5SYFIg sRoFlZuuwCyrBsescR3imXo1maMxOzz6q1lD+jYt3KbHOKPmvn2Nr8nWJBaAwYRGMwUY z3YjDpNLEuZZFrjgQ7tXiNg6sjbJfZAzi8v8UE8V3Ps8sQ3a+73kb3hTXB5+xWR8TxU2 1SAQ==
X-Gm-Message-State: AOAM533/No9W5UeFSBhzNJEVk3yK4SGfrVzk5JtniHqOa6IbQdX+0C7f onA95fDvyMy05dlOrLXWjE2xgMB5JHgA6A==
X-Google-Smtp-Source: ABdhPJzEG5mye5sYicpCDBqKnwGo5KXEcBzaOhSSPELMxEqszk3deFABOWXwyuT5iz1it6Jy0G68eQ==
X-Received: by 2002:a63:4b0f:: with SMTP id y15mr35751155pga.235.1608169610823; Wed, 16 Dec 2020 17:46:50 -0800 (PST)
Received: from [192.168.178.20] ([151.210.131.28]) by smtp.gmail.com with ESMTPSA id u29sm4159561pgm.34.2020.12.16.17.46.48 for <anima@ietf.org> (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 16 Dec 2020 17:46:49 -0800 (PST)
To: Anima WG <anima@ietf.org>
References: <136aa329-41a5-8b65-ef9e-fadf089696eb@gmail.com>
From: Brian E Carpenter <brian.e.carpenter@gmail.com>
Organization: University of Auckland
Message-ID: <704b66e9-d41c-f7e9-7e4b-f2d934ec9158@gmail.com>
Date: Thu, 17 Dec 2020 14:46:46 +1300
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.1
MIME-Version: 1.0
In-Reply-To: <136aa329-41a5-8b65-ef9e-fadf089696eb@gmail.com>
Content-Type: text/plain; charset="utf-8"
Content-Language: en-US
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/anima/5Z7V-yyys8nK2GXIcV7e6QKN0yY>
Subject: Re: [Anima] ANIMA when there is a system-wide issue
X-BeenThere: anima@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Autonomic Networking Integrated Model and Approach <anima.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/anima>, <mailto:anima-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/anima/>
List-Post: <mailto:anima@ietf.org>
List-Help: <mailto:anima-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/anima>, <mailto:anima-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Dec 2020 01:46:53 -0000

And here's what happens when the control plane itself falls over:

https://status.cloud.google.com/incident/zall/20011#20011006

It seems pretty clear that Cloud needs ANIMA.

Regards
   Brian

On 01-Dec-20 11:02, Brian E Carpenter wrote:
> "AWS reveals it broke itself by exceeding OS thread limits"
> 
> https://www.theregister.com/2020/11/30/aws_outage_explanation/
> 
> Especially:
> "The TIFU-like post also outlines why Amazon's dashboards offered only scanty info about the incident – because they, too, depend on a service that depends on Kinesis."
> 
> Perhaps there is something we should specify in ANIMA to prevent the ANIMA infrastructure falling into this sort of trap: when there is a system-wide issue (such as hitting an O/S resource limit everywhere at the same time) it also prevents the autonomic mechanisms from working.
>  
> Regards
>    Brian Carpenter
>