Re: [Anima] ANIMA when there is a system-wide issue

Brian E Carpenter <> Thu, 17 December 2020 01:46 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 132A53A1372 for <>; Wed, 16 Dec 2020 17:46:53 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -0.699
X-Spam-Status: No, score=-0.699 tagged_above=-999 required=5 tests=[BAYES_05=-0.5, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, NICE_REPLY_A=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id imIgN3_jT8wX for <>; Wed, 16 Dec 2020 17:46:51 -0800 (PST)
Received: from ( [IPv6:2607:f8b0:4864:20::431]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id B1E983A1371 for <>; Wed, 16 Dec 2020 17:46:51 -0800 (PST)
Received: by with SMTP id c12so17938880pfo.10 for <>; Wed, 16 Dec 2020 17:46:51 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=subject:to:references:from:organization:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=z5r0As/vAa6NMCkbIm6pvwpgki1AsSBNalSNlKnTFSU=; b=YmAEnHLnDUy70Z/6ZMqQi5nhwmZ0PHWU3T8p5juvOFKzlpgTjgsjQtqjE5+XeAXvGg MzmOAfUyMviLrRuLj3IRoxtQUZmozP3JcSpbLV8IIfPrMUm6HbFVoqbwqy4uVWABHvE7 k12Agp3oO+VIersy6AdSKxVMoE5Bv7n4s0ZLGuVOWZkVn63O8IeXin2hF/3JGqAmCwp7 XX1LrXG6qpU0tsMjEct/kfhQBSXoFECrZwuYlq/KmE1Jq3ADRRhKDCEbOxaBx1KQSQt5 JZfNnqyi4tlnQhYEm+dvoIOsKg5eIo9k5iW+eyUGwe2WfBrjwPIj+AACzU8pBvVT5nJ9 P6VA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:subject:to:references:from:organization :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=z5r0As/vAa6NMCkbIm6pvwpgki1AsSBNalSNlKnTFSU=; b=agAf3+aDSX1ZyISdULDte2ColQMHN1l53bEpvzVWxRyVxIBs4C1Ti1JumR3UB7ZlyS /mPNQqL0176Q/HODwt5vNjWh7Ad/U6l2qb1agn2BrEHTZg8mviGshz629IzR4iL5uRoO jUDcHOe7xMIig8Pdy2qCyLzYmxWACi+X4p7t3Ye6y33S09bf/w3ZGE2eVlfEjQ5SYFIg sRoFlZuuwCyrBsescR3imXo1maMxOzz6q1lD+jYt3KbHOKPmvn2Nr8nWJBaAwYRGMwUY z3YjDpNLEuZZFrjgQ7tXiNg6sjbJfZAzi8v8UE8V3Ps8sQ3a+73kb3hTXB5+xWR8TxU2 1SAQ==
X-Gm-Message-State: AOAM533/No9W5UeFSBhzNJEVk3yK4SGfrVzk5JtniHqOa6IbQdX+0C7f onA95fDvyMy05dlOrLXWjE2xgMB5JHgA6A==
X-Google-Smtp-Source: ABdhPJzEG5mye5sYicpCDBqKnwGo5KXEcBzaOhSSPELMxEqszk3deFABOWXwyuT5iz1it6Jy0G68eQ==
X-Received: by 2002:a63:4b0f:: with SMTP id y15mr35751155pga.235.1608169610823; Wed, 16 Dec 2020 17:46:50 -0800 (PST)
Received: from [] ([]) by with ESMTPSA id u29sm4159561pgm.34.2020. for <> (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 16 Dec 2020 17:46:49 -0800 (PST)
To: Anima WG <>
References: <>
From: Brian E Carpenter <>
Organization: University of Auckland
Message-ID: <>
Date: Thu, 17 Dec 2020 14:46:46 +1300
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.1
MIME-Version: 1.0
In-Reply-To: <>
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: quoted-printable
Archived-At: <>
Subject: Re: [Anima] ANIMA when there is a system-wide issue
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Autonomic Networking Integrated Model and Approach <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Thu, 17 Dec 2020 01:46:53 -0000

And here's what happens when the control plane itself falls over:

It seems pretty clear that Cloud needs ANIMA.


On 01-Dec-20 11:02, Brian E Carpenter wrote:
> "AWS reveals it broke itself by exceeding OS thread limits"
> Especially:
> "The TIFU-like post also outlines why Amazon's dashboards offered only scanty info about the incident – because they, too, depend on a service that depends on Kinesis."
> Perhaps there is something we should specify in ANIMA to prevent the ANIMA infrastructure falling into this sort of trap: when there is a system-wide issue (such as hitting an O/S resource limit everywhere at the same time) it also prevents the autonomic mechanisms from working.
> Regards
>    Brian Carpenter