Re: [Bpf] Review of draft-thaler-bpf-isa-01

Alexei Starovoitov <alexei.starovoitov@gmail.com> Sat, 29 July 2023 00:35 UTC

Return-Path: <alexei.starovoitov@gmail.com>
X-Original-To: bpf@ietfa.amsl.com
Delivered-To: bpf@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E3032C151069 for <bpf@ietfa.amsl.com>; Fri, 28 Jul 2023 17:35:22 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.106
X-Spam-Level:
X-Spam-Status: No, score=-2.106 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EDlM2QV-BgAx for <bpf@ietfa.amsl.com>; Fri, 28 Jul 2023 17:35:18 -0700 (PDT)
Received: from mail-lf1-x133.google.com (mail-lf1-x133.google.com [IPv6:2a00:1450:4864:20::133]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A0EEFC15106D for <bpf@ietf.org>; Fri, 28 Jul 2023 17:35:18 -0700 (PDT)
Received: by mail-lf1-x133.google.com with SMTP id 2adb3069b0e04-4fe1489ced6so4612881e87.0 for <bpf@ietf.org>; Fri, 28 Jul 2023 17:35:18 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1690590917; x=1691195717; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=2J5OLKk0rp35/fM5esXxHanqOsIxvOdR7Hjz0ZxQrCM=; b=bE15akDoUT0teKFoP1IGeCnx2MtUXjgIhzLN0FmxoISz4qkkVswIivSXthBAi9c//A CLwKEd49k9633zeuUZQ8weLOpN6o4dz2Set2uX7WXycgpzrW+94UhPDqj0YW+hZM2JnK fNv22ZWrS0R/fxGn+FOoefa9FVmMriVO5IR4GaUmrQ5P0hXRkqwE0xAua9fR85MygK5M LEaWqzomcHI8u0epofokhPkQOrr/3FlDAw7J0b8pwm03F+aVECaY8Nuw+4J5lMF/I6ay hl24dFb4hv6PscURpc86eislKXRPcV6n6PQqCx06jJi/fRJ5TGSnzrQxRZfINWUycBBO 9meg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690590917; x=1691195717; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2J5OLKk0rp35/fM5esXxHanqOsIxvOdR7Hjz0ZxQrCM=; b=lJ7B6rAyezUogXcHBa/EfgL97Hsrpp5oXjbX26ZkIuclXSatRe/VWowcbLIkT4c4IX ilKzEuWdPQ7Ga0QYcn3d0KrJ/TSYK5r7MIGUYorB+DsJISGW33n5GX36aVX7poeSwaLD 7gKZbNISxZr4s+OOJOfdNOIlCCDQ5Ykt59lGFRvyVhJxd5q1WkT11JYPlt0e/j4VX7ki oz3wt2+5YNF2e+7gF0MwXMmpqNt70TjrX1LLuk2gBmvxqneO+zZrWaeJXs5jMXHz9PI+ HIKtM3tXIIRsbPiJVDXdn34JwagkSwChCUbNSmxi1uJXzhc6QqrqiTp/XausWKjAQVIF OHnA==
X-Gm-Message-State: ABy/qLYjf1Y57ixOIR4bJGOeG/aSOCgwGGVa7o0HbiL3uNrnGNAzR1jR PHGD2hhpUljH7XOcbUW7MGm6c7wjUMxTA1Yp4Hg=
X-Google-Smtp-Source: APBJJlFX+eC0JecotlgRhQKPzJGxNjQ7a6KY6BB3JEguPEtrLw/M7GNDAJi4gqPtY+ECernyWSngzuhD0hg+tEwxKY4=
X-Received: by 2002:a05:651c:c3:b0:2b4:75f0:b9e9 with SMTP id 3-20020a05651c00c300b002b475f0b9e9mr3047009ljr.10.1690590916713; Fri, 28 Jul 2023 17:35:16 -0700 (PDT)
MIME-Version: 1.0
References: <CACsn0ckZO+b5bRgMZhOvx+Jn-sa0g8cBD+ug1CJEdtYxSm_hgA@mail.gmail.com> <PH7PR21MB3878D8DCEF24A5F8E52BA59DA303A@PH7PR21MB3878.namprd21.prod.outlook.com> <CAADnVQJ1fKXcsTXdCijwQzf0OVF0md-ATN5RbB3g10geyofNzA@mail.gmail.com> <CACsn0cmf22zEN9AduiRiFnQ7XhY1ABRL=SwAwmmFgxJvVZAOsg@mail.gmail.com> <CADx9qWi+VQ=do+_Bsd8W4Yc-S1LekVq7Hp4bfD3nz0YP47Sqgg@mail.gmail.com> <CAADnVQ+5d8ztfFLraWnZKszAX23Z-12=pHjJfufNbd3qzWVNsQ@mail.gmail.com> <CADx9qWhSqb6xAP=nz5N-vmd2N3+h4TBFtFOGdJUWNfX=LapEBw@mail.gmail.com> <CAADnVQJ4yzDc0qQExLUO1b23ndEiEjnYYPv5qC7JJYmLr4X3ew@mail.gmail.com> <CADx9qWh6ZUKvjkZow6=eB4gvEgP82mBqn+mMZvmDQynCYAfMWw@mail.gmail.com>
In-Reply-To: <CADx9qWh6ZUKvjkZow6=eB4gvEgP82mBqn+mMZvmDQynCYAfMWw@mail.gmail.com>
From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Date: Fri, 28 Jul 2023 17:35:05 -0700
Message-ID: <CAADnVQKOiwm1UB58=8QcowDyfPQct-wuMD19citS7w5PmadZ6g@mail.gmail.com>
To: Will Hawkins <hawkinsw@obs.cr>
Cc: Watson Ladd <watsonbladd@gmail.com>, Dave Thaler <dthaler@microsoft.com>, "bpf@ietf.org" <bpf@ietf.org>, bpf <bpf@vger.kernel.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/bpf/vqFwKvGYY--Drd4nZ0xTp6JYUIQ>
Subject: Re: [Bpf] Review of draft-thaler-bpf-isa-01
X-BeenThere: bpf@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Discussion of BPF/eBPF standardization efforts within the IETF <bpf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bpf>, <mailto:bpf-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/bpf/>
List-Post: <mailto:bpf@ietf.org>
List-Help: <mailto:bpf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bpf>, <mailto:bpf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 29 Jul 2023 00:35:23 -0000

On Fri, Jul 28, 2023 at 5:19 PM Will Hawkins <hawkinsw@obs.cr> wrote:
>
> On Fri, Jul 28, 2023 at 8:05 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Fri, Jul 28, 2023 at 4:32 PM Will Hawkins <hawkinsw@obs.cr> wrote:
> > >
> > > On Thu, Jul 27, 2023 at 9:05 PM Alexei Starovoitov
> > > <alexei.starovoitov@gmail.com> wrote:
> > > >
> > > > On Wed, Jul 26, 2023 at 12:16 PM Will Hawkins <hawkinsw@obs.cr> wrote:
> > > > >
> > > > > On Tue, Jul 25, 2023 at 2:37 PM Watson Ladd <watsonbladd@gmail.com> wrote:
> > > > > >
> > > > > > On Tue, Jul 25, 2023 at 9:15 AM Alexei Starovoitov
> > > > > > <alexei.starovoitov@gmail.com> wrote:
> > > > > > >
> > > > > > > On Tue, Jul 25, 2023 at 7:03 AM Dave Thaler <dthaler@microsoft.com> wrote:
> > > > > > > >
> > > > > > > > I am forwarding the email below (after converting HTML to plain text)
> > > > > > > > to the mailto:bpf@vger.kernel.org list so replies can go to both lists.
> > > > > > > >
> > > > > > > > Please use this one for any replies.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Dave
> > > > > > > >
> > > > > > > > > From: Bpf <bpf-bounces@ietf.org> On Behalf Of Watson Ladd
> > > > > > > > > Sent: Monday, July 24, 2023 10:05 PM
> > > > > > > > > To: bpf@ietf.org
> > > > > > > > > Subject: [Bpf] Review of draft-thaler-bpf-isa-01
> > > > > > > > >
> > > > > > > > > Dear BPF wg,
> > > > > > > > >
> > > > > > > > > I took a look at the draft and think it has some issues, unsurprisingly at this stage. One is
> > > > > > > > > the specification seems to use an underspecified C pseudo code for operations vs
> > > > > > > > > defining them mathematically.
> > > > > > >
> > > > > > > Hi Watson,
> > > > > > >
> > > > > > > This is not "underspecified C" pseudo code.
> > > > > > > This is assembly syntax parsed and emitted by GCC, LLVM, gas, Linux Kernel, etc.
> > > > > >
> > > > > > I don't see a reference to any description of that in section 4.1.
> > > > > > It's possible I've overlooked this, and if people think this style of
> > > > > > definition is good enough that works for me. But I found table 4
> > > > > > pretty scanty on what exactly happens.
> > > > >
> > > > > Hello! Based on Watson's post, I have done some research and would
> > > > > potentially like to offer a path forward. There are several different
> > > > > ways that ISAs specify the semantics of their operations:
> > > > >
> > > > > 1. Intel has a section in their manual that describes the pseudocode
> > > > > they use to specify their ISA: Section 3.1.1.9 of The Intel® 64 and
> > > > > IA-32 Architectures Software Developer’s Manual at
> > > > > https://cdrdv2.intel.com/v1/dl/getContent/671199
> > > > > 2. ARM has an equivalent for their variety of pseudocode: Chapter J1
> > > > > of Arm Architecture Reference Manual for A-profile architecture at
> > > > > https://developer.arm.com/documentation/ddi0487/latest/
> > > > > 3. Sail "is a language for describing the instruction-set architecture
> > > > > (ISA) semantics of processors."
> > > > > (https://www.cl.cam.ac.uk/~pes20/sail/)
> > > > >
> > > > > Given the commercial nature of (1) and (2), perhaps Sail is a way to
> > > > > proceed. If people are interested, I would be happy to lead an effort
> > > > > to encode the eBPF ISA semantics in Sail (or find someone who already
> > > > > has) and incorporate them in the draft.
> > > >
> > > > imo Sail is too researchy to have practical use.
> > > > Looking at arm64 or x86 Sail description I really don't see how
> > > > it would map to an IETF standard.
> > > > It's done in a "sail" language that people need to learn first to be
> > > > able to read it.
> > > > Say we had bpf.sail somewhere on github. What value does it bring to
> > > > BPF ISA standard? I don't see an immediate benefit to standardization.
> > > > There could be other use cases, no doubt, but standardization is our goal.
> > > >
> > > > As far as 1 and 2. Intel and Arm use their own pseudocode, so they had
> > > > to add a paragraph to describe it. We are using C to describe BPF ISA
> > >
> > >
> > > I cannot find a reference in the current version that specifies what
> > > we are using to describe the operations. I'd like to add that, but
> > > want to make sure that I clarify two statements that seem to be at
> > > odds.
> > >
> > > Immediately above you say that we are using "C to describe the BPF
> > > ISA" and further above you say "This is assembly syntax parsed and
> > > emitted by GCC, LLVM, gas, Linux Kernel, etc."
> > >
> > > My own reading is that it is the former, and not the latter. But, I
> > > want to double check before adding the appropriate statements to the
> > > Convention section.
> >
> > It's both. I'm not sure where you see a contradiction.
> > It's a normal C syntax and it's emitted by the kernel verifier,
> > parsed by clang/gcc assemblers and emitted by compilers.
>
>
> Okay. I apologize. I am sincerely confused. For instance,
>
> if (u32)dst >= (u32)src goto +offset
>
> Looks like nothing that I have ever seen in "normal C syntax".

I thought we're talking about table 4 and ALU ops.
Above is not a pure C, but it's obvious enough without explanation, no?
Also I don't see above anywhere in the doc.
We describe conditionals like:
BPF_JGE   0x3    any  PC += offset if dst >= src

> There also appear to be a few other places where things might be a bit wonky:
>
> 1. Address arithmetic in the description of the load/store
> instructions will depend on the type of the target: E.g.,
>
> *(u64 *)(dst + offset) = imm
>
> The address to which the store is done will be offset*sizeof(X) bytes
> from dst where X is the type of the target of dst. If we are assuming
> that dst (or its equivalent in similar instructions) is being treated
> simply as an unsigned integer, I believe that we will have to say that
> explicitly, especially given that we describe offset as "signed
> integer offset used with pointer arithmetic" in the Instruction
> encoding section.

It's not:
*((u64 *)(dst) + offset) = imm

The doc doesn't say that 'dst' is a pointer 'u64 *dst' type.
Instead it says:
--
The 'code' field encodes the operation as below, where 'src' and 'dst' refer
to the values of the source and destination registers, respectively.
--

so dst + offset is a plain addition of two values and then type cast.

>
> 2. hto[bl]eN functions are not specified by standard C and, while
> "obvious" what they do, are not defined in the document anywhere.

yeah. we can add a short sentence about htoln.