Re: [Bpf] Review of draft-thaler-bpf-isa-01

Will Hawkins <hawkinsw@obs.cr> Sat, 29 July 2023 00:46 UTC

Return-Path: <hawkinsw@obs.cr>
X-Original-To: bpf@ietfa.amsl.com
Delivered-To: bpf@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3927DC151077 for <bpf@ietfa.amsl.com>; Fri, 28 Jul 2023 17:46:36 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.905
X-Spam-Level:
X-Spam-Status: No, score=-6.905 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=obs-cr.20221208.gappssmtp.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id RXTRu-6i94wT for <bpf@ietfa.amsl.com>; Fri, 28 Jul 2023 17:46:31 -0700 (PDT)
Received: from mail-qv1-xf36.google.com (mail-qv1-xf36.google.com [IPv6:2607:f8b0:4864:20::f36]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D771BC151075 for <bpf@ietf.org>; Fri, 28 Jul 2023 17:46:31 -0700 (PDT)
Received: by mail-qv1-xf36.google.com with SMTP id 6a1803df08f44-63d23473ed5so15844506d6.1 for <bpf@ietf.org>; Fri, 28 Jul 2023 17:46:31 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=obs-cr.20221208.gappssmtp.com; s=20221208; t=1690591591; x=1691196391; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=XnQ1pJAtnegsZHOqhcVYr5A/jKjI0nebfK5TPS3lN5Y=; b=NJRPXOAhlRbmxABzgCbhiIvwMb7yJMoMSWa/FAlShGX6sNXHWAiz1UljJZkiok/ZcX 7QmblUkVs4g7mFZ9sDAKy5WXELyQnt36mb3BOearoV8vOW1zNYcpNRPy5H2rc6PvkyCj YGror3jwIggUoDXF5rVqjKrWc+OK8ulc6aZ25qET7fjmA0dCuJntQ3mwcIhTgHOAjr20 ONG6GQ/q7wu33hGrYEImMvgR6D+X2zqprAmjMNy0rbJxIsxpf6bwV3UYNi8RI9EbhIdJ CiiLtmil6hjK8kAVq7bnofclr+pGMJsHw01IqtOOPI/I5boI5lmn02jjpWMdcn97eZFz +O3Q==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690591591; x=1691196391; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=XnQ1pJAtnegsZHOqhcVYr5A/jKjI0nebfK5TPS3lN5Y=; b=MWRpvZH+C43CsTA4vP4aH6F8plvSBBM1HIiGth0iR7BNlwhj+/w74dQpnme7F+PwkW fY/Lha8n7HQDrnqQ+4YRXoZwHOYRnnX1/hTF8xwj3h8hS8+jKyuiRuXADQPqQXhnuhSP I++TTV1CFYCOCwuwm99r3SZl8N1CgT1JCLVj03XyCefbO//8TaIjs1VTDw+8zLIemSCS xvghtHv0R11jaDyGC7QYoF39nZu7MaPz+AUb89J5nJmjVgTkhix8fCK3nhSyR4xLdvUr HTpkNZ2rAQ9/asH1sXdtI9kOct55/B6F+yqAS1bMifolh+5PgYFO6v0X170SGCNaQdp4 nMeQ==
X-Gm-Message-State: ABy/qLbNfXFMJ6nwwaGzdnnKFzoQxtKMebuy1ZFvqsdS1gSvzEmopLcy xyww4cOvlph2Rr6/ghORCIYZxLefVb3QNrKpog6rEA==
X-Google-Smtp-Source: APBJJlHfHAOftx+trekNjTBM4bPEeqf0DtP5xzoraFGp9Xm4G/25kuzQ1Fw2DMfRsazVeCyK95AELwH6kiLFQfPkiAM=
X-Received: by 2002:a05:6214:5b0e:b0:632:15e6:a75e with SMTP id ma14-20020a0562145b0e00b0063215e6a75emr4015677qvb.46.1690591590888; Fri, 28 Jul 2023 17:46:30 -0700 (PDT)
MIME-Version: 1.0
References: <CACsn0ckZO+b5bRgMZhOvx+Jn-sa0g8cBD+ug1CJEdtYxSm_hgA@mail.gmail.com> <PH7PR21MB3878D8DCEF24A5F8E52BA59DA303A@PH7PR21MB3878.namprd21.prod.outlook.com> <CAADnVQJ1fKXcsTXdCijwQzf0OVF0md-ATN5RbB3g10geyofNzA@mail.gmail.com> <CACsn0cmf22zEN9AduiRiFnQ7XhY1ABRL=SwAwmmFgxJvVZAOsg@mail.gmail.com> <CADx9qWi+VQ=do+_Bsd8W4Yc-S1LekVq7Hp4bfD3nz0YP47Sqgg@mail.gmail.com> <CAADnVQ+5d8ztfFLraWnZKszAX23Z-12=pHjJfufNbd3qzWVNsQ@mail.gmail.com> <CADx9qWhSqb6xAP=nz5N-vmd2N3+h4TBFtFOGdJUWNfX=LapEBw@mail.gmail.com> <CAADnVQJ4yzDc0qQExLUO1b23ndEiEjnYYPv5qC7JJYmLr4X3ew@mail.gmail.com> <CADx9qWh6ZUKvjkZow6=eB4gvEgP82mBqn+mMZvmDQynCYAfMWw@mail.gmail.com> <CAADnVQKOiwm1UB58=8QcowDyfPQct-wuMD19citS7w5PmadZ6g@mail.gmail.com>
In-Reply-To: <CAADnVQKOiwm1UB58=8QcowDyfPQct-wuMD19citS7w5PmadZ6g@mail.gmail.com>
From: Will Hawkins <hawkinsw@obs.cr>
Date: Fri, 28 Jul 2023 20:46:19 -0400
Message-ID: <CADx9qWjYChRf2qBr=Pt5D-RLCb665YFKmjDYX8WOQfqMx1-bag@mail.gmail.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Watson Ladd <watsonbladd@gmail.com>, Dave Thaler <dthaler@microsoft.com>, "bpf@ietf.org" <bpf@ietf.org>, bpf <bpf@vger.kernel.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/bpf/YhP6FsCB8oB4QCoUJh3eamsl3hI>
Subject: Re: [Bpf] Review of draft-thaler-bpf-isa-01
X-BeenThere: bpf@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Discussion of BPF/eBPF standardization efforts within the IETF <bpf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bpf>, <mailto:bpf-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/bpf/>
List-Post: <mailto:bpf@ietf.org>
List-Help: <mailto:bpf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bpf>, <mailto:bpf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 29 Jul 2023 00:46:36 -0000

On Fri, Jul 28, 2023 at 8:35 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Fri, Jul 28, 2023 at 5:19 PM Will Hawkins <hawkinsw@obs.cr> wrote:
> >
> > On Fri, Jul 28, 2023 at 8:05 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Fri, Jul 28, 2023 at 4:32 PM Will Hawkins <hawkinsw@obs.cr> wrote:
> > > >
> > > > On Thu, Jul 27, 2023 at 9:05 PM Alexei Starovoitov
> > > > <alexei.starovoitov@gmail.com> wrote:
> > > > >
> > > > > On Wed, Jul 26, 2023 at 12:16 PM Will Hawkins <hawkinsw@obs.cr> wrote:
> > > > > >
> > > > > > On Tue, Jul 25, 2023 at 2:37 PM Watson Ladd <watsonbladd@gmail.com> wrote:
> > > > > > >
> > > > > > > On Tue, Jul 25, 2023 at 9:15 AM Alexei Starovoitov
> > > > > > > <alexei.starovoitov@gmail.com> wrote:
> > > > > > > >
> > > > > > > > On Tue, Jul 25, 2023 at 7:03 AM Dave Thaler <dthaler@microsoft.com> wrote:
> > > > > > > > >
> > > > > > > > > I am forwarding the email below (after converting HTML to plain text)
> > > > > > > > > to the mailto:bpf@vger.kernel.org list so replies can go to both lists.
> > > > > > > > >
> > > > > > > > > Please use this one for any replies.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Dave
> > > > > > > > >
> > > > > > > > > > From: Bpf <bpf-bounces@ietf.org> On Behalf Of Watson Ladd
> > > > > > > > > > Sent: Monday, July 24, 2023 10:05 PM
> > > > > > > > > > To: bpf@ietf.org
> > > > > > > > > > Subject: [Bpf] Review of draft-thaler-bpf-isa-01
> > > > > > > > > >
> > > > > > > > > > Dear BPF wg,
> > > > > > > > > >
> > > > > > > > > > I took a look at the draft and think it has some issues, unsurprisingly at this stage. One is
> > > > > > > > > > the specification seems to use an underspecified C pseudo code for operations vs
> > > > > > > > > > defining them mathematically.
> > > > > > > >
> > > > > > > > Hi Watson,
> > > > > > > >
> > > > > > > > This is not "underspecified C" pseudo code.
> > > > > > > > This is assembly syntax parsed and emitted by GCC, LLVM, gas, Linux Kernel, etc.
> > > > > > >
> > > > > > > I don't see a reference to any description of that in section 4.1.
> > > > > > > It's possible I've overlooked this, and if people think this style of
> > > > > > > definition is good enough that works for me. But I found table 4
> > > > > > > pretty scanty on what exactly happens.
> > > > > >
> > > > > > Hello! Based on Watson's post, I have done some research and would
> > > > > > potentially like to offer a path forward. There are several different
> > > > > > ways that ISAs specify the semantics of their operations:
> > > > > >
> > > > > > 1. Intel has a section in their manual that describes the pseudocode
> > > > > > they use to specify their ISA: Section 3.1.1.9 of The Intel® 64 and
> > > > > > IA-32 Architectures Software Developer’s Manual at
> > > > > > https://cdrdv2.intel.com/v1/dl/getContent/671199
> > > > > > 2. ARM has an equivalent for their variety of pseudocode: Chapter J1
> > > > > > of Arm Architecture Reference Manual for A-profile architecture at
> > > > > > https://developer.arm.com/documentation/ddi0487/latest/
> > > > > > 3. Sail "is a language for describing the instruction-set architecture
> > > > > > (ISA) semantics of processors."
> > > > > > (https://www.cl.cam.ac.uk/~pes20/sail/)
> > > > > >
> > > > > > Given the commercial nature of (1) and (2), perhaps Sail is a way to
> > > > > > proceed. If people are interested, I would be happy to lead an effort
> > > > > > to encode the eBPF ISA semantics in Sail (or find someone who already
> > > > > > has) and incorporate them in the draft.
> > > > >
> > > > > imo Sail is too researchy to have practical use.
> > > > > Looking at arm64 or x86 Sail description I really don't see how
> > > > > it would map to an IETF standard.
> > > > > It's done in a "sail" language that people need to learn first to be
> > > > > able to read it.
> > > > > Say we had bpf.sail somewhere on github. What value does it bring to
> > > > > BPF ISA standard? I don't see an immediate benefit to standardization.
> > > > > There could be other use cases, no doubt, but standardization is our goal.
> > > > >
> > > > > As far as 1 and 2. Intel and Arm use their own pseudocode, so they had
> > > > > to add a paragraph to describe it. We are using C to describe BPF ISA
> > > >
> > > >
> > > > I cannot find a reference in the current version that specifies what
> > > > we are using to describe the operations. I'd like to add that, but
> > > > want to make sure that I clarify two statements that seem to be at
> > > > odds.
> > > >
> > > > Immediately above you say that we are using "C to describe the BPF
> > > > ISA" and further above you say "This is assembly syntax parsed and
> > > > emitted by GCC, LLVM, gas, Linux Kernel, etc."
> > > >
> > > > My own reading is that it is the former, and not the latter. But, I
> > > > want to double check before adding the appropriate statements to the
> > > > Convention section.
> > >
> > > It's both. I'm not sure where you see a contradiction.
> > > It's a normal C syntax and it's emitted by the kernel verifier,
> > > parsed by clang/gcc assemblers and emitted by compilers.
> >
> >
> > Okay. I apologize. I am sincerely confused. For instance,
> >
> > if (u32)dst >= (u32)src goto +offset
> >
> > Looks like nothing that I have ever seen in "normal C syntax".
>
> I thought we're talking about table 4 and ALU ops.
> Above is not a pure C, but it's obvious enough without explanation, no?

To "us", yes. Although I am not an expert, it seems like being
explicit is important when it comes to writing a spec. I suppose we
should leave that to Dave and the chairs.

> Also I don't see above anywhere in the doc.

That is from the Appendix. It is currently in Dave's tree and gets
amalgamated with other files to build the final draft.

https://datatracker.ietf.org/doc/draft-thaler-bpf-isa/

> We describe conditionals like:
> BPF_JGE   0x3    any  PC += offset if dst >= src
>
> > There also appear to be a few other places where things might be a bit wonky:
> >
> > 1. Address arithmetic in the description of the load/store
> > instructions will depend on the type of the target: E.g.,
> >
> > *(u64 *)(dst + offset) = imm
> >
> > The address to which the store is done will be offset*sizeof(X) bytes
> > from dst where X is the type of the target of dst. If we are assuming
> > that dst (or its equivalent in similar instructions) is being treated
> > simply as an unsigned integer, I believe that we will have to say that
> > explicitly, especially given that we describe offset as "signed
> > integer offset used with pointer arithmetic" in the Instruction
> > encoding section.
>
> It's not:
> *((u64 *)(dst) + offset) = imm
>
> The doc doesn't say that 'dst' is a pointer 'u64 *dst' type.
> Instead it says:
> --
> The 'code' field encodes the operation as below, where 'src' and 'dst' refer
> to the values of the source and destination registers, respectively.
> --
>
> so dst + offset is a plain addition of two values and then type cast.

Again I of course understand and "we" know what that means. However,
it seems to me that an earlier description of offset as "signed
integer offset used with pointer arithmetic" might signal something
else to an unfamiliar reader.

Will

>
> >
> > 2. hto[bl]eN functions are not specified by standard C and, while
> > "obvious" what they do, are not defined in the document anywhere.
>
> yeah. we can add a short sentence about htoln.