Re: JPEG-XL as Content-Encoding?

Alex Deymo <deymo@google.com> Fri, 21 August 2020 12:05 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 07E8C3A08B3 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Fri, 21 Aug 2020 05:05:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.498
X-Spam-Level:
X-Spam-Status: No, score=-10.498 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=0.001, MAILING_LIST_MULTI=-1, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_PASS=-0.001, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0wGGgpnGhdo4 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Fri, 21 Aug 2020 05:05:49 -0700 (PDT)
Received: from lyra.w3.org (lyra.w3.org [128.30.52.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 567023A082D for <httpbisa-archive-bis2Juki@lists.ietf.org>; Fri, 21 Aug 2020 05:05:48 -0700 (PDT)
Received: from lists by lyra.w3.org with local (Exim 4.92) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1k95kM-0005DE-6p for ietf-http-wg-dist@listhub.w3.org; Fri, 21 Aug 2020 12:02:14 +0000
Resent-Date: Fri, 21 Aug 2020 12:02:14 +0000
Resent-Message-Id: <E1k95kM-0005DE-6p@lyra.w3.org>
Received: from www-data by lyra.w3.org with local (Exim 4.92) (envelope-from <deymo@google.com>) id 1k95kL-0005Ca-0W for ietf-http-wg@listhub.w3.org; Fri, 21 Aug 2020 12:02:13 +0000
Received: from mimas.w3.org ([128.30.52.79]) by lyra.w3.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from <deymo@google.com>) id 1k95ix-0004qZ-LD for ietf-http-wg@listhub.w3.org; Fri, 21 Aug 2020 12:00:47 +0000
Received: from mail-ej1-x62f.google.com ([2a00:1450:4864:20::62f]) by mimas.w3.org with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from <deymo@google.com>) id 1k95is-0007Wx-1H for ietf-http-wg@w3.org; Fri, 21 Aug 2020 12:00:47 +0000
Received: by mail-ej1-x62f.google.com with SMTP id qc22so1980516ejb.4 for <ietf-http-wg@w3.org>; Fri, 21 Aug 2020 05:00:41 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=jrx2YnHUKLmcmNFEQhhqd8R+zXLqxBWJ0E/LllkQomY=; b=CSmwulQCkQfJr3Z9BVYXqVyT5s3hVt+EEljnNNQhadzcDp8pGfzGIE3CJ4gcqVrIqX 3G7f07qI+2I6ZDsVZX71meWLa6KqWLJrb3ozBWMLFGRnidQ7VLrKPxP4vqaxcQ0xWGaA +5Vk4Pvvkh9FOooFcu+ME3myRGWEc5hCaYiadBwseMyAazFlZGrNaE4Kz/ngNeHuX+5R LBRH8PqCDDPvD2PtQSS0pOOwv6oNNCjilhb/IdJWzjLMxQ5/xgMJu4Vw5PtvKXz7gEb/ NuR7pziNKqFyBjFCgxf2f6O0hmnkQhwahTbmiYdRVeQyL0iFjR6U+b7XycPwc8BmjyhP CGmA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=jrx2YnHUKLmcmNFEQhhqd8R+zXLqxBWJ0E/LllkQomY=; b=s8GkIbTah4pUlgSBtYncbLUWzopyPOeJ7sVPZXJQK/uP6g4NN8V+PZXc1u27pPy2GB 9PG+CEvYIDb0smk4br8VJUXo3vElW4UGU4Rb+LakXg52pLls+sON5z/rSHyCEjah28qb mH+W3gtFvtNqc+Q6qRey7CyIWXjZ/V+NjXVxIHj576L/7uO1qyk1acb6HzblJIlpiXxm 0Dgu7pbWyjXlLZidcSKKXyxwLETU7RLAkifwDSQkRsPt3EIJKg+487dIIobG1F91yWVL 7tkinbW6G/dSkp3q8ijmBJ3K29Ui0nsmhDZS66V5xzr2kD2D273MW8X+PJHkgx9P/cEl jLyQ==
X-Gm-Message-State: AOAM533OkvLawEMp6bAosTWPJfF95pnHIZJS8szWfXxJc71IQQojLyW1 8mcuNefQNp/cgxyvqtPdCYTWzGN2CebUXEliJuBgjPUGIoJbFw==
X-Google-Smtp-Source: ABdhPJx/pBdPiUx06nVQxdF5UPrRD4XPoj/rvOmAH+xN1guEo55REwUWFwatM7gAhc+OFZrED/ehRFh5uL5qvgLHRN4=
X-Received: by 2002:a17:906:3053:: with SMTP id d19mr967839ejd.190.1598011229907; Fri, 21 Aug 2020 05:00:29 -0700 (PDT)
MIME-Version: 1.0
References: <CACj=BEjdwH1OtS=uQXsgPN3XVJvVEUeisjeF5_iro1vg0omqWQ@mail.gmail.com> <20200820151401.GB21689@1wt.eu> <20200820183008.GA8086@lubuntu> <18159.1597960275@critter.freebsd.dk> <CADR0UcWkxb4ZtMgjqDeAv=m6G3ks2P75L-5pvt-ctz8WyJF29g@mail.gmail.com>
In-Reply-To: <CADR0UcWkxb4ZtMgjqDeAv=m6G3ks2P75L-5pvt-ctz8WyJF29g@mail.gmail.com>
From: Alex Deymo <deymo@google.com>
Date: Fri, 21 Aug 2020 14:00:18 +0200
Message-ID: <CAGd9gwhR5zTjsCugrZeSr7Yt_N6wxv7k5evrLBGW=dkKt257ZA@mail.gmail.com>
To: HTTP Working Group <ietf-http-wg@w3.org>
Content-Type: multipart/alternative; boundary="000000000000b5247005ad61fbfc"
Received-SPF: pass client-ip=2a00:1450:4864:20::62f; envelope-from=deymo@google.com; helo=mail-ej1-x62f.google.com
X-W3C-Hub-Spam-Status: No, score=-16.6
X-W3C-Hub-Spam-Report: BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5, W3C_NW=1
X-W3C-Scan-Sig: mimas.w3.org 1k95is-0007Wx-1H 957e4588ef5070d9cfc8b18a72d69f7f
X-caa-id: 1f5bdc8f4f
X-Original-To: ietf-http-wg@w3.org
Subject: Re: JPEG-XL as Content-Encoding?
Archived-At: <https://www.w3.org/mid/CAGd9gwhR5zTjsCugrZeSr7Yt_N6wxv7k5evrLBGW=dkKt257ZA@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/37947
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <https://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

Hi,
I'm Alex and I'm working in the JPEG XL project. The transcoding being
discussed for this is indeed the lossless recompression of JPEG files
feature in JXL.

JXL has many more features as a codec than what the lossless JPEG
recompression can offer, you would get a lot better compression density for
similar visual quality if you don't need to be able to reconstruct the
exact same original JPEG file (or if your original was not a JPEG to start
with but RAW data from a camera), so it makes a lot of sense to add jxl as
an image codec on its own.

However, on top of that, the lossless recompression of JPEG files allows
you to get this ~20% gain for existing files. When you deploy a new lossy
codec there is the question of what to do with the existing images. If you
have a website with photos and want to convert your already lossy JPEG
files to a new codec to save storage and bandwidth and you decide to decode
them to pixels and encode them back to the new format you will end up with
more artifacts or worse compression density trying to accurately represent
the JPEG artifacts in the new codec, whatever the codec is. It's
impractical to do this lossy transcoding to a new codec at large scale on
existing images, each application would need to evaluate whether they want
to do this for existing images. This story is different if you start with a
large and high quality image (like a JPEG from a camera) and want to encode
in a smaller form for the web, since there you already have a high quality
file.

Instead, using jxl lossless recompression of JPEG files as content-encoding
gets you this ~20% benefit for existing files at large scale since you only
need browser support and webserver or intermediary support; you don't need
applications to evaluate whether the new codec fits their needs or if the
new kind of artifacts of a new codec are suitable for their use case. Think
about the JS developer who uses the hash of the file as a key somewhere and
needs to do some work to update their app to a new codec while still deal
with browsers that don't support the new codec. A CDN could decide to
deliver JPEG files encoded with the lossless recompression feature to
supporting browsers and take advantage of these gains today even if the
codec is not widely supported at the beginning like we see with many new
codecs without needing to modify the web application .

Regarding compatibility with browsers that don't support the
content-encoding and how to decide when to use the encoding, I personally
think that a reasonable way to implement that would be that for static
content you store the file already encoded with JXL to save that ~20% on
storage and decode it at serving time to browsers that don't support it,
and for dynamic content encode on the fly, however this would depend on the
cost model of the server/CDN.

I think the only shocking thing about a content-encoding for JPEGs is that
it can't encode any arbitrary file only JPEGs, but if you look at "general
purpose" compressors like Brotli they still can't compress to a smaller
file every file; many binary files that are already compressed like .zip or
even a JPEG files (unless they have a large ICC) won't compress to a
smaller file so you just don't do it even if Brotli is able to compress
them to a ~similar size file.

Let us know if you have any questions. We are happy to discuss ways to
improve.