Re: [Hls-interest] LL-HLS Amendment Proposal: Optional part response headers for CDN efficiency.

Roger Pantos <rpantos@apple.com> Sat, 13 March 2021 00:29 UTC

Return-Path: <rpantos@apple.com>
X-Original-To: hls-interest@ietfa.amsl.com
Delivered-To: hls-interest@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 283FD3A0CFA; Fri, 12 Mar 2021 16:29:44 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.356
X-Spam-Level:
X-Spam-Status: No, score=-2.356 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.248, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_MSPIKE_H4=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_REMOTE_IMAGE=0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=apple.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id MMFYQkqu4XVC; Fri, 12 Mar 2021 16:29:39 -0800 (PST)
Received: from rn-mailsvcp-ppex-lapp15.apple.com (rn-mailsvcp-ppex-lapp15.rno.apple.com [17.179.253.34]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 375AD3A0CEF; Fri, 12 Mar 2021 16:29:39 -0800 (PST)
Received: from pps.filterd (rn-mailsvcp-ppex-lapp15.rno.apple.com [127.0.0.1]) by rn-mailsvcp-ppex-lapp15.rno.apple.com (8.16.0.43/8.16.0.43) with SMTP id 12D0I3pT028722; Fri, 12 Mar 2021 16:29:38 -0800
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=apple.com; h=from : message-id : content-type : mime-version : subject : date : in-reply-to : cc : to : references; s=20180706; bh=dMnuAVPxNYJfRv+xQWF14NhOTvp5KPvgunU6PHemLYg=; b=lKrBhbRNqVwM5ridq5o/bmcG2yYHAczZHqKJtkgq7nZd9Q483ky7Flsl3zDRxJEpp/GL e+mjzwM6GZ1QPYcp/TT+qT3dTv3/k9W/COngP1GZ8yaGOyU4AmrqDwBmi0369qlsRm0L ltD78L1OtgTV5s6DZ8Hj1IETNS1Y/PBJZiFr7DwOAxSh2tg34RrTy5bygEdIKwrFsz2u ys0DBSsKimCCPX5W3Jjv7jp7/1HW4831sQalLz0EB/uB61L+sQFinwCuzzlk41/LKTSb Ra+u4GgQrZ2oUiLuMwOwgyjY2xR+XbOAQxZ7bvpPFqwXVdthnbubJlW5DaTvt0SFnAYb aQ==
Received: from ma-mailsvcp-mta-lapp03.corp.apple.com (ma-mailsvcp-mta-lapp03.corp.apple.com [10.226.18.135]) by rn-mailsvcp-ppex-lapp15.rno.apple.com with ESMTP id 375vavg0ht-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NO); Fri, 12 Mar 2021 16:29:38 -0800
Received: from ma-mailsvcp-mmp-lapp01.apple.com (ma-mailsvcp-mmp-lapp01.apple.com [17.32.222.14]) by ma-mailsvcp-mta-lapp03.corp.apple.com (Oracle Communications Messaging Server 8.1.0.7.20201203 64bit (built Dec 3 2020)) with ESMTPS id <0QPV00CF4TDE1G00@ma-mailsvcp-mta-lapp03.corp.apple.com>; Fri, 12 Mar 2021 16:29:38 -0800 (PST)
Received: from process_milters-daemon.ma-mailsvcp-mmp-lapp01.apple.com by ma-mailsvcp-mmp-lapp01.apple.com (Oracle Communications Messaging Server 8.1.0.7.20201203 64bit (built Dec 3 2020)) id <0QPV00J00TANNX00@ma-mailsvcp-mmp-lapp01.apple.com>; Fri, 12 Mar 2021 16:29:38 -0800 (PST)
X-Va-A:
X-Va-T-CD: 7abe3f87f76ac8b0666e3136bd5d7d17
X-Va-E-CD: ffef869f93324cc6c54b44ebb4e62d2a
X-Va-R-CD: 58f34c67a96479fe75eea96b2ce5ea64
X-Va-CD: 0
X-Va-ID: a75fdc17-6cb1-48cf-a24f-53a3d78e7b99
X-V-A:
X-V-T-CD: 7abe3f87f76ac8b0666e3136bd5d7d17
X-V-E-CD: ffef869f93324cc6c54b44ebb4e62d2a
X-V-R-CD: 58f34c67a96479fe75eea96b2ce5ea64
X-V-CD: 0
X-V-ID: dc7fbc98-ab35-4ee6-b109-b2bbcff51801
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.369, 18.0.761 definitions=2021-03-12_13:2021-03-12, 2021-03-12 signatures=0
Received: from smtpclient.apple ([17.235.89.135]) by ma-mailsvcp-mmp-lapp01.apple.com (Oracle Communications Messaging Server 8.1.0.7.20201203 64bit (built Dec 3 2020)) with ESMTPSA id <0QPV00D2PTDCJZ00@ma-mailsvcp-mmp-lapp01.apple.com>; Fri, 12 Mar 2021 16:29:37 -0800 (PST)
From: Roger Pantos <rpantos@apple.com>
Message-id: <79D4AC5D-CE0A-4C48-A2C3-08D674DDCBBA@apple.com>
Content-type: multipart/alternative; boundary="Apple-Mail=_1584D99D-8FE6-414F-B9BF-511DA7C6001A"
MIME-version: 1.0 (Mac OS X Mail 14.0 \(3654.80.0.2.43\))
Date: Fri, 12 Mar 2021 18:29:36 -0600
In-reply-to: <CANtXGXHL0bJF1gtXqZW-j-aVtXj+9KtTMXgcWOQzG18aRGR_PQ@mail.gmail.com>
Cc: "hls-interest@ietf.org" <hls-interest@ietf.org>
To: Andrew Crowe <acrowe=40llnw.com@dmarc.ietf.org>
References: <21cc4a753f0c46189ada6f8e3e177516@EX13D02EUB003.ant.amazon.com> <863DA51C-6758-4B01-BB81-1DCE078120FD@apple.com> <CANtXGXG0QPSGU0HWJcyb_U16RzwDpQK7BxXkYU7w2t=tK5Uayg@mail.gmail.com> <EBAC993D-4F95-4FF3-9C02-B1542E6128BC@apple.com> <675EA9BB-2291-4604-9B8B-9C050ABF8490@akamai.com> <4E250CAF-AE87-4610-8B34-4491CBD7F688@apple.com> <CANtXGXHL0bJF1gtXqZW-j-aVtXj+9KtTMXgcWOQzG18aRGR_PQ@mail.gmail.com>
X-Mailer: Apple Mail (2.3654.80.0.2.43)
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.369, 18.0.761 definitions=2021-03-12_13:2021-03-12, 2021-03-12 signatures=0
Archived-At: <https://mailarchive.ietf.org/arch/msg/hls-interest/g7qu7iQGwn3JzA2S71xMqTNL0GI>
Subject: Re: [Hls-interest] LL-HLS Amendment Proposal: Optional part response headers for CDN efficiency.
X-BeenThere: hls-interest@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussions about HTTP Live Streaming \(HLS\)." <hls-interest.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/hls-interest>, <mailto:hls-interest-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/hls-interest/>
List-Post: <mailto:hls-interest@ietf.org>
List-Help: <mailto:hls-interest-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hls-interest>, <mailto:hls-interest-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 13 Mar 2021 00:29:44 -0000


> On Mar 12, 2021, at 3:07 PM, Andrew Crowe <acrowe=40llnw.com@dmarc.ietf.org> wrote:
> 
> Would it be simpler to ask the origin to produce urls of the form vid720_segment_1521?start=14040&length=26048 such that either a direct fetch of that URL returned the range, or a cache could fetch all (or as much that currently exists) of vid720_segment_1521 and satisfy the request out of the subrange?
> 
> Can an origin know the size(length) of a part before it is complete?  EXT-X-PRELOAD-HINT is described as being written to the manifest before the content is available to egress, so the player can make the request which the origin will hold open until it can deliver the part at line rate ASAP. If it's describing the part in the manifest before all of the part data is available, then the length cannot be known at describe-time.

Ah, you’re right. I had forgotten about the requirement to pre-publish the upcoming part URL in the HINT tag.

> Or are you suggesting reworking named parts entirely to function more like the byterange style like:
> 
> #EXTM3U
> #EXT-X-TARGETDURATION:2
> #EXT-X-MAP:URI="init.m4v"
> #EXTINF:2,
> vid720_segment_1518.m4v
> #EXTINF:2,
> vid720_segment_1519.m4v
> #EXT-X-PART:DURATION=0.25,URI="vid720_segment_1520.m4v?start=0&length=100000",INDEPENDENT=YES
> #EXT-X-PART:DURATION=0.25,URI="vid720_segment_1520.m4v?start=100001&length=80000"
> #EXT-X-PART:DURATION=0.25,URI="vid720_segment_1520.m4v?start=260001&length=80000"
> #EXT-X-PART:DURATION=0.25,URI="vid720_segment_1520.m4v?start=340001&length=80000"
> #EXT-X-PART:DURATION=0.25,URI="vid720_segment_1520.m4v?start=420001&length=100000",INDEPENDENT=YES
> #EXT-X-PART:DURATION=0.25,URI="vid720_segment_1520.m4v?start=520001&length=80000"
> #EXT-X-PART:DURATION=0.25,URI="vid720_segment_1520.m4v?start=600001&length=80000"
> #EXT-X-PART:DURATION=0.25,URI="vid720_segment_1520.m4v?start=680001&length=80000"
> #EXTINF:2,
> vid720_segment_1520.m4v
> #EXT-X-PART:DURATION=0.25,URI="vid720_segment_1521.m4v?start=0&length=100000",INDEPENDENT=YES
> #EXT-X-PART:DURATION=0.25,URI="vid720_segment_1521.m4v?start=100001&length=80000"
> #EXT-X-PART:DURATION=0.25,URI="vid720_segment_1521.m4v?start=260001"

No, that wouldn’t work. It messes up downstream caches to switch the name of the part on the fly.

What if the URL just contained the start (vid720_segment_1521?start=14040). The cache could do a regular blocking partial segment request on it, then recognize that it represented a range of the base resource vid720_segment_1521 and write it and serve it as as a range request from there instead of creating a separate cache entry for it?


Roger.

> 
> This could be a solution that exposes the byte-range type functionality to clients and origins that don't explicitly support byte-ranges. In this scenario, the bleeding-edge request would be  "vid720_segment_1521.m4v?start=260001" - would an origin be expected to deliver a single part and terminate, or follow byte-range style and deliver parts at line rate as they become available? Perhaps clients could add "&parts=1" to signal they only want one part, otherwise the origin continues to deliver the segment as parts are available?
> 
> Happy Friday!
> 
> Regards,
> -Andrew
> 
> On Fri, Mar 12, 2021 at 12:35 PM Roger Pantos <rpantos=40apple.com@dmarc.ietf.org <mailto:40apple.com@dmarc.ietf.org>> wrote:
> 
> 
>> On Mar 11, 2021, at 4:21 PM, Law, Will <wilaw=40akamai.com@dmarc.ietf.org <mailto:wilaw=40akamai.com@dmarc.ietf.org>> wrote:
>> 
>> Hi Roger
>>  
>> I did a bit of research internally. While the idea of evicting objects from cache the moment their 24s max-age expires is good in theory, the practice it is far from that. Speaking only for Akamai, our cache store entry table is highly optimized for one-way reads. There may be a million or more objects in cache and the store is set up to quickly return whether an object, identified by its cache key, exists. The store table is very expensive to search horizontally. As a result, the lowest interval we can currently evict the cache is approximately 2 hours. We have multiple projects afoot to reduce that number, but it is a practical limit today.
> 
> That… certainly sounds like an area that’s ripe for optimization. 
> 
>> This compounds the dual cache problem. Assuming 2.5Mbytes/s, we would have a block of data 18GB in size at peak before the redundant & stale objects get evicted. Given that edge machines typically have several hundred GB of cache space, having a single stream set consume 18GB is material. I am simplifying the caching behavior significantly with these statements. We have cache age eviction multipliers, max idle lifetimes, memory-only caching, metro caching etc which affect these numbers and which I have not mentioned. In general, multi-tenant highly scalable caching systems may not be as efficient as we would want in evicting ‘low durability’ objects such as LL-HLS parts. I am interested as to whether other CDNs have similar challenges.   
> 
> Chatting with Jan, it seems like the CDNs that use (or are based around) ATS are in much the same boat - they will cache all of the duplicate stream until it ages out in fetch order. (Where he and you may differ is on the tradeoff between the increased cache use and the expense of maintaining a smart edge — but I’ll let him speak to that.)
> 
>> @Andrew – I do like you current proposal of a naming convention over the header approach, simply because headers can become detached as objects move through distribution tiers, whereas filenames persist.  However , in order to stitch segments at the edge, we would need to perform a directory-like search against the cache store similar to “ vid720_segment_1521.*” in order to find all the constituent components. As mentioned above, this is expensive, as it’s an open search against the entirety of the table. Additionally, the edge needs some idea of how many parts should be stitched. The current proposal of listing the offset does not give you that information – are there 4, or 5 or 6 parts per segment? A way to solve this problem would be to name the segments according to their part order, for example 
>> vid720_segment_1521.part1of4.m4v
>> vid720_segment_1521.part2of4.m4v etc.
>> This would tell the stitcher when it was complete. The stitcher could then discover the byte-offsets by reading the objects, which it would need to do anyway in order to stitch the parts.
>>  
> 
> Would it be simpler to ask the origin to produce urls of the form vid720_segment_1521?start=14040&length=26048 such that either a direct fetch of that URL returned the range, or a cache could fetch all (or as much that currently exists) of vid720_segment_1521 and satisfy the request out of the subrange?
> 
> 
> Roger.
> 
>> If stitching is to be done, then we’d prefer it be through the reserved naming of parts versus headers. However, given the cost and complexity of edge stitching in general, our preference is still  to promote the use of the byte-range addressing mode. It solves the redundant cache content problem, reduces the client request count and has been a part of the spec already for the past year.
>>  
>> Cheers
>> Will
>>  
>>  
>>  
>>  
>> From: Roger Pantos <rpantos=40apple.com@dmarc.ietf.org <mailto:rpantos=40apple.com@dmarc.ietf.org>>
>> Date: Wednesday, March 10, 2021 at 11:01 AM
>> To: Andrew Crowe <acrowe=40llnw.com@dmarc.ietf.org <mailto:acrowe=40llnw.com@dmarc.ietf.org>>
>> Cc: "Law, Will" <wilaw=40akamai.com@dmarc.ietf.org <mailto:wilaw=40akamai.com@dmarc.ietf.org>>, Jan Van Doorn <jvd=40apple.com@dmarc.ietf.org <mailto:jvd=40apple.com@dmarc.ietf.org>>, "Weil, Nicolas" <nicoweil@elemental.com <mailto:nicoweil@elemental.com>>, "hls-interest@ietf.org <mailto:hls-interest@ietf.org>" <hls-interest@ietf.org <mailto:hls-interest@ietf.org>>
>> Subject: Re: [Hls-interest] LL-HLS Amendment Proposal: Optional part response headers for CDN efficiency.
>>  
>> Hello Andrew,
>> 
>> 
>>> On Mar 9, 2021, at 6:26 AM, Andrew Crowe <acrowe=40llnw.com@dmarc.ietf.org <mailto:acrowe=40llnw.com@dmarc.ietf.org>> wrote:
>>>  
>>> Folks,
>>> 
>>> Thanks for your replies and patience. While my primary concerns are about origin hit-rate/health and latency of delivery with mixed mode, Will provided great input about the caching layer that further highlights a lot of the difficulties of the named-part format of LL-HLS. I had initially written out thoughtful replies to each comment, but I feel that we can skip to the end and consider alternatives that may improve interoperability of named-parts with range-based vs immediately moving away from named-parts altogether. 
>>> 
>>> Clearly, response headers will not be viable as CDNs prefer/require information about the object at request-time. We could gain some value out of that concept by including some of that information in the part file requests by changing from an iterative sequence identifier to an offset identifier. So vid720_segment_1521.m4v's parts would be presented as:
>>> vid720_segment_1521.part1.m4v -> vid720_segment_1521.offset-0.m4v
>>> vid720_segment_1521.part2.m4v -> vid720_segment_1521.offset-623751.m4v 
>>> 
>>> and so on. This keeps the named part file concept for the broad scope of players, but also allows the caching layer to understand where in the root file (vid720_segment_1521.m4v) a part is located _at request time_.
>>> 
>>> However, this does not achieve 100% interop as the CDN cannot equate an offset named part with an exact range, nor can it assume any number of parts per segment as the segment may be truncated for an interstitial content break. 
>>> 
>>> In summary - is there any desire from the community to iterate on the named part format of LL-HLS with a goal of better interoperability/origin shielding or should we designate range-based as the preferable method of LL-HLS delivery and push to deprecate/ostracize named parts due to their faults (small object, high RPS, low interop, etc...)?
>>  
>> Before proceeding too much further with this I’d like to get a sense of the potential win here. Named-parts (as you call them) provide excellent compatibility with packagers, origins and proxy caches without additional changes, at some cost in duplicate cache and transmission. I’d like to understand the implication of that cost.
>>  
>> Let’s do some math. Say we have a low-latency live stream with 4s segments, as Will suggested, and a combined bit rate of 20Mbps for all tiers. (This is generously assuming that some clients are pulling on every tier.) So at 2.5MB per second, 24s of cache duplication is 60MB.
>>  
>> Question for the (CDN) audience in general: how many active low-latency live streams (where I define active as “more than two clients watching on a particular edge”) would you need to see on the typical edge you’d use for distributing such a stream, before a 60MB per stream contribution showed up on your dashboards as a noticeable amount of cache occupancy?
>>  
>>  
>> thanks,
>>  
>> Roger.
>> 
>> 
>>> 
>>> Many thanks,
>>> -Andrew
>>>  
>>> On Mon, Feb 15, 2021 at 11:34 AM Jan Van Doorn <jvd=40apple.com@dmarc.ietf.org <mailto:40apple.com@dmarc.ietf.org>> wrote:
>>>> One of the original LL-HLS design goals was to scale with regular CDNs and not require anything other than HTTP/2 on the caches. I think CDNs work best when they just implement the HTTP spec. I think that is a strong reason to not implement edge stitching as well. 
>>>>  
>>>> Rgds,
>>>> JvD
>>>>  
>>>> 
>>>> 
>>>>> On Feb 12, 2021, at 6:11 PM, Weil, Nicolas <nicoweil@elemental.com <mailto:nicoweil@elemental.com>> wrote:
>>>>>  
>>>>> From the origin perspective, I would add that we’d have a strong reason to not implementing it: the edge stitching dilutes responsibility and makes it impossible to debug efficiently when things start to go wrong.
>>>>>  
>>>>> I would rather propose that origins add a specific header to the last part of a full segment, with the name of it, so that CDNs can prefetch the full segment and provide the best delivery performance possible – something like hls-fullsegment-prefetch: vid720_segment_1521.m4v
>>>>>  
>>>>> This would work well with ad insertion discontinuities and preserve a clear split of the responsibilities between the origin and the CDN.
>>>>>  
>>>>> Thanks,
>>>>> Nicolas
>>>>> ----------------
>>>>> Nicolas Weil | Senior Product Manager – Media Services
>>>>> AWS Elemental
>>>>>  
>>>>> From: Hls-interest <hls-interest-bounces@ietf.org <mailto:hls-interest-bounces@ietf.org>> On Behalf Of Law, Will
>>>>> Sent: Friday, February 12, 2021 2:28 PM
>>>>> To: hls-interest@ietf.org <mailto:hls-interest@ietf.org>
>>>>> Subject: RE: [Hls-interest] LL-HLS Amendment Proposal: Optional part response headers for CDN efficiency.
>>>>>  
>>>>> Hi Roger
>>>>>  
>>>>> From an Akamai perspective, we acknowledge the issue raised by Andrew, in that duplicate parts and segments reduce cache efficiency at the edge. This is something the HLS community should strive to reduce over time. I have four main responses to this proposal.
>>>>>  
>>>>> Firstly, we feel that the existing HLS spec <https://urldefense.com/v3/__https:/tools.ietf.org/html/draft-pantos-hls-rfc8216bis-08__;!!GjvTz_vk!FlALCrSijFqvrfdxEkYlMUYUHc1sYdOYrTfKsBINYVv0q4uXUvLTD43dPJZO$> already provides a solution to this problem, namely the use of byte-range addressing for parts. Under this addressing mode, the only objects the origin produces are segments. These are the single objects that are cached at the edge. The use of ranges to retrieve parts affords the clients the ability to lower their latency below the segment duration and to start and switch quickly. This approach has the following advantages over edge-stitching :
>>>>>  
>>>>> Segment caching and byte-range delivery is natively supported by any CDN which supports Http2 delivery. It does not need to be taught new behaviors (RFC8673 edge case aside). One of the secrets to HLS success has been the simplicity of scaling it out. Any http server has sufficed in the past. In introducing edge stitching, we are moving to more of a Smooth streaming model, which places a dependency on logic at the edge. This logic has to be implemented consistently across multiple CDNs and introduces critical path complexity which has to be managed.
>>>>> Edge stitching has an overhead cost, mainly in directory searches to discover aggregate parts. Searches are always more expensive than lookups since you must span the whole directory tree. This compute cost would have to be absorbed by the CDN. Edge stitching is basically trading cache efficiency for compute. Byte-range does not force you to make this trade-off.
>>>>>  
>>>>> Secondly, the period time in which we have duplicate cache content is not actually that long. Per the HLS spec, blocking media objects requests (such as parts) should be cached for 6 target durations (basically a 100% safety factor since they are only described for the last 3 target durations of the streams). For 4s segments, this means we can evict our stored parts after 24s. The media segments, which can be requested by standard latency clients and as well as low latency clients scrubbing behind live, need to be cached longer. 
>>>>>  
>>>>> Consider the case of a live stream with 4s segments and 1s part durations.
>>>>>  
>>>>> After 40s of streaming, we have
>>>>> 10 media segments holding 40s of data in the cache
>>>>> 24s of duplicate part data
>>>>> The overall cache duplication is 24/40  = 60%
>>>>>  
>>>>> After 5mins (300s) of streaming, we have
>>>>> 300s of media segments in the cache
>>>>> 24s of duplicate part data
>>>>> The overall cache duplication is 24/300  = 8%
>>>>>  
>>>>> After 30mins (1800s) of streaming, we have
>>>>> 1800s of media segments in the cache
>>>>> 24s of duplicate part data
>>>>> The overall cache duplication is 24/1800  = 1.3%
>>>>>  
>>>>> So streams with realistic durations in the minutes actually have quite a low percentage of duplicate data, as long as the CDN is aggressive about cache eviction and the origin does a good job in setting cache-control headers.
>>>>>  
>>>>> Thirdly, at Akamai we would have a complex time in implementing the edge stitching as proposed and the same may be true for other CDNs. The reason is that while the origin header information gets written in to our cache store entry table, the store tables are architected to very efficiently tell you if an object exists and if so, to return it. They are not databases optimized for horizontal searching. We cannot search across the cache, for example asking for all objects whose X-HLS-Part-Root-Segment header matches a certain string. It would very difficult to implement the edge stitch proposed here. We would need to externalize the header information in to some sort of parallel database which we could query. While we have such structures (via EdgeWorkers and EdgeKV), their use would raise the cost and complexity of delivering LL-HLS. At that point we would probably choose to suffer the low duplication rates and instead focus on efficiently evicting parts.
>>>>>  
>>>>> Fourthly, if the community opinion is to still proceed with this edge-stitch plan, then I would offer the following suggestions:
>>>>> 1.       To avoid header bloat, the sequence and offset headers could be collapsed into a single header, for example HLS-Part-Info:<current-part>,<total-part-count>,<byte-offset. This would look like HLS-Part-Info:2,8,623751. Due to HPACK and QPACK header compression, we would not want to place the root-segment in the same bundle, as it will be invariant over the parts from the same segment and hence can be compressed more efficiently if it is separate.
>>>>> 2.       The IETF strongly discourages the use of X- in header prefixes <https://urldefense.com/v3/__https:/tools.ietf.org/html/rfc6648__;!!GjvTz_vk!FlALCrSijFqvrfdxEkYlMUYUHc1sYdOYrTfKsBINYVv0q4uXUvLTD46canmg$>. A simple header name such as ‘HLS-part-info’ would be preferable.
>>>>> 3.       Do you need both byte offset and sequence? Once you know the sequence, you can read the byte-lengths from the individually stored parts.
>>>>> 4.       Segments get truncated without warning, often for ad insertion discontinuities and always at the end when the encoder is turned off.  Say you are making 8 parts per 4s segment and have sent off the first two parts to the CDN before reaching a sudden discontinuity. You have labelled these as 1/8 and 2/8 respectively. Since 3-8 are never produced, the edge routine would waste some time and resources looking for 3-8, before giving up and going to the origin to fetch the segment. Performance – especially TTFB and apparent throughput – would suffer.
>>>>> 5.       You may have an edge server which is only serving legacy clients pulling segments and no low-latency clients seeding the edge with part requests. In this case, the edge would waste time searching for constituent parts before giving up and going to the origin to fetch the segment. Performance again would suffer. 
>>>>>  
>>>>> I appreciate Limelight raising these issues and look forward to debating a mutually efficient solution which benefits content distributors, CDNs and players.
>>>>>  
>>>>> Have a good long weekend!
>>>>>  
>>>>> Cheers
>>>>> Will
>>>>>  
>>>>>  
>>>>> --------------------------------------------------------
>>>>> Chief Architect – Edge Technology Group
>>>>> Akamai Technologies
>>>>> San Francisco
>>>>> Cell: +1.415.420.0881
>>>>>  
>>>>>  
>>>>>  
>>>>>  
>>>>>  
>>>>>  
>>>>>  
>>>>> From: Roger Pantos <rpantos=40apple.com@dmarc.ietf.org <mailto:rpantos=40apple.com@dmarc.ietf.org>>
>>>>> Date: Friday, February 12, 2021 at 10:40 AM
>>>>> To: Andrew Crowe <acrowe=40llnw.com@dmarc.ietf.org <mailto:acrowe=40llnw.com@dmarc.ietf.org>>
>>>>> Cc: "hls-interest@ietf.org <mailto:hls-interest@ietf.org>" <hls-interest@ietf.org <mailto:hls-interest@ietf.org>>
>>>>> Subject: Re: [Hls-interest] LL-HLS Amendment Proposal: Optional part response headers for CDN efficiency.
>>>>>  
>>>>>  
>>>>>  
>>>>> 
>>>>>> On Feb 9, 2021, at 7:54 AM, Andrew Crowe <acrowe=40llnw.com@dmarc.ietf.org <mailto:acrowe=40llnw.com@dmarc.ietf.org>> wrote:
>>>>>>  
>>>>>> Hello,
>>>>>> 
>>>>>> CMAF content packaged and delivered using LL-DASH and range-based LL-HLS are easily managed as duplicate content by CDNs as they specify only segment files. In fact, once the segment is complete, it can then be served out of CDN cache for players that are not Low Latency capable - effectively reducing latency for them as well. Part-based LL-HLS introduces individually named part files that then collapse to the separately named segment file upon final part completion. This then means that on first request for the whole collapsed segment file the CDN will have to go back to origin to request bytes that it likely already has in the individually named part files. CDNs can improve cache efficiency, origin hit rate, and whole segment delivery times with a little bit of additional information from origin.
>>>>>> 
>>>>>> 
>>>>>> On request for a named part file an origin may provide a set of response headers:
>>>>>> 
>>>>>> *X-HLS-Part-Sequence*
>>>>>> A multi value header that represents the current part sequence (index=1) and the total number of parts for the segment. The values will be separated by a forward slash ("/"). For example a 2 second segment with 8 parts per segment will respond to the 2nd part request (vid720_segment_1521.part2.m4v) like
>>>>>> X-HLS-Part-Sequence: 2/8
>>>>>> 
>>>>>> 
>>>>>> *X-HLS-Part-Offset*
>>>>>> A single value header that represents the byte offset of the part in the segment. The first part of a segment will always be 0 while, for example the second .25s part of a 2mpbs stream (vid720_segment_1521.part2.m4v) may have a value like 623751
>>>>>> 
>>>>>> 
>>>>>> *X-HLS-Part-Root-Segment*
>>>>>> A single value header that provides the name of the root segment of the current part. This lets the CDN/proxy know which root file to concatenate the parts into. vid720_segment_1521.part2.m4v would have a value of vid720_segment_1521.m4v
>>>>>> 
>>>>>> 
>>>>>> With the information from these three headers the CDN can recognize the individually named part files as ranges of a larger file, store them effectively and deliver a better experience to viewers across all formats. 
>>>>>  
>>>>> Hello Andrew. I’m interested in this proposal, but I’d also like to hear some feedback from others in the CDN and packager spaces. Specifically, I’d like to know if other folks:
>>>>>  
>>>>> - Agree that it’s a good way to solve the problem
>>>>>  
>>>>> - Can spot any problems or limitations in this proposal that might make it difficult to produce (or consume) these headers
>>>>>  
>>>>> - Can see themselves implementing it
>>>>>  
>>>>>  
>>>>> thanks,
>>>>>  
>>>>> Roger Pantos
>>>>> Apple Inc.
>>>>>  
>>>>> 
>>>>>> 
>>>>>> Regards,
>>>>>> -Andrew
>>>>>> -- 
>>>>>> Error! Filename not specified. <https://urldefense.com/v3/__https:/www.limelight.com/__;!!GjvTz_vk!Gcp-s1jYCIAYpNsmuK09dLU1cDo5FUINbvdFY1ZXSct8lPTh9xqsUiZS1ril$>
>>>>>> Andrew Crowe Architect
>>>>>> EXPERIENCE FIRST.
>>>>>> Error! Filename not specified.Error! Filename not specified.+1 859 583 3301 <tel:+1+859+583+3301>
>>>>>> www.limelight.com <https://urldefense.com/v3/__https:/www.limelight.com/__;!!GjvTz_vk!Gcp-s1jYCIAYpNsmuK09dLU1cDo5FUINbvdFY1ZXSct8lPTh9xqsUiZS1ril$>
>>>>>> Error! Filename not specified. <https://urldefense.com/v3/__https:/www.facebook.com/LimelightNetworks__;!!GjvTz_vk!Gcp-s1jYCIAYpNsmuK09dLU1cDo5FUINbvdFY1ZXSct8lPTh9xqsUqon8WvN$>Error! Filename not specified. <https://urldefense.com/v3/__https:/www.linkedin.com/company/limelight-networks__;!!GjvTz_vk!Gcp-s1jYCIAYpNsmuK09dLU1cDo5FUINbvdFY1ZXSct8lPTh9xqsUgrrsizx$>Error! Filename not specified. <https://urldefense.com/v3/__https:/twitter.com/llnw__;!!GjvTz_vk!Gcp-s1jYCIAYpNsmuK09dLU1cDo5FUINbvdFY1ZXSct8lPTh9xqsUvWN2UYO$>
>>>>>>  
>>>>>> -- 
>>>>>> Hls-interest mailing list
>>>>>> Hls-interest@ietf.org <mailto:Hls-interest@ietf.org>
>>>>>> https://www.ietf.org/mailman/listinfo/hls-interest <https://urldefense.com/v3/__https:/www.ietf.org/mailman/listinfo/hls-interest__;!!GjvTz_vk!FlALCrSijFqvrfdxEkYlMUYUHc1sYdOYrTfKsBINYVv0q4uXUvLTD6pzO4p7$>
>>>>>  
>>>>> 
>>>>> -- 
>>>>> Hls-interest mailing list
>>>>> Hls-interest@ietf.org <mailto:Hls-interest@ietf.org>
>>>>> https://www.ietf.org/mailman/listinfo/hls-interest <https://urldefense.com/v3/__https:/www.ietf.org/mailman/listinfo/hls-interest__;!!GjvTz_vk!FlALCrSijFqvrfdxEkYlMUYUHc1sYdOYrTfKsBINYVv0q4uXUvLTD6pzO4p7$>
>>>>  
>>>> -- 
>>>> Hls-interest mailing list
>>>> Hls-interest@ietf.org <mailto:Hls-interest@ietf.org>
>>>> https://www.ietf.org/mailman/listinfo/hls-interest <https://urldefense.com/v3/__https:/www.ietf.org/mailman/listinfo/hls-interest__;!!GjvTz_vk!FlALCrSijFqvrfdxEkYlMUYUHc1sYdOYrTfKsBINYVv0q4uXUvLTD6pzO4p7$>
>>> 
>>>  
>>> -- 
>>>  <https://urldefense.com/v3/__https:/www.limelight.com/__;!!GjvTz_vk!FlALCrSijFqvrfdxEkYlMUYUHc1sYdOYrTfKsBINYVv0q4uXUvLTD1JRWIpO$>
>>> Andrew Crowe Architect
>>> EXPERIENCE FIRST.
>>> +1 859 583 3301 <tel:+1+859+583+3301>
>>> www.limelight.com <https://urldefense.com/v3/__https:/www.limelight.com/__;!!GjvTz_vk!FlALCrSijFqvrfdxEkYlMUYUHc1sYdOYrTfKsBINYVv0q4uXUvLTD1JRWIpO$>
>>>  <https://urldefense.com/v3/__https:/www.facebook.com/LimelightNetworks__;!!GjvTz_vk!FlALCrSijFqvrfdxEkYlMUYUHc1sYdOYrTfKsBINYVv0q4uXUvLTDw-SoI9S$> <https://urldefense.com/v3/__https:/www.linkedin.com/company/limelight-networks__;!!GjvTz_vk!FlALCrSijFqvrfdxEkYlMUYUHc1sYdOYrTfKsBINYVv0q4uXUvLTDyiR2Ona$> <https://urldefense.com/v3/__https:/twitter.com/llnw__;!!GjvTz_vk!FlALCrSijFqvrfdxEkYlMUYUHc1sYdOYrTfKsBINYVv0q4uXUvLTD9Ow7PYY$>
>>>  
>>> -- 
>>> Hls-interest mailing list
>>> Hls-interest@ietf.org <mailto:Hls-interest@ietf.org>
>>> https://www.ietf.org/mailman/listinfo/hls-interest <https://www.ietf.org/mailman/listinfo/hls-interest>
>> 
>> 
>> -- 
>> Hls-interest mailing list
>> Hls-interest@ietf.org <mailto:Hls-interest@ietf.org>
>> https://www.ietf.org/mailman/listinfo/hls-interest <https://www.ietf.org/mailman/listinfo/hls-interest>
> -- 
> Hls-interest mailing list
> Hls-interest@ietf.org <mailto:Hls-interest@ietf.org>
> https://www.ietf.org/mailman/listinfo/hls-interest <https://www.ietf.org/mailman/listinfo/hls-interest>
> 
> 
> -- 
>  <https://www.limelight.com/>
> Andrew Crowe Architect
> EXPERIENCE FIRST.
>  <>+1 859 583 3301 <tel:+1+859+583+3301>
> www.limelight.com <https://www.limelight.com/>
>  <https://www.facebook.com/LimelightNetworks> <https://www.linkedin.com/company/limelight-networks> <https://twitter.com/llnw>
> -- 
> Hls-interest mailing list
> Hls-interest@ietf.org
> https://www.ietf.org/mailman/listinfo/hls-interest