Re: [Hls-interest] LL-HLS Amendment Proposal: Optional part response headers for CDN efficiency.

"Law, Will" <wilaw@akamai.com> Sat, 13 March 2021 00:22 UTC

Return-Path: <wilaw@akamai.com>
X-Original-To: hls-interest@ietfa.amsl.com
Delivered-To: hls-interest@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 63A223A0CB4; Fri, 12 Mar 2021 16:22:24 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.336
X-Spam-Level:
X-Spam-Status: No, score=-2.336 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.248, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_REMOTE_IMAGE=0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=akamai.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id O53P0sMEPtuw; Fri, 12 Mar 2021 16:22:19 -0800 (PST)
Received: from mx0a-00190b01.pphosted.com (mx0a-00190b01.pphosted.com [IPv6:2620:100:9001:583::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 407183A0CB3; Fri, 12 Mar 2021 16:22:19 -0800 (PST)
Received: from pps.filterd (m0122333.ppops.net [127.0.0.1]) by mx0a-00190b01.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 12D0EglK017366; Sat, 13 Mar 2021 00:22:16 GMT
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=akamai.com; h=from : to : cc : subject : date : message-id : references : in-reply-to : content-type : mime-version; s=jan2016.eng; bh=UH6QfCStGNt2RX3mkasJFCJyafS6s5yIbLMTUoJH11M=; b=Dg7K/PevmczLugRLwO+LtGkarsAPVlt7xf1XO8yQvVStuT3I9mOUJK6t70Qag2GEYsfU kBTjDrdmj8fSHkdsQQlChbdDlysKw6B1i2HyXF8Ls+qLmes419Kj1kLXWTTS2VHpZcdJ 3XM9rHlY8tHNSJMg1qaYNI9qyr6fIkR7ThSy9pVkF6D2zob3zx/rbKwZ7pjQwe+N/QK2 /6ZgDaR6BTto1FN33/Jwi+mPpcsoPxOVjDeyT/5OuAIOi+PLJVMye12wImtzXCZCgUfm eLR5pCHLe1LfaPTS5YMi4L6B/lsLTzaCTr/hDQYBl+XEbKisxqLr9WeFADzzIlEDV+H1 jA==
Received: from prod-mail-ppoint8 (a72-247-45-34.deploy.static.akamaitechnologies.com [72.247.45.34] (may be forged)) by mx0a-00190b01.pphosted.com with ESMTP id 3741vwvssa-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sat, 13 Mar 2021 00:22:15 +0000
Received: from pps.filterd (prod-mail-ppoint8.akamai.com [127.0.0.1]) by prod-mail-ppoint8.akamai.com (8.16.0.43/8.16.0.43) with SMTP id 12D0Jd5o006281; Fri, 12 Mar 2021 19:22:15 -0500
Received: from email.msg.corp.akamai.com ([172.27.165.113]) by prod-mail-ppoint8.akamai.com with ESMTP id 3745y5qtav-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Fri, 12 Mar 2021 19:22:14 -0500
Received: from ustx2ex-dag1mb6.msg.corp.akamai.com (172.27.165.124) by ustx2ex-dag1mb5.msg.corp.akamai.com (172.27.165.123) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Fri, 12 Mar 2021 18:22:13 -0600
Received: from ustx2ex-dag1mb6.msg.corp.akamai.com ([172.27.165.124]) by ustx2ex-dag1mb6.msg.corp.akamai.com ([172.27.165.124]) with mapi id 15.00.1497.012; Fri, 12 Mar 2021 18:22:13 -0600
From: "Law, Will" <wilaw@akamai.com>
To: Andrew Crowe <acrowe=40llnw.com@dmarc.ietf.org>, Roger Pantos <rpantos@apple.com>
CC: "hls-interest@ietf.org" <hls-interest@ietf.org>
Thread-Topic: [Hls-interest] LL-HLS Amendment Proposal: Optional part response headers for CDN efficiency.
Thread-Index: AdcBoxUsUf5UjWJ5kU2PKuYgwpCeIACR6yoABE3yNIAAO9qRgAAohSUAADkU+QAAB2hTAP//sD0A
Date: Sat, 13 Mar 2021 00:22:13 +0000
Message-ID: <E8974EC8-EB6C-4141-948A-D2A67AE771C1@akamai.com>
References: <21cc4a753f0c46189ada6f8e3e177516@EX13D02EUB003.ant.amazon.com> <863DA51C-6758-4B01-BB81-1DCE078120FD@apple.com> <CANtXGXG0QPSGU0HWJcyb_U16RzwDpQK7BxXkYU7w2t=tK5Uayg@mail.gmail.com> <EBAC993D-4F95-4FF3-9C02-B1542E6128BC@apple.com> <675EA9BB-2291-4604-9B8B-9C050ABF8490@akamai.com> <4E250CAF-AE87-4610-8B34-4491CBD7F688@apple.com> <CANtXGXHL0bJF1gtXqZW-j-aVtXj+9KtTMXgcWOQzG18aRGR_PQ@mail.gmail.com>
In-Reply-To: <CANtXGXHL0bJF1gtXqZW-j-aVtXj+9KtTMXgcWOQzG18aRGR_PQ@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
user-agent: Microsoft-MacOutlook/16.45.21011103
x-ms-exchange-messagesentrepresentingtype: 1
x-ms-exchange-transport-fromentityheader: Hosted
x-originating-ip: [172.27.164.43]
Content-Type: multipart/alternative; boundary="_000_E8974EC8EB6C4141948AD2A67AE771C1akamaicom_"
MIME-Version: 1.0
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.369, 18.0.761 definitions=2021-03-12_13:2021-03-12, 2021-03-12 signatures=0
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 mlxscore=0 phishscore=0 suspectscore=0 spamscore=0 mlxlogscore=999 bulkscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2103130000
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.369, 18.0.761 definitions=2021-03-12_13:2021-03-12, 2021-03-12 signatures=0
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxscore=0 spamscore=0 malwarescore=0 mlxlogscore=999 bulkscore=0 impostorscore=0 suspectscore=0 priorityscore=1501 phishscore=0 adultscore=0 clxscore=1011 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2103130000
X-Agari-Authentication-Results: mx.akamai.com; spf=${SPFResult} (sender IP is 72.247.45.34) smtp.mailfrom=wilaw@akamai.com smtp.helo=prod-mail-ppoint8
Archived-At: <https://mailarchive.ietf.org/arch/msg/hls-interest/ujchAW2OQuSlUNeFG8omobVIkVc>
Subject: Re: [Hls-interest] LL-HLS Amendment Proposal: Optional part response headers for CDN efficiency.
X-BeenThere: hls-interest@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussions about HTTP Live Streaming \(HLS\)." <hls-interest.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/hls-interest>, <mailto:hls-interest-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/hls-interest/>
List-Post: <mailto:hls-interest@ietf.org>
List-Help: <mailto:hls-interest-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hls-interest>, <mailto:hls-interest-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 13 Mar 2021 00:22:25 -0000

I like this named approach. It harmonizes the two addressing schemes (discreet versus range-based) and gives one way to create a LL-HLS playlist, which always leads to better support and interop across origins, networks and players. Players can chose whether to use aggregating responses or discreet requests when playing back the content.

One suggestion - since the query args of “start” and “length” are reserved and carry special meaning for the origin, they should follow the established convention and be prefixed by ‘_HLS_’ .  Additionally, there is already a syntax established within HLS for describing a range and offset, so this should be continued. So a part definition might look like

#EXT-X-PART:DURATION=0.25,URI="vid720_segment_1520.m4v?_HLS_partrange=100000@0",INDEPENDENT=YES
#EXT-X-PART:DURATION=0.25,URI=”vid720_segment_1520.m4v?_HLS_partrange=80000@100001”
…
It is also important that CDNs MUST incorporate the _HLS_ query arg in their cache key. If not, then the vid720_segment_1520.m4v object will keep getting overwritten at the edge by the content of the last part. The reality is that they already have to do this to support the blocking playlist updates, so it is not asking anything new regarding LL-HLs delivery.

The key issue as you describe is what to do with the PRELOAD-HINT. It is different in two ways from all the other parts

  1.  It’s length is not known
  2.  A discreet-mode operating client wants only that part returned, while a byte-range operating client would like that part plus the remainder of the segment.

One solution could be to describe it with the length component missing, to indicate that it is unknown

#EXT-X-PRELOAD-HINT:TYPE=PART, URI=”vid720_segment_1521.m4v?_HLS_partrange=@260001”

The origin behavior rule would then be quite simple:

If a request is received with a _HLS_partrange query arg in which the length is undefined
THEN
IF the range request header is present
THEN serve an aggregating response to the end of the segment with a 206 response code
ELSE block until the part is complete and return a 200 response.

Smart CDNS can also do two things for efficiency:

  1.  If they receive a request for vid720_segment_1520.m4v (i.e a standard latency client), they could scan their cache for all vid720_segment_1520.m4v*_HLS_partrange* objects and then construct the response by concatenating those objects. They could also cache the full segment so the concact operation would only need to be performed once. This workflow assumes that the segment request would always follow the last part request. It should, according to the rules of timing, however errant clients asking ahead of time would receive a partial build of the segment.
  2.  Assuming no edge stitching,  a CDN can purge its cache of any asset with “_HLS_partrange” query arg which was older than 24s. This would help minimize the redundant content problem.

Cheers
Will


From: Andrew Crowe <acrowe=40llnw.com@dmarc.ietf.org>
Date: Friday, March 12, 2021 at 1:07 PM
To: Roger Pantos <rpantos@apple.com>
Cc: "Law, Will" <wilaw@akamai.com>, "hls-interest@ietf.org" <hls-interest@ietf.org>
Subject: Re: [Hls-interest] LL-HLS Amendment Proposal: Optional part response headers for CDN efficiency.

Would it be simpler to ask the origin to produce urls of the form vid720_segment_1521?start=14040&length=26048 such that either a direct fetch of that URL returned the range, or a cache could fetch all (or as much that currently exists) of vid720_segment_1521 and satisfy the request out of the subrange?

Can an origin know the size(length) of a part before it is complete?  EXT-X-PRELOAD-HINT is described as being written to the manifest before the content is available to egress, so the player can make the request which the origin will hold open until it can deliver the part at line rate ASAP. If it's describing the part in the manifest before all of the part data is available, then the length cannot be known at describe-time. Or are you suggesting reworking named parts entirely to function more like the byterange style like:

#EXTM3U
#EXT-X-TARGETDURATION:2
#EXT-X-MAP:URI="init.m4v"
#EXTINF:2,
vid720_segment_1518.m4v
#EXTINF:2,
vid720_segment_1519.m4v
#EXT-X-PART:DURATION=0.25,URI="vid720_segment_1520.m4v?start=0&length=100000",INDEPENDENT=YES
#EXT-X-PART:DURATION=0.25,URI="vid720_segment_1520.m4v?start=100001&length=80000"
#EXT-X-PART:DURATION=0.25,URI="vid720_segment_1520.m4v?start=260001&length=80000"
#EXT-X-PART:DURATION=0.25,URI="vid720_segment_1520.m4v?start=340001&length=80000"
#EXT-X-PART:DURATION=0.25,URI="vid720_segment_1520.m4v?start=420001&length=100000",INDEPENDENT=YES
#EXT-X-PART:DURATION=0.25,URI="vid720_segment_1520.m4v?start=520001&length=80000"
#EXT-X-PART:DURATION=0.25,URI="vid720_segment_1520.m4v?start=600001&length=80000"
#EXT-X-PART:DURATION=0.25,URI="vid720_segment_1520.m4v?start=680001&length=80000"
#EXTINF:2,
vid720_segment_1520.m4v
#EXT-X-PART:DURATION=0.25,URI="vid720_segment_1521.m4v?start=0&length=100000",INDEPENDENT=YES
#EXT-X-PART:DURATION=0.25,URI="vid720_segment_1521.m4v?start=100001&length=80000"
#EXT-X-PART:DURATION=0.25,URI="vid720_segment_1521.m4v?start=260001"

This could be a solution that exposes the byte-range type functionality to clients and origins that don't explicitly support byte-ranges. In this scenario, the bleeding-edge request would be  "vid720_segment_1521.m4v?start=260001" - would an origin be expected to deliver a single part and terminate, or follow byte-range style and deliver parts at line rate as they become available? Perhaps clients could add "&parts=1" to signal they only want one part, otherwise the origin continues to deliver the segment as parts are available?

Happy Friday!

Regards,
-Andrew

On Fri, Mar 12, 2021 at 12:35 PM Roger Pantos <rpantos=40apple.com@dmarc.ietf.org<mailto:40apple.com@dmarc.ietf.org>> wrote:



On Mar 11, 2021, at 4:21 PM, Law, Will <wilaw=40akamai.com@dmarc.ietf.org<mailto:wilaw=40akamai.com@dmarc.ietf.org>> wrote:

Hi Roger

I did a bit of research internally. While the idea of evicting objects from cache the moment their 24s max-age expires is good in theory, the practice it is far from that. Speaking only for Akamai, our cache store entry table is highly optimized for one-way reads. There may be a million or more objects in cache and the store is set up to quickly return whether an object, identified by its cache key, exists. The store table is very expensive to search horizontally. As a result, the lowest interval we can currently evict the cache is approximately 2 hours. We have multiple projects afoot to reduce that number, but it is a practical limit today.

That… certainly sounds like an area that’s ripe for optimization.


This compounds the dual cache problem. Assuming 2.5Mbytes/s, we would have a block of data 18GB in size at peak before the redundant & stale objects get evicted. Given that edge machines typically have several hundred GB of cache space, having a single stream set consume 18GB is material. I am simplifying the caching behavior significantly with these statements. We have cache age eviction multipliers, max idle lifetimes, memory-only caching, metro caching etc which affect these numbers and which I have not mentioned. In general, multi-tenant highly scalable caching systems may not be as efficient as we would want in evicting ‘low durability’ objects such as LL-HLS parts. I am interested as to whether other CDNs have similar challenges.

Chatting with Jan, it seems like the CDNs that use (or are based around) ATS are in much the same boat - they will cache all of the duplicate stream until it ages out in fetch order. (Where he and you may differ is on the tradeoff between the increased cache use and the expense of maintaining a smart edge — but I’ll let him speak to that.)


@Andrew – I do like you current proposal of a naming convention over the header approach, simply because headers can become detached as objects move through distribution tiers, whereas filenames persist.  However , in order to stitch segments at the edge, we would need to perform a directory-like search against the cache store similar to “ vid720_segment_1521.*” in order to find all the constituent components. As mentioned above, this is expensive, as it’s an open search against the entirety of the table. Additionally, the edge needs some idea of how many parts should be stitched. The current proposal of listing the offset does not give you that information – are there 4, or 5 or 6 parts per segment? A way to solve this problem would be to name the segments according to their part order, for example
vid720_segment_1521.part1of4.m4v
vid720_segment_1521.part2of4.m4v etc.
This would tell the stitcher when it was complete. The stitcher could then discover the byte-offsets by reading the objects, which it would need to do anyway in order to stitch the parts.


Would it be simpler to ask the origin to produce urls of the form vid720_segment_1521?start=14040&length=26048 such that either a direct fetch of that URL returned the range, or a cache could fetch all (or as much that currently exists) of vid720_segment_1521 and satisfy the request out of the subrange?


Roger.


If stitching is to be done, then we’d prefer it be through the reserved naming of parts versus headers. However, given the cost and complexity of edge stitching in general, our preference is still  to promote the use of the byte-range addressing mode. It solves the redundant cache content problem, reduces the client request count and has been a part of the spec already for the past year.

Cheers
Will




From: Roger Pantos <rpantos=40apple.com@dmarc.ietf.org<mailto:rpantos=40apple.com@dmarc.ietf.org>>
Date: Wednesday, March 10, 2021 at 11:01 AM
To: Andrew Crowe <acrowe=40llnw.com@dmarc.ietf.org<mailto:acrowe=40llnw.com@dmarc.ietf.org>>
Cc: "Law, Will" <wilaw=40akamai.com@dmarc.ietf.org<mailto:wilaw=40akamai.com@dmarc.ietf.org>>, Jan Van Doorn <jvd=40apple.com@dmarc.ietf.org<mailto:jvd=40apple.com@dmarc.ietf.org>>, "Weil, Nicolas" <nicoweil@elemental.com<mailto:nicoweil@elemental.com>>, "hls-interest@ietf.org<mailto:hls-interest@ietf.org>" <hls-interest@ietf.org<mailto:hls-interest@ietf.org>>
Subject: Re: [Hls-interest] LL-HLS Amendment Proposal: Optional part response headers for CDN efficiency.

Hello Andrew,

On Mar 9, 2021, at 6:26 AM, Andrew Crowe <acrowe=40llnw.com@dmarc.ietf.org<mailto:acrowe=40llnw.com@dmarc.ietf.org>> wrote:

Folks,

Thanks for your replies and patience. While my primary concerns are about origin hit-rate/health and latency of delivery with mixed mode, Will provided great input about the caching layer that further highlights a lot of the difficulties of the named-part format of LL-HLS. I had initially written out thoughtful replies to each comment, but I feel that we can skip to the end and consider alternatives that may improve interoperability of named-parts with range-based vs immediately moving away from named-parts altogether.

Clearly, response headers will not be viable as CDNs prefer/require information about the object at request-time. We could gain some value out of that concept by including some of that information in the part file requests by changing from an iterative sequence identifier to an offset identifier. So vid720_segment_1521.m4v's parts would be presented as:
vid720_segment_1521.part1.m4v -> vid720_segment_1521.offset-0.m4v
vid720_segment_1521.part2.m4v -> vid720_segment_1521.offset-623751.m4v

and so on. This keeps the named part file concept for the broad scope of players, but also allows the caching layer to understand where in the root file (vid720_segment_1521.m4v) a part is located _at request time_.

However, this does not achieve 100% interop as the CDN cannot equate an offset named part with an exact range, nor can it assume any number of parts per segment as the segment may be truncated for an interstitial content break.

In summary - is there any desire from the community to iterate on the named part format of LL-HLS with a goal of better interoperability/origin shielding or should we designate range-based as the preferable method of LL-HLS delivery and push to deprecate/ostracize named parts due to their faults (small object, high RPS, low interop, etc...)?

Before proceeding too much further with this I’d like to get a sense of the potential win here. Named-parts (as you call them) provide excellent compatibility with packagers, origins and proxy caches without additional changes, at some cost in duplicate cache and transmission. I’d like to understand the implication of that cost.

Let’s do some math. Say we have a low-latency live stream with 4s segments, as Will suggested, and a combined bit rate of 20Mbps for all tiers. (This is generously assuming that some clients are pulling on every tier.) So at 2.5MB per second, 24s of cache duplication is 60MB.

Question for the (CDN) audience in general: how many active low-latency live streams (where I define active as “more than two clients watching on a particular edge”) would you need to see on the typical edge you’d use for distributing such a stream, before a 60MB per stream contribution showed up on your dashboards as a noticeable amount of cache occupancy?


thanks,

Roger.


Many thanks,
-Andrew

On Mon, Feb 15, 2021 at 11:34 AM Jan Van Doorn <jvd=40apple.com@dmarc.ietf.org<mailto:40apple.com@dmarc.ietf.org>> wrote:
One of the original LL-HLS design goals was to scale with regular CDNs and not require anything other than HTTP/2 on the caches. I think CDNs work best when they just implement the HTTP spec. I think that is a strong reason to not implement edge stitching as well.

Rgds,
JvD


On Feb 12, 2021, at 6:11 PM, Weil, Nicolas <nicoweil@elemental.com<mailto:nicoweil@elemental.com>> wrote:

From the origin perspective, I would add that we’d have a strong reason to not implementing it: the edge stitching dilutes responsibility and makes it impossible to debug efficiently when things start to go wrong.

I would rather propose that origins add a specific header to the last part of a full segment, with the name of it, so that CDNs can prefetch the full segment and provide the best delivery performance possible – something like hls-fullsegment-prefetch: vid720_segment_1521.m4v

This would work well with ad insertion discontinuities and preserve a clear split of the responsibilities between the origin and the CDN.

Thanks,
Nicolas
----------------
Nicolas Weil | Senior Product Manager – Media Services
AWS Elemental

From: Hls-interest <hls-interest-bounces@ietf.org<mailto:hls-interest-bounces@ietf.org>> On Behalf Of Law, Will
Sent: Friday, February 12, 2021 2:28 PM
To: hls-interest@ietf.org<mailto:hls-interest@ietf.org>
Subject: RE: [Hls-interest] LL-HLS Amendment Proposal: Optional part response headers for CDN efficiency.

Hi Roger

From an Akamai perspective, we acknowledge the issue raised by Andrew, in that duplicate parts and segments reduce cache efficiency at the edge. This is something the HLS community should strive to reduce over time. I have four main responses to this proposal.

Firstly, we feel that the existing HLS spec<https://urldefense.com/v3/__https:/tools.ietf.org/html/draft-pantos-hls-rfc8216bis-08__;!!GjvTz_vk!FlALCrSijFqvrfdxEkYlMUYUHc1sYdOYrTfKsBINYVv0q4uXUvLTD43dPJZO$> already provides a solution to this problem, namely the use of byte-range addressing for parts. Under this addressing mode, the only objects the origin produces are segments. These are the single objects that are cached at the edge. The use of ranges to retrieve parts affords the clients the ability to lower their latency below the segment duration and to start and switch quickly. This approach has the following advantages over edge-stitching :


  1.  Segment caching and byte-range delivery is natively supported by any CDN which supports Http2 delivery. It does not need to be taught new behaviors (RFC8673 edge case aside). One of the secrets to HLS success has been the simplicity of scaling it out. Any http server has sufficed in the past. In introducing edge stitching, we are moving to more of a Smooth streaming model, which places a dependency on logic at the edge. This logic has to be implemented consistently across multiple CDNs and introduces critical path complexity which has to be managed.
  2.  Edge stitching has an overhead cost, mainly in directory searches to discover aggregate parts. Searches are always more expensive than lookups since you must span the whole directory tree. This compute cost would have to be absorbed by the CDN. Edge stitching is basically trading cache efficiency for compute. Byte-range does not force you to make this trade-off.

Secondly, the period time in which we have duplicate cache content is not actually that long. Per the HLS spec, blocking media objects requests (such as parts) should be cached for 6 target durations (basically a 100% safety factor since they are only described for the last 3 target durations of the streams). For 4s segments, this means we can evict our stored parts after 24s. The media segments, which can be requested by standard latency clients and as well as low latency clients scrubbing behind live, need to be cached longer.

Consider the case of a live stream with 4s segments and 1s part durations.

After 40s of streaming, we have
10 media segments holding 40s of data in the cache
24s of duplicate part data
The overall cache duplication is 24/40  = 60%

After 5mins (300s) of streaming, we have
300s of media segments in the cache
24s of duplicate part data
The overall cache duplication is 24/300  = 8%

After 30mins (1800s) of streaming, we have
1800s of media segments in the cache
24s of duplicate part data
The overall cache duplication is 24/1800  = 1.3%

So streams with realistic durations in the minutes actually have quite a low percentage of duplicate data, as long as the CDN is aggressive about cache eviction and the origin does a good job in setting cache-control headers.

Thirdly, at Akamai we would have a complex time in implementing the edge stitching as proposed and the same may be true for other CDNs. The reason is that while the origin header information gets written in to our cache store entry table, the store tables are architected to very efficiently tell you if an object exists and if so, to return it. They are not databases optimized for horizontal searching. We cannot search across the cache, for example asking for all objects whose X-HLS-Part-Root-Segment header matches a certain string. It would very difficult to implement the edge stitch proposed here. We would need to externalize the header information in to some sort of parallel database which we could query. While we have such structures (via EdgeWorkers and EdgeKV), their use would raise the cost and complexity of delivering LL-HLS. At that point we would probably choose to suffer the low duplication rates and instead focus on efficiently evicting parts.

Fourthly, if the community opinion is to still proceed with this edge-stitch plan, then I would offer the following suggestions:
1.       To avoid header bloat, the sequence and offset headers could be collapsed into a single header, for example HLS-Part-Info:<current-part>,<total-part-count>,<byte-offset. This would look like HLS-Part-Info:2,8,623751. Due to HPACK and QPACK header compression, we would not want to place the root-segment in the same bundle, as it will be invariant over the parts from the same segment and hence can be compressed more efficiently if it is separate.
2.       The IETF strongly discourages the use of X- in header prefixes<https://urldefense.com/v3/__https:/tools.ietf.org/html/rfc6648__;!!GjvTz_vk!FlALCrSijFqvrfdxEkYlMUYUHc1sYdOYrTfKsBINYVv0q4uXUvLTD46canmg$>. A simple header name such as ‘HLS-part-info’ would be preferable.
3.       Do you need both byte offset and sequence? Once you know the sequence, you can read the byte-lengths from the individually stored parts.
4.       Segments get truncated without warning, often for ad insertion discontinuities and always at the end when the encoder is turned off.  Say you are making 8 parts per 4s segment and have sent off the first two parts to the CDN before reaching a sudden discontinuity. You have labelled these as 1/8 and 2/8 respectively. Since 3-8 are never produced, the edge routine would waste some time and resources looking for 3-8, before giving up and going to the origin to fetch the segment. Performance – especially TTFB and apparent throughput – would suffer.
5.       You may have an edge server which is only serving legacy clients pulling segments and no low-latency clients seeding the edge with part requests. In this case, the edge would waste time searching for constituent parts before giving up and going to the origin to fetch the segment. Performance again would suffer.

I appreciate Limelight raising these issues and look forward to debating a mutually efficient solution which benefits content distributors, CDNs and players.

Have a good long weekend!

Cheers
Will


--------------------------------------------------------
Chief Architect – Edge Technology Group
Akamai Technologies
San Francisco
Cell: +1.415.420.0881







From: Roger Pantos <rpantos=40apple.com@dmarc.ietf.org<mailto:rpantos=40apple.com@dmarc.ietf.org>>
Date: Friday, February 12, 2021 at 10:40 AM
To: Andrew Crowe <acrowe=40llnw.com@dmarc.ietf.org<mailto:acrowe=40llnw.com@dmarc.ietf.org>>
Cc: "hls-interest@ietf.org<mailto:hls-interest@ietf.org>" <hls-interest@ietf.org<mailto:hls-interest@ietf.org>>
Subject: Re: [Hls-interest] LL-HLS Amendment Proposal: Optional part response headers for CDN efficiency.



On Feb 9, 2021, at 7:54 AM, Andrew Crowe <acrowe=40llnw.com@dmarc.ietf.org<mailto:acrowe=40llnw.com@dmarc.ietf.org>> wrote:

Hello,

CMAF content packaged and delivered using LL-DASH and range-based LL-HLS are easily managed as duplicate content by CDNs as they specify only segment files. In fact, once the segment is complete, it can then be served out of CDN cache for players that are not Low Latency capable - effectively reducing latency for them as well. Part-based LL-HLS introduces individually named part files that then collapse to the separately named segment file upon final part completion. This then means that on first request for the whole collapsed segment file the CDN will have to go back to origin to request bytes that it likely already has in the individually named part files. CDNs can improve cache efficiency, origin hit rate, and whole segment delivery times with a little bit of additional information from origin.


On request for a named part file an origin may provide a set of response headers:

*X-HLS-Part-Sequence*
A multi value header that represents the current part sequence (index=1) and the total number of parts for the segment. The values will be separated by a forward slash ("/"). For example a 2 second segment with 8 parts per segment will respond to the 2nd part request (vid720_segment_1521.part2.m4v) like
X-HLS-Part-Sequence: 2/8


*X-HLS-Part-Offset*
A single value header that represents the byte offset of the part in the segment. The first part of a segment will always be 0 while, for example the second .25s part of a 2mpbs stream (vid720_segment_1521.part2.m4v) may have a value like 623751


*X-HLS-Part-Root-Segment*
A single value header that provides the name of the root segment of the current part. This lets the CDN/proxy know which root file to concatenate the parts into. vid720_segment_1521.part2.m4v would have a value of vid720_segment_1521.m4v


With the information from these three headers the CDN can recognize the individually named part files as ranges of a larger file, store them effectively and deliver a better experience to viewers across all formats.

Hello Andrew. I’m interested in this proposal, but I’d also like to hear some feedback from others in the CDN and packager spaces. Specifically, I’d like to know if other folks:

- Agree that it’s a good way to solve the problem

- Can spot any problems or limitations in this proposal that might make it difficult to produce (or consume) these headers

- Can see themselves implementing it


thanks,

Roger Pantos
Apple Inc.


Regards,
-Andrew
--
Error! Filename not specified.<https://urldefense.com/v3/__https:/www.limelight.com/__;!!GjvTz_vk!Gcp-s1jYCIAYpNsmuK09dLU1cDo5FUINbvdFY1ZXSct8lPTh9xqsUiZS1ril$>

Andrew Crowe Architect
EXPERIENCE FIRST.
Error! Filename not specified.Error! Filename not specified.+1 859 583 3301<tel:+1+859+583+3301>
www.limelight.com<https://urldefense.com/v3/__https:/www.limelight.com/__;!!GjvTz_vk!Gcp-s1jYCIAYpNsmuK09dLU1cDo5FUINbvdFY1ZXSct8lPTh9xqsUiZS1ril$>


Error! Filename not specified.<https://urldefense.com/v3/__https:/www.facebook.com/LimelightNetworks__;!!GjvTz_vk!Gcp-s1jYCIAYpNsmuK09dLU1cDo5FUINbvdFY1ZXSct8lPTh9xqsUqon8WvN$>Error! Filename not specified.<https://urldefense.com/v3/__https:/www.linkedin.com/company/limelight-networks__;!!GjvTz_vk!Gcp-s1jYCIAYpNsmuK09dLU1cDo5FUINbvdFY1ZXSct8lPTh9xqsUgrrsizx$>Error! Filename not specified.<https://urldefense.com/v3/__https:/twitter.com/llnw__;!!GjvTz_vk!Gcp-s1jYCIAYpNsmuK09dLU1cDo5FUINbvdFY1ZXSct8lPTh9xqsUvWN2UYO$>


--
Hls-interest mailing list
Hls-interest@ietf.org<mailto:Hls-interest@ietf.org>
https://www.ietf.org/mailman/listinfo/hls-interest<https://urldefense.com/v3/__https:/www.ietf.org/mailman/listinfo/hls-interest__;!!GjvTz_vk!FlALCrSijFqvrfdxEkYlMUYUHc1sYdOYrTfKsBINYVv0q4uXUvLTD6pzO4p7$>

--
Hls-interest mailing list
Hls-interest@ietf.org<mailto:Hls-interest@ietf.org>
https://www.ietf.org/mailman/listinfo/hls-interest<https://urldefense.com/v3/__https:/www.ietf.org/mailman/listinfo/hls-interest__;!!GjvTz_vk!FlALCrSijFqvrfdxEkYlMUYUHc1sYdOYrTfKsBINYVv0q4uXUvLTD6pzO4p7$>

--
Hls-interest mailing list
Hls-interest@ietf.org<mailto:Hls-interest@ietf.org>
https://www.ietf.org/mailman/listinfo/hls-interest<https://urldefense.com/v3/__https:/www.ietf.org/mailman/listinfo/hls-interest__;!!GjvTz_vk!FlALCrSijFqvrfdxEkYlMUYUHc1sYdOYrTfKsBINYVv0q4uXUvLTD6pzO4p7$>


--
Error! Filename not specified.<https://urldefense.com/v3/__https:/www.limelight.com/__;!!GjvTz_vk!FlALCrSijFqvrfdxEkYlMUYUHc1sYdOYrTfKsBINYVv0q4uXUvLTD1JRWIpO$>

Andrew Crowe Architect
EXPERIENCE FIRST.
Error! Filename not specified.Error! Filename not specified.+1 859 583 3301<tel:+1+859+583+3301>
www.limelight.com<https://urldefense.com/v3/__https:/www.limelight.com/__;!!GjvTz_vk!FlALCrSijFqvrfdxEkYlMUYUHc1sYdOYrTfKsBINYVv0q4uXUvLTD1JRWIpO$>


Error! Filename not specified.<https://urldefense.com/v3/__https:/www.facebook.com/LimelightNetworks__;!!GjvTz_vk!FlALCrSijFqvrfdxEkYlMUYUHc1sYdOYrTfKsBINYVv0q4uXUvLTDw-SoI9S$>Error! Filename not specified.<https://urldefense.com/v3/__https:/www.linkedin.com/company/limelight-networks__;!!GjvTz_vk!FlALCrSijFqvrfdxEkYlMUYUHc1sYdOYrTfKsBINYVv0q4uXUvLTDyiR2Ona$>Error! Filename not specified.<https://urldefense.com/v3/__https:/twitter.com/llnw__;!!GjvTz_vk!FlALCrSijFqvrfdxEkYlMUYUHc1sYdOYrTfKsBINYVv0q4uXUvLTD9Ow7PYY$>


--
Hls-interest mailing list
Hls-interest@ietf.org<mailto:Hls-interest@ietf.org>
https://www.ietf.org/mailman/listinfo/hls-interest<https://urldefense.com/v3/__https:/www.ietf.org/mailman/listinfo/hls-interest__;!!GjvTz_vk!GE_1nboj-BTQzY_rGEpyF1tjCNfOYrOGHo19eiiJnQBW4dCLfAV4XW4TckD8$>

--
Hls-interest mailing list
Hls-interest@ietf.org<mailto:Hls-interest@ietf.org>
https://www.ietf.org/mailman/listinfo/hls-interest<https://urldefense.com/v3/__https:/www.ietf.org/mailman/listinfo/hls-interest__;!!GjvTz_vk!GE_1nboj-BTQzY_rGEpyF1tjCNfOYrOGHo19eiiJnQBW4dCLfAV4XW4TckD8$>

--
Hls-interest mailing list
Hls-interest@ietf.org<mailto:Hls-interest@ietf.org>
https://www.ietf.org/mailman/listinfo/hls-interest<https://urldefense.com/v3/__https:/www.ietf.org/mailman/listinfo/hls-interest__;!!GjvTz_vk!GE_1nboj-BTQzY_rGEpyF1tjCNfOYrOGHo19eiiJnQBW4dCLfAV4XW4TckD8$>


--
[Image removed by sender. Limelight Networks]<https://urldefense.com/v3/__https:/www.limelight.com__;!!GjvTz_vk!GE_1nboj-BTQzY_rGEpyF1tjCNfOYrOGHo19eiiJnQBW4dCLfAV4XfKq4cit$>

Andrew Crowe Architect
EXPERIENCE FIRST.
[Image removed by sender.][Image removed by sender.]+1 859 583 3301<tel:+1+859+583+3301>
www.limelight.com<https://urldefense.com/v3/__https:/www.limelight.com__;!!GjvTz_vk!GE_1nboj-BTQzY_rGEpyF1tjCNfOYrOGHo19eiiJnQBW4dCLfAV4XfKq4cit$>


[Image removed by sender. Facebook]<https://urldefense.com/v3/__https:/www.facebook.com/LimelightNetworks__;!!GjvTz_vk!GE_1nboj-BTQzY_rGEpyF1tjCNfOYrOGHo19eiiJnQBW4dCLfAV4XTH_cZ9Z$>[Image removed by sender. LinkedIn]<https://urldefense.com/v3/__https:/www.linkedin.com/company/limelight-networks__;!!GjvTz_vk!GE_1nboj-BTQzY_rGEpyF1tjCNfOYrOGHo19eiiJnQBW4dCLfAV4XaAFC5qJ$>[Image removed by sender. Twitter]<https://urldefense.com/v3/__https:/twitter.com/llnw__;!!GjvTz_vk!GE_1nboj-BTQzY_rGEpyF1tjCNfOYrOGHo19eiiJnQBW4dCLfAV4XT4APTUU$>