[tsvwg] Re: Question on TSVWG Scope and Feedback on draft-shenzhihong-dacp-00
"xjzhu@cnic.cn" <xjzhu@cnic.cn> Mon, 13 October 2025 06:37 UTC
Return-Path: <xjzhu@cnic.cn>
X-Original-To: tsvwg@mail2.ietf.org
Delivered-To: tsvwg@mail2.ietf.org
Received: from localhost (localhost [127.0.0.1]) by mail2.ietf.org (Postfix) with ESMTP id 98ECA723BBA8 for <tsvwg@mail2.ietf.org>; Sun, 12 Oct 2025 23:37:19 -0700 (PDT)
X-Virus-Scanned: amavisd-new at ietf.org
X-Spam-Flag: NO
X-Spam-Score: -4.387
X-Spam-Level:
X-Spam-Status: No, score=-4.387 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_FONT_FACE_BAD=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_PASS=-0.001, T_KAM_HTML_FONT_INVALID=0.01] autolearn=ham autolearn_force=no
Authentication-Results: mail2.ietf.org (amavisd-new); dkim=pass (1024-bit key) header.d=cnic.cn
Received: from mail2.ietf.org ([166.84.6.31]) by localhost (mail2.ietf.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CAFZDAA8mWH2 for <tsvwg@mail2.ietf.org>; Sun, 12 Oct 2025 23:37:17 -0700 (PDT)
Received: from cstnet.cn (smtp84.cstnet.cn [159.226.251.84]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by mail2.ietf.org (Postfix) with ESMTPS id 01873723BACB for <tsvwg@ietf.org>; Sun, 12 Oct 2025 23:37:15 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cnic.cn; s=dkim; h=Received:Date:From:To:Cc:Subject:References: Mime-Version:Message-ID:Content-Type; bh=oaUupjKiFCqv5gu0IKyCCQk MC5vyQgPvbB30p3KD1Fo=; b=K95m0M/JCRjIvC27o91ic9Cl+Sgb0sPRscSn2i4 ovIZC8szhyiQpWB0hLMWK4l/91pbspWDBFewbIQ6Z/ZJiSnSZnf+nLHDCpfRqZH7 sj8/bzkGbcUUc4r+jgZgOC14IuHIhF3Su1wXZe8i3cWJZHMXydq8/ntKFq5SqemN CRSw=
Received: from TABLET-I7HFUM74 (unknown [223.193.3.32]) by APP-05 (Coremail) with SMTP id zQCowAAHqRIInuxowP4HDg--.33361S2; Mon, 13 Oct 2025 14:36:57 +0800 (CST)
Date: Mon, 13 Oct 2025 14:36:57 +0800
From: "xjzhu@cnic.cn" <xjzhu@cnic.cn>
To: Sophie Harper <sphpr=40proton.me@dmarc.ietf.org>
References: <ZxyT8wsHlX2yXCHQBeFFUgjBAHs6gsX6-YN8c1a_-n6PDp07lIJQvpnamsT6aN1Q9CjBMOOBn3aiJfIp83K8VRP7nKrNYTtKV1xfHc_6qPY=@proton.me>
X-Priority: 3
X-Has-Attach: no
X-Mailer: Foxmail 7.2.25.489[cn]
Mime-Version: 1.0
Message-ID: <202510131436559939285@cnic.cn>
Content-Type: multipart/alternative; boundary="----=_001_NextPart861473833630_=----"
X-CM-TRANSID: zQCowAAHqRIInuxowP4HDg--.33361S2
X-Coremail-Antispam: 1UD129KBjvJXoW3Gr1xKFy5uF4DXr18tr45Awb_yoWxAw17pa yIgayYkaykJwn5G397X3yIvr15W393Kay7Jr17JryxAws8WF1jvFy3Ka1YvFy0krnYkr1j vr1Yq3WrZ3WqyFJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUHG14x267AKxVWUJVW8JwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2ocxC64kIII0Yj41l84x0c7CEw4AK67xGY2AK02 1l84ACjcxK6xIIjxv20xvE14v26w1j6s0DM28EF7xvwVC0I7IYx2IY6xkF7I0E14v26r4U JVWxJr1l84ACjcxK6I8E87Iv67AKxVW0oVCq3wA2z4x0Y4vEx4A2jsIEc7CjxVAFwI0_Gc CE3s1le2I262IYc4CY6c8Ij28IcVAaY2xG8wAqx4xG67k08I80eVWUJVW8JwAqx4xG6c80 4VAFz4xC04v7Mc02F40Ew4AK048IF2xKxVWUJVW8JwAqx4xG6xAIxVCFxsxG0wAqx4xG6I 80eVA0xI0YY7vIx2IE14AGzxvEb7x7McIj6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv 67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lF7I21c0EjI I2zVCS5cI20VAGYxC7M4xvF2IEb7IF0Fy264kE64k0F24lFcxC0VAYjxAxZF0Ex2IqxwCF 04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r106r 1rMI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_JF0_Jw1lIxkGc2Ij64vI r41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Jr0_Gr 1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0xvE x4A2jsIEc7CjxVAFwI0_Jr0_Gr1l6VACY4xI67k04243AbIYCTnIWIevJa73UjIFyTuYvj fUOo7KUUUUU
X-Originating-IP: [223.193.3.32]
X-CM-SenderInfo: x0m2x346fqxugofq/
Message-ID-Hash: IVGS4YTT5HPC37IXPIUUOSQKC6SE6K3W
X-Message-ID-Hash: IVGS4YTT5HPC37IXPIUUOSQKC6SE6K3W
X-MailFrom: xjzhu@cnic.cn
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-tsvwg.ietf.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
CC: tsvwg <tsvwg@ietf.org>, bluejoe <bluejoe@cnic.cn>, zjcheng <zjcheng@cnic.cn>
X-Mailman-Version: 3.3.9rc6
Precedence: list
Subject: [tsvwg] Re: Question on TSVWG Scope and Feedback on draft-shenzhihong-dacp-00
List-Id: Transport Area Working Group <tsvwg.ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/zMz6dhnB3gDGMMjNHr8SKD2rjMQ>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Owner: <mailto:tsvwg-owner@ietf.org>
List-Post: <mailto:tsvwg@ietf.org>
List-Subscribe: <mailto:tsvwg-join@ietf.org>
List-Unsubscribe: <mailto:tsvwg-leave@ietf.org>
Hi Sophie, Thank you for your thorough and insightful feedback on the DACP draft. We're very encouraged by your positive comments and appreciate your support for discussing it within the TSVWG. Here are our thoughts on your excellent points, all of which will help improve the next version of the draft. A. Unstructured Data Framing blob Column: Your understanding is correct. The blob column is a pass-through for data that isn't easily structured (like images), and it relies on lazy loading—the content is only streamed upon explicit request. To your point, making the name mandatory is too restrictive. In our revision, we will make the inclusion of a binary content column OPTIONAL. FileListSchema: You are absolutely right. We will define a standard, baseline FileListSchema in the next draft to ensure maximum compatibility. B. RPC Mapping & DoS Concerns This is an important security consideration. To mitigate the DoS risk from long transformation chains, we will update the "Security Considerations" section to detail required server-side mitigations, such as enforcing transformation chain maxium length and per-user resource quotas. C. Provenance and Integrity This is an excellent suggestion. To strengthen the audit trail, we will specify an OPTIONAL mechanism for integrity, based on your proposal of using cryptographic chaining or signatures. Thank you again for this high-quality feedback. Your suggestions are invaluable as we work towards the -01 version. Best regards, Xiaojie Zhu (on behalf of the authors) xjzhu@cnic.cn From: Sophie Harper Date: 2025-10-11 23:40 To: xjzhu@cnic.cn CC: tsvwg; bluejoe; zjcheng Subject: [tsvwg] Re: Question on TSVWG Scope and Feedback on draft-shenzhihong-dacp-00 Dear Authors and TSVWG Chairs/Members, I am writing as an individual interested in high-performance data protocols to offer some initial feedback on draft-shenzhihong-dacp-00. We highly appreciate the effort in proposing the Data Access and Collaboration Protocol (DACP). The protocol is an elegant solution to the challenges of modern distributed computing by providing "secure, high-performance, and auditable streaming of data across distributed systems." The choice to build DACP "directly on top of Apache Arrow Flight" is excellent, immediately leveraging columnar data transfer and minimizing "high serialization overhead." The introduction of the Streaming DataFrame (SDF) as a "standardized data unit" with Immutability and Deferred Execution principles is a powerful, high-level abstraction that will greatly benefit the scientific community. Regarding your primary question on TSVWG scope, while DACP's core functionality is largely Application Layer (defining the SDF model, operations, and provenance), its entire existence is driven by the need for "low-latency data access" and "high-performance stream-framed data transport." Therefore, I believe the TSVWG would be the appropriate group to review the transport behavior and ensure robust performance and interaction with underlying transport protocols. Here are a few specific points for consideration: A. Clarity on Unstructured Data Framing Section 7.1.2 defines the "Framing Files and File Collections," which maps a list of files into an SDF where the file's raw content is in a "special column, typically named blob and of type Binary." To ensure interoperability, it would be valuable to clarify: 1.Is the column name blob RECOMMENDED or REQUIRED? 2.Should the schema for this specific File List SDF (referred to as a FileListSchema) be standardized within the protocol for maximum client/server compatibility? B. Implications of RPC Mapping Section 5.3 states that Transformation operations "MUST NOT trigger network requests," while Actions trigger server-side execution via DoGet, DoAction, or GetFlightInfo. This clear separation is crucial for optimization. Could the authors expand on the security or resource allocation implications if a malicious client constructs an extremely long Transformation Chain before executing an Action? Does the server impose any limit on the complexity of the accumulated actions array to prevent Denial of Service (DoS) attacks on compute resources? C. Provenance and Integrity Section 9 mentions that the integrity of the provenance trail relies on the security of each hop. Given that "Each intermediate node MUST append its own entry to the provenance trail," have the authors considered a lightweight mechanism (e.g., cryptographic chaining or signature) to allow the final server or client to verify the non-tampering of the intermediate HopInfo entries by untrusted proxies? This could enhance the auditable nature of the trail. Thank you again for this excellent draft. I look forward to its progression. Best regards, Sophie Harper 2025年10月10日 星期五 下午 4:24,xjzhu@cnic.cn <xjzhu@cnic.cn> 来信: Dear TSVWG Chairs and Members, We are writing to propose our new Internet-Draft, DACP (Data Access and Collaboration Protocol), for consideration and potential adoption by the Transport Area Working Group (TSVWG). The Internet-Draft is available here: https://datatracker.ietf.org/doc/draft-shenzhihong-dacp/ What is DACP? DACP is a communication protocol designed to support cross-node, cross-process data access, primarily for scientific and distributed computing environments. Modern data processing, particularly in scientific and distributed computing, requires unified, low-latency data access across diverse nodes and processes. However, the fragmented and heterogeneous nature of scientific data currently hinders effective data sharing and collaboration. To address this, DACP introduces a high-performance solution built upon Apache Arrow Flight, creating a zero-copy, columnar streaming framework that unifies access to both structured and unstructured data. DACP’s goal is to provide a secure, auditable, and low-latency data channel that finally enables the efficient sharing and collaborative processing of scientific data. Our primary question for the group is whether the problem that DACP addresses falls within the scope and interest of the TSVWG. We would be very grateful for your expert opinion on this. If you feel this work is not a good fit, any suggestion for a more appropriate working group would be highly appreciated. Thank you for your time and expertise. Best regards, Xiaojie Zhu (on behalf of the authors) xjzhu@cnic.cn
- [tsvwg] Re: Question on TSVWG Scope and Feedback … Sophie Harper
- [tsvwg] Re: Question on TSVWG Scope and Feedback … xjzhu@cnic.cn