I-D Action: draft-perlert-wg-00.txt
internet-drafts@ietf.org Thu, 13 August 2020 18:59 UTC
Return-Path: <internet-drafts@ietf.org>
X-Original-To: i-d-announce@ietf.org
Delivered-To: i-d-announce@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id 9A8FB3A1093 for <i-d-announce@ietf.org>; Thu, 13 Aug 2020 11:59:46 -0700 (PDT)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
From: internet-drafts@ietf.org
To: i-d-announce@ietf.org
Subject: I-D Action: draft-perlert-wg-00.txt
X-Test-IDTracker: no
X-IETF-IDTracker: 7.14.0
Auto-Submitted: auto-generated
Precedence: bulk
Message-ID: <159734518651.20335.15608200648739085687@ietfa.amsl.com>
Date: Thu, 13 Aug 2020 11:59:46 -0700
Archived-At: <https://mailarchive.ietf.org/arch/msg/i-d-announce/EeXJ2JHjKvRJiASa95-eFT4pyXA>
X-BeenThere: i-d-announce@ietf.org
X-Mailman-Version: 2.1.29
List-Id: Internet Draft Announcements only <i-d-announce.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i-d-announce>, <mailto:i-d-announce-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i-d-announce/>
List-Post: <mailto:i-d-announce@ietf.org>
List-Help: <mailto:i-d-announce-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i-d-announce>, <mailto:i-d-announce-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 13 Aug 2020 18:59:52 -0000
A New Internet-Draft is available from the on-line Internet-Drafts directories. Title : Protocol for Evaluating Reinforcement Learning Environments in Real Time Author : Ruben Montero Filename : draft-perlert-wg-00.txt Pages : 10 Date : 2020-08-13 Abstract: This document defines a simple UDP protocol for communicating a server simulating a reinforcement learning environment and a client observing it and responding with actions. Reinforcement learning problems are usually defined within the scope of a Markov Decission Process (MDP) where an agent sends an action belonging to an action space to an environment. The environment acts as a black box returning an observation and a reward for the agent, whose goal is to maximize the total obtained rewards. Although the problem statement is easy to understand, there are no conventions on how to communicate a reinforcement learning simulation with a client agent, either in a local network or over the Internet. Additionally, giving an answer to this can be especially useful when it comes to multiagent support and analysis. The protocol PERLERT defined in this document assumes that server and client have shared certain information beforehand via another way of communication like a web page served using HTTP protocol. For example, the client must know a port number and an instance number before proceeding to participate in a simulation run on a server. Also, although it is often desired to know the full feedback from the environment, PERLERT focuses on real-time interaction where human agents can interact with AI agents even if that means that information can be lost due to network packet loss. The IETF datatracker status page for this draft is: https://datatracker.ietf.org/doc/draft-perlert-wg/ There are also htmlized versions available at: https://tools.ietf.org/html/draft-perlert-wg-00 https://datatracker.ietf.org/doc/html/draft-perlert-wg-00 Please note that it may take a couple of minutes from the time of submission until the htmlized version and diff are available at tools.ietf.org. Internet-Drafts are also available by anonymous FTP at: ftp://ftp.ietf.org/internet-drafts/
- I-D Action: draft-perlert-wg-00.txt internet-drafts