I-D Action: draft-perlert-wg-00.txt

internet-drafts@ietf.org Thu, 13 August 2020 18:59 UTC

Return-Path: <internet-drafts@ietf.org>
X-Original-To: i-d-announce@ietf.org
Delivered-To: i-d-announce@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id 9A8FB3A1093 for <i-d-announce@ietf.org>; Thu, 13 Aug 2020 11:59:46 -0700 (PDT)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
From: internet-drafts@ietf.org
To: i-d-announce@ietf.org
Subject: I-D Action: draft-perlert-wg-00.txt
X-Test-IDTracker: no
X-IETF-IDTracker: 7.14.0
Auto-Submitted: auto-generated
Precedence: bulk
Message-ID: <159734518651.20335.15608200648739085687@ietfa.amsl.com>
Date: Thu, 13 Aug 2020 11:59:46 -0700
Archived-At: <https://mailarchive.ietf.org/arch/msg/i-d-announce/EeXJ2JHjKvRJiASa95-eFT4pyXA>
X-BeenThere: i-d-announce@ietf.org
X-Mailman-Version: 2.1.29
List-Id: Internet Draft Announcements only <i-d-announce.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i-d-announce>, <mailto:i-d-announce-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i-d-announce/>
List-Post: <mailto:i-d-announce@ietf.org>
List-Help: <mailto:i-d-announce-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i-d-announce>, <mailto:i-d-announce-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 13 Aug 2020 18:59:52 -0000

A New Internet-Draft is available from the on-line Internet-Drafts directories.


        Title           : Protocol for Evaluating Reinforcement Learning Environments in Real Time
        Author          : Ruben Montero
	Filename        : draft-perlert-wg-00.txt
	Pages           : 10
	Date            : 2020-08-13

Abstract:
   This document defines a simple UDP protocol for communicating a
   server simulating a reinforcement learning environment and a client
   observing it and responding with actions.

   Reinforcement learning problems are usually defined within the scope
   of a Markov Decission Process (MDP) where an agent sends an action
   belonging to an action space to an environment.  The environment acts
   as a black box returning an observation and a reward for the agent,
   whose goal is to maximize the total obtained rewards.

   Although the problem statement is easy to understand, there are no
   conventions on how to communicate a reinforcement learning simulation
   with a client agent, either in a local network or over the Internet.
   Additionally, giving an answer to this can be especially useful when
   it comes to multiagent support and analysis.

   The protocol PERLERT defined in this document assumes that server and
   client have shared certain information beforehand via another way of
   communication like a web page served using HTTP protocol.  For
   example, the client must know a port number and an instance number
   before proceeding to participate in a simulation run on a server.

   Also, although it is often desired to know the full feedback from the
   environment, PERLERT focuses on real-time interaction where human
   agents can interact with AI agents even if that means that
   information can be lost due to network packet loss.


The IETF datatracker status page for this draft is:
https://datatracker.ietf.org/doc/draft-perlert-wg/

There are also htmlized versions available at:
https://tools.ietf.org/html/draft-perlert-wg-00
https://datatracker.ietf.org/doc/html/draft-perlert-wg-00


Please note that it may take a couple of minutes from the time of submission
until the htmlized version and diff are available at tools.ietf.org.

Internet-Drafts are also available by anonymous FTP at:
ftp://ftp.ietf.org/internet-drafts/