[AVTCORE] MEXaction2 action detection and localization dataset available
Bernard Merialdo <firstname.lastname@example.org> Thu, 10 September 2015 14:36 UTC
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7FC691B5DF0; Thu, 10 Sep 2015 07:36:57 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Status: No, score=0.441 tagged_above=-999 required=5 tests=[BAYES_50=0.8, HELO_EQ_FR=0.35, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, T_RP_MATCHES_RCVD=-0.01] autolearn=unavailable
Received: from mail.ietf.org ([220.127.116.11]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8QgdEXbdF8Os; Thu, 10 Sep 2015 07:36:55 -0700 (PDT)
Received: from smtp2.eurecom.fr (smtp3.eurecom.fr [18.104.22.168]) by ietfa.amsl.com (Postfix) with ESMTP id 85AD01B52D6; Thu, 10 Sep 2015 07:36:54 -0700 (PDT)
X-IronPort-AV: E=Sophos;i="5.17,505,1437429600"; d="scan'208,217";a="1011265"
Received: from monza.eurecom.fr ([192.168.106.15]) by drago2i.eurecom.fr with ESMTP; 10 Sep 2015 16:29:55 +0200
Received: by monza.eurecom.fr (Postfix) id 1E1BE493; Thu, 10 Sep 2015 16:29:09 +0200 (CEST)
Received: from [172.17.31.50] (xerus42.eurecom.fr [172.17.31.50]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by monza.eurecom.fr (Postfix) with ESMTPSA id EDFD5492 for <email@example.com>; Thu, 10 Sep 2015 16:29:08 +0200 (CEST)
From: Bernard Merialdo <firstname.lastname@example.org>
Date: Thu, 10 Sep 2015 16:29:08 +0200
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0
To: undisclosed-recipients: ;
Content-Type: multipart/alternative; boundary="------------070503010305020009090405"
X-Mailman-Approved-At: Thu, 10 Sep 2015 18:18:12 -0700
Subject: [AVTCORE] MEXaction2 action detection and localization dataset available
List-Id: Audio/Video Transport Core Maintenance <avt.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/avt>, <mailto:email@example.com?subject=unsubscribe>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/avt>, <mailto:firstname.lastname@example.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Sep 2015 14:36:57 -0000
(In behalf of Michel Crucianu and Jenny Benois-Pineau) We are happy to make public the MEXaction2 action detection and localization dataset. A detailed description, including video samples, the evaluation procedure and baseline results, is available at http://mexculture.cnam.fr/xwiki/bin/view/Datasets/Mex+action+dataset A brief description follows. The aim of the MEXaction2 dataset is to support the development and evaluation of methods for 'spotting' instances of short actions in a relatively large video database. For each action class, such a method should detect instances of this class in the video database and output the temporal boundaries of these detections, with an associated 'confidence' score. This task can also be seen as 'action retrieval': the 'query' is an action class and the results are instances of the class, ordered by decreasing 'confidence' score. The dataset contains videos from three sources: 1. INA videos. A large collection of 117 videos (for a total of 77 hours), was extracted from the archives of the Institut National de l'Audiovisuel (France). It contains videos produced between 1945 and 2011. The video content in this collection was divided into three parts: training, parameter validation and testing. 2. YouTube clips. From additional videos collected from YouTube, we provide 588 short clips, each containing only one instance of an action to spot (everything within the boundaries of a clip belongs to an action instance). All these instances should be only used for training. 3. UCF101 Horse Riding clips. We add as training instances for the HorseRiding class the Horse Riding clips from the UCF101 dataset. All these instances should be only used for training. There are two annotated actions (see the abovementioned website for examples): 1. BullChargeCape: in the context of a bull fight, the bull charges the matador who dangles a cape to distract the animal. 2. HorseRiding: instances of one or several persons riding horses. The numbers of annotated examples for the two actions are: BullChargeCape 1324 HorseRiding 651 Beside the fact that the total amount of annotated video is relatively large compared to other existing datasets, this dataset is also interesting because it raises several difficulties: 1. High imbalance between non-relevant video sequences and relevant ones (instances of an action of interest). 2. High variability in point of view and background movement (and action duration for HorseRiding). 3. Variability in image quality: old videos have lower resolution and are in black and white, while the newest ones are in HD. Action detection and localization is evaluated as a retrieval problem: the system must produce a list of detections (temporal boundaries) with positive scores. Sorting these results by decreasing score allows to obtain precision/recall curves and to compute the Average Precision (AP) in order to characterize the detection performance.
- [AVTCORE] MEXaction2 action detection and localiz… Bernard Merialdo