<html>

<head>

<meta http-equiv="Content-Type" content="text/html; charset=Windows-1252">

<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>

</head>

<body dir="ltr">

<b>Call for Participants - ACRV Robotic Vision Scene Understanding Challenge</b>

<div dir="ltr">

<div><b>==================================================</b></div>

<div><br>

</div>

<div>Dear Researchers,</div>

<div><br>

</div>

<div>This is a call for participants for the latest ACRV robotic vision challenge on scene understanding.</div>

<div><br>

</div>

<div>If you plan to participate please contact the challenge organizers at contact@roboticvisionchallenge.org.<br>

</div>

<div><br>

</div>

<div><b>Eval AI Challenge Link:</b> <a href="https://evalai.cloudcv.org/web/challenges/challenge-page/625/overview">

https://evalai.cloudcv.org/web/challenges/challenge-page/625/overview</a></div>

<div><br>

</div>

<div><b>Challenge Overview Webpage:</b> <a href="https://nikosuenderhauf.github.io/roboticvisionchallenges/scene-understanding">

https://nikosuenderhauf.github.io/roboticvisionchallenges/scene-understanding</a><br>

</div>

<div class="x__Entity x__EType_OWALinkPreview x__EId_OWALinkPreview x__EReadonly_1">

</div>

<br>

<div><b>Deadline:</b> October 2nd<br>

</div>

<div><br>

</div>

<div><b>Prizes:</b> 1 Titan RTX and up to 5 Jetson Nano GPUs for two winning teams provided by NVIDIA + $5,000 USD to be divided amongst high-performing competitors provided by the ACRV<br>

</div>

<div><br>

</div>

<div><br>

</div>

<div><b>Challenge Overview</b></div>

<div><b>-----------------------</b></div>

<div>

<p style="margin-top: 0px; margin-bottom: 0px;">The Robotic Vision Scene Understanding Challenge evaluates how well a robotic vision system can understand the semantic and geometric aspects of its environment. The challenge consists of two distinct tasks:

<b>Object-based Semantic SLAM</b>, and <b>Scene Change Detection</b>.</p>

<p style="margin-top: 0px; margin-bottom: 0px;">Key features of this challenge include:</p>

<ul>

<li>BenchBot, a complete software stack for running semantic scene understanding algorithms.</li><li>Running algorithms in realistic 3D simulation, and on real robots, with only a few lines of Python code.</li><li>Tiered difficulty levels to allow for easy of entry to robotic vision with embodied agents and enabling ablation studies.<br>

</li><li>The BenchBot API, which allows simple interfacing with robots and supports OpenAI Gym-style approaches and a simple object-oriented Agent approach.</li><li>Easy-to-use scripts for running simulated environments, executing code on a simulated robot, evaluating semantic scene understanding results, and automating code execution across multiple environments.</li><li>Opportunities for the best teams to execute their code on a real robot in our lab, which uses the same API as the simulated robot.</li><li>Use of the Nvidia Isaac SDK for interfacing with, and simulation of, high fidelity 3D environments.</li></ul>

<div><b><br>

</b></div>

<div><b>Object-based Semantic SLAM: </b>Participants use a robot to traverse around the environment, building up an object-based semantic map from the robot’s RGBD sensor observations and odometry measurements.

<br>

</div>

<div><br>

</div>

<div><b>Scene Change Detection:</b> Participants use a robot to traverse through an environment scene, building up a semantic understanding of the scene. Then the robot is moved to a new start position in the same environment, but with different conditions.

 Along with a possible change from day to night, the new scene has a number objects added and / or removed. Participants must produce an object-based semantic map describing the changes between the two scenes.</div>

<div><br>

</div>

<div><b>Difficulty Levels:</b> We provide three difficulty levels of increasing complexity and similarity to true active robotic vision systems. At the simplest difficulty level, the robot moves to pre-defined poses to collect data and provides ground-truth

 poses, removing the need for active exploration and localization. The second level requires active exploration and robot control but still provides ground-truth pose to remove localization requirements. The final mode is the same as the previous but provides

 only noisy odometry information, requiring localization to be calculated by the system.

<br>

</div>

<div><br>

</div>

<div><b>Information Videos</b></div>

<div><b>-----------------------<br>

</b><a href="https://youtu.be/jQPkV29KFvI">https://youtu.be/jQPkV29KFvI</a><br>

<div class="x__Entity x__EType_OWALinkPreview x__EId_OWALinkPreview_1 x__EReadonly_1">

</div>

<a href="https://youtu.be/LNEvhpWerJQ">https://youtu.be/LNEvhpWerJQ</a><br>

</div>

<div><br>

</div>

<div><b>Contact Details</b></div>

<div><b>------------------</b></div>

<div><b>E-mail: </b>contact@roboticvisionchallenge.org</div>

<div><b>Webpage:</b> <a href="https://roboticvisionchallenge.org">https://roboticvisionchallenge.org</a><br>

</div>

<div><b>Slack:</b> <a href="https://tinyurl.com/rvcslack">https://tinyurl.com/rvcslack</a></div>

<div><b>Twitter:</b> @robVisChallenge <br>

</div>

</div>

</div>

</body>

</html>