title-image




The 1st workshop on Object Instance Detection

in conjunction with ACCV 2024, Hanoi, Vietnam.

Location: TBD
Time: Dec 9

Overview

Instance Detection (InsDet) is an important but fundamental problem in robotics and AR/VR applications. Different from Object Detection (ObjDet) aiming to detect all objects belonging to some predefined classes, InsDet requires detecting specific object instances defined by some examples capturing the instance from multiple views. For example, in a daily scenario, when fetching a specific object instance (e.g., my-coffee-mug), a robot must detect it at a distance, distinguishing it from similar objects (e.g., other mugs or cups) in a cluttered scene for subsequent operations. An illustration of Object Detection vs. Instance Detection is shown below. We invite researchers to the Challenge Workshop on Object Instance Detection where we investigate multiple directions through a competition to address InsDet problem.

fig1

Schedule (tentative)

13:45 - 14:00 Opening remarks
14:00 - 14:45 Keynote 1 Shu Kong
14:45 - 15:30 Keynote 2 Nuo Xu
15:30 - 15:45 Coffee break
15:45 - 16:00 Challenge Overview and Results TBD
16:00 - 16:50 Challenge Winner Talks TBD
16:50 - 17:00 Closing remarks

Invited Speakers

Shu Kong

UMacau, Texas A&M

Shu Kong is on the faculty of FST, University of Macau, and CSE, Texas A&M University. He leads the Computer Vision Lab. Before that he spent two years as a postdoctoral researcher at the Robotics Institute, CMU. He received his PhD from UC-Irvine. His research interests lie in computer vision, machine learning and robotics, with a particular interest in visual perception and learning in an open world. He has published a number of papers addressing open-world problems and applying their solutions to interdisciplinary research. His paper on open-set recognition received honorable mention for Best Paper / Marr Prize at ICCV2021. He was the lead organizer of the workshops on "open-world vision" at CVPR 2021-2024, and "Dealing with the Novelty in Open Worlds" at WACV 2022 and 2023.

Nuo Xu

Zhejiang Lab

Nuo Xu is an Assistant Researcher at Zhejiang Lab. Before that, he received his PhD from the National Laboratory of Pattern Recognition (NLPR) at the Institute of Automation, Chinese Academy of Sciences (CASIA). His research interests include active vision, embodied agents, computer vision, and reinforcement learning. He has published papers in leading publications such as CVPR, ICRA, and AAAI.

Challenge

This year, we plan to run a competition on our InsDet dataset, which is the first instance detection benchmark dataset which is larger in scale and more challenging than existing InsDet datasets. The major strengths of our InsDet dataset over prior InsDet datasets include (1) both high-resolution profile images of object instances and high-resolution testing images from more realistic indoor scenes, simulating real-world indoor robots locating and recognizing object instances from a cluttered indoor scene in a distance (2) a realistic unified InsDet protocol to foster the InsDet research.
  • A realistic unified InsDet protocol.
  • In real-world indoor robotic applications, we consider the scenario that assistive robots must locate and recognize instances to fetch them in a cluttered indoor scene. For a given object instance, the robots should see it only from a few views at the training stage, and then accurately detect it at a distance in any scene at the testing stage.
  • InsDet in the closed-world.
  • InsDet has been explored in a closed-world setting, which allows access to profile images during model development. While one can exploit profile images to train models, it is still unknown how testing images look like when encountered in the open world. Prevalent methods adopt a cut-paste-learn strategy [10] that cuts and pastes profile images on random background photos (sampled in the open world) to generate synthetic training data and uses such synthetic data to train a detector.
  • InsDet in the open-world.
  • The challenge of InsDet lies in its open-world nature that one has no knowledge of data distribution at test-time, which can be unknown testing scene imagery, unexpected scene clutter, and novel object instances specified only in testing. Prevalent methods exploit the open world by using foundation models and by using diverse data to pretrain InsDet models.

We use EvalAI as submission protal. The team with the top-performing submission will be invited to give short talks during the workshop.

Important Dates

    Data Instructions & Helper Scripts: September 10, 2024
    Dev Phase Start: September 10, 2024
    Submission Portal Start: September 10, 2024
    Test Phase Start: October 20, 2024
    Test Phase End: November 30, 2024
    Winner Notification: December 02, 2024

Organizers

Qianqian Shen
Qianqian Shen
Zhejiang University
Haishuai Wang
Haishuai Wang
Zhejiang University
Yanan Li
Yanan Li
Zhejiang Lab
Yunhan Zhao
Yunhan Zhao
UC Irvine
Nahyun Kwon
Nahyun Kwon
Texas A&M
Kelu Yao
Kelu Yao
Zhejiang Lab