-
Additional Baseline Metrics for the paper "Extended YouTube Faces: a Dataset for ...
In this report, we provide additional and corrected results for the paper "Extended YouTube Faces: a Dataset for Heterogeneous Open-Set Face Identification". After further investigations, we discovered and corrected wrongly labe ... Read More >
-
Dense Depth Estimation of a Complex Dynamic Scene without Explicit 3D Motion Est ...
Recent geometric methods need reliable estimates of 3D motion parameters to procure accurate dense depth map of a complex dynamic scene from monocular images \cite{kumar2017monocular, ranftl2016dense}. Generally, to estimate \te ... Read More >
-
Exploring Explicit Domain Supervision for Latent Space Disentanglement in Unpair ...
Image-to-image translation tasks have been widely investigated with Generative Adversarial Networks (GANs). However, existing approaches are mostly designed in an unsupervised manner while little attention has been paid to domai ... Read More >
-
Pornographic Image Recognition via Weighted Multiple Instance Learning
In the era of Internet, recognizing pornographic images is of great significance for protecting children's physical and mental health. However, this task is very challenging as the key pornographic contents (e.g., breast and pri ... Read More >
-
Taking a HINT: Leveraging Explanations to Make Vision and Language Models More G ...
Many vision and language models suffer from poor visual grounding - often falling back on easy-to-learn language priors rather than basing their decisions on visual concepts in the image. In this work, we propose a generic appro ... Read More >
-
Peeking into the Future: Predicting Future Person Activities and Locations in Vi ...
Deciphering human behaviors to predict their future paths/trajectories and what they would do from videos is important in many applications. Motivated by this idea, this paper studies predicting a pedestrian's future path jointl ... Read More >
-
Visual SLAM: Why Bundle Adjust?
Bundle adjustment plays a vital role in feature-based monocular SLAM. In many modern SLAM pipelines, bundle adjustment is performed to estimate the 6DOF camera trajectory and 3D map (3D point cloud) from the input feature tracks ... Read More >
-
UcoSLAM: Simultaneous Localization and Mapping by Fusion of KeyPoints and Square ...
This paper proposes a novel approach for Simultaneous Localization and Mapping by fusing natural and artificial landmarks. Most of the SLAM approaches use natural landmarks (such as keypoints). However, they are unstable over ti ... Read More >
-
Towards Segmenting Anything That Moves
Detecting and segmenting individual objects, regardless of their category, is crucial for many applications such as action detection or robotic interaction. While this problem has been well-studied under the classic formulation ... Read More >
-
A Decoupled 3D Facial Shape Model by Adversarial Training
Data-driven generative 3D face models are used to compactly encode facial shape data into meaningful parametric representations. A desirable property of these models is their ability to effectively decouple natural sources of va ... Read More >