VIM

Introduction

Research & Development

Join US

Welcome

Download & Links

Internal & External

  • Two Papers are accepted by AAAI 2020

    Recently, two papers, “Joint Adversarial Learning for Domain Adaptation in Semantic Segmentation” from Yixin Zhang and “Progressive Boundary Refinement Network for Temporal Action Detection” from Qinying Liu, are accepted to be published in Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI 2020).1. Joint Adversarial Learning for Domain Adaptation in Semantic SegmentationAbstract: Unsupervised domain adaptation in semantic segmentation is to exploit the pixel-level annotated samples in the source domain to aid the segmentation of unlabeled samples in the target domain. For such a task, the key point is to learn domain-invariant representations and adversarial learning is usually used, in which the discriminator is to distinguish which domain the input comes from, and the segmentation model targets to deceive the domain discriminator. In this work, we first propose a novel joint adversarial learning (JAL) to boost the domain discriminator in output space by introducing the information of domain discriminator from low-level features. Consequently, the training of the high-level decoder would be enhanced. Then we propose a weight transfer module (WTM) to alleviate the inherent bias of the trained decoder towards source domain. Specifically, WTM changes the original decoder into a new decoder, which is learned only under the supervision of adversarial loss and thus mainly focuses on reducing domain divergence. The extensive experiments on two widely used benchmarks show that our method can bring considerable performance improvement over different baseline methods, which well demonstrates the effectiveness of our method in the output space adaptation.2. Progressive Boundary Refinement Network for Temporal Action DetectionAbstract: Temporal action detection is a challenging task due to vagueness of action boundaries. To tackle this issue, we propose an end-to-end progressive boundary refinement network (PBRNet) in this paper. PBRNet belongs to the family of on

  • Saihui Hou wins two grand awards

    First of all, we send our congratulations to Saihui Hou for winning Dean’s Excellence Award of Chinese Academy of Sciences, and Excellent doctoral thesis award of USTC in the fierce competition. Saihui Hou received a doctorate and during the PhD career, he focused on several aspects of image classification task. His achievements were rich and published papers in top conferences of computer vision such as CVPR, ICCV and ECCV. Saihui Hou not only wins these awards but also show us the way to be an outstanding student.Now hs has started his journey in Watrix as an algorithm engineer of computer vision. We hope that he can have smooth progress of business and a promising future.

  • Two Papers are accepted by CVPR 2019

    Recently, two papers, “Learning a Unified Classifier Incrementally via Rebalancing” from Saihui Hou and “Meta-SR: A Magnification-Arbitrary Network for Super-Resolution” from Xuecai Hu, are accepted by IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2019), which is one of the top conference for computer vision.1. Learning a Unified Classifier Incrementally via RebalancingAbstract: Conventionally, deep neural networks are trained offline, relying on a large dataset prepared in advance. This paradigm is often challenged in real-world applications, e.g. online services that involve continuous streams of incoming data. Recently, incremental learning receives increasing attention, and is considered as a promising solution to the practical challenges mentioned above. However, it has been observed that incremental learning is subject toa fundamental difficulty – catastrophic forgetting, namely adapting a model to new data often results in severe performance degradation on previous tasks or classes. Our study reveals that the imbalance between previous and new data is a crucial cause to this problem. In this work, we develop a new framework for incrementally learning a unified classifier, i.e. a classifier that treats both old and new classes uniformly. Specifically, we incorporate three components,cosine normalization, less-forget constraint, and inter-class separation, to mitigate the adverse effects of the imbalance. Experiments show that the proposed method can effectively rebalance the training process, thus obtaining superior performance compared to the existing methods. On CIFAR100 and ImageNet, our method can reduce the classification errors by more than 6% and 13% respectively, under the incremental setting of 10 phases.2. Meta-SR: A Magnification-Arbitrary Network for Super-ResolutionAbstract: Recent research on super-resolution has achieved great success due to the development of deep convolutional neural networks (DCNNs). However, super-resoluti

  • One Paper is accepted by AAAI 2019

    Recently, a new paper “Weighted channel dropout for regularization of deep convolutional neural network” from Saihui Hou is accepted to be published in Thirty-Third AAAI Conference on Artificial Intelligence (AAAI 2019).Abstract: In this work, we propose a novel method named Weighted Channel Dropout (WCD) for the regularization of deep Convolutional Neural Network (CNN). Different from Dropout which randomly selects the neurons to set to zero in the fully-connected layers, WCD operates on the channels in the stack of convolutional layers. Specifically, WCD consists of two steps, i.e., Rating Channels and Selecting Channels, and three modules, i.e., Global Average Pooling, Weighted Random Selection and Random Number Generator. It filters the channels according to their activation status and can be plugged into any two consecutive layers, which unifies the original Dropout and Channel-Wise Dropout. WCD is totally parameter-free and deployed only in training phase with very slight computation cost. The network in test phase remains unchanged and thus the inference cost is not added at all. Besides, when combining with the existing networks, it requires no re-pretraining on ImageNet and thus is well-suited for the application on small datasets. Finally, WCD with VGGNet16, ResNet-101, Inception-V3 are experimentally evaluated on multiple datasets. The extensive results demonstrate that WCD can bring consistent improvements over the baselines.

  • Three Papers are accepted by ECCV 2018

    Recently, three papers, “End-to-end View Synthesis for Light Field Imaging with Pseudo 4DCNN” from Yunlong Wang , “Lifelong Learning via Progressive Distillation and Retrospection” from Saihui Hou and “Towards Human-Level License Plate Recognition” from Jiafan Zhuang are accepted by European Conference on Computer Vision (ECCV 2018), which is one of the top conference for computer vision.1. End-to-end View Synthesis for Light Field Imaging with Pseudo 4DCNNAbstract: Limited angular resolution has become the main bottleneck of microlens-based plenoptic cameras towards practical vision applications. Existing view synthesis methods mainly break the task into two steps, i.e. depth estimating and view warping, which are usually inefficient and produce artifacts over depth ambiguities. In this paper, an end-to-end deep learning framework is proposed to solve these problems by exploring Pseudo 4DCNN. Specifically, 2D strided convolutions operated on stacked EPIs and detail-restoration 3D CNNs connected with angular conversion are assembled to build the Pseudo 4DCNN. The key advantage is to efficiently synthesize dense 4D light fields from a sparse set of input views. The learning framework is well formulated as an entirely trainable problem, and all the weights can be recursively updated with standard backpropagation. The proposed framework is compared with state-of-the-art approaches on both genuine and synthetic light field databases, which achieves significant improvements of both image quality (+2dB higher) and computational efficiency (over 10X faster). Furthermore, the proposed framework shows good performances in real-world applications such as biometrics and depth estimation.2. Lifelong Learning via Progressive Distillation and RetrospectionAbstract: Lifelong learning aims at adapting a learned model to new tasks while retaining the knowledge gained earlier. A key challenge for lifelong learning is how to strike a balance between the preservation on old tasks and t

  • Professor Wang and Saihui Hou attended the IEEE Conference on Computer Vision (ICCV2017)

    ICCV is one of the top conference in computer vision, which is held in the beautiful Venice this year. Professor Wang and Saihui Hou went to Venice and attended the conference from October 22 to 29. Two posters from VIM were presented in the conference, i.e., “DualNet: Learn Complementary Features for Image Recognition” and “VegFru: A Domain-Specific Dataset for Fine-grained Visual Categorization”.

  • Mingqi attended the 2017 Chinese Conference on Computer Vision

    On 11th October 2017, Mingqi from VIM Group gave a poster presentation entitled “Learning the Frame-2-frame Ego-Motion for Visual Odometry with Convolutional Neural Network” at the 2017 Chinese Conference on Computer Vision(CCCV 2017). His work attracted a lot of interest from the participants.The CCCV is a major conference on computer vision in China. This conference provides an excellent opportunity for academic and industrial researchers to demonstrate their research.Apart from the Poster, Mingqi took the opportunity to attend all other sessions at CCCV 2017 and had a chance to communicate with many experts in the field. Attending such a conference was an effective way to stay informed about the latest research and broaden one’s horizons.

  • Zhikang attended the 2017 IEEE International Conference on Image Processing

    On 19th September 2017, Zhikang from VIM Group gave an oral presentation entitled ‘Improving Human Action Recognition by Temporal Attention’ at the 2017 IEEE International Conference on Image Processing (ICIP 2017). His work has attracted a lot of interest from both academics and industry.The ICIP is a major international conference on computer vision. This conference provides an excellent opportunity for academic and industrial researchers to meet and showcase their research.Apart from presenting his work, Zhikang took the opportunity to attend all other sessions at ICIP 2017 and had a chance to discuss with many experts in the field. Attending a conference was also an excellent way to stay informed about the latest research achievements and trends in the field of computer vision.

  • Two Papers are accepted by ICCV 2017

    Recently, two papers, “DualNet: Learn Complementary Features for Image Recognition” and “VegFru: A Domain-Specific Dataset for Fine-grained Visual Categorization”, from Saihui Hou are accepted by IEEE Conference on Computer Vision (ICCV 2017), which is one of the top conference for computer vision.DualNet: Learn Complementary Features for Image RecognitionAbstract: In this work we propose a novel framework named DualNet aiming at learning more accurate representation for image recognition. Here two parallel neural networks are coordinated to learn complementary features and thus a wider network is constructed. Specifically, we logically divide an end-to-end deep convolutional neural network into two functional parts, \ie, feature extractor and image classifier. The extractors of two subnetworks are placed side by side, which exactly form the feature extractor of DualNet. Then the two-stream features are aggregated to the final classifier for overall classification, while two auxiliary classifiers are appended behind the feature extractor of each subnetwork to make the separately learned features discriminative alone. The complementary constraint is imposed by weighting the three classifiers, which is indeed the key of DualNet. The corresponding training strategy is also proposed, consisting of iterative training and joint finetuning, to make the two subnetworks cooperate well with each other. Finally, DualNet based on the well-known CaffeNet, VGGNet, NIN and ResNet are thoroughly investigated and experimentally evaluated on multiple datasets including CIFAR-100, Stanford Dogs and UEC FOOD-100. The results demonstrate that DualNet can really help learn more accurate image representation, and thus result in higher accuracy for recognition. In particular, the performance on CIFAR-100 is state-of-the-art compared to the recent works.VegFru: A Domain-Specific Dataset for Fine-grained Visual CategorizationAbstract: In this paper, we propose a novel domain-specific dataset

  • Saihui Hou wins the National Scholarship

    Congratulations to Saihui Hou on being awarded the National Scholarship, which is the highest honor for a master student. In the past two years, besides completing the required courses, he also made a lot of work about research on computer vision and deep learning. He already accomplished several papers, one of which was accepted by CVPR Workshop on Robust Features for Computer Vison (ROF2016). And his work with Dao Xiang was published on the journal of IEEE Transactions on Multimedia.Now he has started his journey as a PhD candidate. We hope he can keep working hard and graduate as an outstanding student.

  • Zhikang attended the 13th Asian Conference on Computer Vision

    The 13th Asian Conference on Computer Vision (ACCV’16) was held in Taipei on Nov 20-24, 2016. ACCV is a leading biennial international conference mainly sponsored by Asian Federation of Computer Vision with less than 25% acceptance rate this year.Zhikang Liu from VIM research group attended the conference. His paper “Stacked Overcomplete Independent Component Analysis for Action Recognition ” is accepted to the main conference.Zhikang presented his poster in the afternoon on Nov 21th and exchanged ideas with other researchers.

  • Two VIMers attended IEEE Conference on Computer Vision and Pattern Recognition 2016

    he 29th IEEE Conference on Computer Vision and Pattern Recognition (CVPR) has taken place at the Caesar’s Palace from June 26th – July 1th in Las Vegas, Nevada. As the top level conference in computer vision, nearly 3,500 researchers from academics and industry all over the world participated in this event.Xu Liu and Saihui Hou from VIM research group attended the conference. Their papers “Highway Vehicle Counting in Compressed Domain” and “Deeply Exploit Depth Information for Object Detection” are accepted to the main conference and the workshop “ROF: Robust Features for Computer Vison 2016”, respectively.Xu Liu presented his poster in the afternoon on June 28th and Saihui Hou gave an oral presentation about his work at the ROF workshop in the morning on July 1th. During the presentations, they introduced the motivation and innovations of their papers and exchanged ideas with other researchers.