Portfolio item number 1
Short description of portfolio item number 1
Short description of portfolio item number 1
Short description of portfolio item number 2
Published in ICCV, 2019
This paper introduces the Asymmetric Non-local Neural Network (ANN) for semantic segmentation, addressing the computational and memory challenges of non-local modules. It includes two key components: the Asymmetric Pyramid Non-local Block (APNB) and the Asymmetric Fusion Non-local Block (AFNB). Experimental results demonstrate the effectiveness and efficiency of ANN, achieving state-of-the-art performance with an mIoU of 81.3 on the Cityscapes test set, while being significantly faster and occupying less GPU memory compared to traditional non-local blocks.
Published in ICCV, 2021
This paper introduces an end-to-end semi-supervised object detection approach that surpasses previous methods by a significant margin on the COCO benchmark, achieving superior performance with labeling ratios of 1%, 5%, and 10%. By leveraging unlabeled data with all labeled data, the proposed approach enhances a strong Faster RCNN by +3.6 mAP, reaching 44.5 mAP, and improves the accuracy of a state-of-the-art Swin Transformer based object detector by +1.5 mAP, reaching 60.4 mAP. When combined with the Object365 pre-trained model, it achieves a new state-of-the-art detection accuracy of 61.3 mAP and instance segmentation accuracy of 53.0 mAP.
Published in NeurIPS, 2021
MixTraining is a novel training approach for object detection that enhances data augmentation by combining different strengths and excluding potentially harmful augmentations. It also addresses localization noise and missing labels through the use of pseudo boxes. This method consistently improves the performance of various detectors on the COCO dataset, achieving significant accuracy gains for models like Faster R-CNN and Cascade-RCNN.
Published in ECCV, 2022
This paper introduces a two-stage open-vocabulary semantic segmentation framework that combines an off-the-shelf vision-language model, CLIP, with mask proposals to achieve superior open-vocabulary semantic segmentation performance. The framework outperforms previous methods on zero-shot semantic segmentation tasks and serves as a strong baseline for future research.
Published in CVPR, 2023
The paper introduces a new framework called Side Adapter Network (SAN) for open-vocabulary semantic segmentation, which utilizes a pre-trained vision-language model, CLIP. SAN incorporates a side network to predict mask proposals and attention bias, resulting in improved recognition of mask classes. The approach achieves high accuracy and inference speed with minimal additional trainable parameters, outperforming other methods in various benchmarks.
Published:
This is a description of your talk, which is a markdown files that can be all markdown-ified like any other post. Yay markdown!
Published:
This is a description of your conference proceedings talk, note the different field in type. You can put anything in this field.
Undergraduate course, University 1, Department, 2014
This is a description of a teaching experience. You can use markdown like any other post.
Workshop, University 1, Department, 2015
This is a description of a teaching experience. You can use markdown like any other post.