Related references
Note: Only part of the references are listed.
Article
Computer Science, Artificial Intelligence
Xiaojun Chang et al.
Summary: Scene graph is a structured representation of a scene, expressing objects, attributes, and relationships. With the development of computer vision, people aim for a higher level of understanding and reasoning about visual scenes. Scene graphs have attracted researchers' attention as a powerful tool for scene understanding.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
(2023)
Article
Engineering, Electrical & Electronic
Zhenxun Yuan et al.
Summary: This paper proposes a new transformer model, called Temporal-Channel Transformer (TCTR), for video object detection from Lidar data by modeling the temporal-channel and spatial relationships. The model encodes temporal-channel information using the encoder and decodes spatial-wise information using the decoder. A gate mechanism is deployed to refine the representation of the target frame. Experimental results show that TCTR achieves state-of-the-art performance in grid voxel-based 3D object detection on the nuScenes benchmark.
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
(2022)
Proceedings Paper
Computer Science, Artificial Intelligence
Yehui Tang et al.
Summary: This paper studies the efficiency problem of visual transformers and proposes a patch slimming approach to reduce redundant calculations. Experimental results demonstrate that the proposed method can significantly reduce computational costs without sacrificing performance.
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)
(2022)
Proceedings Paper
Computer Science, Artificial Intelligence
Lizhe Liu et al.
Summary: This work proposes a novel top-to-down lane detection framework, CondLaneNet, which dynamically predicts lane instances and line shapes, achieving real-time efficiency and excellent detection accuracy.
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021)
(2021)
Proceedings Paper
Computer Science, Artificial Intelligence
Kaleel Mahmood et al.
Summary: This study investigates the robustness of Vision Transformers to adversarial examples, finding that these examples do not readily transfer between CNNs and Transformers. The researchers introduce a new attack called the self-attention blended gradient attack and analyze the security of a simple ensemble defense of CNNs and Transformers.
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021)
(2021)
Proceedings Paper
Computer Science, Artificial Intelligence
Boyu Chen et al.
Summary: The paper introduces a new Neural Architecture Search (NAS) method to find a better transformer architecture for image recognition. By incorporating a locality module and new search algorithms, the method allows for a trade-off between global and local information, as well as optimizing low-level design choices in each module. Through extensive experiments on the ImageNet dataset, the method demonstrates the ability to find more efficient and discriminative transformer variants compared to existing models like ResNet101 and ViT.
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021)
(2021)
Proceedings Paper
Computer Science, Artificial Intelligence
Changlin Li et al.
Summary: The paper introduces an unsupervised NAS method called BossNAS to address inaccurate architecture rating caused by large weight-sharing space and biased supervision in previous methods. In a new hybrid CNN-transformer search space, our searched model BossNet-T achieves high accuracy of 82.5% on ImageNet, surpassing EfficientNet by 2.4% with comparable compute time. Furthermore, our method outperforms state-of-the-art NAS methods in architecture rating accuracy on two different search spaces.
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021)
(2021)
Proceedings Paper
Computer Science, Artificial Intelligence
Yanjie Li et al.
Summary: This paper introduces a novel approach for human pose estimation based on Token representation, which can learn constraint relationships and appearance cues simultaneously, achieving comparable performance with existing methods in experiments.
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021)
(2021)
Proceedings Paper
Computer Science, Artificial Intelligence
Sen Yang et al.
Summary: The TransPose model introduces Transformer for human pose estimation, efficiently capturing long-range relationships and revealing dependencies of keypoints. The heatmap-based approach provides fine-grained image-specific dependencies, showing evidence of how the model handles special cases such as occlusion.
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021)
(2021)
Proceedings Paper
Computer Science, Artificial Intelligence
Minghao Chen et al.
Summary: AutoFormer is a novel one-shot architecture search framework dedicated to vision transformer search. It outperforms recent models like ViT and DeiT, achieving good accuracy on ImageNet by training a supernet and generating comparable subnets.
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021)
(2021)
Proceedings Paper
Computer Science, Artificial Intelligence
Xinpeng Wang et al.
Summary: This study focuses on indoor scene generation using transformers, without relying on appearance information. By using selfattention and cross-attention mechanisms, the model can generate scenes faster and with similar or improved realism compared to existing methods, conditioned on room layout or text descriptions.
2021 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2021)
(2021)
Proceedings Paper
Computer Science, Artificial Intelligence
Jeya Maria Jose Valanarasu et al.
Summary: Deep convolutional neural networks have been widely adopted in medical image segmentation, but lack understanding of long-range dependencies due to inherent biases in convolutional architectures. Transformer-based architectures leverage self-attention mechanism to encode long-range dependencies, motivating the exploration of transformer solutions for medical image segmentation tasks.
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT I
(2021)
Proceedings Paper
Computer Science, Artificial Intelligence
Tao Jiang et al.
Summary: The paper introduces a novel transformer-based network, Skeletor, that can unsupervisedly learn the distribution of 3D pose and motion to reduce inaccuracies and inconsistencies in skeletal estimation. Skeletor uses strong priors learned from 25 million frames to smooth and correct skeleton sequences, achieving improved performance on 3D human pose estimation.
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021
(2021)
Proceedings Paper
Computer Science, Artificial Intelligence
Jie Shao et al.
Summary: The paper introduces TCA framework for video representation learning that incorporates long-range temporal information using self-attention mechanism, and proposes a supervised contrastive learning method with memory bank mechanism to improve negative sample capacity. Extensive experiments show significant performance advantages in multiple video retrieval tasks.
2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021
(2021)
Proceedings Paper
Computer Science, Artificial Intelligence
Ruijin Liu et al.
Summary: The study introduces an end-to-end lane detection method that outputs lane shape model parameters using a Transformer network, which improves learning efficiency for global context and lane long and thin structures. It shows state-of-the-art accuracy on the TuSimple benchmark and demonstrates powerful deployment potential in real applications, with the most lightweight model size and fastest speed compared to other methods.
2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021
(2021)
Proceedings Paper
Computer Science, Artificial Intelligence
Lucas Tabelini et al.
Summary: The advancement of autonomous driving technology is greatly influenced by the emergence of deep learning. Lane detection remains a challenging issue in the quest for safer self-driving vehicles. This study introduces a novel lane detection method that competes with existing techniques in efficiency and accuracy, with additional insights on evaluation metrics limitations and reproducibility.
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR)
(2021)
Article
Computer Science, Software Engineering
Meng-Hao Guo et al.
Summary: This paper introduces a novel framework named Point Cloud Transformer (PCT) for point cloud learning, based on Transformer and enhanced by farthest point sampling and nearest neighbor search for better capturing local context. Extensive experiments demonstrate that the PCT achieves state-of-the-art performance on shape classification, part segmentation, semantic segmentation, and normal estimation tasks.
COMPUTATIONAL VISUAL MEDIA
(2021)
Article
Computer Science, Information Systems
Nico Engel et al.
Summary: Point Transformer is a deep neural network that operates directly on unordered and unstructured point sets, extracting local and global features and relating them through a local-global attention mechanism. SortNet induces input permutation invariance by selecting points based on a learned score. The output is a sorted and permutation invariant feature list that can be directly incorporated into common computer vision applications, showing competitive results compared to prior work through evaluation on standard benchmarks.
Article
Computer Science, Artificial Intelligence
Wei Emma Zhang et al.
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY
(2020)
Review
Engineering, Multidisciplinary
Qiu XiPeng et al.
SCIENCE CHINA-TECHNOLOGICAL SCIENCES
(2020)
Proceedings Paper
Biochemical Research Methods
Tim Prangemeier et al.
2020 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE
(2020)
Proceedings Paper
Computer Science, Artificial Intelligence
Mohsen Fayyaz et al.
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)
(2020)
Proceedings Paper
Computer Science, Artificial Intelligence
Hongje Seong et al.
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW)
(2019)
Proceedings Paper
Computer Science, Artificial Intelligence
Suhas Lohit et al.
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019)
(2019)
Article
Computer Science, Artificial Intelligence
Hao Liu et al.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
(2018)
Article
Computer Science, Artificial Intelligence
Shaoqing Ren et al.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
(2017)
Article
Computer Science, Information Systems
Jun Zhu et al.
JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY
(2014)
Article
Computer Science, Software Engineering
Tianshi Chen et al.
ACM SIGPLAN NOTICES
(2014)
Article
Computer Science, Theory & Methods
Daniel A. Spielman et al.
SIAM JOURNAL ON COMPUTING
(2011)
Article
Multidisciplinary Sciences
F Chung et al.
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
(2002)