ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.12011
  4. Cited By
P2T: Pyramid Pooling Transformer for Scene Understanding

P2T: Pyramid Pooling Transformer for Scene Understanding

22 June 2021
Yu-Huan Wu
Yun-Hai Liu
Xin Zhan
Mingg-Ming Cheng
    ViT
ArXivPDFHTML

Papers citing "P2T: Pyramid Pooling Transformer for Scene Understanding"

50 / 73 papers shown
Title
LSNet: See Large, Focus Small
LSNet: See Large, Focus Small
Ao Wang
Hui Chen
Zijia Lin
J. Han
Guiguang Ding
37
0
0
29 Mar 2025
A Probabilistic Neuro-symbolic Layer for Algebraic Constraint Satisfaction
A Probabilistic Neuro-symbolic Layer for Algebraic Constraint Satisfaction
Leander Kurscheidt
Paolo Morettin
Roberto Sebastiani
Andrea Passerini
Antonio Vergari
55
0
0
25 Mar 2025
Patch-Depth Fusion: Dichotomous Image Segmentation via Fine-Grained Patch Strategy and Depth Integrity-Prior
Patch-Depth Fusion: Dichotomous Image Segmentation via Fine-Grained Patch Strategy and Depth Integrity-Prior
Xianjie Liu
Keren Fu
Qijun Zhao
MDE
52
0
0
08 Mar 2025
Enhancing Video Understanding: Deep Neural Networks for Spatiotemporal Analysis
Enhancing Video Understanding: Deep Neural Networks for Spatiotemporal Analysis
Amir Hosein Fadaei
M. Dehaqani
40
0
0
11 Feb 2025
Towards Open-Vocabulary Video Semantic Segmentation
Towards Open-Vocabulary Video Semantic Segmentation
X. Li
Yun Liu
Guolei Sun
Min Wu
Le Zhang
Ce Zhu
VLM
VOS
85
1
0
12 Dec 2024
Multi-Token Enhancing for Vision Representation Learning
Multi-Token Enhancing for Vision Representation Learning
Zhong-Yu Li
Yu-Song Hu
Bo Yin
Ming-Ming Cheng
66
1
0
24 Nov 2024
RAFA-Net: Region Attention Network For Food Items And Agricultural
  Stress Recognition
RAFA-Net: Region Attention Network For Food Items And Agricultural Stress Recognition
Asish Bera
O. Krejcar
D. Bhattacharjee
32
6
0
16 Oct 2024
ROA-BEV: 2D Region-Oriented Attention for BEV-based 3D Object
ROA-BEV: 2D Region-Oriented Attention for BEV-based 3D Object
Jiwei Chen
Laiyan Ding
Chi Zhang
Feifei Li
Rui Huang
13
0
0
14 Oct 2024
SparX: A Sparse Cross-Layer Connection Mechanism for Hierarchical Vision
  Mamba and Transformer Networks
SparX: A Sparse Cross-Layer Connection Mechanism for Hierarchical Vision Mamba and Transformer Networks
Meng Lou
Yunxiang Fu
Yizhou Yu
Mamba
42
5
0
15 Sep 2024
MVTN: A Multiscale Video Transformer Network for Hand Gesture
  Recognition
MVTN: A Multiscale Video Transformer Network for Hand Gesture Recognition
Mallika Garg
Debashis Ghosh
P. M. Pradhan
ViT
26
1
0
05 Sep 2024
FrozenSeg: Harmonizing Frozen Foundation Models for Open-Vocabulary
  Segmentation
FrozenSeg: Harmonizing Frozen Foundation Models for Open-Vocabulary Segmentation
Xi Chen
Haosen Yang
Sheng Jin
Xiatian Zhu
H. Yao
VLM
29
3
0
05 Sep 2024
MTMamba++: Enhancing Multi-Task Dense Scene Understanding via
  Mamba-Based Decoders
MTMamba++: Enhancing Multi-Task Dense Scene Understanding via Mamba-Based Decoders
Baijiong Lin
Weisen Jiang
Pengguang Chen
Shu Liu
Ying-Cong Chen
Mamba
25
1
0
27 Aug 2024
Embedding-Free Transformer with Inference Spatial Reduction for
  Efficient Semantic Segmentation
Embedding-Free Transformer with Inference Spatial Reduction for Efficient Semantic Segmentation
Hyunwoo Yu
Yubin Cho
Beoungwoo Kang
Seunghun Moon
Kyeongbo Kong
Suk-Ju Kang
28
2
0
24 Jul 2024
Video Watermarking: Safeguarding Your Video from (Unauthorized)
  Annotations by Video-based LLMs
Video Watermarking: Safeguarding Your Video from (Unauthorized) Annotations by Video-based LLMs
Jinmin Li
Kuofeng Gao
Yang Bai
Jingyun Zhang
Shu-Tao Xia
28
4
0
02 Jul 2024
CYCLO: Cyclic Graph Transformer Approach to Multi-Object Relationship
  Modeling in Aerial Videos
CYCLO: Cyclic Graph Transformer Approach to Multi-Object Relationship Modeling in Aerial Videos
Trong-Thuan Nguyen
Pha Nguyen
Xin Li
Jackson Cothren
Alper Yilmaz
Khoa Luu
33
3
0
03 Jun 2024
GestFormer: Multiscale Wavelet Pooling Transformer Network for Dynamic
  Hand Gesture Recognition
GestFormer: Multiscale Wavelet Pooling Transformer Network for Dynamic Hand Gesture Recognition
Mallika Garg
Debashis Ghosh
P. M. Pradhan
SLR
ViT
32
2
0
18 May 2024
OpenESS: Event-based Semantic Scene Understanding with Open Vocabularies
OpenESS: Event-based Semantic Scene Understanding with Open Vocabularies
Lingdong Kong
You-Chen Liu
Lai Xing Ng
Benoit R. Cottereau
Wei Tsang Ooi
VLM
29
12
0
08 May 2024
SRAGAN: Saliency Regularized and Attended Generative Adversarial Network for Chinese Ink-wash Painting Generation
SRAGAN: Saliency Regularized and Attended Generative Adversarial Network for Chinese Ink-wash Painting Generation
Xiang Gao
Yuqi Zhang
GAN
30
0
0
24 Apr 2024
Multi-view Aggregation Network for Dichotomous Image Segmentation
Multi-view Aggregation Network for Dichotomous Image Segmentation
Qian Yu
Xiaoqi Zhao
Youwei Pang
Lihe Zhang
Huchuan Lu
37
15
0
11 Apr 2024
Enhancing Efficiency in Vision Transformer Networks: Design Techniques
  and Insights
Enhancing Efficiency in Vision Transformer Networks: Design Techniques and Insights
Moein Heidari
Reza Azad
Sina Ghorbani Kolahi
René Arimond
Leon Niggemeier
...
Afshin Bozorgpour
Ehsan Khodapanah Aghdam
A. Kazerouni
I. Hacihaliloglu
Dorit Merhof
41
7
0
28 Mar 2024
FMM-Attack: A Flow-based Multi-modal Adversarial Attack on Video-based
  LLMs
FMM-Attack: A Flow-based Multi-modal Adversarial Attack on Video-based LLMs
Jinmin Li
Kuofeng Gao
Yang Bai
Jingyun Zhang
Shu-Tao Xia
Yisen Wang
AAML
22
7
0
20 Mar 2024
OpenOcc: Open Vocabulary 3D Scene Reconstruction via Occupancy
  Representation
OpenOcc: Open Vocabulary 3D Scene Reconstruction via Occupancy Representation
Haochen Jiang
Yueming Xu
Yihan Zeng
Hang Xu
Wei Zhang
Jianfeng Feng
Li Zhang
27
1
0
18 Mar 2024
LSKNet: A Foundation Lightweight Backbone for Remote Sensing
LSKNet: A Foundation Lightweight Backbone for Remote Sensing
Yuxuan Li
Xiang Li
Yimain Dai
Qibin Hou
Li Liu
Yongxiang Liu
Ming-Ming Cheng
Jian Yang
29
31
0
18 Mar 2024
Attacking Transformers with Feature Diversity Adversarial Perturbation
Attacking Transformers with Feature Diversity Adversarial Perturbation
Chenxing Gao
Hang Zhou
Junqing Yu
Yuteng Ye
Jiale Cai
Junle Wang
Wei Yang
AAML
24
3
0
10 Mar 2024
SDR-Former: A Siamese Dual-Resolution Transformer for Liver Lesion
  Classification Using 3D Multi-Phase Imaging
SDR-Former: A Siamese Dual-Resolution Transformer for Liver Lesion Classification Using 3D Multi-Phase Imaging
Meng Lou
Hanning Ying
Xiaoqing Liu
Hong-Yu Zhou
Yuqing Zhang
Yizhou Yu
MedIm
37
8
0
27 Feb 2024
Lightweight high-resolution Subject Matting in the Real World
Lightweight high-resolution Subject Matting in the Real World
Peng Liu
Fanyi Wang
Jingwen Su
Yanhao Zhang
Guojun Qi
3DH
18
2
0
12 Dec 2023
Advancing Vision Transformers with Group-Mix Attention
Advancing Vision Transformers with Group-Mix Attention
Chongjian Ge
Xiaohan Ding
Zhan Tong
Li Yuan
Jiangliu Wang
Yibing Song
Ping Luo
112
16
0
26 Nov 2023
MS-Former: Memory-Supported Transformer for Weakly Supervised Change
  Detection with Patch-Level Annotations
MS-Former: Memory-Supported Transformer for Weakly Supervised Change Detection with Patch-Level Annotations
Zhenglai Li
Chang-Fu Tang
Xinwang Liu
Changdong Li
Xianju Li
Wei Zhang
9
6
0
16 Nov 2023
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition
Meng Lou
Hong-Yu Zhou
Sibei Yang
Yizhou Yu
Chuan Wu
Yizhou Yu
ViT
31
35
0
30 Oct 2023
Low-Resolution Self-Attention for Semantic Segmentation
Low-Resolution Self-Attention for Semantic Segmentation
Yu-Huan Wu
Shi-Chen Zhang
Yun-Hai Liu
Le Zhang
Xin Zhan
Daquan Zhou
Jiashi Feng
Ming-Ming Cheng
Liangli Zhen
ViT
32
3
0
08 Oct 2023
SwinLSTM:Improving Spatiotemporal Prediction Accuracy using Swin
  Transformer and LSTM
SwinLSTM:Improving Spatiotemporal Prediction Accuracy using Swin Transformer and LSTM
Song Tang
Chuang Li
Pufen Zhang
R. Tang
AI4TS
21
45
0
19 Aug 2023
Vision Backbone Enhancement via Multi-Stage Cross-Scale Attention
Vision Backbone Enhancement via Multi-Stage Cross-Scale Attention
Liang Shang
Yanli Liu
Zhengyang Lou
Shuxue Quan
N. Adluru
Bochen Guan
W. Sethares
16
1
0
10 Aug 2023
CalibNet: Dual-branch Cross-modal Calibration for RGB-D Salient Instance
  Segmentation
CalibNet: Dual-branch Cross-modal Calibration for RGB-D Salient Instance Segmentation
Jialun Pei
Tao Jiang
He Tang
Nian Liu
Yueming Jin
Deng-Ping Fan
Pheng-Ann Heng
ISeg
30
8
0
16 Jul 2023
Open Scene Understanding: Grounded Situation Recognition Meets Segment
  Anything for Helping People with Visual Impairments
Open Scene Understanding: Grounded Situation Recognition Meets Segment Anything for Helping People with Visual Impairments
R. Liu
Jiaming Zhang
Kunyu Peng
Junwei Zheng
Ke Cao
Yufan Chen
Kailun Yang
Rainer Stiefelhagen
19
15
0
15 Jul 2023
Revisiting Computer-Aided Tuberculosis Diagnosis
Revisiting Computer-Aided Tuberculosis Diagnosis
Yun-Hai Liu
Yu-Huan Wu
Shi-Chen Zhang
Li Liu
Min-Ying Wu
Ming-Ming Cheng
19
14
0
06 Jul 2023
SegViTv2: Exploring Efficient and Continual Semantic Segmentation with
  Plain Vision Transformers
SegViTv2: Exploring Efficient and Continual Semantic Segmentation with Plain Vision Transformers
Bowen Zhang
Liyang Liu
Minh Hieu Phan
Zhi Tian
Chunhua Shen
Yifan Liu
ViT
19
28
0
09 Jun 2023
InterFormer: Real-time Interactive Image Segmentation
InterFormer: Real-time Interactive Image Segmentation
YouFu Huang
Hao Yang
Ke Sun
Shengchuan Zhang
Liujuan Cao
Guannan Jiang
Rongrong Ji
27
22
0
06 Apr 2023
Towards Efficient Task-Driven Model Reprogramming with Foundation Models
Towards Efficient Task-Driven Model Reprogramming with Foundation Models
Shoukai Xu
Jiangchao Yao
Ran Luo
Shuhai Zhang
Zihao Lian
Mingkui Tan
Bo Han
Yaowei Wang
19
6
0
05 Apr 2023
Large Selective Kernel Network for Remote Sensing Object Detection
Large Selective Kernel Network for Remote Sensing Object Detection
Yuxuan Li
Qibin Hou
Zhaohui Zheng
Mingmei Cheng
Jian Yang
Xiang Li
ObjD
21
239
0
16 Mar 2023
Pyramid Pixel Context Adaption Network for Medical Image Classification
  with Supervised Contrastive Learning
Pyramid Pixel Context Adaption Network for Medical Image Classification with Supervised Contrastive Learning
Xiaoqin Zhang
Zunjie Xiao
Xiao Wu
Jiansheng Fang
Junyong Shen
Yan Hu
Jiang-Dong Liu
16
10
0
03 Mar 2023
Delivering Arbitrary-Modal Semantic Segmentation
Delivering Arbitrary-Modal Semantic Segmentation
Jiaming Zhang
R. Liu
Haowen Shi
Kailun Yang
Simon Reiß
Kunyu Peng
Haodong Fu
Kaiwei Wang
Rainer Stiefelhagen
VLM
32
85
0
02 Mar 2023
Most Important Person-guided Dual-branch Cross-Patch Attention for Group
  Affect Recognition
Most Important Person-guided Dual-branch Cross-Patch Attention for Group Affect Recognition
Hongxia Xie
Ming-Xian Lee
Tzu-Jui Chen
Hung-Jen Chen
Hou-I Liu
Hong-Han Shuai
Wen-Huang Cheng
CVBM
20
8
0
14 Dec 2022
IncepFormer: Efficient Inception Transformer with Pyramid Pooling for
  Semantic Segmentation
IncepFormer: Efficient Inception Transformer with Pyramid Pooling for Semantic Segmentation
Lihua Fu
Haoyue Tian
Xiang Zhai
Pan Gao
Xiaojiang Peng
ViT
18
9
0
06 Dec 2022
Studying inductive biases in image classification task
Studying inductive biases in image classification task
N. Arizumi
16
1
0
31 Oct 2022
Face Pyramid Vision Transformer
Face Pyramid Vision Transformer
Khawar Islam
M. Zaheer
Arif Mahmood
ViT
CVBM
17
4
0
21 Oct 2022
SegViT: Semantic Segmentation with Plain Vision Transformers
SegViT: Semantic Segmentation with Plain Vision Transformers
Bowen Zhang
Zhi Tian
Quan Tang
Xiangxiang Chu
Xiaolin K. Wei
Chunhua Shen
Yifan Liu
ViT
16
133
0
12 Oct 2022
Centralized Feature Pyramid for Object Detection
Centralized Feature Pyramid for Object Detection
Yu Quan
Dong Zhang
Liyan Zhang
Jinhui Tang
ObjD
19
143
0
05 Oct 2022
Ret3D: Rethinking Object Relations for Efficient 3D Object Detection in
  Driving Scenes
Ret3D: Rethinking Object Relations for Efficient 3D Object Detection in Driving Scenes
Yu-Huan Wu
Da Zhang
Le Zhang
Xin Zhan
Dengxin Dai
Yun-Hai Liu
Ming-Ming Cheng
3DPC
16
2
0
18 Aug 2022
Behind Every Domain There is a Shift: Adapting Distortion-aware Vision
  Transformers for Panoramic Semantic Segmentation
Behind Every Domain There is a Shift: Adapting Distortion-aware Vision Transformers for Panoramic Semantic Segmentation
Jiaming Zhang
Kailun Yang
Haowen Shi
Simon Reiß
Kunyu Peng
Chaoxiang Ma
Haodong Fu
Philip H. S. Torr
Kaiwei Wang
Rainer Stiefelhagen
ViT
MDE
24
35
0
25 Jul 2022
Defect Transformer: An Efficient Hybrid Transformer Architecture for
  Surface Defect Detection
Defect Transformer: An Efficient Hybrid Transformer Architecture for Surface Defect Detection
Junpu Wang
Guili Xu
Fuju Yan
Jinjin Wang
Zhengsheng Wang
ViT
MedIm
17
65
0
17 Jul 2022
12
Next