Skip to content

Latest commit

 

History

History
23 lines (19 loc) · 969 Bytes

README.md

File metadata and controls

23 lines (19 loc) · 969 Bytes

Medical Transformers

Abstract

Transformers are a widely adopted technique in the natural language processing research community due to their powerful attention and parallelization capabilities. However in the computer vision scene, before vision transformer (ViT), pure applications of transformers on images often are underperformed compared to convolutional neural networks or hybrid models (CNNs and transformers together). Vision transformer (ViT), although a relatively new approach, has the state of the art results in image classification. In this paper, we apply vision transformers (ViT) to the medical domain for the first time. We also propose a strided (SViT) approach that improves both vision transformer (ViT) and vision transformer trained with distillation tokens (DViT). Further analyses and comparisons between different applications of vision transformers are given.

Poster

medical transformer poster