← Back to glossary
Vision Transformer (ViT)
Transformer-based architecture applied to images by splitting them into patches and processing them as sequences, often outperforming CNNs.
Advanced vision transformer arquitectura
Full definition
Transformer-based architecture applied to images by splitting them into patches and processing them as sequences, often outperforming CNNs.
Example in a business context
Modern image classification models that use attention instead of convolutions.