GlossarIA
Open AI glossary for companies
← Back to glossary

Vision Transformer (ViT)

Transformer-based architecture applied to images by splitting them into patches and processing them as sequences, often outperforming CNNs.

Advanced vision transformer arquitectura

Full definition

Transformer-based architecture applied to images by splitting them into patches and processing them as sequences, often outperforming CNNs.

Example in a business context

Modern image classification models that use attention instead of convolutions.