RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://huggingface.co/papers/2304.07193 below:

Website Navigation

Paper page - DINOv2: Learning Robust Visual Features without Supervision

DINOv2: Learning Robust Visual Features without Supervision

Published on Apr 14, 2023

Authors:

,

Abstract

Self-supervised pretraining of large-scale image datasets using a ViT model and distillation yields superior all-purpose visual features.

AI-generated summary

The recent breakthroughs in natural language processing for model pretraining on large quantities of data have opened the way for similar foundation models in computer vision. These models could greatly simplify the use of images in any system by producing all-purpose visual features, i.e., features that work across image distributions and tasks without finetuning. This work shows that existing pretraining methods, especially self-supervised methods, can produce such features if trained on enough curated data from diverse sources. We revisit existing approaches and combine different techniques to scale our pretraining in terms of data and model size. Most of the technical contributions aim at accelerating and stabilizing the training at scale. In terms of data, we propose an automatic pipeline to build a dedicated, diverse, and curated image dataset instead of uncurated data, as typically done in the self-supervised literature. In terms of models, we train a ViT model (Dosovitskiy et al., 2020) with 1B parameters and distill it into a series of smaller models that surpass the best available all-purpose features, OpenCLIP (Ilharco et al., 2021) on most of the benchmarks at image and pixel levels.

Models citing this paper 59 facebook/dinov2-base

Image Feature Extraction • 0.1B • Updated Jan 17, 2024 • 2.57M • 137

facebook/dinov2-large

Image Feature Extraction • 0.3B • Updated Sep 6, 2023 • 693k • 86

facebook/dinov2-giant

Image Feature Extraction • 1B • Updated Sep 6, 2023 • 103k • 46

facebook/dinov2-small

Image Feature Extraction • 0.0B • Updated Sep 6, 2023 • 1.32M • 41

Browse 59 models citing this paper Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2304.07193 in a dataset README.md to link it from this page.

Spaces citing this paper 101 Collections including this paper 5 Browse 5 collections that include this paper

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4