Train Better Models, Faster - No Labels Needed
LightlyTrain brings self-supervised pretraining to real-world computer vision pipelines, using your unlabeled data to reduce labeling costs and speed up model deployment. Leveraging the state-of-the-art from research, it pretrains your model on your unlabeled, domain-specific data, significantly reducing the amount of labeling needed to reach a high model performance.
This allows you to focus on new features and domains instead of managing your labeling cycles. LightlyTrain is designed for simple integration into existing training pipelines and supports a wide range of model architectures and use cases out of the box.
How It WorksOn COCO, YOLOv8-s models pretrained with LightlyTrain achieve high performance across all tested label fractions. These improvements hold for other architectures like YOLOv11, RT-DETR, and Faster R-CNN. See our announcement post for more details.
Install LightlyTrain:
pip install lightly-train
Then start pretraining with:
import lightly_train if __name__ == "__main__": lightly_train.train( out="out/my_experiment", # Output directory data="my_data_dir", # Directory with images model="torchvision/resnet50", # Model to train )
This will pretrain a Torchvision ResNet-50 model using unlabeled images from my_data_dir
. All training logs, model exports, and checkpoints are saved to the output directory at out/my_experiment
. The final model is exported to out/my_experiment/exported_models/exported_last.pt
.
Finally, load the pretrained model and fine-tune it using your existing training pipeline:
import torch from torchvision import models # Load the pretrained model model = models.resnet50() model.load_state_dict(torch.load("out/my_experiment/exported_models/exported_last.pt", weights_only=True)) # Fine-tune the model with your existing training pipeline ...
See also:
Fine-Tune Example: Looking for a full fine-tuning example? Head over to the Quick Start!
π₯ New: Semantic Segmentation Fine-tuning: Want to train a state-of-the-art semantic segmentation model? Head over to the semantic segmentation guide!
Embedding Example: Want to use your pretrained model to generate image embeddings instead? Check out the embed guide!
More Tutorials: Want to get more hands-on with LightlyTrain? Check out our Tutorials for more examples!
For an overview of all supported models and usage instructions, see the full model docs.
Contact us if you need support for additional models or libraries.
Supported Training MethodsSee the full methods docs for details.
Who is LightlyTrain for?LightlyTrain is designed for engineers and teams who want to use their unlabeled data to its full potential. It is ideal if any of the following applies to you:
We recommend a minimum of several thousand unlabeled images for training with LightlyTrain and 100+ labeled images for fine-tuning afterwards.
For best results:
The unlabeled dataset must always be treated like a training splitβnever include validation images in pretraining to avoid data leakage.
What's the difference between LightlyTrain and other self-supervised learning implementations?LightlyTrain offers several advantages:
LightlyTrain is most beneficial when:
LightlyTrain is complementary to existing pretrained models and can start from either random weights or existing pretrained weights.
Check our complete FAQ for more information.
LightlyTrain offers flexible licensing options to suit your specific needs:
AGPL-3.0 License: Perfect for open-source projects, academic research, and community contributions. Share your innovations with the world while benefiting from community improvements.
Commercial License: Ideal for businesses and organizations that need proprietary development freedom. Enjoy all the benefits of LightlyTrain while keeping your code and models private.
Free Community License: Available for students, researchers, startups in early stages, or anyone exploring or experimenting with LightlyTrain. Empower the next generation of innovators with full access to the world of pretraining.
We're committed to supporting both open-source and commercial users. Contact us to discuss the best licensing option for your project!
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4