Tonix Pharmaceuticals stock halted ahead of FDA approval news
Investing.com -- Meta has unveiled DINOv3, a state-of-the-art computer vision model that achieves unprecedented performance across diverse visual tasks without requiring labeled data.
The new model scales self-supervised learning to create universal vision backbones that outperform specialized solutions on multiple tasks including object detection and semantic segmentation. DINOv3 was trained on 1.7 billion images and scaled to 7 billion parameters, representing a 7x larger model on a 12x larger dataset than its predecessor.
Unlike previous approaches that rely heavily on human-generated metadata such as web captions, DINOv3 learns independently without human supervision. This label-free approach enables applications where annotations are scarce, costly, or impossible to obtain.
The model produces high-resolution visual features that make it easy to train lightweight adapters, leading to exceptional performance across image classification, semantic segmentation, and object tracking in video. For the first time, a single frozen vision backbone outperforms specialized solutions on multiple dense prediction tasks.
Meta is releasing a comprehensive suite of pre-trained backbones under a commercial license, including smaller models that outperform comparable CLIP-based derivatives and alternative ConvNeXt architectures for resource-constrained use cases. The company is also sharing downstream evaluation heads and sample notebooks to help developers build with DINOv3.
Real-world applications are already emerging. The World Resources Institute is using DINOv3 to monitor deforestation and support restoration efforts. Compared to DINOv2, the new model reduces the average error in measuring tree canopy height in a region of Kenya from 4.1 meters to 1.2 meters.
NASA’s Jet Propulsion Laboratory is also leveraging the technology to build exploration robots for Mars, enabling multiple vision tasks with minimal compute requirements.
The release includes the full DINOv3 training code and pre-trained models to drive innovation in computer vision and multimodal applications across industries including healthcare, environmental monitoring, autonomous vehicles, retail, and manufacturing.
This article was generated with the support of AI and reviewed by an editor. For more information see our T&C.