Intel Labs Presents LDM3Da new model of Generative AI capable of creating realistic 3D visual content from simple text. It is the first diffusion model capable of generating depth mapping and has the potential to revolutionize content creationthe metaverse and digital experiences.
Intel Labs Introduces Latent Diffusion Model for 3D
Il Latent Diffusion Model for 3D (LDM3D) is the latest generative AI model announced by Intel Labs in collaboration with Blockade Labs. It is a new diffusion model that uses generative artificial intelligence to create 3D visual content realistic. Furthermore, it is the first model that exploits the diffusion process for create 360° 3D images through depth mapping. With these premises, LDM3D could revolutionize the creation of multiple content, applications of metaverso and digital experiences. An invention capable of transforming various sectors, from entertainment to games, up to architecture and design.
Generative AI aims to increase and enhance human creativity ea save time. However, most of the generative AI models available today are limited to generating 2D images; only very few are able to generate 3D images from text instructions. Unlike existing latent stable diffusion models, LDM3D allows you to generate an image and a depth map from a given text message using almost the same number of parameters. Provides a more accurate relative depth per pixel compared to standard post-processing methods for depth estimation and saves developers a lot of time creating scenes.
Vasudev Lal, AI/ML research scientist, Intel Labs.
The potential of LDM3D in generative AI
Intel is committed to enabling generative AI to enable ever-increasing access to the benefits of this technology through a open ecosystem. Despite the dramatic advances in the industry seen in recent years, most of today’s advanced generative AI models are limited to generating two-dimensional images. LDM3D differs from existing diffusion models, which generate 2D RGB images from text prompts, by depth mapping. Starting from a single textual indication, in fact, it manages to create 3D images, providing a more accurate relative depth for each pixel.
Intel Labs research has the potential to revolutionize the way we interact with digital content, creating new previously inconceivable possibilities. Users approaching LDM3D can transform the text description of a tropical beach paradise, a sci-fi universe or any detailed 360° panorama. This capability allows for the creation of realistic and immersive visual content, proving to be an absolute innovation for the most varied sectors. From entertainment to gaming, from interior design to architectural renderings, up to immersive virtual reality (VR) experiences.
How Intel Labs’ new diffusion model works
To create 360° 3D images, LDM3D was trained on a dataset that contains 10,000 samples of the database LAION-400M. The latter contains over 400 million images and paired captions. For the design of LDM3D, the team used the deep estimation model Dense Prediction Transformer (DPT) by Intel Labs. The DPT-large model is capable of providing extremely accurate relative depth for each pixel within an image.
LDM3D model training is done through an Intel AI supercomputer powered by Intel Xeon processors and Intel Habana Gaudi AI accelerators. The generated RGB image and depth map are combined to create 360° visual content.
LDM3D: a look into the future of AI
The LDM3D model paves the way for further advances inMulti-view generative AI and in computer vision. With a view to promoting and exploring the use of artificial intelligence and building a ecosistema open source, Intel made LDM3D accessible via HuggingFace. In this way, AI researchers and developers will be able to contribute by improving this system and adapting it to customized applications.
Intel Labs’ research was presented at the IEE/CVF Computer Vision and Pattern Recognition Conference (CVPR) June 18-22.
Leave a Reply
View Comments