3D model generators could be the next big breakthrough in AI. OpenAI has released Point-E, an open-sourced machine learning system that can create a 3D object from a text prompt.
Point-E produces 3D models in a matter of minutes using a single Nvidia V100 GPU, according to a paper published with the code base.
Point-E doesn’t create 3D objects in the traditional sense. It generates point clouds or discrete data points in space corresponding to a 3D shape.
Point-E stands for efficiency, which is why it generates point clouds faster than other 3D object generation methods.
Point clouds are more computationally straightforward, but they can’t capture the object’s fine-grained texture or shape — which is a major limitation of Point-E.
To overcome this limitation, point-E developed an AI system that could convert Point-E’s cloud points into meshes. Meshes are used for 3D modeling and design.
They also note that the model can sometimes miss parts of the object, which can lead to distorted or blocky shapes.
Point-E is composed of two models, one text-to-image and one image-to-3D. Similar to OpenAI’s DALL-E 2 or Stable Diffusion, the text-to-image model was trained using labeled images to learn how to associate words with visual concepts.
The image-to-3D model was, however, fed a series of images paired with 3D objects to help it learn how to translate the two effectively.
OpenAI researchers claim that Point-E can produce colored point clouds when it trains the models using a large dataset of 3D objects and associated metadata.
Point-E’s image to the 3D model can sometimes fail to recognize the image from the text model. These results in shape are different from the text prompt. Still, it’s orders of magnitude quicker than the earlier state-of-the-art, according to the OpenAI team.