Creating 3D animations from regular 2D images has long been thought impossible by researchers since three decades ago. Describing the geometry of objects to a computer was no easy task back then. Even research computers are so much slower than our current home computers.
But the time is different now. Researchers at Carnegie Mellon University has announced that they’ve found a way to help computers learn the geometric context of a 2D image automatically. Not only will we be able to generate 3D scenes from a single 2D image, the technology could also be used with gadgets and robots to better comprehend what their “eyes” see.
The transformation is accomplished by teaching the computer to spot visual cues to differentiate horizontal and vertical surfaces in outdoor scenes. For example, the sky is always blue and cars will likely be resting on flat grounds. From the video below, the technology is looking really promising.
How far do you think we are, from an age where robots roam the street and behave just like everyone of us?


June 17th, 2006 at 12:16 am
Beyond insane… That’s a MASSIVE leap in graphics technology. It’s so limitless. Give ten years and this programming will be worth millions.
June 18th, 2006 at 3:54 pm
them 3d models are shit compaird to some of the job done manualy, Computers will be very advanced , but still need things to be done manualy ,
June 18th, 2006 at 4:02 pm
coming to a video game near you in the not to near future. the capabilities is endless..lovely
June 18th, 2006 at 7:21 pm
This would’ve required some clever matrix trasformations to be applied. Yet, I’d still prefer to do the work manually. Some of those images look rather flat.
June 19th, 2006 at 2:37 am
You guys are idiots. This technology isnt really that important for videogames or 3d-animation, but for robots navigating. With this technology a robot helicopter or land vehicle could navigate down a crowded city street using traditional video camera’s to see where its going.
This is the future of input for robotics.
June 19th, 2006 at 5:21 am
Well, application in video games and other mediums is feasible. While manual work is still required, this could start out as a great blueprint for graphical applications. Imagine, for instance, a stadium in a sports game, starting out with the simple 2-D image, then transformed into 3-D. All the programmers would have to do is add some more texturing on top and clean up some of the imagery, and you could have a rather accurate field to play in. The possibilities are limitless!
June 19th, 2006 at 9:54 am
If you have 2 cameras you can easily do stereographic imaging with photogrammetry. Probably they will use this technology for that?
June 19th, 2006 at 11:32 am
Or a multiplayer game, connected to the users webcam that takes a few snapshots of the room you’re playing in… players could visit each other, whooping each others asses in their living rooms
June 19th, 2006 at 12:28 pm
Come on guys, firstly this technology would be in an early state of development, yes they have a long way to go if it was to be used in 3d gaming and annimation and all that. However what the technology is desgned for, this is one hell of a leap forward. I was rather impressed.
June 21st, 2006 at 9:39 am
As mentioned, this isn’t really anything new. Such technologies have been around for QUITE some time now. Practical usage in games would be almost nil.
Since game developement entrails OPTIMIZATION of 3D and not just a splattering of a complex mesh, the results attained manually would be nowhere near what can be accomplished with multiple perspective references and a modeler with a good knack for minimalistic poly-flow. Additionally the textures seem to be procedurally generated on the fly, giving distance objects a very blurred setup. I’m not quite sure quite why they didn’t go with a vertex coloring scheme to speed up the process and more easily interpret depth.
As you might have also noticed, it’s not a FULL 3D image either. Merely what the photograph can show. If you tilt the camera around to the backside of any such images, you’d have an empty shell of an already complex high-triangle mesh.
The usages for this have been stated as well. Primarily robotic input interpretation. The only problem is that such technologies “usually” (or at least have in the past) taken extensive time to render. A robotic input “eye” would need to update something like at a least a fair rate — even faster if movement was involved or you needed accuracy beyond 2-4 SPF (seconds per frame).