Yesterday I did a brief presentation on augmented reality (AR) at Barcamp Auckland. Having no internet connection, and no mini DVI connector for a VGA projector, I kinda had to roll with it and talked more and showed less visuals apart from a few vids. Here are some notes from my original presentation and some links to videos for more info.
As you can appreciate this is a big topic, but I just wanted to touch on the basics for the 30 minute talk. I am no expert in AR but I think there is a huge opportunity for businesses or individuals to enter this space today and be at the top of the curve.
Augmented reality is part way along the spectrum towards virtual reality. VR is where you experience a fully digital or synthesised environment, whereas AR is where you take a real environment and overlay digital data and imagery to enhance the scene. The most typical example of this is the heads up display in a fighter jet. Airspeed, altitude, pitch, and bad guys are all highlighted in the line of view of the pilot.
There are three most common types of AR that I will cover:
- Projection AR
- Windowed AR
- Retinal display
This is where you utilise a camera and projector to interact with your environment. The camera to observe, and the projector to display overlay information.
This type or AR is currently in use in vehicles to project driving information and directions onto the windscreen so it appears in the line of view of the driver.
See this video of the BMW HUD (mind the music).
Here is a fantastic presentation by the MIT team about their wearable AR projector
This style of AR is good for interactions with environments where you have canvases that are close to you. This would not work obviously when wanting to interact with a mountain, a building across the road and so on.
This is where using some eye-wear, with built in camera and micro projection device, projects additional information augmenting the scene directly onto the retina. Think of this as your own personal HUD, that can give you directions as you walk, alert you to new messages, identifying objects and providing detailed information about them.
The best example of this is as seen in the movie Terminator.
Windowed AR – What you can do today
This style of AR is where you use a intermediary device as your window onto a real scene. Think of it as a looking glass. This could be a mobile phone, or portable video device. The device views the scene as you would, and renders it on a screen. It can then overlay digital information on top.
This is the most exciting form of AR for me at the moment as this is relatively easy to achieve using the latest styles of mobile phones, like the iPhone 3GS, Android G1 and G2. Here is what you need to achieve basic AR information overlays on a mobile phone.
You will need a decent video camera within in the device to view and a screen to render the scene.
The phone needs to know WHERE it is. Consumer GPS can get fairly accurate within a 10 meter area.
A digital compass to determine which way the phone is pointing.
What angle is the phone looking at. Is it looking down, up, slightly on an angle?
Combining these things together you get a pretty accurate position as to what the device is looking at. The accuracy has quite some variance, so this works quite well when dealing with large outdoor scenes where the objects are large, for example a tourism application where you might be walking through downtown Auckland or Rome and looking at landmarks, or for real estate where you are covering a suburb and providing property information from the roadside.
A powerful extension to AR is the inclusion of image and shape recognition. By adding this to position, direction and angle, you enable the device to recognise “things” in the scene. These could be features of the landscape, peoples faces, buildings, or objects in a room. This increases the accuracy of AR considerably, but adds complexity by needing to develop algorithms to process realtime video and identify objects. Currently mobile devices are a little underpowered to allow for this but there are some exciting developments underway, and on the next generation devices this will be even more of a reality.
The most common example of image recog today are where a printed patteren or fiduciary marker is used to let the device identify a canvas in the scene. The device then overlays onto or replaces the marker with some digital information. There are examples of this where a 3D object is superimposed onto the scene and the perspective of the object matches the angle and direction viewed by the devices.
This example is a virtual pet for the iPhone
This is a concept game by NVidia using their new GPU chip-set for mobile devices.
Tourism – A virtual guide, that can identify when you are near a point of interest and provide visual and or audio information about it. A point and identify tool, to get information of landmarks.
Real Estate – Roadside guide to property that guides you from house to house, and provides detailed information from the curb.
Engineering – An app to allow field engineers to easily find buried cables, pipes, or even just a power meter.
What else?? The exciting thing is there are hundreds or applications yet to be discovered.
FaLLen SREngine demo http://www.youtube.com/watch?v=LhujKGuhiK0
Wiki article on Virtual retinal display http://en.wikipedia.org/wiki/Virtual_retinal_display
BMW Augmented reality for engineering http://www.youtube.com/watch?v=P9KPJlA5yds
A good wiki article on AR http://en.wikipedia.org/wiki/Augmented_reality
If you want to start developing here is a handy development toolkit – http://www.hitl.washington.edu/artoolkit/