Sunday, August 14, 2011

Software : Tutorial: How to hack Kinect for new functionality

Software : Tutorial: How to hack Kinect for new functionality


Tutorial: How to hack Kinect for new functionality

Posted: 14 Aug 2011 02:00 AM PDT

When you break it down into its basic components, Kinect isn't actually particularly revolutionary. It contains a motorised webcam, a microphone array, a depth sensor and an infrared camera - pretty straightforward enthusiast tinkering kit.

But put all these elements together and you've got something that does a good job of approximating the way we're all going to interact with our computers in the future, if science fiction ideals are to be believed. Forget keyboards and mice - waving your arms around like Tom Cruise is the new thing.

What's great about Kinect is its price. You can easily pick one up for under £100, and there are standard webcams already scraping that price point. OK, its output isn't superb - the camera is a little grainy, and it has a limited resolution - but those combined sensors mean it can do some pretty special things.

It can pull out people from complex backgrounds with only the tiniest bit of calibration. With the right software it can track the bare bones, if you'll pardon the pun, of a human skeleton. It can follow you around the room and see in the dark. Kinect is significant.

Set yourself up

Kinect

Sadly, at the time of writing, Kinect isn't officially supported on the PC. Microsoft is getting there with its own SDK - but for now it's (technically) meant to be an Xbox peripheral only.

Thankfully, some enterprising hackers, boosted by the promise of cash rewards and worldwide fame, crowbarred their way into the device shortly after its launch late last year. Since then, thanks to a number of open frameworks and existing projects that just happened to be suitable for repurposing, Kinect has grown into a fledgling PC peripheral. It's even nearing usefulness.

We say 'nearing' because it still has some way to go. Full functionality doesn't seem to be quite there yet, which the Microsoft SDK will surely help upon its release. This means you shouldn't expect absolute perfection; even in its console incarnation, Kinect is rather finicky about positioning and lighting.

Be prepared to shuffle a few things around. To get the full experience you'll need space in which to move - 10-12ft seems to be the norm - and avoid direct sunlight if you can.

Install the drivers

Since there's no official driver, you'll need to start by installing a suite of third party software that looks after the task of recognising and interacting with the Kinect sensor. There are a few options, but we're going to start by downloading the latest unstable release of the OpenNI framework.

OpenNI, which is short for Open Natural Interaction, is a not-for-profit organisation dedicated to improving support for natural interaction devices like Kinect, applications that use them, and the middleware that goes between the two. The OpenNI framework is the bit that does the real donkey work, interpreting your hand gestures and tracking your body motion.

Most Kinect-compatible apps call upon it at this point. Start by running through the installer to get it on to your system. During the installation, you'll be asked if you want to install a driver from PrimeSense. This is the company which made the Kinect sensor for Microsoft, which works in conjunction with OpenNI, so it's safe to do so.

PrimeSense's driver isn't Kinect specific, though. For proper compatibility you'll need the SensorKinect mod, developed by a studious hacker who goes by the name Avin2. Once the OpenNI installation is complete, grab the latest binary from Avin2's site and install it.

The final node in the Kinect trifecta is Nite, which also comes from OpenNI. It's the middleware component that provides the various handy gesture interaction tools used by most Kinect-compatible apps. Download the latest unstable binary from here. You'll need to enter a free license key before you can use it with Kinect: it's 0KOIk2JeIBYClPWVnMoRKn5cdY4=.

In the interests of avoiding mistakes, the potentially confusing characters are, in this order: zero, upper case 'o'. upper case 'i', upper case 'i', lower case 'L', lower case 'o'.

Restart your machine and plug the Kinect unit in, and it should be detected and installed without any issues.

An actual app

When it comes to testing Kinect's functionality, Nicolas Burrus' Kinect RGB demo should be your first port of call. It does an excellent job of demonstrating the device's impressive capabilities.

Extract the folder from the archive you've downloaded and check inside; you should see a series of excecutable files. Forget the ones that are marked 'Calibration' for now (you don't need to calibrate the device in order to use it) and head for 'RGBD-viewer.exe'.

When you first run the app, you'll see a message in a command prompt window saying that the camera has been set to VGA resolution; the Xbox only uses QVGA, so you've already extracted more from your Kinect than console users can.

It can go even further - if you run the viewer from a command prompt with '-highres' appended after it, you can extract a full 1,280 x 1,024 from the camera, albeit at a somewhat lower frame rate.

The main screen of the app displays a depth-seperated image in the main body, with a view from the webcam in a separate box at the top right. The technicolour look of the main image represents different depths; hover your mouse over a particular pixel to see how far it is from the Kinect sensor.

That's not the best way to represent Kinect's depth-sensing abilities, though. Use the menu in 'Show | Filters' to switch on edge detection (tick the box marked 'Edges'). Close the window, then click 'Show | 3D view' to bring up a window that mixes the depth and webcam sensors together; click and drag to move the eerie three-dimensional view of what Kinect can see.

By default, this is in the form of a point cloud view - click 'Triangles' to fill the space between the points with polygons for a more solid look. Click 'SaveMesh' to output the 3D view to a PLY file, suitable for use with MeshLab or Blender.

RGBD is a decent demonstration of Kinect's powers, but it's time to put the sensor to work. To do this we'll use FAAST, the Flexible Action and Articulated Skeleton Toolkit.

Developed by a team at the University of Southern California Institute for Creative Technologies, FAAST came to public attention earlier this year when it was used to turn Google's 'gesture controlled Gmail' April Fool's prank into a reality; by using OpenNI and Nite to read your skeleton, FAAST can make a decent stab of translating your movements into keyboard inputs.

Start, predictably, by downloading the latest version of the FAAST toolkit, unzipping it and running the single executable inside. If you've already installed the latest versions of OpenNI and NITE, it should run without problems. Make sure you have your Kinect sensor plugged in, select 'Upper body' under 'Skeleton mode' (we assume you're sitting down; if not, select 'Full body'), set 'Smoothing' to 0.4, then click 'Connect' to fire it up.

FAAST will start up a network server, then attempt to pick up a body. Wave at the sensor and you should be able to see your outline in the window. Now it's time for a calibration gesture to give FAAST an idea of your skeletal proportions.

In this case you'll need to perform a classic body-building pose - arms bent at the elbows, out to your sides, hands in fists pointing skyward. After a couple of seconds FAAST should work out your frame and display a simple wireframe skeleton over the top.

Movements translated

By default, FAAST is set to translate the leaning of your body into presses of the [W], [A], [S] and [D] keys - click 'Start emulator' to enable this mode and feel free to test it out in your favourite first-person game. You'll probably find it doesn't translate particularly well, but it's a proof of concept.

For better results, try closing and re-opening the application, then adjusting the smoothness value on the first tab - it ranges between 0 and 1, so use 0.x - any positive value greater than 1 will cause FAAST to stall. You'll want to experiment with this to find the perfect value for your distance from the camera and for the light level in the room.

This is impressive, but FAAST can do a lot more if you set it up to do so. You can wave your left hand around and use it as a mouse, for instance. Stop the emulator, go to the 'Mouse' tab and click 'Enable' to set mouse control as active. Your personal settings will probably differ from ours, but we found it most comfortable to use Absolute control centred on the shoulder joint, set each of the bounds to a distance of 8-inches, and the movement threshold to 5.

Start the emulator again and try it. As you move your left hand around, your mouse pointer should follow as if you're using the Force to control it.

You'll notice you're a bit trapped, though - FAAST has effectively taken over your mouse, and since you haven't defined any mouse buttons, you can't click to stop the emulator. Hit [Space], and hopefully the start/stop toggle button will trip.

Now go to the right-hand tab and enter a new gesture to bind to the left mouse button. We found something like 'left_arm_up 20 mouse_click left_button' worked a treat, although as with many things Kinect, it will increase the flailing of your arms and make you look even more like you're having a fit in front of your PC.

Kinect is rather impractical, it's true, but with the FAAST toolkit you should be able to put it to work in any number of ways.

Maximise Kinect

Do things you never thought possible

1. Learn to juggle
Download

Juggle

This neat program tracks the movement of your hands and attaches glowing orbs to them whenever they're hidden behind your back. Flick your hands and you'll lob the balls in that direction. if you're skilled enough you might manage a three-ball cascade, but don't count on it.

2. Shoot a fireball
Download

Fireball

Ever seen Dragon Ball Z, the slightly insane Japanese comic book and cartoon series? Run Kamehameha and you'll recognise its influence straight away: it gives you big, wobbly hair that stands on end, and lets you power up and release a world-destroying fireball if you get your poses right.

3. Render 3D objects
Download

Render 3d

You'll need a Mac with Processing installed to use this hack (though if you're savvy enough, you might be able to fire it up in Windows). It uses your hand-waving to sculpt objects from a kind of 3D putty, Kinect input is unlikely to be as accurate as that of a mouse, but that's not the point, is it?

4. Play with physics
Download

Physics

This demo uses the openFrameworks system instead of OpenNI, and lets you interact with gravity-centric boxes on screen. It's notable not because of the output, but because of the fact that the Kinect depth sensor removes the need for any kind of blank background.

5. Read your email
Download

Email

The clever minds behind FAAST developed SLOOW - Software Library Optimizing Obligatory Waving - as a response to Google's 'Gesture Gmail' April Fool's prank. No actual code has been released, but it's something you could develop yourself within the toolkit. It certainly looks like fun.

6. Kung Fu Tetris
Download

Kung fu tetris

Practical it isn't, but if you've cleared a big enough space in your front room you can use FAAST to have a go at a particularly physical form of Tetris. Front kicks rotate, side kicks move the pieces, and jumping straight up drops pieces quickly. Lanny Lin has kindly supplied his config file.

This posting includes an audio/video/photo media file: Download Now

In Depth: The future of 3D internet and computer interfaces

Posted: 14 Aug 2011 12:00 AM PDT

When we talk about a 3D internet, we don't mean HTML web pages designed in 3D - designers are doing that already.

Examples like White Void's portfolio add an eye-catching third dimension to an ordinary menu, the Dasai Creative Engineering website features core navigation options mapped onto a rotatable sphere, and the Swell 3D website is rendered in anaglyphic 3D and requires a set of red/cyan glasses to view properly.

Whether this approach is effective is up for debate. The effects on the White Void and Dasai sites need a hefty dose of Flash to function. A 2D website wouldn't be as pretty, but it would load quicker and be much simpler to navigate.

Augmented reality

Perhaps the future of a 3D internet is augmented reality. AR is currently a novelty. It describes applications that use a device's built-in camera to calculate your location and augment what you see with relevant web data. In other words, it's a glimpse into an internet that overflows into the real world.

AR applications deliver real place data in real time, tapping into existing databases and assets on the web. Wikitude and Cyclopedia, for example, let you see London Bridge through a camera and read the relevant Wikipedia entry onscreen. Star Walk annotates the night sky for you, while Quest Visual's Word Lens visually translates written languages as you watch, in a way that feels suspiciously like magic.

"AR is nothing more than a user interface," says Octavio Good, founder of Quest Visual. "In the case of Word Lens, everything that's being done could be done with a dictionary if you had time, but Word Lens uses AR to make looking up words effortless and fun."

Point a phone running the Acrossair browser at a high street, and it will show you the nearest restaurants and highlight those with the best reviews. It doesn't take much imagination to see where this technology is going. In the future, you might be able to see whether a shop has the product you want, or which pub your friends are in.

Google Goggles

It's no surprise that Google has thrown its weight behind AR with its Goggles and Shopper apps. Snap a photo of a product or object, and Google Goggles will attempt to identify it and return relevant search results. Do the same in Google Shopper for a fast price comparison.

The technology can be hit-and-miss. In fact, the process can often take longer than typing a query into a search box. Google Goggles is certainly handy for products, shops and some places, but it's unlikely to be useful if you want to find information about concepts or ideas. For example, how would you use augmented reality to search for information on augmented reality?

While the technology is often used for visual searches, it's not a replacement for search. Instead, it can change the way we interact with real-world objects and places.

"Augmented reality is a very natural user interface for some tasks," adds Octavio Good. "The next steps for AR will be to more seamlessly integrate the real word with information people care about."

Autonomy's Aurasma project promises more integration of the real and digital worlds. If its YouTube promo is any indication, it will let you point your phone's camera at an advert or still image and see it come to life with an animation or a video.

"The technology needed to make AR apps useful has arrived in the last year, in the form of capable smartphones," explains Octavio Good. "This year, phones will get dual-core CPUs, more powerful GPUs, and more capable sensors. AR apps can use these to make more creative apps and to improve the quality of existing ones."

3D controls

Surface

The way we interact with computers hasn't changed for almost 30 years. That's not to say that inventors haven't tried to revolutionise PC interaction. We've seen data gloves, VR helmets, 3D mice, trackballs, squeezable balls and even brainwave-powered headsets. Few devices have genuinely threatened the keyboard and mouse though, both of which are perfectly suited to today's 2D interfaces.

But things are changing. The introduction of the touchscreen on mobile devices has given rise to new UIs that are almost invisible to users. "On a traditional desktop PC you move the mouse to move the pointer and highlight the photo you want to see," explains Gabriel White of specialist device user experience consultancy Small Surfaces. "With a natural user interface (like a touchscreen), the user simply reaches out and touches the photo they want."

We're now seeing laptops and desktop PCs with touch-sensitive displays, and Windows 7 supports touch as standard. Microsoft has expanded the multi-touch idea with Surface, one of the world's most expensive coffee tables.

Beyond touch, we need to look to games consoles to see the ideal controllers for future 3D interfaces. Thanks to the PlayStation Eyetoy and Move, Nintendo Wii and Microsoft Kinect, millions of people have been introduced to the concept of spatial or gesture-based control - and they like it.

The advantage of the Wii's system is that it needs little explanation. You just swing an invisible tennis racket or chop with an imaginary sword. There are no complicated combos to learn, and the interface is practically invisible. We say 'practically' because current systems are still fiddly when using menus or entering data.

Remember the slick, gesture-based interface in the film Minority Report? What Tom Cruise and a fat special effects budget faked, MIT scientists actually built with an Xbox 360 and a hacked Kinect camera.

Gesture control seems the ideal replacement for the ageing mouse. As for the keyboard, whether the future is a physical peripheral or a virtual projection, there's life in QWERTY yet.

3D computer interfaces

3D interfaces

Two-dimensional interfaces have proved their worth from the earliest punched cards through to the DOS prompt and the Windows desktop. 2D works - it's fast and effective. But it hasn't stopped the development of 3D UIs like Meego, SPBshell3D for Android and Bumptop, even though many are simply 2D systems with a 3D sheen.

Is the next phase of UI development purely cosmetic? UI specialist Gabriel White doesn't think so. "As 3D interfaces evolve, there will be new paradigms for interaction," he says. "Once depth is represented in a UI, it's possible to do fascinating things: organising UI elements spatially (rather than in categories and lists), and tangible manipulation of objects allow us to continue to make user interface more natural (pushing and pulling objects, not just swiping and zooming)."

Beyond the desktop

While original filesystems used a tree-like organisational structure, we're now locked into the idea of a desktop with files organised visually on top of it. This setup feels familiar; it's a structured environment that we can identify with, because it resembles a real-world desk.

But like any desk, a virtual desktop can get cluttered, which is why BumpTop introduced the idea of stacking files on top of one another to create a 3D desktop. Google bought the company in 2010, and has plans to incorporate parts of the 3D UI into Android 3.0.

Again, do we actually need a 3D interface? If so, what do we need it for? There's an argument that a 3D UI would enable us to do things that aren't possible in 2D environments.

Data visualization

James McRae, an Autodesk researcher at the University of Toronto, points to large-scale data visualisation. "Across many disciplines, you have datasets emerging whose geometry is 3D by nature," he says. "Examples might be the solar system, or the human body."

Windows 8

With increased computer power, rendering a 3D display is easy, but interacting with it is hard. Rumours suggest that Windows 8 could have a 3D element - an optional graphical interface called Wind. It could include Kinect support for gesture tracking, or to enable logins via facial recognition. It might even feature colour-coded info-bubbles, which are scaled according to their importance.

Microsoft's Chief Research and Strategy officer Craig Mundie showed off such a 3D concept UI earlier this year. Watch the video here.

"There have been many failed attempts to bring the third dimension to UI experiences," says Gabriel White. "Think of the 1990s, when VRML was all the rage. What we learned from all this is that making 3D interfaces isn't about simulating the real world inside a PC. Rather it's about leveraging our real-world cognitive abilities to create compelling, natural and direct interactions that rely less on explicit reasoning and more on intuition and spatial memory."

No comments:

Post a Comment