Opportunities for gestural technologies in retail and commercial environments

What is gestural technology?

As everyone gets to grips with smartphones, touch screens and various other interactive mediums which do not require a traditional user interface such as a mouse or keyboard, technology providers are quickly establishing a ubiquitous standard for controlling, navigating and requesting actions with our fingertips or bodies -- the art of gestural technology.

The Kinect

Gestural technology in the gaming arena is no new kid on the block.

First implemented by the likes of Sony and Nintendo with their respective consoles, Microsoft has now weighed in with their version of gestural interfacing, the Kinect. In a nutshell, the Kinect provides multiple users with the ability to interact with the Xbox 360 without the need for a handheld device.

Using an array of complex cameras and microphones, the Kinect is able to recognise user's faces, and uses a skeletal tracking system capable of accounting for up to six people, as well as two active users. Each user has 20 of their joints tracked at a cool 30/fps to accurately pin point their location in 3D space. The Kinect also boasts an impressive depth of field, tracking users up to 3.5m away from the sensor.

According to Steve 'The Ape' Ballmer, CEO of Microsoft, the Kinect has sold over three million more units than its originally modest projections of five million -- almost double the amount of sale achieved by the Playstation's counterpart, 'Move' motion technology.

Hacking The Kinect

With the Kinect showing the potential to dominate the 'controller-less' gaming market, hacking the Kinect has become 'the new black'; so much so that Microsoft has revealed that it will be creating its own binaries and drivers to enable third-party developers to explore the capabilities and create new content for the Kinect on non-Xbox 360 platforms.

According to reghardware.com, the SDKs should be released "in the coming months". Unfortunately, the SDKs will only be available on the Windows platform, though we can't really blame Microsoft for being 'choosey' about what they will and won't allow to run on their systems (unlike some turtleneck-wearing people I could mention...) -- and these limitations haven't stopped us Mac Developers up to now!

There have been many 'Kinect Hacks' released recently; here are some of my personal favourites:

These applications are effectively interpreting the raw data that the Kinect is seeing and making this data available to a wide range of programming languages through particular middleware.

As an ActionScript3 Developer, I have a few choices of SDK and various types of interface -- OpenKinect, FLKinect and TuioKinect. I have played around on all three and found TuioKinect to be the easiest to set up and get going with for 'Blob Tracking' (Basic hand tracking). Yet it does have some draw backs: namely that the user must be standing a certain distance from the sensor, and that it's fairly limited in terms of gaining access to some of the Kinect's motor features or microphone accessibility. This was disappointing as the concept I'd decided I wanted to demo required a much more stable solution.

Then I discovered OpenNI, which probably offers the most extensive collection of libraries (but be prepared for a long slog installing it on Mac OSX, as there are no Binaries available at the moment -- users are assured that they're on the way).

Now I was all set to create a proof of concept for commercial viability of the Kinect.

Our Kinect demo

I started doing a few experiments with the Kinect's and OpenNI depth measuring capabilities, and then exploring if what I'd created could effectively be used through that transparent stuff we call glass -- particularly double glazing. A few experiments later, I discovered it worked like a charm!

So I began to focus the experimentation on commercially viable uses. On a basic level, they're aimed at promoting simple gestures to iterate through certain levels of information displayed on a rear projection screen or large scale monitor. I used the Flex's socket classes to connect to a simple JavaScript socket server running locally on my machine. As for the middleware, I decided to use a combination of OpenNI and slightly modified PrimeSense C++ application to produce the example. The modifications were fairly simple, but a little daunting to start with as C++ isn't my first language (or even second or third...). But using some examples and samples provided by Microsoft, I scraped together a bridge application to connect to a Node.js socket server. Now the Kinect was sending data through to the socket, which basically consisted of x, y and z coordinates in rapid succession. This only happens once the user has performed the initiation gesture.

Without a doubt, the most challenging aspect of creating a gesture based UI for the general public to effectively engage with is ease of use. With our little experiment being a relatively/totally unique user interface with no instruction manual included, we have to rely on user's knowledge of current gestural technology on other devices in order to interact successfully with ours. The standard gestures like 'Pinch' to zoom, 'Expand' to zoom out or 'Swipe' left or right could be used to navigate your way through content, and should be second nature to the majority of smartphone, iPad or next generation mice users. In this case, these familiar gestures are simply replicated by moving hands in front of the display rather than on it.

Answering the question...

Businesses that choose to utilise advertising or display information within its own commercial or retail space will be notifying potential customers of their presence within this environment. These businesses could implement bespoke interactive experiences to suit their customers' needs for information.

An interactive solution provides consumers with the ability to attain large amounts of information night or day, upon request, and the basic concept behind the solution can be rolled out to pretty much every business with a window.

A typical example of a suitable business looking to implement such technology would be an estate agent. Only displaying a few properties within each price band to entice a maximum range or potential buyers simply cannot address all of their customers' needs. By using an interactive system such as this, potential buyers can 'gesture' through the entire back catalogue of properties, price ranges, and retrieve detailed information that a paper-based or static digital solution just simply cannot provide.

Limitations and further potential developments

The Kinect is a powerful and complex bit of kit, but is rather limited in understanding its environment outside of the visual. However, with the addition of third party input devices, intuitive sales devices could be created; an 'intelligent' salesman who never sleeps, if you like.

A clothing retailer, for example, could tailor its digital advertising strategy based on the time of day or current weather trends; to start advertising umbrellas when it is raining, or bobble hats and mittens when it drops below zero.

These environmental considerations can all be taken into account when a user is interacting with a digital display, yet these are not the only considerations that the system could be aware of. The user's physical state could also change the information they are being shown. The Kinect can read exactly where you are in 3D space, and so could virtually 'size you up' to let you know whether there is currently any stock in your size, or make an informed decision about the clothing colour schemes you might be interested in given what you' re currently wearing.

Users and businesses could also take advantage of QR technology. For example, say a user has found something that they want to order or get more detailed information about, but it is outside of opening hours; a QR code could be provided to the user so that they can easily come back to the item later on.

As with any new technology, there are some potential drawbacks to implementing a solution like this into a business advertising or sales strategy. The Kinect only has the ability to support six users at the moment. This means problems could arise when it comes to supporting multiple users all trying to use the technology at once. Yet this could be overcome by registering one user at a time -- for example, if a user is directed to stand in a certain spot, the system will be able to detect their presence and only react to their gestures.

Windows would also need to be regularly cleaned or self-cleaning glass used in the installation. There will also be a problem if there is condensation on glass or large amounts of moisture or rain on the glass, as this will block the effective view and range of the cameras. A potential solution -- buy an awning!

Minimum lighting levels were an issue which arose while testing the prototype. Minimum levels will need to be matched, or artificial lighting may need to be implemented if the effective 3.5m range is to be achievable in the pitch black.

Even without a legitimate SDK provided by Microsoft, the possibilities of hacking the Kinect to utilise its power for advanced user interactivity are immense, accessible and exciting. It is also intriguing to design and prototype interfaces for this not too distant technology. I believe it's only a matter of time before we start seeing this type of installation on our high streets and we start living in a'Minority Report' style world -- minus the creepy metal spider robots and precognitives, of course...

Flash vs. HTML5 – how we did it!