Depth Sensing

Most of the apps we have built up to now are using video as input. Video is two-dimensional, it has a width and a height.

Depth sensors are special types of cameras that add a third dimension to the mix; they can see width, height, and depth. Depth in this context means the distance away from the camera where a captured pixel is situated.

It is important to note the difference between full 3D and depth. Just like a conventional 2D camera, a depth sensor can only capture what it “sees”. If an object is obstructing another one behind it, the camera will only be able to measure the depth of the one closest to it. We can only have one depth measurement per pixel.

Metal Pin Art Pin Point Impression 3D Frame Toy
Metal Pin Art Pin Point Impression 3D Frame Toy
MegaFaces: Kinetic Facade Shows Giant 3D 'Selfies' from iart on Vimeo.

Technologies

There are a few different technologies used for depth sensing, each with their own pros and cons depending on the application.

Most depth cameras include both color and depth sensors, providiing a pair of images per frame. These are then combined to form a colored point cloud as the result.

Coded (Structured) Light

Coded (or structured) light works by projecting a known pattern onto the world, then analyzing the deformations of the pattern to determine the shapes of the surfaces it is projected on.

Coded light technology and the Intel­® RealSense™ Depth Camera SR305
  • Pros:
    • Precise.
    • High resolution, usually as high as the camera resolution it is using.
  • Cons:
    • Can be slow as it needs multiple patterns projected then captured to generate a single frame.
    • Subject to interference from outside light, so usually better for indoor sensing.

Time of Flight

Like the name implies, time of flight works by measuring the time it takes for a beam of light to do a round-trip to the sensor. This involves complex calculations using the speed of light, and the shorter the travel time, the nearer the object.

3D time-of-flight camera operation
3D time of flight cameras3D time-of-flight camera operation
  • Pros:
    • Compact. (This is the technology used in many current generation phones)
    • Fast, and ideal for real-time processing.
  • Cons:
    • Mid-level accuracy.
    • Low resolution as it uses a custom sensor.
    • Subject to interference from other devices.

Stereo Vision

Stereo vision works like human depth perception, where two cameras (or eyes) are placed side by side looking in the same direction. Differences in the two captured images, called disparity, is used to determine how far from the sensors these different pixels are.

Semi-global Matching Method
Semi-global Matching Method
  • Pros:
    • Tends to be cheap, as any off the shelf cameras can be used.
    • Image representation is intuitive to humans.
    • No interference.
  • Cons:
    • Low accuracy.
    • Complex implementation requiring feature extraction and matching.

Microsoft Kinect

Microsoft was the first company to bring depth sensors to the mass market with the Kinect for Xbox 360 in 2010. This was originally designed as a game controller, but gained a lot of interest from robotics engineers and creative coders, who hacked the device to use for custom Desktop applications.

Microsoft originally was against the practice, even threatening legal action against the hackers, but eventually changed course and embraced the community building around it.

ofxKinect 3D draw 001 from Memo Akten on Vimeo.
Interactive Puppet Prototype with Xbox Kinect from Theo Watson on Vimeo.

Kinect for Xbox 360

Even though it was discontinued in 2013, this device is still used because of its long range and ease of use on all major OSes.

Kinect for Xbox 360
  • Technology: Coded light
  • Range: 1.2-3.5m / 3.9-11.5ft
  • Color resolution: 640x480px
  • Depth resolution: 640x480px
  • FOV: 57°x43°

Extra features:

  • Microphone
  • Motorized base to adjust device angle

OF Support

Starfield from Lab212 on Vimeo.

Kinect V2 (aka Kinect for Xbox One)

Released in 2012, the second version of the Kinect is said to have 3x the fidelity of its predecessor and a 60% wider field of view.

While still originally labeled as an Xbox controller, Microsoft shipped a Windows SDK with the device, providing access to advanced features to Desktop applications.

This device was discontinued in 2017 but is also still widely used for long-term installations.

Kinect V2
  • Technology: Time-of-flight
  • Range: 0.5-4.5m / 1.6-11.5ft
  • Color resolution: 1920x1080px
  • Depth resolution: 512x424px
  • FOV: 70°x60°

Extra features using Microsoft SDK:

  • Body tracking (up to 6 people)
  • Facial expression recognition
  • Hand gesture recognition
  • Heart rate tracking
  • Speech recognition

OF Support

Parade - Dancing Shadow Sculptures from Dpt. on Vimeo.

Kinect for Azure

Just released in 2019, the Kinect for Azure is based on technologies of the previous Kinect and the HoloLens. This is the first Kinect device marketed to developers at launch, and the first device to have an open-source SDK with official support for non-Windows platforms.

Kinect for Azure
  • Technology: Time-of-flight
  • Range: 0.25-5.5m / 0.8-11.5ft (mode dependent)
  • Color resolution: Up to 3840x2160px
  • Depth resolution: Up to 1024x1024px
  • FOV: Up to 120°x120°

Extra features using Microsoft SDK:

  • Orientation sensors
  • 360° microphone array
  • Body tracking (requires NVIDIA GPU)

OF Support

Intel RealSense

The Intel RealSense started off as a compact depth camera aimed at video conferencing, gesture-based interaction, and 3D scanning. The 200 series included many devices, both standalone and embedded in tablet computers and laptops.

The quality was not up-to-par with other depth cameras, and the original RealSense was rarely used for interactive installations.

In 2018, Intel released the 400 series devices, which were a major improvement on the previous generation. Low cost, small form-factor and portability make these devices a viable choice for many applications.

Intel RealSense

D415 / D435 / D455 / D457

  • Technology: Stereo IR
  • Range: 0.10-10m / 0.3-32ft (depends on conditions)
  • Color resolution: 1920x1080px
  • Depth resolution: 1280x720px
  • FOV: 65°x40° (D415), 87°x58° (D435, D455, D457)

Extra features:

  • USB powered
  • Orientation sensors (on some models)

OF Support

Other Options

Stereolabs ZED

Stereolabs are the newest addition to this list, and are worth mentioning because of their high quality ZED 2 sensors.

  • Technology: Stereo Color (range is virtually unlimited)
  • Waterproof / dustproof options makes them great for outdoor use
  • SDK uses machine learning models to provide a robust depth map and body tracking features
  • Works on Windows and Linux but requires an NVIDIA GPU

Orbbec Astra

Orbbec released the Astra Series as a response to the Microsoft Kinect. The goal was to create an open, cross-platform SDK which included body tracking with OpenNI.

  • Technology: Strucutured Light

Leap Motion

The Leap Motion Controller is a depth sensor that focuses on hand and finger tracking. It can be used for both Desktop and VR applications.

  • Technology: Stereo IR

USB Connections

We will often find ourselves wanting to connect many sensors to a single computer, or wanting to position our sensors far from the computer. This can be achieved using USB hubs and USB extenders, but one thing to remember is that not all USBs are created equal, and like most things in life, you get what you pay for.

In all cases, the most important thing we can do is test our setup with all the hardware connected to make sure everything is working as expected.

Bandwidth

USB bandwidth (amount of data over time) should be planned carefully:

  • Only enable feeds that are necessary to the application (e.g. Disable the RGB color stream if we are only interested in the depth data).
  • Use a lower image resolution if it provides enough information (e.g. Test a lower resolution stream to see if it provides enough precision).
  • If using multiple devices, connect them to different USB channels when possible (e.g. Connect one on the front and the other on the back).

Hubs

USB hubs can help, but have limitations:

  • Powered USB hubs will have higher bandwidth as they can use more power.
  • External hubs that connect to a PC using a USB connection will create a bottleneck (since all the data still needs to go through a single bus).
  • Internal PCIe hubs with dedicated channels per port will work the best.

Recommendation:

Cables

USB cables will deteriorate the signal over distance.

  • Use cables that are close in length to what is needed. Cables that are too long will weaken the signal.
  • Thicker and insulated cables tend to reduce interference.
  • Active (powered) cables can boost the data signal, especially over long distances.

Recommendations: