SMV Content for the 3D Media World
The onset of the VR(Virtual Reality) era has seen the development of many new 360 degrees VR image production technologies. However, all of these technologies require users to wear 3D goggles or HMDs, which may cause discomfort to the wearer, making them less appealing to some users. As an alternative, SMV display technology is being developed to allow the viewing of 3D images without special viewing devices.
Interactive SMV Content Technology Trends
SMV(Super Multi-View) refers to a multi-view display designed to generate two or more views in the viewer’s pupil(about 5 mm wide), with the two images spaced very closely. It is similar to a hologram, which is also called a 3D display. Research is taking place in universities and research institutes for obtaining and visualizing SMV images, while commercial multi-view displays are being made in the U.S., Japan, Netherlands, and Korea. Although commercial multi-view displays allow viewers to watch 3d images without goggles, a market has not been created for them yet. The key reason for this is thought to be the difficulty in content production.
ETRI is currently developing content production technology for SMV with dozens of views. It has developed a real time SMV image generation system, a portable multi-view filming device for ordinary users, and a multi-view production tool. Technology to render CG graphics with hundreds of views in real time and actualize it on a SMV display has also been developed, making it possible to enjoy 3D games and VR without 3D eyewear.
Turning to international interactive SMV content technology trends, a 72 view horizontal diffusion display using Pico projector arrays and a rendering system to visualize 640x480 resolution images for each view have been constructed in the University of Southern California. Google I/O launched a VR camera rig called ‘JUMP’ in 2015, which connects 16 GoPro HERO4 Black cameras to film in 360 degrees. Researchers at Nagoya University built an intermediate view TV system that transmits footage from a 100 multi-view camera system to a server, which generates depth map-based intermediate view videos and transmits them back to a mobile device. Omnitouch technology combining a depth map camera that obtains near-field depth data with a wearable system to allow interactions with all virtual objects appearing on all real surfaces with finger motions that was developed at Carnegie Mellon University. A Wi-Fi signal based gesture recognition solution that does not require sensing devices attached to the body or line-of-sight was developed at the University of Washington. Foundry and NewSight are companies that provide image construction technology.
Multi-view Video Content Production and Construction Tools
In order to create SMV videos that can be viewed without discomfort on an SMV display, SMV content has to be made for viewing comfort from the beginning. In order to maximize the 3D effect, the object has to be filmed from a sufficiently wide viewing angle, and there has to be continuity in the image transition when the audience moves while looking at the object in an SMV display. The key to developing good SMV video production technology lies in finding a middle ground between the limited views provided by the SMV display and preserving the continuity of the object as much as possible in these non-continuous views. The physical conditions of the SMV video system are as follows.
The process of image production in the SMV image construction tool can be divided into five stages of image filming, image correction(location synchronization), image conversion, image compression, and image output. A rig with many cameras arranged in regular intervals and fixed directions is used to film real-life multi-view footage. Here, the parallel arrangement and focused arrangement of cameras can be controlled easily with a single wheel. As the images taken by the multi-view camera rig may differ in their start times, time synchronization is required. Location synchronization is also required as it is almost impossible to ensure all images taken by a multi-view rig are from the identical camera position. The location synchronization of multi-view images involves the processes of image calibration and image alignment. Detailed information about the position of each camera is identified and remapping information for synchronizing the positions of the cameras is obtained through image calibration to generate the aligned images.
Information on time difference is necessary for obtaining intermediate view images required in making SMV images. Methods such as DIBR(Depth Image-Based Rendering) and image domain warping may be used to generate these intermediate view images. In SMV image processing, the larger the number of final views is, the larger the amount of image data including intermediate images is, and thus there is a need to compress the SMV image in storage. The prepared SMV images have to be converted into a format that can be output in SMV displays. Image adjustment and splitting are also required to create a natural 3D effect.
Content Streaming Technology
SMV content streaming technology allows images taken from cameras in multiple viewpoints to be streamed and viewed on SMV displays as SMV videos that do not cause discomfort to the viewers’ eyes. This process requires the synchronization of views using hardware and algorithms designed for processing large amounts of data. The streaming of SMV content requires time synchronization, color synchronization, video alignment, depth map calculation, and intermediate image generation.
The SMV video content streaming system developed by ETRI obtains images from 18 cameras arranged in a row, transmits them to three node servers, and conducts synchronization and foreground/background separation. The node servers transmit the foreground/background separated images and original images to the acquisition server. The acquisition server uses these images to calculate a depth map, and generates images with CG images synthesized onto the original images and depth map images, and transmits them to the intermediate image server. The intermediate image server creates an appropriate number of intermediate view images for the 18 images received and mixes them into a SMV image to be viewed on a SMV display. This image is sent to the SMV display to be viewed by the audience.
A range of user gesture recognition methods for digital content interaction have been researched, involving the analysis of video or signal data obtained from cameras(RGB, depth etc) or wearable sensors. The unique feature of SMV content is that it shows varying 3D images depending on the viewing direction of the user. To eliminate limitations to the user’s position, gesture input devices that can be worn on the body are utilized. Wearable device-based gesture recognition technology allows interactions unrestricted by the location of the multi-view content user. Ring-type device-based gesture recognition technology uses gyro-sensor and acceleration sensor information to recognize gestures, allowing interactions with content and control of computers and devices. A wide range of ring-type devices have been developed based on this advantage.
Author : Jaesook Cheong, Gisu Heo, Sangwon Kim, Ilkwon Jeong