The end-to-end broadcasting chain from image source, programme production, delivery and display may be illustrated as follows:
The broadcasting chain (indicative)
The implications for the following individual elements of the broadcasting chain should be considered:
7.1 Image source methods
There are three main approaches to sourcing 3D programme material in use today. These are: stereo camera, CGI, and conversion from 2D video.
Most 3D video captured presently use stereo camera rigs. Some test footage has been captured using stereo cameras coupled with a rangefinder. Rangefinders are usually laser or infrared-based and attempt to provide depth maps for a given scene. The depth maps are prone to numerous errors due to a number of issues, such as poor accuracy, speculars, translucent objects, transparent objects and reflections. Another capture method that has had some testing is multi-camera rigs. These have a large number of cameras that provide a number of views. This method works well for capturing several views. However, the complicated rigs plus the large amounts of data currently prohibit widespread use of multi-camera rigs.
Computer generated content is typically considered the easiest method of stereo generation. The rendering system can either render one or more related views depending on the application. In addition, the Z buffer, which represents the distance to the screen of various objects, can be exported as a depth map. In either case, computer generated data can be used for stereoscopic production or for multiview production.
Finally, 3D video can be created by taking conventional 2D video and adding depth information. The normal process is to deconstruct the 2D image into a series of objects (also known as segmentation), assigning relative depth to each object, then filling in occluded areas. Human visual perception can also be exploited in the processes for converting from 2D to 3D. The creation of a depth map from a 2D allows for the creation of multiple views, through a rendering process that incorporates techniques of covering disoccluded regions.
7.2 Characteristics of signals in the studio
Without coding or compression, the baseband required for a two-channel 3DTV system, with HD resolution for each eye, is twice that required for a HDTV system. However, the actual requirements will depend on the format of the signals of the 3DTV system.
– How much information is involved?
– Can signals be handled by existing equipment and interfaces?
– Would new interfaces be required?
The answers to the above questions can be expected to vary according to the form of the 3DTV system.
Some form of metadata for first-generation systems are required to ensure that the left-eye and right-eye views are correctly identified. This may be based on either explicit or implicit information. For example, in the side-by-side format, whether the image on the left consists of the left-eye or right-eye view, and the sampling structure used, has to be known. Synchronization of the left-eye and right-eye views is also needed to ensure that there are no errors in timing, such as with the above-below format. Some of these signals might be able to be handled by existing equipment and interfaces but others might not.
Also, control signals are required for active eyewear that has to synchronize its operation with the view that is being displayed on the screen.
7.3 Programme production
Equipment is required that must handle recording, editing, effects, and postproduction.
The effect of the introduction on first-generation 3DTV on existing Recommendations that apply in the studio production environment will need to be considered.
Suitable provision will need to be made for monitoring the quality of the 3DTV at the point of origination and at appropriate points in the production chain.
One UK satellite broadcaster’s technical guidelines, the prime objectives of which are to deliver content of both a high technical quality and of high production values may be found at http://introducingsky3d.sky.com/a/bskyb-3d-tech-spec/ http://www.sky.com/shop/tv/3d/producing3d.
In the UK, the BBC has issued interim delivery requirements for programmes made in stereoscopic 3D. See: http://www.bbc.co.uk/guidelines/dq/pdf/tv/tv_delivery_to_network_programmes_v1.2-2011.pdf
Further study is required.
3DTV signals may need to be encoded in ways that are appropriate to their transmission within the existing 6/7/8 MHz terrestrial transmission channels, and also by existing broadcast satellite services.
Different techniques are likely to be required that are appropriate to each of these situations, and according to the requirements of the broadcaster as indicated in §§ 4 and 5.
There are known to be three fundamental approaches:
– viewer wears glasses;
– without glasses (auto-stereoscopic);
– headmounted display.
With a headmounted display, the left and right eyes are presented with the left-eye and right-eye images of a stereo pair. This may be appropriate for video games, but is unlikely to be appropriate for viewing of broadcast television. It is an individual viewing experience and is not suitable for collective (e.g. family) viewing of broadcast television.
Within these broad categories, various approaches may be possible. In many cases, 3DTV presentation relies on some form of eyewear or headgear that the viewer must wear in order to discriminate between left-eye and right-eye images:
– Anaglyph: a stereoscopic effect can be obtained by displaying images in which the presentation screen simultaneously displays two differently-filtered coloured images, (typically red for the right-eye image and cyan for the left-eye image). These are viewed through correspondingly-coloured glasses. One difficulty with this solution is that the viewer may feel compelled to remove the coloured glasses when looking away from the presentation screen. In addition, the programme presentation will necessarily provide an inferior colour rendition.
– Polarized glasses: this solution makes use of cross-polarizations for the right-eye and the left-eye images of a stereo pair; the images are watched through correspondingly cross‑polarized glasses. One solution to display such cross-polarized image uses a “tiled” display of alternating tiles for the first and second image of a stereo pair. The tiled display is covered by an identically tiled polarized mask, with alternating tiles being cross‑polarized. When viewed through cross-polarized glasses, separate views will be presented to the left and right eyes of the viewer. One problem with this solution is that the presentation of stereo images at HDTV resolution requires a more expensive display providing at least twice the horizontal resolution of HDTV.
– Shuttered glasses: the two images of a stereo pair are time-interleaved on the screen, and viewed through special glasses in which the left and right eye lenses are shuttered in turn, following the switching cycle of the left and right images on the screen.
While it seems inevitable that, at least to begin with (and possibly for many years) 3DTV viewing would require that viewers to wear glasses, first-generation 3DTV broadcasts could nevertheless continue to be viewed on more advanced improved forms of autostereoscopic display as the technology progresses.
8 Production grammar
Poor quality stereoscopic television could “poison the water” for everyone. There is a risk that 3DTV becomes associated with eye strain if stereoscopic content is poorly realized – whether due to inappropriate production grammar or due to inadequate technology for delivery. This has happened before in the cinema in the 30s, 50s, and 80s.
The production grammar of 3D often differs to 2D productions. Special care has to be taken in order to achieve a good 3D viewing experience. This can lead to some compromises for the 2D viewer. In some cases, a production might be optimized for 3D, with no intention that the 3DTV version be used for conventional standard- or high-definition television presentation.
It is understood that various recent trial 3DTV productions have provided useful learning experiences, and it is expected that further knowledge will be gained through ongoing trial productions and from services that have recently become available to the public. Live 3DTV production presents particular challenges. Live production nevertheless forms a regular part of the schedule of the recently introduced 3D service from United Kingdom pay television operator BSkyB.
9 The viewing environment
The effect of the viewing environment is fundamental on the perception of depth and to the quality of the overall viewing experience. The following situations should be considered:
– the studio environment;
– the home environment.
In particular, in conjunction with viewing distance, picture size and subtended viewing angle play a role in the three-dimensional effect as perceived by the viewer. This might have implications on the way in which 3DTV should be produced and displayed.
10 Principles for comfortable viewing of stereoscopic three-dimensional images
10.1 Composite factors in perception of stereoscopic 3D images
A stereoscopic 3D system expresses depth by presenting video that has parallax with respect to the left and right eyes of the viewer. Perception depends on program production techniques, display devices, 3D glasses, viewing conditions, and viewer characteristics. Consequently, visual fatigue associated with the viewing of stereoscopic 3D programmes will be affected by composite factors consisting of programme production techniques, display devices, 3D glasses, viewing conditions and viewer characteristics.
NOTE – Section 5 of Annex 4 describes the spatial distortion prediction system for 3DTV that calculates the spatial distortion of a reproduced stereoscopic image and predicts the extent of the puppet-theatre and cardboard effects, excessive binocular parallax, and excessive parallax distribution.
10.2 Measures to enable comfortable viewing of stereoscopic three-dimensional images
All parties concerned with stereoscopic 3DTV systems should take the above characteristics of stereoscopic 3D systems into account when manufacturing equipment, producing programmes, displaying video, and viewing video programmes. Solely regulating the amount of parallax in 3DTV programmes is not reasonable.
Due to the complexity of the end-to-end broadcast chain that involves many organizations and technologies, from capture, through production, mastering, broadcast, and reception to display, no single organization has end-to-end control over this effect.
10.2.1 Programme production
It may be useful to identify measures to help avoid the inadvertent creation of materials for transmission on broadcast television that would likely induce visual fatigue and other possible health hazards. Measures should be proportionate to the risks and should not place undue burdens on broadcasting organizations or programme producers. The impact of measures on broadcasters or programme producers may vary with their programme genres. For example, programme production is often beyond the control of the broadcaster in some live programming, such as news.
Broadcasting organizations should be encouraged to raise awareness among programme producers of the risks of creating stereoscopic television image content that may induce visual fatigue and create other possible health hazards in viewers of stereoscopic television broadcasts.
Producers of 3D programmes need to understand the characteristics of stereoscopic 3D images and the various effects of 3D video techniques.
10.2.2 Viewing environments and display devices
Viewing environments and display devices, which can affect the likelihood of problems, may differ between households, reflecting the style of living. Nevertheless, viewers should be well informed of satisfactory conditions for viewing stereoscopic 3D images. Example notifications given to viewers in Japan are described in Annex 7.
11 Psychophysical aspects of viewing stereoscopic images
11.1 Psychophysical aspects
Before attempting to implement new broadcast schemes, it is necessary to gain a full understanding of the results of psychophysical studies in order to understand the effects to which the viewer is subjected and the performance that is required of the main equipment in these systems. There are a number of issues to be studied before the effects of viewing three-dimensional images on human perception and visual functions can be fully understood.
Section 1 of Annex 4 identifies some key study items on the psychophysical aspects of stereoscopic television systems. It also includes the results of studies related to the naturalness and unnaturalness of stereoscopic video, the evaluation of visual comfort based on an analysis of parallax distributions within certain frames, and the visual fatigue by viewing stereoscopic video.
An important problem of current stereoscopic television, which is common to all the approaches currently implemented, is that they present stereoscopic images on a single surface (the display screen), giving rise to a potential conflict between “vergence” (the eye movement to point both eyes to the same point on the screen) and “accommodation” (the action by which the “lens” in the viewer’s eye focuses on that point). It has been documented in medical literature that this conflict can cause viewer’s discomfort, eye fatigue, headache and possibly other health hazards notably if the viewing continues for an extended period of time3, 4.
Such effects have been noticed in some recent trial broadcasts. For example, one of the terrestrial broadcasters in Korea, SBS (Seoul Broadcasting System), broadcast South Africa World Cup Soccer Games in 3D from 11th June 2010. In a survey of nearly 100 viewers, 75% expressed satisfaction with the trial 3DTV broadcasting service. However, the survey showed that 30% of the viewer felt dizziness, double image and eye fatigue.
Factors that affect 3D viewing comfort also include inter-pupillary distance, intra-scene disparity range, and the speed of depth change of objects in the scene. In addition, rapid cuts between shots of differing depths and changing depths with zoom or pans are known to cause viewer discomfort. Some of these techniques are widely used in 2D production but might cause discomfort when viewed in 3D. Due to these factors, 3D production techniques tend to create 2D video that might be considered by 2D viewers as boring. This is the reason that many 3D productions to date have been different from the 2D productions of the same event or release. It is widely known that current 3D movie releases are editorially different from the 2D releases.Parallax is affected by the programme production technique, display device, viewing conditions, and viewer characteristics (such as inter-pupil distance). Accordingly, all parties concerned with stereoscopic 3D systems should take this characteristic of stereoscopic 3D systems into account when manufacturing equipment, producing programmes, displaying video, and viewing video programs.
11.1.1 Geometrical relationships and naturalness
The reproduction of depth information is essential for people to gain a sense of three-dimensionality from a stereoscopic image. Depth distortion in the stereoscopic image can create an unnatural impression when the image is viewed.
In stereoscopic imaging, the object is imaged using two cameras. The arrangement of these two cameras can be classified into parallel configurations where the optical axes of the two cameras are parallel with each other, and intersecting “toed-in” configurations where the two optical axes are made to intersect. When using a parallel configuration, a stereoscopic image with no spatial distortion is obtained when the gap between the cameras is set equal to the gap between the pupils, the horizontal offset of the left and right images projected on the screen is equal to the gap between the pupils, and the camera’s angle of view is equal to the expected viewing angle of the display. In such cases, the image is said to be viewed under orthostereoscopic (distortion-free) conditions. When actual programme production and viewing conditions are taken into consideration, it is difficult to ensure that distortion-free conditions are always satisfied. If these conditions are not met, depending on the conditions, the spatial distortion of the stereoscopic image can cause unnatural effects such as the “puppet theatre” effect and “cardboard” effect. The puppet theatre effect is a phenomenon wherein the stereoscopic images of foreground objects appear unnaturally small. The cardboard effect is a phenomenon wherein the stereoscopic image of an object appears unnaturally thin.
Section 2 of Annex 4 presents a geometric analysis of reproduced stereoscopic image spaces and discusses the results and their relationship to the distortion of reproduced stereoscopic image spaces. The results of subjective evaluation tests that support these findings are also shown. The discussion relates to how the reproduced stereoscopic image space is affected by parameters such as the camera configuration (parallel or toed-in), display screen size, and viewing distance.
11.1.2 Visual comfort and discomfort in viewing stereoscopic images
Finding a way to make the visual comfort of stereoscopic images a measurable physical factor is arguably one of the key issues in stereoscopic imaging research. Stereoscopic images convey depth information to the viewer by making use of the parallax between the images presented to the left and right eyes. If we could ascertain how the magnitude and distribution characteristics of this parallax relate to the visual comfort of the image, this information would be very useful for the production of stereoscopic images.
How these parameters relate to the subjective visual comfort of stereoscopic images was studied by focusing on the average and range of the parallax distributions. It was shown that they both have a correlation with the visual comfort of stereoscopic images and that the range of parallax distributions in stereoscopic images appraised as visually comfortable was almost 60 pixels in HDTV image. It was also suggested that stereoscopic images tend to become more visually comfortable when the average value of the parallax distribution approaches zero (i.e. at apparent positions closer to the display screen).
In stereo 3D systems, a binocular 3D image is formed by presenting each of these images to the respective right and left eyes. If discrepancies arise between these two images due to the systems used for production, storage, transmission or display, they can cause psychophysical stress, and in some cases 3D viewing can fail. For example, when shooting and displaying stereoscopic 3DTV programmes, there can be geometrical distortions, such as size inconsistency, vertical shift, and rotation error, between left and right images. It is desirable that these geometrical distortions should be suppressed. Stereoscopic image cross-talk, in which the images “leak” and can be partially seen by the opposite eye, can also result in discomfort for the viewer. Detection and tolerance limits of evaluating visual discomfort in terms of cross-talk were reported to be highly dependent on image content and display contrast, and cross talk must be reduced on high contrast displays.
Section 3 of Annex 4 presents some results of subjective evaluation tests with regard to visual comfort in viewing stereoscopic images. The results indicate that stereoscopic image having an excessive range of parallax distribution can be evaluated as uncomfortable to view. The research results on visual discomfort caused by discrepancies between left and right images are also shown. The results indicate the detection and tolerance limit of discrepancies with regard to visual discomfort in viewing stereoscopic images.
11.1.3 Visual fatigue in viewing stereoscopic images
One of the major factors of visual fatigue caused by viewing stereoscopic images is the difficulty in fusing left and right retinal images with large binocular parallax, which lead to increased viewer fusion effort. Fusion effort is based on two factors: the principle of stereoscopic display (defined by horizontal binocular parallax, inevitable in stereoscopic systems), and issues involving hardware (leading to differences between views of left and right images).
Another important aspect of stereo 3D systems is that a dissociation of vergence and accommodation can be a major factor of visual fatigue in viewing stereoscopic images. This is because a difference in visual functions between viewing real objects and viewing stereoscopic images. The vergence point is positioned within the depth of field when viewing a real object. On the other hand, the vergence point is sometimes outside the depth of field when binocular parallax is large in viewing stereoscopic images. Temporal discontinuous changes in dissociation can also lead to visual fatigue.
Section 4 of Annex 4 presents some experimental results of subjective evaluation with regard to visual fatigue in viewing stereoscopic images. The results indicate that inconsistencies between vergence and accommodation can cause visual fatigue.
11.2 Examples of safety guidelines
In 2010, a liaison statement (see Document 6/316) was sent to the WHO requesting them for information on potential impact on health of 3D. They recently replied after a reminder. They could not give any direct information from their own files as they presently do not have a project on this topic. Guidance has nevertheless been made available by some national bodies.
11.2.1 Korea (Republic of)
A 3DTV Project Group (PG806) has been established in Korea with the aim of development of 3DTV broadcasting specification and viewing safety guideline. Working Group WG8062 is focusing on the development of a 3DTV viewing safety guideline for display, contents, viewing condition, and viewer parameters.
TTA published “3DTV Broadcasting Safety Guideline” in December 2010, see Annex 5. Its purpose is to present a way to reduce the visual fatigue and conflict of viewing stereoscopic contents, and to promote related 3D applied industries by reducing potential risk factors for the viewing stereoscopic contents. This guideline is intended to present adequate circumstances of 3D viewing, notes for viewer, suitable use conditions of contents and display guideline for safe viewing of 3D broadcasting service. It is planned to update this guideline reflecting a result of clinical research for 3DTV viewing.
Recent actions taken by the Italian Health Ministry, related to the use of 3D spectacles by the public attending cinema presentations of 3D movies, are described in Annex 6.
Factors that affect 3D viewing comfort also include inter-pupillary distance, intra-scene disparity range, and the speed of depth change of objects in the scene. In addition, rapid cuts between shots of differing depths and changing depths with zoom or pans are known to cause viewer discomfort. Some of these techniques are widely used in 2D production but might cause discomfort when viewed in 3D. Due to these factors, 3D production techniques tend to create 2D video that might be considered by 2D viewers as boring. This is the reason 3D productions to date have been different from the 2D productions of the same event or release. It is widely known that current 3D movie releases are editorially different the 2D releases.
12 Assessment methodology
Although a method for subjective assessment of image quality and depth quality is provided by Recommendation ITU‑R BT.1438 – Subjective assessment of stereoscopic television pictures, the type and visibility of artefacts peculiar to stereoscopic images have yet to be systematically identified and studied. Furthermore, the various methodologies and formats have to be taken into consideration.
The development of an appropriate assessment methodology, in conjunction with a common set of reference source material is of the utmost importance for evaluating 3DTV systems. It is understood that PSNR results might not be indicative of the effect of artefacts, and that new metrics will need to be considered. Major issues concern the identification of the factors that contribute to viewing discomfort and the development of proper metrics for measuring levels of discomfort. It is especially urgent to not only seek a metric for the measurement of viewing comfort, as this is a major concern for most users and providers alike, but also to seek a methodology for testing viewing comfort for both short-term and long-term viewing.
Visual comfort, image quality and depth quality are major perceptual dimensions that both users and programme providers are interested in. However, the value of routine testing of other perceptual dimensions, such as “presence”, “sensation of reality” and “naturalness”, should also be investigated. It is likely that new metrics and methodology are required. Methodology is also required to compare the performance of various approaches to the transmission of 3DTV signals and effects of bandwidth reduction.
13 User requirements
These are currently not fully understood.
At its May 2009 meeting, WP 6C decided to carry out a survey on the aspirations of the ITU Membership on 3DTV broadcasting. The survey was carried out between July 2009 and October 2009. All those who responded considered that there is a need to discuss with standards bodies, such as the IEC, the provision of minimum requirements for 3DTV receivers which match a future 3DTV broadcast system. In addition all responders considered it very important/essential for a 3DTV system to have the same format as packaged media (e.g. HDTV capacity discs).
14 Performance requirements
The overall performance requirements need to be identified in sufficient detail in order to orient the choice of the appropriate technologies for a new 3DTV system.
A preliminary list of possible requirements is listed in Annex 9. This has been gleaned from contributions made to WP 6C since October 2008.
It is hoped that future contributions will provide clarifications and/or additional factors that should be considered.
15 Organizations with initiatives in 3DTV
A wide range of research, standardization, and trade associations are currently active in investigating aspects of 3DTV. A non-exhaustive list is attached in Annex 1.
Without an orderly approach to the standardization of 3DTV broadcasting systems, even for an initial test phase, various de facto standards will become established. There is a risk that subsequent implementation of 3DTV broadcasting could become more difficult.
Furthermore, actions likely to be taken by the gaming and optical media (Blu-ray) industries could have a significant impact on the capabilities of widely deployed consumer equipment.
It is also not known what the consequences might be of decisions on the future 3D-capabilities of interfaces to displays if these are taken in the absence of agreed requirements for 3DTV broadcasting systems.
It is anticipated that guidance will be desirable covering the following:
– quality assessment methods for 3DTV systems;
– reference 3DTV source materials for use in subjective tests;
– requirements for the broadcasting chain;
– requirements for production and production grammar;
– psychophysical aspects related to viewing of stereoscopic images;
– requirements first-generation 3DTV systems.
In addition, an important issue for further study is an understanding of bit-rate requirements for first-generation 3DTV broadcasting systems, for both the frame-based and compatible 2D approaches.
Referring to the matrix of signal formats described in Fig. 1, the most critical matrix points that might need to be standardized are the first-generation Levels 2 and 4 points to the maximum extent possible, but certainly regarding signalling.
Another critical issue is to try to align the matrix with the formats used for packaged media.
Further contributions to WP 6C are invited on the above and related topics.
Organizations with current initiatives in 3DTV
1 ISO/IEC JTC1/SC29/WG11
In July 2009 it was planned to finalize the specification of carriage of MVC over MPEG‑2 systems, as well as extensions to the file format specifications to accommodate multiview video.
Work is also proposed to begin on a new 3D video (3DV) format that aims to support advanced stereoscopic display processing and auto-stereoscopic displays.
An amendment to ISO/IEC 14496‑10 includes a spatially interleaved frame supplemental enhancement information (SEI) message to signal the type of interleaving in a frame-based scheme.
2 ITU-T Study Group 9
ITU-T SG 9 has recently initiated work items to develop draft new Recommendations on subjective assessment methods for 3D video quality and on display requirements for 3D video quality assessment. In addition, studies are being progressed on scalable view-range representation for free viewpoint television (FTV).
3 ITU-T Study Group 16
The multiview coding extension of Recommendation ITU-T H.264 | ISO/IEC 14496-10 MPEG-4 AVC has proceeded to AAP Consent under ITU-T Recommendation A.8 approval process.
4 3DTV – Network of Excellence
5 3D4You – Content generation and delivery for 3D television
3D4You was a project funded by the European Union under the information and communication technologies (ICT) Work Programme 2007-2008, a thematic priority for research and development under the specific programme “Cooperation” of the Seventh Framework Programme (2007-2013). Project website: http://www.3d4you.eu/.
The activities of the Society of Motion Picture & Television Engineers (SMPTE) include standardization work related to stereoscopic 3DTV in the production environment. SMPTE’s work is distributed among its various Technology Committees, Working Groups and Ad Hoc Groups
One such activity has been completed and has resulted in a published standard (SMPTE ST 292‑2:2011 Dual 1.5 Gb/s Serial Digital Interface for Stereoscopic Image Transport) that defines a method of transporting stereoscopic images using two streams of 1.5 Gbit/s in conjunction with the means to identify each stream.
For each current activity related to stereoscopic 3DTV, the table below identifies the responsible SMPTE group, the scope of work, and its status as of September 2011. See also: https://www.smpte.org/.
Overview of SMPTE standardization activities related to stereoscopic 3DTV (September 2011)
Standards Committee Group
Technology Committee 35PM, Working Group on 3D Home Master
Stereoscopic Distribution Master
The Stereoscopic Distribution Master (formerly known as the ‘3D Home Master’) is intended to provide a standardized means for interchange of 3D content amongst mastering facilities, and between a mastering facility and the ingest facility of a distribution system. The Stereoscopic Distribution Master may feed various distribution outlets for 3D content to the home, including (but not limited to): mobile, Blu-ray/DVD, streaming, terrestrial, and cable/satellite broadcast.
The document includes a Glossary, and covers Image Structure, Subtitles, Captions and Graphical Overlays, and Metadata
FCD Final Committee Draft
10E Essence: AHG
ST 2066 Disparity Map Representation for Stereoscopic 3D
Identify requirements for a data representation of disparity maps relevant for production, post-production, and distribution of 3D content.
Work in progress
ST 2068 Frame Compatible
This document categorizes and enumerates the various methods to transport a pair of stereoscopic frames within single video frames. Unique identification codes are assigned to these methods in order to facilitate interchange of 3D Frame Compatible signals. The information contained can be used to confirm that proper encoder/decoder pairs are used. It can also set the H.264/MPEG-4 AVC Frame Packing Arrangement SEI message prior to delivery to end user terminal devices.
One purpose of this document is to carry the frame packing information necessary for the MPEG-4 AVC Frame Packing Arrangement SEI message. A second purpose is to carry information describing the frame packing arrangement that may be used in the production workflow.
The document will not enumerate the filtering used prior to sub-sampling, but optional suffix coding may be provided in some implementations in order to identify unique filtering on the basis of specific implementations.
Work in progress
32NF Networks and Infrastructure
SMPTE ST 2063. Stereoscopic 3D Full Resolution Contribution Link – MPEG-2 TS
This document specifies how a stereoscopic 3D video system based on the MPEG-2 Transport Stream (TS) that is codec agnostic (i.e., any codec for which there are defined methods for transport via MPEG-2 TS is permitted) performs coding, multiplexing, and decoding. It defines constraints for the input image pair, the bitstream, the multiplexing, timing synchronization, and signalling, as well as for the video coding and decoder behaviour. The input image pair must have the same image structure (horizontal and vertical pixel count, scanning system, colorimetry, and frame rate) and be coincident in time.
Final Committee Draft
Dual 1.5 Gb/s Serial Digital Interface for Stereoscopic Image Transport
This standard defines a means of transporting stereoscopic images (Left eye and Right eye images) using an interface consisting of two links based on the SMPTE ST 292-1 data structure. The Left eye images are carried on one link of the interface and the Right eye images are carried on the other link. The stereoscopic image formats to be transported using this standard are the 4:2:2 10 bit image formats defined by SMPTE ST274, ST2048-2 and ST296, which can be transported by a single SMPTE ST292-1 serial interface. Audio and other associated ancillary data may also be transported. This standard also defines a payload identifier.
Source Image Format and Ancillary Data Mapping for Stereoscopic Image Formats on a single-link 3Gb/s Serial Interface
This standard defines a means of transporting a stereoscopic image pair consisting of a Left Eye and Right Eye image (Le and Re) using an interface consisting of a single 3Gb/s (nominal) link.
The stereoscopic image formats to be transported using this standard are those 4:2:2 10 bit image formats having a sampling frequency of 74.25 MHz, or 74.25/1.001 MHz.
Audio and other associated ancillary data may also be transported. This standard also defines a payload identifier.
It is not necessary for implementations to include support for all formats defined in this standard to be compliant. Implementers should indicate supported formats in commercial publications.
Final Committee Draft
Dual 3 Gb/s Serial Digital Interface for Stereoscopic Image Transport
This standard defines a means of transporting stereoscopic images (Left eye and Right eye images) using an interface consisting of two streams based on the SMPTE ST 425-1 data structures. The Left eye images are carried on one stream of the interface and the Right eye images are carried on the other stream.
Work in progress.
Technology Committee 31FS
3D interleaved in MXF OP 1a
File format to standardize the transport of left and right eye images in frame interleaved MXF files for use in TV acquisition, contribution, distribution, station operations, and archives.
Work in progress.
7 The Digital Video Broadcasting Project
Technical work in Digital Video Broadcasting Project (DVB) is driven by commercial requirements. Following completion of a study mission to investigate the possible need for 3D activities, further activity led to the publication of “DVB commercial requirements for DVB-3DTV” (DVB Document A151, July 2010). This addresses “frame compatible” 3DTV services over HD broadcast infrastructures. Work has started on considering a second phase of “2D service compatible” commercial requirements.
On 17 February 2011 the DVB Steering Board approved the DVB-3DTV specification, which has been published as BlueBook A154 “Frame Compatible Plano-Stereoscopic 3DTV”, (DVB‑3DTV). The specification has been sent to the European Telecommunications Standards Institute (ETSI) for formal standardization.
The specification specifies the delivery system for frame compatible plano-stereoscopic 3DTV services, enabling service providers to utilize their existing HDTV infrastructures to deliver 3DTV services that are compatible with 3DTV capable displays already in the market. This system covers both use cases of a set-top box delivering 3DTV services to a 3DTV capable display device via an HDMI connection, and a 3DTV capable display device receiving 3DTV services directly via a built-in tuner and decoder.
8 The Blu-ray disc Association (BDA)
10 Consumer Electronics Association
The Consumer Electronics Association (CEA) has established a 3D Task Force. This is considering interfaces between consumer sources, sinks, repeaters, converters, and glasses. They are also considering what is needed for “3D READY” products. It is proposed to develop standards for 3D glasses, including interface, signalling, setup, control and polarization. A project is being considered to update CEA-861 to carry 3D content.
11 The 3D@Home Consortium
This comprises around 40 members, with the aim of speeding the commercialization of 3D video into homes worldwide.
12 Association of Radio Industries and Businesses
The Association of Radio Industries and Businesses (ARIB) has established a working group for researching 3DTV broadcasting in 2008.
13 Ultra-realistic communications forum
The Ultra-realistic communications forum (URCF) is a forum established by the organizations from industries, government and academies, with the aim of promoting the R&D of ultra-realistic communications.
14 3D Consortium
The 3D Consortium was established in 2003 and comprises 47 members from 3D industry. Its main focus is on stereoscopic 3D.
15 Consortium of 3-D image business promotion
The Consortium of 3-D image business promotion was established in 2003 and comprises 49 members.
16 Japanese Ergonomics National Committee
Japanese Ergonomics National Committee (JENC) is in charge of the national preparation for ISO TC159.
17 Telecommunications Technology Association
In Korea, Telecommunications Technology Association (TTA) has established a 3DTV Project Group (PG806), with the aim of development of 3DTV broadcasting specification and viewing safety guideline. 3DTV PG consist of two Working Groups; WG8061 for the development of 3DTV broadcasting specification and WG8062 for the development of 3DTV viewing safety guideline.
TTA subsequently published “3DTV Broadcasting Safety Guideline” in December 2010 (see Annex 5).
18 European Broadcasting Union (EBU)
The EBU (http://www.ebu.ch) has produced a 3D briefing document for senior management. This is reproduced in full in Annex 8.
In addition, EBU Recommendation R 135 “Production & Exchange Formats for 3DTV Programmes” (August 2011) provides interim recommendations for EBU Members who are required to produce, exchange, archive and distribute 3D programmes using 2D infrastructure and transmission technologies, see: http://tech.ebu.ch/docs/r/r135.pdf.
MUSCADE is intending to create major innovations in the fields of production equipment and tools, production, transmission and coding formats allowing technology independent adaptation to any 3D display and transmission of multiview signals while not exceeding double the data rate of monoscopic TV, and robust transmission schemes for 3DTV over all existing and future broadcast channels. MUSCADE is a collaborative project co-funded by the European Commission’s Seventh Framework Programme - ICT under the theme “Networked Media and 3D Internet”. See: http://www.muscade.eu/index.html.
20 3D VIVANT
The 3D VIVANT project is investigating the generation of a novel true 3D video technology, based on mixed 3D Holoscopic video content capture and associated manipulation, and display technologies. This project, is supported by the European Commission through the Information & Communication Technologies programme. See: http://www.3dvivant.eu/.
Historical background on the development of stereoscopic
and 3D television systems
Document 6C/92 describes the present state of three-dimensional (3D) TV broadcasting studies in the Russian Federation:
Introduction to free viewpoint television
See Annex 1 to Annex 6 to Document 6C/69:
Psychophysical studies on three dimensional television systems
Before attempting to implement new broadcast schemes, we must gain a full understanding of the results of psychophysical studies in order to understand the effects to which the viewer is subjected and the performance that is required of the main equipment in these systems.
There are a number of issues to be studied before we can fully understand the effects of viewing three dimensional images on human perception and visual functions. For the success of three-dimensional television broadcasting, all parties concerned, including broadcasters, producers, manufacturers, and regulators, should be well informed of the effects.
The psychophysical aspects of viewing stereoscopic images have been extensively studied. This Annex provides some key study items and the study results on the psychophysical aspects of stereoscopic television systems. It also describes the spatial distortion prediction system for 3DTV that calculates the spatial distortion of a reproduced stereoscopic image and predicts unnatural size distortion, excessive binocular parallax, and excessive parallax distribution on the basis of the shooting, display, and viewing conditions.
1 Key items for psychophysical studies
The following sections describe the key items for which further study is encouraged:
1.1 Naturalness and unnaturalness of images
1) Theoretical analysis of spatial reproduction characteristics of images taken by 3D cameras
It is of fundamental importance to understand precisely how a real space is converted into a stereoscopic image space by a camera. In particular, the reproducibility of a stereoscopic image space should be analysed in terms of different settings of the lens axes of 3D cameras.
2) Size distortion
The reproduction magnification ratio of an object at the shooting distance (the perceived size) varies with the imaging and display conditions. The resulting distortion in size may make an object be perceived as unnaturally small; this is called the “puppet theatre” effect.
3) Depth distortion
The imaging and display conditions may reduce the reproduction magnification ratio of the depth direction and distort the perception of objects with visually imperceptible thicknesses. This is called the “cardboard” effect.
Section 2 describes the study results on the naturalness and unnaturalness of stereoscopic images.
1.2 Viewing comfort and discomfort
1) Differences in size, verticality, inclination and brightness, and cross-talk
Viewers may not feel comfortable viewing left and right images that have size, verticality, inclination, and brightness differences. Cross-talk between the left and right images may also have an impact on viewing comfort.
2) Psychological factors and the parallax distribution
The fundamental relationship between psychological effects brought about by 3D images and factors related to fatigue should be studied. In particular, “ease of viewing” and “sense of presence” may be key psychological factors. Attention should be paid to the distribution of parallaxes in the stereoscopic images. From the correlations between psychological factors and the parallax distribution, we can grasp the essential characteristics of stereoscopic images, e.g., the sense of presence they convey and their ease of viewing (visual discomfort).
3) Superimpositions within 3D images
With regard to superimpositions in a two-dimensional image, we only have to think about exactly where to display it on the screen. In the case of a stereoscopic image, however, we also need to pay attention to the depth of the superimposition. If we could find a preferred position for superimposition for stereoscopic images, we will be able to use it for actual program production.
4) Change in parallax distribution during scene changes
The parallax distribution of stereoscopic images is discontinuous during scene-change frames, where the scene depth and perceived convergence distance change. We need to evaluate how these changes affect the visual discomfort experienced during viewing of stereoscopic images.
Section 3 describes the study results on the viewing comfort and discomfort of stereoscopic images.
1.3 Visual fatigue caused by parallax 3DTV viewing
Visual fatigue caused by viewing stereoscopic motion images is a particular safety concern. Viewer’s repeated adaptation to the discrepancy between eye convergence and accommodation causes a decline of their visual functions and results in visual fatigue.
Section 4 describes the study results on the visual fatigue caused by viewing stereoscopic images.
1.4 Individual differences in the stereopsis function
Visual functions vary greatly from person to person, so it is essential to understand that there are individual differences before 3D broadcasts begin. For instance, there are limits to the binocular parallax of left and right images which a person can fuse into one image; when the parallax exceeds these limits, a double image is perceived. In this situation, depth perception collapses and viewing becomes extremely uncomfortable. For this reason, it is necessary to know the range of binocular parallax over which two images can be fused into one. However, individual differences are vast and will necessitate a study of the stereopsis function of many people.
1.5 Effect on young people
We must bear in mind that young people’s sense of sight changes as they mature. Viewing of stereoscopic images possibly affects their visual functions in ways different from adults. It may be advisable that young children be cautioned about viewing stereoscopic images for extended periods of time.
2 Naturalness and unnaturalness of stereoscopic images − Geometrical analysis of spaces reproduced by stereoscopic images
2.1 Theoretical analysis of reproduced spaces
A basic requirement for the design of stereoscopic systems is an understanding of the transformation from real space (the space in which an actual object exists) to reproduced stereoscopic image space (the representation of this space in a stereoscopic image). In this section, we analyse the distortion of reproduced stereoscopic image space on the basis of image shooting and display system parameters5.
2.1.1 Model of shooting/display systems
The configurations of the image shooting and display systems analysed here conform to the parameters shown in Fig. 3. The details of these parameters are shown in Table 1.
(a) Shooting system (b) Display system
Shooting and display systems can typically be configured in two different ways depending on how the optical axes are arranged.
Parallel configurations (where the two cameras of the stereo camera are aligned parallel to each other) are characterized such that objects at infinity are displayed at infinity by maintaining a constant horizontal separation of Hc between the left and right images when they are displayed (see Fig. 4). As a special case, when the separation dc between the cameras and the horizontal offset Hc between the left and right images are equal to the separation de between the viewer’s pupils, and the lens angle b is equal to the angle of view of the display screen d, the real space is in theory reflected without distortion in the reproduced stereoscopic image space6. However, it is not always possible to satisfy this condition in broadcasting where a wide variety of different subjects are liable to be viewed under widely varying conditions.
Another optical axis configuration is the so-called toed-in configuration wherein the optical axes of the two cameras intersect (see Fig. 5). This configuration is characterized such that an object situated at the intersection of the optical axes appears at the depth position of the screen on which its stereoscopic image is displayed. It is also relatively easy to present a sense of depth for objects in the space in front of and behind the object at the intersection of the optical axes. By virtue of these characteristics, this method appears to be used in most stereoscopic programs.
Parameters in shooting and display models
Position of a stereoscopic object
Angles of view of lens
Camera convergence angle
Convergence angle of eye
Horizontal gap between L and R images
Width of screen
Width of virtual screen at the viewing distance in the shooting model
Distance from the centre of the virtual screen at the viewing distance in the shooting model (see Fig. 3)
(distortion-free when dc = de = Hc and θb = θd)
2.1.2 Depth distance in real space and stereoscopic image space
If an object’s depth position in real space (the environment where images are shot) and stereoscopic image space (the reproduced environment) are Lb and Ld, respectively, then the relationship between these values obeys the following formula using the parameters of Table 1 and the geometrical relationship of the system configuration shown in Fig. 3.
In a parallel configuration, we can set Lc→∞ and Hc = de. In a toed-in configuration, we can set Hc = 0.
Table 2 shows the results of using equation (1) to investigate how the depth position Ld in the reproduced image and the actual depth position (the original camera-to-object distance) Lb are expressed in systems with parallel and toed-in configurations.
In Table 2, no consideration is given to the keystone distortion of the image shape that occurs in toed-in configurations. In other words, this table shows the characteristics at the centre of the image where keystone distortion has little effect.
In the parallel configuration, Lb and Ld obey a proportional relationship regardless of the parameter settings. On the other hand, in the toed-in configuration, Lb and Ld are equal only for a certain specific combination of parameters (Lc = a1·a2·Ls), but otherwise have a non-linear relationship. The graph of Table 2 indicates that different characteristics are exhibited depending on the sizes of Lc and a1·a2·Ls.
Distances in real space and in reproduced stereoscopic image space