International telecommunication union



Download 146.19 Kb.
Page3/6
Date conversion08.07.2018
Size146.19 Kb.
1   2   3   4   5   6

7 Test Methods and Experimental Design


Measurement of the perceived quality of images requires the use of subjective scaling methods. The condition for such measurements to be meaningful is that there exists a relation between the physical characteristics of the "stimulus," in this case the 3D video sequence presented to the subjects in a test, and the magnitude and nature of the sensation caused by the stimulus. The final choice of one of these methods for a particular application depends on several factors, such as the context, the purpose and where in the development process the test is to be performed.

3D subjective experiments may measure opinions on three different perceptual scales:



These perceptual scales must be rated independently. For example, the subject might watch all videos during one session and rate visual experience; then the subject might watch all videos a second time during a different session and rate visual comfort. Other perceptual scales may be of interest (e.g., perceived amount of depth).

This section describes the test methods, rating scales and allowable deviations. The method controls the sequence presentation. The rating scale controls way that people indicate their opinion of the sequences. A list of appropriate changes to the method follows.



7.1 Absolute Category Rating (ACR) Method


The Absolute Category Rating (ACR) method is a category judgment where the test sequences are presented one at a time and are rated independently on a category scale. ACR is a Single Stimulus Method. The subject observes one sequence, then has time to rate that sequence.

The ACR method uses the following five-level rating scale:

5 Excellent

4 Good


3 Fair

2 Poor


1 Bad

The numbers may optionally be displayed on the scale.



<>

Comments

The ACR method produces a high number of ratings in a brief period of time.

ACR ratings confound the impact of the impairment with the influence of the content upon the subject (e.g., whether the subject likes or dislikes the production quality of the sequence).

7.2 Degradation Category Rating (DCR) Method


The Degradation Category Rating (DCR) method presents sequences in pairs. The first stimulus presented in each pair is always the reference. The second stimulus is that reference sequence after being impaired by the systems under test. DCR is a Double Stimulus method.

In this case the subjects are asked to rate the impairment of the second stimulus in relation to the reference. The following five-level scale for rating the impairment should be used:

5 Imperceptible

4 Perceptible but not annoying

3 Slightly annoying

2 Annoying

1 Very annoying

The numbers may optionally be displayed on the scale.



<>

Comments

The DCR method produces a fewer ratings than ACR in the same period of time (e.g., slightly more than one-half).

DCR ratings are minimally influenced by subject’s opinion of the content (e.g., whether the subject likes or dislikes the production quality). Thus, DCR is able to detect color impairments and skipping errors that the ACR method may miss.

DCR ratings may contain a slight bias. This occurs because the reference always appears first, and people know that the first sequence is the reference.


7.3 Comparison Category Rating (CCR) Method


The Comparison Category Rating (CCR) method is a method where the test sequences are presented in pairs. Two versions of the same stimuli are presented in a randomized order (e.g., reference shown first 50% and second 50% of the time). CCR is a Double Stimulus method. CCR may be used to compare source video with impaired video, or to compare two different impairments.

The subjects are asked to rate the impairment of the second stimulus in relation to the first stimulus. The following seven-level scale for rating the impairment should be used:

-3 Much Worse

-2 Worse

-1 Slightly Worse

0 The Same

1 Slightly Better

2 Better

3 Much Better

The numbers may optionally be displayed on the scale.

During data analysis, the randomized order of presentation must be removed.

<>

Comments

The CCR method produces a fewer ratings than ACR in the same period of time (e.g., slightly more than one-half).

CCR ratings are minimally influenced by subject’s opinion of the content (e.g., whether the subject likes or dislikes the production quality).

Test subjects will occasionally mistakenly swap their ratings when using the CCR scale (e.g., mark “Much Better” when intending to mark “Much Worse”). This is unavoidable due to human error. These unintentional score swapping events will introduce a type of error into the subjective data that is not present in ACR and DCR data.

The accuracy of CCR is influenced by the randomized presentation of stimuli one and two. For example, when comparing source and degraded video, if the source stimulus is presented first 90% of the time, then CCR will contain the same bias seen in the DCR method.

7.4 Subjective Assessment Methodology for Video Quality (SAMVIQ)

The SAMVIQ method defined in [ITU-R BT.1788] is commonly used to subjectively rate 2D video. The SAMVIQ method is also appropriate for use in measuring 3D subjective quality.

This method provides a global quality score for short display duration (10s–20s). It is inspired by the DSCQS (Double Stimulus using a Continuous Quality Scale) method. SAMVIQ is a multi stimulus method: several sequences to evaluate are directly accessible (e.g., played upon request).

SAMVIQ is able to discriminate between low quality as well as high quality video sequences. For this purpose, it combines subjective evaluation capabilities and the ability to discriminate near quality, using an implicit comparison process. The subject can compare each sequence under test with the reference one (i.e., 3D reference sequence without any treatment) and to the other versions of the 3D source. The SAMVIQ method includes a random access to play sequence files. Viewers can start or stop the evaluation, and give, change or keep the current score of each clip when they want. Additionally, they can replay sequences as often as they want.

The SAMVIQ quality evaluation method uses a continuous quality scale to provide a measurement of the intrinsic quality of video sequences. Each viewer moves a slider on a continuous scale graded from 0 to 100 annotated by 5 linearly spaced quality items (Excellent, good, fair, poor, bad). In the 3D case, three different perceptual scales are used: visual experience, image quality and visual comfort.

Each perceptual scale is rated during a different session.



Comments:

The main value of the SAMVIQ method for 3D video subjective quality assessment is to improve rating accuracy for viewers who have little experience viewing 3D content. Moreover, SAMVIQ increases the accuracy of results for each viewer (e.g., fewer judgment errors). This leads to more reliable results in terms of statistical analysis.



1   2   3   4   5   6


The database is protected by copyright ©dentisty.org 2016
send message

    Main page