17045

4
8/19/2019 17045 http://slidepdf.com/reader/full/17045 1/4 Audio Engineering Society Convention Paper 8997 Presented at the 135th Convention 2013 October 17–20 New York, NY, USA An Objective Comparison of Stereo Recording Techniques Through the Use of Subjective Listener Preference Ratings Lim, Wei University of Michigan, Ann Arbor, Michigan, 48109, USA [email protected] ABSTRACT Stereo microphone techniques offer audio engineers the ability to capture a soundscape that approximates how one might hear realistically. To illustrate the differences between six common stereo microphone techniques, namely XY, Blumlein, ORTF, NOS, AB and Faulkner, I asked 12 study participants to rate recordings of a Yamaha Disklavier piano. I examined the inter-rating correlation between subjects to find a preferential trend towards near- coincidental techniques. Further evaluation showed that there was a preference for clarity over spatial content in a recording. Subjects did not find that wider microphone placements provided for more spacious-sounding recordings. Using this information, this paper also discusses the need to re-evaluate how microphone techniques are typically categorized by distance between microphones. 1. INTRODUCTION 1.1. Background Stereo microphone and playback techniques were specifically developed to allow for proper representation of a normal listening field one perceives. There are many commonly employed techniques that evolved based on the principles of human auditory perception. Recent papers published in major journals have focused on the advancement of encoding and decoding in playback systems. Detailed, further research in two- microphone stereo recording techniques, beyond the initial conceptualization of the techniques, is lacking. Olive coined the term: “circle of confusion” – where by any one element of the entire sound reproduction procedure, namely recording, mixing, duplicating and playback, could be a cause for misrepresentation in reproduced sounds [1]. It is therefore dire that we understand every step of the process, including the very first stage of sound reproduction – the recording. 1.2. Aim and Hypothesis A survey of existing literature shows that there is a shortage of direct comparisons in the literature with regard to how stereo microphone techniques differ from each other, and the advantages of using a technique over This Convention paper was selected based on a submitted abstract and 750-word précis that have been peer reviewed by at least two qualified anonymous reviewers. The complete manuscript was not peer reviewed. This convention paper has been reproduced from the author's advance manuscript without editing, corrections, or consideration by the Review Board. The AES takes no responsibility for the contents. Additional papers may be obtained by sending request and remittance to Audio  Engineering Society, 60 East 42nd Street, New York, New York 10165-2520, USA; also see www.aes.org. All rights reserved.  Reproduction of this paper, or any portion thereof, is not permitted without direct permission from the Journal of the Audio Engineering Society.

Upload: nath-garza

Post on 08-Jul-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 17045

8/19/2019 17045

http://slidepdf.com/reader/full/17045 1/4

Audio Engineering Society 

Convention Paper 8997Presented at the 135th Convention

2013 October 17–20 New York, NY, USA

An Objective Comparison of StereoRecording Techniques Through the Use of

Subjective Listener Preference RatingsLim, Wei

University of Michigan, Ann Arbor, Michigan, 48109, [email protected]

ABSTRACT

Stereo microphone techniques offer audio engineers the ability to capture a soundscape that approximates how one

might hear realistically. To illustrate the differences between six common stereo microphone techniques, namely

XY, Blumlein, ORTF, NOS, AB and Faulkner, I asked 12 study participants to rate recordings of a Yamaha

Disklavier piano. I examined the inter-rating correlation between subjects to find a preferential trend towards near-

coincidental techniques. Further evaluation showed that there was a preference for clarity over spatial content in a

recording. Subjects did not find that wider microphone placements provided for more spacious-sounding recordings.

Using this information, this paper also discusses the need to re-evaluate how microphone techniques are typically

categorized by distance between microphones.

1. INTRODUCTION

1.1. Background

Stereo microphone and playback techniques werespecifically developed to allow for proper representation

of a normal listening field one perceives. There are

many commonly employed techniques that evolved

based on the principles of human auditory perception.

Recent papers published in major journals have focused

on the advancement of encoding and decoding in

playback systems. Detailed, further research in two-

microphone stereo recording techniques, beyond the

initial conceptualization of the techniques, is lacking.

Olive coined the term: “circle of confusion” – where by

any one element of the entire sound reproduction

procedure, namely recording, mixing, duplicating and

playback, could be a cause for misrepresentation in

reproduced sounds [1]. It is therefore dire that we

understand every step of the process, including the veryfirst stage of sound reproduction – the recording.

1.2. Aim and Hypothesis

A survey of existing literature shows that there is a

shortage of direct comparisons in the literature with

regard to how stereo microphone techniques differ from

each other, and the advantages of using a technique over

This Convention paper was selected based on a submitted abstract and 750-word précis that have been peer reviewed by at leasttwo qualified anonymous reviewers. The complete manuscript was not peer reviewed. This convention paper has beenreproduced from the author's advance manuscript without editing, corrections, or consideration by the Review Board. The AEStakes no responsibility for the contents. Additional papers may be obtained by sending request and remittance to Audio

 Engineering Society, 60 East 42nd Street, New York, New York 10165-2520, USA; also see www.aes.org. All rights reserved. Reproduction of this paper, or any portion thereof, is not permitted without direct permission from the Journal of the AudioEngineering Society.

Page 2: 17045

8/19/2019 17045

http://slidepdf.com/reader/full/17045 2/4

Lim Comparison of Stereo Microphone Techniques

 

AES 135th Convention, New York, NY, USA, 2013 October 17–20

Page 2 of 4 

another. This paper hopes to fill the gap by examining

the results of a formal, blinded listening test, where

participants rate their relative preferences of extremely

identical recordings using different stereo microphone

techniques, done in a highly-controlled environment.

The microphone techniques examined in this

experiment were XY, Blumlein, ORTF, NOS, modified

Faulkner (at 60cm apart) and AB (at 60cm apart), with

recordings performed using a pair of AKG C414-ULS.

These techniques attempt to simulate human auditory

perception in hopes of being able to accurately recreate

a combination of sounds that forms an immersive

environment, also known to us as a soundscape [2]. Of

these microphone techniques, we can classify them into

three categories – coincidental (XY and Blumlein),

near-coincidental (ORTF and NOS) and spaced

configurations (modified Faulkner and AB).

In this study, I made three hypotheses. Firstly, I

predicted that there would be a preference showntowards the near-coincidental techniques because near-

coincidental techniques blend the strengths of both the

coincidental and spaced techniques. The second

hypothesis was that the wider microphone placements

(distance between capsules) do not always yield a

higher rating in terms of perceived spaciousness. Lastly,

I hypothesized that there is a trade-off in semantic-

preferential ratings between clarity and spaciousness,

with listeners generally preferring clarity captured over

spatial content in their recordings.

2. METHODS

2.1. Participants

Subjects have between two and nineteen years of

experience in critical listening. Previous studies by

Toole [3] have shown that trained listeners are more

discerning, but not less representative of the larger

audience. All participants were not compensated for

their time, and additionally, were told that their

participation was completely optional and on a

voluntary basis.

2.2. Materials

In order to accurately determine the preferences

amongst questions per recording technique, a blinded

experiment set up was created in the Cycling 74’s

Max/MSP environment. The Max/MSP patch allowed

users to submit their relative preferential ratings of each

of the samples played. Six questions were asked about

each of the sample recordings that were presented two

at a time. The questions were phrased such that they

elicited responses specific to the following six

attributes: representation of the instrument recorded,

wideness of the space perceived, quality of localization,

clarity of notes (in terms of attack and articulation),

depth of space perceived and a general preference. For

each question, subjects used an 11-point semantic rating

scale to show their preferences (0 = not at all, 10 =

extremely).

The presented music was performed on a Yamaha

Disklavier MX100A, recorded in the Rolston Concert

Hall at The Banff Centre in Alberta, Canada. The

upright piano was sent MIDI information triggered

through a digital audio workstation. The mechanical

playback ensured that there was a consistent and highly

identical performance for all of the recordings I

performed. As such, it allowed me to keep the

microphone, the microphone pre-amplifier, the cables,the stands, and the position from the piano exactly the

same, while only changing the stereo technique’s set up,

thus minimizing confounding factors in this study.

2.3. Procedure

During the study, participants were first briefed on the

purpose of this experiment. They then were introduced

to the experimental interface, and proceeded with the

test. The study required them to answer a series of 6

questions, repeated for each of the 3 categorical pairs of

microphone techniques. A sequence with a total of 18

questions was set up such that no two participants willreceive the same order of questions and choice

presentation. I had randomized the test presentation

order to avoid possible biases and threat to internal

validity of the experiment due to order effects.

After the first set of questions was completed, another

set of questions was generated based on the participants’

submitted results. These questions would only contain

sound samples that were previously rated the highest for

their respective technique categories (coincidental, near-

coincidental and spaced) per attribute

(representativeness, wideness of space perceived,

localization, definition in attack, depth of spaceperceived, and general preference).

Each subject then answered some demographical

questions. These questions included the number of years

they have had formal training in audio or worked

Page 3: 17045

8/19/2019 17045

http://slidepdf.com/reader/full/17045 3/4

Lim Comparison of Stereo Microphone Techniques

 

AES 135th Convention, New York, NY, USA, 2013 October 17–20

Page 3 of 4 

professionally in the audio industry, their age, the

gender they identified themselves with, the musical

genre that they work in or listen to most, and the

listening environment that they were in while

participating in this experiment. These questions were

designed to find out if moderating variables were at

play.

3. RESULTS

To test the first hypothesis, the general preference rating

of each microphone technique within each of the

categories (coincidental, near-coincidental or spaced)

was added together. They were then analyzed for the

differences between group means.

I also measured the correlation between the sense of

spaciousness perceived in a recording, and the category

it belongs to. Spaciousness was measured by adding the

ratings for the wideness and depth of space perceived ineach recording sample.

The clarity measure was a sum of attack and ability to

localize. All statistical results were tested for normality

and had a Greenhouse-Geisser correction where

relevant.

An ANOVA test shows that across the three categories

of microphone techniques, there is a significant

difference between General Preference ratings of Near-

Coincidental techniques versus Coincidental and

Spaced, with  p=.02 and  p=.03 and mean difference of

1.41 and 1.31 respectively.

The second hypothesis is upheld, given that there is no

significance ( p>.85) between the spaciousness

perceived in a recording, and how far apart the

microphone placements were between capsules.

Listeners also generally preferred Clarity over

Spaciousness, with results showing a correlation

between the former category ratings (Pearson’s 2-tailed

with p=.01) and a lack of correlation between the latter

( p=.06 ), and General Preference ratings. There is,

however, no evidence of a trade-off between Clarity and

Spaciousness.

4. DISCUSSION

Statistical results have shown that listeners, indeed,

prefer near-coincidental techniques, as predicted in the

first hypothesis. According to Dooley [4], such

techniques provide for good localization. This thus

serves to solidify the third hypothesis, being that

listeners do lean toward clarity in a recording.

With the second null hypothesis rejected, it is worth re-

examining the way in which we think and learn about

microphone techniques. Stereo microphone techniques

have traditionally been categorized by the distance

between the microphone capsules, as defined by the

technique’s placements. The results suggest that other

factors, such as microphone polar patterns, could be at

play. In the Blumlein technique, which is classified as a

coincident pair technique, specifies figure-8 polar

pattern the microphones. This probably was the reason

why Blumlein recordings were found to be highly

similar to both of the Spaced pair recordings (p<0.01).

5. LIMITATIONS AND FUTURE DIRECTION

The materials in this experiment have been arbitrarily

selected. I recognize that the microphone choice may be

said to be less ideal for recording classical piano music,

not to mention the upright piano. Yet, the versatility and

availability of the microphone was taken into

consideration, and thought to be most fitting in this

instance. In spite of the arbitrary selection of materials,

one should note that they were kept constant and thus

there is high internal validity within this experiment.

The only variable was the microphone technique. I

therefore do not expect any confounding factors in the

experiment conducted.

I am also aware of the small sample size I have attained

in this experiment, thus leading to a slightly reduced

generalizability. In the mean time, however, the found

results still speak given the high internal validity and

appropriate use of statistical tests.

Future research may be built upon the interface created

for testing subjects. The interface allows for random

presentation order, blind tests and quick access to

results. As such, it could be beneficial to expand beyond

the six stereo microphone recording techniques used in

this experiment. It would also be interesting to compare

binaural recordings against traditional stereo recordings.

Data should also be collected with a different

performance hall, different selection of music and

different microphones.

This experiment has shown that there are indeed valid

and objective ways of testing for preferences between

Page 4: 17045

8/19/2019 17045

http://slidepdf.com/reader/full/17045 4/4

Lim Comparison of Stereo Microphone Techniques

 

AES 135th Convention, New York, NY, USA, 2013 October 17–20

Page 4 of 4 

recording techniques. The systematic constants in the

test set up and experiment interface that allowed for

blind testing, are strengths of this experiment that I

think should be displayed in other formal experiments

involving recording techniques and auditory perception.

6. ACKNOWLEDGEMENTS

I would like to thank Jason Corey for mentoring me

through this entire project, and the wonderful people at

The Banff Centre, especially Peter Cook, who assisted

me during the formulation of the idea and for allowing

me to have full access to their resources to record the

necessary source material.

The generous help of all who have participated in my

study and provided feedback towards the current

interface is also greatly appreciated. The collection of

materials would not have been possible without the help

of Benjamin Gendron-Smith, Denis Martin andWinfried Lachenmayr.

7. REFERENCES

[1]  Olive, S. (1990). The preservation of timbre:

Microphones. Loudspeakers, sound sources and

acoustical spaces. In Proceedings of the AES 8th

International Conference, May 3-6. Audio

Engineering Society.

[2]  Rumsey, F. (2002). Spatial quality evaluation for

reproduced sound: Terminology, meaning, and a

scene-based paradigm. Journal of the AudioEngineering Society 50(9). 651 – 666.

[3]  Toole, F. E. (1981). Listening tests – turning

opinion into fact. Presented at 68th Audio

Engineering Society Convention, Los Angeles.

Preprint 1766.

[4]  Streicher, R., and Dooley, W. (1985). Basic Stereo

Microphone Perspectives-A Review. Journal of the

Audio Engineering Society 33, 7/8, 548–556.