Apple Inc. v. Samsung Electronics Co. Ltd. et al

Filing 999

Administrative Motion to File Under Seal filed by Samsung Electronics America, Inc.(a New York corporation), Samsung Electronics Co. Ltd., Samsung Telecommunications America, LLC(a Delaware limited liability company). (Attachments: #1 Proposed Order Granting Motion to Seal, #2 Samsung's Opposition to Apple's Motion to Exclude Testimony of Samsung's Experts, #3 Declaration of Joby Martin in Support of Samsung's Opposition, #4 Exhibit A to the Martin Declaration, #5 Exhibit B to the Martin Declaration, #6 Exhibit C to the Martin Declaration, #7 Exhibit D to the Martin Declaration, #8 Exhibit E to the Martin Declaration, #9 Exhibit F to the Martin Declaration, #10 Exhibit G to the Martin Declaration, #11 Exhibit H to the Martin Declaration, #12 Exhibit I to the Martin Declaration, #13 Exhibit J to the Martin Declaration, #14 Exhibit K to the Martin Declaration, #15 Exhibit L to the Martin Declaration, #16 Exhibit M to the Martin Declaration, #17 Exhibit N to the Martin Declaration, #18 Exhibit O to the Martin Declaration, #19 Exhibit P to the Martin Declaration, #20 Exhibit Q to the Martin Declaration, #21 Exhibit R to the Martin Declaration, #22 Exhibit S to the Martin Declaration, #23 Proposed Order Denying Apple's Motion to Exclude Testimony of Samsung's Experts)(Maroulis, Victoria) (Filed on 5/31/2012)

Download PDF
EXHIBIT R Quality & Quantity (2007) 41:601–626 DOI 10.1007/s11135-007-9089-z © Springer 2007 Ordinal Methodology in the Analysis of Likert Scales ¨ RAINER GOB1,∗ , CHRISTOPHER McCOLLIN2 and MARIA FERNANDA RAMALHOTO3 1 Institute for Applied Mathematics and Statistics, University of W¨ rzburg, Sanderring 2, u D-97070 W¨ rzburg, Germany. E-mail: goeb@mathematik.uni-wuerzburg.de; 2 Nottingham u Trent University, University Burton Street, Nottingham, NG1 4BU, United Kingdom. E-mail: Chris.McCollin@ntu.ac.uk; 3 Instituto Superior T´ cnico, Maths Dept., Av. Rovisco e Pais, 1049-001 Lisbon, Portugal Abstract. Likert scales are widely used in survey studies for attitude measuring. In particular, the questionnaires propagated by the SERVQUAL approach are based on Likert scales. Though the problem of attitude suggests an ordinal interpretation of Likert scales, attitude survey data are often evaluated with techniques designed for cardinal measurements. The present paper discusses the interpretation of scales for attitude measuring and gives a survey of data analysis techniques under the proper ordinal understanding. Key words: attitude measuring, likert scales, ordinal scales, cardinal scales, SERVQUAL, statistical analysis. 1. Introduction Likert scales are widely used for measuring attitudes, e.g., opinions, psychic and mental dispositions, preferences. Questionnaires and surveys based on Likert scales are used in various areas, e.g., in psychometrics for the analysis of subjective well-being, see Diener et al. (1985) or Watson et al. (1988), in social studies and panels, or for purposes of business administration. The use of Likert scales has increased especially in the service sector with consumer surveys now being commonplace within the hotel, leisure and public utility sectors. In particular, the SERVQUAL approach introduced by Parasuraman et al. (1985, 1988) has received enormous interest. The ways of collecting survey data vary widely from the use of telephone questionnaires to on-line designed web pages for automatic input. The statistical analysis of survey data can range from simple dot plots to logistic regression and cluster analysis to determine any hidden structure. However, many studies confine themselves to a descriptive analysis. ∗ Author for correspondence: E-mail: goeb@mathematik.uni-wuerzburg.de 602 ¨ RAINER GOB ET AL. Clason and Dormody (1994) compare 95 articles analyzing Likert scales from the Journal of Agricultural Education. 51 reported only descriptive statistics. In a recent review of some University Business School dissertations, most students opted for questionnaires and/or interviews for their primary research and the main statistical analysis was of an exploratory nature with bar charts, check lists and Pareto plots undertaken. It is interesting to note that in a similar way to Ishikawa’s three levels of tools which provide the differentiation between Six Sigma Green and Black Belts, most students will mainly only attempt Ishikawa’s level 1 tools (7 basic tools) to carry out their analysis even though they have been taught level 2 and 3 tools such as ANOVA and regression. Unfortunately, the promotion of ways to analyze data measured in Likert scales is not widely available within textbooks. In fact, there is no common standard accepted by the scientific community for the correct interpretation and analysis of such data. Interpretation and analysis often seem to be in a mismatch. In methodological considerations it is generally acknowledged that attitude measuring scales should be considered as ordinal. Nevertheless, many studies use cardinal statistics as sample means, sample variances, t-tests to analyze attitude data. Proper ordinal approaches are in the minority. In particular, the SERVQUAL methodology as usually propagated is completely based on cardinal statistics. The objective of the present paper is to establish a framework for the analysis of survey data under an explicitly ordinal interpretation of the Likert scale. Sections 2 and 3 review the debate on the impact of scale typologies on statistical methodology. Sections 4 and 5 discuss the interpretation of Likert scales. Sections 6 and 7 suggest the multinomial model for modelling data from attitude surveys. Sections 8 through 12 consider the analysis of survey data from a homogeneous sample of respondents. Ways of detecting and analyzing inhomogeneous samples are discussed in Section 13. 2. Ordinal and Cardinal Scales We consider one-dimensional scales which can be identified with subsets of the real line. Stevens (1946, 1951) characterizes the scale types nominal, ordinal, interval, ratio in terms of permissible transformations. We use Stevens’ (1932) ideas to distinguish between ordinal and cardinal scales. Ordinal measure scales consist of categories ordered by a relation of the type “<” or “≤”, respectively. Any two measure values can be compared in terms of the order relation. The admissibility of strictly increasing scale transformations preserving the order relation is characteristic for ordinal scales. Consequently, differences of scale values are not meaningful. Beyond order, there is no measure for the distance between two scale values. For ORDINAL METHODOLOGY IN THE ANALYSIS OF LIKERT SCALES 603 instance, the ordinal scales 1, 2, 3, 4, 5 and 1, 3, 9, 27, 81 are equivalent. However, in the first scale the magnitude of differences between successive points is identical, whereas it is increasing in the second scale. Cardinal measure scales express magnitudes. Differences between scale values are meaningful. In Stevens’ terminology, cardinal scales are interval scales. Interval scales are characterized by the admissibility of strictly increasing linear transformations. For instance, the cardinal scales 1, 2, 3, 4, 5 and 0, 2, 4, 6, 8 are equivalent. 3. Scale Interpretation and Statistical Methodology The rationale behind an axiomatic distinction of scales as described in Section 2 is beyond doubt. However, the role of scale identifications in the methodology of statistical data analysis is controversial. Stevens (1951) and subsequently many other authors, e.g., Luce (1959), Townsend and Ashby (1984) and Luce et al. (1990), postulate the following steps of data analysis: (S1) Scales for measuring the values of certain attributes are chosen according to criteria provided by measurement theory. (S2) The measure scale chosen in step (S1) prescribes certain statistics and proscribes others. In this view, data measured in a specific scale have to be analyzed by statistics which preserve their meaning under the characteristic transformation of the scale. Admissible statistics for ordinal data are frequencies, histograms, order statistics. Methods involving arithmetic or weighted means are appropriate for cardinal data, but they make no sense for the analysis of ordinal data. Andrews et al. (1981) present an elaborate guide to select statistical methods in accordance with measure scales. The above view of the predominant role of measurement theory in data analysis has been criticized by several authors, see Lord (1953), Savage (1957), Tukey (1961), Adams et al. (1965) and Baker et al. (1986) for instance. More references and a detailed discussion survey are given by Velleman and Wilkinson (1993). Subsequently we consider only one, but substantial critical argument. The following propositions can be taken for granted: (i) Data analysis is an autonomous discipline. (ii) Among other techniques, data analysis uses formal mathematical methods, without being a part of mathematics. (iii) Any data analysis is motivated by a specific problem, i.e., specific interests and objectives of knowledge discovery, occurs in a specific context, i.e., a specific scientific or pragmatic environment, and reflects methods with respect to their solution potential for the problem in the context. (iv) The criteria of adequacy of methods of data analysis result from the specific problem, the specific context, and the solution potential. 604 ¨ RAINER GOB ET AL. Under propositions (i) through (iv), the description of data analysis by steps (S1), (S2) requires the following further assumption: (v) Measurement theory alone is able to reflect the criteria resulting from problem, context, and solution potential, by determining uniquely an adequate measure scale. However, considering the actual state of measurement theory as a discipline, see Luce et al. (1990) for instance, it will be difficult to defend Proposition (v). Customary measurement theory deliberately works without reflecting the potential of the entire corpus of statistical data analysis regarding problem, context, and solution potential. On the contrary, the succession of steps (S1), (S2) claims that methods of analysis can be selected or excluded without reflecting their potential. Measurement theory claims to be a preliminary fundamental discipline for data analysis. However, it is strongly influenced by formal axiomatic reasoning and fails to provide a conceptual framework to structure data analysis according to the basic matters of problem and context, and solution potential. The succession of steps (S1), (S2) has to be rejected. Scale type identification is reasonable to avoid conceptual confusion. However, scale type identification by measurement theory is not exclusively decisive for the choice of data analysis methods. The choice of appropriate methods is determined by the three interdependent factors listed above. In this vein, Adams et al. (1965): “Nothing is wrong per se in applying any statistical operation to measurements of given scale, but what may be wrong, depending on what is said about the results of these operations, is that the statement about them will be empirically meaningful or else that it is not scientifically debated.” Examples illustrating the above argument are provided by Lord (1953) and Wright (1997). The subsequent Section 5 discusses the interpretation of Likert scales. 4. The Likert Scale Rensis Likert (1932) introduced a scale and technique for attitude measurement. An individual is confronted with statements which are essentially value judgements. The value judgements may concern the individual’s reflections of reality or the individual’s psychic dispositions as feelings, wants, desires, conative dispositions. The individual is invited to define his attitude towards each statement by choosing among a number of r grades (scores, degrees) on the r-grade Likert scale. Most popular are five-grade and seven-grade Likert scales. The grades (scores, degrees) 1, . . . , r are ordered in ascending order of agreement or approval of the individual with respect to the value statement. In case of r = 5, the grades are usually interpreted by strongly disagree, disagree, neutral (undecided), agree, strongly agree. ORDINAL METHODOLOGY IN THE ANALYSIS OF LIKERT SCALES 605 Likert scales are widely used in different areas for attitude measurement by surveys, e.g., in psychology, sociology, health care, marketing, quality control. Popular applications are in the assessment of customers’ quality perceptions or expectations, and of subjective well-being. Subjective well-being has become an important topic in research and practical fields like health care, see Diener (1984) or Diener et al. (1999). Lots of differently structured attitude measuring techniques based on Likert scales are used. We describe some standard schemes which have been widely used for many years: SERVQUAL, PANAS, SWLS, GSOEP. 4.1. servqual Attitude surveys structured according to the SERVQUAL approach introduced by Parasuraman et al. (1985, 1988) are currently among the most popular applications of sample surveys in industry. SERVQUAL surveys intend to inquire customers’ attitudes towards service quality. Service quality is considered with respect to ν dimensions which are addressed by a questionnaire regarding M performance items. Parasuraman et al. (1985, 1988) suggest M = 22 items grouped into ν = 5 dimensions of service quality: tangibles (environmental factors), reliability, responsiveness, assurance, empathy. This setting has widely been accepted in applications. More subtle investigations use statistical instruments like principal components analysis to confirm or modify the setting, see Asubonteng et al. (1996) for a literature survey. The questionnaire contains a statement on each of the M performance items. The respondent is invited to qualify his attitude towards each statement in a response scale of Likert type with grades or scores ranging from 1 (“strongly disagree”) to r (“strongly agree”). Most popular in SERVQUAL manuals and case studies are scales with r = 7 or r = 5 grades, see Parasuraman et al. (1985, 1988). Occasionally, other values of r, e.g., r = 10 are also used, see Asubonteng et al. (1996). SERVQUAL distinguishes between two attitudes: expectations on quality, i.e., what a customer expects from the service, and perceptions of quality, i.e., the customer’s view of what actually happened. SERVQUAL intends to measure the gap between expectations and perceptions. To this end, SERVQUAL questionnaires are doubled: The respondent is invited to qualify his attitude towards each of the M statements once in the sense of expectation, irrespective of what actually happened, once in the sense of perception of what actually happened. The SERVQUAL community has adopted some standard quantitative methodology for the evaluation of SERVQUAL surveys which was essentially coined by Parasuraman et al. (1985, 1988, 1991). These methods are based on an implicit cardinal interpretation of the Likert scale. For 606 ¨ RAINER GOB ET AL. each respondent, dimension scores and a total SERVQUAL score are calculated as arithmetic or appropriately weighted averages. Survey scores are calculated as arithmetic averages of the respondent scores. Gap scores are calculated as differences of perception score minus expectation score. 4.2. panas The positive and negative affect scale (PANAS) introduced by Watson et al. (1988) is concerned with measuring subjective dispositions in the sense of moods, momentary, mid-term, or long-term. PANAS refers to 20 feelings or emotions in two dimensions, positive and negative. The respondent is asked to notify the degree of realizing the feeling or emotion in a five-grade Likert scale with values very slightly or not at all, a little, moderately, quite a bit, extremely. The quantitative methodology suggested by Watson et al. (1988) uses an implicit cardinal interpretation of Likert scores. 4.3. swls To measure global and rather persistent judgements on individual life, Diener et al. (1985) suggest the satisfaction with life scale (SWLS). SWLS considers only five statements: “In most ways my life is close to my ideal”, “The conditions of my life are excellent”, “I am satisfied with my life”, “So far I have gotten the important things I want in life”, “If I could live my life over, I would change almost nothing”. The respondent is asked to notify the degree of approval with each statement in a seven-grade Likert scale ranging from strongly disagree to strongly agree. The quantitative methodology used by Diener et al. (1985) to evaluate the SWLS technique is based on an implicit cardinal interpretation of Likert scores. 4.4. gsoep The German Socio-Economic Panel (GSOEP) has been conducted as a longitudinal panel in Germany since 1984. It includes 11 questions concerning satisfaction with work, income, health, housing, leisure, consumption. The answers are notified in an eleven-grade Likert scale ranging from 0 (totally dissatisfied) to 10 (totally satisfied). Further information about GSOEP can be obtained on the WWW at http://www.diw.de/. GSOEP contains no advice for scale interpretation or methods of analysis. 5. The Character of Likert Scales Following the conclusion of Section 3, the specific problem, the context of data analysis, and the problem solving potential of methods are crucial for ORDINAL METHODOLOGY IN THE ANALYSIS OF LIKERT SCALES 607 the deciding about the scale type and appropriate methods of analysis. For a Likert scale, the alternative is between an ordinal and a cardinal scale type. 5.1. the problem behind likert scales The problem behind the use of Likert scales is measuring attitudes. Accordingly, the interpretation of such scales is discussed in psychology, sociology and economics. Attitude measuring has to satisfy the following criteria: • Longitudinal consistency or retesting reliability: At repeated measuring times under invariant relevant side conditions respondents exhibit the same rating. • Longitudinal comparibility: Responses given by an individual at different times with respect to the same item can be compared on the scale. • Internal consistency. • Interpersonal comparibility: Responses from different inviduals can be compared on the scale. • Plausibility: The measuring method has to conform to naive assessments of attitudes. By definition in terms of admissible transformations, an ordinal scale is less restrictive in interpretation than a cardinal scale. Hence it is easier to satisfy the above criteria under an ordinal interpretation than under a cardinal interpretation of the Likert scale. In particular, ordinal scales facilitate achieving comparibility. For instance, consider a seven-grade Likert scale to measure satisfaction and let an individual report grade 2 at time t1 and grade 4 at time t2 . Under a cardinal interpretation this amounts to the controversial conclusion that the individual at time t2 experiences twice the satisfaction experienced at time t1 . Under an ordinal interpretation it only means that the indvidual’s satisfaction increased from the second to the fourth position on the scale. Comparability has been discussed extensively in the theory of utility and of social choice, see Georgescu-Roegen (1968), van Praag (1991) or Sen (1999). Many authors agree that under ordinal scaling interpersonal comparibility is a justified working hypothesis, see Ferrer-I-Carbonell and van Praag (2003). Cardinal interpretations, however, involve considerable difficulties in guaranteeing comparability. Naive cardinal interpretations of ordinal scales may violate internal consistency and interpersonal comparibility. Hart (1996) reports the results of an experiment suggested by Lodge (1981) for quantifying the grades in a Likert scale by magnitudes. A sample of respondents is invited to assign magnitudes to the grades of a 7-grade Likert scale with the interpretations atrocious, very bad, bad, so-so, good, very good, excellent. The result shows 608 ¨ RAINER GOB ET AL. considerable differences in the weights assigned to distances between the grades on the Likert scale. For instance, the step from atrocious to very bad is quantified by 0.6, whereas the step from so-so to good is quantified by 1.9. 5.2. the context of the use of likert scales Consider the scientific and pragmatic context of the use of Likert scales in survey techniques. In view of the context, methodology is rated by the following criteria: • Acceptance by communities in practice or research. • Standardization. • Comparibility of results. Section 4 lists four popular standardized attitude measuring techniques based on Likert scales: SERVQUAL, SWLS, PANAS, GSOEP. All are widely accepted, definite, standardized. They differ in advice for scale interpretation and for methods of data analysis. GSOEP contains no advice for scale interpretation or methods of analysis. Recent studies in GSOEP attitude data include explicitly ordinal scale interpretations, see Ferrer-I-Carbonell and van Praag (2003) or Nolte and McKee (2004), and implicitly cardinal ones, see Lucas et al. (2003) or Ronellenfitsch and Razum (2004). SERVQUAL, SWLS, and PANAS, in the manner originally conceived by their authors, see Parasuraman et al. (1988), Diener et al. (1985) and Watson et al. (1988), contain advices on data analysis. These advices imply a cardinal interpretation of the Likert scale: empirical sums, means, variances and correlation coefficients of scores are calculated. Such approaches are mainly motivated by pragmatic reasoning since cardinal statistics are widely available in textbooks and software. However, they contradict principles of attitude measuring which suggest ordinal scaling, see Section 5.1, above. Essentially, two types of misleading conclusions may follow from the conflict of intrinsic ordinality and imposed cardinality. (1) Complete distortion of results by applying strictly monotonous transformations to a scale which bears a cardinal interpretation. Fortunately, this type of misinterpretation is prevented by the pragmatic context, where SERVQUAL, SWLS, and PANAS are strictly linked to unambiguous Likert scales with grades 1,. . . ,7. The idea of subjecting the scale to transformations is purely academic. (2) Interpretation of attitude grades in terms of magnitudes. This is a serious misinterpretation supported by approaches like SERVQUAL, SWLS, or PANAS. Often enough, practitioners report results of surveys by statements like “We’ve increased customer satisfaction by 150% in one year.” ORDINAL METHODOLOGY IN THE ANALYSIS OF LIKERT SCALES 609 5.3. the analysis of attitude surveys Consider the problem solving potential of methods for the analysis of attitude surveys. The major criteria are: • • • • • Clarity. Exactness. Informational value. Simplicity. Availability. The cardinal scale approach suggested by standard descriptions of SERVQUAL, PANAS, SWLS excels by simplicity and availability. Methods like principal components analysis, factor analysis, correlation analysis, t-testing or ANOVA are from the conventional statistical toolbox, readily available in textbooks or software packages. Deficiencies of the cardinal scale approach are in clarity and exactness. The basic problem of scale interpretation generally remains unmentioned in the SERVQUAL environment and is discussed by few authors only, see Hart (1996) or Hart (1999). Many of the methods usually recommended are based on normality assumptions. These assumptions mostly remain undiscussed. Attempts of substantiating by asymptotics are not made. The informational value of methods recommended in SERVQUAL, PANAS or SWLS schemes is undoubted. Summed or averaged scores convey information about respondents’ attitudes. However, cardinal statistics also may hide or distort information. For instance, strong agreements and strong disagreements may be averaged, providing a misleading impression of average agreement. 5.4. conclusion on the interpretation of likert scales The problem of attitude measuring clearly suggests an ordinal interpretation of Likert scales. The context of use has established some implicit cardinal interpretation. To some extent, cardinal statistics have successfully been applied in the analysis of attitude surveys. In summary, ordinal methodology for Likert scale analysis conforms to the problem of attitude measuring, but it differs from widely used and sufficiently successful practice. To be acceptable for practitioners, a proper ordinal approach to Likert scale analysis has to substantiate its problem solving potential according to the criteria listed in Section 5.3, in particular with respect to simplicity and availability. The subsequent sections give an overview of ordinal methodology which is competitive in this sense. 610 ¨ RAINER GOB ET AL. 6. Formal Description of Attitude Questionnaires The discussion of quantitative analysis of attitude surveys requires a formal description of attitude questionnaires. Consider a questionnaire with M statements expressing ν dimensions. In SERVQUAL usually ν = 5, M = 22. Let 1 = m1 < · · · < mν+1 = M + 1 and let the statements (items) mρ , . . . , mρ+1 − 1 be associated with dimension ρ, ρ = 1, . . . , ν. Responses are notified in a Likert scale with r grades represented by the numbers 1, . . . , r. The survey is conducted with n respondents i = 1, . . . , n. The response of respondent i with respect to statement j is denoted by an r-tuple X ij = (Xij 1 , . . . , Xij r ) with entries from {0, 1}, Xij 1 + · · · + Xij r = 1. Xij l = 1 means: with respect to statement j , respondent i exhibits agreement grade 1 ≤ l ≤ r on the Likert scale. Then mρ+1 −1 (ρ) Xi = ν Xij , ρ = 1, . . . , ν, M (ρ) Xi = Xi = j =mρ ρ=1 Xij (1) j =1 is the response vector of respondent i in dimension ρ, respectively the total response vector in all M items. The above scheme can be used to describe the response on perception as well as on expectation. The response of respondent i with respect to statement j in terms of the gap between perception and expectation is denoted by a (2r − 1)-tuple Z ij = (Zi,j,−(r−1) , . . . , Zi,j,0 , . . . , Zi,j,r−1 ) (2) with entries from {0, 1}, Zi,j,−(r−1) + · · · + Zi,j,r−1 = 1. Zij l = 1, l < 0 means: with respect to item j , the perception of respondent i is |l| degrees below the expectation. Zij l = 1, l > 0 means: with respect to item j , the perception of respondent i exceeds the expectation by l degrees. Zij 0 = 1 means: with respect to item j , the perception of respondent i equals the expectation. The above interpretation of gaps is consistent with an ordinal interpretation of the Likert scale. The cyphers −(r − 1), . . . , 0, . . . , r − 1 indicate distances of expectation and perception in terms of degrees, not of magnitudes. These distances remain invariant under strictly monotonous scale transformations. The vectors mρ+1 −1 (ρ) Zi = ν Z ij , j =mρ ρ = 1, . . . , ν, M (ρ) Zi = Zi = ρ=1 Z ij (3) j =1 are the gap vectors of respondent i in dimensions ρ = 1, . . . , ν, respectively the total gap vector of respondent i in all M items on the questionnaire. 611 ORDINAL METHODOLOGY IN THE ANALYSIS OF LIKERT SCALES 7. The Multinomial Model for Attitude Responses The notation of Section 6 can be used to express survey evaluations based on a cardinal view of the Likert scale by forming weighted sums and averages of scores. Under an ordinal interpretation, quantitative analysis is primarily interested in the proportions of respondents choosing a certain grade on the attitude scale. In view of this interest, the multinomial distribution is a natural stochastic model of response behaviour. The multinomial framework in terms of the multinomial logit is often used in explanatory modelling of choices or preferences, see Powers and Xie (1999). However, in the analysis of sample surveys based on Likert scales the multinomial model is not very popular, see Maravelakis et al. (2003) for instance. An s-dimensional random vector Y has multinomial distribution MULT (k, p1 , . . . , ps ), briefly Y ∼ MU LT (k, p1 , . . . , ps ), if the probability density function is given by P (Y =y)= k! y P1 1 · · · Psys y1 ! . . . y s ! for y1 , . . . , ys ∈ N0 , y1 +· · ·+ys = k, (4) with parameters k ∈ N and p1 , . . . , ps ≥ 0, p1 + · · · + ps = 1. Formula (4) gives the probability of choosing yl times the category l in k experiments where the choice probability of choosing category l in one experiment is pl . The above interpretation suggests to assume Xij ∼ MU LT (1, pij 1 , . . . , pij r ), Z ij ∼ MU LT (1, qi,j,−(r−1) , . . . , qi,j,r−1 ) (5) for the response vector X ij of respondent i on item j and for the gap vector Z ij between perception and expectation of respondent i on item j . The choice probability pij l respectively qij l quantifies respondent i’s average inclination to exhibit attitude l towards statement j . Analogously, we assume multinomial distributions for dimension responses or dimension gaps and for total responses and for total gaps, i.e., (ρ) (ρ) (ρ) Xi ∼ MU LT (mρ+1 − mρ , pi1 , . . . , pir ), ρ = 1, . . . , ν, X i ∼ MU LT (M, pi1 , . . . , pir ), (ρ) (ρ) (ρ) Z i ∼ MU LT (mρ+1 − mρ , qi,−(r−1) , . . . , qi,r−1 ), ρ = 1, . . . , ν, Z i ∼ MU LT (M, qi,−(r−1) , . . . , qi,r−1 ). (6) (7) (8) (9) Above, choice probabilities are indexed in the respondent i. In most cases, choice probabilities are identical for groups of individuals or for the entire population. We distinguish two assumptions: (A1) Homogeneous population and sample: The choice probabilities are invariant for all respondents from a given population, and in partic- 612 ¨ RAINER GOB ET AL. ular for all respondents 1, . . . , n. The respondent index i in formulae (5) through (9) can be omitted. (A2) Clustered population and sample: The choice probabilities are invariant in mutually exclusive subgroups (clusters) of the population, and in particular in subgroups C1 , . . . , CQ , C1 ∪ · · · ∪ CQ = {1, . . . , n} of respondents in the sample. Choice probabilities corresponding to different clusters are different. 8. Estimation and Confidence Intervals for Choice Probabilities in a Homogeneous Population We consider a homogeneous sample according to assumption (A1) where the choice probabilities are identical for all respondents. Responses of different individuals can be assumed to be independent. The respective vectors of choice probabilities, see formulae (5) through (9), are estimated by the vectors 1 Xj = n Zj = 1 n n Xij, X (ρ) i=1 n Z ij, Z = (ρ) n 1 n(mρ+1 − mρ ) = i=1 1 (ρ) Xi , i=1 n n(mρ+1 − mρ ) (ρ) Zi , i=1 1 X= nM Z= 1 nM n Xi , (10) i=1 n Z i ,(11) i=1 of empirical survey averages (empirical proportions). The components of the survey averages are uniformly minimum variance unbiased (UMVU) estimators for the corresponding choice probabilities, for instance Lehmann (1983) for the background in estimation theory. The accuracy of a parameter estimate is best supported by providing a confidence region at a sufficiently high confidence level γ . In case of a vector parameter, simultaneous confidence intervals are particularly useful since the accuracy of each component estimate can be evaluated separately. For the vector parameter p = (p1 , . . . , ps ) in an s-dimensional multinomial distribution, the s simultaneous confidence intervals I1 = (LCL1 ; U CL1 ), . . . , Is = (LCLs ; U CLs ) at a nominal confidence level γ have to satisfy ! Pp (p1 ∈ I1 , . . . , ps ∈ Is ) ≥ γ for all values of p = (p1 , . . . , ps ). (12) We consider an i.i.d. sample Y 1 , . . . , Y N of size N from a multinomial distribution MULT (1, p1 , . . . , ps ). The confidence intervals for pl are centered around the UMVU estimator for pl , i.e., the sample average 1 (empirical proportion) Y ·l = N N Ydl with respect to component l. d=1 Asymptotic simultaneous confidence intervals for the choice probabilities have been constructed by Quesenberry and Hurst (1964) and Goodman 613 ORDINAL METHODOLOGY IN THE ANALYSIS OF LIKERT SCALES (1965). Both Quesenberry and Hurst (1964) and Goodman (1965) suggest for pl an interval with lower/upper limit LCLl /U CLl = z + 2NY · l − / + z z + 4NY ·l (1 − Y ·l ) 2(N + z) , (13) where z is a suitable 100β% quantile of χ 2 -distribution. Quesenberry and Hurst (1964) show that the requirement (12) is satisfied asymptotically for large N , i.e., the nominal confidence level is guaranteed asymptotically, by choosing the 100γ % quantile z = zs−1 (γ ) of the χ 2 - distribution. χ 2 (s − 1) with degree of freedom s − 1. By a theorem of Wilks (1962) based on the Bonferroni inequality, Goodman (1965) shows that (12) can be satisfied asymptotically with narrower intervals by choosing the 100(1 − (1 − γ )/s)% quantile z = z1 (1 − (1 − γ )/s) of the χ 2 -distribution χ 2 (1) with degree of freedom 1. Further simultaneous confidence intervals for the choice probabilities are discussed in literature. Bailey’s (1980) approach based on a normalizing transformation of the estimators produces shorter intervals than Goodman’s for small values of the estimators. The method of Sison and Glaz (1995) is quite involved and cannot be used without software support. Fitzpatrick and Scott (1987) suggest the simple intervals c(γ ) c(γ ) Il = Y ·l − √ ; Y ·l + √ , N N (14) where c(0.90) = 1.00, c(0.95) = 1.13, c(0.99) = 1.40. May and Johnson (1997) compare the approaches of Quesenberry and Hurst (1964) Goodman (1965), Fitzpatrick and Scott (1987), Sison and Glaz (1995) and some more in a simulation study. Fitzpatrick and Scott (1987) intervals are recommended for quick and rough calculations. Quesenberry and Hurst (1964) intervals are wide and conservative, agreeing with formula (12) generally also for relatively small sample sizes. The narrower Goodman (1965) intervals should be used only if the dimension s of the multinomial distribution is small or if the expected occupancy in each degree l = 1, . . . , s is at least 10. To apply formulae (13) and (14) for constructing simultaneous confidence intervals for choice probabilities in the setting introduced by Section 7, the estimators Y ·l and the sample size N have to be identified appropriately with parameters from formulae (10) and (11). The identifications can be found in Table 1. 614 ¨ RAINER GOB ET AL. Table I. Interpretation of choice probabilities in formulae (5) through (9) for a homogeneous sample and corresponding estimators and sample sizes N to be used in confidence interval formulae (13) and (14) Number s of Likert Degrees Parameter Estimator Sample Size s =r pj l , probability that a respondent chooses degree l with respect to statement j 1 Y ·l = X ·j l = n n Xij l , i=1 sample average number of respondents choosing degree l with respect to statement j N =n s =r pl(ρ) , average probability that a respondent chooses degree l with respect to statements in dimension ρ s =r pl, average probability that a respondent chooses degree l with respect to statements on the questionnaire s = 2r − 1 qj l , probability that a respondent chooses the gap degree l with respect to statement j (ρ) Y ·l = X ·l = n Xil / i=1 sample n(mρ+1 − mρ ), average number of respondents choosing degree l with respect to statements in dimension ρ 1 Y ·l = n n Xij l = X ·j l , i=1 sample average number of respondents choosing degree l with respect to statements on the questionnaire 1 Y ·l = Z ·j l = n n Zij l , i=l sample average number of respondents choosing gap degree l with respect to statement j s = 2r − 1 ql(ρ) , average probability that a respondent chooses the gap degree l with respect to statements in dimension ρ s = 2r − 1 ql, average probability that a respondent chooses gap degree l with respect to statements on the questionnaire (ρ) (ρ) (ρ) Y ·l = Z ·l = n Zil / i=1 n(mρ+1 − mρ ), sample average number of respondents choosing gap degree l with respect to statements in dimension ρ 1 Y ·l = Z ·l = n n Zil , i=1 sample average number of respondents choosing the gap degree l with respect to statements on the questionnaire N=n(mρ+1 −mρ ) N = nM N =n N=n(mρ+1 −mρ ) N = nM ORDINAL METHODOLOGY IN THE ANALYSIS OF LIKERT SCALES 615 9. Choice of Sample Size A reasonable criterion for sample size selection is imposing an upper limit on the width of simultaneous confidence intervals, see Tortora (1978). As in Section 8, above, we consider an i.i.d. sample Y 1 , . . . , Y N of size N from a multinomial distribution MU LT (1, p1 , . . . , ps ). The length of the intervals by Quesenberry and Hurst (1964) and Goodman (1965), see formula (13), is U CLl − LCLl = z z + 4NY ·l (1 − Y ·l ) N +z , (15) where z is a suitable quantile of a χ 2 -distribution as discussed in Section 8. The interval length depends on the estimator Y ·l for pl and attains a max√ imum at Y ·l = 0.5 with asymptotic length z/N. Sample size is determined according to the following criterion: The confidence limits should not differ from the estimate by more than a prescribed amount ε, i.e., the length of the confidence interval for each pl , l = 1, . . . , s should not exceed 2ε. Hence we obtain N= z . 4ε 2 (16) To identify the parameters s, p1 , . . . , ps and N in the setting of Section 7, consider Table 1.The following example 9.1 shows that sample size calculations by formula (16) are very sensitive with respect to the type of confidence intervals. For a conservative assessment of confidence, the technique of Quesenberry and Hurst (1964) should be used. 9.1. example Consider a SERVQUAL survey based on a 7-grade Likert scale. Interest is in estimating the choice probabilities qj,−6 , . . . , qj,6 for gap degrees in each item. Hence s = 13, N = n. Estimates should be accurate up to ∓10% at 90% confidence. Under the more conservative confidence intervals of Quesenberry and Hurst (1964) we have to use the quantile z = z12 (0.90) = 18.55 of the χ 2 -distribution χ 2 (12) with degree of freedom 12. Hence from formula (16) we obtain sample size N = 464. Under the looser confidence intervals of Goodman (1965) we have to use the quantile z = z1 (0.9923) = 7.10 of the χ 2 -distribution χ 2 (1) with degree of freedom 1. Hence from formula (16) we obtain sample size N = 177. 616 ¨ RAINER GOB ET AL. 10. Testing the Equality of Choice Probabilities in a Homogeneous Population As in Section 8 we consider the survey as an independent sample from a homogeneous population of respondents. The estimates together with simultaneous confidence intervals for the choice probabilities appearing in formulae (10) and (11) provide a good insight into the attitudes of individuals in the population. Descriptive statistics like histograms, bar charts, pie charts, Pareto charts should be used for presentation. In the present and the subsequent Section 11 we present methods of statistical inference for the comparison of choice probabilities: Tests of significance for the equality of choice probabilities, see below, and tests of significance for rank orders of choice probabilities, see Section 11. The tests are based on an i.i.d. sample Y 1 , . . . , Y n of size n from a multinomial distribution MU LT (M, p1 , . . . , ps ). The sum Y = Y 1 + · · · + Y n is a sufficient statistic for the probabilities p1 , . . . , ps , see Lehmann (1983). Hence testing can be based on Y which has multinomial distribution MU LT (nM, p1 , . . . , ps ), see the reproduction theorem for multinomial distribution in Appendix A.1. The interpretation of the quantities s, p1 , . . . , ps , Y , of the general scheme with the appropriate quantities in the special settings of Section 7 is obvious from Table 2. We want to find out whether respondents prefer certain attitudes or whether all among a given number of pairwise different attitudes i1 , . . . , it have the same probability to be chosen. To this end we consider the equality hypothesis H : pi1 = · · · = pit . (17) Similar to Fisher’s well-known test for comparing binomial probabilities, a reasonable test of (17) compares the results Yi1 , . . . , Yit relative to the total number y = Yi1 + · · · + Yit of observed choices in degrees i1 , . . . , it . According to assertion (c) of Theorem A.2.1 in Appendix A.2, the conditional distribution of Yi1 , . . . , Yit under the condition y = Yi1 + · · · + Yit is the multinomial distribution MU LT (y, π1 , . . . , πt ) where πl = pil /(pi1 + · · · + pit ). The equality hypothesis H is equivalent to H : π1 = · · · = πt = 1/t. A reasonable test statistic should be a measure of variation of the responses Yi1 , . . . , Yit . Light and Margolin (1971) suggest the variation measure y 1 V = V (Yi1 , . . . , Yit ) = − 2 2y t Yil (18) l=1 which is derived from Gini’s (1955) well-known variation measure for categorical data. The equality hypothesis H : π1 = · · · = πt = 1/t is rejected if the ORDINAL METHODOLOGY IN THE ANALYSIS OF LIKERT SCALES 617 Table II. Assignment of choice probabilities and statistics from formulae (5) through (9) to the general testing schemes of Sections 10 and 11 Number s of Likert Degrees Compared Probabilities Test Statistic s =r pj1 , . . . , pj r , probabilities that a respondent chooses degree 1, . . . , r with respect to statement j (ρ) (ρ) p1 , . . . , pr , average probabilities that a respondent chooses degrees 1, . . . , r with respect to statements in dimension ρ p1 , . . . , pr , average probabilities that a respondent chooses degrees 1, . . . , r with respect to statements on the questionnaire probabilities qj1 , . . . , qj r , that a respondent chooses gap degree 1, . . . , r with respect to statement j Y = n Xij , vector of total i=1 numbers of respondents choosing degrees 1, . . . , r with respect to statement j Y = n X(ρ) , vector of i i=1 total numbers of respondents choosing degree 1, . . . , r with respect to statements in dimension ρ Y = n Xi , vector of total i=1 numbers of respondents choosing degrees 1, . . . , r with respect to statements on the questionnaire Y = n Zij , vector of total i=1 numbers of respondents choosing gap degrees −(r − 1), . . . , r − 1 with respect to statement j Y = n Z (ρ) , vector of i=1 i total numbers of respondents choosing gap degree −(r − 1), . . . , r − 1 with respect to statements in dimension ρ Y = n Z i , vector of total i=1 numbers of respondents choosing gap degrees −(r − 1), . . . , r − 1 with respect to statements on the questionnaire s =r s =r s = 2r − 1 s = 2r − 1 (ρ) q1 , . . . , qr(ρ) , average probabilities that a respondent chooses gap degrees −(r − 1), . . . , r − 1 with respect to statements in dimension ρ s = 2r − 1 q1 , . . . , qr , average probabilities that a respondent chooses gap degrees −(r − 1), . . . , r − 1 with respect to statements on the questionnaire sample variation is too large, i.e., if V ≥ c. The p-value of this test under sample realizations Yi1 = y1 , . . . , Yit = yt , y1 + · · · + yt = y, is given by 618 ¨ RAINER GOB ET AL. 1 ty x1 ,... ,xt ≥0 x1 +···+xt =y V (x1 ,... ,xt )≥V (y1 ,... ,yt ) y! . x1 ! · · · .xt ! (19) Further research on simplifying approximations of expression (19) is necessary. 11. Ranking of Choice Probabilities in a Homogeneous Population If choice probabilities are apparently not identical, major interest is in a hypothesis on the rank order of the choice probabilities. Such a hypothesis may be formulated by the rank order of the empirical proportions in the sample as expressed by a Pareto chart. Methods for confirming this hypothesis are required. Simultaneous confidence regions are no help for this purpose. In the sequel, we develop a method of testing the ranking hypothesis by multiple comparisons as used in comparative treatment analysis, see Hsu (1996) for instance. Consider the comparison of choice probabilities p1 , . . . , ps , pl = 1. We wish to confirm the composite rank order hypothesis (20) K : pi1 > pj1 , . . . , pit > pjt where im = jm . To this end, we consider the negation H = ¬K as the null hypothesis. K is confirmed by a significance test if H can be rejected. H is the disjunction H = H1 ∪ · · · ∪ Ht of the null hypotheses in the t partial testing problems H1 : pi1 ≤ pj1 against K1 : pi1 > pj1 , H2 : pi2 ≤ pj2 against K2 : pi2 > pj2 , and so on until Ht : pit ≤ pjt against Kt : pit > pjt . We test H against K by successively testing Hm against Km . H is rejected in favour of K if each Hm is rejected in favour of Km . In the following Section 11.1 we describe the design of partial tests under a prescribed level of significance. The subsequent Section 11.2 considers the test of the composite hypothesis H against K. 11.1. design of partial tests for rank order Consider the partial problem Hm : pim ≤ pjm against Km : pim > pjm , (21) m ∈ {1, . . . , t}, where pim + pjm > 0. An equivalent formulation of problem (21) is H m : πm ≤ 0.5 against Km : πm > 0.5, where πm = pim . p i m + pj m (22) ORDINAL METHODOLOGY IN THE ANALYSIS OF LIKERT SCALES 619 Similar to Fisher’s well-known test for comparing binomial probabilities, a reasonable test of (21) compares the results Yim , Yjm relative to the total number y = Yim + Yjm of observed choices in degrees im and jm . Hm is rejected in favour of Km if Yim > c, i.e., if Yim is a too large amount of y. We have to determine the critical value c = cα0 ∈ {0, . . . , y} under a prescribed level of significance 0 < α0 < 1. Clearly in case of 0 = y = Yim + Yjm Hm cannot be rejected under any level α0 , so formally c = cα0 = 0 = y in case of y = 0. Consider the case 0 < y = Yim + Yjm . According to assertion (d) of Theorem A.2.1 in Appendix A.2, the conditional distribution of Yim under the condition Yim + Yjm = y is the binomial distribution Bi(y, π1 ). Hence (22) can be tested by the well-known test for binomial probabilities. The critical value c = cα0 is determined as the minimum integer c ∈ {0, . . . , y} satisfying the inequalities y 1−Ly,c (0.5)=0.5 y l=c+1 y l ! ! = ≤ α0 < 0.5 y y =1−Ly,c−1 (0.5), (23) l y l=c where Ly,c (0.5) is the distribution function of the binomial distribution Bi(y, 0.5). These values are available in tables and are provided by any modern statistical software package. 11.2. design of the test for the composite rank order hypothesis Let the significance level 0 < α < 1 be prescribed for a test of the composite hypothesis H against K. This level can be guaranteed by prescribing α0 = α t for each of the t partial problems Hm against Km . Let Rm denote the event that the m-th partial test, m ∈ {1, . . . , t}, rejects Hm in favour of Km , and let R denote the event that H is rejected in favour of K. Then P(Rm |Hm ) ≤ α t by the design of the partial test and hence by the well-known Bonferroni inequality t P(R|H ) = P(R1 ∪ · · · ∪ Rt |H ) ≤ t P(Rm |H ) = m=1 P(Rm |Hm ) ≤ α. (24) m=1 11.3. the p-value of the tests for rank order Under small y = Yim + Yjm it may be impossible to satisfy (23) with α0 = α , t i.e., to guarantee a partial test of prescribed significance level α0 = α . Hence t it may be more adequate to consider the p-value for rejecting H = H1 ∪ · · · ∪ Ht under sample realizations Y1 = y1 , . . . , Ys = ys . The p-value of the partial test of Hm against Km with the rejection region of type Yim > c as described in Section 11.1 is 620 ¨ RAINER GOB ET AL. 1 − Lyim +yjm ,yim −1 (0.5) = 0.5 yim +yjm yim +yjm l=yim yim + yjm . l (25) By the Bonferroni inequality, an upper bound for the p-value of the test of the composite hypothesis H is t t 1 − Lyim +yjm ,yim −1 (0.5) = m=1 yim +yjm yim +yjm 0.5 m=1 l=yim yim + yjm . l (26) 11.4. comparison of choice probabilities by simultaneous confidence intervals Choice probabilities pim , pjm may also be compared by simultaneous confidence intervals for the difference pim − pjm or the ratio pim /pjm . Such simultaneous confidence intervals are provided by Goodman (1965). 12. In-Questionnaire Association An important topic in survey data analysis is in-questionnaire association or dependence, i.e., association or dependence between responses on certain items or dimensions or the entire questionnaire. Are responses tending to be similar or do they diverge? Particularly important is item-to-total association, i.e., the relationship between responses on an individual item and responses on the entire questionnaire. A detailed analysis goes beyond the scope of the present paper. Important ordinal measures of association are Spearman’s well-known rank correlation coefficient, the gamma statistic by Goodman and Kruskal (1954), Kendall’s (1945) tau-b, Somer’s (1962) d statistic. These measures should be investigated under the multinomial response model to develop efficient estimation and testing procedures. 13. Comparison of Vectors of Choice Probabilities Sections 10 and 11 compare the choice probabilities in a given multinomial parameter vector. A further important topic is the comparison of entire parameter vectors. Two topics are interesting: • Comparison of parameter vectors corresponding to questionnaire items or questionnaire dimensions or the entire questionnaire. This topic is related to in-questionnaire association, see Section 12, above. ORDINAL METHODOLOGY IN THE ANALYSIS OF LIKERT SCALES 621 • Comparison of parameter vectors corresponding to different respondents. Here, we question the assumption of a homogeneous population of respondents, made throughout in Sections 8–11, see assumption (A1) of Section 7. Methods for comparing multinomial parameter vectors are provided by literature. A simple approach is to use Pearson’s χ 2 -test, see Clason and Dormody (1994). Light and Margolin (1971) and Margolin and Light (1974) present an ANOVA scheme which tests the hypothesis of equality of m multinomial probability vectors. A survey of measures of agreement between respondents is provided by Adejumo et al. (2004). In practice, populations of respondents are often inhomogeneous, i.e., the hypothesis of equality of the n multinomial probability vectors of n respondents will often be rejected. Groups (clusters) of customers, consumers, patients, social classes, age brackets, genders, may differ substantially in their attitudes. Some distinguishing factors may be quite obvious and known beforehand so that stratified surveys may be conducted. In other cases, however, stratified sampling is impossible. Major obstacles are: (1) Unknown factors. (2) Known factors, but unknown distribution of factors in the population. (3) Practical unfeasibility of stratification, e.g., due to economic restrictions. In such cases, groups of substantially different respondents have to be identified from survey data. Standard model free clustering algorithms based on distance measures can contribute to clustering in attitude surveys. However, model based clustering is generally more efficient. By means of assumption (A2), the multinomial model of Section 7 can describe clusters as groups sharing the same vector of choice probabilities. Recently, advances in clustering multinomial samples have been made in genetics, see Medvedovic et al. (2000). This type of probabilistic clustering has the potential to be more efficient than model free techniques. The description of multinomial clustering goes beyond the scope of the present paper. However, in view of the potential of such methods business statistics should reflect and adopt such approaches. 14. Conclusion The problem of attitude measuring suggests an ordinal interpretation of the Likert scale. The above Sections 7–13 show that plenty of proper ordinal methods exist for the analyis of data measured in Likert scales. However, such methods are not as easily available in textbooks and statistical packages as cardinal statistics. Some new methods were introduced in Sections 8, 10 and 11. Further work should concentrate on developing con- 622 ¨ RAINER GOB ET AL. venient and customized versions of ordinal statistics and on propagating these among researchers and practitioners. Appendix A. Properties of Multinomial Random Vectors A.1. reproduction theorem Sums of independent multinomial random vectors with identical vectors of choice probabilities follow a multinomial distribution: THEOREM A.1.1. Let Y 1 , . . . , Y n be s-dimensional independent random vectors, each with multinomial distribution MU LT (ki , p1 , . . . , ps ). n Then the sum i=1 Y i has multinomial distribution MU LT ( . . . , ps ). For a proof of Theorem A.1.1 see Wilks (1962). k i , p1 , A.2. marginal and conditional distributions The following Theorem A.2.1 gives marginal and conditional distributions in a multinomial vector. THEOREM A.2.1. Let Y = (Y1 , . . . , Yr ) be an r-dimensional random vector with multinomial distribution MU LT (N, p1 , . . . , pr ). Let 1 ≤ i1 < · · · < im ≤ r. Then we have the following results: (a) For yi1 , . . . , yim ≥ 0, yi1 + · · · + yim = y ≤ N we have P(Yi1 = yi1 , . . . , Yim = yim ) N! yi yi = pi1 1 · · · pim1 (1 − pi1 − · · · − pim )N−y . yi1 ! · · · yim !(N − y)! (A.1) (b) The sum Yi1 + · · · + Yim follows the binomial distribution Bi(N, pi1 + · · · + pim ). (c) Let 1 ≤ y ≤ N . Then the conditional distribution of the m-dimensional random vector (Yi1 , . . . , Yim ) under the condition Yi1 + · · · + Yim = y is the multinomial distribution MU LT (y, π1 , . . . , πm ) where pil πl = for l = 1, . . . , m. (A.2) pi1 + · · · + pim (d) Let 1 ≤ y ≤ N, m = 2. Then the conditional distribution of the univariate random variable Yi1 under the condition Yi1 + Yi2 = y is the binomial distribution Bi(y, pi1 /pi1 + pi2 ). 623 ORDINAL METHODOLOGY IN THE ANALYSIS OF LIKERT SCALES The proof of Theorem A.2.1 makes use of the well-known theorem on multinomial expansions: (pj1 + · · · + pjk )z = yj1 ,... ,yjk ≥0 yj1 ,... ,yjk =z z! yj yj pj1 1 · · · pjk k . yj1 ! · · · yjk ! (A.3) Hence formula (A.1) in assertion (a) of theorem A.2.1 is obtained by calculating P(Yi1 = yi1 , . . . , Yim = yim ) = N! yi yi pi1 1 . . . pimm yi1 ! · · · yim !(N − y)! (N−y)! yj yj pj1 1 . . . pjNN −m . −m yj1 ! · · · yjN −m ! yj1 ,...,yjN −m≥0 yj1 ,...,yjN −m=N−y For the proof of assertion (b) let 0 ≤ y ≤ N . Using formula (A.3) on multinomial expansion we obtain P(Yi1 + · · · + Yim = y) = yi1 ,... ,yim ≥0 N! yi1 ! · · · yim !(N − y)! yi1 ,... ,yim =y yi y i pi1 1 · · · pimm (1 − pi1 − · · · − pim )N−y = N (1 − pi1 − · · · − pim )N−y y y! yi yi pi1 1 · · · pimm yi1 ! · · · yim ! yi1 ,... ,yim ≥0 yi1 ,... ,yim =y = This proves assertion (b). N (pi1 + · · · + pim )y (1 − pi1 − · · · − pim )N−y . y 624 ¨ RAINER GOB ET AL. Investigating the conditional distribution considered in assertion (c) we obtain for yi1 , . . . , yim ≥ 0, yi1 + · · · + yim = y: P(Yi1 = yi1 , . . . , Yim = yim |Yi1 + · · · + Yim = y) P(Yi1 = yi1 , . . . , Yim = yim ) = P(Yi1 + · · · + Yim = y) yi yi N! p 1 . . . pim1 (1 − pi1 − · · · − pim )N−y yi1 !···yim !(N−y)! i1 = N (pi1 + · · · + pim )y (1 − pi1 − · · · − pim )N−y y yi 1 y! pim pi1 = ... yi1 ! · · · yim ! pi1 + · · · + pim pi1 + · · · + pim yim . Acknowledgements This paper is supported by funding from the “Growth” program of the European Community and was prepared in collaboration by member organizations of the Thematic Network-Pro- ENBIS-EC contract number G6RT-CT-2001-05059. References Adams, E., Fagot, R. F. & Robinson, R. E. (1965). A theory of appropriate statistics. Pschometrika 30 (2): 99–127. Adejumo, A. O., Heumann, C. & Toutenburg, H. (2004). A review of agreement measure as a subset of association measure between raters. SFB386-Discussion Paper 385, ¨ Ludwig-Maximilians-Universit¨ t, Munchen, Germany. a Andrews, F. M., Klem, L., Davidson, T. N., O’Malley, P. M. & Rodgers, W. L. (1981). A Guide for Selecting Statistical Techniques for Analyzing Social Science Data. Ann Arbor: Institute for Social Research, University of Michigan. Asubonteng, A., McClearly, K. J. & Swan, J. E. (1996). SERVQUAL revisited: A critical review of service quality. J. Service. Market. 10 (6): 62–81. Bailey, B. J. R. (1980). Large sample simultaneous confidence intervals for the multinomial probabilities based on transformations of the cell frequencies. Technometrics 22 (4): 583–589. Baker, B. O., Hardyck, C. D. & Petrinovich, L. F. (1986). Weak measurements versus strong statistics: An empirical critique of S. S. Stevens’ proscriptions on statistics. Education. Psychol. Measure. 26: 291–309. Clason, D. L. & Dormody, T. J. (1994). Analyzing data measured by individual likert-type items. J. Agric. Education 35 (4): 31–35. Diener, E.(1984). Subjective well-being. Psychol. Bull. 95: 542–575. Diener, E., Emmons, R. A., Larsen, R. J. & Griffin, S. (1985). The satisfaction with life scale. J. Personal. Assess. 49 (1): 71–75. Diener, E., Suh, E. M., Lucas, R. E. & Smith, H. L. (1999). Subjective well-being: Three decades of progress. Psychol. Bull. 125 (2): 276–302. Ferrer-I-Carbonell, A. & van Praag, B. M. S. (2003). Income satisfaction inequality and its Causes. J. Econ. Inequality 1: 107–127. ORDINAL METHODOLOGY IN THE ANALYSIS OF LIKERT SCALES 625 Fitzpatrick, S. & Scott, A. (1987). Quick simultaneous confidence intervals for multinomial proportions. J. Am. Stat. Assoc. 82: 399. Gini, C. W. (1955). Variabilit` e Concentrazione. Vol. 1: Memorie di metodologia statistica. a Georgescu-Roegen, N. (1968). Utility. in International Encyclopedia of Social Sciences, Vol. 16, New York: MacMillan Co. & The Free Press, pp. 236–267. Goodman, L. A. (1965). On simultaneous confidence intervals for multinomial proportions. Technometrics 7 (2): 247–254. Goodman, L. A. & Kruskal, W. H. (1954). Measuring of association for cross classifications. J. Am. Stat. Assoc. 49: 732–768. Hart, M. C. (1996). Improving the discrimination of SERVQUAL by using magnitude scaling. In London: G. K. Kanji (ed.) Chapman & Hall Total Quality Management in Action. Hart, M. C. (1999). The quantification of patient satisfaction. In Managing Quality: Strategic Issues in Health Care Management. H. T. O. Davies, M. Tavakoli, M. Malek, and A. Neilson Ashgate (eds.), Aldershot. Hsu, J. C. (1996). Mutiple Comparisons. Theory and Methods. Boca Raton, London: Chapman & Hall Kendall, M. G. (1945). The treatment of ties in ranK problems. Biometrika, 33: 239–251. Lehmann, E. L. (1959). Testing Statistical Hypotheses. New York: John Wiley & Sons. Lehmann, E. L. (1983). Theory of Point Estimation. New York: John Wiley & Sons. Light, J. & Margolin, B. H. (1971). An analysis of variance for categorical data. J. Am. Stat. Assoc. 66 (335): 534–544. Likert, R. (1932). A technique for the measurement of attitudes. J. Social. Psychol. 5: 228–238. Lodge, M. (1981). Magnitude scaling. Sage University Paper Series on Quantitative Applications in the Social Sciences, 07-025, Beverly Hills and London: Sage Publications Lord, F. M. (1953). On the statistical treatment of football numbers. Am. Psychol. 8: 750–751. Lucas, R. E., Clark, A. E., Georgellis, Y. & Diener, E. (2003). Reexamining adaptation and the set point model of happiness: Reactions to changes in marital status. J. Personal. Social Psychol. 84 (3): 527–539. Luce, R. D. (1959). On the possible psychophysical laws. Psychol. Rev. 66: 81–95. Luce, R. D., Krantz, D. H., Suppes, P. & Tversky, A. (1990). Foundations of Measurement. Vol. III. New York: Academic Press. Maravelakis, P. E., Perakis, M., Psarakis, S. & Panaretos, J. (2003). The use of indices in surveys. Qual. Quant. 37: 1–19. Margolin, B. H. & Light, J. (1974). An analysis of variance for categorical data, II. J. Am. Stat. Assoc. 69 (347): 755–764. May, W. L. & Johnson, W. D. (1997). Properties of simultaneous confidence intervals for multinomial proportions. Commun. Stat. Simulat. Comput. 26 (2): 495–518. Medvedovic, M., Succop, P., Shukla, R. & Dixon, K. (2000). Clustering mutational spectra via classification Likelihood and Markov chain Monte Carlo algorithms. J. Agric. Biol. Environ. Stat. 6 (1): 19–37. Nolte, E. & McKee, M. (2004). Changing health inequalities in east and west Germany since unification. Social Sci. Med. 58 (1): 119–136. Parasuraman, A., Berry, L. L. & Zeithaml, V. A. (1991). Refinement and assessment of the SERVQUAL. J. Retail. 67(4): 420–449. Parasuraman, A., Zeithaml, V. A. & Berry, L. L. (1985). A conceptual model for service quality and its implication for future research. J. Market. 41–50. Parasuraman, A., Zeithaml, V. A. & Berry, L. L. (1988). SERVQUAL: A multiple-item scale for measuring consumer perceptions of service quality. J. Retail. 64 (1): 12–40. 626 ¨ RAINER GOB ET AL. Powers, D. & Xie, Y. (1999). Statistical Methods for Categorical Data Analysis. Academic Press, Inc. Quesenberry, C. P. & Hurst, D. C. (1964). Large sample simultaneous confidence intervals for multinomial proportions. Technometrics 6 (2): 191–195. Ronellenfitsch, U. & Razum, R. (2004). Deteriorating health satisfaction among immigrants from eastern Europe to Germany. Int. J. Equity Health 3 (1):4 Savage, I. R. (1957). Nonparametric Statistics. J. Am. Stat. Assoc. 52: 331–334. Sen, A. (1999). The possibility of social choice. Am. Econ. Rev. 89: 349–378. Sison, C. P. & Glaz, J. (1995). Simultaneous confidence intervals and sample size determination for multinomial proportions. J. Am. Stat. Assoc. 90 (429): 366–369. Somer, R. H. (1962). A new asymmetric measure of association of ordinal variables. Am. Sociol. Rev. 27: 799–811. Stevens, S. S. (1946). On the theory of scales of measurements. Science 103: 677–680. Stevens, S. S. (1951). Mathematics, measurement and psychphysics. In Handbook of Experimental Psychology. S. S. Stevens (ed.), New York: John Wiley & Sons pp. 1–49. Tortora, R. D. (1978). A note on sample size estimation for multinomial populations. Am. Stat. 32 (3): 100–102. Townsend, J. T. & Ashby, F. G. (1984). Measurement scales and statistics: The misconception misconceived. Psychol. Bull. 96 (2): 394–401. Tukey, J.W. (1961). Data analysis and behavioral science or learning to bear the quantitative burden by Shunning Badmandments. In The Collected Works of John W. Tukey, Vol. III, L. V. Jones (ed.), Belmont: Wadsworth. pp. 391–484. van Praag, B. M. S. (1999). Ordinal and cardinal utility. J. Economet. 50: 69–89. Velleman, P. F. & Wilkinson, L. (1993). Nominal, ordinal, interval, and ratio typologies are misleading. Am. Stat. 47 (1): 65–72. Watson, D., Clark, L. & Tellegen, A. (1988). Development and validation of brief measures of positive and negative affect: The panas scales. J. Personal. Social Psychol. 54 (6): 1063–1070. Wilks, S. S. (1962). Mathematical Statistics. New York: John Wiley & Sons Wright, D. B. (1997). Football standings and measurement levels. Statistician 46 (1): 105–110.

Disclaimer: Justia Dockets & Filings provides public litigation records from the federal appellate and district courts. These filings and docket sheets should not be considered findings of fact or liability, nor do they necessarily reflect the view of Justia.


Why Is My Information Online?