Statistics 简明教程
Statistics - Cohen’s kappa coefficient
Cohen’s kappa coefficient 是一种统计,用于衡量质性(分类)项目的评级者间一致性。一般认为它是一种比简单百分比一致性计算更稳健的度量,因为 k 考虑了偶然发生的共识。Cohen 的 kappa 系数衡量了两个评级者之间的一致性,每位评级者将 N 个项目分类为 C 个互斥类别。
Cohen’s kappa coefficient is a statistic which measures inter-rater agreement for qualitative (categorical) items. It is generally thought to be a more robust measure than simple percent agreement calculation, since k takes into account the agreement occurring by chance. Cohen’s kappa measures the agreement between two raters who each classify N items into C mutually exclusive categories.
Cohen 的 kappa 系数的定义如下,且由以下函数给出:
Cohen’s kappa coefficient is defined and given by the following function −
Formula
其中——
Where −
-
${p_0}$ = relative observed agreement among raters.
-
${p_e}$ = the hypothetical probability of chance agreement.
${p_0}$和 ${p_e}$ 通过观测数据计算出来,以计算每次观察者随机说出每个类别的概率。如果评级者完全一致,则 ${k}$ = 1。如果评级者之间除了 ${p_e}$ 所示的随机预期的之外没有其他一致性,则 ${k}$ ≤ 0。
${p_0}$ and ${p_e}$ are computed using the observed data to calculate the probabilities of each observer randomly saying each category. If the raters are in complete agreement then ${k}$ = 1. If there is no agreement among the raters other than what would be expected by chance (as given by ${p_e}$), ${k}$ ≤ 0.
Example
Problem Statement −
Problem Statement −
假设您正在分析与 50 名申请补助金的人员相关的数据。每个补助金提案由两位读者阅读,并且每位读者都对提案回答“是”或“否”。假设不一致的计数数据如下,其中 A 和 B 为读者,左斜对角线上的数据显示一致计数,而右斜对角线上的数据显示不一致计数:
Suppose that you were analyzing data related to a group of 50 people applying for a grant. Each grant proposal was read by two readers and each reader either said "Yes" or "No" to the proposal. Suppose the disagreement count data were as follows, where A and B are readers, data on the diagonal slanting left shows the count of agreements and the data on the diagonal slanting right, disagreements −
B |
|
Yes |
No |
A |
Yes |
20 |
5 |
No |
10 |
计算 Cohen 的 kappa 系数。
Calculate Cohen’s kappa coefficient.
Solution −
Solution −
请注意,有 20 个提案被读者 A 和读者 B 同时授予,有 15 个提案被这两位读者同时拒绝。因此,观察到的比例一致性为:
Note that there were 20 proposals that were granted by both reader A and reader B and 15 proposals that were rejected by both readers. Thus, the observed proportionate agreement is
要计算 ${p_e}$(随机一致性的概率),我们注意到:
To calculate ${p_e}$ (the probability of random agreement) we note that −
-
Reader A said "Yes" to 25 applicants and "No" to 25 applicants. Thus reader A said "Yes" 50% of the time.
-
Reader B said "Yes" to 30 applicants and "No" to 20 applicants. Thus reader B said "Yes" 60% of the time.
使用公式 P(A 和 B) = P(A) x P(B),其中 P 为事件发生的概率。
Using formula P(A and B) = P(A) x P(B) where P is probability of event occuring.
他们俩随机回答“是”的概率是 0.50 x 0.60 = 0.30,他们俩回答“否”的概率是 0.50 x 0.40 = 0.20。因此,随机一致性的总概率为 ${p_e}$ = 0.3 + 0.2 = 0.5。
The probability that both of them would say "Yes" randomly is 0.50 x 0.60 = 0.30 and the probability that both of them would say "No" is 0.50 x 0.40 = 0.20. Thus the overall probability of random agreement is ${p_e}$ = 0.3 + 0.2 = 0.5.
因此,现在应用我们用于 Cohen 的 Kappa 的公式,我们得到:
So now applying our formula for Cohen’s Kappa we get: