順序ロジスティック回帰分析をStata、Mplus、Rで行う[Stata][Mplus][R]

この更新は更新されています。

ides.hatenablog.com

brant検定がうまくいかないデータで走らせたかったので、Richard WilliamsのGologit2で使用されているデータを使用した。解析例は下記のPDF内にあるものと基本的に同じである。

https://www.stata.com/meeting/4nasug/gologit2.pdf

Stata

標準誤差をロバスト推定するオプションを付けた。

use https://www3.nd.edu/~rwilliam/statafiles/ordwarm2.dta, clear
ologit warm yr89 male white age ed prst, robust
brant

Iteration 0:   log pseudolikelihood = -2995.7704  
Iteration 1:   log pseudolikelihood = -2846.4532  
Iteration 2:   log pseudolikelihood = -2844.9142  
Iteration 3:   log pseudolikelihood = -2844.9123  
Iteration 4:   log pseudolikelihood = -2844.9123  

Ordered logistic regression                             Number of obs =  2,293
                                                        Wald chi2(6)  = 281.80
                                                        Prob > chi2   = 0.0000
Log pseudolikelihood = -2844.9123                       Pseudo R2     = 0.0504

------------------------------------------------------------------------------
             |               Robust
        warm | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
        yr89 |   .5239025   .0779223     6.72   0.000     .3711775    .6766275
        male |  -.7332997   .0785119    -9.34   0.000    -.8871803   -.5794192
       white |  -.3911595   .1173929    -3.33   0.001    -.6212454   -.1610735
         age |  -.0216655   .0024086    -8.99   0.000    -.0263864   -.0169447
          ed |   .0671728   .0160635     4.18   0.000     .0356889    .0986567
        prst |   .0060727   .0032978     1.84   0.066    -.0003908    .0125363
-------------+----------------------------------------------------------------
       /cut1 |  -2.465362     .23923                     -2.934244    -1.99648
       /cut2 |   -.630904   .2306956                     -1.083059   -.1787489
       /cut3 |   1.261854   .2322644                      .8066241    1.717084
------------------------------------------------------------------------------


Brant test of parallel regression assumption

              |       chi2     p>chi2      df
 -------------+------------------------------
          All |      49.18      0.000      12
 -------------+------------------------------
         yr89 |      13.01      0.001       2
         male |      22.24      0.000       2
        white |       1.27      0.531       2
          age |       7.38      0.025       2
           ed |       4.31      0.116       2
         prst |       4.33      0.115       2

A significant test statistic provides evidence that the parallel
regression assumption has been violated.

brant検定でエラーが出る場合には、findit spost13_adoからspost13_adoをインストールする必要がある。(追記：2023/10/30)

Mplus

RでSataのデータをMplus用に変換

library(haven)
ordwarm2 <- read_dta("ordwarm2.dta")

ordwarm2.mplus <- subset(ordwarm2, select = 
                    c(warm, yr89, male, white, age, ed, prst))

library(MplusAutomation)
variable.names(ordwarm2.mplus)

ordwarm2.mplus$warm<-as.numeric(ordwarm2.mplus$warm)
ordwarm2.mplus$yr89<-as.numeric(ordwarm2.mplus$yr89)
ordwarm2.mplus$male<-as.numeric(ordwarm2.mplus$male)
ordwarm2.mplus$white<-as.numeric(ordwarm2.mplus$white)

prepareMplusData(ordwarm2.mplus, filename="ordwarm2.mplus.dat", overwrite=T)

Mplusのコード

TITLE: Ordinal Logistic Regression
DATA: FILE = "ordwarm2.mplus.dat";
VARIABLE: 
  NAMES = warm yr89 male white age ed prst; 
  USEVARIABLES = warm yr89 male white age ed prst; 
  CATEGORICAL = warm;
  MISSING=.;

ANALYSIS:
  ESTIMATOR = MLR; 
  
MODEL:
   warm on yr89 male white age ed prst
  
plot:
    type = plot3;

結果

MODEL RESULTS

                                                    Two-Tailed
                    Estimate       S.E.  Est./S.E.    P-Value

 WARM       ON
    YR89               0.524      0.078      6.725      0.000
    MALE              -0.733      0.078     -9.342      0.000
    WHITE             -0.391      0.117     -3.333      0.001
    AGE               -0.022      0.002     -8.997      0.000
    ED                 0.067      0.016      4.183      0.000
    PRST               0.006      0.003      1.842      0.065

 Thresholds
    WARM$1            -2.465      0.239    -10.308      0.000
    WARM$2            -0.631      0.231     -2.735      0.006
    WARM$3             1.262      0.232      5.434      0.000


LOGISTIC REGRESSION ODDS RATIO RESULTS

 WARM       ON
    YR89               1.689
    MALE               0.480
    WHITE              0.676
    AGE                0.979
    ED                 1.069
    PRST               1.006


BRANT WALD TEST FOR PROPORTIONAL ODDS

                                   Degrees of
                      Chi-Square     Freedom   P-Value

  WARM
    Overall test          49.181        12      0.000
    YR89                  13.013         2      0.001
    MALE                  22.238         2      0.000
    WHITE                  1.268         2      0.531
    AGE                    7.383         2      0.025
    ED                     4.310         2      0.116
    PRST                   4.332         2      0.115

R Massパッケージ

Massパッケージではロバスト推定はできないので通常の標準誤差の推定を行っている。

library(MASS)

subsetの作成

ordwarm2.R <- subset(ordwarm2, select = 
                   c(warm, yr89, male, white, age, ed, prst))

データ型の変更

ordwarm2.R$warm <- as.factor(ordwarm2.R$warm)
ordwarm2.R$warm <- factor(ordwarm2.R$warm, ordered = T, 
                           levels = c("1", "2", "3", "4"))
ordwarm2.R$male <- as.factor(ordwarm2.R$male)
ordwarm2.R$white <- as.factor(ordwarm2.R$white)

分析

res.R <-polr(formula = 
              warm ~ yr89 + male + white + age + ed + prst, 
              data = ordwarm2.R, Hess = TRUE, 
              method = "logistic")
summary(res.R)

結果

Coefficients:
           Value Std. Error t value
yr89    0.523912   0.079899   6.557
male1  -0.733309   0.078483  -9.344
white1 -0.391140   0.118381  -3.304
age    -0.021666   0.002469  -8.777
ed      0.067176   0.015975   4.205
prst    0.006072   0.003293   1.844

Intercepts:
    Value    Std. Error t value 
1|2  -2.4654   0.2389   -10.3188
2|3  -0.6309   0.2333    -2.7042
3|4   1.2618   0.2340     5.3919

Residual Deviance: 5689.825 
AIC: 5707.825

オッズ比での表示

res.R.odds <- exp(coef(res.R))
res.R.odds

P値の表示

ctable <- coef(summary(res.R))
p <- pnorm(abs(ctable[, "t value"]), lower.tail = FALSE) * 2 
ctable <- cbind(ctable, "p-value" = p)
ctable

              Value  Std. Error    t value      p-value
yr89    0.523912100 0.079898892   6.557189 5.483168e-11
male1  -0.733308680 0.078482778  -9.343562 9.314676e-21
white1 -0.391140326 0.118380964  -3.304081 9.528821e-04
age    -0.021665815 0.002468551  -8.776733 1.682920e-18
ed      0.067175724 0.015975011   4.205050 2.610242e-05
prst    0.006071693 0.003293204   1.843704 6.522628e-02
1|2    -2.465365780 0.238919177 -10.318828 5.792619e-25
2|3    -0.630933708 0.233319484  -2.704162 6.847687e-03
3|4     1.261844284 0.234024013   5.391944 6.969964e-08

brant検定

library(brant) 
brant(res.R)

-------------------------------------------- 
Test for    X2      df  probability 
-------------------------------------------- 
Omnibus     49.3    12  0
yr89        13.01   2   0
male1       22.24   2   0
white1      1.27    2   0.53
age         7.38    2   0.02
ed          4.31    2   0.12
prst        4.33    2   0.11
-------------------------------------------- 

H0: Parallel Regression Assumption holds