井出草平の研究ノート

順序ロジスティック回帰分析をStata、Mplus、Rで行う-更新[Stata][Mplus][R]

順序ロジスティック回帰分析をStata、Mplus、Rで行う

以前のエントリの更新。

ides.hatenablog.com

brant検定が通らない例を使っている。

Stata

brant検定が走らない場合はfindit spost13_adoからspost13_adoをインストールする必要がある。

use https://www3.nd.edu/~rwilliam/statafiles/ordwarm2.dta, clear
ologit warm yr89 male white age ed prst, robust
brant
Iteration 0:   log pseudolikelihood = -2995.7704  
Iteration 1:   log pseudolikelihood = -2846.4532  
Iteration 2:   log pseudolikelihood = -2844.9142  
Iteration 3:   log pseudolikelihood = -2844.9123  
Iteration 4:   log pseudolikelihood = -2844.9123  

Ordered logistic regression                             Number of obs =  2,293
                                                        Wald chi2(6)  = 281.80
                                                        Prob > chi2   = 0.0000
Log pseudolikelihood = -2844.9123                       Pseudo R2     = 0.0504

------------------------------------------------------------------------------
             |               Robust
        warm | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
        yr89 |   .5239025   .0779223     6.72   0.000     .3711775    .6766275
        male |  -.7332997   .0785119    -9.34   0.000    -.8871803   -.5794192
       white |  -.3911595   .1173929    -3.33   0.001    -.6212454   -.1610735
         age |  -.0216655   .0024086    -8.99   0.000    -.0263864   -.0169447
          ed |   .0671728   .0160635     4.18   0.000     .0356889    .0986567
        prst |   .0060727   .0032978     1.84   0.066    -.0003908    .0125363
-------------+----------------------------------------------------------------
       /cut1 |  -2.465362     .23923                     -2.934244    -1.99648
       /cut2 |   -.630904   .2306956                     -1.083059   -.1787489
       /cut3 |   1.261854   .2322644                      .8066241    1.717084
------------------------------------------------------------------------------


Brant test of parallel regression assumption

              |       chi2     p>chi2      df
 -------------+------------------------------
          All |      49.18      0.000      12
 -------------+------------------------------
         yr89 |      13.01      0.001       2
         male |      22.24      0.000       2
        white |       1.27      0.531       2
          age |       7.38      0.025       2
           ed |       4.31      0.116       2
         prst |       4.33      0.115       2

A significant test statistic provides evidence that the parallel
regression assumption has been violated.

Mplus

RでSataのデータをMplus用に変換

library(rio)
ordwarm2 <- import("ordwarm2.dta")
ordwarm2.mplus <- subset(ordwarm2, select = 
                    c(warm, yr89, male, white, age, ed, prst))
library(MplusAutomation)
variable.names(ordwarm2.mplus) 
prepareMplusData(ordwarm2.mplus, filename="ordwarm2.mplus.dat", overwrite=T)

Mplus

TITLE: Ordinal Logistic Regression
DATA: FILE = "ordwarm2.mplus.dat";
VARIABLE: 
  NAMES = warm yr89 male white age ed prst; 
  USEVARIABLES = warm yr89 male white age ed prst; 
  CATEGORICAL = warm;
  MISSING=.;

ANALYSIS:
  ESTIMATOR = MLR; 
  
MODEL:
   warm on yr89 male white age ed prst
  
plot:
    type = plot3;

結果

MODEL RESULTS

                                                    Two-Tailed
                    Estimate       S.E.  Est./S.E.    P-Value

 WARM       ON
    YR89               0.524      0.078      6.725      0.000
    MALE              -0.733      0.078     -9.342      0.000
    WHITE             -0.391      0.117     -3.333      0.001
    AGE               -0.022      0.002     -8.997      0.000
    ED                 0.067      0.016      4.183      0.000
    PRST               0.006      0.003      1.842      0.065

 Thresholds
    WARM$1            -2.465      0.239    -10.308      0.000
    WARM$2            -0.631      0.231     -2.735      0.006
    WARM$3             1.262      0.232      5.434      0.000


LOGISTIC REGRESSION ODDS RATIO RESULTS

 WARM       ON
    YR89               1.689
    MALE               0.480
    WHITE              0.676
    AGE                0.979
    ED                 1.069
    PRST               1.006


BRANT WALD TEST FOR PROPORTIONAL ODDS

                                   Degrees of
                      Chi-Square     Freedom   P-Value

  WARM
    Overall test          49.181        12      0.000
    YR89                  13.013         2      0.001
    MALE                  22.238         2      0.000
    WHITE                  1.268         2      0.531
    AGE                    7.383         2      0.025
    ED                     4.310         2      0.116
    PRST                   4.332         2      0.115

R Ordinalパッケージ

subsetの作成

ordwarm2.R <- subset(ordwarm2, select = 
                   c(warm, yr89, male, white, age, ed, prst))

データ型の変更

ordwarm2.R$warm <- as.factor(ordwarm2.R$warm)
ordwarm2.R$warm <- factor(ordwarm2.R$warm, ordered = T, 
                           levels = c("1", "2", "3", "4"))
ordwarm2.R$male <- as.factor(ordwarm2.R$male)
ordwarm2.R$white <- as.factor(ordwarm2.R$white)

分析

library(ordinal)
res.R <-clm(formula = warm ~ yr89 + male + white + age + ed + prst, 
            link = "logit", Hess = TRUE, na.action = na.omit,
            data = ordwarm2.R)
summary(res.R)

結果

 link  threshold nobs logLik   AIC     niter max.grad cond.H 
 logit flexible  2293 -2844.91 5707.82 5(0)  4.94e-10 4.4e+05

Coefficients:
        Estimate Std. Error z value Pr(>|z|)    
yr89    0.523902   0.079899   6.557 5.49e-11 ***
male1  -0.733300   0.078483  -9.343  < 2e-16 ***
white1 -0.391159   0.118381  -3.304 0.000952 ***
age    -0.021666   0.002468  -8.778  < 2e-16 ***
ed      0.067173   0.015975   4.205 2.61e-05 ***
prst    0.006073   0.003293   1.844 0.065157 .  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Threshold coefficients:
    Estimate Std. Error z value
1|2  -2.4654     0.2389 -10.319
2|3  -0.6309     0.2333  -2.704
3|4   1.2619     0.2340   5.392

オッズ比での表示

library(tidymodels)
tidy(res.R, exponentiate = TRUE, conf.int = TRUE) |> 
  select(-c(std.error, statistic))

結果

  term   estimate  p.value conf.low conf.high coef.type
  <chr>     <dbl>    <dbl>    <dbl>     <dbl> <chr>    
1 1|2      0.0850 5.78e-25   NA        NA     intercept
2 2|3      0.532  6.85e- 3   NA        NA     intercept
3 3|4      3.53   6.96e- 8   NA        NA     intercept
4 yr89     1.69   5.49e-11    1.44      1.98  location 
5 male1    0.480  9.32e-21    0.412     0.560 location 
6 white1   0.676  9.52e- 4    0.536     0.853 location 
7 age      0.979  1.67e-18    0.974     0.983 location 
8 ed       1.07   2.61e- 5    1.04      1.10  location 
9 prst     1.01   6.52e- 2    1.00      1.01  location

brant検定

library(gofcat)
brant.test(res.R)

結果

Brant Test:
           chi-sq   df   pr(>chi)    
Omnibus     49.30   12    1.9e-06 ***
yr89        13.01    2     0.0015 ** 
male1       22.24    2    1.5e-05 ***
white1       1.27    2     0.5305    
age          7.38    2     0.0249 *  
ed           4.31    2     0.1159    
prst         4.33    2     0.1146    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

H0: Proportional odds assumption holds