井出草平の研究ノート

LaTeX形式の数式コードを書くパッケージequatiomatic[R]

中澤港さんの日記に書かれていたパッケージ(https://minato.sip21c.org/im3r/20210131.html)。

これを使うと,lmer()で使ったモデルが,そのままTeXのコードになる。LaTeXで論文を書いている人はそのまま取り込めるし,WordやLibreOfficeで論文を書いている人でも,(1)ヘッダをつけてpandocでWordのdocxやWriterのodtに変換するとか,(2)MS Office+IguanaTexを使ってWordに取り込むとか,(3)LibreOffice+TexMaths拡張を使ってLibreOffice WriterやImpressに取り込むなどすれば,LaTeXがわからなくても使えてしまう。forecastパッケージのArima()関数にも使えるようだ。

www.rdocumentation.org

主に恩恵を受けるのは、lme4::lmer Modelsforecast::Arima Modelsを使う人のようだ。

適用できる分析

Model Packages/Functions
linear regression stats::lm
logistic regression stats::glm(family = binomial(link = 'logit'))
probit regression stats::glm(family = binomial(link = 'probit'))
ordinal logistic regression MASS::polr(method = 'logistic'); ordinal::clm(link = 'logit')
ordinal probit regression MASS::polr(method = 'probit'); ordinal::clm(link = 'probit')
auto-regressive integrated moving average forecast::Arima
regression with auto-regressive integrated moving average errors forecast::Arima

回帰分析

mod1 <- lm(mpg ~ cyl + disp, mtcars)
summary(mod1)

結果。

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 34.66099    2.54700  13.609 4.02e-14 ***
cyl         -1.58728    0.71184  -2.230   0.0337 *  
disp        -0.02058    0.01026  -2.007   0.0542 .  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 3.055 on 29 degrees of freedom
Multiple R-squared:  0.7596,    Adjusted R-squared:  0.743 
F-statistic: 45.81 on 2 and 29 DF,  p-value: 1.058e-09

重回帰分析

数式の表示にはextract_eqコマンドを使用する。

library(equatiomatic)
extract_eq(mod1)

$$
\operatorname{mpg} = \alpha + \beta{1}(\operatorname{cyl}) + \beta{2}(\operatorname{disp}) + \epsilon
$$

f:id:iDES:20210211005124p:plain

推定値を入れた式を描く

extract_eq(mod1, use_coefs = TRUE)

$$
\operatorname{\widehat{mpg}} = 34.66 - 1.59(\operatorname{cyl}) - 0.02(\operatorname{disp})
$$

f:id:iDES:20210211005136p:plain

Texコードで一部をアレンジ

extract_eq(mod1, 
           wrap = TRUE, 
           intercept = "\\hat{\\phi}", 
           greek = "\\hat{\\gamma}",
           raw_tex = TRUE)

切片をphi hatにして、βではなくγ hatにした場合。

$$
\begin{aligned}
\operatorname{mpg} &= \hat{\phi} + \hat{\gamma}{1}(\operatorname{cyl}) + \hat{\gamma}{2}(\operatorname{disp}) + \epsilon
\end{aligned}
$$

f:id:iDES:20210211005151p:plain

ロジスティック回帰

model_logit <- glm(sex ~ bill_length_mm + species, 
                   data = penguins, family = binomial(link = "logit"))
extract_eq(model_logit, wrap = TRUE, terms_per_line = 3)

wrap = TRUEで改行した形で表記、terms_per_lineは一行入る数式の数。

$$
\begin{aligned}
\log\left[ \frac { P( \operatorname{sex} = \operatorname{male} ) }{ 1 - P( \operatorname{sex} = \operatorname{male} ) } \right] &= \alpha + \beta{1}(\operatorname{bill_length_mm}) + \beta{2}(\operatorname{species}{\operatorname{Chinstrap}})\ + \ &\quad \beta{3}(\operatorname{species}_{\operatorname{Gentoo}})
\end{aligned}
$$

f:id:iDES:20210211005247p:plain

プロビット回帰

model_probit <- glm(sex ~ bill_length_mm + species, 
                    data = penguins, family = binomial(link = "probit"))
extract_eq(model_probit, wrap = TRUE, terms_per_line = 3)

$$
\begin{aligned}
P( \operatorname{sex} = \operatorname{male} ) &= \Phi[\alpha + \beta{1}(\operatorname{bill_length_mm}) + \beta{2}(\operatorname{species}{\operatorname{Chinstrap}})\ + \
&\qquad\ \beta
{3}(\operatorname{species}_{\operatorname{Gentoo}})]
\end{aligned}
$$

f:id:iDES:20210211005257p:plain

順序回帰分析

library(MASS)
set.seed(1234)
df <- data.frame(outcome = factor(rep(LETTERS[1:3], 100),
                                  levels = LETTERS[1:3],
                                  ordered = TRUE),
                 continuous_1 = rnorm(300, 100, 1),
                 continuous_2 = rnorm(300, 50, 5))

model_ologit <- MASS::polr(outcome ~ continuous_1 + continuous_2, 
                           data = df, Hess = TRUE, method = "logistic")
model_oprobit <- MASS::polr(outcome ~ continuous_1 + continuous_2, 
                            data = df, Hess = TRUE, method = "probit")

extract_eq(model_ologit, wrap = TRUE)

$$
\begin{aligned}
\log\left[ \frac { P( \operatorname{A} \geq \operatorname{B} ) }{ 1 - P( \operatorname{A} \geq \operatorname{B} ) } \right] &= \alpha{1} + \beta{1}(\operatorname{continuous_1}) + \beta{2}(\operatorname{continuous_2}) \
\log\left[ \frac { P( \operatorname{B} \geq \operatorname{C} ) }{ 1 - P( \operatorname{B} \geq \operatorname{C} ) } \right] &= \alpha
{2} + \beta{1}(\operatorname{continuous_1}) + \beta{2}(\operatorname{continuous_2})
\end{aligned}
$$

f:id:iDES:20210211005527p:plain

順序ロジスティック回帰

$$
\begin{aligned}
\log\left[ \frac { P( \operatorname{A} \geq \operatorname{B} ) }{ 1 - P( \operatorname{A} \geq \operatorname{B} ) } \right] &= \alpha{1} + \beta{1}(\operatorname{continuous_1}) + \beta{2}(\operatorname{continuous_2}) \
\log\left[ \frac { P( \operatorname{B} \geq \operatorname{C} ) }{ 1 - P( \operatorname{B} \geq \operatorname{C} ) } \right] &= \alpha
{2} + \beta{1}(\operatorname{continuous_1}) + \beta{2}(\operatorname{continuous_2})
\end{aligned}
$$

f:id:iDES:20210211005308p:plain

順序プロピット回帰

extract_eq(model_oprobit, wrap = TRUE)

$$ \begin{aligned}
P( \operatorname{A} \geq \operatorname{B} ) &= \Phi[\alpha{1} + \beta{1}(\operatorname{continuous_1}) + \beta{2}(\operatorname{continuous_2})] \
P( \operatorname{B} \geq \operatorname{C} ) &= \Phi[\alpha
{2} + \beta{1}(\operatorname{continuous_1}) + \beta{2}(\operatorname{continuous_2})]
\end{aligned}
$$

f:id:iDES:20210211005413p:plain