Title: | Test for Conditional Independence Based on the Generalized Covariance Measure (GCM) |
---|---|
Description: | A statistical hypothesis test for conditional independence. It performs nonlinear regressions on the conditioning variable and then tests for a vanishing covariance between the resulting residuals. It can be applied to both univariate random variables and multivariate random vectors. Details of the method can be found in Rajen D. Shah and Jonas Peters: The Hardness of Conditional Independence Testing and the Generalised Covariance Measure, Annals of Statistics 48(3), 1514--1538, 2020. |
Authors: | Jonas Peters and Rajen D. Shah |
Maintainer: | Jonas Peters <[email protected]> |
License: | GPL-2 |
Version: | 0.2.0 |
Built: | 2024-11-13 03:57:57 UTC |
Source: | https://github.com/cran/GeneralisedCovarianceMeasure |
This function is used for the GCM test. Other methods can be added.
comp.resids(V, Z, regr.pars, regr.method)
comp.resids(V, Z, regr.pars, regr.method)
V |
A (nxp)-dimensional matrix (or data frame) with n observations of p variables. |
Z |
A (nxp)-dimensional matrix (or data frame) with n observations of p variables. |
regr.pars |
Some regression methods require a list of additional options. |
regr.method |
A string indicating the regression method that is used. Currently implemented are "gam", "xgboost", "kernel.ridge", "nystrom". The regression is performed only if not both resid.XonZ and resid.YonZ are set to NULL. |
Vector of residuals.
Please cite the following paper. Rajen D. Shah, Jonas Peters: "The Hardness of Conditional Independence Testing and the Generalised Covariance Measure" https://arxiv.org/abs/1804.07203
set.seed(1) n <- 250 Z <- 4*rnorm(n) X <- 2*sin(Z) + rnorm(n) res <- comp.resids(X, Z, regr.pars = list(), regr.method = "gam")
set.seed(1) n <- 250 Z <- 4*rnorm(n) X <- 2*sin(Z) + rnorm(n) res <- comp.resids(X, Z, regr.pars = list(), regr.method = "gam")
Test for Conditional Independence Based on the Generalized Covariance Measure (GCM)
gcm.test(X, Y, Z = NULL, alpha = 0.05, regr.method = "xgboost", regr.pars = list(), plot.residuals = FALSE, nsim = 499L, resid.XonZ = NULL, resid.YonZ = NULL)
gcm.test(X, Y, Z = NULL, alpha = 0.05, regr.method = "xgboost", regr.pars = list(), plot.residuals = FALSE, nsim = 499L, resid.XonZ = NULL, resid.YonZ = NULL)
X |
A (nxp)-dimensional matrix (or data frame) with n observations of p variables. |
Y |
A (nxp)-dimensional matrix (or data frame) with n observations of p variables. |
Z |
A (nxp)-dimensional matrix (or data frame) with n observations of p variables. |
alpha |
Significance level of the test. |
regr.method |
A string indicating the regression method that is used. Currently implemented are "gam", "xgboost", "kernel.ridge". The regression is performed only if not both resid.XonZ and resid.YonZ are set to NULL. |
regr.pars |
Some regression methods require a list of additional options. |
plot.residuals |
A Boolean indicating whether some plots should be shown. |
nsim |
An integer indicating the number of bootstrap samples used to approximate the null distribution of the test statistic. |
resid.XonZ |
It is possible to directly provide the residuals instead of performing a regression. If set to NULL, the regression method specified in regr.method is used. |
resid.YonZ |
It is possible to directly provide the residuals instead of performing a regression. If set to NULL, the regression method specified in regr.method is used. |
The function tests whether X is conditionally independent of Y given Z. The output is a list containing
p.value
: P-value of the test.
test.statistic
: Test statistic of the test.
reject
: Boolean that is true iff p.value < alpha.
Please cite the following paper. Rajen D. Shah, Jonas Peters: "The Hardness of Conditional Independence Testing and the Generalised Covariance Measure" https://arxiv.org/abs/1804.07203
set.seed(1) n <- 250 Z <- 4*rnorm(n) X <- 2*sin(Z) + rnorm(n) Y <- 2*sin(Z) + rnorm(n) Y2 <- 2*sin(Z) + X + rnorm(n) gcm.test(X, Y, Z, regr.method = "gam") gcm.test(X, Y2, Z, regr.method = "gam")
set.seed(1) n <- 250 Z <- 4*rnorm(n) X <- 2*sin(Z) + rnorm(n) Y <- 2*sin(Z) + rnorm(n) Y2 <- 2*sin(Z) + X + rnorm(n) gcm.test(X, Y, Z, regr.method = "gam") gcm.test(X, Y2, Z, regr.method = "gam")
Contains the function gcm.test that can be used for performing a conditional independence test based on the GCM.
Jonas Peters [email protected], Rajen D. Shah
Please cite the following paper. Rajen D. Shah, Jonas Peters: "The Hardness of Conditional Independence Testing and the Generalised Covariance Measure" https://arxiv.org/abs/1804.07203