Tampere University of Technology

TUTCRIS Research Portal

Large-Scale Simultaneous Inference with Hypothesis Testing: Multiple Testing Procedures in Practice

Research output: Contribution to journalArticleScientificpeer-review

Standard

Large-Scale Simultaneous Inference with Hypothesis Testing: Multiple Testing Procedures in Practice. / Emmert-Streib, Frank; Dehmer, Matthias.

In: Machine Learning and Knowledge Extraction, Vol. 1, No. 2, 15.05.2019, p. 653-683.

Research output: Contribution to journalArticleScientificpeer-review

Harvard

APA

Vancouver

Author

Emmert-Streib, Frank ; Dehmer, Matthias. / Large-Scale Simultaneous Inference with Hypothesis Testing: Multiple Testing Procedures in Practice. In: Machine Learning and Knowledge Extraction. 2019 ; Vol. 1, No. 2. pp. 653-683.

Bibtex - Download

@article{82ecee7ca03d4d83a1f48a43f427441f,
title = "Large-Scale Simultaneous Inference with Hypothesis Testing: Multiple Testing Procedures in Practice",
abstract = "A statistical hypothesis test is one of the most eminent methods in statistics. Its pivotal role comes from the wide range of practical problems it can be applied to and the sparsity of data requirements. Being an unsupervised method makes it very flexible in adapting to real-world situations. The availability of high-dimensional data makes it necessary to apply such statistical hypothesis tests simultaneously to the test statistics of the underlying covariates. However, if applied without correction this leads to an inevitable increase in Type 1 errors. To counteract this effect, multiple testing procedures have been introduced to control various types of errors, most notably the Type 1 error. In this paper, we review modern multiple testing procedures for controlling either the family-wise error (FWER) or the false-discovery rate (FDR). We emphasize their principal approach allowing categorization of them as (1) single-step vs. stepwise approaches, (2) adaptive vs. non-adaptive approaches, and (3) marginal vs. joint multiple testing procedures. We place a particular focus on procedures that can deal with data with a (strong) correlation structure because real-world data are rarely uncorrelated. Furthermore, we also provide background information making the often technically intricate methods accessible for interdisciplinary data scientists.",
author = "Frank Emmert-Streib and Matthias Dehmer",
year = "2019",
month = "5",
day = "15",
doi = "10.3390/make1020039",
language = "English",
volume = "1",
pages = "653--683",
journal = "Machine Learning and Knowledge Extraction",
issn = "2504-4990",
publisher = "MDPI",
number = "2",

}

RIS (suitable for import to EndNote) - Download

TY - JOUR

T1 - Large-Scale Simultaneous Inference with Hypothesis Testing: Multiple Testing Procedures in Practice

AU - Emmert-Streib, Frank

AU - Dehmer, Matthias

PY - 2019/5/15

Y1 - 2019/5/15

N2 - A statistical hypothesis test is one of the most eminent methods in statistics. Its pivotal role comes from the wide range of practical problems it can be applied to and the sparsity of data requirements. Being an unsupervised method makes it very flexible in adapting to real-world situations. The availability of high-dimensional data makes it necessary to apply such statistical hypothesis tests simultaneously to the test statistics of the underlying covariates. However, if applied without correction this leads to an inevitable increase in Type 1 errors. To counteract this effect, multiple testing procedures have been introduced to control various types of errors, most notably the Type 1 error. In this paper, we review modern multiple testing procedures for controlling either the family-wise error (FWER) or the false-discovery rate (FDR). We emphasize their principal approach allowing categorization of them as (1) single-step vs. stepwise approaches, (2) adaptive vs. non-adaptive approaches, and (3) marginal vs. joint multiple testing procedures. We place a particular focus on procedures that can deal with data with a (strong) correlation structure because real-world data are rarely uncorrelated. Furthermore, we also provide background information making the often technically intricate methods accessible for interdisciplinary data scientists.

AB - A statistical hypothesis test is one of the most eminent methods in statistics. Its pivotal role comes from the wide range of practical problems it can be applied to and the sparsity of data requirements. Being an unsupervised method makes it very flexible in adapting to real-world situations. The availability of high-dimensional data makes it necessary to apply such statistical hypothesis tests simultaneously to the test statistics of the underlying covariates. However, if applied without correction this leads to an inevitable increase in Type 1 errors. To counteract this effect, multiple testing procedures have been introduced to control various types of errors, most notably the Type 1 error. In this paper, we review modern multiple testing procedures for controlling either the family-wise error (FWER) or the false-discovery rate (FDR). We emphasize their principal approach allowing categorization of them as (1) single-step vs. stepwise approaches, (2) adaptive vs. non-adaptive approaches, and (3) marginal vs. joint multiple testing procedures. We place a particular focus on procedures that can deal with data with a (strong) correlation structure because real-world data are rarely uncorrelated. Furthermore, we also provide background information making the often technically intricate methods accessible for interdisciplinary data scientists.

U2 - 10.3390/make1020039

DO - 10.3390/make1020039

M3 - Article

VL - 1

SP - 653

EP - 683

JO - Machine Learning and Knowledge Extraction

JF - Machine Learning and Knowledge Extraction

SN - 2504-4990

IS - 2

ER -