Browse Results

Showing 48,576 through 48,600 of 55,715 results

Statistical Inference in Stochastic Processes

by N.U. Prabhu and I.V. Basawa

Covering both theory and applications, this collection of eleven contributed papers surveys the role of probabilistic models and statistical techniques in image analysis and processing, develops likelihood methods for inference about parameters that determine the drift and the jump mechanism of a di

Statistical Inference in Stochastic Processes


Covering both theory and applications, this collection of eleven contributed papers surveys the role of probabilistic models and statistical techniques in image analysis and processing, develops likelihood methods for inference about parameters that determine the drift and the jump mechanism of a di

Statistical Inference on Residual Life (Statistics for Biology and Health)

by Jong-Hyeon Jeong

This is a monograph on the concept of residual life, which is an alternative summary measure of time-to-event data, or survival data. The mean residual life has been used for many years under the name of life expectancy, so it is a natural concept for summarizing survival or reliability data. It is also more interpretable than the popular hazard function, especially for communications between patients and physicians regarding the efficacy of a new drug in the medical field. This book reviews existing statistical methods to infer the residual life distribution. The review and comparison includes existing inference methods for mean and median, or quantile, residual life analysis through medical data examples. The concept of the residual life is also extended to competing risks analysis. The targeted audience includes biostatisticians, graduate students, and PhD (bio)statisticians. Knowledge in survival analysis at an introductory graduate level is advisable prior to reading this book.

Statistical Inference Under Mixture Models (ICSA Book Series in Statistics)

by Jiahua Chen

This book puts its weight on theoretical issues related to finite mixture models. It shows that a good applicant, is an applicant who understands the issues behind each statistical method. This book is intended for applicants whose interests include some understanding of the procedures they are using, while they do not have to read the technical derivations.At the same time, many researchers find most theories and techniques necessary for the development of various statistical methods, without chasing after one set of research papers, after another. Even though the book emphasizes the theory, it provides accessible numerical tools for data analysis. Readers with strength in developing statistical software, may find it useful.

Statistical Inference via Convex Optimization (PDF)

by Anatoli Juditsky Arkadi Nemirovski

This authoritative book draws on the latest research to explore the interplay of high-dimensional statistics with optimization. Through an accessible analysis of fundamental problems of hypothesis testing and signal recovery, Anatoli Juditsky and Arkadi Nemirovski show how convex optimization theory can be used to devise and analyze near-optimal statistical inferences.Statistical Inference via Convex Optimization is an essential resource for optimization specialists who are new to statistics and its applications, and for data scientists who want to improve their optimization methods. Juditsky and Nemirovski provide the first systematic treatment of the statistical techniques that have arisen from advances in the theory of optimization. They focus on four well-known statistical problems—sparse recovery, hypothesis testing, and recovery from indirect observations of both signals and functions of signals—demonstrating how they can be solved more efficiently as convex optimization problems. The emphasis throughout is on achieving the best possible statistical performance. The construction of inference routines and the quantification of their statistical performance are given by efficient computation rather than by analytical derivation typical of more conventional statistical approaches. In addition to being computation-friendly, the methods described in this book enable practitioners to handle numerous situations too difficult for closed analytical form analysis, such as composite hypothesis testing and signal recovery in inverse problems.Statistical Inference via Convex Optimization features exercises with solutions along with extensive appendixes, making it ideal for use as a graduate text.

Statistical Inference via Data Science: A ModernDive into R and the Tidyverse (Chapman & Hall/CRC The R Series)

by Chester Ismay Albert Y. Kim

Statistical Inference via Data Science: A ModernDive into R and the Tidyverse provides a pathway for learning about statistical inference using data science tools widely used in industry, academia, and government. It introduces the tidyverse suite of R packages, including the ggplot2 package for data visualization, and the dplyr package for data wrangling. After equipping readers with just enough of these data science tools to perform effective exploratory data analyses, the book covers traditional introductory statistics topics like confidence intervals, hypothesis testing, and multiple regression modeling, while focusing on visualization throughout. Features: ● Assumes minimal prerequisites, notably, no prior calculus nor coding experience ● Motivates theory using real-world data, including all domestic flights leaving New York City in 2013, the Gapminder project, and the data journalism website, FiveThirtyEight.com ● Centers on simulation-based approaches to statistical inference rather than mathematical formulas ● Uses the infer package for "tidy" and transparent statistical inference to construct confidence intervals and conduct hypothesis tests via the bootstrap and permutation methods ● Provides all code and output embedded directly in the text; also available in the online version at moderndive.com This book is intended for individuals who would like to simultaneously start developing their data science toolbox and start learning about the inferential and modeling tools used in much of modern-day research. The book can be used in methods and data science courses and first courses in statistics, at both the undergraduate and graduate levels.

Statistical Inference via Data Science: A ModernDive into R and the Tidyverse (Chapman & Hall/CRC The R Series)

by Chester Ismay Albert Y. Kim

Statistical Inference via Data Science: A ModernDive into R and the Tidyverse provides a pathway for learning about statistical inference using data science tools widely used in industry, academia, and government. It introduces the tidyverse suite of R packages, including the ggplot2 package for data visualization, and the dplyr package for data wrangling. After equipping readers with just enough of these data science tools to perform effective exploratory data analyses, the book covers traditional introductory statistics topics like confidence intervals, hypothesis testing, and multiple regression modeling, while focusing on visualization throughout. Features: ● Assumes minimal prerequisites, notably, no prior calculus nor coding experience ● Motivates theory using real-world data, including all domestic flights leaving New York City in 2013, the Gapminder project, and the data journalism website, FiveThirtyEight.com ● Centers on simulation-based approaches to statistical inference rather than mathematical formulas ● Uses the infer package for "tidy" and transparent statistical inference to construct confidence intervals and conduct hypothesis tests via the bootstrap and permutation methods ● Provides all code and output embedded directly in the text; also available in the online version at moderndive.com This book is intended for individuals who would like to simultaneously start developing their data science toolbox and start learning about the inferential and modeling tools used in much of modern-day research. The book can be used in methods and data science courses and first courses in statistics, at both the undergraduate and graduate levels.

Statistical Information and Likelihood: A Collection of Critical Essays by Dr. D. Basu (Lecture Notes in Statistics #45)

by D. Basu

It is an honor to be asked to write a foreword to this book, for I believe that it and other books to follow will eventually lead to a dramatic change in the current statistics curriculum in our universities. I spent the 1975-76 academic year at Florida State University in Tallahassee. My purpose was to complete a book on Statistical Reliability Theory with Frank Proschan. At the time, I was working on total time on test processes. At the same time, I started attending lectures by Dev Basu on statistical inference. It was Lehmann's hypothesis testing course and Lehmann's book was the text. However, I noticed something strange - Basu never opened the book. He was obviously not following it. Instead, he was giving a very elegant, measure theoretic treatment of the concepts of sufficiency, ancillarity, and invariance. He was interested in the concept of information - what it meant. - how it fitted in with contemporary statistics. As he looked at the fundamental ideas, the logic behind their use seemed to evaporate. I was shocked. I didn't like priors. I didn't like Bayesian statistics. But after the smoke had cleared, that was all that was left. Basu loves counterexamples. He is like an art critic in the field of statistical inference. He would find a counterexample to the Bayesian approach if he could. So far, he has failed in this respect.

Statistical Intervals: A Guide for Practitioners and Researchers (Wiley Series in Probability and Statistics #541)

by William Q. Meeker Gerald J. Hahn Luis A. Escobar

Describes statistical intervals to quantify sampling uncertainty,focusing on key application needs and recently developed methodology in an easy-to-apply format Statistical intervals provide invaluable tools for quantifying sampling uncertainty. The widely hailed first edition, published in 1991, described the use and construction of the most important statistical intervals. Particular emphasis was given to intervals—such as prediction intervals, tolerance intervals and confidence intervals on distribution quantiles—frequently needed in practice, but often neglected in introductory courses. Vastly improved computer capabilities over the past 25 years have resulted in an explosion of the tools readily available to analysts. This second edition—more than double the size of the first—adds these new methods in an easy-to-apply format. In addition to extensive updating of the original chapters, the second edition includes new chapters on: Likelihood-based statistical intervals Nonparametric bootstrap intervals Parametric bootstrap and other simulation-based intervals An introduction to Bayesian intervals Bayesian intervals for the popular binomial, Poisson and normal distributions Statistical intervals for Bayesian hierarchical models Advanced case studies, further illustrating the use of the newly described methods New technical appendices provide justification of the methods and pathways to extensions and further applications. A webpage directs readers to current readily accessible computer software and other useful information. Statistical Intervals: A Guide for Practitioners and Researchers, Second Edition is an up-to-date working guide and reference for all who analyze data, allowing them to quantify the uncertainty in their results using statistical intervals.

Statistical Intervals: A Guide for Practitioners and Researchers (Wiley Series in Probability and Statistics #541)

by William Q. Meeker Gerald J. Hahn Luis A. Escobar

Describes statistical intervals to quantify sampling uncertainty,focusing on key application needs and recently developed methodology in an easy-to-apply format Statistical intervals provide invaluable tools for quantifying sampling uncertainty. The widely hailed first edition, published in 1991, described the use and construction of the most important statistical intervals. Particular emphasis was given to intervals—such as prediction intervals, tolerance intervals and confidence intervals on distribution quantiles—frequently needed in practice, but often neglected in introductory courses. Vastly improved computer capabilities over the past 25 years have resulted in an explosion of the tools readily available to analysts. This second edition—more than double the size of the first—adds these new methods in an easy-to-apply format. In addition to extensive updating of the original chapters, the second edition includes new chapters on: Likelihood-based statistical intervals Nonparametric bootstrap intervals Parametric bootstrap and other simulation-based intervals An introduction to Bayesian intervals Bayesian intervals for the popular binomial, Poisson and normal distributions Statistical intervals for Bayesian hierarchical models Advanced case studies, further illustrating the use of the newly described methods New technical appendices provide justification of the methods and pathways to extensions and further applications. A webpage directs readers to current readily accessible computer software and other useful information. Statistical Intervals: A Guide for Practitioners and Researchers, Second Edition is an up-to-date working guide and reference for all who analyze data, allowing them to quantify the uncertainty in their results using statistical intervals.

Statistical Inversion of Electromagnetic Logging Data (SpringerBriefs in Petroleum Geoscience & Engineering)

by Qiuyang Shen Jiefu Chen Xuqing Wu Yueqin Huang Zhu Han

This book presents a comprehensive introduction to well logging and the inverse problem. It explores challenges such as conventional data processing methods’ inability to handle local minima issues, and presents the explanations in an easy-to-follow way. The book describes statistical data interpretation by introducing the fundamentals behind the approach, as well as a range of sampling methods. In each chapter, a specific method is comprehensively introduced, together with representative examples. The book begins with basic information on well logging and logging while drilling, as well as a definition of the inverse problem. It then moves on to discuss the fundamentals of statistical inverse methods, Bayesian inference, and a new sampling method that can be used to supplement it, the hybrid Monte Carlo method. The book then addresses a specific problem in the inversion of downhole logging data, and the interpretation of earth model complexity, before concluding with a meta-technique called the tempering method, which serves as a supplement to statistical sampling methods. Given its scope, the book offers a valuable reference guide for drilling engineers, well logging tool physicists, and geoscientists, as well as students in the areas of petroleum engineering and electrical engineering.

Statistical Issues in Drug Development (Statistics in Practice #69)

by Stephen S. Senn

Drug development is the process of finding and producing therapeutically useful pharmaceuticals, turning them into safe and effective medicine, and producing reliable information regarding the appropriate dosage and dosing intervals. With regulatory authorities demanding increasingly higher standards in such developments, statistics has become an intrinsic and critical element in the design and conduct of drug development programmes. Statistical Issues in Drug Development presents an essential and thought provoking guide to the statistical issues and controversies involved in drug development. This highly readable second edition has been updated to include: Comprehensive coverage of the design and interpretation of clinical trials. Expanded sections on missing data, equivalence, meta-analysis and dose finding. An examination of both Bayesian and frequentist methods. A new chapter on pharmacogenomics and expanded coverage of pharmaco-epidemiology and pharmaco-economics. Coverage of the ICH guidelines, in particular ICH E9, Statistical Principles for Clinical Trials. It is hoped that the book will stimulate dialogue between statisticians and life scientists working within the pharmaceutical industry. The accessible and wide-ranging coverage make it essential reading for both statisticians and non-statisticians working in the pharmaceutical industry, regulatory bodies and medical research institutes. There is also much to benefit undergraduate and postgraduate students whose courses include a medical statistics component.

Statistical Issues in Drug Development (Statistics in Practice)

by Stephen S. Senn

Statistical Issues in Drug Development The revised third edition of Statistical Issues in Drug Development delivers an insightful treatment of the intersection between statistics and the life sciences. The book offers readers new discussions of crucial topics, including cluster randomization, historical controls, responder analysis, studies in children, post-hoc tests, estimands, publication bias, the replication crisis, and many more.This work presents the major statistical issues in drug development in a way that is accessible and comprehensible to life scientists working in the field, and takes pains not to gloss over significant disagreements in the field of statistics, while encouraging communication between the statistical and life sciences disciplines. In addition to new material on topics like invalid inversion, severity, random effects in network meta-analysis, and explained variation, readers will benefit from the inclusion of:A thorough introduction to basic topics in drug development and statistics, including the role played by statistics in drug developmentAn exploration of the four views of statistics in drug development, including the historical, methodological, technical, and professionalAn examination of debatable and controversial topics in drug development, including the allocation of treatments to patients in clinical trials, baselines and covariate information, and the measurement of treatment effectsPerfect for life scientists and other professionals working in the field of drug development, Statistical Issues in Drug Development is the ideal resource for anyone seeking a one-stop reference to enhance their understanding of the use of statistics during drug development.

Statistical Issues in Drug Development (Statistics in Practice)

by Stephen S. Senn

Statistical Issues in Drug Development The revised third edition of Statistical Issues in Drug Development delivers an insightful treatment of the intersection between statistics and the life sciences. The book offers readers new discussions of crucial topics, including cluster randomization, historical controls, responder analysis, studies in children, post-hoc tests, estimands, publication bias, the replication crisis, and many more.This work presents the major statistical issues in drug development in a way that is accessible and comprehensible to life scientists working in the field, and takes pains not to gloss over significant disagreements in the field of statistics, while encouraging communication between the statistical and life sciences disciplines. In addition to new material on topics like invalid inversion, severity, random effects in network meta-analysis, and explained variation, readers will benefit from the inclusion of:A thorough introduction to basic topics in drug development and statistics, including the role played by statistics in drug developmentAn exploration of the four views of statistics in drug development, including the historical, methodological, technical, and professionalAn examination of debatable and controversial topics in drug development, including the allocation of treatments to patients in clinical trials, baselines and covariate information, and the measurement of treatment effectsPerfect for life scientists and other professionals working in the field of drug development, Statistical Issues in Drug Development is the ideal resource for anyone seeking a one-stop reference to enhance their understanding of the use of statistics during drug development.

Statistical Language and Speech Processing: 9th International Conference, SLSP 2021, Virtual Event, November 22-26, 2021, Proceedings (Lecture Notes in Computer Science #13062)

by Luis Espinosa-Anke Carlos Martín-Vide Irena Spasić

This book constitutes the proceedings of the 9th International Conference on Statistical Language and Speech Processing, SLSP 2021, held in Cardiff, UK, in November 2021.The 9 full papers presented in this volume were carefully reviewed and selected from 21 submissions. The papers present topics of either theoretical or applied interest discussing the employment of statistical models (including machine learning) within language and speech processing.

Statistical Language and Speech Processing: 4th International Conference, SLSP 2016, Pilsen, Czech Republic, October 11-12, 2016, Proceedings (Lecture Notes in Computer Science #9918)

by Pavel Král Carlos Martín-Vide

This book constitutes the refereed proceedings of the 4th International Conference on Statistical Language and Speech Processing, SLSP 2016, held in Pilsen, Czech Republic, in October 2016. The 11 full papers presented together with two invited talks were carefully reviewed and selected from 38 submissions. The papers cover topics such as anaphora and coreference resolution; authorship identification, plagiarism and spam filtering; computer-aided translation; corpora and language resources; data mining and semantic web; information extraction; information retrieval; knowledge representation and ontologies; lexicons and dictionaries; machine translation; multimodal technologies; natural language understanding; neural representation of speech and language; opinion mining and sentiment analysis; parsing; part-of-speech tagging; question and answering systems; semantic role labeling; speaker identification and verification; speech and language generation; speech recognition; speech synthesis; speech transcription; speech correction; spoken dialogue systems; term extraction; text categorization; test summarization; user modeling.

Statistical Language and Speech Processing: 5th International Conference, SLSP 2017, Le Mans, France, October 23–25, 2017, Proceedings (Lecture Notes in Computer Science #10583)

by Nathalie Camelin, Yannick Estève and Carlos Martín-Vide

This book constitutes the refereed proceedings of the 5th International Conference on Statistical Language and Speech Processing, SLSP 2017, held in Le Mans, France, in October 2017. The 21 full papers presented were carefully reviewed and selected from 39 submissions. The papers cover topics such as anaphora and conference resolution; authorship identification, plagiarism and spam filtering; computer-aided translation; corpora and language resources; data mining and semanticweb; information extraction; information retrieval; knowledge representation and ontologies; lexicons and dictionaries; machine translation; multimodal technologies; natural language understanding; neural representation of speech and language; opinion mining and sentiment analysis; parsing; part-of-speech tagging; question and answering systems; semantic role labeling; speaker identification and verification; speech and language generation; speech recognition; speech synthesis; speech transcription; speech correction; spoken dialogue systems; term extraction; text categorization; test summarization; user modeling. They are organized in the following sections: language and information extraction; post-processing and applications of automatic transcriptions; speech paralinguistics and synthesis; speech recognition: modeling and resources.

Statistical Learning and Modeling in Data Analysis: Methods and Applications (Studies in Classification, Data Analysis, and Knowledge Organization)

by Simona Balzano Giovanni C. Porzio Renato Salvatore Domenico Vistocco Maurizio Vichi

The contributions gathered in this book focus on modern methods for statistical learning and modeling in data analysis and present a series of engaging real-world applications. The book covers numerous research topics, ranging from statistical inference and modeling to clustering and factorial methods, from directional data analysis to time series analysis and small area estimation. The applications reflect new analyses in a variety of fields, including medicine, finance, engineering, marketing and cyber risk.The book gathers selected and peer-reviewed contributions presented at the 12th Scientific Meeting of the Classification and Data Analysis Group of the Italian Statistical Society (CLADAG 2019), held in Cassino, Italy, on September 11–13, 2019. CLADAG promotes advanced methodological research in multivariate statistics with a special focus on data analysis and classification, and supports the exchange and dissemination of ideas, methodological concepts, numerical methods, algorithms, and computational and applied results. This book, true to CLADAG’s goals, is intended for researchers and practitioners who are interested in the latest developments and applications in the field of data analysis and classification.

Statistical Learning for Big Dependent Data (Wiley Series in Probability and Statistics)

by Daniel Peña Ruey S. Tsay

Master advanced topics in the analysis of large, dynamically dependent datasets with this insightful resource Statistical Learning with Big Dependent Data delivers a comprehensive presentation of the statistical and machine learning methods useful for analyzing and forecasting large and dynamically dependent data sets. The book presents automatic procedures for modelling and forecasting large sets of time series data. Beginning with some visualization tools, the book discusses procedures and methods for finding outliers, clusters, and other types of heterogeneity in big dependent data. It then introduces various dimension reduction methods, including regularization and factor models such as regularized Lasso in the presence of dynamical dependence and dynamic factor models. The book also covers other forecasting procedures, including index models, partial least squares, boosting, and now-casting. It further presents machine-learning methods, including neural network, deep learning, classification and regression trees and random forests. Finally, procedures for modelling and forecasting spatio-temporal dependent data are also presented. Throughout the book, the advantages and disadvantages of the methods discussed are given. The book uses real-world examples to demonstrate applications, including use of many R packages. Finally, an R package associated with the book is available to assist readers in reproducing the analyses of examples and to facilitate real applications. Analysis of Big Dependent Data includes a wide variety of topics for modeling and understanding big dependent data, like: New ways to plot large sets of time series An automatic procedure to build univariate ARMA models for individual components of a large data set Powerful outlier detection procedures for large sets of related time series New methods for finding the number of clusters of time series and discrimination methods , including vector support machines, for time series Broad coverage of dynamic factor models including new representations and estimation methods for generalized dynamic factor models Discussion on the usefulness of lasso with time series and an evaluation of several machine learning procedure for forecasting large sets of time series Forecasting large sets of time series with exogenous variables, including discussions of index models, partial least squares, and boosting. Introduction of modern procedures for modeling and forecasting spatio-temporal data Perfect for PhD students and researchers in business, economics, engineering, and science: Statistical Learning with Big Dependent Data also belongs to the bookshelves of practitioners in these fields who hope to improve their understanding of statistical and machine learning methods for analyzing and forecasting big dependent data.

Statistical Learning for Big Dependent Data (Wiley Series in Probability and Statistics)

by Daniel Peña Ruey S. Tsay

Master advanced topics in the analysis of large, dynamically dependent datasets with this insightful resource Statistical Learning with Big Dependent Data delivers a comprehensive presentation of the statistical and machine learning methods useful for analyzing and forecasting large and dynamically dependent data sets. The book presents automatic procedures for modelling and forecasting large sets of time series data. Beginning with some visualization tools, the book discusses procedures and methods for finding outliers, clusters, and other types of heterogeneity in big dependent data. It then introduces various dimension reduction methods, including regularization and factor models such as regularized Lasso in the presence of dynamical dependence and dynamic factor models. The book also covers other forecasting procedures, including index models, partial least squares, boosting, and now-casting. It further presents machine-learning methods, including neural network, deep learning, classification and regression trees and random forests. Finally, procedures for modelling and forecasting spatio-temporal dependent data are also presented. Throughout the book, the advantages and disadvantages of the methods discussed are given. The book uses real-world examples to demonstrate applications, including use of many R packages. Finally, an R package associated with the book is available to assist readers in reproducing the analyses of examples and to facilitate real applications. Analysis of Big Dependent Data includes a wide variety of topics for modeling and understanding big dependent data, like: New ways to plot large sets of time series An automatic procedure to build univariate ARMA models for individual components of a large data set Powerful outlier detection procedures for large sets of related time series New methods for finding the number of clusters of time series and discrimination methods , including vector support machines, for time series Broad coverage of dynamic factor models including new representations and estimation methods for generalized dynamic factor models Discussion on the usefulness of lasso with time series and an evaluation of several machine learning procedure for forecasting large sets of time series Forecasting large sets of time series with exogenous variables, including discussions of index models, partial least squares, and boosting. Introduction of modern procedures for modeling and forecasting spatio-temporal data Perfect for PhD students and researchers in business, economics, engineering, and science: Statistical Learning with Big Dependent Data also belongs to the bookshelves of practitioners in these fields who hope to improve their understanding of statistical and machine learning methods for analyzing and forecasting big dependent data.

Statistical Learning from a Regression Perspective (Springer Texts in Statistics)

by Richard A. Berk

This textbook considers statistical learning applications when interest centers on the conditional distribution of the response variable, given a set of predictors, and when it is important to characterize how the predictors are related to the response. This fully revised new edition includes important developments over the past 8 years. Consistent with modern data analytics, it emphasizes that a proper statistical learning data analysis derives from sound data collection, intelligent data management, appropriate statistical procedures, and an accessible interpretation of results. As in the first edition, a unifying theme is supervised learning that can be treated as a form of regression analysis. Key concepts and procedures are illustrated with real applications, especially those with practical implications. The material is written for upper undergraduate level and graduate students in the social and life sciences and for researchers who want to apply statistical learning procedures to scientific and policy problems. The author uses this book in a course on modern regression for the social, behavioral, and biological sciences. All of the analyses included are done in R with code routinely provided.

Statistical Learning from a Regression Perspective (Springer Texts in Statistics)

by Richard A. Berk

This textbook considers statistical learning applications when interest centers on the conditional distribution of a response variable, given a set of predictors, and in the absence of a credible model that can be specified before the data analysis begins. Consistent with modern data analytics, it emphasizes that a proper statistical learning data analysis depends in an integrated fashion on sound data collection, intelligent data management, appropriate statistical procedures, and an accessible interpretation of results. The unifying theme is that supervised learning properly can be seen as a form of regression analysis. Key concepts and procedures are illustrated with a large number of real applications and their associated code in R, with an eye toward practical implications. The growing integration of computer science and statistics is well represented including the occasional, but salient, tensions that result. Throughout, there are links to the big picture. The third edition considers significant advances in recent years, among which are: the development of overarching, conceptual frameworks for statistical learning;the impact of “big data” on statistical learning;the nature and consequences of post-model selection statistical inference;deep learning in various forms;the special challenges to statistical inference posed by statistical learning;the fundamental connections between data collection and data analysis;interdisciplinary ethical and political issues surrounding the application of algorithmic methods in a wide variety of fields, each linked to concerns about transparency, fairness, and accuracy. This edition features new sections on accuracy, transparency, and fairness, as well as a new chapter on deep learning. Precursors to deep learning get an expanded treatment. The connections between fitting and forecasting are considered in greater depth. Discussion of the estimation targets for algorithmic methods is revised and expanded throughout to reflect the latest research. Resampling procedures are emphasized. The material is written for upper undergraduate and graduate students in the social, psychological and life sciences and for researchers who want to apply statistical learning procedures to scientific and policy problems.

Statistical Learning from a Regression Perspective (Springer Series in Statistics)

by Richard A. Berk

Statistical Learning from a Regression Perspective considers statistical learning applications when interest centers on the conditional distribution of the response variable, given a set of predictors, and when it is important to characterize how the predictors are related to the response. As a first approximation, this is can be seen as an extension of nonparametric regression. Among the statistical learning procedures examined are bagging, random forests, boosting, and support vector machines. Response variables may be quantitative or categorical. Real applications are emphasized, especially those with practical implications. One important theme is the need to explicitly take into account asymmetric costs in the fitting process. For example, in some situations false positives may be far less costly than false negatives. Another important theme is to not automatically cede modeling decisions to a fitting algorithm. In many settings, subject-matter knowledge should trump formal fitting criteria. Yet another important theme is to appreciate the limitation of one’s data and not apply statistical learning procedures that require more than the data can provide. The material is written for graduate students in the social and life sciences and for researchers who want to apply statistical learning procedures to scientific and policy problems. Intuitive explanations and visual representations are prominent. All of the analyses included are done in R.

Statistical Learning in Genetics: An Introduction Using R (Statistics for Biology and Health)

by Daniel Sorensen

This book provides an introduction to computer-based methods for the analysis of genomic data. Breakthroughs in molecular and computational biology have contributed to the emergence of vast data sets, where millions of genetic markers for each individual are coupled with medical records, generating an unparalleled resource for linking human genetic variation to human biology and disease. Similar developments have taken place in animal and plant breeding, where genetic marker information is combined with production traits. An important task for the statistical geneticist is to adapt, construct and implement models that can extract information from these large-scale data. An initial step is to understand the methodology that underlies the probability models and to learn the modern computer-intensive methods required for fitting these models. The objective of this book, suitable for readers who wish to develop analytic skills to perform genomic research, is to provide guidance to take this first step.This book is addressed to numerate biologists who typically lack the formal mathematical background of the professional statistician. For this reason, considerably more detail in explanations and derivations is offered. It is written in a concise style and examples are used profusely. A large proportion of the examples involve programming with the open-source package R. The R code needed to solve the exercises is provided. The MarkDown interface allows the students to implement the code on their own computer, contributing to a better understanding of the underlying theory.Part I presents methods of inference based on likelihood and Bayesian methods, including computational techniques for fitting likelihood and Bayesian models. Part II discusses prediction for continuous and binary data using both frequentist and Bayesian approaches. Some of the models used for prediction are also used for gene discovery. The challenge is to find promising genes without incurring a large proportion of false positive results. Therefore, Part II includes a detour on False Discovery Rate assuming frequentist and Bayesian perspectives. The last chapter of Part II provides an overview of a selected number of non-parametric methods. Part III consists of exercises and their solutions.Daniel Sorensen holds PhD and DSc degrees from the University of Edinburgh and is an elected Fellow of the American Statistical Association. He was professor of Statistical Genetics at Aarhus University where, at present, he is professor emeritus.

Statistical Learning of Complex Data (Studies in Classification, Data Analysis, and Knowledge Organization)

by Francesca Greselin Laura Deldossi Luca Bagnato Maurizio Vichi

This book of peer-reviewed contributions presents the latest findings in classification, statistical learning, data analysis and related areas, including supervised and unsupervised classification, clustering, statistical analysis of mixed-type data, big data analysis, statistical modeling, graphical models and social networks. It covers both methodological aspects as well as applications to a wide range of fields such as economics, architecture, medicine, data management, consumer behavior and the gender gap. In addition, it describes the basic features of the software behind the data analysis results, and provides links to the corresponding codes and data sets where necessary. This book is intended for researchers and practitioners who are interested in the latest developments and applications in the field of data analysis and classification. It gathers selected and peer-reviewed contributions presented at the 11th Scientific Meeting of the Classification and Data Analysis Group of the Italian Statistical Society (CLADAG 2017), held in Milan, Italy, on September 13–15, 2017.

Refine Search

Showing 48,576 through 48,600 of 55,715 results