为了正常的体验网站,请在浏览器设置里面开启Javascript功能!

骨科手术分级

2017-09-18 16页 doc 30KB 36阅读

用户头像

is_833902

暂无简介

举报
骨科手术分级 M A N N I N G Robert I. Kabacoff Data analysis and graphics with R IN ACTION R in Action R in Action Data analysis and graphics with R ROBERT I. KABACOFF MANNING Shelter Island For online information and ordering of this and other Ma...
骨科手术分级
M A N N I N G Robert I. Kabacoff Data analysis and graphics with R IN ACTION R in Action R in Action Data analysis and graphics with R ROBERT I. KABACOFF MANNING Shelter Island For online information and ordering of this and other Manning books, please visit www.manning.com. The publisher offers discounts on this book when ordered in quantity. For more information, please contact Special Sales Department Manning Publications Co. 20 Baldwin Road PO Box 261 Shelter Island, NY 11964 Email: orders@manning.com ©2011 by Manning Publications Co. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by means electronic, mechanical, photocopying, or otherwise, without prior written permission of the publisher. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in the book, and Manning Publications was aware of a trademark claim, the designations have been printed in initial caps or all caps. Recognizing the importance of preserving what has been written, it is Manning’s policy to have the books we publish printed on acid-free paper, and we exert our best efforts to that end. Recognizing also our responsibility to conserve the resources of our planet, Manning books are printed on paper that is at least 15 percent recycled and processed without the use of elemental chlorine. Manning Publications Co. Development editor: Sebastian Stirling 20 Baldwin Road Copyeditor: Liz Welch PO Box 261 Typesetter: Composure Graphics Shelter Island, NY 11964 Cover designer: Marija Tudor ISBN: 9781935182399 Printed in the United States of America 1 2 3 4 5 6 7 8 9 10 -- MAL -- 16 15 14 13 12 11 v brief contents Part I Getting started .......................................... 1 1 ■ Introduction to R 3 2 ■ Creating a dataset 21 3 ■ Getting started with graphs 45 4 ■ Basic data management 73 5 ■ Advanced data management 91 Part II Basic methods ........................................ 117 6 ■ Basic graphs 119 7 ■ Basic statistics 141 Part III Intermediate methods ......................... 171 8 ■ Regression 173 9 ■ Analysis of variance 219 10 ■ Power analysis 246 11 ■ Intermediate graphs 263 12 ■ Resampling statistics and bootstrapping 291 vi BRIEF CONTENTS Part IV Advanced methods ...................................311 13 ■ Generalized linear models 313 14 ■ Principal components and factor analysis 331 15 ■ Advanced methods for missing data 352 16 ■ Advanced graphics 373 vii contents preface xv acknowledgments xvii about this book xix about the cover illustration xxiv Part I Getting started .............................................1 1 Introduction to R 31.1 Why use R? 5 1.2 Obtaining and installing R 7 1.3 Working with R 7 Getting started 8 ■ Getting help 11 ■ The workspace 11 Input and output 13 1.4 Packages 14 What are packages? 15 ■ Installing a package 16 Loading a package 16 ■ Learning about a package 16 1.5 Batch processing 17 1.6 Using output as input—reusing results 18 1.7 Working with large datasets 18 viii CONTENTS 1.8 Working through an example 18 1.9 Summary 20 2 Creating a dataset 212.1 Understanding datasets 22 2.2 Data structures 23 Vectors 24 ■ Matrices 24 ■ Arrays 26 ■ Data frames 27 Factors 30 ■ Lists 32 2.3 Data input 33 Entering data from the keyboard 34 ■ Importing data from a delimited text file 35 ■ Importing data from Excel 36 ■ Importing data from XML 37 Webscraping 37 ■ Importing data from SPSS 38 ■ Importing data from SAS 38 Importing data from Stata 38 ■ Importing data from netCDF 39 Importing data from HDF5 39 ■ Accessing database management systems (DBMSs) 39 ■ Importing data via Stat/Transfer 41 2.4 Annotating datasets 42 Variable labels 42 ■ Value labels 42 2.5 Useful functions for working with data objects 42 2.6 Summary 43 3 Getting started with graphs 453.1 Working with graphs 46 3.2 A simple example 48 3.3 Graphical parameters 49 Symbols and lines 50 ■ Colors 52 ■ Text characteristics 53 Graph and margin dimensions 54 3.4 Adding text, customized axes, and legends 56 Titles 57 ■ Axes 57 ■ Reference lines 60 ■ Legend 60 Text annotations 62 3.5 Combining graphs 65 Creating a figure arrangement with fine control 69 3.6 Summary 71 4 Basic data management 734.1 A working example 73 4.2 Creating new variables 75 4.3 Recoding variables 76 CONTENTS ix 4.4 Renaming variables 78 4.5 Missing values 79 Recoding values to missing 80 ■ Excluding missing values from analyses 80 4.6 Date values 81 Converting dates to character variables 83 ■ Going further 83 4.7 Type conversions 83 4.8 Sorting data 84 4.9 Merging datasets 85 Adding columns 85 ■ Adding rows 85 4.10 Subsetting datasets 86 Selecting (keeping) variables 86 ■ Excluding (dropping) variables 86 Selecting observations 87 ■ The subset() function 88 ■ Random samples 89 4.11 Using SQL statements to manipulate data frames 89 4.12 Summary 90 5 Advanced data management 915.1 A data management challenge 92 5.2 Numerical and character functions 93 Mathematical functions 93 ■ Statistical functions 94 ■ Probability functions 96 Character functions 99 ■ Other useful functions 101 ■ Applying functions to matrices and data frames 102 5.3 A solution for our data management challenge 103 5.4 Control flow 107 Repetition and looping 107 ■ Conditional execution 108 5.5 User-written functions 109 5.6 Aggregation and restructuring 112 Transpose 112 ■ Aggregating data 112 ■ The reshape package 113 5.7 Summary 116 Part II Basic methods ............................................117 6 Basic graphs 1196.1 Bar plots 120 Simple bar plots 120 ■ Stacked and grouped bar plots 121 ■ Mean bar plots 122 Tweaking bar plots 123 ■ Spinograms 124 6.2 Pie charts 125 6.3 Histograms 128 x CONTENTS 6.4 Kernel density plots 130 6.5 Box plots 133 Using parallel box plots to compare groups 134 ■ Violin plots 137 6.6 Dot plots 138 6.7 Summary 140 7 Basic statistics 1417.1 Descriptive statistics 142 A menagerie of methods 142 ■ Descriptive statistics by group 146 Visualizing results 149 7.2 Frequency and contingency tables 149 Generating frequency tables 150 ■ Tests of independence 156 Measures of association 157 ■ Visualizing results 158 Converting tables to flat files 158 7.3 Correlations 159 Types of correlations 160 ■ Testing correlations for significance 162 Visualizing correlations 164 7.4 t-tests 164 Independent t-test 164 ■ Dependent t-test 165 ■ When there are more than two groups 166 7.5 Nonparametric tests of group differences 166 Comparing two groups 166 ■ Comparing more than two groups 168 7.6 Visualizing group differences 170 7.7 Summary 170 Part III Intermediate methods ............................171 8 Regression 1738.1 The many faces of regression 174 Scenarios for using OLS regression 175 ■ What you need to know 176 8.2 OLS regression 177 Fitting regression models with lm() 178 ■ Simple linear regression 179 Polynomial regression 181 ■ Multiple linear regression 184 Multiple linear regression with interactions 186 8.3 Regression diagnostics 188 A typical approach 189 ■ An enhanced approach 192 ■ Global validation of linear model assumption 199 ■ Multicollinearity 199 8.4 Unusual observations 200 Outliers 200 ■ High leverage points 201 ■ Influential observations 202 CONTENTS xi 8.5 Corrective measures 205 Deleting observations 205 ■ Transforming variables 205 ■ Adding or deleting variables 207 ■ Trying a different approach 207 8.6 Selecting the “best” regression model 207 Comparing models 208 ■ Variable selection 209 8.7 Taking the analysis further 213 Cross-validation 213 ■ Relative importance 215 8.8 Summary 218 9 Analysis of variance 2199.1 A crash course on terminology 220 9.2 Fitting ANOVA models 222 The aov() function 222 ■ The order of formula terms 223 9.3 One-way ANOVA 225 Multiple comparisons 227 ■ Assessing test assumptions 229 9.4 One-way ANCOVA 230 Assessing test assumptions 232 ■ Visualizing the results 232 9.5 Two-way factorial ANOVA 234 9.6 Repeated measures ANOVA 237 9.7 Multivariate analysis of variance (MANOVA) 239 Assessing test assumptions 241 ■ Robust MANOVA 242 9.8 ANOVA as regression 243 9.9 Summary 245 10 Power analysis 24610.1 A quick review of hypothesis testing 247 10.2 Implementing power analysis with the pwr package 249 t-tests 250 ■ ANOVA 252 ■ Correlations 253 ■ Linear models 253 Tests of proportions 254 ■ Chi-square tests 255 ■ Choosing an appropriate effect size in novel situations 257 10.3 Creating power analysis plots 258 10.4 Other packages 260 10.5 Summary 261 11 Intermediate graphs 26311.1 Scatter plots 264 Scatter plot matrices 267 ■ High-density scatter plots 271 ■ 3D scatter plots 274 Bubble plots 278 xii CONTENTS 11.2 Line charts 280 11.3 Correlograms 283 11.4 Mosaic plots 288 11.5 Summary 290 12 Resampling statistics and bootstrapping 29112.1 Permutation tests 292 12.2 Permutation test with the coin package 294 Independent two-sample and k-sample tests 295 ■ Independence in contingency tables 296 ■ Independence between numeric variables 297 Dependent two-sample and k-sample tests 297 ■ Going further 298 12.3 Permutation tests with the lmPerm package 298 Simple and polynomial regression 299 ■ Multiple regression 300 One-way ANOVA and ANCOVA 301 ■ Two-way ANOVA 302 12.4 Additional comments on permutation tests 302 12.5 Bootstrapping 303 12.6 Bootstrapping with the boot package 304 Bootstrapping a single statistic 305 ■ Bootstrapping several statistics 307 12.7 Summary 309 Part IV Advanced methods ...................................311 13 Generalized linear models 31313.1 Generalized linear models and the glm() function 314 The glm() function 315 ■ Supporting functions 316 ■ Model fit and regression diagnostics 317 13.2 Logistic regression 317 Interpreting the model parameters 320 ■ Assessing the impact of predictors on the probability of an outcome 321 ■ Overdispersion 322 ■ Extensions 323 13.3 Poisson regression 324 Interpreting the model parameters 326 ■ Overdispersion 327 ■ Extensions 328 13.4 Summary 330 14 Principal components and factor analysis 33114.1 Principal components and factor analysis in R 333 14.2 Principal components 334 Selecting the number of components to extract 335 CONTENTS xiii Extracting principal components 336 ■ Rotating principal components 339 Obtaining principal components scores 341 14.3 Exploratory factor analysis 342 Deciding how many common factors to extract 343 ■ Extracting common factors 344 ■ Rotating factors 345 ■ Factor scores 349 ■ Other EFA-related packages 349 14.4 Other latent variable models 349 14.5 Summary 350 15 Advanced methods for missing data 35215.1 Steps in dealing with missing data 353 15.2 Identifying missing values 355 15.3 Exploring missing values patterns 356 Tabulating missing values 357 ■ Exploring missing data visually 357 ■ Using correlations to explore missing values 360 15.4 Understanding the sources and impact of missing data 362 15.5 Rational approaches for dealing with incomplete data 363 15.6 Complete-case analysis (listwise deletion) 364 15.7 Multiple imputation 365 15.8 Other approaches to missing data 370 Pairwise deletion 370 ■ Simple (nonstochastic) imputation 371 15.9 Summary 371 16 Advanced graphics 37316.1 The four graphic systems in R 374 16.2 The lattice package 375 Conditioning variables 379 ■ Panel functions 381 ■ Grouping variables 383 Graphic parameters 387 ■ Page arrangement 388 16.3 The ggplot2 package 390 16.4 Interactive graphs 394 Interacting with graphs: identifying points 394 ■ playwith 394 latticist 396 ■ Interactive graphics with the iplots package 397 ■ rggobi 399 16.5 Summary 399 afterword Into the rabbit hole 400 xiv CONTENTS appendix A Graphic user interfaces 403 appendix B Customizing the startup environment 406 appendix C Exporting data from R 408 appendix D Creating publication-quality output 410 appendix E Matrix Algebra in R 419 appendix F Packages used in this book 421 appendix G Working with large datasets 429 appendix H Updating an R installation 432 index 435 xv preface What is the use of a book, without pictures or conversations? —Alice, Alice in Wonderland It’s wondrous, with treasures to satiate desires both subtle and gross; but it’s not for the timid. —Q, “Q Who?” Stark Trek: The Next Generation When I began writing this book, I spent quite a bit of time searching for a good quote to start things off. I ended up with two. R is a wonderfully flexible platform and language for exploring, visualizing, and understanding data. I chose the quote from Alice in Wonderland to capture the flavor of statistical analysis today—an in- teractive process of exploration, visualization, and interpretation. The second quote reflects the generally held notion that R is difficult to learn. What I hope to show you is that is doesn’t have to be. R is broad and powerful, with so many analytic and graphic functions available (more than 50,000 at last count) that it easily intimidates both novice and experienced users alike. But there is rhyme and reason to the apparent madness. With guidelines and instructions, you can navigate the tremendous resources available, selecting the tools you need to accomplish your work with style, elegance, efficiency—and more than a little coolness. I first encountered R several years ago, when applying for a new statistical consulting position. The prospective employer asked in the pre-interview material if I was conversant in R. Following the standard advice of recruiters, I immediately said yes, and set off to learn it. I was an experienced statistician and researcher, had xvi PREFACE 25 years experience as an SAS and SPSS programmer, and was fluent in a half dozen programming languages. How hard could it be? Famous last words. As I tried to learn the language (as fast as possible, with an interview looming), I found either tomes on the underlying structure of the language or dense treatises on specific advanced statistical methods, written by and for subject-matter experts. The online help was written in a Spartan style that was more reference than tutorial. Every time I thought I had a handle on the overall organization and capabilities of R, I found something new that made me feel ignorant and small. To make sense of it all, I approached R as a data scientist. I thought about what it takes to successfully process, analyze, and understand data, including ■ Accessing the data (getting the data into the application from multiple sources) ■ Cleaning the data (coding missing data, fixing or deleting miscoded data, trans- forming variables into more useful formats) ■ Annotating the data (in order to remember what each piece represents) ■ Summarizing the data (getting descriptive statistics to help characterize the data) ■ Visualizing the data (because a picture really is worth a thousand words) ■ Preparing the results (creating publication-quality tables and graphs) Modeling the data (uncovering relationships and testing hypotheses) ■ Then I tried to understand how I could use R to accomplish each of these tasks. Be- cause I learn best by teaching, I eventually created a website (www.statmethods.net) to document what I had learned. Then, about a year ago, Marjan Bace (the publisher) called and asked if I would like to write a book on R. I had already written 50 journal articles, 4 technical manuals, numerous book chapters, and a book on research methodology, so how hard could it be? At the risk of sounding repetitive—famous last words. The book you’re holding is the one that I wished I had so many years ago. I have tried to provide you with a guide to R that will allow you to quickly access the power of this great open source endeavor, without all the frustration and angst. I hope you enjoy it. P.S. I was offered the job but didn’t take it. However, learning R has taken my career in directions that I could never have anticipated. Life can be funny. xvii acknowledgments A number of people worked hard to make this a better book. They include ■ Marjan Bace, Manning publisher, who asked me to write this book in the first place. ■ Sebastian Stirling, development editor, who spent many hours on the phone with me, helping me organize the material, clarify concepts, and generally make the text more interesting. He also helped me through the many steps to publication. ■ Karen Tegtmeyer, review editor, who helped obtain reviewers and coordinate the review process. ■ Mary Piergies, who helped shepherd this book through the production pro- cess, and her team of Liz Welch, Susan Harkins, and Rachel Schroeder. ■ Pablo Domínguez Vaselli, technical proofreader, who helped uncover areas of confusion and provided an independent and expert eye for testing code. ■ The peer reviewers who spent hours of their own time carefully reading through the material, finding typos and making valuable substantive sug- gestions: Chris Williams, Charles Malpas, Angela Staples, PhD, Daniel Reis Pereira, Dr. D. H. van Rijn, Dr. Christian Marquardt, Amos Folarin, Stuart Jefferys, Dror Berel, Patrick Breen, Elizabeth Ostrowski, PhD, Atef Ouni, Carles Fenollosa, Ricardo Pietrobon, Samuel McQuillin, Landon Cox, Austin Ziegler, Rick Wagner, Ryan Cox, Sumit Pal, Philipp K. Janert, Deepak Vohra, and Sophie Mormede. ACKNOWLEDGMENTS xviii ■ The many Manning Early Access Program (MEAP) participants who bought the book before it was finished, asked great questions, pointed out errors, and made helpful suggestions. Each contributor has made this a better and more comprehensive book. I would also like to acknowledge the many software authors that have contributed to making R such a powerful data-analytic platform. They include not only the core developers, but also the selfless individuals who have created and maintain contributed packages, extending R’s capabilities greatly. Appendix F provides a list of the authors of contributed packages described in this book. In particular, I would like to mention John Fox, Hadley Wickham, Frank E. Harrell, Jr., Deepayan Sarkar, and William Revelle, whose works I greatly admire. I have tried to represent their contributions accurately, and I remain solely responsible for any errors or distortions inadvertently included in this book. I really should have started this book by thanking my wife and partner, Carol Lynn. Although she has no intrinsic interest in statistics or programming, she read each chapter multiple times and made countless corrections and suggestions. No greater love has any person than to read multivariate statistics for another. Just as important, she suffered the long nights and weekends that I spent writing this book, with grace, support, and affection. There is no logical explanation why I should be this lucky. There are two other people I would like to thank. One is my father, whose love of science was inspiring and who gave me an appreciation of the value of data. The other is Gary K. Burger, my mentor in graduate school. Gary got me interested in a career in statistics and teaching when I thought I wanted to be a clinician. This is all his fault. xix about this book If you picked up this book, you probably have some data that you need to collect, summarize, transform, explore, model, visualize, or present. If so, then R is for you! R has become the world-wide language for statistics, predictive analytics, and data visualization. It offers the widest range available of methodologies for understand- ing data, from the most basic to the most complex and bleeding edge. As an open source project it’s freely available for a range of platforms, including Windows, Mac OS X, and Linux. I
/
本文档为【骨科手术分级】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑, 图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。 本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。 网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。

历史搜索

    清空历史搜索