为了正常的体验网站,请在浏览器设置里面开启Javascript功能!

stata交叉分析

2010-10-19 34页 pdf 447KB 169阅读

用户头像

is_274695

暂无简介

举报
stata交叉分析 Getting Started in Frequencies, Crosstab, Factor and Regression Analysis (ver. 2.0 beta, draft) Oscar Torres-Reyna Data Consultant otorres@princeton.edu http://dss.princeton.edu/training/ Case study: intro Search here in the home page for this dataset Cod...
stata交叉分析
Getting Started in Frequencies, Crosstab, Factor and Regression Analysis (ver. 2.0 beta, draft) Oscar Torres-Reyna Data Consultant otorres@princeton.edu http://dss.princeton.edu/training/ Case study: intro Search here in the home page for this dataset Codebook in two formats Datasets, two formats: ACII and SPSS Marginals Metadata NOTE: When data is not available in Stata, you can download the SPSS portable (*.por), open it using SPSS (available at the DSS lab) and saving it as Stata. Total 1,053 100.00 Female 552.611604 52.48 100.00 Male 500.388396 47.52 47.52 ASK) Freq. Percent Cum. (DO NOT A. Gender . tab qa [aweight=weight] /*With weights*/ . Total 1,053 100.00 Female 560 53.18 100.00 Male 493 46.82 46.82 ASK) Freq. Percent Cum. (DO NOT A. Gender . tab qa /*No weights*/ . Total 1,053 100.00 (VOL) Undecided/Don't know/no answer 78.61762284 7.47 100.00 (VOL) Other/Neither 20.5570831 1.95 92.53 John McCain and Sarah Palin, the Republ 449.487545 42.69 90.58 Barack Obama and Joe Biden, the Democra 504.337749 47.90 47.90 Barack Freq. Percent Cum. held today and the candidates were Q5. If the Presidential election were . tab q5 [aweight=weight] /*With weights*/ . Total 1,053 100.00 (VOL) Undecided/Don't know/no answer 87 8.26 100.00 (VOL) Other/Neither 21 1.99 91.74 John McCain and Sarah Palin, the Republ 464 44.06 89.74 Barack Obama and Joe Biden, the Democra 481 45.68 45.68 Barack Freq. Percent Cum. held today and the candidates were Q5. If the Presidential election were . tab q5 /*No weights*/ Case study: frequencies Distribution of electoral preferences and gender. According to the codebook ‘q5’ has the electoral question and ‘qa’ gender. NOTE: At this point, it is strongly recommended to open a log to keep a record of your work and to extract output, type: log using mywork.log You could also open a do-file by typing doedit and copy your commands there. No weights Using weights No weights Using weights 100.00 100.00 100.00 47.52 52.48 100.00 Total 500.3884 552.6116 1,053 5.59 9.16 7.47 35.59 64.41 100.00 (VOL) Undecided/Don't 27.980574 50.637048 78.617623 2.01 1.90 1.95 48.92 51.08 100.00 (VOL) Other/Neither 10.055739 10.5013441 20.557083 50.55 35.57 42.69 56.27 43.73 100.00 John McCain and Sarah 252.9313 196.55625 449.487545 41.85 53.37 47.90 41.52 58.48 100.00 Barack Obama and Joe 209.42078 294.91697 504.33775 Barack Male Female Total the candidates were ASK) were held today and A. Gender (DO NOT Presidential election Q5. If the column percentage row percentage frequency Key . tab q5 qa [aw=weight], col row /*Electoral preferences by gender*/ Case study: Electoral preferences by gender Case study: Electoral preferences by age 100.00 100.00 100.00 100.00 100.00 100.00 100.00 3.59 4.97 9.45 8.59 9.76 23.80 16.26 Total 37.845325 52.312241 99.540836 90.454747 102.7289 250.600407 171.24932 5.99 7.05 1.82 5.08 8.33 7.16 7.75 2.88 4.69 2.30 5.84 10.88 22.84 16.87 (VOL) Undecided/Don't 2.2672181 3.6879373 1.809561 4.5920698 8.5570854 17.952531 13.264407 0.00 0.00 2.13 2.70 4.39 1.25 1.62 0.00 0.00 10.32 11.88 21.96 15.25 13.52 (VOL) Other/Neither 0 0 2.1209543 2.4419715 4.51458561 3.1358789 2.7783459 16.44 42.42 55.14 40.71 49.69 39.90 40.31 1.38 4.94 12.21 8.19 11.36 22.25 15.36 John McCain and Sarah 6.2229886 22.18839 54.883049 36.825588 51.046351 99.992283 69.037199 77.57 50.53 40.92 51.51 37.59 51.68 50.32 5.82 5.24 8.08 9.24 7.66 25.68 17.09 Barack Obama and Joe 29.355119 26.435913 40.727272 46.595118 38.610873 129.51971 86.169373 Barack 18-24 25-29 30-34 35-39 40-44 45-54 55-64 the candidates were F1. What is your age? were held today and Presidential election Q5. If the column percentage row percentage frequency Key . tab q5 f1 [aw=weight], col row /*Electoral preferences by age*/ 100.00 100.00 100.00 22.50 1.08 100.00 236.93948 11.328748 1,053 10.22 20.01 7.47 30.81 2.88 100.00 24.219596 2.2672179 78.617623 2.35 0.00 1.95 27.07 0.00 100.00 5.56534701 0 20.557083 44.21 39.98 42.69 23.31 1.01 100.00 104.76215 4.5295414 449.487545 43.21 40.00 47.90 20.30 0.90 100.00 102.39238 4.5319886 504.33775 65 or old (VOL) No Total F1. What is your age? 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 0.53 1.10 17.48 23.74 35.84 20.78 0.54 100.00 Total 5.57103 11.538985 184.08111 249.97781 377.3726 218.79775 5.6607032 1,053 0.00 13.34 11.95 6.55 7.36 4.99 0.00 7.47 0.00 1.96 27.99 20.82 35.34 13.90 0.00 100.00 (VOL) Undecided/Don't 0 1.5397725 22.004128 16.367784 27.7818421 10.924096 0 78.617623 0.00 0.00 2.03 1.35 2.62 1.62 0.00 1.95 0.00 0.00 18.19 16.45 48.12 17.24 0.00 100.00 (VOL) Other/Neither 0 0 3.7389017 3.382658 9.8911577 3.5443656 0 20.557083 58.73 53.00 41.69 46.68 45.13 33.86 39.97 42.69 0.73 1.36 17.07 25.96 37.89 16.48 0.50 100.00 John McCain and Sarah 3.2718681 6.1159475 76.7484051 116.69213 170.30303 74.093841 2.2623235 449.487545 41.27 33.65 44.32 45.42 44.89 59.52 60.03 47.90 0.46 0.77 16.18 22.51 33.59 25.82 0.67 100.00 Barack Obama and Joe 2.2991619 3.883265 81.589679 113.53524 169.39657 130.23545 3.3983797 504.33775 Barack 8th grade Some high High scho Some coll College g Postgradu (VOL) No Total the candidates were F4. What is the highest grade of schooling that you've completed? were held today and Presidential election Q5. If the column percentage row percentage frequency Key . tab q5 f4 [aw=weight], col row /*Electoral preferences by education*/ Case study: Electoral preferences by educational attainment 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 5.90 9.83 14.21 20.27 14.50 14.35 10.92 10.02 100.00 Total 62.109961 103.52815 149.64713 213.49343 152.69863 151.06636 114.97061 105.48574 1,053 7.16 11.34 6.22 8.61 3.22 6.23 6.10 12.70 7.47 5.66 14.93 11.85 23.38 6.26 11.97 8.93 17.04 100.00 (VOL) Undecided/Don't 4.4480018 11.739914 9.3136182 18.37691 4.9181423 9.409895 7.01703324 13.3941079 78.617623 2.42 0.85 2.14 1.17 1.39 2.04 1.91 4.79 1.95 7.33 4.30 15.60 12.17 10.33 14.99 10.70 24.59 100.00 (VOL) Other/Neither 1.5060026 .88321203 3.2060684 2.5018142 2.1243815 3.0806277 2.200355 5.0546217 20.557083 30.00 38.41 43.04 32.71 56.34 45.57 47.53 44.88 42.69 4.14 8.85 14.33 15.53 19.14 15.32 12.16 10.53 100.00 John McCain and Sarah 18.630762 39.764056 64.4115908 69.827216 86.023642 68.843117 54.640308 47.346852 449.487545 60.42 49.40 48.59 57.51 39.05 46.16 44.46 37.63 47.90 7.44 10.14 14.42 24.35 11.82 13.83 10.13 7.87 100.00 Barack Obama and Joe 37.525195 51.14097 72.715849 122.78749 59.632459 69.732723 51.1129092 39.690155 504.33775 Barack Less than $20,000 t $35,000 t $50,000 t $75,000 t $100,000 or $150,0 (VOL) No Total the candidates were F13. Finally, just for classification purposes, was your total family income bef were held today and Presidential election Q5. If the column percentage row percentage frequency Key . tab q5 f13 [aw=weight], col row /*Electoral preferences by income*/ Case study: Electoral preferences by income 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 52.87 6.75 2.50 25.83 1.79 5.80 3.86 0.60 100.00 Total 556.67476 71.08488 26.3172676 272.02643 18.8805338 61.0719 40.639858 6.304371 1,053 5.31 9.48 9.40 10.22 12.01 11.85 6.21 0.00 7.47 37.60 8.57 3.15 35.38 2.88 9.21 3.21 0.00 100.00 (VOL) Undecided/Don't 29.558151 6.7386098 2.4747578 27.814172 2.2672181 7.2399743 2.52474 0 78.617623 2.07 2.33 0.00 2.25 0.00 0.00 3.18 0.00 1.95 55.94 8.04 0.00 29.74 0.00 0.00 6.29 0.00 100.00 (VOL) Other/Neither 11.498793 1.6530186 0 6.1126834 0 0 1.2925883 0 20.557083 45.33 36.19 23.37 41.39 5.97 60.89 29.83 35.88 42.69 56.13 5.72 1.37 25.05 0.25 8.27 2.70 0.50 100.00 John McCain and Sarah 252.31686 25.723928 6.1500438 112.5963 1.1268505 37.187532 12.123702 2.2623235 449.487545 47.30 52.01 67.23 46.14 82.02 27.25 60.77 64.12 47.90 52.21 7.33 3.51 24.88 3.07 3.30 4.90 0.80 100.00 Barack Obama and Joe 263.30095 36.9693237 17.692466 125.50328 15.486465 16.644394 24.6988275 4.0420475 504.33775 Barack Employed Employed Laid off Retired Student Homemaker Something (VOL) No Total the candidates were f8 were held today and Presidential election Q5. If the column percentage row percentage frequency Key . tab q5 f8 [aw=weight], col row /*Electoral preferences by employment status*/ Case study: Electoral preferences by employment status Case study: Testing for associations (preparing the data) Before running any test we need to prepare the data by setting to missing any non-valid response (like “don’t know/no answer/not sure”) unless is relevant to the question. It is important to ‘clean’ the variables for the tests to be as accurate as possible. For demographics we will remove non-response items. Here are a series of commands per variable (columns) to prepare some variables for you to run on your own. Description Age Education Income Employment Gender creating a new variable gen age=f1 gen educ=f4 gen income=f13 gen employ=f8 gen gender=qa exploring the new variable tab age tab educ tab income tab employ tab gender checking for labels from original variable labelbook f1 labelbook f4 labelbook f13 labelbook f8 labelbook qa assigning labels to new variable label value age f1 label value educ f4 label value income f13 label value employ f8 label value gender qa exploring the new variable tab age tab educ tab income tab employ tab gender setting no response to missing replace age=. if age>8 replace educ=. if educ==8 replace income=. if income==8 replace employ=. if employ==8 adding variable labels label variable age "Age" label variable educ "Educational attainment" label variable income "Family income" label variable employ "Employment status" exploring the new variable tab age tab educ tab income tab employ Case study: Testing for associations (preparing the data –cont.) Here is an easy way to do it by using the command clonevar in Stata. Description Age Education Income Employment Gender creating a new variable clonevar age=f1 clonevar educ=f4 clonevar income=f13 clonevar employ=f8 clonevar gender=qa exploring the new variable tab age tab educ tab income tab employ tab gender setting no response to missing replace age=. if age>8 replace educ=. if educ==8 replace income=. if income==8 replace employ=. if employ==8 exploring the new variable tab age tab educ tab income tab employ Case study: testing for associations To find whether there is some association between demographics and electoral preferences we can use chi-square but first we need to ‘clean’ the electoral variable (q5). Lets create a new variable ‘elec’ from ‘q5’. We will use recode for this, type: Original variable Value 1=1 with label in quotes Value 2=2 with label in quotes Values 3, 4 & 8 = 3 with label in quotes New variable, name in parenthesis Labels are saved as ‘elec’ Here is the new variable We use the ‘nofreq’ option after comma since we are not interested on the crosstabulations but rather on the tests. We can see that gender, education, income and employment status are somehow associated with electoral preferences. Age does not seem to have any association. Total 1,053 100.00 Undecided/DK/NA/Other 108 10.26 100.00 McCain/Palin 464 44.06 89.74 Obama/Biden 481
/
本文档为【stata交叉分析】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑, 图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。 本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。 网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。
热门搜索

历史搜索

    清空历史搜索