为了正常的体验网站,请在浏览器设置里面开启Javascript功能!

Monte Carlo Methods in Statistics

2017-09-08 4页 pdf 2MB 16阅读

用户头像

is_991613

暂无简介

举报
Monte Carlo Methods in StatisticsMonteCarloMethodsinStatisticsChristianRobert∗Universite´ParisDauphineandCREST,INSEESeptember2,2009MonteCarlomethodsarenowanessentialpartofthestatistician’stoolbox,tothepointofbeingmorefamiliartograduatestudentsthanthemeasuretheo-reticnotionsuponwh...
Monte Carlo Methods in Statistics
MonteCarloMethodsinStatisticsChristianRobert∗Universite´ParisDauphineandCREST,INSEESeptember2,2009MonteCarlomethodsarenowanessentialpartofthestatistician’stoolbox,tothepointofbeingmorefamiliartograduatestudentsthanthemeasuretheo-reticnotionsuponwhichtheyarebased!WerecallinthisnotesomeoftheadvancesmadeinthedesignofMonteCarlotechniquestowardstheiruseinStatis-tics,referringtoRobertandCasella(2004,2010)foranin-depthcoverage.ThebasicMonteCarloprincipleanditsextensionsThemostappealingfeatureofMonteCarlomethods[forastatistician]isthattheyrelyonsamplingandonprobabilitynotions,whicharethebreadandbutterofourprofession.Indeed,thefoundationofMonteCarloapproximationsisidenticaltothevalidationofempiricalmomentestimatorsinthattheaverage1TT∑t=1h(xt),xt∼f(x),(1)isconvergingtotheexpectationEf[h(X)]whenTgoestoinfinity.Furthermore,theprecisionofthisap-proximationisexactlyofthesamekindasthepreci-sionofastatisticalestimate,inthatitusuallyevolvesasO(√T).Therefore,onceasamplex1,...,xTisproducedaccordingtoadistributiondensityf,allstandardstatisticaltools,includingbootstrap,applytothissample(withthefurtherappealthatmoredatapointscanbeproducedifdeemednecessary).AsillustratedbyFigure1,thevariabilityduetoasingleMonteCarloexperimentmustbeaccountedfor,whendrawingconclusionsaboutitsoutputandevaluations∗ProfessorofStatistics,CEREMADE,Universite´ParisDauphine,75785Pariscedex16,France.SupportedbytheAgenceNationaledelaRecherche(ANR,212,ruedeBercy75012Paris)throughthe2009-2012projectANR-08-BLAN-0218Big’MC.Email:xian@ceremade.dauphine.fr.Theau-thorisgratefultoJean-MichelMarinforhelpfulcomments.02000600010000−0.0050.0000.005NumberofsimulationsMonteCarloapproximation02000600010000−0.020.000.02NumberofsimulationsMonteCarloapproximation020006000100000.24150.24250.24350.2445NumberofsimulationsMonteCarloapproximation020006000100000.2380.2420.246NumberofsimulationsMonteCarloapproximationFigure1:MonteCarloevaluation(1)oftheexpecta-tionE[X3/(1+X2+X4)]asafunctionofthenumberofsimulationwhenX∼N(µ,1)using(left)onesim-ulationrunand(right)100independentrunsfor(top)µ=0and(bottom)µ=2.5.oftheoverallvariabilityofthesequenceofapproxi-mationsareprovidedinKendalletal.(2007).ButtheeasewithwhichsuchmethodsareanalysedandthesystematicresorttostatisticalintuitionexplaininpartwhyMonteCarlomethodsareprivilegedovernumericalmethods.TherepresentationofintegralsasexpectationsEf[h(X)]isfarfromuniqueandthereexistthere-foremanypossibleapproachestotheaboveapprox-imation.Thisrangeofchoicescorrespondstotheimportancesamplingstrategies(Rubinstein1981)inMonteCarlo,basedontheobviousidentityEf[h(X)]=Eg[h(X)f(X)/g(X)]providedthesupportofthedensitygincludesthesupportoff.SomechoicesofgmayhoweverleadtoappallinglypoorperformancesoftheresultingMonte1arXiv:0909.0389v1[stat.CO]2Sep2009Carloestimates,inthatthevarianceoftheresult-ingempiricalaveragemaybeinfinite,adangerworthhighlightingsinceoftenneglectedwhilehavingama-jorimpactonthequalityoftheapproximations.Fromastatisticalperspective,thereexistsomenaturalchoicesfortheimportancefunctiong,basedonFisherinfor-mationandanalyticalapproximationstothelikeli-hoodfunctionliketheLaplaceapproximation(Rueetal.2008),eventhoughitismorerobusttoreplacethenormaldistributionintheLaplaceapproxima-tionwithatdistribution.ThespecialcaseofBayesfactors(RobertandCasella2004)B01(x)=∫Θf(x|θ)pi0(θ)dθ/∫Θf(x|θ)pi1(θ)dθ,whichdriveBayesiantestingandmodelchoice,andoftheirapproximationhasledtoaspecificclassofim-portancesamplingtechniquesknownasbridgesam-pling(Chenetal.2000)wheretheoptimalimpor-tancefunctionismadeofamixtureoftheposteriordistributionscorrespondingtobothmodels(assum-ingbothparameterspacescanbemappedintothesameΘ).Wewanttostressherethatanalternativeapproximationofmarginallikelihoodsrelyingontheuseofharmonicmeans(GelfandandDey1994,New-tonandRaftery1994)andofdirectsimulationsfromaposteriordensityhasrepeatedlybeenusedintheliterature,despiteoftensufferingfrominfinitevari-ance(andthusnumericalinstability).Anotherpo-tentiallyveryefficientapproximationofBayesfactorsisprovidedbyChib’s(1995)representation,basedonparametricestimatestotheposteriordistribution.MCMCmethodsMarkovchainMonteCarlo(MCMC)methodshavebeenproposedmanyyears(Metropolisetal.1953)beforetheirimpactinStatisticswastrulyfelt.How-ever,onceGelfandandSmith(1990)stressedtheul-timatefeasibilityofproducingaMarkovchainwithagivenstationarydistributionf,eitherviaaGibbssamplerthatsimulateseachconditionaldistributionoffinitsturn,orviaaMetropolis–Hastingsalgo-rithmbasedonaproposalq(y|x)withacceptanceprobability[foramovefromxtoy]min{1,f(y)q(x|y)/f(x)q(y|x)},thenthespectrumofmanageablemodelsgrewim-menselyandalmostinstantaneously.Duetoparalleldevelopmentsatthetimeongraph-icalandhierarchicalBayesianmodels,likegeneralised−3−101230.00.20.40.6040008000−0.4−0.20.00.20.4iterationsGibbsapproximationFigure2:(left)Gibbssamplingapproximationtothedistributionf(x)∝exp(−x2/2)/(1+x2+x4)againstthetruedensity;(right)rangeofconvergenceoftheapproximationtoEf[X3]=0againstthenumberofiterationsusing100independentrunsoftheGibbssampler,alongwithasingleGibbsrun.linearmixedmodels(ZegerandKarim1991),thewealthofmultivariatemodelswithavailablecondi-tionaldistributions(andhencethepotentialofim-plementingtheGibbssampler)wasfarfromnegligi-ble,especiallywhentheavailabilityoflatentvariablesbecamequasiuniversalduetotheslicesamplingrep-resentations(Damienetal.1999,Neal2003).(Al-thoughtheadoptionofGibbssamplershasprimarilytakenplacewithinBayesianstatistics,thereisnoth-ingthatpreventsanartificialaugmentationofthedatathroughsuchtechniques.)Forinstance,ifthedensityf(x)∝exp(−x2/2)/(1+x2+x4)isknownuptoanormalisingconstant,fisthemarginal(inx)ofthejointdistributiong(x,u)∝exp(−x2/2)I(u(1+x2+x4)≤1),whenuisrestrictedto(0,1).Thecorrespondingslicesamplerthencon-sistsinsimulatingU|X=x∼U(0,1/(1+x2+x4))andX|U=u∼N(0,1)I(1+x2+x4≤1/u),thelaterbeingatruncatednormaldistribution.AsshownbyFigure2,theoutcomeoftheresultingGibbssamplerperfectlyfitsthetargetdensity,whiletheconvergenceoftheexpectationofX3underfhasabehaviourquitecomparablewiththeiidsetting.WhiletheGibbssamplerfirstappearsasthenatu-ralsolutiontosolveasimulationproblemincomplexmodelsifonlybecauseitstemsfromthetruetarget2−20240.00.20.40.6040008000−0.4−0.20.00.20.4iterationsGibbsapproximationFigure3:(left)RandomwalkMetropolis–Hastingssamplingapproximationtothedistributionf(x)∝exp(−x2/2)/(1+x2+x4)againstthetruedensityforascaleof1.2correspondingtoanacceptancerateof0.5;(right)rangeofconvergenceoftheapproximationtoEf[X3]=0againstthenumberofiterationsus-ing100independentrunsoftheMetropolis–Hastingssampler,alongwithasingleMetropolis–Hastingsrun.f,asexhibitedbythewidespreaduseofBUGSLunnetal.(2000),whichmostlyfocusonthisapproach,theinfinitevariationsofferedbytheMetropolis–Hastingsschemesoffermuchmoreefficientsolutionswhentheproposalq(y|x)isappropriatelychosen.Thebasicchoiceofarandomwalkproposalq(y|x)beingthenanormaldensitycentredinx)canbeimprovedbyex-ploitingsomefeaturesofthetargetasinLangevinal-gorithms(seeRobertandCasella2004,section7.8.5)andHamiltonianorhybridalternatives(Duaneetal.1987,Neal1999)thatbuildupongradients.Morerecentproposalsincludeparticlelearningaboutthetargetandsequentialimprovementoftheproposal(Doucetal.2007,Rosenthal2007,Andrieuetal.2010).Figure3reproducesFigure2forarandomwalkMetropolis–Hastingsalgorithmwhosescaleiscalibratedtowardsanacceptancerateof0.5.TherangeoftheconvergencepathsisclearlywiderthanfortheGibbssampler,butthefactthatthisisagenericalgorithmapplyingtoanytarget(insteadofaspecialisedversionasfortheGibbssampler)mustbeborneinmind.Anothermajorimprovementgeneratedbyasta-tisticalimperativeisthedevelopmentofvariabledi-mensiongeneratorsthatstemmedfromBayesianmodelchoicerequirements,themostimportantexamplebe-ingthereversiblejumpalgorithminGreen(1995)whichhadasignificantimpactonthestudyofgraph-icalmodels(Brooksetal.2003).SomeusesofMonteCarloinStatis-ticsTheimpactofMonteCarlomethodsonStatisticshasnotbeentrulyfeltuntiltheearly1980’s,withthepublicationofRubinstein(1981)andRipley(1987),butMonteCarlomethodshavenowbecomeinvalu-ableinStatisticsbecausetheyallowtoaddressopti-misation,integrationandexplorationproblemsthatwouldotherwisebeunreachable.Forinstance,thecalibrationofmanytestsandthederivationoftheiracceptanceregionscanonlybeachievedbysimula-tiontechniques.WhileintegrationissuesareoftenlinkedwiththeBayesianapproach—sinceBayesesti-matesareposteriorexpectationslike∫h(θ)pi(θ|x)dθandBayestestsalsoinvolveintegration,asmentionedearlierwiththeBayesfactors—,andoptimisationdifficultieswiththelikelihoodperspective,thisclas-sificationisbynowaytight—asforinstancewhenlikelihoodsinvolveunmanageableintegrals—andallfieldsofStatistics,fromdesigntoeconometrics,fromgenomicstopsychometryandenvironmics,havenowtorelyonMonteCarloapproximations.Awholenewrangeofstatisticalmethodologieshaveentirelyinte-gratedthesimulationaspects.Examplesincludethebootstrapmethodology(Efron1982),wheremulti-levelresamplingisnotconceivablewithoutacom-puter,indirectinference(Gourie´rouxetal.1993),whichconstructapseudo-likelihoodfromsimulations,MCEM(Cappe´andMoulines2009),wheretheE-stepoftheEMalgorithmisreplacedwithaMonteCarloapproximation,orthemorerecentapproxi-matedBayesiancomputation(ABC)usedinpopula-tiongenetics(Beaumontetal.2002),wherethelike-lihoodisnotmanageablebuttheunderlyingmodelcanbesimulatedfrom.Inthepastfifteenyears,thecollectionofrealproblemsthatStatisticscan[affordto]handlehastrulyundergoneaquantumleap.MonteCarlometh-odsandinparticularMCMCtechniqueshaveforeverchangedtheemphasisfrom“closedform”solutionstoalgorithmicones,expandedourimpacttosolv-ing“real”appliedproblemswhileconvincingscien-tistsfromotherfieldsthatstatisticalsolutionswereindeedavailable,andledusintoaworldwhere“ex-act”maymean“simulated”.Thesizeofthedatasetsandofthemodelscurrentlyhandledthankstothosetools,forexampleingenomicsorinclimatol-3ogy,issomethingthatcouldnothavebeenconceived60yearsago,whenUlamandvonNeumanninventedtheMonteCarlomethod.ReferencesAndrieu,C.,Doucet,A.andHolenstein,R.(2010).ParticleMarkovchainMonteCarlo(withdiscussion).J.RoyalStatist.SocietySeriesB,72.(toappear).Beaumont,M.,Zhang,W.andBalding,D.(2002).ApproximateBayesiancomputationinpopulationge-netics.Genetics,1622025–2035.Brooks,S.,Giudici,P.andRoberts,G.(2003).Ef-ficientconstructionofreversiblejumpMarkovchainMonteCarloproposaldistributions(withdiscussion).J.RoyalStatist.SocietySeriesB,653–55.Cappe´,O.andMoulines,E.(2009).On-lineexpectation-maximizationalgorithmforlatentdatamodels.J.RoyalStatist.SocietySeriesB,71(3)593–613.Chen,M.,Shao,Q.andIbrahim,J.(2000).MonteCarloMethodsinBayesianComputation.Springer-Verlag,NewYork.Chib,S.(1995).MarginallikelihoodfromtheGibbsout-put.J.AmericanStatist.Assoc.,901313–1321.Damien,P.,Wakefield,J.andWalker,S.(1999).GibbssamplingforBayesiannon-conjugateandhier-archicalmodelsbyusingauxiliaryvariables.J.RoyalStatist.SocietySeriesB,61331–344.Douc,R.,Guillin,A.,Marin,J.-M.andRobert,C.(2007).Convergenceofadaptivemixturesofimpor-tancesamplingschemes.Ann.Statist.,35(1)420–448.Duane,S.,Kennedy,A.D.,Pendleton,B.J.,andRoweth,D.(1987).HybridMonteCarlo.Phys.Lett.B,195216–222.Efron,B.(1982).TheJacknife,theBootstrapandOtherResamplingPlans,vol.38.SIAM,Philadelphia.Gelfand,A.andDey,D.(1994).Bayesianmodelchoice:asymptoticsandexactcalculations.J.RoyalStatist.SocietySeriesB,56501–514.Gelfand,A.andSmith,A.(1990).Samplingbasedap-proachestocalculatingmarginaldensities.J.AmericanStatist.Assoc.,85398–409.Gourie´roux,C.,Monfort,A.andRenault,E.(1993).Indirectinference.J.AppliedEconom.,885–118.Green,P.(1995).ReversiblejumpMCMCcomputa-tionandBayesianmodeldetermination.Biometrika,82711–732.Kendall,W.,Marin,J.-M.andRobert,C.(2007).ConfidencebandsforBrownianmotionandapplica-tionstoMonteCarlosimulations.StatisticsandCom-puting,171–10.Lunn,D.,Thomas,A.,Best,N.,andSpiegelhalter,D.(2000).WinBUGS–aBayesianmodellingframe-work:concepts,structure,andextensibility.StatisticsandComputing,10325–337.Metropolis,N.,Rosenbluth,A.,Rosenbluth,M.,Teller,A.andTeller,E.(1953).Equationsofstatecalculationsbyfastcomputingmachines.J.Chem.Phys.,211087–1092.Neal,R.(1999).BayesianLearningforNeuralNetworks,vol.118.Springer–Verlag,NewYork.LectureNotes.Neal,R.(2003).Slicesampling(withdiscussion).Ann.Statist.,31705–767.Newton,M.andRaftery,A.(1994).ApproximateBayesianinferencebytheweightedlikelihoodboostrap(withdiscussion).J.RoyalStatist.SocietySeriesB,561–48.Ripley,B.(1987).StochasticSimulation.JohnWiley,NewYork.Robert,C.andCasella,G.(2004).MonteCarloSta-tisticalMethods.2nded.Springer-Verlag,NewYork.Robert,C.andCasella,G.(2010).IntroducingMonteCarloMethodswithR.Springer-Verlag,NewYork.Rosenthal,J.(2007).AMCM:AnRinterfaceforadap-tiveMCMC.Comput.Statist.DataAnalysis,515467–5470.Rubinstein,R.(1981).SimulationandtheMonteCarloMethod.JohnWiley,NewYork.Rue,H.,Martino,S.andChopin,N.(2008).Approxi-mateBayesianinferenceforlatentGaussianmodelsbyusingintegratednestedLaplaceapproximations(withdiscussion).J.RoyalStatist.SocietySeriesB,71(2)319–392.Zeger,S.andKarim,R.(1991).Generalizedlinearmodelswithrandomeffects;aGibbssamplingap-proach.J.AmericanStatist.Assoc.,8679–86.4
/
本文档为【Monte Carlo Methods in Statistics】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑, 图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。 本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。 网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。

历史搜索

    清空历史搜索