Random_Effect_and_Latent_Variable_Model_Selection.pdf

Random_Effect_and_Latent_Variable_Model_Selection.pdf

ID:34858804

大小:2.28 MB

页数:174页

时间:2019-03-12

上传者:不努力梦想只是梦
Random_Effect_and_Latent_Variable_Model_Selection.pdf_第1页
Random_Effect_and_Latent_Variable_Model_Selection.pdf_第2页
Random_Effect_and_Latent_Variable_Model_Selection.pdf_第3页
Random_Effect_and_Latent_Variable_Model_Selection.pdf_第4页
Random_Effect_and_Latent_Variable_Model_Selection.pdf_第5页
资源描述:

《Random_Effect_and_Latent_Variable_Model_Selection.pdf》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库

LectureNotesinStatistics192EditedbyP.Bickel,S.Fienberg DavidB.DunsonEditorRandomEffectandLatentVariableModelSelectionABC EditorDavidB.DunsonNationalInstituteofEnvironmentalHealthSciencesResearchTrianglePark,NCUSAdunson@stat.duke.eduISBN:978-0-387-76720-8e-ISBN:978-0-387-76721-5DOI:10.1007/978-0-387-76721-5LibraryofCongressControlNumber:2008928920°c2008SpringerScience+BusinessMedia,LLCAllrightsreserved.Thisworkmaynotbetranslatedorcopiedinwholeorinpartwithoutthewrittenpermissionofthepublisher(SpringerScience+BusinessMedia,LLC,233SpringStreet,NewYork,NY10013,USA),exceptforbriefexcerptsinconnectionwithreviewsorscholarlyanalysis.Useinconnectionwithanyformofinformationstorageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilarmethodologynowknownorhereafterdevelopedisforbidden.Theuseinthispublicationoftradenames,trademarks,servicemarks,andsimilarterms,eveniftheyarenotidentifiedassuch,isnottobetakenasanexpressionofopinionastowhetherornottheyaresubjecttoproprietaryrights.Coverillustration:FolliclesofcolloidinthyroidPrintedonacid-freepaper987654321springer.com PrefaceRandomEffectandLatentVariableModelSelectionInrecentyears,therehasbeenadramaticincreaseinthecollectionofmultivariateandcorrelateddatainawidevarietyoffields.Forexample,itisnowstandardprac-ticetoroutinelycollectmanyresponsevariablesoneachindividualinastudy.Thedifferentvariablesmaycorrespondtorepeatedmeasurementsovertime,toabatteryofsurrogatesforoneormorelatenttraits,ortomultipletypesofoutcomeshavinganunknowndependencestructure.Hierarchicalmodelsthatincorporatesubject-specificparametersareoneofthemostwidely-usedtoolsforanalyzingmultivariateandcorrelateddata.Suchsubject-specificparametersarecommonlyreferredtoasrandomeffects,latentvariablesorfrailties.Therearetwomodelingframeworksthathavebeenparticularlywidelyusedashierarchicalgeneralizationsoflinearregressionmodels.Thefirstisthelinearmixedeffectsmodel(LairdandWare,1982)andthesecondisthestructuralequationmodel(Bollen,1989).Linearmixedeffects(LME)modelsextendlinearregres-siontoincorporatetwocomponents,withthefirstcorrespondingtofixedeffectsdescribingtheimpactofpredictorsonthemeanandthesecondtorandomeffectscharacterizingtheimpactonthecovariance.LMEshavealsobeenincreasinglyusedforfunctionestimation.InimplementingLMEanalyses,modelselectionproblemsareunavoidable.Forexample,theremaybeinterestincomparingmodelswithandwithoutapredictorinthefixedand/orrandomeffectscomponent.Inaddition,thereistypicallyuncertaintyinthesubsetofpredictorstobeincludedinthemodel,withthenumberofcandidatepredictorslargeinmanyapplications.Toaddressproblemsofthistype,itisnotappropriatetorelyonclassicalmethodsdevelopedformodelselectionandinferencesinnon-hierarchicalregressionmodels.Forexample,thewidelyusedBICcriteriaarenotvalidforrandomeffectsmodels,andlikelihoodratioandscoretestsfacedifficulties,sincethenullhypothesisoftenfallsontheboundaryoftheparameterspace.TheobjectiveofthefirstpartofthisbookistoprovideanoverviewofavarietyofpromisingstrategiesforaddressingmodelselectionproblemsinLMEsandrelatedmodelingframeworks.Inthechapter,“LikelihoodRatioTestingforZeroVarianceComponentsinLinearMixedModels,”CiprianCrainiceanuprovidesanapplications-motivatedoverviewofrecentworkonlikelihoodratioandrestrictedlikelihoodratiotestsforv viPrefacetestingwhetherrandomeffectshavezerovariance.Theapproacheshedescribesrepresentanimportantadvanceoverthecurrentstandardpracticeintestingforzerovariancecomponentsinhierarchicalmodels.Suchapproachesincludeignoringtheboundaryproblemandassumingthelikelihoodratioteststatistichasachi-squaredistributionunderthenullandrelyingonasymptoticresultsshowingamixtureofchi-squaresismoreappropriate(StramandLee,1994).Crainiceanushowsthatas-ymptoticapproximationsmaybeunreliableinmanyapplications,motivatinguseoffinitesampleapproaches.Heillustratestheideasthroughseveralexamples,includ-ingapplicationstononlinearregressionmodeling.Scoretestsprovideawidely-usedalternativetolikelihoodratiotests,andinthechapter,“VarianceComponentTestinginGeneralizedLinearMixedModelsforLongitudinal/ClusteredDataandOtherRelatedTopics,”ofthisvolumeDaowenZhangandXihongLinprovideanexcellentoverviewoftherecentliteratureonscoretest-basedapproaches.Inaddition,ZhangandLinconsiderabroaderclassofmodels,whichincludesGLMMsandgeneralizedadditivemixedmodels(GAMMs).GAMMsprovideanextremelyrichframeworkforsemiparametricmodelingoflon-gitudinaldataallowingflexiblepredictoreffectsthroughreplacinglineartermsinageneralizedlinearmodelwithunknownnon-linearfunctions,whilealsoincludingrandomeffectstoaccountforwithin-subjectdependenceandheterogeneity.Thefirstpartofthevolumeiscompletedwithtwocompanionchaptersdescrib-ingBayesianapproachesforvariableselectioninLMEsandGLMMs.Thelikeli-hoodratioandscoretestmethodsprovideanapproachforcomparingtwonestedmodelswiththesmallermodelhavingarandomeffectexcluded.However,inmanyapplicationsoneisfacedwithasetofpcandidatepredictors,withuncertaintyinwhichsubsetsshouldbeincludedinthefixedandrandomeffectscomponentsofthemodel.Clearly,thenumberofcandidatemodelsgrowsextremelyrapidlywithp,sothatitoftenbecomesimpossibletofiteachmodelinthelist.Onepossibilityistousealikelihoodratiotestwithinastepwiseselectionprocedure.However,thefinalmodelselectedwilldependontheorderinwhichcandidatepredictorsareaddedordeletedanditisdifficulttoadjustforuncertaintyinsubsetselectioninperforminginferencesandpredictions.Innon-hierarchicalregressionmodels,Bayesianvari-ableselectionimplementedwithstochasticsearchalgorithmshasbeenverywidelyusedtoaddressthisproblem.Inthechapter,“BayesianModelUncertaintyinMixedEffectsModels,”SatkartarKinneyandIdescribeanapproachforLMEs,whileinthechapter,“BayesianVariableSelectioninGeneralizedLinearMixedModels,”BoCaiandIdescribeanalternativeforGLMMs.Thesecondpartofthebookswitchesgearstofocusonstructuralequationmodels(SEMs),whichhavebeenverywidelyusedinsocialscienceapplicationsforassess-ingrelationshipsamonglatentvariables,suchaspovertyorviolence,thatcanonlybemeasuredindirectlythroughmultiplesurrogates.SEMsprovideageneralizationoffactoranalysis,whichallowsformodelingoflinearrelationshipsamongthela-tentfactorsthroughalinearstructuralrelations(LISREL)model.SEMsarealsoquiteusefuloutsideoftraditionalapplicationareasforsparsecovariancestructuremodelingofhigh-dimensionalmultivariatedata.However,oneofthemainissuesinapplyingSEMsishowtodealwithmodeluncertainty,whichcommonlyarises Prefaceviiindecidingonthenumberoffactorstoincludeineachcomponentandtherela-tionshipsamongthesefactors.Inthechapter,“AUnifiedApproachtoTwo-LevelStructuralEquationModelsandLinearMixedEffectsModels,”PeterBentlerandJiajuanLiangprovideabridgebetweenthefirstandsecondpartsofthevolumeinlinkingLMEsandSEMs,whilealsoconsideringmethodsformodelselection.Inthechapter,“BayesianModelComparisonofStructuralEquationModels,”Sik-YumLeeandXin-YuanSongprovideageneralBayesianapproachtocom-parisonofSEMs.TypicalBayesianmethodsforcomparingmodelsrelyonBayesfactors.However,BayesfactorshaveprovedquitedifficulttoestimateaccuratelyinSEMs.LeeandSongproposeausefulandcleversolutiontothisproblemusingpathsampling.Onewell-knownissueinmodelselectionusingBayesfactorsissensitiv-itytopriorselection.Thishasmotivatedarichliteratureondefaultpriors.Inthechapter,“BayesianModelSelectioninFactorAnalyticModels”JoyeeGhoshandIbuildontheapproachofLeeandSong,proposingadefaultprior,andanefficientapproachforposteriorcomputationrelyingonparameterexpansion.Inaddition,animportancesamplingalgorithmisproposedasanalternativetopathsampling.Insummary,thisvolumeprovidesapractically-motivatedoverviewofavarietyofrecentlyproposedapproachesformodelselectioninrandomeffectsandlatentvariablemodels.Thegoalistomakethesemethodsmoreaccessibletopractition-ers,whilealsostimulatingadditionalresearchinthisimportantandunder-studiedareaofstatistics.Thereareanumberoftopicsrelatedtomodelselectioninran-domeffectsandlatentvariablemodelsthatareinneedofnewresearch,withso-lutionshavingthepotentialforsubstantialappliedimpact.Thefirsttopicisthedevelopmentofsimplemethodstocalculatemodelselectioncriteria,whichmodifyAICandBICtoincorporateapenaltyformodelcomplexitythatisappropriateforahierarchicalmodel.AsecondtopicisthedevelopmentofefficientmethodsforsimultaneousmodelsearchandposteriorcomputationinSEMs.Often,onehasahigh-dimensionalsetofSEMsthatareplausibleaprioriandconsistentwithcurrentscientificorsociologictheories.Itisofsubstantialinteresttoidentifyhighposteriorprobabilitymodelsandtoaverageacrossmodelsinmakingpredictions.However,typicaltricksusedinothermodelclasses,suchaszeroingoutcoefficients,donotworkingeneralforSEMs,andefficientalternativesremaintobedeveloped.ReferencesBollen,K.A.(1989).StructuralEquationModelswithLatentVariables.NewYork:WileyLaird,N.andWare,J.(1982).Random-effectsmodelsforlongitudinaldata.Biometrics38,963–974DavidB.Dunson ContentsPartIRandomEffectsModelsLikelihoodRatioTestingforZeroVarianceComponentsinLinearMixedModels..................................................3CiprianM.CrainiceanuVarianceComponentTestinginGeneralizedLinearMixedModelsforLongitudinal/ClusteredDataandotherRelatedTopics.............19DaowenZhangandXihongLinBayesianModelUncertaintyinMixedEffectsModels.................37SatkartarK.KinneyandDavidB.DunsonBayesianVariableSelectioninGeneralizedLinearMixedModels.......63BoCaiandDavidB.DunsonPartIIFactorAnalysisandStructuralEquationsModelsAUnifiedApproachtoTwo-LevelStructuralEquationModelsandLinearMixedEffectsModels..................................95PeterM.BentlerandJiajuanLiangBayesianModelComparisonofStructuralEquationModels...........121Sik-YumLeeandXin-YuanSongBayesianModelSelectioninFactorAnalyticModels..................151JoyeeGhoshandDavidB.DunsonIndex.............................................................165ix PartIRandomEffectsModels LikelihoodRatioTestingforZeroVarianceComponentsinLinearMixedModelsCiprianM.CrainiceanuMixedmodelsareapowerfulinferentialtoolwithawiderangeofapplicationsin-cludinglongitudinalstudies,hierarchicalmodeling,andsmoothing.Mixedmodelshavebecomethestateoftheartforstatisticalinformationexchangeandcorrela-tionmodeling.Theirpopularityhasbeenaugmentedbytheavailabilityofdedicatedsoftware,e.g.,theMIXEDprocedureinSAS,thelmefunctioninRandS+,orthextmixedfunctioninSTATA.Inthispaper,weconsidertheproblemoftestingthenullhypothesisofazerovariancecomponentinalinearmixedmodel(LMM).Wefocusonthelikelihoodratiotest(LRT)andrestrictedlikelihoodratiotest(RLRT)statisticsforthreerea-sons.First,(R)LRTsareuniformlymostpowerfulforsimplenullandalternativehypothesesandhavebeenshowntohavegoodpowerpropertiesinavarietyoftheo-reticalandappliedframeworks.Second,giventheirrobustproperties,(R)LRTsarethebenchmarkforstatisticaltesting.Third,(R)LRTcannowbeusedinrealisticdatasetsandapplicationsduetoabetterunderstandingoftheirnulldistributionandimprovedcomputationaltools.Thepaperisorganizedasfollows.Section1describesthreeapplicationsoftest-ingforazerovariancecomponent.Section2containsthemodelandadescriptionofthetestingframework.Section3describesstandardasymptoticresultsandprovidesashortdiscussionoftheirapplicability.Section4presentsfinitesampleandas-ymptoticresultsforlinearmixedmodels(LMMs)withonevariancecomponent.Section5introducestwoapproximationsofthefinitesample(R)LRTdistribu-tionfortestingforzerovariancecomponentsinLMMswithmultiplevariancecomponents.Section6presentsthecorrespondingtestingresultsfortheex-amplesintroducedinSect.1.Section7providesthediscussionandpracticalrecommendations.C.M.CrainiceanuDepartmentofBiostatistics,JohnsHopkinsUniversityccrainic@jhsph.eduD.B.Dunson(ed.)RandomEffectandLatentVariableModelSelection,3DOI:10.1007/978-0-387-76721-5,cSpringerScience+BusinessMedia,LLC2008 4C.M.Crainiceanu1ExamplesThethreeexamplesinthissectionillustratethewidevarietyofapplicationsoftest-ingforzerovariancecomponentsinLMMs.Thislistisfarfrombeingexhaustivebutprovidesaforetasteofwhatispossibleandneededinthisframework.1.1LoaloaPrevalenceinWestAfricaFigure1displaysvillagelocationsfromoneoftheseveralparasitologicalsurveylocationinWestAfrica.InallthesevillagesparasitologicalsamplingwasconductedtoassesstheprevalenceofLoaisis.Hereweprovideashortsummary,butacompletedescriptionoftheproblemcanbefoundinCrainiceanuetal.(2007).Loaisis,oreyeworm,isanendemicdiseaseofthewettropics,causedbyLoaloa,afilarialparasitewhichistransmittedtohumansbythebiteofaninfectedChrysopsfly.InFig.1theempiricalprevalenceratesatlocationx,p(x),areindicatedasdotscodedaccordingtotheirsize:smallp(x)<0.18,medium0.18≤p(x)<0.20,large0.20≤p(x)<0.25,andverylargep(x)>0.30.AcompletebivariatebinomialanalysisofthisdatasetcanbefoundinCrainiceanuetal.(2007).Here,weconsiderthefollowingsimplerunivariatemodelforthelogitprevalenceatthespatiallocationxlogit{p(x)}=α0+α1g(x)+α2s(x)+α3e(x)+α4{e(x)−800}++S(x)+(x),(1)0.50.40.3latitude0.20.10.034567810121416longitudeFig.1VillagesamplinglocationsinonesubregionfromWestAfrica.Theempiricalprevalenceratesareindicatedasdotscodedaccordingtotheirsize:smallp(x)<0.18,medium0.18≤p(x)<0.20,large.20≤p(x)<0.25,verylargep(x)>0.30.Theestimatedmeanprevalencebasedonmodel(1)isgrey-scalecodedaccordingtothelegend LikelihoodRatioTestingforZeroVarianceComponentsinLinearMixedModels5whereg(x)isanannualaveragemeasureofgreenness,s(x)thestandarddeviationofgreenness,e(x)theelevationinmeters,S(x)aspatialcomponent,and(x)∼Normal(0,σ2)aretheindependenterrors.Herea+isequaltoaifa>0and0otherwise,sothat{e(x)−800}+representstheelevationatlocationxtruncatedbelow800m.IfthespatialcomponentS(x)ismodeledasalowrankpenalizedthinplatesplinethenS(x)=xtβ+Z(x)b,2(2)b∼Normal(0,σIK),bwhereZ(x)isthelowrankspecificdesignvector(fordetailssee(Ruppertetal.,2003;KammannandWand,2003)),bthethinplatesplinecoefficientsdescribingthespatialfeaturesofS(x),σ2thesmoothingparametercontrollingtheamountofbsmoothing,andIKistheidentitymatrixwhereKisthenumberofspatialknots.InthecaseoflowranksmoothersthesetofKknotsforthecovariateshavetobechosen.Onepossibilityistouseequallyspacedknots.Anotherpossibilityistoselecttheknotsandsubknotsusingthespacefillingdesign(NychkaandSaltzman,1998),whichisbasedonthemaximalseparationprinciple.Thisavoidswastingknotsandislikelytoleadtobetterapproximationsinsparseregionsofthedata.Thecover.design()functionfromtheRpackageFields(FieldsDevelopmentTeam,2006)providessoftwareforspacefillingknotselection.Ifthesmoothingparameterisestimatedbyrestrictedmaximumlikelihood(REML),thenthemodeldescribedin(1)and(2)isequivalenttoaparticularLMMwithonevariancecomponent.Figure1displaystheestimatedmeanprevalenceatalllocationsinthemapcodedaccordingtothelegend.InthiscontexttestingwhetherthenonlinearspatialcomponentofS(x)isnecessarytoexplaintheresidualvariabilityafterfittingthescientificallyavailablecovariatesisequivalenttotesting2=0vs.H2H0:σbA:σb>0.FromascientificperspectivetestingH0isequivalenttotestingwhethersimplermodelsincludingonlycovariatescouldcapturethecomplexstochasticnatureofthespatialdataandhavegoodpredictivepower.1.2OnionDensityinAustraliaFigure2containsdataonyields(grams/plant)ofwhiteSpanishonionsintwolo-cations:PurnongLandingandVirginia,SouthAustralia(Ratkowsky,1983).Thehorizontalaxiscorrespondstoarealdensityofplants(plants/m2).DetailedanalysesofthesedataaregivenbyRuppertetal.(2003)andCrainiceanu(2003).Denoteby(yi,xi,si)theyield,densityofplantsandlocationfortheithobservation.Here,si=1correspondstoPurnongLandingandsi=0correspondstoVirginia.ThesolidlinesinFig.2correspondtofittingthelinearadditivemodellog(yi)=β0+β1si+β2di+i.(3) 6C.M.Crainiceanu5.554.5log(yield)43.550100150Fig.2Logyieldfortheoniondataplottedagainstdensity(circlePurnongLanding;asteriskVirginia),straightlinefit(solidline),binaryoffsetmodelusingapenalizedlinearsplinefitwithK=15knotsandREMLestimationofsmoothingparameter(dashedline),discretebycontinuousinteractionmodel(dottedline)Thedashedlinesrepresentthemeanfitusingasemiparametricbinaryoffsetmodel(Ruppertetal.,2003)log(yi)=β1si+f(di)+i,(4)whichcontainsaparametriccomponent,β1si,andanonparametriccomponent,f(di).ThebinaryvariablesverticallyoffsetstherelationshipbetweenE[log(y)]anddensityaccordingtolocation.Byspecifyingalinearpenalizedsplinemodelforf(di)themodelbecomesKlog(yi)=β0+β1si+β2di+bk(di−κk)++i,k=1wherebkarei.i.d.N(0,σ2)andiarei.i.d.N(0,σ2).FollowingRuppertetal.b(2003),weuseK=15knotschosenatthesamplequantilesofdensitycorrespond-ingtofrequencies1/(K+1),...,K/(K+1).Testingmodel(3)correspondingtothesolidlinefitsinFig.2versusmodel(4)correspondingtodashedlinesinFig.2correspondstotestingH0:σ2=0bvs.HA:σ2>0.Forthesedataandhypothesistestingframework,Crainiceanub(2003)calculatedRLRT=35.93withacorrespondingp-value<0.001.Thecalcu-lationofthep-valuewasbasedontheexactdistributionoftheRLRTasobtainedbyCrainiceanuandRuppert(2004b).Thisresultisnotsurprising,giventhelarge LikelihoodRatioTestingforZeroVarianceComponentsinLinearMixedModels7discrepanciesbetweenthetwomodelfitsinFig.2.Infact,resultswouldnotchangeevenifoneusedthemoreconservative(butincorrectinthiscase)0.5χ2:0.5χ201approximationtothenullRLRTdistribution(SelfandLiang,1987).Itisnatural,however,toaskwhetherthebinaryoffsetmodelaccuratelyrepre-sentsthedata.Toaddressthisquestionwenestmodel(4)intothefollowingdiscretebycontinuousinteractionmodelfPL(di)ifsi=1;E{log(yi)}=fVA(di)ifsi=0,wherethesubscriptsPLandVAdenotethePurnongLandingandVirginialoca-tions,respectively.Thebasicideaistomodelthemeanresponseatoneoftheloca-tions,sayPurnongLanding,asanonparametricsplineandthedeviationsfromthisfunctioncorrespondingtotheotherlocation,sayVirginia,asanothernonparametricspline.ThediscretebycontinuousinteractionmodelisKKlog(yi)=β0+β1di+bk(di−κk)++{γ0+γ1di+vk(di−κk)+}I(i∈PL)+ik=1k=1(5)forVirginia(s=0),whereβ0,β1,γ0,andγ1arefixedunknownparameters,bkarei.i.d.N(0,σ2),vkarei.i.d.N(0,σv2),andI(i∈PL)is1iftheobservationiisfrombPurnongLandingand0otherwise.Themodel(5)isanLMMwithtworandomeffectsvariancecomponents,σ2andσv2,andthefittothedataisdepictedbythebtwodottedcurvesinFig.2.TestingforlinearversusnonlineardeviationsfromthesmoothregressionfunctioncorrespondingtothePurnongLandinglocationreducesinthismodeltotestingH2=0vs.σ2>0,0:σvvwhichisequivalenttotestingforazerovariancecomponentinanLMMwithtwovariancecomponents.AfterdiscussingthestateoftheartinstatisticaltestinginthisframeworkwewillrevisitthisexampleinSect.6.1.3CoronarySinusPotassiumWeconsiderthecoronarysinuspotassiumconcentrationdatameasuredon36dogspublishedbyGrizzleandAllan(1969)andWang(1998).Themeasurementsoneachdogweretakenevery2minfrom1to13min(sevenobservationsperdog).The36dogscomefromfourtreatmentgroups.Figure3displaysthedatafortheninedogsinthefirsttreatmentgroup(dottedlines).Ifyijdenotesthejthconcentrationfortheithdogattimetij=1+2jthenareasonableLMMmodelforthefirsttreatmentgroupis2+yij=β0+ui+β1tij+β2tijij,(6) 8C.M.Crainiceanu6.05.55.04.54.0Potassiumconcentration3.53.024681012Time(minutes)Fig.3Sinuspotassiumconcentrationforninedogsinthefirsttreatmentgroup(dottedlines)whereui∼N(0,σu2)areindependentdogspecificinterceptsandij∼N(0,σ2)areindependenterrors.Figure3displaysthefitofmodel(6)asadashedline.Itisnaturaltoaskthequestionwhethermodel(6)isenoughtocapturethecomplexityofthepopulationmeanfunction.Onewaytoanswerthisquestionisbyembeddingmodel(6)intothefollowingmoregeneralmodelK2+b2yij=β0+ui+β1tij+β2tijk(tij−κk)++ij,(7)i=1wherebk∼N(0,σ2)areindependenttruncatedsplinecoefficients,Kthenumberbofknotsandκk,k=1,...,Karetheknots.Alltheotherassumptionsarethesameasinmodel(6).Notethatmodel(6)isanLMMwithtwovariancecomponents:one,σ2,controllingtheshrinkageofrandominterceptstowardstheirmeanandtheotheruone,σ2,controllingtheshrinkageofthepopulationfunctiontowardsaquadraticbpolynomial.Figure3displaysthefitofthismodelasasolidlinetogetherwith95%pointwiseconfidenceintervals(shadedarea).Testingthenullhypothesisdescribedbymodel(6)versusthealternativede-scribedbymodel(7)isequivalenttotestingforH2=0vs.σ2>0.0:σbb LikelihoodRatioTestingforZeroVarianceComponentsinLinearMixedModels9Similarly,testingfordogresponsehomogeneityisequivalenttotestingH2=0vs.σ2>0.0:σuuBothframeworkscorrespondtotestingforazerovariancecomponentinanLMMwithtwovariancecomponents.Asthelastpointforthisexample,notethatanaivewaytotestforH0:σ2=0bistocheckwhetherthenullfitiscontainedintheshadedarea.Thismayseemlikeagoodidea,butleadstoincorrectinferences.Indeed,alltheconfidenceintervalsforthemeanfunctionbasedonmodel(7)containthefitbasedonmodel(6).How-ever,asweshowinSect.6,theRLRTindicatesstrongevidenceagainstthenullhypothesisofaquadraticpopulationcurve.2ModelandTestingFrameworkAllexamplesinSect.1,andmanyothers,involvetestingforazerovariancecom-ponentasthemethodologicalanswertoimportantscientificquestions.Toformalizetheframework,letusassumethattheoutcomevector,Y,ismodeledasanLMM⎧⎨Y=Xβ+Z1b1+···+ZSbS+,bs∼N(0,σ2IK),s=1,...,S,(8)ss⎩2∼N(0,σIn).Heretherandomeffectsbs,s=1,...,S,andtheerrorvectoraremutuallyinde-pendent,KsdenotesthenumberofcolumnsinZs,nthesamplesize,andIνdenotestheidentitymatrixwithνcolumns.ThisisnotthemostgeneralformofanLMM,butitisoftenusedinpracticeandkeepsthepresentationsimple.Weareinterestedintesting2=0vs.H2H0,s:σsA,s:σs>0,(9)wherethehypothesesareindexedbys=1,...,Stoemphasizethatthesearedistinctandnotjointhypothesesforallvariancecomponents.Notethatbecausebs∼N(0,σ2IK),thenullhypothesisisequivalenttobs=0,indicatingthatssunderthenullthecomponentZsbsofmodel(8)iszero.Denotebyθ−salltheparametersinmodel(8)withtheexceptionofσs2.TheRLRTfortestingH0,sisthendefinedas{logL(θ2RLRT=2supθ−s,σs2−s,σs)}−2supθ−s{logL(θ−s,0)},whereL(θ−s,σs2)istherestrictedlikelihoodfunctionformodel(8).Asimilardefin-itionholdsforLRTusingthelikelihoodinsteadoftherestrictedlikelihoodfunction. 10C.M.Crainiceanu3StandardAsymptoticResultsforLMMsTestingforzerovariancecomponentsisnotnewinmixedmodels.UsingtheoryoriginallydevelopedbyChernoff(1954),Moran(1971),andSelfandLiang(1987),StramandLee(1994)provedthattheLRTfortesting(9)hasanasymptotic0.5χ2:00.5χ2mixturedistributionunderthenullhypothesisH0,sifdataareindependent1andidenticallydistributedbothunderthenullandalternativehypothesis.Formoredetailsonstandardasymptoticresults,seethechapterbyZhangandLin(2007)inthisbook.Thus,itcouldbesurprisingthatinmanyapplicationsthenulldistributionoftheLRTusingsimulationsisfarfrombeinga0.5χ2:0.5χ2mixture.01Thereareseveralreasonsfortheseinconsistencies.First,theLairdandWare(1982)modelusedbyStramandLee(1994)allowsthepartitionoftheoutcomevectorYintoindependentsubvectors.Thiscouldberevealedbycloseinspectionofthismodel,whichistypicallydescribedintermsofthesubject-levelvectorYiandnotintermsofthedatavectorY.Theindependenceassumptionisviolated,forexample,whenrepresentingnonparametricsmoothingasaparticularLMM.Sec-ond,evenwhentheoutcomevectorcanbepartitionedintoindependentsubvectors,thenumberofsubvectorsmaynotbesufficienttoensureanaccurateasymptoticapproximation.Third,subvectorsmaynotbeidenticallydistributedduetounbal-anceddesignsormissingdata.InthecaseofanLMMwithonevariancecomponent(S=1)CrainiceanuandRuppert(2004b)andCrainiceanuetal.(2005)havede-rivedthefinitesampleandasymptoticdistributionoftheLRTsshowingthat,undergeneralconditions,thenulldistributionfortestingH0,sistypicallydifferentfrom0.5χ2:0.5χ2.Inthefollowingsection,weprovideasummaryoftheseresultsand01discusstheimplicationsforappliedstatisticalinference.4FiniteSampleandAsymptoticResultsforGeneralDesignLMMswithOneVarianceComponentConsidertheparticularcaseofmodel(8)withGaussianoutcomevectorandonevariancecomponent⎧⎨Y=Xβ+Z1b1+ε,b1∼N(0,σ2IK),(10)11⎩ε∼N(0,σε2In),whereb1andεasmutuallyindependent.Asmodel(10)hasonlyonevariancecomponent,σ2,theexactnulldistribution1oftheRLRTfortestingH0,1:σ2=0versusHA,1:σ2>0isCrainiceanuand11Ruppert(2004b)⎧⎫⎨NK1⎬dn(λ)RLRTn=sup(n−p)log1+−log(1+λµl,n),(11)λ≥0⎩Dn(λ)⎭l=1 LikelihoodRatioTestingforZeroVarianceComponentsinLinearMixedModels11dwhere“=”denotesequalityindistribution,pisthenumberofcolumnsinX,K1λµl,nK1w2n−pN2l2n(λ)=wl,Dn(λ)=+wl,1+λµl,n1+λµl,nl=1l=1l=K1+1wl,l=1,...,n−p,areindependentN(0,1),andµl,n,l=1,...,K1,aretheeigenvaluesoftheK1×K1matrixZ1(In−X(XX)−1X)Z1.TheasymptoticdistributionoftheLRTwasalsoderivedbyCrainiceanuandRuppert(2004b)anddependsessentiallyontheasymptoticgeometryoftheeigenvaluesµl,n.Thisdis-tributionmayormaynotbeequaltothe0.5χ2:0.5χ2mixture,dependingonthe01asymptoticbehavioroftheseeigenvalues.AsimilarresultforLRTcanbefoundinCrainiceanuandRuppert(2004b).Thereareseveralreasonsforpreferringthedistributionin(11)overthe0.5χ2:00.5χ2ofStramandLee(1994).First,thisisthefinitesampledistributionofthe1RLRT.Second,the0.5χ2:0.5χ2asymptoticdistributioncanbeinaccuratewhen01thenumberofindependentsub-vectorsofYissmalltomoderateorwhendesignsareunbalanced.Typically,the0.5χ2:0.5χ2providesaconservativeapproxima-01tionofthefinitesampledistributionwithconsiderableassociatedlossesinpower.Third,calculatingthedistributionin(11)isveryfast.Indeed,thedistributionin(11)dependsonlyontheeigenvaluesµl,nofaK1×K1matrix,whichneedtobecomputedonlyonce.Simulationeffectivelyreducestosimulationof(K1+1)χ2variablesandagridsearchoverλ.Thissimulationdoesnotdependonthesamplesize,n,andisfast(5,000simulationspersecondwitha2.66GHzCPUand1Mbyterandomaccessmemory).Fourth,whenassumptionsinStramandLee(1994)holdthedistributionin(11)convergesweaklytotheasymptotic0.5χ2:0.5χ2.015LinearMixedModelswithMultipleVarianceComponentsTheresultsinCrainiceanuandRuppert(2004b)havesolvedtheproblemformixedmodelswithGaussianoutcomesandonevariancecomponent.However,inmanypracticalapplicationstherearemultiplevariancecomponentscontrollingshrinkage.TwosuchexamplesaretheoniondensityandthecoronarysinuspotassiummodelsinSects.1.2and1.3,respectively.ThemethodologydevelopedbyCrainiceanuandRuppert(2004b)couldbeusedtoderivethenulldistributionforthemoregeneralcasediscussedinthispaper.Whiletheresultistheoreticallyinteresting,thisdistributionisobtainedbymaximiz-ingastochasticprocessoverthevariancecomponentsofmodel(8),whichmakestheimplementationcomputationallyequivalenttotheparametricbootstrap.Forthisreason,Crainiceanu(2003)andCrainiceanuandRuppert(2004a)suggestusingtheparametricbootstrapinthiscontext.Onecoulddebatetheeleganceofthisapproach,buttheparametricbootstrapisapracticalandrobustalternativetothe0.5χ2:0.5χ201approximation. 12C.M.CrainiceanuOneproblemwiththeparametricbootstrapisthat,inmanyapplications,evalu-atingthelikelihoodiscomputationallyexpensiveanditmaynotbereasonabletoperformthousandsofsimulations.Toillustratethisproblem,considerthefollowingsimplelongitudinalmodel:Yij=ui+f(xij)+ij,(12)whereui∼N(0,σu2)arerandomindependentsubjectspecificintercepts,ij∼N(0,σ2)areindependenterrors,i=1,...,I,j=1,...,J,IisthenumberofsubjectsandJisthenumberofobservationspersubject.Heref(.)isanunspecifiedpopulationmeanfunction.Ifthefunctionf(.)ismodeledasalinearpenalizedspline,thentestingforlinearityoff(.)againstanonparametricalternativeisequivalenttotesting2=0vs.H2H0:σbA:σb>0,(13)whereσ2isavariancecomponentcontrollingthedegreeofsmoothnessoff(.).bComputationtimesbothforLRTandRLRTwereverylongevenforsmallsamplesizes.Forexample,forsixsubjectsand50observationspersubject,computationtimefor10,000simulationswas4.5hforRand1hforSASonaserver(IntelXeon3GHzCPU).Additionally,runtimeincreasedsteeplywithbothIandJforR.ForRsignificantreductionofcomputationtimescouldbeachievedbyinterfacingitwithCorFORTRAN.SASisfasterwithitsdefaultconvergencecriterion,butwefoundnumericalimprecisions,especiallywhenestimatingtheprobabilitymassatzero.Theseproblemsweremitigatedwhentheconvergencecriterionwasmorestringent,butwasaccompaniedbyanincreasingproportionofunsuccessfulmodelfits.FormoredetailsseetheextensivesimulationstudyinGrevenetal.(2008).Needlesstosaythatinmorecomplexmodelswithlargersamplesizesthecomputationalburdenisevenmoreserious,especiallywhenrunningseveraltestsorperformingsimulationstudies.Therefore,formanyapplicationsthereisaneedforfastandaccurateapproxima-tionsofthenullfinitesampledistributionoftheRLRTfortestingH0,s.Wedescribetwosuchapproximations.ThefirstapproximationwasintroducedbyGrevenetal.(2008),ispracticallyinstantaneous,andavoidsbootstrap.Thesecondapproxima-tionwasintroducedbyCrainiceanu(2003)andCrainiceanuandRuppert(2004a)andusesasimpleparametricapproximationthatreducesthenecessarynumberofbootstrapsamples.Inextensivesimulationstudies,Grevenetal.(2008)showthatbothmethodsoutperformthe0.5χ2:0.5χ2approximationandtheparametric01bootstrap.Theapproximationusedbystandardsoftwareisthe0.5χ2:0.5χ201approximation.Thenecessaryregularityconditionsforthisapproximationtobeasymptoticallyvalidareindependenceundernullandalternativehypothesis,largenumberofsubvectors,andbalanceddesigns.Whentheseconditionsaremetbothapproximateddistributionsdiscussedinthefollowingconvergeweaklyto0.5χ2:00.5χ2distribution.However,whenconditionsarenotmet,bothapproximatedistri-1butionsagreewitheachother,aredifferentfromthe0.5χ2:0.5χ2distribution,and01betterfitthefinitesampledistributionoftheRLRT. LikelihoodRatioTestingforZeroVarianceComponentsinLinearMixedModels135.1FastFiniteSampleApproximationTheapproximationproposedbyGrevenetal.(2008)isacombinationofresultsinCrainiceanuandRuppert(2004b)andthepseudo-likelihoodestimationideainGongandSamaniego(1981).Recallthatthepseudo-likelihoodfunctionisobtainedbyplugginginaconsistentestimatorofthenuisanceparametersinsteadofthenui-sanceparameters.Moreprecisely,letL(θ,φ)bethelikelihoodforindependentandidenticallydistributed(i.i.d.)randomvariablesX1,...,Xn,wherethelikelihooddependsontheparametersofinterestθandonnuisanceparametersφ.AssumethatL(.,.)isacomplicatedfunctionofθandφ,butsimpleasafunctionofθalonewhenφisfixed.Inthiscase,pseudo-likelihoodreplacesφbyaconsistentestimatorφˆandmaximizesL∗(θ)=L(θ,φˆ)overθtoobtainthepseudo-maximumlike-lihoodestimatorθˆ.Thepseudo-LRTfortestingH0:θ=θ0isthendefinedasLRT∗=2logL∗(θˆ)−2logL∗(θ0).Inourframework,θ=(σs2,β,bs)couldbeviewedastheparametersofinterest,andthebi,i=s,asnuisanceparameters.Ifthebi’swereknown,theoutcomevectorcouldberedefinedasY=Y−Zibiandourmodelcouldbereducedi=saccordingly.Theideawetransferfrompseudo-likelihoodestimationis,thatunderregularityconditions,thepredictionofZibimightbegoodenoughtoallowthei=sRLRTnulldistributionfortestingH0:σs2=0tobecloselyapproximatedbytheRLRTdistributionwhenZibiisknown.Thus,Grevenetal.(2008)usethei=sfollowingreducedmodel⎧⎨Y=Xβ+Zsbs+,bs∼N(0,σ2IK),(14)⎩ss∼N(0,σ2IK),nwherenotationsaresimilartonotationsformodel(10).TheideaistocalculatetheRLRTfortestingH0,s:σs2=0usinganLMMoftype(8)butuseanapproximatedRLRTnulldistributionbasedontestingforzerovarianceintheLMM(14).Theadvantageofthisapproachisthatthisfinitesampledistributioncanbeobtainedveryeasily,asdescribedinSect.4.5.2MixtureApproximationtotheBootstrapInsomecases,onemightstillwanttouseasimpleparametricbootstraptodeter-minethedistributionofthe(R)LRT.Giventhesteepcomputationalpenaltyinmanyapplications,weproposetouseaparametricapproximationtothe(R)LRTdistribu-tion.Whileinthecaseofi.i.d.datathedistributionisasymptoticallya0.5χ2:0.5χ201mixture,CrainiceanuandRuppert(2004b)showedthatforcorrelatedresponsesandfinitesamplesizesthedistributioncanseverelydeviatefromthismixture.Wepro-posetousethefollowingfinitesampleapproximation 14C.M.Crainiceanud(R)LRT≈aUD,(15)whereU∼Bernoulli(1−p),D∼χ2,p=P(U=0),andaareunknown1dconstants,and≈denotesapproximateequalityindistribution.TheparametersoftheaUDapproximationareestimatedusingabootstrapsamplethatistypicallymuchsmallerthantheonerequiredtoestimatesmalltailprobabilities.Notethattheflexibleclassofdistributionsin(15)Contains,asaparticularcase,the0.5χ2:0.5χ2distributionwitha=1andp=0.5,andisjustaseasytouse.As01thepointmassatzero,p,andthescalingfactor,a,areunknowninallothercases,weproposetoestimatethemfromabootstrapsample.Theideaoftheparametricapproximationistousetheentirebootstrapsampletofitaflexibletwoparameterfamilyofdistributions,thusreducingthenecessarynumberofsimulationsrequiredforestimatingtailquantiles.Grevenetal.(2008)showthattheapproximation(15)generallyoutperformsthe0.5χ2:0.5χ2approximation.Thishappensinmanyap-01plicationswhenthecorrelationstructureimposedbytherandomeffects,bi,cannotbeignored,orwhenthesamplesizeissmalltomoderate.Notethatbothapproxi-mationsareasymptoticallyidenticaltothe0.5χ2:0.5χ2approximationwhenthe01assumptionsinSelfandLiang(1987)andStramandLee(1994)hold.ThismethodologyhasbeenappliedbyCrainiceanu(2003)andCrainiceanuandRuppert(2004a).ItsbehaviorhasbeenstudiedinextensivesimulationstudiesbyGrevenetal.(2008)inawidevarietyofsettingsindicatingexcellentagreementwithlongbootstrapsimulations.Themainstrengthsofthemethodarethatitrequiresfewbootstrapsamples(100–200),providesafinitesampleapproximation,andcanbeappliedtodatawithamoderatenumberofclustersandunbalanceddesigns.6RevisitingtheApplicationsInthissection,werevisittheapplicationsdescribedinSect.1.Table1providestheRLRTcalculatedforeachapplicationtogetherwiththep-valueestimatednonpara-metricallyfrom10,000simulationsusingtheparametricbootstrap.WealsoreporttheestimatedaUDapproximationtothebootstrap.ThefirstrowofTable1providesresultsfortestingthenullhypothesisofalinearspatialdriftagainstageneralalternativeintheLoaloaapplication.ThisistestingTable1RLRTtestingforthethreeexamplesintroducedinSect.1ExampleTestValuep-ValueaUDLoaloaσ2=0127.47<0.0010.66χ2+0.34(0.91χ2)b01Onionσ2=01.980.0480.66χ2+0.34(0.91χ2)b01Dogsσ2=04.890.00570.69χ2+0.31(0.88χ2)b01Dogsσu2=019.03<0.0010.56χ2+0.44(0.96χ2)01 LikelihoodRatioTestingforZeroVarianceComponentsinLinearMixedModels15whetherthereremainssizablespatialcorrelationaftercontrollingfortheeffectsofavailablecovariates.InthiscaseRLRT=127.47,suggestingverystrongevidenceagainstthelinearspatialtrend,irrespectiveoftheparticularapproximationtothenulldistribution.Inthiscase,becausethealternativemodelisanLMMwithonevariancecomponent,weactuallyhavetheexactnullfinitesampledistribution.TheaUDapproximationisstilldisplayedbecauseitprovidesacompactandaccuratesummaryoftheexactdistribution.Inmanytestingexamples,decisionsarenotaseasytomakeasintheLoaloacase.Indeed,thesecondrowpresentsresultsfortheoniondensityexample.Thenullhypothesisσ2=0correspondstotestingforthesemiparametricbinaryoffsetbmodel(twoparallelnonparametricfunctions)againstthediscretebycontinuousin-teractionmodel(twononparametricfunctions).ThevalueofRLRT=1.96,ismuchclosertothedecisionboundary.Insuchcases,itisreasonabletoinvestcomputa-tionalefforttoobtainthenullfinitesampledistribution.TheaUDapproximationtothebootstrapsuggestsseriousdifferencesfromthe0.5χ2:0.5χ2distribution.01Infact,thep-valuebasedontheaUDapproximationwas0.048comparedto0.081basedonthe0.5χ2:0.5χ2approximation.ItcouldseemstrangethatthetwoaUD01approximationsforthewidelydifferenttestingproblemsareidentical.Thisisduetothefactthatbothdistributionsdependessentiallyonaverylargeleadingeigenvalue.Thefollowingtwoexampleshavedifferentdistributionsbecausetheireigenvaluestructureisdifferent.ThelasttworowsinTable1arededicatedtoresultsforthecoronarysinuspotas-siumdata.Thethirdrowcorrespondstotestingforaquadraticpopulationcurveagainstageneralalternative,whilethelastrowcorrespondstotestingforhomo-geneityofdogresponsesaroundthenonparametricpopulationcurve.Bothtestingproceduressuggeststrongevidenceagainstthecorrespondingnullhypotheses.Resultsinthissectionwereobtainedforafixednumberofknotsandchoiceofknotlocations.Insimulationstudies,Crainiceanu(2003)andGrevenetal.(2008)showedthatthenulldistributionandpowerpropertiesdonotchangesubstantiallybyincreasingthenumberofknotsaslongastheregressiondesignprovidesanalternativethatisflexibleenoughtocapturethepotentialcomplexitiesofthealter-nativehypothesis.ThisresultsareconsistentwiththeresultsinRuppert(2002)whoshowedthat20knotsareenoughtofitmostfunctionsthatdonotexhibitextremechangesincurvature.7DiscussionLMMsareusedinawiderangeofapplicationssuchaslongitudinalstudies,hierar-chicalmodelsorsmoothing.Thelikelihoodratiotestingforzerovariancecompo-nentsinmixedmodelshaslongbeenamethodologicalchallenge.Researchinthelast20yearscombinedwithrecentmethodologicalresultsandsimulationstudieshaveledtoabetterunderstandingoftheframework.Mostimportantly,theapplica-tionoflikelihoodratiotestingformostLMMshasbecomepossible,ifnotroutine. 16C.M.CrainiceanuWhilethispaperisnotaimedatansweringallquestions,severalpointsshouldbemadeclear.First,theχ2approximationcanbeappliedwiththeacknowledgement1thatitmayprovideanexcessivelyconservativeapproximationtothenulldistrib-ution.Thisissafewhentheevidenceagainstthenullisoverwhelming(see,forexampletheLoaloaandthedogresponsehomogeneityexamples).Second,the0.5χ2:0.5χ2approximationcanbeappliedinmanysituations,especiallywhen01testingforhomogeneityofalargenumberofclusters(inthedogsexamplethereareninedogs).However,thisapproximationtendstobeconservativeandlosepowerinmanyapplications.Effectsarelessseriousthantheonesassociatedtotheuseoftheχ2approximation.Thus,anonsignificanteffectusingthe0.95quantileofthe10.5χ2:0.5χ2distribution,could,infact,besignificantusingthecorrectnulldistrib-01utionatthesamelevel(seetheonionexample).Third,inthecaseofLMMswithonevariancecomponentthefinitesampledistributionoftheRLRTisavailableandeasytoobtain.Fourth,inthecaseofLMMswithmorethanonevariancecomponentsthefastfinitesampleapproximationintroducedinSect.5.1performsconsistentlywell.Fifth,theaUDapproximationintroducedinSect.5.2mayreducesimulationtimeswhilepreservingaccuracybyusingmuchsmallerbootstrapsamples.Anaturalquestiontoaskis“WhatshouldIdoifIhaveatestingproblemforazerovariancecomponentinaLinearMixedModel?”Ofcourse,therearemanyanswerstothisparticularquestion,mainlybecausetherearemultiplewaysofap-proachingtheproblem.Onesuchalternativeistousescoretests,whicharenull-basedtests,asdescribedinthechapterbyZhangandLin(2007)inthisbook.However,ifonedecidestouseLRTsthefollowingalgorithm-likelistcanprovideguidance:1.UseRestrictedLikelihoodRatioTest(RLRT)insteadofLikelihoodRatioTest(LRT).ThisisduetothetendencyofMLtostronglyunderestimatethevariancecomponentthatultimatelyleadstopowerlossesforLRT.2.IftestingforavariancecomponentinanLMMwithonevariancecomponentusetheexactfinitesampledistributioninCrainiceanuandRuppert(2004b).IfnotthencontinuetoStep3.3.Ifpossible,obtain10,000parametricbootstrapsfromthenulldistributionoftheRLRT.Comparethisdistributionwiththe0.5χ2:0.5χ2andaUDapproxima-01tions.Reportresultsbasedonbootstrap,0.5χ2:0.5χ2andaUDapproxima-01tions.4.Ifobtaining10,000bootstrapsiscomputationallyprohibitive,obtainatleast100–200bootstrapsamples.ThencontinueasinStep3.5.ObtainthefinitesampleapproximationdescribedinSect.5.1andcompareitwiththeotherapproximations.Theauthor’spointofviewisthatthenullfinitesampledistributionistherelevantdistributionandnotitsasymptoticapproximation.Anasymptoticapproximationisrelevantwhenitprovidesapreviouslyunknowninsight,ismucheasiertouseandcloselyapproximatesthenullfinitedistribution.Thus,theasymptoticdistributionisnottherightdistributionbutisoneautomaticwaytoapproachatestingproblem.Fortunately,thevarietyofapplicationsandproblemscontinuestoraisenonstandardproblems. LikelihoodRatioTestingforZeroVarianceComponentsinLinearMixedModels17AcknowledgementsCiprianCrainiceanu’sworkwassupportedbyNIHGrantAG025553-02ontheEffectsofAgingonSleepArchitecture.ReferencesChernoff,H.(1954).Onthedistributionofthelikelihoodratio.AnnalsofMathematicalStatis-tics25,573–578Crainiceanu,C.M.(2003).Ph.D.Thesis:NonparametricLikelihoodRatioTesting.CornellUniversityCrainiceanu,C.M.andD.Ruppert(2004a).Likelihoodratiotestsforgoodness-of-fitofanonlinearregressionmodel.JournalofMultivariateAnalysis91,3552Crainiceanu,C.M.andD.Ruppert(2004b).Likelihoodratiotestsinlinearmixedmodelswithonevariancecomponent.JournaloftheRoyalStatisticalSociety,SeriesB66(1),165–185Crainiceanu,C.M.,D.Ruppert,G.Claeskens,andM.P.Wand(2005).Exactlikelihoodratiotestsforpenalisedsplines.Biometrika92(1),91–103Crainiceanu,C.M.,P.J.Diggle,andB.Rowlingson(2007).Bivariatebinomialspatialmodelingofloaloaprevalenceintropicalafrica,withdiscussions.JournaloftheAmericanStatisticalAssociation103,21–43FieldsDevelopmentTeam(2006).Fields:ToolsforSpatialData.Boulder,CO:NationalCenterforAtmosphericResearchGong,G.andF.J.Samaniego(1981).Pseudomaximumlikelihoodestimation:theoryandappli-cations.TheAnnalsofStatistics9(4),861–869Greven,S.,C.M.Crainiceanu,H.Kuechenhoff,andA.Peters(2008).Likelihoodratiotestingfor¨zerovariancecomponentsinlinearmixedmodels.JournalofComputationalandGraphicalStatisticsGrizzle,J.E.andD.M.Allan(1969).Analysisofdoseanddoseresponsecurves.Biometrics25,357–381Kammann,E.andM.P.Wand(2003).Geoadditivemodels.AppliedStatistics52,1–18Laird,N.andJ.H.Ware(1982).Random-effectsmodelsforlongitudinaldata.Biometrics38,963–974Moran,P.A.P.(1971).Maximumlikelihoodestimatorsinnon-standard-conditions.ProceedingsoftheCambridgePhilosophicalSociety70,441–450Nychka,D.W.andN.Saltzman(1998).Designofairqualitymonitoringnetworks.InD.Nychka,L.Cox,andW.Piegorsch(Eds.),CaseStudiesinEnvironmentalStatistics.BerlinHeidelbergNewYork:SpringerRatkowsky,D.A.(1983).NonlinearRegressionModeling:AUnifiedPracticalApproach.NewYork:MarcelDekkerRuppert,D.(2002).Selectingthenumberofknotsforpenalizedsplines.JournalofComputationalandGraphicalStatistics11(4),735–757Ruppert,D.,M.P.Wand,andR.J.Carroll(2003).SemiparametricRegression.Cambridge:CambridgeUniversityPressSelf,S.G.andK.-Y.Liang(1987).Asymptoticpropertiesofmaximumlikelihoodestimatorsandlikelihoodratiotestsundernonstandardconditions.JournaloftheAmericanStatisticalAssoci-ation82(398),605–610Stram,D.O.andJ.-W.Lee(1994).Variancecomponentstestinginthelongitudinalmixedeffectsmodel.Biometrics50(3),1171–1177Wang,Y.(1998).Mixedeffectssmoothingsplineanalysisofvariance.JournalofRoyalStatisticalSociety,SeriesB60,159–174Zhang,D.andX.Lin(2007).Variancecomponenttestingingeneralizedlinearmixedmodelsforlongitudinal/clustereddataandotherrelatedtopics.InD.Dunson(Ed.),ModelSelectioninLinearMixedModels.BerlinHeidelbergNewYork:Springer VarianceComponentTestinginGeneralizedLinearMixedModelsforLongitudinal/ClusteredDataandotherRelatedTopicsDaowenZhangandXihongLin1IntroductionLinearmixedmodels(LairdandWare,1982)andgeneralizedlinearmixedmodels(GLMMs)(BreslowandClayton,1993)havebeenwidelyusedinmanyresearchareas,especiallyintheareaofbiomedicalresearch,toanalyzelongitudinalandclustereddataandmultipleoutcomedata.Inamixedeffectsmodel,subject-specificrandomeffectsareusedtoexplicitlymodelbetween-subjectvariationinthedataandoftenassumedtofollowameanzeroparametricdistribution,e.g.,multivariatenormal,thatdependsonsomeunknownvariancecomponents.Alargeliteraturewasdevelopedinthelasttwodecadesfortheestimationofregressioncoefficientsandvariancecomponentsinmixedeffectsmodels.SeeDiggleetal.(2002)andVerbekeandMolenberghs(2000,2005)foranoverview.Inmanysituations,however,weareinterestedintestingwhethersomeofthebetween-subjectvariationsareabsentinamixedeffectsmodel.Thisisequivalenttotestingsomevariancecomponentsequaltozero.However,suchanullhypothesisplacessomevariancecomponentsontheboundaryoftheparameterspace.Hencethecommonlyusedtests,suchasthelikelihoodratio,Waldandscoretests,donothavethetraditionalchi-squareddistribution.Inthischapter,wewillreviewthelike-lihoodratiotestandthescoretestfortestingvariancecomponentsinGLMMs.AcloselyrelatedtopicistestingwhetheracovariateeffectinaGLMMcanbead-equatelyrepresentedbyapolynomialofacertaindegree.Usingasmoothingsplineorpenalizedsplineapproach,testingforapolynomialcovariateeffectisequivalenttotestingazerovariancecomponentinaninducedGLMM.Wewillreviewthelike-lihoodratiotestandthescoretestfortestingaparametricpolynomialmodelversusasmoothingsplinemodelforlongitudinaldatawithinthegeneralizedadditivemixedmodelsframework(LinandZhang,1999).D.ZhangDepartmentofStatistics,NorthCarolinaStateUniversity,Raleigh,NC27695,USAzhang@stat.ncsu.eduX.LinDepartmentofBiostatistics,HarvardSchoolofPublicHealth,Boston,MA02115,USAxlin@hsph.harvard.eduD.B.Dunson(ed.)RandomEffectandLatentVariableModelSelection,19DOI:10.1007/978-0-387-76721-5,cSpringerScience+BusinessMedia,LLC2008 20D.Zhang,X.LinThischapterisorganizedasfollows.InSect.2,wepresentthemodelspecifica-tionofaGLMMandbrieflyreviewmodelestimationandinferenceprocedures.InSect.3,wereviewthelikelihoodratiotestforvariancecomponentsinGLMMsandillustratesuchtestsinseveralcommoncasesofinterest.InSect.4,wereviewthescoretestforvariancecomponentsinGLMMs,andcomparetheperformanceofthelikelihoodratiotestwiththescoretestinasimpleGLMM.InSect.6,wereviewthelikelihoodratiotestandthescoretestfortestingapolynomialcovariateeffectversusanonparametricsmoothingsplinemodelforlongitudinaldata.WeillustratethesetestsinSect.7throughtheapplicationofdatafromastudyofinfectiousdis-easeinIndonesianchildren.ThechapterendswithadiscussioninSect.8.2GeneralizedLinearMixedModelsforLongitudinal/ClusteredDataSupposetherearemsubjectsinthesample.Fortheithsubject,denotebyyijtheresponsemeasuredforthejthobservation,e.g.,thejthtimepointforlongitudi-naldataorthejthoutcomeformultipleoutcomedata.Similarly,denotebyxijap×1vectorofcovariatesassociatedwithfixedeffectsandbyzijaq×1vectorofcovariatevaluesassociatedwithrandomeffects.Givensubject-specificrandomeffectsbi,theresponsesyijareassumedtobeconditionallyindependentandbelongtoanexponentialfamilywiththeconditionalmeanE(yij|bi)=µijandconditionalvariancevar(y−1ij|bi)=V(µij)=φωv(µij),whereφisapositivedispersionpa-ijrameter,ωijisapre-specifiedweightsuchasthebinomialdenominatorwhenyijistheproportionofeventsinbinomialsampling,andv(·)isthevariancefunction.Ageneralizedlinearmixedmodel(GLMM)relatestheconditionalmeanµijtothecovariatesxijandzijasfollows:g(µTTij)=xijβ+zijbi,(1)whereg(·)isastrictlyincreasinglinkfunction,βisap×1vectoroffixedeffects(regressioncoefficients)ofx,andbiisaq×1vectorofsubject-specificrandomeffectsofz.Themodelspecificationiscompletedbytheusualassumptionthatbi∼N{0,D(ψ)},whereψisac×1vectorofvariancecomponents.Model(1)includesmanypopularmodelsforcontinuousanddiscretedataasspe-cialcases.Forexample,iftheyijarecontinuousoutcomemeasurementsassumedtohaveanormaldistributiongivenrandomeffectsbiandthelinkfunctionistheidentitylinkg(µ)=µ,thenmodel(1)reducestothefollowinglinearmixedmodel(LairdandWare,1982)Tβ+zTbyij=xijiji+ij,(2)iidwhereij∼N(0,φ)areresidualerrors.Whentheyijarebinaryresponses,acom-monchoiceofthelinkfunctionisthelogitlinkg(µ)=log{µ/(1−µ)}.Inthiscase,model(1)reducestothefollowinglogistic-normalmodel VarianceComponentTestinginGeneralizedLinearMixedModels21logit{P(yTTij=1|bi)}=xijβ+zijbi.(3)Thelog-likelihoodfunction(β,ψ;y)givenoutcomeyundermodel(1)is⎧m⎨ni−m/2expexp{(β,ψ;y)}∝|D(ψ)|ij(β,ψ;yij|bi)⎩i=1j=1⎫⎬1T−1−biD(ψ)bidbi,(4)2⎭whereµijωij(yij−u)ij(β,ψ;yij|bi)=duyijφv(u)istheconditionallog-likelihoodofyijgivenrandomeffectsbi.Estimationandinferenceinmodel(1)areoftenhamperedbytheintractablein-tegrationsinvolvedinevaluationoflikelihood(4)andhavebeenwelldevelopedinthepasttwodecades.OurmainfocusinthispaperisonvariancecomponenttestinginaGLMM.Wehencelistheresomerepresentativeworkasreferences.ZegerandKarim(1991)usedaGibbssamplingapproachformodelestimationandinference.BreslowandClayton(1993)approximatedthelikelihood(4)usingLaplaceapprox-imationandconductedmodelestimationandinferencebymaximizingapenalizedquasi-likelihood(PQL).BreslowandLin(1995)andLinandBreslow(1996)stud-iedthebiasinPQLestimatorsanddevelopedbias-correctionmethods.BoothandHobert(1999)proposedanautomatedMonteCarloEMalgorithmtomaximizetheintegratedlikelihood(4).Asusual,throughoutthischapter,wewilluseXforthedesignmatrixofβandZthedesignmatrixofb.Thatis,X=(XT,XT,...,XmT)Twhere12Xi=(xi1,xi2,...,xin)T,andZ=diag{Z1,Z2,...,Zm}whereZi=i(zi1,zi2,...,zin)T.i3TheLikelihoodRatioTestforVarianceComponentsinGLMMsThespecificationofthesubject-specificrandomeffectsbiinmodel(1)modelsthesourceofbetween-subjectvariationinthecovariateeffectsofz,whichalsodeterminesthewithin-subjectcorrelation.Themagnitudeofthisbetween-subjectvariation/within-subjectcorrelationiscapturedbythemagnitudeoftheelementsofD(ψ).Inpractice,investigatorsmaybeinterestedtoseeifthereisnobetween-subjectvariationinsomecovariateeffectsofz.Statistically,itisequivalenttotestingsomeoralloftheelementsofD(ψ)tobezero.Inaregularhypothesistestingsetting,alikelihoodratiotest(LRT)isthemostcommonlyusedtestduetoitsdesirabletheoreticalpropertiesandthefactthatitis 22D.Zhang,X.Lineasytoconstruct.Underverygeneralregularityconditions,theLRTstatisticasymp-toticallyhasaχ2nulldistributionwiththedegreesoffreedomequaltothenumberofindependentparametersbeingtestedunderthenullhypothesis.However,whentheelementsofD(ψ)aretested,thenullhypothesisusuallyplacessomeorallofthecomponentsofψontheboundaryofthemodelparameterspace,inwhichcasetheLRTstatisticdoesnothavetheusualχ2nulldistribution.Denotebyθ=(βT,ψT)T,acombinedvectorofregressionandvariance–covarianceparametersinthemodel.SelfandLiang(1987)formulatedtheasymp-toticnulldistributionoftheLRTstatistic−2lnλmfortestingH0:θ0∈0vs.HA:θ0∈1=0,whenthetruevalueθ0ofθispossiblyontheboundaryofthemodelparameterspace.Assumethattheparameterspaces1underHAand0underH0canbeapproximatedatθ0byconesCandC,respectively,withvertexθ0.Self10andLiang(1987)showedthatundersomeregularityconditionstheLRTstatistic−2lnλmasymptoticallyhasthesamedistributionas{(U−θ)TI(θTinf0)(U−θ)}−inf{(U−θ)I(θ0)(U−θ)},(5)θ∈C0−θ0θ∈C−θ0whereCistheconeapproximatingwithvertexatθ0,C−θ0andC−θ00aretranslatedconesofCandCsuchthattheirverticesaretheorigin,I(θ0)0isthe(Fisher)informationmatrixatθ0,andUisarandomvectordistributedasN{0,I−1(θ0)}.Alternatively,SelfandLiang(1987)expressed(5)asinfU˜−θ2−infU˜−θ2,(6)θ∈C˜0θ∈C˜whereC˜={θ˜:θ˜=1/2QTθforallθ∈C−θ0},C˜0={θ˜:θ˜=1/2QTθforallθ∈C−θ0},U˜isarandomvectorfromN(0,I)andQQTisthespectral0decompositionofI(θ0);thatis,I(θ0)=QQT,QQT=Iand=diag{λi}.Wecanuseeither(5)or(6)toderivetheasymptoticnulldistributionfortheLRTstatisticdependingonthestructureofI(θ0).StramandLee(1994)appliedtheabovegeneralresultsofSelfandLiang(1987)toinvestigatetheasymptoticnulldistributionofLRTstatistic−2lnλmfortestingcomponentsofD(ψ)forlinearmixedmodel(2).SincetheresultsofSelfandLiang(1987)areforageneralparametricmodel,theyarealsoapplicabletoGLMM(1)aslongasonecanmaximizethelikelihood(4)underthenullandalternativehypothe-sesofinterest.Here,welistsomecasesonecommonlyencountersinpractice.ForreviewsonLRTforvariancecomponentsinlinearmixedmodels,seethechapter“LikelihoodRatioTestingforZeroVarianceComponentsinLinearMixedModels”byCrainiceanu.Case1.Assumethedimensionqoftherandomeffectsisequaltoone,thatis,D=d11,andwearetestingH0:d11=0vs.HA:d11>0.Forexample,considertherandominterceptmodelZijbi=biandbi∼N(0,d11)inmodel(1). VarianceComponentTestinginGeneralizedLinearMixedModels23Inthiscase,θ=(βT,d11)TandC=Rp×{0}andC=RP×(0,∞).01DecomposeUandI(θ0)in(5)asU=(UT,U2)TandI(θ0)={Ijk}corresponding1toβandd11.Somealgebrathenshowsthatinf{(U−θ)TI(θ20)(U−θ)}=U˜2,θ∈C0−θ0whereU˜−11/22=(I22−I21II12)U2,and11inf{(U−θ)TI(θ20)(U−θ)}=U˜2I(U˜2≤0).θ∈C−θ0Therefore,(5)reducestoU˜2I(U˜2>0).ItiseasytoseethatU˜2∼N(0,1).The2asymptoticnulldistributionof−2lnλm(asm→∞)isthena50:50mixtureofχ20andχ2.1DenotetheobservedLRTstatisticbyTobs.Then,thelevelαlikelihoodratiotestwillrejectH0:d11=0ifTobs≥χ2,whereχ2isthe(1−2α)thquantile2α,12α,1oftheχ2distributionwithonedegreeoffreedom.Thecorrespondingp-valueisP[χ2≥Tobs]/2,halfofthep-valueiftheregularbutincorrectχ2distribution11wereused.Case2.Assumeq=2sothatD={dij}2×2,andwetestH0:d11>0,d12=d22=0vs.HA:Dispositivedefinite.Asanexample,considertherandominter-ceptandslopemodelzTbi=b0i+b1itij,wheretijisthetimeandb0iandb1iareijthesubject-specificrandominterceptandslopeinlongitudinaldataassumedtofol-low(b0i,b1i)∼N{0,D(ψ)}.Theforegoinghypothesisteststherandominterceptmodel(H0)versustherandominterceptandslopemodel(H1).Inthiscase,θ=(θT,θ2,θ3)Twhereθ1=(βT,d11)T,θ2=d12andθ3=d22.1UnderH0:d11>0,thetranslatedapproximatingconeatθ0isC−θ0=Rp+1×0{0}×{0}.UnderH0∪HA,d11>0andDispositivesemidefinite.Thisisequivalent−1d2≥0.Sincetheboundarydefinedbyd−12tod11>0andd22−d22−dd=011121112foranygivend11>0isasmoothsurface,thetranslatedapproximatingconeatθ0underH0∪HAisC−θ0=Rp+1×R1×[0,∞).SimilartoCase1,decomposeUandI−1(θ0)in(5)asU=(UT,U2,U3)TandI−1(θ0)={Ijk}correspondingto1θ1,θ2andθ3.Wecanthenshowthat2223−1TIIU2inf{(U−θ)I(θ0)(U−θ)}=[U2,U3]3233,(7)θ∈C−θ0IIU30{(U−θ)TI(θ33−12inf0)(U−θ)}=(I)U3I(U3≤0).(8)θ∈C−θ0Since(UT,U2,U3)T∼N{0,I−1(θ0)},thedistributionofthedifferencebetween1(7)and(8)isa50:50mixtureofχ2andχ2.12Foragivensignificancelevelα,thecriticalvaluecαfortheLRTcanbesolvedbythefollowingequationusingsomestatisticalsoftware:0.5P[χ2≥c]+0.5P[χ2≥c]=α.12 24D.Zhang,X.LinAlternatively,thesignificancelevelαcanalsobecomparedtotheLRTp-valuep-value=0.5P[χ2≥T21obs]+0.5P[χ2≥Tobs],whereTobsistheobservedLRTstatistic.Thisp-valueisalwayssmallerthantheusualbutincorrectp-valueP[χ2≥Tobs]inthissetting.Thedecisionbasedonthis2classicalp-valueishenceconservative.Case3.Assumeq>2andwetestthepresenceoftheqthelementoftherandomD11D12effectsbiinmodel(1).DenoteD=,wherethedimensionsofD11,D21D22D12,andD21ares×s,s×1,and1×s,respectively(s=q−1),andD22isascalar.Thenstatistically,wetestH0:D11ispositivedefinite,D12=0,D22=0vs.HA:Dispositivedefinite.Denotebyθ1thecombinedvectorofβandtheuniqueelementsofD11,θ2=D12,andθ3=D22.UnderH0,thetranslatedapproximatingconeatθ0isC−0θ0=Rp+s(s+1)/2×{0}s×{0}.UnderH0∪HA,D11ispositivedefiniteandDispositivesemidefinite.ThisisequivalenttoD11beingpositivedefiniteandD22−TD−1DD12≥0(StramandLee(1994),mistakenlyusedqconstraints).Again,1211TD−1DsincetheboundarydefinedbyD22−D12=0foranygivenpositivedefinite1211matrixD11isasmoothsurface,thetranslatedapproximatingconeatθ0underH0∪HAisC−θ0=Rp+s(s+1)/2×Rs×[0,∞).ThiscaseissimilartoCase2exceptthatU2isans×1randomvector.Therefore,theasymptoticnulldistributionofLRTstatisticisa50:50mixtureofχs2andχ2.Thep-valueoftheLRTtestforgivens+1observedLRTstatisticTobsisequalto0.5P[χs2≥Tobs]+0.5P[χ2≥Tobs],s+1whichwillbeclosertotheusualbutincorrectp-valueP[χ2≥Tobs]assbecomess+1larger.Case4.SupposetherandomeffectspartzTbiinmodel(1)canbedecomposedijaszTbi=zTb1i+zTb2i,whereb1i∼N{0,D1(ψ1)},b2i∼N(0,ψ2I)andweij1ij2ijtestH0:ψ2=0,andD1ispositivedefinite,versusHA:ψ2>0,andD1ispositivedefinite.Denotebyθ1thecombinedvectorofβandtheuniqueelementsofD1,andθ2=ψ2.Sincethetruevaluesofthenuisanceparametersθ1areinteriorpointsofthecorrespondingparameterspace,wecanapplytheresultofCase1tothiscase.ThisimpliesthattheasymptoticnulldistributionoftheLRTstatisticisa50:50mixtureofχ2andχ2.01Case5.SupposeD1(ψ1)inCase4takestheformψ1I,andwetestH0:ψ1=0,ψ2=0versusHA:eitherψ1>0orψ2>0.Denoteθ=(βT,ψ1,ψ2)withθ1=β,θ2=ψ1andθ3=ψ2.UnderH0,thetranslatedapproximatingconeatθ0isC−θ0=Rp×{0}×{0}.UnderH0∪HA,thetranslatedapproximatingcone0atθ0isC−θ0=Rp×[0,∞)×[0,∞).DecomposeUandI(θ0)in(5)as(UT,U2,U3)TandI(θ0)={Iij}correspond-1ingtoθ1,θ2andθ3,anddefinematrixI˜asfollows:I˜=I˜22I˜23I22I23I21−1=−I[I12,I13].I˜32I˜33I32I33I3111 VarianceComponentTestinginGeneralizedLinearMixedModels25Then(U2,U3)T∼N(0,I˜−1).Givenθ2andθ3,itcanbeeasilyshownthatTU2−θ2inf(U−θ)I(θ0)(U−θ)=[U2−θ2,U3−θ3]I˜θ1∈RpU3−θ32+(U˜2=(U˜2−θ˜2)3−θ˜3),where(U˜2,U˜3)T=˜1/2Q˜T(U2,U3)T,(θ˜2,θ˜3)T=˜1/2Q˜T(θ2,θ3)T,Q˜˜Q˜TisthespectraldecompositionofI˜.Therefore,underH0,wehave(U−θ)TI(θ22inf0)(U−θ)=U˜2+U˜3.θ∈C0−θ0Denotebyϕtheangleintheradiantformedbythevectors˜1/2Q˜T(1,0)Tand˜1/2Q˜T(0,1)T,thatis,ϕ=cos−1I˜23/I˜22I˜33(SelfandLiang(1987),whoaccidentallyusedIjk),andsetξ=ϕ/2π,then⎧⎪⎪U˜2+U˜2withprobabilityξ⎪⎪23⎨U˜2withprobability0.25inf(U−θ)TI(θ20)(U−θ)=θ∈C−θ0⎪⎪U˜2withprobability0.25⎪⎪3⎩0withprobability0.5−ξ.Therefore,theasymptoticnulldistributionoftheLRTstatisticisamixtureofχ2,0χ2,andχ2withmixingprobabilitiesξ,0.5,and0.5−ξ.NotethatsinceI˜isa12positivedefinitematrix,theprobabilityξsatisfies0<ξ<0.5.Inparticular,ifI˜isdiagonal,themixingprobabilitiesare0.25,0.5,and0.25.TheasymptoticnulldistributionoftheLRTstatisticisrelativelyeasiertostudyfortheabovecases.ThestructureoftheinformationmatrixI(θ0)andtheapproxi-matingconesC−θ0andC−θ0playkeyrolesinderivingtheasymptoticnull0distribution.Formorecomplicatedcasesoftestingvariancecomponents,althoughtheasymptoticnulldistributionoftheLRTisgenerallystillamixtureofsomechi-squareddistributions,itmaybetoodifficulttoderivethemixingprobabilities.Inthiscase,onemayusesimulationtocalculatethep-value.4TheScoreTestforVarianceComponentsinGLMMsConceptually,theLRTtestforvariancecomponentsinGLMMsdiscussedinSect.3iseasytoapply.However,theLRTinvolvesfittingGLMM(1)underH0andH0∪HA.Formanysituations,itisrelativelystraightforwardtofitmodel(1)underH0.However,onecouldoftenencounternumericaldifficultiesinfittingthefullmodel(1)underH0∪HA.First,fittingmodel(1)underH0∪HAinvolveshigherdimensionalintegration,thusincreasingcomputationalburden.Second,ifH0istrueorapproximatelytrue,itisoftenunstabletofitamorecomplicatedmodel 26D.Zhang,X.LinunderH0∪HAastheparametersusedtospecifyH0areestimatedclosetotheboundary.Forexample,althoughtheLaplaceapproximationusedbyBreslowandClayton(1993)andothersisrecommendedforaGLMMwithcomplexparameterboundary,suchapproximationmayworkpoorlyinsuchcases(Hsiao,1997).Inthissection,wediscussscoretestsforvariancecomponentsinmodel(1).Oneadvantageofusingscoretestsisthatweonlyneedtofitmodel(1)underH0,oftendramaticallyreducingcomputationalburden.Anotheradvantageisthatunlikelike-lihoodratiotests,scoretestsonlyrequirethespecificationofthefirsttwomomentsofrandomeffectsandarehencerobusttomis-specificationofthedistributionofrandomeffects(Lin,1997).WefirstreviewthescoretestforCase1discussedinSect.3,thatis,weassumethatthereisonlyonevariancecomponentinmodel(1)forwhichwewouldliketoconducthypothesistesting.Aone-sidedscoretestisdesirableinthiscaseandcanbefoundinLin(1997)andJacqmin-GaddaandCommenges(1995).Zhang(1997)discussedaone-sidedscoretestfortestingH0:ψ2=0forCase5inSect.3forageneralizedadditivemixedmodel,whichincludesmodel(1)asaspecialcase.VerbekeandMolenberghs(2003)discussedone-sidedscoretestsforlinearmixedmodel(2).Lin(1997)derivedscorestatisticsfortestingsingleormultiplevariancecomponentsinGLMMsandconsideredsimplertwo-sidedtests.Paralleltolikelihoodratiotests,theone-sidedscoretestsfollowamixtureofchi-squaredistributionwhoseweightscouldbedifficulttocalculatewhenmultiplevariancecomponentsaresettobezerounderH0asillustratedinCase5.Thetwo-sidedscoretestsassumethescorestatisticsfollowaregularchi-squaredistributionandhenceitsp-valuecanbecalculatedmoreeasily,especiallyformultiplevariancecomponenttests.Thetwo-sidedscoretesthasthecorrectsizeunderH0,whileitspowermightbelowerthantheone-sidedscoreandlikelihoodratiotests.Seethesimulationresultsformoredetails.InCase1,ψ=d11.Assumeatthemomentthatβisknown.OnecanshowusingL’Hopital’sruleortheTaylorexpansion(Lin,ˆ1997)thatthescoreforψis⎡⎧⎫2∂(β,ψ;y)1m⎨ni⎬=⎢z0Uψ=⎣ijwijδijyij−µij∂ψψ=02i=1⎩j=1⎭⎤ni20⎥−zijwij+eijyij−µij⎦,j=1(9)wherewij=[V(µ0){g(µ0)}2]−1,δij=g(µ0),ijijijVµ0gµ0+Vµ0gµ0ijijijijeij=3,V2µ0gµ0ijij VarianceComponentTestinginGeneralizedLinearMixedModels27Fig.1Expectedscore7asafunctionofvariancecomponentψ6543Expectedscore2100.00.51.01.52.0Variancecomponentwhichiszeroforthecanonicallinkfunctiong(·),andµ0satisfiesg(µ0)=xTβ.ijijijItcanbeeasilyshownthattherandomvariableUψdefinedby(9)haszeromeanunderH0:ψ=0.AsarguedbyVerbekeandMolenberghs(2003),thelog-likelihood(β,ψ;y)forthelinearmixedmodel(2)onaveragehasapositiveslopeatψ=0wheninfactψ>0.ThesameargumentalsoappliestoGLMM(1).ThisisbecauseunderHA:ψ>0,theMLEψofψwillbeclosetoψsothatψ>0whenthesamplesizemgetslarge.Ifthelog-likelihood(β,ψ;y)asafunctionofψonlyissmoothandhasauniqueMLEψ,whichisthecaseformostGLMMs,theslopeUψof(β,ψ;y)atψ=0willbepositive.Indeed,E(Uψ)generallyisanincreasingfunctionofψ.Forexample,Fig.1plotstheexpectedscoreE(Uψ)vs.ψforthelogistic-normalGLMM(3)wherem=10,ni=5,xij=1,β=0.25,andzij=1.ItisconfirmedthatE(Uψ)increasesasψincreases.TheaboveargumentindicatesthatalargevalueofUψprovidesevidenceagainstH0:ψ=0andweshouldrejectH0onlyifUψislarge.SinceUψisasumofindependentrandomvariables,classicresultsshowthatitwillhaveanasymptoticnormaldistributionunderH0:ψ=0withzeromeanandvarianceequaltoIψψ=E(U2),wheretheexpectationistakenatH0:ψ=0.ψDenotebyκrijtherthcumulantofyijunderH0.Bythepropertiesofthedis-tributionsinanexponentialfamily,κ3ijandκ4ijarerelatedtoκ2ijviaκ(r+1)ij=−1v(µ0κ2ij∂κrij/∂µij(r=2,3),whereκ2ij=φωij)andµij=µ.Specifically,ijij2κ−13ij=φωvµijvµij,ij3!"2#1µκ4ij=φωijvijv(µij)+v(µij)vµij.ThenIψψcanbeshowntobe(Lin,1997)1mniz2rIψψ=ijii,4i=1j=1 28D.Zhang,X.Linwhererii=w4δ4κ4ij+2w2+eijκ2ij−2w2δ2eijκ3ij.Therefore,alevelαscoreijijijijij1/2testfortestingH0:ψ=0vs.HA:ψ>0willrejectH0:ψ=0ifUψ≥zαI.ψψInpractice,however,βinUψandIψψisunknownandhastobeestimatedunderH0.ThisisstraightforwardsinceunderH0:ψ=0,GLMM(1)reducestothestandardgeneralizedlinearmodelforindependentdatag(µij)=XTβandexistingijsoftwarecanbeusedtoeasilycalculatetheMLEβofβunderH0:ψ=0.Inthiscase,Lin(1997)consideredthebias-correctedscorestatistictoaccountfortheestimationofβunderH0as⎡⎧⎫⎤m⎨ni⎬2nic∂(β,ψ;y)1⎢02⎥Uψ=∂ψ=2⎣⎩zijwijδijyij−µij⎭−zijw0ij⎦,ψ=0,β=βi=1j=1j=1(10)whereallquantitiesareobtainedbyreplacingβbyβ,w0ij=(1−hij)wij+eij(yij−µ0),andhijisthecorrespondingdiagonalelementofthehatmatrixH=ijW1/2X(XWX)−1XTW1/2,W=diag{wij},andshowedthatUchasvarianceψI˜TI−1Iψβ,(11)ψψ=Iψψ−Iψβββwhere1mnimnicTTIψβ=ijzijxij,Iββ=XWX=wijxijxij(12)2i=1j=1i=1j=1withcij=w3δ3κ3ij−wijδijeijκ2ij.Thenthebias-correctedscoretestatlevelαijijc1/2wouldrejectH0ifTs=U≥zαI˜.Theone-sidedscoretestpresentedaboveψψψisasymptoticallyequivalenttothelikelihoodratiotest(VerbekeandMolenberghs,2003).Thetwo-sidedscoretestassumesthescorestatisticTs={Uc}2/I˜ψψfollowsψaχ2distribution.Unliketheregularlikelihoodratiotest,suchatwo-sidedscoretesthasthecorrectsizeunderH0butissubjecttosomelossofpower.Asshowninoursimulationstudiesforasinglevariancecomponent,thelossofpowerisminortomoderateformostalternatives.Thehighestpowerlossisabout10%whenthemagnitudeofthevariancecomponentismoderate.Whenthedimensionofψisgreaterthan1,supposewecanpartitionψ=(ψ1,ψ2)whereψ1isac1×1vectorandψ2isac2×1vector.Weareinter-estedintestingH0:ψ1=0vs.HA:ψ1≥0.Heretheinequalityisin-terpretedelement-wise.Lin(1997)consideredasimpletwo-sidedscoretestforthismultiplevariancecomponenttest.Specifically,denoteby(β,ψ2)theMLEof(β,ψ2)underH0:ψ1=0.Wecansimilarlyderivethe(corrected)scoreSψ=m−1/2∂(β,ψ1;y)/∂ψ1|.SeeLin(1997)forthespecialcase1ψ1=0,β=β,ψ2=ψ2whereeachelementofψrepresentsavarianceofarandomeffect.Asymptotically,Sψhasanormaldistributionwithzeromeanandvarianceequaltotheefficient1 VarianceComponentTestinginGeneralizedLinearMixedModels29informationmatrixHψψ=m−1I˜ψψunderH0,whereI˜ψψisdefinedsimilarly111111to(11)exceptthatIφβandIββarereplacedbyIψ1γandIγγandγ=(ψ2,β).Thesimpletwo-sidedscorestatisticisdefinedasTH−1STs=Sψ1ψ1ψ1ψ1(13)andthep-valueiscalculatedbyassumingTsfollowsachi-squaredistributionwithc1degreesoffreedom.SilvapulleandSilvapulle(1995)proposedaone-sidedscoretestforagen-eralparametricmodelandshowedthattheone-sidedscoretestisasymptoticallyequivalenttothelikelihoodratiotest.VerbekeandMolenberghs(2003)extendedSilvapulleandSilvapulle(1995)one-sidedscoretestfortestingvariancecompo-nentsH0:ψ1=0vs.HA:ψ1∈Cforlinearmixedmodel(2)andshowedsimilarasymptoticequivalencebetweentheone-sidedscoretestandthelikelihoodratiotest.HallandPraestgaard(2001)derivedaone-sidedscoretestforGLMMs.Thentheone-sidedscorestatisticT∗isdefinedass∗=STH−1ST−1Tsψ1ψ1ψ1ψ1−inf{(Sψ1−ψ1)Hψ1ψ1(Sψ1−ψ1)}.(14)ψ1∈CItiseasytoseethatT∗asdefinedin(14)hasthesameasymptoticnulldistribu-stionofthelikelihoodratiotestfortestingH0:ψ1=0vs.HA:ψ1∈C.Similarlytothecaseforthelikelihoodratiotest,itiscriticaltodetermineHψψandtheshapeofC,andTs∗generallyfollowsamixtureofchi-squaredistributionsandweusuallyhavetostudythedistributionofTs∗casebycase.Boththetwo-sidedtestTsandtheone-sidedtestTs∗havethecorrectsizeunderH0.Thetwo-sidedtestTsismucheasiertocalculate,butissubjecttosomelossofpower.HallandPraestgaard(2001)conductedextensivesimulationstudiescomparingLin’s(1997)two-sidedscoretestandtheirone-sidedscoretestforGLMMswithtwo-dimensionalrandomeffectsandfoundsimilarpowerlosstothecaseofasinglevariancecomponent(Table4inHallandPraestgaard,2001;themaximumpowerlossisabout9%).5SimulationStudytoComparetheLikelihoodRatioTestandtheScoreTestforVarianceComponentsWeconductedasmallsimulationstudytocomparethesizeandthepoweroftheone-sidedandtwo-sidedscoretestswiththelikelihoodratiotest.Weconsideredthelogistic-normalGLMM(3)byassumingbinaryresponsesyij(i=1,2,...,m=100,j=1,2,...,ni=5)weregeneratedfromthefollowinglogistic-normalGLMM:logitP(yij=1|bi)=β+bi,(15)whereβ=0.25andbi∼N(0,ψ),withequalspacedψin[0,1]by0.2.Foreachvalueofψ,500datasetsweregenerated.ThelikelihoodratiotestdescribedinSect.3andthe(corrected)one-sidedandtwo-sidedscoretestswereappliedtotest 30D.Zhang,X.LinTable1Sizeandpowercomparisonsofthelikelihoodratiotestsandscoretestsforasinglevari-ancecomponentbasedon500simulationsunderthelogisticmodel(15)MethodSizePowerψ=0ψ=0.2ψ=0.4ψ=0.6ψ=0.8ψ=1.0LRT0.0340.3700.7900.9220.9901.000RegularLRT0.0200.2800.6720.8820.9680.992One-sidedscoretest0.0540.4160.8340.9380.9961.000Two-sidedscoretest0.0500.3360.7360.9100.9800.998H0:ψ=0.WecomparetheperformanceoftheregularbutconservativeLRT,theappropriateLRT,one-sidedandtwo-sidedscoretestfortestingH0:ψ=0.Thenominallevelofallfourtestsweresetatα=0.05.Table1presentsthesimulationresults.Theresultsshowthatthesizeofthe(cor-rect)likelihoodratiotestislittlesmallerthanthenominallevel.Thisisprobablyduetothenumericalinstabilitycausedbynumericaldifficultiesinfittingmodel(15)wheninfactthereisnorandomeffectinthemodel,orthefactthatthesam-plesize(numberofclustersm=100)maynotbelargeenoughfortheasymptotictheorytotakeeffect.Asexpected,theregularLRTusingχ2isconservativeandthe1sizeistoosmall.Ontheotherhand,bothone-sidedandtwo-sidedscoretestshavetheirsizesveryclosetothenominallevel.Thepowersofthelikelihoodratiotestandtheone-sidedscoretestarealmostthesame,althoughtheone-sidedscoretestisslightlymorepowerfulthantheLRT,whichmaybeduetothenumericalinte-grationrequiredtofitmodel(15).Thetwo-sidedscoretesthassomelossofpowercomparedtotheone-sidedscoretestandthecorrectLRT.However,thep-valueofthetwo-sidedscoretestismucheasiertocalculateespeciallyfortestingformultiplevariancecomponents.6PolynomialTestinSemiparametricAdditiveMixedModelsLinandZhang(1999)proposedgeneralizedadditivemixedmodels(GAMMs),anextensionofGLMMswhereeachparametriccovariateeffectinmodel(1)isre-placedbyasmoothbutarbitrarynonparametricfunction,andproposedtoestimateeachfunctionbyasmoothingspline.Usingamixedmodelrepresentationforasmoothingspline,theycastestimationandinferenceofGAMMsinaunifiedframe-workthroughaworkingGLMM,wheretheinverseofasmoothingparameteristreatedasavariancecomponent.AspecialcaseofGAMMsisthesemiparametricadditivemixedmodelsconsideredbyZhangandLin(2003)Tα+zTbg(µij)=f(tij)+sijiji,(16)wheref(t)isanunknownsmoothfunction,i.e.,thecovariateeffectoftisas-sumedtobenonparametric,sijsomecovariatevector,andbi∼N{0,D(ψ)}.For VarianceComponentTestinginGeneralizedLinearMixedModels31independent(normal)datawiththeidentitylink,model(16)reducestoapartiallylinearmodel.Weareinterestedindevelopingascoretestingfortestingf(t)isaparametricpolynomialfunctionversusasmoothnonparametricfunction.Specifi-cally,wesetH0:f(t)isapolynomialfunctionofdegreeK−1andH1:f(t)isasmoothingspline.FollowingZhangandLin(2003),denotebyt0=(t0,t0,...,tr0)Tavectorof12ordereddistincttij’sandbyfavectoroff(t)evaluatedatt0(withoutlossofgenerality,assume00.ItishencenaturaltoconsiderusingthevariancecomponentlikelihoodratiotestorscoretestdescribedintheearliersectionstotestH0:τ=0.However,thedatadonothaveindependentclusterstructureunderthealternativeHA:τ>0.Therefore,theasymptoticnulldistributionofthelikelihoodratioteststatisticfortestingH0:τ=0doesnotfollowa50:50mixtureofχ2andχ2.Infact,forindependentnormaldatawith01theidentitylink,Crainiceanuetal.(2005)showedthat,whenf(t)ismodeledbyapenalizedspline(similartoasmoothingspline),theLRTstatisticasymptoticallyhasapproximately0.95massprobabilityatzero.Forthisspecialcase,Crainiceanuetal.(2005)derivedtheexactnulldistributionoftheLRTstatistic.Theirresults,however,maynotbeapplicabletotestingH0:τ=0underamoregeneralmixedmodelrepresentation(19).Furthermore,itcouldbecomputationallydifficulttocal-culatethisLRTstatisticbyfittingmodel(19)underthealternativeHA:τ>0asitusuallyrequireshigh-dimensionalnumericalintegrations.Duetothespecialstructureofthesmoothingmatrix,thescorestatisticofτevaluatedunderH0:τ=0doesnothaveanormaldistribution.ZhangandLin(2003)showedthatthescorestatisticofτcanusuallybeexpressedasaweightedsumofchi-squaredrandomvariableswithpositivebutrapidlydecayingweights,anditsdistributioncanbeadequatelyapproximatedbythatofascaledchi-squaredrandomvariable.Underthemixedmodelrepresentation(19),themarginallikelihoodfunctionLM(τ,ψ;y)of(τ,ψ)isgivenby⎧⎨mni−m/2τ−r/2expLM(τ,ψ;y)∝|D|ij(β,ψ,bi;yij)⎩i=1j=1$m1T−11T−biDbi−aadadbdβ.22τi=1(20)LetM(τ,ψ;y)=logLM(τ,ψ;y)bethelog-marginallikelihoodfunctionof(τ,ψ).ZhangandLin(2003)showedthatthescoreUτ=∂M(τ,ψ;y)/∂τ|τ=0canbeapproximatedbyU1T−1T−1Tτ≈(Y−Xβ)VNNV(Y−Xβ)−trPNN,(21)2β,ψwhereβistheMLEofβandψtheREMLestimateofψfromthenullGLMM(22),andYistheworkingvectorY=Xβ+Zb+(y−µ)underthenullGLMMg(µ)=Xβ+Zb,(22) VarianceComponentTestinginGeneralizedLinearMixedModels33where=diag{g(µij)},P=V−1−V−1X(XTV−1X)−1XTV−1andV=W−1+ZDZ˜TwithD˜=diag{D,...,D}andWisdefinedsimilarlyasinSect.4exceptµ0isreplacedbyµij.Allthesematricesareevaluatedunderthereducedijmodel(22).WriteUτ=Uτ−˜e,whereUτande˜arethefirstandsecondtermsofUτin(21).ZhangandLin(2003)showedthatthemeanofUτisapproximatelyequaltoe˜underH0:τ=0.SimilartothescoretestderivedinSect.4,themeanofUτincreasesasτincreases.Therefore,wewillrejectH0:τ=0whenUτislarge,implyingaone-sidedtest.ThevarianceofUτunderH0canbeapproximatedbyI˜TI−1Iτψ,(23)ττ=Iττ−Iτψψψwhere%1T21T∂VIττ=trPNN,Iτψ=trPNNP,22∂ψ1∂V∂VIψψ=trPP.(24)2∂ψ∂ψDefineκ=I˜ττ/2e˜andν=2e˜2/I˜ττ.ThenSτ=Uτ/κapproximatelyhasaχν2dis-tribution,andwewillrejectH0:τ=0atthesignificancelevelαifSτ≥χ2.Theα;νsimulationconductedbyZhangandLin(2003)indicatesthatthismodifiedscoretestforpolynomialcovariateeffectinthesemiparametricadditivemixedmodel(16)hasapproximatelytherightsizeandispowerfultodetectalternatives.7ApplicationInthissection,weillustratethelikelihoodratiotestingandthescoretestingforvariancecomponentsinGLMMsdiscussedinSects.3and4,aswellasthescorepolynomialcovariateeffecttestinginGAMMsdiscussedinSect.6throughanap-plicationtothedatafromIndonesianchildreninfectiousdiseasestudy(ZegerandKarim,1991).Twohundredandseventy-fiveIndonesianpreschoolchildrenwereexaminedforuptosixquartersforthesignofrespiratoryinfection(0=no,1=yes).Totallythereare2,000observationsinthedataset.Availablecovariatesinclude:ageinyears,Xerophthelmiastatus(signforvitaminAdeficiency),gender,heightforage,andthepresenceofstunningandtheseasonalsineandcosine.TheprimaryinterestofthestudyistoseeifvitaminAdeficiencyhasaneffectontherespiratoryinfectionadjustingforothercovariatesandtakingintoaccountthecorrelationinthedata.ZegerandKarim(1991)usedGibbs’samplingapproachtofitthefollowinglogistic-normalGLMMTβ+blogit(P[yij=1|bi])=xiji,(25) 34D.Zhang,X.Linwhereyijistherespiratoryinfectionindicatorfortheithchildatthejthinterview,xijisthe7×1vectorofthecovariatesdescribedabovewithcorrespondingeffectsβ,bi∼N(0,θ)istherandomeffectmodelingthebetween-childvariation/between-childcorrelation.NostatisticallysignificanteffectofvitaminAdeficiencyonrespi-ratoryinfectionwasfound.Wecanalsoconductalikelihoodinferenceformodel(25)byevaluatingtherequiredintegrationsusingGaussianquadraturetechnique.TheMLEofθisθ=0.58withSE(θ)=0.31,whichindicatesthattheremaybebetween-childvariationintheprobabilityofgettingrespiratoryinfection.AninterestingquestioniswhetherwecanrejectH0:θ=0.Thelikelihoodratiostatisticforthisdatasetis−2lnλm=674.872−669.670=5.2.Theresultingp-value=0.5P[χ2≥5.2]=0.011,1indicatingstrongevidenceagainstH0usingtheLRTprocedure.Alternatively,wemayapplythescoreteststotestH0.The(corrected)scorestatisticforthisdatasetis2.678.Thep-valuefromtheone-sidedscoretestis0.0037,andthetwo-sidedscoretestis0.0074.BoththetestsprovidestrongevidenceagainstH0:θ=0.Motivatedbytheirearlierwork,ZhangandLin(2003)consideredtestingwhetherf(age)inthefollowingsemiparametricadditivemixedmodelcanbeadequatelyrepresentedbyaquadraticfunctionofageTβ+f(age)+blogit(P[yij=1|bi])=sijiji,(26)wheresijaretheremainingcovariates.ThescoreteststatisticdescribedinSect.6forK=3isSτ=5.73with1.30degreesoffreedom,indicatingastrongevidenceagainstH0:f(age)isaquadraticfunctionofage(p-value=0.026).Thismayimplythatnonparametricmodelingoff(age)inmodel(26)ispreferred.8DiscussionInthischapter,wehavereviewedthelikelihoodratiotestandthescoretestfortest-ingvariancecomponentsinGLMMs.Thecentralissueisthatthenullhypothesisusuallyplacessomeofthevariancecomponentsontheboundaryofthemodelpa-rameterspace,andthereforethetraditionalnullchi-squareddistributionoftheLRTstatisticnolongerappliesandthep-valuebasedontraditionalLRchi-squaredis-tributionisoftentooconservative.UsingthetheorydevelopedbySelfandLiang(1987),wehavereviewedtheLRTforsomespecialcasesandshowtheLRTgen-erallyfollowsamixtureofchi-squaredistribution.Toderivetherightnulldistri-butionoftheLRTstatistic,oneneedstoknowthe(Fisher)informationmatrixatthetrueparametervalue(underthenullhypothesis)andthetopologicalbehavioroftheneighborhoodofthetrueparametervalue.However,asoursimulationindi-cates,theLRTforthevariancecomponentsinaGLMMmaysufferfromnumericalinstabilitywhenthevariancecomponentissmallandnumericalintegrationishighdimensional. VarianceComponentTestinginGeneralizedLinearMixedModels35Ontheotherhand,thescorestatisticonlyinvolvesparameterestimatesunderthenullhypothesisandhencecanbecalculatedmuchmorestraightforwardandefficiently.Wediscussedboththeone-sidedscoretestandthemuchsimplertwo-sidedscoretest.Bothtestshavethecorrectsize.Theone-sidedscoretesthasthesameasymptoticdistributionasthecorrectlikelihoodratiotest.Hence,similartotheLRT,thecalculationoftheone-sidedscoretestrequirestheknowledgeoftheinformationmatrixandthetopologicalbehavioroftheneighborhoodofthetrueparametervalueandalsorequirescomputingamixtureofchi-squaredistributions.Thetwo-sidedscoretestisbasedontheregularchi-squaredistributionandhastherightsize.Itismucheasiertocalculateespeciallyfortestingmultiplevariancecomponents.Thesimulationstudiespresentedhereandinthestatisticalliteratureshowthatthetwo-sidedscoretestmaysufferfromsomepowerlosscomparedtothe(correct)likelihoodratiotestandtheone-sidedscoretest.WehavealsoreviewedthelikelihoodratiotestandthescoretestfortestingwhetheranonparametriccovariateeffectinaGAMMcanbeadequatelymodeledbyapolynomialofcertaindegreecomparedtoasmoothingsplineorapenalizedsplinefunction.Althoughtheproblemcanbereducedtotestingavariancecom-ponentequaltozerousingthemixedeffectsrepresentationofthesmoothing(pe-nalized)spline,theGLMMresultsforlikelihoodratiotestandthescoretestforvariancecomponentsdonotapplybecausethedataundermixedeffectsrepresen-tationofthesplinedonothaveanindependentclusterstructureanymore.SincetheLRTstatisticwillbeprohibitivetocalculateforaGLMMwithpotentiallyhighdimensionalrandomeffects,wehaveparticularlyreviewedthescoretestofZhangandLin(2003)fortestingtheparametriccovariatemodelversusthenonparametriccovariatemodelinthepresenceofasinglenonparametriccovariatefunction.Futureresearchisneededtodevelopsimultaneoustestsformultiplecovariateeffects.ReferencesBooth,J.G.andHobert,J.P.(1999).MaximizinggeneralizedlinearmixedmodellikelihoodswithanautomatedMonteCarloEMalgorithm.JournaloftheRoyalStatisticalSociety,SeriesB61,265–285Breslow,N.E.andClayton,D.G.(1993).Approximateinferenceingeneralizedlinearmixedmod-els.JournaloftheAmericanStatisticalAssociation88,9–25Breslow,N.E.andLin,X.(1995).Biascorrectioningeneralizedlinearmixedmodelswithasinglecomponentofdispersion.Biometrika82,81–91Crainiceanu,C.,Ruppert,D.,Claeskens,G.andWand,M.P.(2005).Exactlikelihoodratiotestsforpenalizedsplines.Biometrika92,91–103Diggle,P.J.,Heagerty,P.,Liang,K.Y.andZeger,S.L.(2002).AnalysisofLongitudinalData.OxfordUniversityPress,OxfordHall,D.B.andPraestgaard,J.T.(2001).Order-restrictedscoretestsforhomogeneityingeneralizedlinearandnonlinearmixedmodels.Biometrika88,739–751Hsiao,C.K.(1997).ApproximateBayesfactorswhenamodeloccursontheboundary.JournaloftheAmericanStatisticalAssociation92,656–663Jacqmin-Gadda,H.andCommenges,D.(1995).Testsofhomogeneityforgeneralizedlinearmod-els.JournaloftheAmericanStatisticalAssociation90,1237–1246 36D.Zhang,X.LinLaird,N.M.andWare,J.H.(1982).Randomeffectsmodelsforlongitudinaldata.Biometrics38,963–974Lin,X.(1997).Variancecomponenttestingingeneralizedlinearmodelswithrandomeffects.Biometrika84,309–326Lin,X.andBreslow,N.E.(1996).Biascorrectioningeneralizedlinearmixedmodelswithmultiplecomponentsofdispersion.JournaloftheAmericanStatisticalAssociation91,1007–1016Lin,X.andZhang,D.(1999).Inferenceingeneralizedadditivemixedmodelsusingsmoothingsplines.JournaloftheRoyalStatisticalSociety,SeriesB61,381–400Self,S.G.andLiang,K.Y.(1987).Asymptoticpropertiesofmaximumlikelihoodestimatorsandlikelihoodratiotestsundernonstandardconditions.JournaloftheAmericanStatisticalAssoci-ation82,605–610Silvapulle,M.J.andSilvapulle,P.(1995).Ascoretestagainstone-sidedalternatives.JournaloftheAmericanStatisticalAssociation90,342–349Stram,D.O.andLee,J.W.(1994).Variancecomponentstestinginthelongitudinalmixedeffectsmodel.Biometrics,50,1171–1177Verbeke,G.andMolenberghs,G.(2000).LinearMixedModelsforLongitudinalData.Springer,BerlinHeidelbergNewYorkVerbeke,G.andMolenberghs,G.(2003).Theuseofscoretestsforinferenceonvariancecompo-nents.Biometrics59,254–262Verbeke,G.andMolenberghs,G.(2005).ModelsforDiscreteLongitudinalData.Springer,BerlinHeidelbergNewYorkZeger,S.L.andKarim,M.R.(1991).Generalizedlinearmodelswithrandomeffects;aGibbssamplingapproach.JournaloftheAmericanStatisticalAssociation86,79–86Zhang,D.(1997).GeneralizedAdditiveMixedModel,unpublishedPh.D.dissertation,DepartmentofBiostatistics,UniversityofMichiganZhang,D.andLin,X.(2003).Hypothesistestinginsemiparametricadditivemixedmodels.Biostatistics4,57–74 BayesianModelUncertaintyinMixedEffectsModelsSatkartarK.KinneyandDavidB.Dunson1Introduction1.1MotivationRandomeffectsmodelsarewidelyusedinanalyzingdependentdata,whicharecol-lectedroutinelyinabroadvarietyofapplicationareas.Forexample,longitudinalstudiescollectrepeatedobservationsforeachstudysubject,whilemulti-centerstud-iescollectdataforpatientsnestedwithinstudycenters.Insuchsettings,itisnaturaltosupposethatdependencearisesduetotheimpactofimportantunmeasuredpre-dictorsthatmayinteractwithmeasuredpredictors.Thisviewpointnaturallyleadstorandomeffectsmodelsinwhichtheregressioncoefficientsvaryacrossthedifferentsubjects.Inthischapter,weusetheterm“subject”broadlytorefertotheindepen-dentexperimentalunits.Forexample,inlongitudinalstudies,thesubjectsaretheindividualsunderstudy,whileinmulti-centerstudiesthesubjectscorrespondtothestudycenters.Inapplicationsofrandomeffectsmodels,oneistypicallyfacedwithuncertaintyinthepredictorstobeincludedinthefixedandrandomeffectscomponentsofthemodel.Thepredictorsincludedinthefixedeffectscomponentarecorrelatedwiththepopulation-averagedresponse,whilethepredictorsincludedintherandomef-fectscomponenthavevaryingcoefficientsforthedifferentsubjects.Thisvariabilityinthecoefficientsinducesapredictor-dependentcorrelationstructureintherepeatedobservationsuponmarginalizingouttherandomeffects.Forfixedeffectsmodels,thereisarichliteratureonmethodsforsubsetselectionandinferencesfrombothfrequentistandBayesianperspectives;however,subsetselectionfortherandomef-fectscomponenthasreceivedlimitedattention.S.K.KinneyNationalInstituteofStatisticalSciencessaki@niss.orgD.B.DunsonDepartmentofStatisticalScience,DukeUniversitydunson@stat.duke.eduD.B.Dunson(ed.)RandomEffectandLatentVariableModelSelection,37DOI:10.1007/978-0-387-76721-5,cSpringerScience+BusinessMedia,LLC2008 38S.K.Kinney,D.B.DunsonOnereasonforthelimitedattentiontothisproblemisthecommonperspectivethattheprimaryfocusofinferenceisthefixedeffectscomponentofthemodel,whilethedependencestructureismerelya“nuisance.”Fromthisviewpoint,arela-tivelysimplemodelfortherandomeffectscomponent,suchasarandominterceptmodel,isthoughttobesufficienttoaccountforwithin-subjectdependence.Thereareafewproblemswiththisparadigm.First,itisseldomthecasethatscientificinterestliesonlyinthepredictorforahypothetical“typical”subjecthavingaveragerandomeffectvalues.Inclinicaltrials,forexample,variabilityamongthesubjectsisalsoimportant.Iftheimpactofadrugtherapyvariesconsiderablyamongdiffer-entindividuals,thissuggeststhatefficacyishigherforcertainsubgroups,afindingwithconsiderableclinicalimplications.Second,onemayobtaininvalidinferencesonthefixedeffectcoefficientsiftherandomeffectscomponentismisspecified.ThefocusofthischapterisonapplyingBayesianmethodsformodeluncertaintytotherandomeffectssubsetselectionproblem.Ourgoalistoprovideapractically-motivatedbackgroundontherelevantliterature,withafocusonthemethodologyproposedbyKinneyandDunson(2007).ThehopeisthatthistutorialwillmotivateincreaseduseofBayesianmethodsinthisarea,whilealsostimulatingnewresearch.Webeginwithabriefoverviewoftheliteratureonmodelselectionandinferencesonvariancecomponentsinrandomeffectsmodels.1.2FrequentistLiteratureOneofthedifficultiesingeneralizingmethodsforsubsetselectionandinferencesinlinearregressionmodels(see,forexample,MitchellandBeauchamp,1988)totherandomeffectssettingisthatthelikelihoodcannot,ingeneral,beobtainedan-alytically.Thisisbecausethelikelihoodisspecifiedasanintegralofaconditionallikelihoodgiventherandomeffectsovertherandomeffectsdistribution,withthisintegraltypicallynotavailableinclosedform.Motivatedbythisproblem,thereisarichliteratureonapproximationstothemarginallikelihoodobtainedbyintegratingouttherandomeffects.SinharayandStern(2001)summarizethemajorapproaches,includingmarginalmaximumlikelihood,restrictedmaximumlikelihood,andqua-silikelihood.ThemarginalmaximumlikelihoodapproachevaluatesthelikelihoodusingquadratureoraLaplaceapproximationandcomputesmaximumlikelihoodestimatesofmodelparametersusingtraditionalnumericoptimizationapproaches.Thisapproachtendstounderestimatevarianceparameters;hence,Stiratellietal.(1984)suggestanapproximateE-Malgorithmforcomputingtherestrictedmaxi-mumlikelihoodestimateofthevariancematrix.AnalternativefromBreslowandClayton(1993)isthequasilikelihoodmethod.Afterobtaininganaccurateapproximationtothelikelihood,likelihoodratioteststatisticscanbecomputedtocomparenestedrandomeffectsmodels;however,whenthenestedmodelsdifferintherandomeffectsthatareincluded,thetypicallikeli-hoodratiotestasymptotictheorydoesnotapplysincethenullhypothesisliesattheboundaryoftheparameterspace.Specifically,thenullhypothesiscorrespondsto BayesianModelUncertaintyinMixedEffectsModels39settingoneormoreoftherandomeffectsvariancesequaltozero,withthesepara-metersrestrictedtobepositiveunderthealternative.Potentially,toavoidrelyingonknowledgeoftheexactorasymptoticdistributionofthelikelihoodratioteststatis-tic(discussedinthechapterbyCiprianCrainiceanu),onecouldapplyaparametricbootstrap(SinharayandStern,2001).Asanalternativetoalikelihoodratiotest,onecouldconsiderascoretest,suchasthatconsideredbyLin(1997)fortestingwhetherallthevariancecomponentsinaGLMMarezero(seealsoVerbekeandMolenberghs,2003;HallandPraestgaard,2001andthechapterbyDaowenZhangandXihongLin).Evenifonecanobtainaccuratep-valuesforlikelihoodratiotestsorscoretestscomparingnestedrandomeffectsmodels,itisnotclearhowtousesuchmethodstoappropriatelyaccountforuncertaintyinsubsetselection.Potentially,onecanapplyastepwiseprocedure,butthemodelselectedmaybesensitivetotheorderinwhichpredictorsareaddedandthelevelofp-valuecutoffforinclusionorexclusion.Inaddition,unlessoneaccountsforuncertaintyintheselectionprocess,itisnotappropriatetobaseinferencesontheestimatesofthecoefficientsunderthefinalselectedmodel.Onecanpotentiallyaddressthisconcernbyselectingthemodelonatrainingsubsetofthedata,thoughitisnotclearhowtooptimallychoosethetrainingandtestsamples.Asanalternativeapproach,whichaddressessomeoftheseconcerns,Jiangetal.(2008)proposedaninnovative“fence”method.Thefencemethodgivesasinglesubsetofpredictorstoincludeintherandomeffectscomponent,butdoesnotpro-videameasureofuncertaintyinthesubsetselectionprocessorallowinferencesonwhetheragivenpredictorhasarandomcoefficient.Inaddition,ifpredictionsareofinterest,onecanobtainmorerealisticmeasuresofuncertaintyandpotentiallymoreaccuratepredictionsbyallowingforerrorsinmodelselection.Thisisparticularlyimportantwhentherearemanypredictors,becauseinsuchcasesanysingleselectedmodelmaynotbemarkedlybetterthanallofthecompetingmodels.1.3BayesianApproachGiventhepracticaldifficultiesthatariseinimplementingafrequentistapproachtothisproblem,wefocusonBayesianmethods.AdvantagesoftheBayesianapproachinclude(1)lackofrelianceonasymptoticapproximationstothemarginallikelihoodobtainedbyintegratingouttherandomeffectsortothedistributionofthelikeli-hoodratioteststatistic;(2)abilitytofullyaccountformodeluncertaintythroughaprobabilisticframeworkthatassignseachmodelinthelistpriorandposteriorprob-abilities;and(3)allowancefortheincorporationofpriorinformation.Practicaldis-advantagesincludeeaseofimplementationgiventhelackofproceduresinstandardsoftwarepackages,computationalintensity,andsensitivitytopriordistributions.Theseconcernsarelikelytodecreaseinthecomingyears,withnewproceduresforimplementingBayesiananalysesinSASinacomputationallyefficientmannerandwithongoingresearchindefaultpriorsformodelselectioninhierarchicalmodels. 40S.K.Kinney,D.B.DunsonWereviewBayesianmodeluncertaintyingeneralinSect.2andinthecontextofmixedmodelsinSect.3.Section4describesaBayesianapproachforlinearmixedmodelsanddiscussespriorspecification.AmodificationforbinarylogisticmodelsisoutlinedinSect.5.Section6providesasimulationexampleandSect.7adataexample.AdditionalextensionsarediscussedinSect.8andconcludingremarksaregiveninSect.9.2BayesianModelUncertainty2.1SubsetSelectioninLinearRegressionLetusfirstconsideranormallinearmodely=Xβ+,∼N(0,σ2)withnorandomeffects.FromtheBayesianperspective,themodelparametersareconsid-eredrandomvariableswithprobabilitydistributions.Whenfittingthemodel,priordistributionsareassignedtoeachparameter,andposteriordistributionsareob-tainedbyupdatingthepriorwiththeinformationinthelikelihood.Unlesscon-jugatepriorsareused,theposteriordistributionsarenotavailableinclosedanalyticform;hence,MarkovchainMonteCarlo(MCMC)algorithmsaretypicallyusedtoproduceautocorrelateddrawsfromthejointposteriordistributionoftheparameters.AfterconvergenceoftheMCMCchain,thedrawscanbeusedtoestimateposteriorsummaries.Whenperformingposteriorcomputationforasinglemodelwithnomodeluncertainty,typicalposteriorsummariesincludeposteriormeans,standarddeviations,andcredibleintervals.IntheBayesianparadigm,modeluncertaintycanbeaddressedsimultaneouslywithparameteruncertaintybyplacingpriorsp(Mk)oneachpossiblemodelM1,...,MKinadditiontothemodelparametersp(β|Mk,σ2)andp(σ2).Forexample,innormallinearregressionanalyses,itiscommontohaveuncertaintyinthesubsetofpredictorstobeincludedintheregressionmodel.Iftherearepcandidatepredictors,thenthereare2ppossiblesubsets,witheachMkcorrespond-ingtoadifferentsubset.Inthiscase,p(Mk)isthepriorprobabilityofsubsetMk,whichiscommonlychosentobeuniformorbasedonthesizeofthemodel.Ifoneallowspredictorstobeincludedindependentlywith0.5priorprobability,thenp0.5p,where#Mp(Mk)=kisthenumberofpredictorsincludedinmodelMk.#MkFornormallinearregressionmodels,onecanchooseaconjugatenormalinverse-gammaprior.Forexample,g-priorsormixturesofg-priors(Liangetal.,2005)arecommonly-used.TheposteriormodelprobabilitiescanbecalculatedusingBayesruleasfollows:p(y|Mk)p(Mk)p(Mk|y)=,kp(y|Mk)p(Mk)wherep(y|M2222k)=p(y|β,σ,Mk)p(β|σ,Mk)p(σ)dβ,dσ BayesianModelUncertaintyinMixedEffectsModels41isthemarginallikelihoodofthedataundermodelMk.Thismarginallikelihoodisavailableanalyticallyfornormallinearregressionmodelswhenconjugatenormalinverse-gammapriorsarechosenfor(β,σ2);however,ingeneralizedlinearmodelsandinnormallinearmodelswithrandomeffects,themarginallikelihoodwillnotbeavailableanalytically.Insuchcases,itiscommontorelyontheLaplaceapproxima-tion,ortousesimulation-basedapproachestoapproximatethemarginallikelihoodand/orposteriormodelprobabilities.EveninlinearregressionmodelsinwhichonecanobtaintheexactmarginallikelihoodforanyparticularmodelMk,calculationoftheexactposteriormodelprobabilitiesmaynotbepossiblewhenthenumberofmodelsisverylarge.Forex-ample,inthesubsetselectionproblem,thenumberofmodelsinthelistis2p,whichgrowsveryrapidlywithpsothatonecannotcalculatethemarginallikelihoodsforallmodelsinthelistevenformoderatep.Thisproblemhasmotivatedaliteratureonstochasticsearchvariableselection(SSVS)algorithms,whichuseMCMCmethodstoexplorethehigh-dimensionalmodelspaceinanattempttorapidlyidentifyhighposteriorprobabilitymodels(GeorgeandMcCulloch,1993,1997).TheSSVSalgorithmofGeorgeandMcCulloch(1997)usesaGibbssampler(GelfandandSmith,1990)tosearchformodelshavinghighposteriorprobabilitybyembeddingallthemodelsinthefullmodel.Thisisaccomplishedbychoos-ingapriorfortheregressioncoefficientsthatisamixtureofacontinuousdensityandacomponentconcentratedatzero.Becausepredictorshavingzeroornear-zerocoefficientseffectivelydropoutofthemodel,thecomponentconcentratedatzeroallocatesprobabilitytomodelshavingoneormorepredictorsexcluded.Suchcom-ponentmixturepriorsarecommonlyreferredtoasvariableselectionorpointmassmixturepriors.Theyareveryconvenientcomputationally,astheyallowonetorunasingleGibbssamplerasifdoingcomputationunderthefullmodel.Byrandomlyswitchingfromthecomponentconcentratedatzerotothemorediffusecomponent,thechaineffectivelymovesbetweenmodelscorrespondingtodifferentsubsetsofthepredictorsbeingselected.Afterdiscardinginitialburn-indraws,onecanestimatetheposteriormodelprob-abilitiesusingtheproportionofMCMCdrawsspentineachmodel.Ingeneral,all2pmodelswillnotbevisited;hence,manyormostofthecandidatemodelswillbeestimatedtohavezeroposteriorprobability.Althoughthereisnoguaranteethatthemodelwiththehighestposteriorprobabilitywillbevisited,whenpislarge,SSVStendstoquicklylocategoodmodels.Model-averagedestimatesmayalsobeob-tainedformodelcoefficientsbyaveragingtheparameterestimatesoverallMCMCdraws.Marginalinclusionprobabilitiesforeachpredictorareestimatedbythepro-portionofdrawsspentinmodelscontainingthatpredictor.Becauseposteriorprobabilitiesforanyspecificmodeltendtobeverysmallinlargemodelspaces,itmaybeunreliabletobaseinferencesonanyoneselectedmodel.ThisproblemisnotuniquetoBayesianmodelselection,asothermodelse-lectioncriteria(e.g.,AIC)mayhavesimilarvaluesformanymodelswhenthenum-berofcandidatemodelsislarge.AnadvantageoftheBayesianapproachisthatitprovidesawellcalibratedandeasilyinterpretablescoreforeachmodel.Hence,onecanconsideralistofthetop10or100models,examiningthesizeoftheposterior 42S.K.Kinney,D.B.Dunsonprobabilitiesallocatedtoeachofthesemodels.Suchanexerciseprovidesamuchmorerealisticjudgeofuncertaintythanapproachesthatseektoidentifyasinglebestmodelbasedonsomecriteria.Marginalinclusionprobabilitiesprovideameasureoftheweightofevidencethataparticularpredictorshouldbeincludedinthemodel,adjustingforuncer-taintyintheotherpredictorsthatareincluded.Forexample,supposethefirstcan-didatepredictor(age)isincludedwithposteriorprobabilityof0.98.Thenonehasstrongevidencethatageisanimportantpredictor.However,ifthemarginalinclu-sionprobabilitywasinstead0.02,onewouldhaveevidencethatagedoesnotneedtobeincluded.Posteriorinclusionprobabilitiesthatarenotcloseto0or1arelessconclusive.Whenthefocusisonpredictioninsteadofinferences,onecanuseBayesianmodelaveraging,whichisperformedbyweightingmodel-specificBayesianpre-dictionsbytheposteriormodelprobabilities.Thisapproachhasadvantagesoverclassicalmethods,whichinsteadrelyonasingleselectedmodel,ignoringuncer-taintyinselectingthatmodel.ForadetailedreviewofBayesianmodelaveragingandselection,refertoClydeandGeorge(2004).2.2BayesFactorsandDefaultPriorsBayesfactorsprovideastandardBayesianweightofevidenceinthedatainfavorofonemodeloveranother.TheBayesfactorinfavorofmodelM1overmodelM0isdefinedastheposterioroddsofM1dividedbytheprioroddsofM1p(M1|y)p(M0)p(y|M1)BF10=×=,p(M0|y)p(M1)p(y|M0)whichissimplytheratioofmarginallikelihoodsunderthetwodifferentmodels.Unlikefrequentisttestingbasedonp-values,Bayesfactorshavetheadvantageoftreatingthetwocompetingmodels(say,nullandalternative)symmetrically,sothatoneobtainsameasureofsupportinthedataforamodel,whichisappropriateregardlessofwhetherthemodelsarenested.Hence,wedonotobtainatestforwhetherthelargemodelis“significantly”better,butinsteadrelyontheintrinsicBayespenaltyformodelcomplexitytoallowcoherentcomparisonsofnon-nestedmodelsofdifferentsizes.Apotentialdrawback(oradvantageincertainsettings)isthattheBayesfactorhasawell-knownsensitivitytotheprior,andimproperpriorscannotbechosen.Thisrestrictiondoesnotholdifonewishestodoinferencesunderasinglemodel,aslongastheposteriorisproper.However,inconductingmodelcomparisons,theBayesfactorisonlydefineduptoanarbitraryconstantthatdependsonthevarianceoftheprior.Asthepriorvarianceincreases,thereisanincreasingtendencytofavorsmallermodels.Hence,itisimportanttoeitherchooseaninformativepriorbasedonsubjectmatterknowledgeortochooseaproperdefaultprior,chosentoyield BayesianModelUncertaintyinMixedEffectsModels43goodBayesianand/orfrequentistproperties.Insubsetselectionfornormallinearregressionmodels,theZellner–Siowprior(ZellnerandSiow,1980)isacommonly-useddefault,withrecentworkproposingalternativemixturesofg-priors(Liangetal.,2005).ThepopularBayesianinformationcriterion(BIC)wasoriginallyderivedbySchwarz(1978)startingwithaLaplaceapproximationtothemarginallikelihood,andmakingsomesimplifyingassumptions,includingtheuseofaunitinformationprior.Fornormallinearregressionmodels,theunitinformationpriorcorrespondstoaspecialcaseoftheZellnerg-priorinwhichg=n,sothattheamountofin-formationinthepriorisequivalenttooneobservation.ModelselectionviatheBICcloselyapproximatesmodelselectionbasedonBayesfactorsunderawiderangeofproblemsforaparticulartypeofdefaultprior(Raftery,1995).However,forhierar-chicalmodelsandmodelsinwhichthenumberofparametersincreaseswithsamplesize,theBICisnotjustified(Pauleretal.,1999,Bergeretal.,2003).3BayesianSubsetSelectionforMixedEffectsModelsIncontrasttotherichliteratureonBayesiansubsetselectionforfixedeffects,thereisverylittleworkonselectionofrandomeffects.Pauleretal.(1999)comparevari-ancecomponentmodelsusingBayesfactorsandSinharayandStern(2001)considertheproblemofcomparingtwoGLMMsusingtheBayesfactor.Motivatedbysensi-tivitytothechoiceofprior,ChungandDey(2002)developanintrinsicBayesfactorapproachforbalancedvariancecomponentmodels.ChenandDunson(2003)devel-opedastochasticsearchvariableselection(SSVS)(GeorgeandMcCulloch,1993;Geweke,1996)approachforfixedandrandomeffectsselectioninthelinearmixedeffectsmodel.RelyingonTaylorseriesapproximationtointractableintegrals,CaiandDunson(2006)recentlyextendedthisapproachtoallGLMMs(refertochapterbyCaiandDunsonforfurtherdetails).3.1BayesFactorApproximationsTheBICisnotappropriateforcomparingmodelswithdifferingnumbersofrandomeffects,astherequiredregularityconditionsarenotmetwhenthenullhypothesiscorrespondstoaparameterfallingontheboundaryoftheparameterspace(Pauleretal.,1999).SeveralBayesfactorapproximationsfortestingvariancecomponentsarereviewedinSinharayandStern(2001).Mostoftheseinvolveestimationofp(y|M1)andp(y|M0)toobtaintheBayesfactor.AmodificationtotheLaplaceapproximationwhichaccommodatestheboundarycaseisappliedbyPauleretal.(1999).Ascalculationofp(y|M)involvessolvinganintegralthatisoftennotavail-ableanalytically,onecanapplystandardapproximationssuchasquadratureandimportancesampling. 44S.K.Kinney,D.B.DunsonApracticalissuewithimportancesamplingistheselectionofthetargetdistrib-ution.MengandWong(1996)extendtheimportanceofthesamplerideaandsug-gestabridgesamplingapproachforapproximatingp(y|M).AnMCMCalgorithmusingGibbssamplingwasdevelopedbyChib(1995).Aharmonicestimator,consis-tentforsimulationsthoughotherwiseunstable,isproposedbyNewtonandRaftery(1994).Lastly,anapproachsuggestedbyGreen(1995)isdescribedwhichcom-putestheBayesfactorsdirectlyusingareversible-jumpMCMCalgorithmwhichcanmovebetweenmodelswithparameterspacesofdifferingdimension.Thisislikelytobecomputationallyintensive,andinSinharayandStern(2001)itwastheslowestapproach,whereastheLaplaceapproximationwasthefastest.3.2StochasticSearchVariableSelectionInextendingBayesianmodelselectionproceduresforlinearmodelstolinearmixedeffectsmodels,thetwoprimaryconsiderationsarethepriorspecificationandpos-teriorcomputation.Thestructureoftherandomeffectscovariancematrixneedstobeconsidered,andthemodelparameterizationsandpriorstructurecarefullychosensothattheMCMCalgorithmmaymovebetweenmodelswithbothdifferingfixedeffectsandrandomeffects.Theefficiencyoftheposteriorcomputationalsoneedstobeconsidered;algorithmsthatexplorethemodelspaceefficientlyandquicklylocateareasofhighposteriorprobabilityareneeded.AsdescribedinSect.2,stochasticsearchvariableselection(SSVS)isapromis-ingapproachforBayesianmodeluncertaintyusingGibbssampling.TheSSVSap-proachhasbeenappliedsuccessfullyinawidevarietyofregressionapplications,includingchallenginggeneselectionproblems.OnechallengeindevelopingSSVSapproachesforrandomeffectsmodelsistheconstraintthattherandomeffectsco-variancematrixbepositivesemi-definite.ChenandDunson(2003)addressedthisproblembyusingamodifiedCholeskydecompositionof=,(1)whereisapositivediagonalmatrixwithdiagonalelementsλ=(λ1,...,λq)proportionaltotherandomeffectsstandarddeviations,sothatsettingλl=0isequivalenttodroppingthelthrandomeffectfromthemodel,andisalowertrian-gularmatrixwithdiagonalelementsequalto1andfreeelementsthatdescribetherandomeffectscorrelations.Inthecaseofindependentrandomeffects,issimplytheidentitymatrixIandthediagonalelementsλl,l=1,...,qofequaltherandomeffectsstandarddeviations.Inthenextsection,werevisittheSSVSapproachofChenandDunson(2003)forlinearmixedmodels,withadditionalconsiderationgiventothepriorstruc-tureandposteriorcomputation.Wewillthendiscussanextensiontologisticmodels. BayesianModelUncertaintyinMixedEffectsModels454LinearMixedModelsIfwehavensubjectsunderstudy,eachwithniobservations,i=1,...,n,letyijdenotethejthresponseforsubjecti,Xijap×1vectorofpredictors,andZijaq×1vectorofpredictors.Thenthelinearmixedeffects(LME)modelisdenotedasβ+Za2yij=Xijiji+ij,ij∼N(0,σ),(2)whereai∼N(0,).Hereβ=(β1,...,βp)arethefixedeffectsandai=(ai1,...,aiq)aretherandomeffects.InpracticeZijistypicallychosentobeasubsetofthepredictorsinXijbelievedtohaverandomeffects,oftenonlythein-terceptforsimplicity.IfweletXijandZijincludeallcandidatepredictors,thentheproblemofinterestistolocateasubsetofthesepredictorstobeincludedinthemodel.Withthehelpofcovariancedecompositionin(1)wecanuseSSVS,andwrite(2)asβ+Zb2yij=Xijiji+ij,ij∼N(0,σ),(3)wherebi∼N(0,I).ChenandDunson(2003)showthatbyrearrangingterms,thediagonalelements,λl,l=1,...,q,ofcanbeexpressedaslinearregressioncoefficients,conditionalonandbi.Similarly,thefreeelementsγk,k=1,...,q(q−1)/2,ofcanbeexpressedaslinearregressioncoefficients,conditionalonandbi.HencethevarianceparametersλandγhavedesirableconditionalconjugacypropertiesforconstructingaGibbssamplingalgorithmforsamplingtheposteriordistributionandweareabletousetheSSVSapproach.4.1PriorsPriorselectionisakeystepinanyBayesiananalysis;however,inthiscontextitisparticularlyimportantasproblemscanarisewhendefaultpriorsareappliedwithoutcaution.Inparticular,flatorexcessivelydiffusepriorsarenotrecommendedforhierarchicalmodelsgiventhepotentialforanimproperposteriorandthedifficultyofverifyingproprietyduetotheintractablenatureofthedensity,evenwhentheoutputfromaGibbschainseemsreasonable(HobertandCasella,1996).ProperdistributionsarealsodesiredforBayesfactorstobewell-defined(Pauleretal.,1999).Thearbitrarymultiplicativeconstantsfromimproperpriorscarryovertothemarginallikelihoodp(y|M)resultinginindeterminatemodelprobabilitiesandBayesfactors(BergerandPericchi,2001).Amixtureofapointmassatzeroandanormalorheavier-taileddistributionisacommonchoiceofpriorforfixedeffectscoefficients,βl,l=1,...,p,inBayesianmodelselectionproblems.SmithandKohn(1996)introduceavectorJofindicatorvariables,whereJl=1indicatesthatthelthvariableisinthemodel,l=1,...,p, 46S.K.Kinney,D.B.DunsonandassignaZellnerg-prior(ZellnerandSiow,1980)toβJ,thevectorofcoeffi-cientsinthecurrentmodel.Asanotationalconvention,weletβdenotethep×1vector({βl:Jl=1}=βJ,{βl:Jl=0}=0).Hence,conditionalonthemodelindexJ,thepriorforβisinducedthroughthepriorforβJ.Consistencyissuescanarisewhencomparingmodelsbasedonthesepriors;how-ever,forlinearmodels,placingaconjugategammaprioronginducesatprioronthecoefficients.Inthespecialcasewherethetdistributionhasdegreesoffree-domequal1,theCauchydistributionisinduced,whichhasbeenrecommendedforBayesianrobustness(ClydeandGeorge,2004).Thiscanbeconsideredaspe-cialcaseofmixturesofg-priors,proposedbyLiangetal.(2005)asanattractivecomputationalsolutiontotheconsistencyandrobustnessissueswithg-priors,andanalternativetotheCauchyprior,whichdoesnotyieldaclosed-formexpressionforthemarginallikelihood.Aschoosinggcanaffectmodelselection,withlargevaluesconcentratingtheprioronsmallmodelswithafewlargecoefficientsandsmallvaluesofgconcentratingtheprioronsaturatedmodelswithsmallcoeffi-cients,severalapproachesforhandlingghavebeenproposed(Liangetal.,2005).Recommendationsincludetheunitinformationprior(KassandWasserman,1995),whichinthenormalregressioncasecorrespondstochoosingg=n,leadingtoBayesfactorsthatbehaveliketheBICandthehyper-gpriorofLiangetal.(2005).FosterandGeorge(1994)recommendcalibratingthepriorbasedontheriskinfla-tioncriterion(RIC)andFernandezetal.(2001)recommendacombinationoftheunitinformationpriorandRICapproach.AnotheralternativeisalocalempiricalBayesapproach,whichcanbeviewedasestimatingaseparategforeachmodel,orglobalempiricalBayes,whichassumesacommongbutborrowsstrengthfromallmodels(Liangetal.,2005).Forstandarddeviationparametersinhierarchicalmodels,Gelman(2005)rec-ommendsafamilyoffolded-tpriordistributionsoverthecommonlyusedinversegammafamily,duetotheirflexibilityandbehaviorwhenrandomeffectsareverysmall.Thesepriorsareinducedusingaparameter-expansionapproachwhichhastheaddedbenefitofimprovingcomputationalefficiencybyreducingdependenceamongtheparameters(Liuetal.,1998;LiuandWu,1999).ThisyieldsaGibbssamplerlesspronetoslowmixingwhenthestandarddeviationsarenearzero.TheChenandDunson(2003)approachhadthedisadvantagesof(1)relyingonsub-jectivepriorsthataredifficulttoelicit,and(2)computationalinefficiencyduetoslowmixingoftheGibbssampler;henceweusetheparameter-expandedmodeltoaddressthesetwoproblems.ExtendingtheparameterexpansionapproachproposedbyGelman(2005)forsimplevariancecomponentmodelstotheLMEmodel,wereplace(3)withβ+ZAξ+2yij=Xijijiij,ij∼N(0,σ),(4)whereξi∼N(0,D)andA=diag(α1,...,αq)andD=diag(d1,...,dq)arediagonalmatrices,α1Nl∼N(0,1),l=1,...,q,anddl∼IG(,),l=1,...,q,22IGdenotingtheinversegammadistribution.Notethatthelatentrandomeffectshavebeenmultipliedbyaredundantmultiplicativeparameter.Inthiscase,theim-pliedcovariancedecompositionis=ADA. BayesianModelUncertaintyinMixedEffectsModels47Theparametersαl,l=1,...,q,areproportionaltoλlandthustotherandomeffectsstandarddeviations,sosettingαl=0effectivelydropsouttherandomef-fectsforthelthpredictor.Whenrandomeffectsareassumedtobeuncorrelated,i.e.,=Iandλl,l=1,...,qequaltherandomeffectsstandarddeviations,a√foldedtprioronλl=|αl|dl,l=1,...,qisinduced,asdescribedinGelman(2005).Generalizingtothecaseofcorrelatedrandomeffects,afolded-tpriorisnotinduced;however,improvedcomputationalefficiencyisstillachieved,asillustratedinSect.6.Inourproposedpriorstructure,weuseaZellner-typepriorforthefixedeffectscomponents.Specifically,weletβ0,σ2(XJXJ)−1/g,g∼G(1,N),J∼N22σ2∝1,andJ2l∼Be(p0),l=1,...,p,withBedenotingtheBernoulliσdistributionandG(a,b)denotingtheGammadistributionwithmeana/bandvariancea/b2.Wegiveαl,l=1,...,q,azero-inflatedhalf-normalprior,ZI−N+(0,1,pl0),wherepl0isthepriorprobabilitythatαl=0.Lastly,thefreeelementsofaretreatedasaq(q−1)/2-vectorwithpriorp(γ|α)=N(γ0,Vγ)·1(γ∈Rα)whereRαconstrainselementsofγtobezerowhenthecorrespond-ingrandomeffectsarezero.Forsimplicity,wedonotallowuncertaintyinwhichrandomeffectsarecorrelated.4.2PosteriorComputationThejointposteriordistributionforθ=(α,β,γ,σ2)isgivenbynniN(y22p(θ|y)∝Np(ξi;0,D)ij;Xijβ+ZijAξi,σ)p(σ)p(β,J,g)p(α,γ)p(D).i=1j=1(5)Thisdistributionhasacomplexform,whichwecannotsamplefromdirectly;instead,weemployaparameter-expandedGibbssampler(Liuetal.,1998;LiuandWu,1999).TheGibbssamplerproceedsbyiterativelysamplingfromthefullcondi-tionaldistributionsofallparametersα,γ,β,σ2,hyperparametersgandJ,andthediagonalelementsdl,l=1,...,qofD.Thefullconditionalposteriordistributionsfollowfrom(5)usingstraightforwardalgebraicroutes.LetψbetheN-vectorsuchthatψij=yij−ZAξi.Thevectoroffixedeffectsijcoefficients,β,andeffectively,X,changedimensionfromiterationtoiteration,de-pendingonthevalueofJ,socareneedstobetakentoensurethatthedimensionsJdenotethesubvectorofXareconsistent.LetXijij,{Xijl:Jl=1},βJthesubvec-torβ,{βl:Jl=1}.Thefullconditionalposteriorp(βJ|J,α,γ,σ2,ξ,y,X,Z)isN(βˆJ,VJ)where⎛⎞⎛⎞−1nniVJnni1βˆ=⎝(yij−ZAξ)XJ⎠·andVJ=⎝XJJ+g⎠.Jijiijσ2ijXijσ2i=1j=1i=1j=1 48S.K.Kinney,D.B.DunsonTocalculatetheposteriorforJeachJlneedstobeupdatedindividually,conditionalonJ−l,thesubvectorofJexcludingJl.Wecalculatep(Jl=1|J−l,α,γ,σ2,ξ,y,X,Z)forl=1,...,p,byintegratingoutβandσ2asinSmithandKohn(1996)andobtainingp(J1,l=1|J−l,α,γ,φ,ξ,y,X,Z)=1+hlwhereJ−l={Ji:i=l},1/21−pl01S(Jl=0)hl=·1+·pl0gS(Jl=1)and−N/2S(J)=ψψ−βˆV−1βˆ.JJJS(Jl=0)isequivalenttoS(J)butwiththeelementJlofJsetto0,soψ,XJ,βˆJandVJmayneedtoberecomputedtocorrespondtoJl=0.Similarly,forS(Jl=1).Tocompletethefixedeffectscomponentupdating,theposteriorsofgandσ2areneeded.Thegammaandinversegammapriorsusedyieldconjugategammaandinversegammaposteriors.Thefullconditionalposteriorforgisgivenby*+XJXJβ2pJ+1βJJ/σ+N,v22p2wherepJ=1(Jl=1).Thefullconditionalposteriorforσisgivenbyl=1*+N+pJJJψψ+gβJXXβJIG,.22Fortherandomeffectscomponent,thedimensionalitydoesnotchangebetweeniterations.Forγandξi,thenormalpriorsyieldconjugatenormalposteriors,whilethezero-inflatedhalf-normalpriorforeachαlyieldsazero-inflatedhalf-normalposterior.LetψbetheN-vectorsuchthatψij=yij−XJijβ−ZAξi.ijThefullconditionalposteriorp(γ|α,β,λ,ξ,σ2,y,X,Z)isgivenbyN(γˆ,Vˆγ)·1(γ∈Rλ)where⎛⎞−1⎛⎞nni1nni1Vˆ⎝uiju+V−1⎠andγˆ=⎝(yij−XJ+γV−1⎠·Vˆγ.γ=σ2ijγσ2ijβJ)uij0γi=1j=1i=1j=1Theq(q−1)/2vectoruijisdefinedas(ξilαmZijm:l=1,...,q,m=l+1,...,q)sothattherandomeffectstermZAξicanbewrittenasuγ.ijijEachαlmustbeupdatedindividually.Thezero-inflatedtruncatednor-malpriorforαlyieldsaconjugateposteriorp(αl|α−l,β,γ,ξ,φ,y,X,Z)=ZI−N+(α,ˆVαl,pˆl)where BayesianModelUncertaintyinMixedEffectsModels49*n+⎛⎞−1nitnni2i=1j=1ijlTij⎝tijl⎠αˆ=Vαl,Vαl=+1,σ2σ2i=1j=1palpˆl=,pN(0;0,1)1−(0;ˆα,Vαl)al+(1−pal)·N(0;ˆα,Vαl)1−(0;0,1)whereTij=yij−XJijβJ−k=ltijkαkandN(0;m,v)denotesthenormaldensitywithmeanmandvariancevevaluatedat0and(0;m,v)isthenormalcumulativedistributionfunctionwithmeanmandvariancevevaluatedat0.Theqvector**++Tl−1tij=Zijlξil+ξimγml:l=1,...,qm=1isdefinedsothattherandomeffectstermZAξicanbewrittenastα.ijijThelatentvariablesξi,i=1,...,nhaveposteriorp(ξi|β,α,γ,σ2,y,X,Z)givenbyN(ξˆi,Vξ)where⎛⎞−1niniξˆi=(yij−XJijβJ)ZijAVξσ−2andVξ=⎝AZijZijAσ−2+D−1⎠.j=ij=1Onlythecomponentsofξicorrespondingtoαl>0areupdated.Lastly,thediagonalelementsofDhaveinversegammapriorsIG(1,N);hencetheposteriorisgivenby22*+nξ221nNi=1ilp(dl|α,β,γ,ξ,σ,y)=IG+,+.2222TheinitialMCMCdraws,priortotheconvergenceofthechain,arediscarded,andtheremainingdrawsusedtoobtainposteriorsummariesofmodelparameters.Modelswithhighposteriorprobabilitycanbeidentifiedasthoseappearingmostoftenintheoutputandconsideredforfurtherevaluation.Marginalinclusionproba-bilitiesforagivencoefficientmayalsobecalculatedusingtheproportionofdrawsinwhichthecoefficientisnonzero.5BinaryLogisticMixedModelsLogisticmixedmodelsarewidelyused,flexiblemodelsforunbalancedrepeatedmeasuresdata.OurapproachforlogisticmixedmodelsistoformulatethemodelinsuchawaythatitscoefficientsareconditionallylinearandtheSSVSapproachcanagainbeapplied.Thisentailstheuseofadataaugmentationstrategyandap-proximationofthelogisticdensity,withapproximationerrorcorrectedforusingimportanceweights.Thecovariancedecompositionin(1)andparameterexpansionapproachdescribedinSect.4.1areagainused. 50S.K.Kinney,D.B.DunsonDefiningtermsasin(3),thelogisticmixedmodelforabinaryresponsevariableyiswrittenas=Xβ+ZalogitP(yij=1|Xij,Zij,β,ai)ijiji,ai∼N(0,).(6)WewouldliketobeabletoapplytheSSVSapproachasinthenormalcase.Ifweapplythecovariancedecompositionin(1)tothelogisticmixedmodel,wehave=Xβ+ZblogitP(yij=1|Xij,Zij,β,λ,γ,bi)ijiji,bi∼N(0,I).(7)Inthiscase,themodelisnonlinearandwedonotimmediatelyhaveconditionallinearityforthevarianceparametersλandγasinthenormalcase.Toobtaincon-ditionallinearityforthemodelcoefficients,wetakeadvantageofthefactthatthelogisticdistributioncanbecloselyapproximatedbythetdistribution(AlbertandChib,1993;HolmesandKnorr-Held,2003;O’BrienandDunson,2004),andthatthetdistributioncanbeexpressedasascalemixtureofnormals(West,1987).First,notethat(7)isequivalenttothespecification1wij>0yij=,0wij≤0wherewijisalogisticallydistributedrandomvariablewithlocationparameterXβ+Zbianddensityfunctionijijexp{−(wij−Xβ−Zbi)}ijijL(wij|Xij,Zij,β,λ,γ)=.{1+exp[−(wij−Xβ−Zbi)]}2ijijThen,aswijisapproximatelydistributedasanoncentraltνwithlocationparameterXβ+Zbiandscaleparameterσ˜2,wecanexpressitasascalemixtureofijijnormalsandwriteβ+Zb2wij=Xijiji+ij,ij∼N(0,σ˜/φij),(8)whereφν,ν.Settingν=7.3andσ˜2=π2(ν−2)/3νmakestheap-ij∼G22proximationnearlyexact.Theapproximationerror,thoughnegligibleexceptintheextremetails,maybecorrectedforbyimportanceweightingwhenmakinginfer-ences.Underthismodelformulation,wehaveamodelinwhichallcoefficientsareconditionallynormal,andweareabletoapplySSVStotheproblem.Wealsoareabletotakeadvantageoftheimprovedcomputationalefficiencyofaparameterexpandedmodelasin(4).Applyingtheparameterexpansionto(8)wehaveβ+Z2wij=XijijAξi+ij,ij∼N(0,σ˜/φij),wheretermsaredefinedasin(4)and(8).Wewillusethismodelformulationtoproposeapriorstructureandcomputeposteriordistributions. BayesianModelUncertaintyinMixedEffectsModels515.1PriorsandPosteriorComputationWeusethesamepriorsfortherandomeffectsparametersasinthenormalcase,andsimilarpriorsforthefixedeffectsparameters.WespecifyβJ∼N0,(XJXJ)−1/g,g∼G(1,N),andJl∼Be(p0),l=1,...,p.Using22thet-distributiontoapproximatethelikelihoodaspreviouslydescribed,thejointposteriordistributionforθ=(α,β,γ,φ)isgivenbynniσ˜2p(θ|y)∝p(β,J,g)p(γ,α)p(D)Nq(ξi;0,D)Nwij;Xijβ+ZijAξi,φiji=1j=1×{1(wij>0)yij+1(wij≤0)(1−yij)}p(φij).(9)AgainwehaveacomplexposteriorfromwhichwecannotdirectlysampleandweemployaGibbssampler.InintroducingalatentvariablewijwehaveappliedadataaugmentationstrategyrelatedtoAlbertandChib(1993)andusedformultivariatelogisticmodelsbyO’BrienandDunson(2004).ThisauxiliaryvariableisupdatedintheGibbssampleranditsfullconditionalposteriorfollowsimmediatelyfrom(9)asanormaldistributiontruncatedaboveorbelowby0dependingonyij2Nwσ˜·1(−1)yijij;Xijβ+ZijAξi,φijwij<0p(wij|θ,yij)=1−yijyij,0;Xβ+ZAξ,σ˜21−0;Xβ+ZAξ,σ˜2ijijiφijijijiφij(10)where(·)indicatesthenormalcumulativedistributionfunction.TheGibbssamplerproceedsbyiterativelysamplingfromthefullconditionaldistributionsofwandallparametersα,γ,β,φ,hyperparametersgandJ,aswellasthelatentvariableξi,i=1,...,nandthediagonalelementsdl,l=1,...,qofD.Theremainingfullconditionalposteriordistributionsfollowfrom(9)andaresimilarinformtothenormalcase.Somedifferencesarethatσ2isfixedandtheGibbssampleradditionallyupdateswandφ.LetψbetheN-vectorsuchthatψij=wij−ZAξi.Asinthenormalcase,ijthevectoroffixedeffectscoefficients,β,andeffectively,X,changedimensionfromiterationtoiteration,dependingonthevalueofJ,socareneedstobetakenJdenotethesubvectorofXtoensurethatthedimensionsareconsistent.LetXij,ij{Xijl:Jl=1},βJthesubvectorβ,{βl:Jl=1}.Thefullconditionalposteriorp(βJ|J,α,γ,φ,ξ,y,X,Z)isN(βˆJ,VJ)where⎛⎞⎛⎞−1nniφijnniφijβˆ=⎝ψijXJ⎠·V⎝XJJ+g⎠.Jσ˜2ijJandVJ=ijXijσ˜2i=1j=1i=1j=1 52S.K.Kinney,D.B.DunsonTocalculatetheposteriorforJeachJlneedstobeupdatedindividually.Wecalculatep(Jl=1|J−l,α,γ,φ,ξ,y,X,Z)forl=1,...,p,byintegratingoutβasinSmithandKohn(1996)andobtainingp(Jl=1|J−l,α,γ,φ,ξ,y,X,Z)=1,whereJ−l={Ji:i=l},1+hl1/21−pl01S(Jl=0)hl=··pl0gS(Jl=1)and⎧⎛⎞⎫⎨1nni⎬JJ1/21/2⎝2−1βˆ⎠S(J)=|XX|·|VJ|exp⎩−φijψij−βˆJVJJ⎭.2i=1j=1S(Jl=0)isequivalenttoS(J)butwiththeelementJlofJsetto0,soψ,XJ,βˆJandVJmayneedtoberecomputedtocorrespondtoJl=0.SimilarlyforS(Jl=1).Tocompletethefixedeffectscomponentupdating,theposteriorsofgandφareneeded.Thegammapriorsusedyieldconjugategammaposteriors.Thefullconditionalposteriorforgisgivenby*+XJXJβpJ+1βJJ+N,,22pwherepJ=1(Jl=1).Thecomponentsofφ,φij,arenotidenticallydistrib-l=1uted.Eachφijhasaconjugategammaposterior*+ν+1(wij−ZijAξi−Xβ)2/σ˜2+νijG,.22Fortherandomeffectscomponent,thedimensionalitydoesnotchangebetweeniterations.Forγandξi,thenormalpriorsyieldconjugatenormalposteriors,whilethezero-inflatedhalf-normalpriorforeachαlyieldsazero-inflatedhalf-normalposterior.LetψbetheN-vectorsuchthatψij=yij−XJijβ−ZAξi.ijThefullconditionalposteriorp(γ|α,β,λ,ξ,φ,y,X,Z)isgivenbyN(γˆ,Vˆγ)·1(γ∈Rλ)where⎛⎞−1⎛⎞nniφijnniφijVˆ⎝uiju+V−1⎠andγˆ=⎝(wij−XJ+γV−1⎠·Vˆγ.γ=σ˜2ijγσ˜2ijβJ)uij0γi=1j=1i=1j=1Theq(q−1)/2vectoruijisdefinedas(ξilαmZijm:l=1,...,q,m=l+1,...,q)sothattherandomeffectstermZAξicanbewrittenasuγ.ijijEachαlmustbeupdatedindividually.Thezero-inflatedtruncatednor-malpriorforαlyieldsaconjugateposteriorp(αl|α−l,β,γ,ξ,φ,y,X,Z)=ZI−N+(α,ˆVαl,pˆl)where BayesianModelUncertaintyinMixedEffectsModels53*n+⎛⎞−1niφnni2i=1j=1ijtijlTij⎝φijtijl⎠αˆ=Vαl,Vαl=+1,σ˜2σ˜2i=1j=1palpˆl=,pN(0;0,1)1−(0;ˆα,Vαl)al+(1−pal)·N(0;ˆα,Vαl)1−(0;0,1)whereTij=wij−XJijβJ−k=ltijkαkandN(0;m,v)denotesthenormaldensitywithmeanmandvariancevevaluatedat0and(0;m,v)isthenormalcumulativedistributionfunctionwithmeanmandvariancevevaluatedat0.Theqvector**++Tl−1tij=Zijlξil+ξimγml:l=1,...,qm=1isdefinedsothattherandomeffectstermZAξicanbewrittenastα.ijijThelatentvariablesξi,i=1,...,nhaveposteriorp(ξi|β,α,γ,φ,y,X,Z)givenbyN(ξˆi,Vξ)where⎛⎞−1niniξˆ=φij(wij−XJβJ)ZAVξσ˜−2andVξ=⎝φijAZijZAσ˜−2+D−1⎠.iijijijj=ij=1Onlythecomponentsofξicorrespondingtoαl>0areupdated.Lastly,thediagonalelementsofDhaveinversegammapriorsIG(1,N);hencetheposteriorisgiven22nξ2byp(d1nNi=1ill|α,β,γ,ξ,φ,y)=IG+,+.22225.2ImportanceWeightsThisGibbssamplergeneratessamplesfromanapproximateposterioraswehaveapproximatedthelogisticlikelihoodin(8).Tocorrectforthis,importanceweights(Hastings,1970)maybeappliedwhencomputingposteriorsummariestoobtainexactinferences.IfwehaveMiterationsofourGibbssampler,excludingtheburn-ininterval,thenourimportanceweightsr(t),t=1,...,McanbecomputedasnniL(wij;Xβ+ZAξ)(t)ijijir=,Tν(wij;Xβ+ZAξi,σ˜2)i=1j=1ijijwhereL(·)isthelogisticdensityfunctionandTν(·)isthetdensityfunctionwithdegreesoffreedomν.Posteriormeans,probabilities,andothersummariesofthemodelparameterscanbeestimatedfromtheGibbssampleroutputusinganimportance-weightedsampleaverage.Forexample,theposteriorprobabilityforagivenmodelmisthesumof 54S.K.Kinney,D.B.Dunsontheweightscorrespondingtoeachoccurrenceofmodelmintheposteriorsample,dividedbythesumofallMweights.Theapproximationisverycloseandhencetheweightsareclosetoone.Inoursimulationanddataexamples,wefoundverylittledifferencebetweenweightedandunweightedresults.Inlieuofapproximatingthelogisticdistributionwiththetdistribution,wealsoconsideredtheslicesamplerforsamplingfromtheexactposteriordistribu-tionasappliedbyGerlachetal.(2002)tovariableselectionforlogisticmodels.Inthisapproach,themodelisconsideredlinearwithresponsevariablevij=logitp(yij=1),thevectoroflogodds,andvij=logitp(yij=1)=Xβ+ijZbi+ij,ij∼N(0,σ2).Thevectorvijisupdatedinadata-augmentedijGibbssamplerwhereanauxiliaryvariableu1ij∼U0,isintroduced1+exp(vij)sothatthefullconditionalposteriordistributionforvijissimplifiedtoatruncatednormaldistributionasfollows:p(v22ij|yij,Xij,Zij,β,α,γ,σ)∝p(yij|vij)·p(vij|Xij,Zij,β,α,γ,σ)evijyij∝·N(Xβ+ZAξ,σ2),1+evijijiji2)∝p(u2p(vij|uij,Xij,Zij,β,α,γ,σij|vij)p(vij|Xij,Zij,β,α,γ,σ)221−uij∝N(Xijβ+ZijAξi+σyij,σ)·1vij0)andvij∼N(Xβ+Zai,1),ijijyieldingaconditionalposteriordistributionforvijofN(Xβ+Zai,1)·{1(vij>ijij0)yij+1(vij<0)(1−yij)}.Afterupdatingvij,theMCMCalgorithmproceedsasinthenormalcase,exceptthatσ2=1.Inoursimulationsthisalgorithmexhibitedgoodmixingandconvergenceproperties.ThisalgorithmcouldalsobeadaptedforordinalprobitmodelsasdescribedinSect.8.19DiscussionTheBayesianframeworkformodelselectionwithmixedeffectsmodelsdiscussedhereisadvantageousinthatitallowsforfixedandrandomeffectstobeselectedsimultaneously.Additionally,itallowsformarginalposteriorinclusionprobabilitiestobecomputedforeachpredictoralongwithmodel-averagedcoefficientestimates.Posteriormodelprobabilitiescanbeusedtocomparemodels;whereasfrequentisttestingforvariancecomponentsismorelimited.Inadditiontomodelselectionandaveraging,theproposedpriorstructureandcomputationalalgorithmshouldbeusefulforefficientGibbssamplingforfittingsinglemixedeffectsmodels.Inparticular,thepriorandcomputationalalgorithmrepresentausefulalternativetoapproachesthatrelyoninverse-Wishartpriorsforvariancecomponents(e.g.Gilksetal.,1993).Thereisanincreasingrealizationthatinverse-Wishartpriorsareapoorchoice,particularlywhenlimitedpriorinformationisavailable.Although,wehavefocusedonLMEsoftheLairdandWare(1982)type,itisstraightforwardtoadaptourmethodsforabroaderclassoflinearmixedmodels,accommodatingvaryingcoefficientmodels,spatiallycorrelateddata,andotherapplications(Zhaoetal.,2006).Gibbssamplingchainsfromrandomeffectsmodelparameterstendtoexhibitslowmixingandconvergence.Gelfandetal.(1996)recommendhierarchicalcen-teringforimprovedconvergenceandposteriorsurfacebehavior.Vinesetal.(1994)alsoproposeatransformationofrandomeffectstoimprovemixing.Achallenge 60S.K.Kinney,D.B.Dunsoninimplementingthehierarchicallycenteredmodelistoefficientlyupdatethecor-relationmatrixinthecontextofrandomeffectsselectionwhereweareinterestedinseparatingoutthevariances.OnesolutionisproposedbyChibandGreenberg(1998);however,itisprohibitivelyslowformorethanacouplerandomeffects.Furtherworkisneededtodevelopfastapproachesthatcanbeeasilyimplementedandincorporatedintosoftwarepackages.AcknowledgementsThisworkwassupportedinpartbytheIntramuralResearchProgramoftheNIH,NationalInstituteofEnvironmentalHealthSciences,andbytheNationalScienceFounda-tionunderAgreementNo.DMS-0112069withtheStatisticalandMathematicalSciencesInstitute(SAMSI).TheauthorsaregratefultoMerliseClydeandAbelRodriguezatDukeUniversityandtotheparticipantsoftheModelUncertaintyworkinggroupoftheSAMSIProgramonLatentVariableModelingintheSocialSciences.ReferencesAgresti,A.(1990).CategoricalDataAnalysis.NewYork:WileyAlbert,J.H.andChib,S.(1993).Bayesiananalysisofbinaryandpolychotomousresponsedata.JournaloftheAmericanStatisticalAssociation88,669–679Albert,J.H.andChib,S.(2001).Sequentialordinalmodelingwithapplicationstosurvivaldata.Biometrics57,829–836Berger,J.O.,Ghosh,J.K.andMukhopadhyay,N.(2003).ApproximationsandconsistencyofBayesfactorsasmodeldimensiongrows.JournalofStatisticalPlanningandInference112,241–258Berger,J.O.andPericchi,L.R.(2001).ObjectiveBayesianmethodsformodelselection:Introductionandcomparison.InLahiri,P.,editor,ModelSelection,volume38ofIMSLec-tureNotes–MonographSeries,pages135–193.InstituteofMathematicalStatisticsBreslow,N.(2003).WhitherPQL?UWBiostatisticsWorkingPaperSeriesWorkingPaper192Breslow,N.andClayton,D.(1993).Approximateinferenceingeneralizedlinearmixedmodels.JournaloftheAmericanStatisticalAssociation88,9–25Cai,B.andDunson,D.B.(2006).Bayesiancovarianceselectioningeneralizedlinearmixedmodels.Biometrics62,446–457Chen,Z.andDunson,D.B.(2003).Randomeffectsselectioninlinearmixedmodels.Biometrics59,762–769Chib,S.(1995).MarginallikelihoodfromtheGibbsoutput.JournaloftheAmericanStatisticalAssociation90,1313–1321Chib,S.andGreenberg,E.(1998).Bayesiananalysisofmultivariateprobitmodels.Biometrika85,347–361Chung,Y.andDey,D.(2002).Modeldeterminationforthevariancecomponentmodelusingreferencepriors.JournalofStatisticalPlanningandInference100,49–65Clyde,M.andGeorge,E.I.(2004).Modeluncertainty.StatisticalScience19,81–94Fernandez,C.,Ley,E.andSteel,M.F.(2001).BenchmarkpriorsforBayesianmodelaveraging.JournalofEconometrics100,381–427Foster,D.P.andGeorge,E.I.(1994).Theriskinflationcriterionformultipleregression.AnnalsofStatistics22,1947–1975Gelfand,A.E.andSmith,A.(1990).Sampling-basedapproachestocalculatingmarginaldensities.JournaloftheAmericanStatisticalAssociation85,398–409Gelfand,A.E.,Sahu,S.K.andCarlin,B.P.(1996).Efficientparameterizationsforgeneralizedlinearmixedmodels.BayesianStatistics5, BayesianModelUncertaintyinMixedEffectsModels61Gelman,A.(2005).Priordistributionsforvarianceparametersinhierarchicalmodels.BayesianAnalysis1,1–19George,E.I.andMcCulloch,R.E.(1993).VariableselectionviaGibbssampling.JournaloftheAmericanStatisticalAssociation88,881–889George,E.I.andMcCulloch,R.E.(1997).ApproachesforBayesianvariableselection.StatisticaSinica7,339–374Gerlach,R.,Bird,R.andHall,A.(2002).Bayesianvariableselectioninlogisticregression:Pre-dictingcompanyearningsdirection.AustralianandNewZealandJournalofStatistics42,155–168Geweke,J.(1996).Variableselectionandmodelcomparisoninregression.InBayesianStatistics5–ProceedingsoftheFifthValenciaInternationalMeeting,pages609–620Gilks,W.,Wang,C.,Yvonnet,B.andCoursaget,P.(1993).Random-effectsmodelsforlongitudi-naldatausingGibbssampling.Biometrics49,441–453Green,P.J.(1995).ReversiblejumpMarkovchainMonteCarlocomputationandBayesianmodeldetermination.Biometrika82,711–732Green,P.J.(1997).Discussionof”TheEMalgorithm–anoldfolksongsungtoafastnewtune,”byMengandvanDyk.JournaloftheRoyalStatisticalSociety,SeriesB59,554–555Hall,D.andPraestgaard,J.(2001).Order-restrictedscoretestsforhomogeneityingeneralisedlinearandnonlinearmixedmodels.Biometrika88,739–751Hastings,W.(1970).MonteCarlosamplingmethodsusingMarkovchainsandtheirapplications.Biometrika57,97–109Hobert,J.P.andCasella,G.(1996).TheeffectofimproperpriorsonGibbssamplinginhierarchicallinearmixedmodels.JournaloftheAmericanStatisticalAssociation91,1461–1473Holmes,C.andKnorr-Held,L.(2003).EfficientsimulationofBayesianlogisticregressionmodels.Technicalreport,LudwigMaximiliansUniversityMunichJang,W.andLim,J.(2005).Estimationbiasingeneralizedlinearmixedmodels.Technicalreport,InstituteforStatisticsandDecisionSciences,DukeUniversityJiang,J.,Rao,J.,Gu,Z.andNguyen,T.(2008).Fencemethodsformixedmodelselection.AnnalsofStatistics,toappearJohnson,V.E.andAlbert,J.H.(1999).OrdinalDataModeling.BerlinHeidelbergNewYork:SpringerKass,R.E.andWasserman,L.(1995).AreferenceBayesiantestfornestedhypothesesanditsrelationshiptotheschwarzcriterion.JournaloftheAmericanStatisticalAssociation90,928–934Kinney,S.K.andDunson,D.B.(2007).Fixedandrandomeffectsselectioninlinearandlogisticmodels.Biometrics63,690–698Laird,N.andWare,J.(1982).Random-effectsmodelsforlongitudinaldata.Biometrics38,963–974Liang,F.,Paulo,R.,Molina,G.,Clyde,M.A.andBerger,J.O.(2005).Mixturesofg-priorsforBayesianvariableselection.TechnicalReport05-12,ISDS,DukeUniversityLin,X.(1997).Variancecomponenttestingingeneralisedlinearmodelswithrandomeffects.Biometrika84,309–326Liu,J.S.andWu,Y.N.(1999).Parameterexpansionfordataaugmentation.JournaloftheAmer-icanStatisticalAssociation94,1264–1274Liu,C.,Rubin,D.B.andWu,Y.N.(1998).ParameterexpansiontoaccelerateEM:thePX-EMalgorithm.Biometrika85,755–770Meng,X.L.andWong,W.H.(1996).Simulatingratiosofnormalizingconstantsviaasimpleidentity:atheoreticalexploration.StatisticaSinica6,831–860Mira,A.andTierney,L.(2002).Efficiencyandconvergencepropertiesofslicesamplers.Scandi-navianJournalofStatistics29,1–12Mitchell,T.J.andBeauchamp,J.J.(1988).Bayesianvariableselectioninlinearregression(withdiscussion).JournaloftheAmericanStatisticalAssociation83,1023–1036Neal,R.M.(2000).Slicesampling.Technicalreport,DepartmentofStatistics,UniversityofToronto 62S.K.Kinney,D.B.DunsonNewton,M.andRaftery,A.E.(1994).ApproximateBayesianinferencebytheweightedlikelihoodbootstrap(withdiscussion).JournaloftheRoyalStatisticalSociety,SeriesB56,3–48O’Brien,S.M.andDunson,D.B.(2004).Bayesianmultivariatelogisticregression.Biometrics60,739–746Pauler,D.K.,Wakefield,J.C.andKass,R.E.(1999).Bayesfactorsandapproximationsforvari-ancecomponentmodels.JournaloftheAmericalStatisticalAssociation94,1242–1253Raftery,A.E.(1995).Bayesianmodelselectioninsocialresarch.SociologicalMethodology25,111–163Schwarz,G.(1978).Estimatingthedimensionofamodel.AnnalsofStatistics6,461–464Sinharay,S.andStern,H.S.(2001).Bayesfactorsforvariancecomponenttestingingeneralizedlinearmixedmodels.InBayesianMethodswithApplicationstoScience,Policy,andOfficialStatistics,pages507–516Smith,M.andKohn,R.(1996).NonparametricregressionusingBayesianvariableselection.JournalofEconometrics75,317–343Stiratelli,R.,Laird,N.M.andWare,J.H.(1984).Random-effectsmodelforseveralobservationswithbinaryresponse.Biometrics40,961–971Verbeke,G.andMolenberghs,G.(2003).Theuseofscoretestsforinferenceonvariancecompo-nents.Biometrics59,254–262Vines,S.,Gilks,W.andWild,P.(1994).FittingBayesianmultiplerandomeffectsmodels.Tech-nicalreport,BiostatisticsUnit,MedicalResearchCouncil,CambridgeWest,M.(1987).Onscalemixturesofnormaldistributions.Biometrika74,646–648Zellner,A.andSiow,A.(1980).Posterioroddsratiosforselectedregressionhypotheses.InBayesianStatistics:ProceedingsoftheFirstInternationalMeetingheldinValencia(Spain)Zhao,Y.,Staudenmayer,J.,Coull,B.andWand,M.(2006).GeneraldesignBayesiangeneralizedlinearmixedmodels.StatisticalScience21,35–51 BayesianVariableSelectioninGeneralizedLinearMixedModelsBoCaiandDavidB.Dunson1Introduction1.1BackgroundandMotivationRepeatedmeasuresandlongitudinaldataarecommonlycollectedforanalysisinepidemiology,clinicaltrials,biology,sociology,andeconomicsciences.Insuchstudies,aresponseismeasuredrepeatedlyovertimeforeachsubjectunderstudy,andthenumberandtimingoftheobservationsoftenvariesamongsubjects.Incon-trasttocross-sectionalstudiesthatcollectasinglemeasurementpersubject,lon-gitudinalstudieshavetheextracomplicationofwithin-subjectdependenceintherepeatedmeasures.Suchdependencecanbethoughttoariseduetotheimpactofunmeasuredpredictors.Maineffectsofunmeasuredpredictorsleadtovariationintheaveragelevelofresponseamongsubjects,whileinteractionswithmeasuredpre-dictorsleadtoheterogeneityintheregressioncoefficients.Thisjustificationhasmo-tivatedrandomeffectsmodels,whichallowtheinterceptandslopesinaregressionmodeltobesubject-specific.Randomeffectsmodelsarebroadlyusefulformod-elingofdependencenotonlyforlongitudinaldatabutalsoinmulticenterstudies,metaanalysisandfunctionaldataanalysis.ThechapterbyKinneyandDunsonhasmotivatedanddescribedaBayesianap-proachforselectingfixedandrandomeffectsinlinearandlogisticmixedeffectsmodels.ThegoalofthecurrentchapteristooutlineaBayesianmethodologyforsolvingthesametypesofproblemsinthebroaderclassofgeneralizedlinearmixedB.CaiDepartmentofEpidemiologyandBiostatistics,ArnoldSchoolofPublicHealth,UniversityofSouthCarolinabocai@gwm.sc.eduD.B.DunsonBiostatisticsBranch,NationalInstituteofEnvironmentalHealthSciencesResearchTrianglePark,NC27709,U.S.AD.B.Dunson(ed.)RandomEffectandLatentVariableModelSelection,63DOI:10.1007/978-0-387-76721-5,cSpringerScience+BusinessMedia,LLC2008 64B.Cai,D.B.Dunsonmodels(GLMMs).GLMMsprovideanextensionofgeneralizedlinearmodels(GLMs)toaccommodatecorrelationandallowrichclassesofdistributionsthroughallowingsubject-specificregressioncoefficientsinaGLM(McCullochandSearle,2001).Typicallythesesubject-specificcoefficients,orrandomeffects,areassumedtohaveamultivariateGaussiandistributionapriori,aswillbethefocusinthischapter.Forarecentlyproposedapproachthatallowstherandomeffectsdistribu-tiontobeunknown,whilealsoallowingfixedandrandomeffectsselection,refertoCaiandDunson(2007).NotethatGLMMstypicallyassumethattheobservationsareconditionallyindependentgiventherandomeffects.However,inmarginalizingouttherandomeffects,adependencestructureisinducedinthemultipleresponsesfromasubject.Inaddition,randomeffectscanbeincorporatedinGLMstoallowricherclassesofdistributions.Forexample,byincorporatingrandomeffectsinPoissonorbino-mialGLMs,oneinducesover-dispersion.TheresultingmarginaldistributionsarenolongerPoissonorbinomial,butareinsteadmixturesofPoissonorbinomialdistributions.Theformofthelinkfunctioncanalsobeimpacted.Forexample,inmarginalizingoutrandomeffectsinalogisticregressionmodel,oneinducesalogistic-normallink.Hence,GLMMsareoftenusefulevenwhenthereisasingleobservationpersubject,andmodelingofdependenceisnotofinterest.Inperform-inginferencesandvariableselectioninGLMMs,itisimportanttokeepinmindthedualroleoftherandomeffectsininducingamoreflexibleclassofmodelsforasingleoutcomeandinaccommodatingdependenceinrepeatedoutcomes.Suchadualitydoesnotoccurinnormallinearmixedeffectmodels,sinceonestillobtainsanormallinearmodelinmarginalizingouttherandomeffects.Inadditiontothecomplicationininterpretationarisingfromthisduality,GLMMsarecertainlymorecomplicatedtofitthanlinearmixedmodelsorGLMswithoutrandomeffects.ThechallengesinfittingGLMMsarisebecausethemar-ginallikelihoodobtainedinintegratingouttherandomeffectsisnotavailableanalyticallyexceptinthenormallinearmodelspecialcase.Hence,onecannotobtainasimpleiterativesolutionformaximizingtheexactmarginallikelihood,andevenBayesianMCMC-basedapproachestendtobemoredifficulttoimplementefficiently.ThereisavastliteratureonfrequentistandBayesianmethodsforad-dressingthisproblem,forexample,refertoSchall(1991),ZegerandKarim(1991),BreslowandClayton(1993),McGilchrist(1994),andMcCulloch(1997).SuchmethodsallowonetofitasingleGLMMandtoperforminferencesonthefixedeffectregressioncoefficients.Muchoftheliteraturehasarguedthatthefixedeffectsareofprimaryinterest,withtherandomeffectsincorporatedasnuisancepa-rameterstoaccountforthecomplicationofwithin-subjectdependence.FrequentistmethodsforfittingofGLMMstendtoprovideonlyapointestimatefortherandomeffectscovariance,butifthiscovarianceisnotofinterest,suchanestimateismorethansufficient.However,itishardtothinkofastudyinwhichitisonlyofinteresttoassesstheeffectsofthepredictorsfortypicalsubjects,withouthavinganinterestalsoinhowmuchtheyvaryinthepredictoreffects.Forexample,inassessingtheefficacyofadrugtherapyinaclinicaltrial,isitreallythecasethatinterestfocusesonlyontheaverageeffectivenessofthedrugandnotonhowmuchthiseffective-nessvariesamongpatients?Certainly,cliniciansandpatientsmayviewadrugvery BayesianVariableSelectioninGeneralizedLinearMixedModels65differentlyunderthefollowingtwoscenarios:(A)thedrughasnoeffectoramildadverseeffectfor50%ofthepatientsandadramaticbeneficialeffectforasmallsubsetofthepatients;or(B)thedrughasanidenticaleffectforallpatients.Scenar-ios(A)and(B)aredistinguishedbythemagnitudeoftherandomeffectvariance.Inmanyapplications,theprimaryinterestisinassessingwhethertherandomeffectsvarianceisequaltozeroornot.Thisisoftenthecaseingeneticstudiesinwhichonewishestotestwhetherdiseaseriskvariesacrossfamilies.However,evenbeyondsuchspecializedstudies,wewouldarguethatthetypicalscenariofacedinanalysisofdatawithaGLMMisasfollows:Thestudycollectsdataforanumberofdifferentcovariatesanditisnotknownwithcertaintyaprioriwhichpredictorsshouldbeincludedinthefixedeffectscomponentandwhichshouldbeincludedintherandomeffectscomponent.Hence,toappropriatelyallowforuncertaintyinspecifyingthemodel,itwouldbeappealingtoconsideraBayesianapproachformodelingaveragingandselection.Inaddition,itistypicallythecasethatinvesti-gatorsdesireaweightofevidencethataparticularpredictorisinthefixedand/orrandomeffectscomponent.Ourgoalistodescribeanapproachthatsimultaneouslysearchesapotentiallylargemodelspaceforgoodsubsetsofpredictorstoincludeinthetwocomponents,whilealsoestimatingmarginalinclusionprobabilitiesandallowingmodel-averagedpredictions.1.2TimetoPregnancyApplicationAsmotivationconsidertheapplicationtoreproductiveepidemiologystudiesofoccupationalexposures.Toassesstheimpactofapotentiallyadverseexposureonfecundability,theprobabilityofconceptioninanoncontraceptingmenstrualcy-cle,epidemiologistscommonlymeasuretimetopregnancy(TTP).Inretrospectivestudies,TTPistypicallydefinedasthenumberofmenstrualcyclesduringwhichthewomanwashavingintercoursewithoutcontraceptionpriortohermostrecentpregnancy.BecauseTTPisadiscreteeventtime,onecanconsideradiscretehazardsmodeloftheformlogit{Pr(Tβ+zζ,(1)i=t|Ti≥t,xit,zit)}=xititiwhereTiistheTTPforwomani,xitandzitarevectorsofpredictorsthatmayvaryfromcycletocycle,βarefixedeffectsregressioncoefficients,ζi∼N(0,)arerandomeffectsforwomani,andisacovariancematrix.Model(1)usesalogisticmixedeffectsmodeltocharacterizetheconditionalprobabilityofapregnancyinthetthcycleatriskgiventhatthewomandidnotconceivepriortothatcycle.Iftherewasnounexplainedheterogeneityinfecundabilityafteraccountingforthemeasuredpredictors,thentherandomeffectscouldbeexcluded.Inthiscase,intheabsenceoftime-varyingpredictors,TTPisgeometricallydistributed.Over-dispersionrelativetothegeometricdistributionallowsonetoidentifytherandomeffectsvarianceevenwithasingleTTPmeasurementfromawoman. 66B.Cai,D.B.DunsonRowlandetal.(1992)studiedfactorsrelatedtofecundabilityindentalassistants.Studyparticipantscompletedademographicandexposurehistoryquestionnaire,whilealsoprovidinginformationonthenumberofmenstrualcyclesduringwhichthewomanwashavingnoncontraceptingsexualintercoursebeforethemostrecentpregnancy.Model(1)couldbefittedeasilyinstandardsoftwarepackages(e.g.,SASorWinBUGS)ifthepredictorstobeincludedinthefixedandrandomeffectscom-ponentswereknown.However,itisofcoursenotknownaprioriwhichpredictorsshouldbeincluded,andwewouldliketoinvestigatewhichfactorsvaryintheiref-fectsacrosswomen.Forexample,dotheeffectsofaging,recentoralcontraceptiveuse,andsmokingvary?1.3BackgroundonModelSelectioninGLMMsIfthefocuswereonselectingthesinglebestGLMMfromamongthepossiblecan-didates,onecouldpotentiallyfitthemodelforallpossiblechoicesoffixedeffectpredictors,xit,andrandomeffectspredictors,zit.Onecouldthenapplyastandardcriterion,suchastheAkaike’sinformationcriterion(AIC)ortheBayesianinfor-mationcriterion(BIC).However,itisnotclearthatthesecriteriaareappropriateinmixedeffectsmodels,asonemayneedtoestimateaneffectivedegreesoffreedomtoprovideamoreappropriatepenaltyformodelcomplexity.Inhierarchicalmodels,suchasGLMMs,thenumberofparametersisarbitrary,asonecanwritedifferentformulationsofthesamemodelthathavedifferentnumbersofparameters.Anadditionalissueisthatonemayrequireaweightofevidencethataparticularpredictorisincludedintherandomeffectscomponent.Thiscanbeaddressedbyconductinghypothesistestsofwhetherthevarianceoftherandomeffectsdistribu-tionisequaltozero.ThechapterbyCiprianCrainiceanuconsiderslikelihoodratiotestsinthissetting,whilethechapterbyDaowenZhangandXihongLinconsid-ersscoretests.Theseapproachescanbeusedtoobtainp-valuesfortestingwhethervariancecomponentsareequaltozero.IntheBayesianliterature,AlbertandChib(1997)proposedanapproachfortestingwhetherarandominterceptshouldbeincluded,SinharayandStern(2001)developedamoregeneralapproachforcalculatingBayesfactorsforvariancecom-ponentstestinginGLMMs,andChenetal.(2003)proposedaclassofinforma-tivepriorsformodelselectioninGLMMs.Thesemethodsfocusoncomparingtwomodelsatatime,anddonotprovideageneralapproachforsearchingforpromisingsubsetsofcandidatepredictors.Inthesettingoflinearmixedmodelsfornormaldata,Chenetal.(2003)proposedaBayesianapproachforrandomeffectsselectionbasedonusingvariableselectionpriorsforthecomponentsinaspecialdecompositionoftherandomeffectscovari-ance.Relatedapproacheshavebeenusedingraphical(orcovariancestructure)mod-elingformultivariatenormaldata(refertoWongetal.(2003);Liechtyetal.(2004)forrecentreferences).BayesianvariableselectioninconventionalGLMshasalsoreceivedalotofinterestintheliterature.Raftery(1996)proposedanapproximate BayesianVariableSelectioninGeneralizedLinearMixedModels67Bayesfactorapproach,MeyerandLaud(2002)consideredpredictivevariablese-lection,NottandLeonte(2004)developedaninnovativesamplingalgorithmandNtzoufrasetal.(2003)developedmethodsforjointvariableandlinkselection.CaiandDunson(2006)describedaBayesianapproachtotheproblemofsimul-taneousselectionoffixedandrandomeffectsinGLMMs.Inthischapter,wesum-marizethisworkandprovidemoredetailsoftheapproach.Wefirstchoosevariableselection-typemixturepriorsforthefixedeffectsregressioncoefficientsandtheparametersinaspecialCholeskydecompositionoftherandomeffectscovarianceproposedbyChenandDunson(2003).Thesepriorsallowfixedeffectstodropoutofthemodelbyplacingprobabilitymassonβl=0.Inaddition,followingare-latedapproachtoAlbertandChib(1997)andChenandDunson(2003)(referalsototheKinneyandDunsonchapter),weassignpositiveprobabilitytorandomeffectshaving0variancetoeffectivelymovebetweenthefullmodelwithrandomeffectsforeverypredictorandsubmodelsexcludingoneormorerandomeffects.Thispriorspecificationhasconvenientcomputationalproperties,whichisimportant,giventhepotentiallylargenumberofmodelsunderconsideration.Outsideofsimplemodels,itistypicallythecasethatBayesianmodelselec-tionrequiresthecalculationofnormalizingconstants,whichdonothaveclosedforms.Unfortunately,typicalMCMCalgorithmsbypasscalculationofnormalizingconstants,soarenotappropriate.Inaddition,MCMC-basedmethodsforcalculat-ingnormalizingconstantstendtobehighlycomputationally-intensive,evenwhenconsideringasinglemodelinsteadofahigh-dimensionallistofmodels.Forthesereasons,manyapproachesrelyonanalyticapproximationstointractableintegralsusingLaplaceandotherapproachesbasedonTaylorseries.TheCaiandDunson(2006)approachreviewedinthischapterreliesonstochasticsearchvariableselec-tion(SSVS)(GeorgeandMcCulloch(1993))implementedwithMCMC,combinedwithlimitedanalyticapproximations.SimilarideaswereimplementedpreviouslybyRafteryetal.(1996)inthecontextofmodelaveraginginsurvivalanalysis,andChipmanetal.(2002,2003)inimplementinganalysesoftreedGLMs.Theremainderofthischapterisorganizedasfollows.Section2reviewsthespecificationofaGLMM,anddescribesaBayesianformulationofthemodelse-lectionproblem.Section3outlinestheSSVSalgorithmforposteriorcomputationandmodelsearch.Section4considerssimulationexamplesasaproofofconceptandillustration.Section5appliestheapproachtotheRowlandetal.(1992)timetopregnancyapplication,andSect.6containsadiscussion.2BayesianSubsetSelectioninGLMMs2.1GeneralizedLinearMixedModelsForobservationj(j=1,...,ni)fromsubjecti(i=1,...,n),letyijdenotetheresponsevariable,letxijdenoteap×1vectorofcandidatepredictors,andletzijdenoteaq×1vectorofcandidatepredictors.Notethatitisimportanttodistinguish 68B.Cai,D.B.Dunsoncandidatepredictorsfrompredictorsthatareincludedinthemodel.Here,wewillfollowtheapproachofimbeddingallofthemodelsunderconsiderationinafullmodelthatcontainsallofthecandidatepredictors.Then,bychoosingapriorthatallowsthefixedeffectcoefficientstohavevaluesexactlyequaltozero,wealloweachofthefixedeffectcandidatepredictorstopotentiallydropoutofthemodel.Inaddition,bychoosingapriorthatallowstherandomeffectvariancestobeexactlyzero,weallowpredictorstodropoutoftherandomeffectcomponent.ReviewingtheGLMMspecification,notethattheelementsofyi=(yi1,...,yi,n)aremodeledasconditionally-independentrandomvariablesfromasimpleiexponentialfamily%yijθij−b(θij)π(yij|xij,zij,ζi)=exp+c(yij,φ),(2)aij(φ)whereθijiscanonicalparameterrelatedtothelinearpredictorηij=xβ+zζiijijwithap×1vectoroffixedeffectsregressioncoefficientsβ,andaq×1vectorofsubject-specificrandomeffectsζi∼Nq(0,),φisascalardispersionparameter,andaij(·),b(·),c(·)areknownfunctions,withaij(φ)typicallyexpressedasφ/wij,wherewijisaknownweight.Notethattheconditional-independenceassumptionimpliesthatthedependenceamongthedifferentobservationsfromsubjectiarisesentirelyfromtheshareddependenceontherandomeffects.Inmarginalizingouttherandomeffects,oneinducesapredictor-dependentcorrelationstructureinthemultivariateresponsevec-tor,yi.Forthisreason,GLMMsprovideaveryusefulclassofmodelsformodelingofmultivariatenon-normaldata.Inaddition,inselectingpredictorstoincludeintherandomeffectscomponent,wearealsosimultaneouslyselectingacovariancestructureamongthemultipleoutcomes.Lety=(y,...,yn),yi=(yi1,...,yin),X=(X,...,Xn),Xi=1i1(xi1,...,xin),Z=diag(Z1,...,Zn),Zi=(zi1,...,zin),andζ=ii(ζ,...,ζn).Thejointdistributionofresponsesyandrandomeffectsζcondi-1tionallyonthepredictorsXandZisoftheform.!"/h(η)−b+c(y,φ)1π(y,ζ|β,φ,,X,Z)=expy(h(η))1N/aNπ(ζ|),(3)nwhereη=Xβ+Zζand1NisanN×1vectorofones,whereN=i=1ni.Inpractice,oneneedstochooseaparticularexponentialfamilydistributionandlinkfunctiontocompleteanexplicitspecificationofthemodel.OneaspectofmodeluncertaintyinGLMMsischoiceofthedistributionandlinkfunction.How-ever,inthischapter,weassumethatboththesecomponentsareknowntosimplifyexposition.OurfocusisonBayesianapproachesforaccountingforuncertaintyintheele-mentsofxijandzijtobeincludedinthemodel,aswellasthecovariancestructureintheζi’s.AsdiscussedindetailinthechapterbyKinneyandDunson,aBayesianspecificationofthemodeluncertaintyproblemrequiresonetochooseaprioronthe BayesianVariableSelectioninGeneralizedLinearMixedModels69modelspace,correspondingtopriorprobabilitiesforeachofthemodelsinthelist,alongwithpriorsforthecoefficientswithineachofthemodels.WeletMdenotethelistofmodelscorrespondingtoallpossiblesubsetsofxijandzij.Inaddition,weletM∈MbeanindexforasinglemodelinM.Withthisnotationinplace,weletx(M),z(M),β(M),ζ(M),and(M)denotetheijiji(M)(M)(M)(M)(M)termsformodelM,whichisspecifiedasη=xβ+zζ,withtheijijijidispersionparameter,linkfunction,anddistributionalformassumedcommontothe(M)differentmodelsM∈M.ThepredictorsxconsistofapM≤psubsetofxij,ij(M)isaq(M)(M)whilezM≤qsubsetofzij.Inaddition,ζ∼N(0,)isaqM×1ijivectorofrandomeffects,withqM×qMcovariancematrix(M),whichcanhavezerooff-diagonalelementscorrespondingtoconditionalindependencerelationshipsintherandomeffectsincluded.Themodelspace,M,includesallpossiblecombinationsofsubsetsofxijandzijandzerooff-diagonalelementsoftherandomeffectscovariancema-tricescorrespondingtothesesubsets.Hence,thetotalnumberofmodelsispqq1(q−k)(q−k−1)222.Clearly,thenumberofmodelsunderconsiderationk=0kgrowsextremelyfastwithpandq,sothatitwouldbeinfeasibletorunaseparateMCMCanalysisforeachmodelinthelistevenforamodestnumberofcandidatepredictors.Forexample,evenwithp=q=5,wehave46,400modelsinM.2.2DescriptionofApproachOurgoalistoselectgoodmodelsfromamongthedifferentpossibilitiesforM.Toattempttoidentifygoodmodelsquicklyfromamongthepotentiallyenormousnum-berofpossiblemodelsunderconsideration,weapplyastochasticsearchvariableselectionapproach.ThisalgorithmsequentiallymodifiesthevariablesincludedineachcomponentthroughMCMCsampling.Forthefixedeffectscomponent,wefol-lowthecommonconventionofchoosingmixturepriorswithpointmassatzero.Fortherandomeffects,amoreinnovativeandinvolvedapproachisnecessary.Inpartic-ular,weproposetoinducezerovariancecomponentsandzerocorrelationsbetweenrandomeffectsthroughzeroingcoefficientsinacarefullychosendecompositionoftherandomeffectscovariance.InBayesiananalyses,thestandardpriorforacovariancematrixistheWishartprior.However,itiswidelyknownthattheWishartpriorisveryinflexible,al-lowingonlyasingledegreesoffreedomforallelementsandnotallowingzerooff-diagonalelements.Becausetheconstraintsonacovariancematrixlimittheflex-ibilitywithwhichonecanconsiderdirectmodificationstotheWishartprior,acom-montrickistoinduceapriorforacovariancematrixthroughpriorsforparametersinadecomposition.Forexample,DanielsandZhao(2003)usedaspecialCholeskydecompositiontomodelchangesintherandomeffectscovarianceovertime.Are-lateddecompositionapproachwasconsideredbyDanielsandPourahmadi(2002).DanielsandKass(1999)insteadconsideredspectraldecomposition.Wongetal. 70B.Cai,D.B.Dunson(2003)proposedaBayesianmethodforestimatinganinversecovariancematrixfornormaldatausingapriorthatallowstheoff-diagonalelementsoftheinversecovariancematrixtobezero.ChenandDunson(2003)proposedanalternativedecomposition,whichhassomeappealingcharacteristics.Forexample,thedecompositionresultsinacondition-allylinearregressionmodel,whichallowsonetochooseaconditionally-conjugateprior.Thisconjugacyisimportantinallowingclosedformcalculationofconditionalprobabilitiesofincludingapredictor.Suchprobabilitiesarerequiredinimplement-ingSSVSalgorithms.UnlikeChenandDunson(2003)andKinneyandDunson(2007),thischapterwillallowzerooff-diagonalelementsintherandomeffectsco-variancematrix.Thisisaccomplishedthroughvariableselectionmixturepriorsforparametersinthedecompositionthatcontrolcorrelationsamongtherandomeffects.Effectively,thepriorallowsmovementbetweenmodelsofdifferentdimension,withthecovariancematrixoftherandomeffectsineachofthesemodelsbeingpositivesemidefinite.ThedetailsaregivenSect.2.3.2.3ReparameterizationandMixturePriorSpecificationTherandomeffectscovariancematrixmaybefactorizedas=,where=diag(λ1,...,λq)isadiagonalmatrix,withdiagonalelementsλk≥0fork=1,...,q,anddenotesthelowertriangularmatrix⎛⎞1⎜γ211⎟⎜⎟⎜......⎟.⎝...⎠γq,1γq,2···1Fromstraightforwardalgebra,thisdecompositionoftherandomeffectscovarianceimpliesthatthe(k,l)elementofthematrixhasthefollowingexpression:r1−1σkl=σlk=λkλlγr2,r1+γksγls,fork,l=1,...,q,(4)s=1wherer1=min(k,l),r2=max(k,l).Hence,theλsarerowandcolumn-specificmultipliers,whiletheγ’scontrolthesizeoftheoff-diagonalelementsof.Forexample,inthespecialcaseinwhichallthelowertriangularelementsof’sequalzero,isadiagonalmatrixwithλ2,fork=1,...,qtheelementsalongthekdiagonal.Oneobtainsapositivesemidefinitewhenλk>0forallk.Becauseλkservesasamultiplieronalltheelementsinthekthrowandcolumnof,itisclearthatoneeffectivelyexcludesthekthrandomeffectfromthemodelwhenλk=0,asinthiscasetherandomeffectsvariancewillequalzero.Note BayesianVariableSelectioninGeneralizedLinearMixedModels71thatoncealltherowsandcolumnscorrespondingtonullrandomeffectshavingzerovarianceareexcluded,oneobtainsapositivesemidefinitecovariancematrixforthoserandomeffectsincludedinthemodel.Inthisway,randomeffectsareallowedtoeffectivelydropoutofthemodel.Recallthat,tocompleteaBayesianformulationofthemodelselectionprob-lem,weneedtochoosepriorprobabilitiesforeachM∈M,alongwithpriordis-tributionsforthecoefficientswithineachmodel.Assumingthatacommonpriorisassumedfortheparametersthatbelongtoeverymodelinthelist,suchastheexponentialfamilydispersionparameter,wefocusonthechoiceofpriorforλ=(λ1,...,λq)andγ=(γmk:m=k+1,...,q;k=1,...,q−1).EachmodelM∈Misdistinguishedbythesubsetsofλandγhavingzeroelements.Hence,bychoosingasinglemixturepriorforλandγthatallowsforzeroelements,wesimultaneouslyspecifyaprioroverthemodelspaceMandforthecoefficientswithineachmodel.Todropouttheoff-diagonalelementsinthecovariancematrixwhenarandomef-fectisexcluded,thesupportofthepriorforγisdefinedasRλ={γ:γmk=γkl=0ifλk=0,fork=1,...,q,1≤l0),(5)F(0;−µ1,k0,σ2)1,k0whereπ1,k0,µ1,k0andσ2arehyperparametersspecifiedbyinvestigators,and1,k0F(·)isthenormaldistributionfunction.Werefertoprior(5)asazero-inflatedposi-tivenormaldensity,ZI-N+(λk;π1,k0,µ1,k0,σ2).Thepriorprobabilityofthekth1,k0randomeffectbeingexcludedisπ1,k0=Pr(H0k:λk=0).ThepriorprobabilityoftheglobalnullhypothesisofhomogeneityisPr(H0:λ1=···=λq=0)=0qπ1,k0,whichimpliesthatallrandomeffectsareexcludedfromthemodel.k=1Toallowfixedeffectspredictorstoeffectivelydropoutofthemodel,wealsochooseazero-inflatednormaldensity,ZI-N(βv|π2,v0,µ2,v0,σ2),asthepriorfor2,v0βv,forv=1,...,p.Thepriorprobabilityofthevthpredictorbeingexcludedisthenπ2,v0=Pr(βv=0).SimilarmixturepriorshavebeenwidelyusedintheBayesianvariableselectionliterature(cf.Geweke,1996).Wealsoallowzerooff-diagonalelementsinthecovariancematrixbychoosingmixturepriorswithmassesat0fortheγ’s.Wechooseazero-inflatednormalden-sity,ZI-N(γmk;π3,mk,0,µ3,mk,0,σ2),withtheconstraintrelatedtoλ,asthe3,mk,0priorforγmk,form=k+1,...,qandk=1,...,q−1.Thismixturepriorfixesthepriorprobabilityofγmk=0tobeπ3,mk,0.Inthisway,thecorrelationsbetweentherandomeffectscanbezeroornonzero.Explicitly,from(4),thecorrelationco-efficientbetweenthemthandthekthrandomeffectsisk−1γmk+γksγmsρ(ζs=1.im,ζik;γ)=1m−12k−121+s=1γms1+s=1γks 72B.Cai,D.B.DunsonSothepriorprobabilitythatthetworandomeffectsareuncorrelatedisPr{ρ(ζim,ζik)=0}=Pr(γm1γk1=···=γm,k−1γk,k−1=γmk=0)k−1!"=π3,mk,0π3,ms,0(1−π3,ks,0)+π3,ks,0.s=1Notethattheexpressionforthecorrelationcoefficients,ρ(ζim,ζik),foranytworandomeffectsthathavenonzerovariance(λm>0,λk>0)doesnotinvolveλ.Eventhoughweusetheλsandγsinspecifyingthepriorandinposteriorcompu-tation,inferencesshouldbebasedontherandomeffectvariancesandcorrelations,whichareeasilycalculatedfromtheλsandγs.Toobtainsamplesfromthepriordistributionof,onecansimplydrawfromthepriorofλ,γandthencalculatethecorrespondingvaluesof.Similarlytoobtainsamplesfromtheposteriordistribu-tionof,onecanrelyonsamplesfromtheposteriorofλ,γ.Tochoosethevaluesforthehyperparameters,wesuggestaninformativespec-ification.Forexample,onecouldsetthepointmassprobabilitiesequalto0.5toallowequalprobabilityofinclusionorexclusion,whilecenteringthepriorsforthecoefficientsonzero,andchoosingthepriorvariancetoassignhighprobabilitytoawiderangeofplausiblevaluesfortherandomeffectscovariancevalues.However,itisimportanttoavoidchooseveryhighvariance,diffusebutproperpriors.DiffusepriorsarenotrecommendedforBayesianmodelselection,becausethehigherthepriorvariancethemorethenullmodelisfavored.DefaultpriorselectioninGLMMsinaninterestingareaforfutureresearch.2.4AnApproximationOurgoalistoimplementastochasticsearchvariableselection(SSVS)algorithmforsimultaneouslyexploringthemodelspace,M,whilealsoobtainingdrawsfromtheposteriordistributionsfortheparameters.Ifwehadalinearmixedeffectsmodel,thenthestepsintheSSVSalgorithm,obtainedfromthespecificationinSects.2.1–2.3,wouldallinvolvesamplingfromstandarddistributions.However,inGLMMsmorebroadly,thisisnolongerthecase,andwecannotapplystandardMCMCalgorithmsforupdatingtheparametersinasingleGLMMdirectly.Theproblemoccursincalculatingtheconditionalposteriordistributionsforasingleelementofλorγ.Inparticular,duetotheuseofthevariableselectionmixtureprior,theconditionalposteriorisalsoamixtureofapointmassatzeroandacon-tinuousdistribution.Calculationoftheconditionalposteriorprobabilityallocatedtothecontinuouscomponentinvolvescalculatingamarginallikelihood,whichisnotavailableinclosedform.Inparticular,itisnecessarytocalculatethemarginallikelihoodofyconditionalontheparametersbyintegratingouttherandomeffectsL(β,φ,;y,X,Z)=π(y|β,φ,ζ,X,Z)π(ζ|)dζ.(6)q BayesianVariableSelectioninGeneralizedLinearMixedModels73Letl(β,φ,;y)=logL(β,φ,;y),suppressingtheconditioningonXandZasshorthand.Byfarthemostcommonly-usedandsuccessfulapproachforanalyticallyapproximatingmarginallikelihoodsistheLaplaceapproximation(SolomonandCox,1992;BreslowandClayton(1993);Lin,1997;Chipmanetal.,2003,amongothers).Recallthatdependsonλ,γthroughexpression(4).Inaddition,whenλ=0,wehave=0,whichimpliesthatζ≡0sothatL(β,φ,;y,X,Z)=π(y|β,φ,X),whichisthelikelihoodforaGLMwithnorandomeffects.Inthegeneralcase,CaiandDunson(2006)proposedasecond-orderTaylorseriesapproximationto(6).Inparticular,startbyapproximatingthefirstintegrandof(6)bytakingasecond-orderTaylorseriesexpansionatE(ζ)=0,themeanoftherandomeffects∂L(β,φ,ζ;y)1∂2L(β,φ,ζ;y)L(β,φ,ζ;y)≈L(β,φ,ζ=0;y)+ζ+ζζ∂ζζ=02∂ζ∂ζζ=0∂l(β,φ,ζ;y)=L(β,φ,ζ=0;y)1+Zζ∂ηζ=01∂l(β,φ,ζ;y)∂l(β,φ,ζ;y)+(Zζ)2∂η∂η%%∂2l(β,φ,ζ;y)+DGZζ,∂η∂ηζ=0whereη=Xβ+Zζ,andDG(A)denotesadiagonalmatrixwithdiagonalentriesofA.Wenotethat(6)isactuallytheexpectationofL(β,φ,ζ;y)withrespecttoζ.Thus,theapproximation2L(β,φ,;y)canbeexpressedas2%%2L(β,φ,;y)=L1+1trZ∂l(β,φ,ζ;y)∂l(β,φ,ζ;y)+DG∂l(β,φ,ζ;y)Z∗,02∂η∂η∂η∂ηζ=0(7)whereL0=L(β,φ,ζ=0;y),whichdenotesthelikelihoodfortheordinaryGLM,tr(A)denotesthetraceofmatrixA,and∗=In⊗,theKroneckerproductofInand.Thesecondtermin(7)involvesthefirstandsecondderivativecalculations.Thus,theapproximation(7)istractable,sincethefirstandsecondderivativesofl(β,φ,ζ|y)areeasilyobtainedasfollows:%∂l(β,φ,ζ|y)∂ψ(h(η))∂h(η)=y−,∂η∂h(η)φ∂η%∂2l(β,φ,ζ|y)∂ψ(h(η))∂2h(η)∂2ψ(h(η))∂h(η)∂h(η)=y−−.∂η∂η∂h(η)φ∂η∂ηφ∂h(η)∂h(η)∂η∂ηThen,ingeneral,theapproximation2L(β,φ,;y)maybeexpressedas1qnq−1qn%(1)(2)L01+σkkB+2σmkB,(8)2φi,ki,m,kk=1i=1k=1m=k+1i=1 74B.Cai,D.B.Dunson(1)(2)whereBandBarefunctionsofβrelatedtoresponsevariabley,fixedeffectsi,ki,m,kpredictorsX,andtherandomeffectspredictorsZ,andvaryforparticularGLMMs.Indetail,theapproximation(8)maybeshownas1qk−1nq−1qk−1n%L22(1)(2)01+2φλk1+γksBi,k+2λkλmγmk+γksγmsBi,m,k.k=1s=1i=1k=1m=k+1s=1i=1(9)ThisformgivesageneralanalyticallytractableformforGLMMswhichsimpli-fiesthesubsequentcomputation.Thegeneralresultcanbeappliedinastraightfor-wardmannertoanyparticularspecialcase(e.g.,logisticregression,Poisson,loglinearmodels,etc).Thedetailedmarginaldistributionsfornormallinear,logisticregressionandPoissonmodelsareprovidedinAppendix.Wecanobtainasimplerapproximationto(9)undertheassumptionthattheel-ementsoftherandomeffectscovariancearesmallenoughsothattheassumptionexp(σ)≈1+σiswarranted.Inthiscase,theapproximationbecomes1qnq−1qn%(1)(2)L0expσkkB+2σmkB.(10)2φi,ki,m,kk=1i=1k=1m=k+1i=1Thisexpressionissimplertocalculaterapidly,somayhaveadvantagesincertaincases.3PosteriorComputation3.1GeneralStrategiesRelyingontheapproximationsproposedinSect.2.4onlywhenneededtoapproxi-matemarginallikelihoodsintegratingoutrandomeffects,thissectiondescribesthestepsinvolvedintheCaiandDunson(2006)SSVSalgorithm.ForbinomialandPoissonlikelihoods,thescaleordispersionparameterisφ=1.Fornormallin-earmodels,φisσ2,andwefollowcommonpracticeinchoosingagammaprior,G(c0,d0),forσ−2.TheSSVSalgorithmiterativelysamplesfromthefullconditionaldistributionsofeachoftheparameters.Forβ,λandγ,theseposteriorswillhaveamixturestructureconsistingofpointmassat0andnonconjugatedistributions.Incalculatingthepointmassprobabilities,werelyontheapproximationdescribedinSect.2.4.Tosamplefromthenonconjugatedistribution,weuseadaptiverejectionMetropolissampling(Gilksetal.,1995,1997).Ingeneral,ifaparameterθhasamixturepriorofformπ(θ)=π01(θ=0)+(1−π0)1(θ=0)p(θ),andthelikelihoodisnonconjugate(e.g.,theGLMMwithlogitlinkandloglink),directlysamplingforθfromitsfullconditionaldistributionisratherdifficultduetotheintractablemarginalintegralforθincalculatingπˆ.To BayesianVariableSelectioninGeneralizedLinearMixedModels75sampleθmoreefficiently,onecouldintroducetwolatentvariablesδandθ˜whicharelinkedtoθasθ=(1−δ)θ˜,whereδ∼Bernoulli(π0)andθ˜∼p(θ)˜.Thus,onecansampleθthroughthefollowingsteps:•UpdateδfromitsfullconditionaldistributionBernoulli(π)˜,whereπ0π˜=π0+(1−π0)L(θ=θ,)/˜L(θ=0,)withL(·)denotingthelikelihoodandtheotherparameters.•Updateθ˜forθfromitsfullconditionaldistributionL(θ,)˜p(θ)˜ifδ=0,andletθ=0otherwise.Letδ1,kdenoteanindicatorvariablewhichisoneifthekthrandomeffectisexcluded(H0k)andzeroiftherandomeffectisincluded(H1k).Then,itisclearthatthepriorspecificationin(5)canbeinducedthroughlettingλk=(1−δ1,k)λ˜k,whereindind+2+2δ1,k∼Bernoulli(π1,k0)andλ˜k∼N(µ1,k0,σ),withN(µ,σ)denotingthe1,k0N(µ,σ2)distributiontruncatedtofallwithin+.Similarly,thepriorsforβvandγmkcanbeinducedthroughthespecificationsindind2βv=(1−δ2,v)β˜v,δ2,v∼Bernoulli(π2,v0),β˜v∼N(µ2,v0,σ2,v0),indind2γmk=(1−δ3,mk)γ˜mk,δ3,mk∼Bernoulli(π3,mk),γ˜mk∼N(µ3,mk,0,σ3,mk,0).Ineachofthesecases,wesimplyinduceamixturepriorforthecoefficientthroughmultiplyingalatentindicatorthatthecoefficientequalzerobythelatentvalueofthecoefficientifitisnonzero.3.2UpdatingParametersBasedontheprecedingsettings,theSSVSalgorithmalternatesbetweenstepsforupdatingeachoftheunknownsasfollows:Step1:Updateλ˜k.Wefirstsampleδ1,kfromitsfullconditionalposteriordistrib-π1,k0ution,Bernoulli(π˜1,k),whereπ˜1,k=,π1,k0+(1−π1,k0)C1,k2L(β,λλ˜k=k,λ(−k),γ,φ;y)C1,k=,2L(β,λk=0,λ(−k),γ,φ;y)andλ(−k)=(λ1,...,λk−1,λk+1,...,λq).Ifδ1,k=1,thenletλk=0andexcludethekthrandomeffect.Otherwise,wesampleλ˜kforλkfromtheconditionalposteriorgiveninclusion,whichisproportionalto1(λ˜k>0)2L(β,λ˜k,λ(−k),γ,φ;y)N(λ˜k;µ1,k0,σ2).1,k0Step2:Updateβ˜v.Wefirstsampleδ2,vfromitsfullconditionalposteriordistrib-π2,v0ution,Bernoulli(π˜2,v),whereπ˜2,v=,π2,v0+(1−π2,v0)C2,v 76B.Cai,D.B.Dunson2L(β˜v,β(−v),λ,γ,φ;y)C2,v=,2L(β˜v=0,β(−v),λ,γ,φ;y)andβ(−v)=(β1,...,βv−1,βv+1,...,βp).Ifδ2,v=1,thenletβv=0.Other-wise,wesampleβ˜vforβvfromfromtheconditionalposteriorgiveninclusion,whichisproportionalto2L(β˜v,β(−v),λ,γ,φ;y)N(β˜v;µ2,v0,σ2).2,v0Step3:Updateγ˜mk(m>k).Wefirstsampleδ3,mkfromitsfullconditionalπ3,mk,0posteriordistribution,Bernoulli(π˜3,mk),whereπ˜3,mk=,π3,mk,0+(1−π3,mk,0)C3,mkwithC3,mk=2L(β,λ,γ˜mk,γ(−mk),φ;y)/2L(β,λ,γ˜mk=0,γ(−mk),φ;y),whereγ(−mk)=(γmk:m=k+1,...,q;k=1,...,k−1,k+1,...,q−1).Ifδ3,mk=1,thenletγmk=0.Otherwise,wesampleγ˜mkforγmkfromfromtheconditionalposteriorgiveninclusion,whichisproportionalto2L(β,λ,γ˜mk,γ(−mk),φ;y)N(γ˜mk;µ3,mk,0,σ2).However,ifλm=0or3,mk,0λk=0,γmk=0accordingtoitsconstraintrelatedtoλ.Step4:Updateσ−2.Inthecaseofidentitylink,φ=σ2.Wesampleσ−2fromitsfullconditionaldistribution,G(σ−2;c0,d0)2L(β,λ,γ,σ−2;y).Samplesfromthejointposteriordistributionoftheparametersaregeneratedbyrepeatingthesestepsforalargenumberofiterationsafterapparentconvergence.Ingeneral,therearenoexplicitformsforthefullconditionalsoftheparametersbasedontheproposedapproximation.However,when(10)holds,themoreexplicitfullconditionalposteriordistributionsforλ˜k,γ˜mkandσ−2canbederivedfromthejointposteriordistributionasfollows:•Updateλ˜k,k=1,...,qfromitsfullconditionaldistribution,whichispropor-tionalto1(λ˜2),k>0)C1,kN(λ˜k;µ1,k0,σ1,k0where⎡⎧⎫⎤λ˜n⎨k−1qr−1⎬C1,k=exp⎣kλ˜1+γ2B(1)+2λtγ+γksγtsB(2)⎦2φ⎩kksi,kw(t,k)i,w(t,k)⎭i=1s=1t=1,t=ks=1(2)withr=min(t,k),γw(t,k)equalsγktiftkfromitsfullconditionaldistributionwhichispropor-tionalto⎧⎛⎞⎫⎨λmγ˜mkn1q⎬exp⎝λmγ˜mkB(1)+λtγtkB(2)⎠Nγ˜mk;µ3,mk,0,σ2.⎩φ2i,mi,w(t,m)⎭3,mk,0i=1t=k,t=m•Updateσ−2,ifφ=σ2,fromthefullconditionaldistributionGc0,d0−log2L(β,λ,γ,σ−2;y)/L0. BayesianVariableSelectioninGeneralizedLinearMixedModels77Byvaryingtheelementsofλ,βandγthatareassigned0values,thealgorithmeffectivelygeneratessamplesfromtheposteriordistributionofM.AsinSSVSalgo-rithmsforlinearregression,wedonotvisitallthepossiblemodelsinM,sincethisnumberistypicallyenormous.Instead,bystochasticallymakinglocalchangestothemodelbasedon(approximated)conditionalmodelprobabilities,wetendtovisitmodelswithrelativelyhighposteriorprobability.However,forverylargemodelspaces,thereisnoguaranteethatwewillvisitthebestmodelinM.Inaddition,theremaybealargenumberofmodelswhichhavesimilarposteriorprobability.Hence,inferencesareoftenbasedonmarginalposteriorprobabilitiesofexcludingaparticularpredictorfromthefixedand/orrandomeffectscomponents.3.3CalculationofQuantitiesPosteriormodelprobabilitiescanbeestimatedbyaveragingindicatorvariablesacrossiterationscollectedafterapparentconvergence.Forexample,toestimatetheposteriorprobabilityofthekthrandomeffectbeingexcluded,onecansim-plyaddupthenumberofiterationsforwhichλk=0anddividebythetotalnumberofiterations.AnalternativemethodistouseaRao–BlackwellestimatorPr(λ1S(s)(s)k=0|data)=Ss=1π˜1,k,whereπ˜1,kisthevalueofπ˜1,katiterations,fors=1,...,S.Thisestimatorispotentiallymoreefficient.Thesameapproachcanbeusedtocalculateposteriorprobabilitiesofexcludingpredictorsfromthefixedeffectcomponent.Toestimatetheposteriorprobabilitythattworandomeffectsareuncorrelatedgiventhattheyarebothinthemodel(e.g.σmk=0),onecanusethefollowingestimator:!"s:λ1ρ(ζim,ζik;γ(s))=0Pr(σm>0,λk>0mk=0|λm>0,λk>0,data)=S,s=11(λm>0,λk>0)sothatwecalculatetheproportionofsamplesforwhichtherandomeffectsareun-correlatedfromamongthesamplesforwhichbothrandomeffectsareinthemodel.Notethat,inallowingtheelementsoftohavevaluesexactlyequaltozero,weobtainashrinkageestimatoroftherandomeffectscovariance.Thisestimatorshouldhavelowervariancethantypicalestimatorsfortherandomeffectscovari-ance,particularlywhenthenumberofcandidaterandomeffectsismoderatetolarge.However,thetheoreticalpropertiesofshrinkageestimatorsofthisformremaintobeestablished.ItisimportanttonotethattypicalideasofMCMCconvergencecannotberealis-ticallyappliedwhentheMCMCalgorithmisrequiredtosimultaneouslysearchoveraveryhigh-dimensionalmodelspace,whilealsoobtainingdrawsfromtheposteriorfortheparameterswithineachmodel.Simplyandhonestlyput,whentherearehun-dredsofthousandsorevenmillionsofmodelsunderconsideration,wehavenohopewhatsoeverofobtainingproperconvergence,sothattheMCMCdrawscanbein-terpretedassamplesfromthetargetdistributioncorrespondingtothejointposterior 78B.Cai,D.B.Dunsondistribution.TypicalimplementationsofSSVSwillvisitaverysmallfractionofthemodelsinthelistifthenumberofmodelsinthelistisverylarge.Hence,weareattemptingtoestimateposteriormodelprobabilitiesforalargenumberofmod-elsthathavenotevenbeenvisited.Nonetheless,SSVSalgorithmstendtorapidlyidentifygoodmodels,andmarginalinclusionprobabilitiesforthepredictorstendtobehighforimportantpredictorsandrobustacrossmultiplechainsstartedfromdifferentlocationsinM.ThiswarningisjusttonotethattheBayesianapproachisveryusefulinthiscontext,butdoesnotmagicallyprovideaperfectsolutiontothehigh-dimensionalmodeluncertaintyproblem.4SimulationExamples4.1SimulationSetupAsaproofofconcepttoassessthebehavioroftheapproach,wecarriedoutasim-ulationstudy.BecausetheSSVSapproachiscomputationallyexpensiveevenforasingledataset,itwasnotfeasibletorunafullsimulationstudytoassessfrequentistoperatingcharacteristics,suchastypeIerrorrates,power,biasinparameteresti-mation,andefficiency.Instead,weranasmallnumberofsimulationsundereachofavarietyofcasesinvolvingdifferentrandomeffectscovariancestructurefromtheGLMMwithidentitylink,logitlink,andloglink.Weconsidered100subjects,eachofwhichhassixobservations.Thenumbersofcandidatepredictorsinthetwocomponents,pandq,arechosenasp=q=3,5or8.Thecovariatesarexij=(xij1,...,xijp),wherexij1=1andxijk∼Bernoulli(0.5),fori=1,...,100,j=1,...,6,k=2,...,p.Letzij=xij,β(−2)∼N(0,I),β2=0,andζi=(ζi1,...,ζiq)∼N(0,),wherewedesigned=withthreedifferentstructures:(1)λ=(1.2,0.4,0.6)andγ=(0.4,0.5,0.3),implyingthatallthreerandomeffectsareincludedinthemodel.(2)λ=(0.2,0,0.7,0,0.5)andγ=(0,0.4,0,0,0,0,0.8,0,0.1,0),implyingthatthesecondandthefourthrandomeffectsareexcludedfromthemodel.(3)λ=(0.5,0.8,0.9,0.2,0.1,0.1,0.6,0)andγ=(0.3,0.6,0.5,0.4,0.2,0.1,0.2,0.3,0.4,0.3,0.6,0.1,0.2,0.1,0.8,0.3,0.4,0.8,0.6,0.3,0.2,0,0,0,0,0,0,0),implyingthatthelastrandomeffectisexcludedfromthemodel.ThecorrespondingcovariancematricesforrandomeffectsareshowninthefirstrowinFig.1.FortheGLMMwithidentitylink,yij∼N(xβ+zζi,σ−2)withijijσ−2=2.FortheGLMMwithlogitlink,yij∼Bernoulli(πij)withlogit(πij)=xβ+zζi.FortheGLMMwithloglink,yij∼Poisson(λij)withlog(λij)=ijijxβ+zζi.ijijWechosethepriordistributionforλkasZI-N+(λk;π1,k0,0,10).Thepriordistri-butionsfortheelementsofγarechosentobemixturepriors,ZI-N(γmk;π3,u0,0,1), BayesianVariableSelectioninGeneralizedLinearMixedModels79TrueIdentitylinkLogisticlinkLoglinkq=3q=5q=8Fig.1Imageplotsofthetrueandestimatedrandomeffectscovariancematricesforsimulateddataundertheidentitylink,logitlinkandloglinkwiththenumberofcandidaterandomeffectpredictorsbeing3,5and8.Thedarkerthecolorappears,thelargerthevalueoftheelementis,withthewhitecolorcorrespondingtozerowiththeconstraintrelatedtoλ.AmixturepriordistributionforβvischosenasZI-N(βv;π2,k0,0,10).Adiffusepriorforparameterσ−2ischosentobeG(0.08,0.08).Tostudytheeffectsofthepriorprobabilitiesofλk=0,βv=0andγmk=0ontheestimatedposteriorprobabilities,weconsider0.2,0.5,0.8forthesepriorprobabilities.Foreachsimulateddatasetandchoiceofprior,werantheGibbssamplingalgo-rithmdescribedinSect.3for20,000iterationsaftera2,000burn-in.ThediagnostictestswerecarriedoutbyusingGeweke(1992)andRafteryandLewis(1992),whichshowedrapidconvergenceandefficientmixing.Notethatthisapparentgoodper-formanceintermsofconvergenceandmixingissomewhatcountertoourwarningattheendofSect.3.Asampleofsize4,000wasobtainedbythinningtheMCMCchainbyafactorof5.Foreachsimulateddataset,wecalculated(a)theposte-riorprobabilitiesforthepossiblesubmodelsundereachlink;and(b)theestimatedposteriormeansandthe95%credibleintervalsforeachoftheparameters. 80B.Cai,D.B.DunsonWeranfivesimulationswithdifferentseedsforeachcase.Sensitivityoftheresultstothepriorspecificationwasassessedbyrepeatingtheanalyseswiththefollowingdifferenthyperparameters:(a)priorswithvariance/2;(b)priorswithvariance×2;and(c)priorswithmoderatelydifferentmeans.Althoughwedonotshowdetails,inferencesforallmodelsarerobusttosimulateddatasetwithdifferentseedsandthepriorspecification.TherangesinTable2illustratethisrobustness.4.2ResultsFigure1displaysimageplotsofthetruecovariancematricesforrandomeffectscorrespondingtothesimulateddataandtheestimatedcovariancematricesundereachlink.Inparticular,thefinalcolumncontainsthetruecovariancematrix,withwhitecorrespondingtozeroorlowvaluesofthecovarianceparameteranddarkershadescorrespondingtohighvalues.Theothercolumnsshowtheestimatedposte-riormeansinthesimulationexamplesunderlog,logisticandidentitylinkfunctionsforPoisson,binomialandnormaldata,respectively.Itisclearfromtheseplotsthatweobtainedanaccurateestimateofthecovariancematrixineachcase.Inaddition,theestimatesfromthesensitivityanalysesthatvariedthepriorinclusionproba-bilitiesweresimilar,aswereresultsforeachofreplicates.Theseresultsprovidesupportforourapproachasamethodforobtainedanaccurateshrinkageestimatoroftherandomeffectscovariancematrix.Focusingontheresultsforthelogisticmixedeffectsregressionsimulations,Table1presentsposteriorsummariesofthefixedeffectregressionparametersandTable1PosteriorestimatesoftheparametersforthesecondsimulationwiththelogitlinkParameterTruevalueMeanSD95%HPDintervalβ1−0.020.0100.120(−0.231,0.273)β20−0.0010.025(−0.002,0.001)β3−0.60−0.6110.166(−0.934,−0.284)β41.481.4480.142(1.170,1.725)β5−0.81−0.7920.167(−1.120,−0.467)σ110.040.0370.011(0.013,0.059)σ2200.0070.021(0,0)σ330.570.5780.120(0.348,0.820)σ4400.0020.024(0,0.001)σ550.410.3960.097(0.208,0.587)σ2100.0060.019(0,0)σ310.060.0520.016(0.020,0.084)σ3200.0020.026(0,0)σ4100.0010.030(0,0.001)σ4200.0030.022(0,0)σ4300.0020.027(0,0)σ510.080.0730.028(0.023,0.122)σ5200.0000.024(0,0)σ530.150.1590.035(0.084,0.223)σ5400.0010.033(0,0) BayesianVariableSelectioninGeneralizedLinearMixedModels81Table2Estimatedmodelposteriorprobabilitiesinsimulationstudiesunderthelogitlink.Sub-modelswithposteriorprobabilitylessthan0.02arenotdisplayedModelπ1,k00.20.50.8Simulation1x1,x3,z1,z2,z3a0.833bc0.796(0.771,0.828)0.748(0.719,0.782)(0.814,0.865)x1,x3,z1,z30.085(0.054,0.116)0.098(0.083,0.115)0.116(0.098,0.141)x1,x3,z1,z20.066(0.045,0.092)0.070(0.046,0.095)0.082(0.059,0.106)Simulation2x1,x3,x4,x5,z1,z3,z5a0.437(0.421,0.544)0.519(0.483,0.558)0.568(0.543,0.591)x3,x4,x5,z1,...,z50.106(0.075,0.140)0.095(0.068,0.131)0.084(0.048,0.112)x1,x3,x4,x5,z3,z50.103(0.076,0.135)0.139(0.110,0.180)0.177(0.138,0.218)x3,x4,x5,z1,z3,z4,z50.024(0.013,0.037)0.039(0.026,0.054)0.052(0.037,0.078)x2,...,x5,z1,...,z40.022(0.010,0.035)0.037(0.020,0.053)0.045(0.024,0.066)Simulation3x1,x3,...,x8,z1,...,z7a0.547(0.529,0.577)0.581(0.550,0.617)0.633(0.602,0.658)x1,x3,...,x8,z1,...,z80.090(0.079,0.106)0.075(0.065,0.089)0.064(0.053,0.075)x1,x3,...,x7,z1,...,z4,z6,z70.051(0.043,0.059)0.074(0.066,0.085)0.078(0.070,0.089)x1,x3,...,x7,z1,...,z5,z70.032(0.027,0.041)0.034(0.024,0.042)0.037(0.030,0.041)x1,x3,...,x7,z1,...,z4,z70.025(0.013,0.039)0.027(0.018,0.037)0.032(0.023,0.044)x1,...,x7,z1,...,z3,z70.023(0.011,0.033)0.025(0.014,0.035)0.028(0.016,0.040)aTruemodelbPosteriorprobabilitycRangerandomeffectscovarianceparametersinthesecondsimulationcase.Itisclearthattheposteriormeanvaluesareveryclosetothetruevaluesineachcase,andthatthetruevaluesareincludedin95%HPDintervals.Again,weobtainedsimilarresultsinsensitivityanalysesandothersimulationcases.Table2showstheestimatedposteriormodelprobabilitiesforthepreferredmod-elsunderthethreedifferentsimulationcasesforarangeofvaluesforthepriorprobabilityofexcludingapredictor.Althoughtheestimatedposteriorprobabili-tiesvariedsomewhatasthepriorexclusionprobabilitiesvaried,therankingsinthemodelswererobust.Ineachcase,thetruemodelwasthedominatemodel,hav-ingsubstantiallyhigherestimatedposteriorprobabilitythanthesecondbestmodel.Therangesshowninsubscriptsrepresenttherangeintheestimatedposteriormodelprobabilitiesacrossthefivedifferentsimulationreplicatesandfordifferentchoicesofhyperparameters.Figure2presentsboxplotsofthesamplesofparametersforthesecondsimulationundereachlink.Thetruevaluesofallparametersfallinthe95%credibleintervals.Whenthenumberofmodelsunderconsiderationislarge,theposteriorprobabil-ityassignedtoanyonemodelistypicallynotcloseto1.ThispropertyisobservedinTable2.Althoughthehighestposteriorprobabilityisassignedtothecorrectmodel 82B.Cai,D.B.Dunson0.60.00.40.2−0.412β3β0.0β−0.2−0.8−0.4−0.6−1.2IdentityLogisticLogIdentityLogisticLogIdentityLogisticLog0.082.0−0.60.061.64110.04β5σβ−1.01.20.020.8−1.40.00IdentityLogisticLogIdentityLogisticLogIdentityLogisticLog1.00.80.080.80.6330.631σ550.4σσ0.040.40.20.20.000.0IdentityLogisticLogIdentityLogisticLogIdentityLogisticLog0.2050.150.2045123σ0.10σ53σ0.1020.0510.000.000IdentityLogisticLogIdentityLogisticLogIdentityFig.2Boxplotsofthesamplesofparametersforthesecondsimulationundereachlink.Thesolidhorizontallinesindicatethetruevaluesineachcase,thisprobabilityisneverclosetoone.Forexample,insimulationcase3,theprobabilityassignedtothetruemodelisonlyslightlyabove0.5.Thisdoesnotsuggestthatourapproachforestimatingposteriorprobabilitiesispoor,butisinsteadageneralfeatureofmodelselectioninlargemodelspaces.Forthisreason,inperforminginferencesonwhetheragivenpredictorshouldbeincludedinthefixedand/orrandomeffectscomponent,itismorereliabletorelyonmarginalinclusionprobabilitiesthanonwhetherthatpredictorisincludedin BayesianVariableSelectioninGeneralizedLinearMixedModels83π(λ=0)=0.2π(λ=0)=0.5π(λ=0)=0.81.0MarginalprobabilityofinclusionMarginalprobabilityofinclusionMarginalprobabilityofinclusion0.00.20.40.60.80.00.20.40.60.81.00.00.20.40.60.81.0x1x2x3z1z2z3x1x2x3z1z2z3x1x2x3z1z2z3p=q=3p=q=3p=q=30.81.0MarginalprobabilityofinclusionMarginalprobabilityofinclusionMarginalprobabilityofinclusion0.00.20.40.60.00.20.40.60.81.0xxxxxzzzzzxxxxxzzzzzxxxxxzzzzz123451234512345123451234512345p=q=5p=q=5p=q=5MarginalprobabilityofinclusionMarginalprobabilityofinclusionMarginalprobabilityofinclusion0.00.20.40.60.81.00.00.20.40.60.81.00.00.20.40.60.81.00.00.20.40.60.81.0xxxxxxxxzzzzzzzzxxxxxxxxzzzzzzzzxxxxxxxxzzzzzzzz123456781234567812345678123456781234567812345678p=q=8p=q=8p=q=8Fig.3Plotsofmarginalinclusionprobabilitiesforeachofthepredictorsintermsoffixedandrandomeffectsunderthelogisticmodelwithdifferentpriorprobabilities(0.2,0.5,and0.8)ofexclusionofeachpredictorandthedifferentnumber(3,5,and8)ofcandidatepredictors.Theshorthorizontallinesdenotetheposteriormarginalinclusionprobabilitiesforeachofthepredictors.Theverticallinesshowtherangesofmarginalinclusionprobabilitiesonaveragethehighestposteriorprobabilitymodel.Figure3showsplotsofposteriormarginalinclusionprobabilitiesofeachofthepredictorsintermsoffixedandrandomef-fectsunderthelogisticmodelwithdifferentpriorprobabilities(0.2,0.5,and0.8)ofexclusionofeachpredictorandthedifferentnumber(3,5,and8)ofcandidatepredictors.Itisclearthatalthoughmarginalinclusionprobabilitiesforeachofthepredictorschangeslightlyaccordingtothedifferentchoicesofpriorprobabilitiesofexclusion,theinclusionofthepredictorsisconsistentwiththedesignedmodels.Forexample,thethirdrowpresentsthemarginalinclusionprobabilitiesofeightfixedandrandomeffectspredictorswithdifferentpriorprobabilitiesofexclusion(0.2,0.5,and0.8).Obviously,thepredictorsforthesecondfixedeffectandfortheeighthrandomeffectarelesslikelytobeincludedinthemodelsincethecorrespondingmarginalinclusionprobabilitiesarefairlylow.Wealsocalculatedmarginalinclu-sionprobabilitiesofpredictorsfortheGLMMwithloglinkandidentitylink,whichshowtheconsistentresultswiththetruemodels. 84B.Cai,D.B.Dunson4.3AssessmentofAccuracyoftheApproximationAlthoughtheproposedapproachappearstoperformwellatmodelselectionandparameterestimationbasedontheresultsinthesimulationexamples,itisimportanttoassesstheaccuracyoftheproposedapproximationtothemarginallikelihood.Iftheapproximationisnotaccurate,thenitmaybethecasethatourapproachisnotproducingaccurateestimatesoftheposteriormodelprobabilities.Evenifwedoagoodjobinestimationandmodelselection,itdoesnotnecessarilyimplyaccuracyinmarginallikelihoodapproximation.BecauseittendstobethecasethatTaylorseries-basedapproximationstothemarginallikelihoodarelessaccurateforbinaryresponselogisticmixedeffectsmod-elsthanforlog-linearandlinearcases,wefocusonthelogisticspecialcaseinassessingaccuracy.Inaddition,wefocusonthemodestsamplesizeof100subjects,eachofwhichhassixobservations,andweletp=q=3.Wechoosedifferentcovariancestructureswithvariancecomponentsproportionaltoλ2fromsmalltolarge,whichare(1)(0.01,0.02,0.005);(2)(1.2,0.4,0.6);(3)(2.8,4.3,3.5);(4)(27.5,20.6,35.1);(5)(50.6,30.8,60.3),withγkeptfixedat(0.4,0.5,0.3).Anessentiallyexactvalueforthemarginallikelihoodcanbeobtainedbybruteforcenumericalintegration,soweusethatapproachasthereferenceincomparingseveralapproximationapproaches.Inparticular,weestimatethemarginallikeli-hoodsofthesimulateddatausingtheLaplaceapproximation,importancesampling,Chib’smarginallikelihoodmethod,andourproposedapproach.Table3showsthecomparisonoflogmarginallikelihoodscalculatedbythedifferentmethods.Notethatalloftheapproximationstendtoperformbetterwhentherandomeffectsvarianceissmall,withtheaccuracydecreasingforlargerandomeffectsvariances.Forsmallvariances,theproposedapproachwasslightlymoreaccuratethananyofthecompetitors.Inaddition,theproposedapproachwasclosertothetruththantheLaplaceapproximationinallthecases.Notethattheapproximationtothemarginallikelihoodisonlyusedincalculatingtheconditionalposteriorprobabilitiesthatacoefficientisequaltozero.Hence,whenthereisclearevidencethatthepredictorshouldbeincluded,someinaccuracyinthemarginallikelihoodapproximationhasnoimpactoninferences.Therefore,underourapproachtheperformanceforsmallvaluesoftherandomeffectvariancesismostimportant.However,giventheimprovementseenrelativelytothewidely-usedLaplaceapproximationforallvaluesoftherandomeffectsvariance,theproposedapproximationshouldalsobeusefulinfrequentistinferencesandothersettings.Table3ComparisonofapproximatedlogmarginallikelihoodsfortheGLMMwiththelogitlinkλ2Chib’sExactI.SamplingLaplaceProposed(0.01,0.02,0.005)−92.47−89.19−92.09−91.74−91.59(1.2,0.4,0.6)−129.23−125.35−128.68−131.05−128.15(2.8,4.3,3.5)−147.37−140.80−145.01−148.39−148.22(27.5,20.6,35.1)−94.83−88.93−93.62−97.92−96.47(50.6,30.8,60.3)−100.56−90.97−98.81−104.73−101.86 BayesianVariableSelectioninGeneralizedLinearMixedModels855Time-to-PregnancyApplication5.1DataandModelSelectionProblemReturningtothetimetopregnancy(TTP)applicationintroducedinSect.1.2,wean-alyzetheRowlandetal.(1992)studytoillustratetheproposedapproach.ThiswasaretrospectiveTTPstudy,withfemaledentalassistants,aged19–39,contactedandinvitedtoenrolledafterbeingrandomlyselectedfromaregistry.Studyinvestiga-torsenrolled427women,whocompletedadetailedquestionnaireonreproductivehistory,occupationalexposuresandotherfactorsthatmayberelatedtofecundabil-ity.Asillustration,wefocusonthelogisticmixedeffectsdiscretehazardmodelpresentedin(1),withthecandidatepredictorsincludingcategoryindicatorsforage(19–24,25–29,>30),intercoursefrequencyperweek(≤1,1–3,3–4,>4),ciga-rettessmokedperday(nonsmoker,1–5,6–10,11–15,>15),andtheuseoforalcontraceptivesinthecyclepriortobeginningthepregnancyattempt(no,yes).Includingalltheaboveindicatorvariablesintheorderthattheyareintroduced,withthefirstcategorylevelbeingthereference,wehave14candidatepredictors.Wealloweachofthesepredictorstobepotentiallyincludedinthefixedand/orrandomeffectscomponent.Hence,wehaveanenormouslistofpossiblemodelswhenalsoallowinguncertaintyinwhethertherandomeffectscorrelationsareequaltozero.5.2PriorSpecification,ImplementationandResultsInchoosingaprior,ourgoalwastoassignhighprobabilitytoawiderangeofplau-siblevaluesfortheregressioncoefficientsandinducedcovariancematrixwithoutspecifyingaveryhighvarianceprior.Giventhatallthepredictorsareindicatorvariableswithinalogisticregressionmodel,apriorvarianceof20(giveninclu-sioninthemodel)forbothλkandβvseemedreasonable.Hence,thepriordistri-butionforλkischosenasZI-N+(λk|πk0,0,20),andthepriordistributionforβvischosenasZI-N(βv|πv0,0,20).Fortheelementsofγ,wechooseamoreinfor-mativepriorwithavarianceofonetofavormodestlevelsofcorrelationbylettingZI-N(γmk;π3,u0,0,1),withtheconstraintonλ.Inconsideringapplicationswithcontinuouspredictors,oneshouldeithernormal-izethesepredictorspriortoanalysis,orcarefullyconsiderthemeasurementscaleofthepredictorsinchoosingthepriorvariance.Howinformativethepriorisforaparticularpriorvarianceiscompletelydependentonthescaleofthemeasurements.Notethatthetendencyistofavorsmallermodelsthelargerthepriorvariance.Ourchoiceof20isreallyintheupperrangeofreasonablevaluesforthepriorvarianceinthiscontext,andourmotivationwastofavorparsimony.WerantheMCMCalgorithmfor80,000iterationsaftera10,000iterationburn-in.ThischainwasabitlongerthanistypicalforposteriorcomputationinasingleGLMM.Ingeneral,whentheMCMCalgorithmisbeingusedtosimultaneously 86B.Cai,D.B.Dunsonexploreahigh-dimensionalmodelspace,whilealsoestimatingposteriormodelprobabilitiesandposteriordistributionsofcoefficients,theMCMCchainshouldbedramaticallylongerthanthatusedinanalysisofasinglemodel.ThechainspassedGeweke(1992)andRafteryandLewis(1992)convergencediagnostictests.Multi-plechainswithdifferentinitialsettingspassedGelmanandRubin(1992)conver-gencetest.Weretainedevery20thsampleforinferencesofinterest.AswediscussedinSect.4.2,whenthenumberofmodelsinthelistisenormous,theposteriorprobabilityassignedtoanyonemodeltendstobequitesmall,evenifthatmodelisthetruemodel.ThisbehaviorisexpectedandprovidesstrongsupportforBayesianmethodsofmodelaveragingandinferences,whichavoidrelyingoninterpretinganysingleselectedmodelasbeingsupportedbythedata.FrequentistorBayesianmethodsforselectinganoptimalmodeltendtoignoretheissuethatitiseffectivelyimpossibletofindclearevidenceinfavorofthetruemodelinaverylargelistunlesstheamountofdatayouhaveavailableisincrediblymassive.Hence,itismuchmorereasonabletofocusonmarginalinclusionprobabilitiesingaugingimportanceofpredictorswhenthelistofpossiblemodelsishuge.IntheTTPapplication,thetopmodelshadestimatedposteriorprobabilitiescloseto0.03,withseveralcandidateshavingsimilarvalues.Table4presentsthemarginalposteriorprobabilitiesofincludingeachpredictorinthefixedandrandomeffectscomponentsunderdifferentchoicesofπ1,k0andπ2,v0.Theoverallposteriorprobabilityofincludingageinthefixedeffectscompo-nentcanbecalculatedastheposteriorprobabilitythatanyofthecategoryindicatorsforageareincluded.Suchoverallposteriorprobabilitiesarecalculatedseparatelyforthefixedandrandomeffectscomponentsforeachofthefactorsunderconsider-ation,includingage,intercoursefrequency,cigarettessmoked,andrecentpilluse.TheresultsareshowninTable4.Theposteriorprobabilityofincludingageinthefixedeffectscomponentrangesfrom0.95to0.97(average=0.96)dependingontheprior.Thecorrespondingrangesforintercoursefrequency,cigarettessmoked,andrecentpilluseare0.96–0.97(average=0.96),0.92–0.97(average=0.94),and0.87–0.98(average=0.92),respectively.Hence,asexpected,thereissomeevidencethatage,intercoursefrequency,cigarettesmoking,andrecentpillusearepredictiveoffecundabilityonaverage,withthemostevidenceforageandinter-coursefrequency.Theageeffectismostapparentinwomen30+.Inaddition,theindicatorsforthehighestcategoriesofintercoursefrequency(4+acts/week)andcigarettesmoking(15+/day)hadthehighestposteriorprobabilitiesofinclusion.Fortherandomeffectscomponent,theresultsweresomewhatdifferent.Thepos-teriorprobabilityofinclusionforrecentpilluserangedbetween0.40and0.53(average=0.47),sothereisnoevidenceofheterogeneityintheeffectofrecentpilluse.However,therewassomeevidenceofheterogeneityamongwomenintheeffectsofeachoftheotherfactors.Inparticular,theposteriorprobabilityofincludingageintherandomeffectscomponentrangedbetween0.87and0.92(average=0.90),whichissuggestivebutnotclearevidence.Therewasslightlymoreevidenceofheterogeneityintheeffectsofintercoursefrequencyandcigarettesmokingwiththeposteriorprobabilitiesofinclusionforthesetwofactorsrang-ingbetween0.90and0.93(average=0.92)and0.91–0.95(average=0.93),respectively. BayesianVariableSelectioninGeneralizedLinearMixedModels87Table4Estimatedmarginalposteriorprobabilitiesofincludingpredictorsinthefixedandran-domeffectscomponentsunderdifferentpriorprobabilitiesofbeingzerointhetime-to-pregnancyapplication.Probabilitiesover0.9arewritteninboldPredictorPosteriorprobabilityofinclusionFixedeffectRandomeffect0.20.50.80.20.50.8Intercept0.90(0.89,0.94)a0.87(0.85,0.90)0.83(0.81,0.87)0.94(0.90,0.96)0.90(0.86,0.92)0.85(0.82,0.87)Age25–290.83(0.80,0.85)0.75(0.73,0.78)0.72(0.69,0.75)0.55(0.52,0.58)0.50(0.46,0.52)0.43(0.39,0.46)30+0.93(0.90,0.96)0.89(0.86,0.92)0.87(0.84,0.90)0.88(0.84,0.91)0.86(0.83,0.90)0.81(0.76,0.84)Overall0.97(0.94,0.99)0.96(0.94,0.98)0.95(0.93,0.97)0.92(0.89,0.95)0.90(0.87,0.92)0.87(0.84,0.91)Intercoursefrequency1–30.63(0.61,0.66)0.56(0.54,0.59)0.53(0.49,0.56)0.61(0.56,0.64)0.56(0.52,0.59)0.52(0.49,0.56)3–40.76(0.73,0.79)0.73(0.71,0.76)0.68(0.65,0.71)0.83(0.78,0.87)0.78(0.75,0.82)0.74(0.70,0.78)4+0.97(0.93,0.98)0.94(0.91,0.96)0.88(0.85,0.91)0.54(0.49,0.57)0.50(0.45,0.53)0.44(0.40,0.48)Overall0.97(0.94,0.99)0.96(0.94,0.98)0.96(0.93,0.98)0.93(0.90,0.96)0.92(0.89,0.94)0.90(0.86,0.93)Cigarettessmoked1–50.70(0.68,0.73)0.65(0.62,0.67)0.56(0.52,0.59)0.61(0.59,0.65)0.55(0.52,0.59)0.51(0.47,0.53)6–100.85(0.81,0.88)0.81(0.77,0.83)0.74(0.72,0.78)0.89(0.85,0.92)0.85(0.82,0.88)0.82(0.77,0.85)11–150.86(0.82,0.88)0.79(0.76,0.82)0.70(0.67,0.74)0.57(0.53,0.61)0.52(0.49,0.55)0.49(0.46,0.54)15+0.95(0.92,0.97)0.92(0.89,0.94)0.88(0.84,0.91)0.69(0.65,0.75)0.66(0.62,0.70)0.61(0.58,0.65)Overall0.97(0.94,0.99)0.94(0.91,0.98)0.92(0.89,0.95)0.95(0.93,0.98)0.92(0.89,0.95)0.91(0.87,0.94)Recentpilluse0.98(0.96,0.99)0.91(0.87,0.95)0.87(0.85,0.91)0.53(0.48,0.59)0.48(0.43,0.51)0.40(0.37,0.45)aRangeTable5providestheoverallposteriorsummariesoftheregressioncoefficientsfromourapproachcomparedwiththestandardGLMwiththelogitlinkfittedtothefullmodel.Itisclearthattherearenosystematicdifferencesbetweenourmodel-averagedBayesianpointandintervalestimatesfortheregressioncoefficientsandthemaximumlikelihoodestimates.Ingeneral,theresultscandiffersubstantially,particularlywhenthefocusisoninferences,andthefrequentistanalystselectsthefinalmodelbasedontypicalcriteria(e.g.,usingstepwiseselection),whiletheBayesianusesmodelaveraging.Theresultsinthiscase,however,areduetothefactthatmostofthecandidatepredictorshavemoderatetohighprobabilitiesofbeingincludedinthemodel.Wealsonotethatthecoefficientfortheoldestleveloftheagevariableislargerthanfortheyoungerlevels,implyingthattheproba-bilityofgettingpregnantgoesupasagegoesupwhichseemscounterintuitive.Thecounter-intuitiveageeffectwasalsoapparentinfrequentistandmoreroutineBayesiananalyses.Oneofthedifficultiesintimetopregnancystudiesisthatitis 88B.Cai,D.B.DunsonTable5Overallposteriormeansand95%credibleintervalsofregressioncoefficientsinthetime-to-pregnancyapplicationcomparedwiththeresultsfromtheGLMwiththelogitlinkEffectsProposedapproachStandardGLMAge25–290.173(−0.043,0.396)a0.177(−0.068,0.422)b30+0.358(0.003,0.727)0.354(−0.029,0.719)Intercoursefrequencyperweek1–30.095(−0.144,0.335)0.088(−0.151,0.327)3–40.258(−0.120,0.643)0.265(−0.129,0.659)4+0.877(0.402,1.306)0.885(0.408,1.361)Cigarettessmokedperday1–5−0.119(−0.570,0.313)−0.126(−0.736,0.482)6–10−0.278(−0.694,0.241)−0.285(−0.873,0.303)11–15−0.425(−1.239,0.337)−0.433(−1.413,0.547)15+−0.686(−1.640,−0.164)−0.681(−1.622,0.260)Useoforalcontraceptives−0.923(−1.544,−0.268)−0.931(−1.619,−0.243)a95%credibleintervalb95%confidenceintervalimpossibletogetagroupofwomenofdifferentageswhoareatriskofpregnancyandrepresentativeofthegeneralpopulationofreproductiveagewomen,particularlyinanoccupationalepidemiologysetting.Itmaybethecasethattheolderdentalas-sistantsarerepresentativeofadifferentdemographicgrouphavinghigherfertility,orthatsomeselection.Weranextensivesensitivityanalysestoevaluatetherobustnessoftheresultstothepriorspecificationsbyrepeatingtheanalyseswiththefollowingdifferenthyper-parameters:(a)priorswithhalfvariance;(b)priorswithdoublevariance;(c)priorswithmoderatelydifferentmeanswithintherangeofthepriorexpectation.TherangesinTable4showtheresultsforallofthedifferentpriors.6DiscussionThischapterhasproposedaBayesianapproachforaccountingforuncertaintyinselectionoffixedandrandomeffectsinGLMMs.Theapproachisverycom-putationallyintensiveinrelyingonMCMCtosimultaneouslyexploreaveryhigh-dimensionalmodelspaceandestimateposteriormodelprobabilitiesandden-sitiesforselectedpredictors.However,giventhegreatdealoftimeandexpenseinvolvedincollectingthedata,itseemsthatspendingabitoftimeinimplementinganimprovedanalysisiswellworththeeffort.ACprogramisavailablefromtheauthorsuponrequest. BayesianVariableSelectioninGeneralizedLinearMixedModels89OnthetopicofBayesianmethodsforGLMMvariableselection,thereareseveralareasofsubstantialinterestforfutureresearch.Thefirstisdefaultpriorselection.Itisappealingtohavegeneralsoftwareavailablethatproducesresultswithgoodfre-quentistandBayesianpropertieswithoutneedforcarefulthoughtinpriorchoiceorsensitivitytosubjectivelychosenhyperparameters.TherehasbeensomeworkdoneondefaultpriorselectionforfittingofasingleGLMM,butdefaultpriorselectioninmodeluncertaintycontextsisacompletelydifferentissue.Forlinearregressionsubsetselection,mixturesofg-priorsprovideausefuldefault,butthereisalackofsimilarpriorsforGLMMs.Intheabsenceofcarefully-justifieddefaultpriors,onecanusethepriorsproposedinthischapterafternormalizinganycontinuouspredictors.Anotherimportantareaisthedevelopmentofsimplerandmoreefficientcompu-tationalimplementations,particularlyforcasesinvolvingmassivenumbersofcan-didatepredictors.Forexample,itmaybethecasethattheproposedapproximationtothemarginallikelihoodcanbeexpandedtomarginalizeoutnotonlytherandomeffects,butalsoalltheparametersinaparticularGLMM.Thiswillcertainlyresultinmuchmoreefficientcomputation,andmaybequiteappealingiftheproposedapproximationcanbejustifiedasaccurate.ReferencesAlbert,J.H.andChib,S.(1997).BayesianTestandModelDiagnosticsinConditionallyIndepen-dentHierarchicalModels.JournaloftheAmericanStatisticalAssociation92,916–925Breslow,N.E.andClayton,D.G.(1993).ApproximateInferenceinGeneralizedLinearMixedModels.JournaloftheAmericanStatisticalAssociation88,9–25Cai,B.andDunson,D.B.(2006).BayesianCovarianceSelectioninGeneralizedLinearMixedModels.Biometrics62,446–457Cai,B.andDunson,D.B.(2007).BayesianVariableSelectioninNonparametricRandomEffectsModels.Biometrika,underrevisionChen,Z.andDunson,D.B.(2003).RandomEffectsSelectioninLinearMixedModels.Biometrics59,762–769Chen,M.,Ibrahim,J.G.,Shao,Q.,andWeiss,R.E.(2003).PriorElicitationforModelSelectionandEstimationinGeneralizedLinearMixedModels.JournalofStatisticalPlanningandIn-ference111,57–76Chipman,H.,George,E.I.,andMcCulloch,R.E.(2002).BayesianTreedModels.MachineLearn-ing48,299–320Chipman,H.,George,E.I.,andMcCulloch,R.E.(2003).BayesianTreedGeneralizedLinearMod-els.BayesianStatistics7(J.M.Bernardo,M.J.Bayarri,J.O.Berger,A.P.Dawid,D.Heckerman,A.F.M.Smith,andM.West,eds),Oxford:OxfordUniversityPress,323–349Daniels,M.J.andKass,R.E.(1999).NonconjugateBayesianEstimationofCovarianceMatri-cesandItsUseinHierarchicalModels.JournaloftheAmericanStatisticalAssociation94,1254–1263Daniels,M.J.andPourahmadi,M.(2002).BayesianAnalysisofCovarianceMatricesandDynamicModelsforLongitudinalData.Biometrika89,553–566Daniels,M.J.andZhao,Y.D.(2003).ModellingtheRandomEffectsCovarianceMatrixinLongi-tudinalData.StatisticsinMedicine22,1631–1647Gelman,A.andRubin,D.B.(1992).InferencefromIterativeSimulationusingMultipleSequences.StatisticalScience7,457–472 90B.Cai,D.B.DunsonGeorge,E.I.andMcCulloch,R.E.(1993).VariableSelectionviaGibbsSampling.JournaloftheAmericanStatisticalAssociation88,881–889Geweke,J.(1992).EvaluatingtheAccuracyofSampling-BasedApproachestotheCalculationofPosteriorMoments.BayesianStatistics4(J.M.Bernardo,J.O.Berger,A.P.Dawid,andA.F.M.Smith,eds),Oxford:OxfordUniversityPress,169–193Geweke,J.(1996).VariableSelectionandModelComparisoninRegression.BayesianStatistics5(J.O.Berger,J.M.Bernardo,A.P.Dawid,andA.F.M.Smith,eds),Oxford:OxfordUniversityPress,609–620Gilks,W.R.,Best,N.G.,andTan,K.K.C.(1995).AdaptiveRejectionMetropolisSamplingwithinGibbsSampling.AppliedStatistics44,455–472Gilks,W.R.,Neal,R.M.,Best,N.G.,andTan,K.K.C.(1997).Corrigendum:AdaptiveRejectionMetropolisSampling.AppliedStatistics46,541–542Kinney,S.K.andDunson,D.B.(2007).Fixedandrandomeffectsselectioninlinearandlogisticmodels.Biometrics63,690–698Laird,N.M.andWare,J.H.(1982).RandomEffectsModelsforLongitudinalData.Biometrics38,963–974Liechty,J.C.,Liechty,M.W.,andMuller,P.(2004).BayesianCorrelationEstimation.Biometrika91,1–14Lin,X.(1997).Variancecomponenttestingingeneralisedlinearmodelswithrandomeffects.Bio-metrika84,309–326McCulloch,C.E.(1997).MaximumLikelihoodAlgorithmsforGeneralizedLinearMixedModels.JournaloftheAmericanStatisticalAssociation92,162–170McCulloch,C.E.andSearle,S.(2001).GeneralizedLinearandMixedModels.NewYork:Wiley.McGilchrist,C.A.(1994).EstimationinGeneralizedMixedModels.JournaloftheRoyalStatisti-calSocietyB56,61–69Meyer,M.C.andLaud,P.W.(2002).PredictiveVariableSelectioninGeneralizedLinearModels.JournaloftheAmericanStatisticalAssociation97,859–871Nott,D.J.andLeonte,D.(2004).SamplingSchemesforBayesianVariableSelectioninGeneral-izedLinearModels.JournalofComputionalandGraphicalStatistics13,362–382Ntzoufras,I.,Dellaportas,P.,andForster,J.J.(2003).BayesianVariableSelectionandLinkDe-terminationforGeneralisedLinearModels.JournalofStatisticalPlanningandInference111,165–180Raftery,A.(1996).ApproximateBayesFactorsandAccountingforModelUncertaintyinGener-alizedLinearModels.Biometrika83,251–266Raftery,A.E.andLewis,S.(1992).HowManyIterationsintheGibbsSampler?BayesianSta-tistics4(J.M.Bernardo,J.O.Berger,A.P.Dawid,andA.F.M.Smith,eds),Oxford:OxfordUniversityPress,763–773Raftery,A.E.,Madigan,D.,andVolinsky,C.T.(1996).AccountingforModelUncertaintyinSurvivalAnalysisImprovesPredictivePerformance.BayesianStatistics5(J.M.Bernardo,J.O.Berger,A.P.Dawid,andA.F.M.Smith,eds),Oxford:OxfordUniversityPress,323–349Rowland,A.S.,Baird,D.D.,Weinberg,C.R.,Shore,D.L.,Shy,C.M.,andWilcox,A.J.(1992).ReducedFertilityAmongWomenEmployedasDentalAssistantsExposedtoHighLevelsofNitrousOxide.TheNewEnglandJournalofMedicine327,993–997Schall,R.(1991).EstimationinGeneralizedLinearMixedModelswithRandomEffects.Bio-metrika78,719–727Sinharay,S.andStern,H.S.(2001).BayesFactorsforVarianceComponentTestinginGeneralizedLinearMixedModels.InBayesianMethodswithApplicationstoScience,PolicyandOfficialStatistics(ISBA2000Proceedings),507–516Solomon,P.J.andCox,D.R.(1992).NonlinearComponentofVarianceModels.Biometrika79,1–11Wong,F.,Carter,C.K.,andKohn,R.(2003).EfficientEstimationofCovarianceSelectionModels.Biometrika90,809–830Zeger,S.L.andKarim,M.R.(1991).GeneralizedLinearModelswithRandomeEffects:AGibbsSamplingApproach.JournaloftheAmericanStatisticalAssociation86,79–86 BayesianVariableSelectioninGeneralizedLinearMixedModels91AppendixThenormallinearmixedmodelofLairdandWare(1982)isaspecialcaseofaGLMMhavingg(µij)=µij=ηij=xβ+zζi,φ=σ2andb(θij)=η2/2.ijijij∂l(β,φ,ζ|y)∂l(β,φ,ζ|y)2∂2l(β,φ,ζ|y)Inthiscase,∂η∂η=(y−η)(y−η)/σand∂η∂η=−1N1/σ2.ThereforewehaveN(1)!"2Bi,k=(yi−Xiβ)Zik−ZikZik(2)Bi,m,k=(yi−Xiβ)ZimZik(yi−Xiβ)−ZikZim,whereZikdenotesthekthcolumnofZi,and⎧⎫⎨1nni⎬(y2L0=exp−2ij−xijβ).⎩2σ⎭i=1j=1Whenyijare0–1randomvariables,thelogisticregressionmodelcanbeobtainedπijbythecanonicallinkfunctiong(πij)=log1−πij=ηij=xijβ+zijζi,φ=1,ηij)=−log(1−π∂l(β,φ,ζ|y)∂l(β,φ,ζ|y)b(θij)=log(1+eij),hence∂η∂η=(y−π)∂2l(β,φ,ζ|y)(y−π)and∂η∂η=−π(1N−π).Then(1)!"2B=(yi−πi)Zik−πDG(ZikZ)(1n−πi)i,kiiki(2)B=(yi−πi)ZimZ(yi−πi)−πDG(ZimZ)(1n−πi),i,m,kikiikiwhereπi=(πi1,...,πin)withπij=exp(xβ)/1+exp(xβ),andiijij%πijL0=expyijlog+log(1−πij).1−πijSimilarly,whenyijarecountswithmeanλij,thePoissonregressionmodelcanbeobtainedbythecanonicallinkfunctiong(λij)=logλij=ηij=xβ+zζi,ijijφ=1,b(θηij∂lβ,φ,ζ|y∂lβ,φ,ζ|yij)=e=λij,∂η∂η=(y−λ)(y−λ)and∂2lβ,φ,ζ|y∂η∂η=−λ1N.Thenweobtainthat(1)!"2B=(yi−λi)Zik−λDG(ZikZ)1ni,kiiki(2)B=(yi−λi)ZimZ(yi−λi)−λDG(ZimZ)1n,i,m,kikiikiwhereλi=(λi1,...,λin)withλij=exp(xβ),andL0=expyijlogλij−λij−iijlogyij!. PartIIFactorAnalysisandStructuralEquationsModels AUnifiedApproachtoTwo-LevelStructuralEquationModelsandLinearMixedEffectsModelsPeterM.BentlerandJiajuanLiang1IntroductionTwo-levelstructuralequationmodels(two-levelSEMforsimplicity)arewidelyusedtoanalyzecorrelatedclustereddata(ortwo-leveldata)suchasdatacollectedfromstudents(level-1units)nestedindifferentschools(level-2units),ordatacol-lectedfromsiblings(level-1units)nestedindifferentfamilies(level-2units).Thesedataareusuallycollectedbytwosamplingsteps:randomlychoosingsomelevel-2units;andthen,randomlychoosingsomelevel-1unitsfromeachchosenlevel-2unit.Datacollectedinthiswaycanbeconsideredtobeaffectedbytwodifferentrandomsourcesorrandomeffects,namely,level-1effectsandlevel-2effects.Thesubstantivegoalwithsuchtwo-leveldataistoobtaintheoreticallymeaningfulandstatisticallyadequatesubmodelsforboththelevel-1andlevel-2effects.Realizationofthismaintaskconsistsofthreesteps:(1)setupaninitialmodelwithbothlevel-1andlevel-2effects;(2)estimatetheunknownmodelparameters;and(3)testthegoodness-of-fitofthegivenmodel.Inthecontextoflatentvariablestructuralequationmodeling,Step1isbasedonaninitialunderstandingandsubstantiveknowledgerelatedtopossibleconstructsorfactorsthatmayaffecttheobserveddata.Dependingonthefield,thismayin-volvesettingupameasurementmodelwithobservedindicatorsofoneormorelatentvariables,asinaconfirmatoryfactoranalysismodel,orpossiblyalsoamul-tivariaterelationsmodelinwhichsomelatentvariablesaffectothers,asinatyp-icalstructuralequationsmodel.SuchmodelsmaybespecifiedateachofthetwoP.M.BentlerUniversityofCalifornia,LosAngelesDepartmentsofPsychologyandStatistics,Box951563,LosAngeles,CA90095-1563bentler@ucla.eduJ.LiangUniversityofNewHaven,CollegeofBusiness,300BostonPostRoad,WestHaven,CT06516jliang@newhaven.eduD.B.Dunson(ed.)RandomEffectandLatentVariableModelSelection,95DOI:10.1007/978-0-387-76721-5,cSpringerScience+BusinessMedia,LLC2008 96P.M.Bentler,J.Lianglevels,wherethemodelatbothlevelscanbehighlysimilarorcompletelydiffer-ent.Clearlythemoresubstantiveknowledgeonecanhave,thebetterthechancetoidentifyasuitablemodelforcharacterizingthedata.Step2requiressomeesti-mationmachinery,andintwo-levelSEM,thisistypicallybasedontheasymptoticstatisticaltheoryofmaximumlikelihood(ML)orgeneralizedleastsquares(GLS),see,forexample,Bentleretal.(2005).Step3,testinggoodness-of-fitintwo-levelSEM,usesthestandardtestingmachineryassociatedwithMLandGLSmethodol-ogy,andprovidesevidenceonhowwellthemodelproposedinstep1canrepresenttheobserveddata.Approachestotheestimationofparametersandmodelevaluationintwo-levelSEMaresimilartotheircounterpartsinconventional(one-level)SEM.ThesemethodsarewellknownduetosomepopularstatisticalprogramssuchasEQS(Bentler,2006),LISREL(duToitandduToit,2001)andMplus(MuthenandMuthen,2004).Thebasicapproachtomodelevaluationinvolvesthechi-squarestatisticwhichcomparesarestrictedSEMtoamoregeneralunrestrictedorsat-uratedmodel.Becauseaproposedmodel(thenullhypothesis)mayfailtobeacceptableinlargesamples,alternativefitindicesandotherstatisticshavebeenproposedformodelevaluation,see,YuanandBentler(2007a,b)forconventionalSEM,andBentleretal.(2005)andYuanandBentler(2003)fortwo-levelSEM.AsinconventionalSEM,acceptanceofthenullhypothesis(i.e.,theproposedmodel)intwo-levelSEMbyateststatisticdoesnotnecessarilyimplythattheproposedmodelisthecorrectmodel.Whenamodelfailstofit,itmaybedesir-abletoimproveitusingsomemodelmodificationorselectioncriteria(see,e.g.,Bozdogan,1987;CudeckandBrowne,1983;Sorbom,¨1989).Animportantcharacteristicoftwo-leveldataisthatthetwodifferenttypesoffactors(i.e.,level-1andlevel-2factors)areassumedtobetheonlyfactorsthataf-fectthedata.Thischaracteristicissimilartothatofdataaffectedbymixedeffectsandrandomeffects.Bytakingthefixedeffectsastheeffectsfromlevel-2factorsandtherandomeffectsastheeffectsfromlevel-1factors,wewilldescribeauni-fiedapproachtotwo-levellinearSEMandlinearmixedeffectsmodels(LMEMforsimplicity)usingthesamemodelformulation.Someequivalencebetweenmul-tilevelSEMandLMEMhasbeenalreadystudiedbyanumberofresearcherssuchasRovineandMolenaar(2000),Bauer(2003),Curran(2003),SkrondalandRabe-Hesketh(2004),andMehtaandNeale(2005)fromdifferentpointsofview.Inthischapter,wefocusonestimatingmodelparametersinviewofthesamemodelformulation.Thischapterisorganizedasfollows.Weintroducethegeneralmodelformulationfortwo-levelSEManditsrelationtoLMEMinSect.2.AnEMalgorithmforesti-matingmodelparametersforbothtwo-levelSEMandLMEM,andsomeasymptoticpropertiesoftheparameterestimatoraregiveninSect.3.ApplicationsoftheEMalgorithmareillustratedbyexamplesinSect.4.Somediscussionandcommentsaregiveninthelastsection. Two-LevelStructuralEquationModelsandLinearMixedEffectsModels972ModelFormulationInmodelingtwo-leveldata,itisusuallyassumedthatbothlevel-1andlevel-2observations,respectively,havethesamedimension(e.g.,LeeandPoon,1998;BentlerandLiang,2003;LiangandBentler,2004).Thisassumptionisviolatedwhenadditionalmeasurementsaretakenfromfactorsorlatentvariablesatlevel-1.Atypicalexampleofthissituationisthatstudents(level-1units)nestedindifferentschools(level-2units)maybegivendifferentnumberofscholasticteststomeasuretheirability(e.g.,themathability).Thenthedatacollectedfromstudents’scoresinthetestswillhavedifferentdimensionsacrosstheschools.Wewillcallthissituationoneofdimensionalheterogeneity.Therefore,itismeaningfultoallowbothlevel-1andlevel-2observationstohavedifferentdimensions.Supposethattheobserveddataarecollectedfromahierarchicalsamplingscheme:(1)randomlychoosesomelevel-2units(suchasdifferentschoolsorfamilies);and(2)randomlychoosesomelevel-1units(suchasstudentsorfamilymembers)fromeachchosenlevel-2unit.Let{ygi:pg×1,i=1,...,Ng}denotetheobservationsfromlevel-2unitg.Forexample,ygimaydenotetheobservationfromtheithstudent(level-1unit)nestedinthegthschool(level-2unit),andtherearepgtestsgiventostudentsinthegthschool,say,g=1,...,G.Giscalledthelevel-2samplesizeand{Ng:g=1,...,G}arecalledthelevel-1samplesizes.Ngmaybedifferentfordifferentg(i.e.,anunbalancedsampledesign).Let{zg:qg×1,g=1,...,G}denotethepurelevel-2observationsthatareonlyobservedfromlevel-2units.Forexample,zgmaydenotethefinancialresourcesforthegthschool,andthereareqgfinancialresourcesforthegthschool.Then{ygi:pg×1,i=1,...,Ng;zg:qg×1,g=1,...,G}constitutesasetofobservationsfromallresponsesinthepopulation.Thelevel-1observations{ygi:pg×1,i=1,...,Ng}(foreachfixedg)areusuallynotindependentbecausedifferentlevel-1unitsnestedinthesamelevel-2unitareaffectedbysomecommonlevel-2factors.Thepurelevel-2observationzgisassumedtohavethesameeffectonalllevel-1unitsnestedinthesamelevel-2unitg.Forexample,thefinancialresourcesforthegthschoolcanbeassumedtohavethesameeffectonallstudents(level-1units)nestedinthesameschool(level-2unit)g.Basedonthisviewpoint,weproposethefollowinggeneralformulationfortwo-levelSEM:zgzg0=+,(1)ygivgvgiforMLanalysiswiththeassumptions:(A1)Thelevel-1randomvectors{vgi:pg×1,i=1,...,Ng}areindependentforeachfixedgandvgi∼Np(0,gW)forg=1,...,G,gW>0(positivegdefinite).(A2)Thelevel-2randomvectors{vg:pg×1,g=1,...,G}areindependentandvg∼Npg(µ2g,gB)withgB>0.(A3){zg:qg×1,g=1,...,G}areindependentlevel-2observationsandzg∼Nqg(µ1g,gzz)withgzz>0. 98P.M.Bentler,J.Liang(A4)Therandomvector(zg,vg)(pg+qg)×1hasajointmultivariatenormal∼∼distributionNpg+qg(µg,gB)withgB>0andµ1g∼zggzzgzyµg=,gB=cov=,(2)µ2gvggyzgBwheregzy=gyz=cov(zg,vg).(A5){zg,vg}isuncorrelatedwith{vgi:i=1,...,Ng}foreachfixedg.Informulation(1),within-group(level-1)differencesarereflectedbyamodelforvgiforeachgiveng(i=1,...,Ng)andwecallsuchamodel,alevel-1model,whichmaycontainlevel-1latentfactors;between-group(level-2)differencesarereflectedbyamodelforvgandamodelforzg(g=1,...,G)andwecallsuchmodels,level-2models,whichmaycontainlevel-2latentfactors.Alllatentfactorsareassumedtohavenormaldistributionsinthefollowingcontexttoderivetheuni-fiedalgorithmforestimatingmodelparameters.Anontrivialmodelunderformulation(1)isonethatrestrictsthemeansandco-variancesinassumptions(A1)–(A4).Thisimpliesthatthemeansandcovariancematricesinassumptions(A1)–(A4)dependonacommonparametervectorθ(say,r×1).θcontainsallmodelparametersfromformulation(1)andwecanwrite∼∼µg=µg(θ),gW=gW(θ),gB=gB(θ).(3)Thesematricesmaybestructuredinparticularwaysasmotivatedbyspecificstruc-turalmodels,seeSect.4.LiangandBentler(2004)proposedanMLanalysisforthemodelformulation(1)forthecaseofpg≡pandqg≡qandpointedoutthatformulation(1)includestheformulationsfortwo-levelSEMinMcDonaldandGoldstein(1989),Muthen(´1989),Raudenbush(1995),Lee(1990),andLeeandPoon(1998).Ananalysisofthemodeldefinedby(1)consistsoftwomajortasks:(a)Estimatethemodelparametervectorθfromtheavailableobservations{ygi,zg}.(b)Evaluatethegoodness-of-fitaftertheparameterθisestimated.Afteramodelexpressedby(1)issetup,itisassumedthatthemodelisidentified.Thatis,themeanandcovariancestructuresin(3)areuniquelydeterminedbyθandviceversa.Thisimpliesthatiftherearetwoparametersθ1(r×1)andθ2(r×1)suchthat∼∼µg(θ1)=µg(θ2),gW(θ1)=gW(θ2),gB(θ1)=gB(θ2)forg=1,...,G,thenθ1=θ2.Thecomplexityofamodeldefinedby(1)maycomefromcomplicatedlevel-1modelsforthewithinvariablevgi,orfromcompli-catedlevel-2modelsforthebetweenvariablevgandtheobservablevariablezg,orfrombothcomplicatedwithinandbetweenmodels.Whenvgi,vg,andzgaredeter-minedbymeasurementmodelssuchasfactoranalysismodels(1)reducestoausualSEMwithdimensionalheterogeneity. Two-LevelStructuralEquationModelsandLinearMixedEffectsModels99Themeanandcovariancestructuresin(3)actasthenullhypothesisforthemodelformulation(1).Thecommonparametervectorθin(3)isassumedtocontainrdistinctindividualmodelparameters.Thesaturated(ortrivial)modelassociatedwith(3)isthecasethatθcontainsthemaximumnumberofdistinctindividualmodelparametersRthatisgivenbyG./R=pg+qg+pg(pg+1)/2+(pg+qg)(pg+qg+1)/2.(4)g=1Amodel-fitstatisticformodelformulation(1)isatestoftherestrictedmodel(3)withr0)andeg∼Nn(0,σ2Rg)(Rg>0)areuncorrelated.gyg(ng×1)istheresponsevectorfromthegthgroup,Xg(p×ng)isthedesignmatrixfromfixedeffects,Zg(q×ng)containsobservationsfromrandomeffects,bg(q×1)containstherandomeffects,andeg(ng×1)containstherandomerrors.Intheformulationgivenby(1),letN1=N2=...=NG≡1,zg≡0(nolevel-2observations),andβ+Zbvg=Xggg,vgi≡eg(i≡1)(6)thenygi≡yg(i≡1)andygi=vg+vgi(i≡1).(7)ItisnotedthattofittheLMEM(5)withintheSEMframework(1),wehavetoswitchthemeaningofdimensions:informulation(1),foreachgiveng,Ngstandsforthenumberofindividualswithingroupg;butwhentakingN1=N2=...=NG≡1infittingtheLMEM(5)into(7),eachNgonlystandsforanimaginarylevel-1samplesize.TakingN1=N2=...=NG≡1in(5)doesnotmeanthatthenumberofindividualswithingroupsareallequalto1.ThenumberofindividualswithingroupsinLMEM(5)isactuallyng(g=1,...,G),whichisnowswitchedtothedimensionorthenumberofmeasuredoutcomesthatisequivalenttopginformulation(1),whichisthedimensionoflevel-1observationsinatwo-levelSEMexpressedby(1).AsimilarswitchofthemeaningofdimensioncanbealsonoticedinpapersthatprovideaunifyingframeworkforLMEMandmultilevelSEM,see,forexample,Muthen(´1997),Bauer(2003),Curran(2003),SkrondalandRabe-Hesketh(2004).SinceanLMEMisessentiallyasingle-levelmodel,therearenosecond-levelobservationsexcepttheindividualobservations.Thetermzgrepresentingalevel-2observationinformulation(1)becomesimaginaryinanLMEM.Takingzg≡0infittingtheLMEM(5)intoformulation(1)simplymeansthattherearenolevel-2observationsinLMEMandsoformulation(1)reducesto(7),whichisaspecialcaseofformulation(1)withoutlevel-2observations. 100P.M.Bentler,J.LiangIfweassumeng≤qforthefull-rankdesignmatrixZgin(5),thenvg=Xgβ+Zgbghasanonsingularmultinormaldistributionandtheassumptions(A1)–(A5)willbesatisfiedforthemodeldefinedby(7).IntheLMEM(5),bg∼Nq(0,T)canbeconsideredasabetween(level-2)latentvariableandeg∼Nn(0,σ2Rg)gasawithin(level-1)latentvariable.ygi≡yg(i≡1)istheonlyoneresponse(observation)vectorfromgroup(level-2unit)g.Thereforetheformulationdefinedby(1)coversallLMEMofthetypedefinedby(5)withallng≤q.Letθbethemodelparametervectorin(5)thatiscomposedofallfreemodelparametersin(5):theregressioncoefficientsinβ,thevariancesandnonduplicatedcovariancesincov(bg)=T,andthevariancesandnonduplicatedcovariancesincov(eg)=σ2Rg.Themeanandcovariancestructures(therestrictedmodelornullhypothesis)associatedwiththeLMEM(5)(or(7))aregivenby(θ)=E(y)=Xβ+ZE(b2µggggg)=Xgβ,gW(θ)=cov(eg)=σRg(θ),β+ZbgB(θ)=cov(Xggg)=Zgcov(bg)Zg=ZgT(θ)Zg.(8)Thesaturatedmodel(alternativehypothesis)associatedwiththeLMEM(5)isusu-allyfarmorecomplicatedthanthatassociatedwiththetwo-levelSEMexpressedbyformulation(1)becausetheremaybefartoomanydifferentwithin-groupco-variancematrices.IthasbeennotedthatthereisnoestimablesaturatedmodelformanyLMEM’s(especiallyforhighlyimbalanceddata).Inthefollowingcontext,wewillonlyfocusonaunifiedapproachtoestimatingmodelparametersforbothtwo-levelSEMandLMEMunderthestructuredmodels(thenullhypothesisistrueorthepresentedstructuralrelationshipsareassumedtobecorrect).3TheEMAlgorithm3.1MaximumLikelihoodEstimationBecausethegenerallinearmixedeffectsmodeldefinedby(5)isaspecialcaseofthemodelformulationgivenby(1),inthissectionwewilldevelopanEMalgorithmforcomputingtheMLE(maximumlikelihoodestimate)ofthemodelparametervectorθspecifiedinthemeanandcovariancestructuresin(3).ThentheEMalgorithmcanbeappliedtotheLMEM(5).Letθ(r×1)betheparametervectorcontainingallmodelparametersfromformulation(1)associatedwithassumptions(A1)–(A5).TosimplifythederivationoftheEMalgorithm,let⎛⎞⎛⎞yg1yg1⎜.⎟⎜.⎟zgvgyg=⎝..⎠,Yg0=⎝..⎠,ug=y,xg=u,ggyygNggNg(9)Ng1X={x1,...,xG},Z={z1,...,zG},¯yg=ygi.Ngi=1 Two-LevelStructuralEquationModelsandLinearMixedEffectsModels101./Thenug(qg+Ngpg)×1containsallobservationsfromgroup(level-2unit)g.Takingvgasamissingvectorvalue,weconstructthe“completeobservation”vector./xg(qg+pg+Ngpg)×1.Yg0in(9)istheobservationmatrixcomposedofalllevel-1observationsfromgroupg.¯ygisthesamplemeanfromthelevel-1obser-vationsingroupg.XdenotesthesetofallcompleteobservationsandZthesetofalllevel-2observations.Fromtheassumptions(A1)–(A5)on(1),wecanderivethenegativetwicethelogarithmofthelikelihoodfunctionfromallcompleteresponses{xg:g=1,...,G}G∗)=l∗l(X,θg(X,θ),(10)g=1whereθ∗isanarbitrarilyspecifiedvalueoftheparameterθ,and∗)=log|∗|+(v∗∗−1∗lg(X,θgBg−µ2g)gB(vg−µ2g)Ng+∗|+(y−v∗−1log|gWgig)gW(ygi−vg)i=1#+log|∗|+z∗∗∗−1∗gzz.vg−µ1g−gzygB(vg−µ2g)#×∗−1∗−∗∗−1(v∗gzz.vzg−µ1ggzygBg−µ2g),(11)where∗=µ(θ∗),µ∗=µ(θ∗),∗=∗µ1g1g2g2ggWgW(θ),∗=∗∗∗∗∗gBgB(θ),gzz=gzz(θ),gzy=gzy(θ),(12)∗=∗,∗=∗−∗∗−1∗.gyzgzygzz.vgzzgzygBgyzAccordingtotheprincipleoftheEMalgorithm(Dempsteretal.,1977),theE-stepfunctionoftheEMalgorithmforestimatingθistheconditionalexpectationde-finedbyG!"∗|θ)=El∗M(θg(X,θ)|zg,yg,θ,(13)g=1wherebothθ∗andθaretwoarbitrarilyspecifiedvaluesofthesameparameterθ.Fromassumptions(A1)–(A5)on(1),itcanbeprovedthat!"∗)|z∗∗−1∗Elg(X,θg,yg,θ=log|gB|+trgBSgB+N∗∗−1glog|gW|+trgWSgW(14)+log|∗|+tr∗−1S∗,gzz.vgzz.vgz 102P.M.Bentler,J.Liangwhere#∗=E(v∗∗SgBg−µ2g)(vg−µ2g)|zg,yg,θ,Ng#1SgW=E(ygi−vg)(ygi−vg)|zg,yg,θ,Ngi=1(15)#S∗=Ez∗∗∗−1∗gzg−µ1g−gzygB(vg−µ2g)#×z∗∗∗−1∗g−µ1g−gzygB(vg−µ2g)|zg,yg,θ.Undertheassumptions(A1)–(A5)on(1),itcanbederivedthatdef−1ag(θ)=ag=E(vg|zg,yg,θ)=µ2g+1gg(cg−µg),(16)def−1Cg(θ)=Cg=cov(vg|zg,yg,θ)=gB−1gg1g,defwhere“=”means“definedas,”µgisdefinedin(2),and1g=(gyz,gB),g=gW+NggB,*+(17)zggzzgzycg=,g=1.¯yggyzNggThenwehave∗=C∗∗∗∗SgBg+agag−agµ2g−(agµ2g)+µ2gµ2g,−aSgW=Sgyy+Cg+agagg¯yg−(ag¯yg),S∗=(z∗∗∗∗−1∗∗(18)gzg−µ1g)(zg−µ1g)−gzygB(#ag−µ2g)(zg−µ1g)−∗∗−1(a∗∗∗∗−1∗∗−1∗gzygBg−µ2g)(zg−µ1g)+gzygBSgBgBgyzandNg11Sgyy=ygiygi=Yg0Yg0.(19)NgNgi=1Employingsomepropertiesforblockmatrices,wecanderiveG∗|θ)=N∗∗−1M(θglog|gW|+trgWSgWg=1(20)G∼∗∼∗−1∼%+log|g|+trgSg,g=1 Two-LevelStructuralEquationModelsandLinearMixedEffectsModels103whereSgWisgivenin(16),and∗*∼∗+*∼+∼+µ∗µ∗µ∗∼SgB+dgddgzgg=gBggg,Sg=g,dg=,µ∗1d1aggg(21)∼00∼∗∼S∗∗∗gB=0C,µg=µg(θ),gB=gB(θ)g∼∼∗withµgandgBdefinedin(2)andagin(14).Hereweexpressthematricesg∼andSgasin(19)forprogrammingconvenience,sincetheexpression(18)hasex-actlythesameformasthemultiple-grouplikelihoodfunctionincovariancestructure∼∗∼analysis.Theconstant“1”inthematricesgandSgin(19)isaddedtotheE-step∗|θ)andM∗function(18)withoutlossofgeneralitybecauseMg(θg(θ|θ)±1have∼∗∼thesameoptimality.TheexpressionsforgandSgin(19)withtheconstant“1”aretheresultofre-combinationofsmallermatricesintobiggermatricesbyusingthepropertyofblockmatrices*∗+−1⎛∼∗−1∼∗−1⎞∼∗−1∼∗∗∗∗gB+µgµgµg⎝gB−gBµg⎠,g=∗=∼∗−1∼∗−1µg1−µ∗ggB1+µ∗ggBµ∗g∼∗∼∗∗∗∗∼∗g=gB+µgµgµg=gB,µ∗1g∼∼∼∗−1∼∗−1∼∗−1tr∗−1S∗−1∗∗∗gg=trgBSgB+dggBdg−2µggBdg+µggBµg+1.TheM-stepoftheEMalgorithmistoupdateθ∗foreverygivenθintheE-stepfunctionin(18).Forexample,intheithstep(i=0correspondstotheinitialstep),∗toθgivenθ=θi,weneedtoupdateθi+1suchthatM(θi+1|θi)≤M(θi|θi)(22)accordingtothegeneralideaoftheEM-typealgorithm(McLachlanandKrishnan,1997).WewillemployLange’s(1995a,b)EMgradientalgorithmtoderivetheup-datingformulaforθi+1in(20)byusingthegradientdirection.Thisrequiresthefirst-orderderivativeoftheE-stepfunction(18)andasuitableapproximationoftheFisherinformationmatrix(McLachlanandKrishnan,1997).Thefirst-orderderiva-tivecanbederivedas 104P.M.Bentler,J.Liang∗∂M(θ|θ)dM(θ|θ)=∗∗∂θθ=θGN−1⊗−1=ggWvecgW−SgWgWgW(23)g=1%∼∼−1∼−1∼∼G+g=1gg⊗gvecg−Sg,where∼∂(vecgW)∼∂(vecg)∼∼∗gW=,g=,g=g∗(24)∂θ∂θθ=θthesign“vec”in(21)and(22)standsforthevectorizationofamatrixbystackingitscolumnssuccessively,andthesign“⊗”fortheKroneckerproductofmatrices.Byusingapositivedefinitematrixoffirst-orderderivativestoapproximatetheHessianmatrix(see,e.g.,(McLachlanandKrishnan,1997),pp.5–7,fortheuseofapositivedefinitematrixtoapproximatetheHessianmatrix),wecanobtainthefollowingapproximationtotheHessianmatrix∗∂2M(θ|θ)I(θ)=E∗∗∗∂θ∂θθ=θGN−1−1−1−1−1≈ggW2gWSgewgW⊗gW−gW⊗gWgW(25)g=1G∼∼−1∼∼−1∼−1∼−1∼−1%∼+g2gSgebg⊗g−g⊗gg.g=1∼ThetwomatricesSgewandSgebarederivedbytheassumptions(A1)–(A5)on(1)Segw=E(Sgw)=2gB+gW−Agw−Agw,(26)−1,1Agw=2gg1g2g=(gyz,Ngg),where1g,gandgaredefinedin(15),and∼∼Sgeb+µgµgµggzzAgbSgeb=E(Sg)=,Sgeb=,µg1AgbgB(27)A−1.gb=(gzz,gzy)g1g Two-LevelStructuralEquationModelsandLinearMixedEffectsModels105AccordingtotheEMgradientalgorithminLange(1995a,b),theM-stepintheEMalgorithmforestimatingtheparameterθcanberealizedby−1dM(θθi+1=θi−αI(θi)i|θi),(28)whereθidenotesthevalueofθattheithiterationand0<α≤1isanadjust-ingconstantforcontrollingthesteplengthduringtheiteration.αcanbechosendynamically(itcouldbedifferent)ateachiteration.TheRootMeanSquareError(RMSE)(LeeandPoon,1998)canbeusedasastoppingcriterionfortheiterationin(26).Thatis,theiterationstopswhentheRMSEbetweentwoadjacentstepsissmallenough%1/212−6RMSE(θi+1,θi)=θi+1−θi≤(e.g.,=10),(29)rwherethesign“ · ”standsfortheusualEuclideandistance,andristhedimensionofθ.Forthelinearmixedeffectsmodelgivenby(5)–(7)withmeanandcovariancestructures(8),wecanobtainthefirst-orderderivatives.2/∂µg∂β∂(vecgW)∂vecσRggµ==·Xg,gW==,∂θ∂θ∂θ∂θ(30)∂(vecgB)∂(vecT)gB==·(Zg⊗Zg).∂θ∂θ∼Thesederivativesarehelpfulincomputingthetermgin(21)–(23).BecauseNg≡1andzg≡0in(1)(g=1,...,G)fortheLMEM(5)–(7),wecanobtainthesimplifiedformulasfordM(θ|θ)(see(21))andI(θ)(see(23))usedintheiterationprocess(26)1G−1−12dM(θ|θ)=4gWRg⊗RgvecσRg−SgWσg=1G∼∼−1∼−1∼∼+gmgm⊗gmvecgm−Sgm,g=1G1∼∼−1∼−1∼%I(θ)=−1−14gWRg⊗RggW+gmgm⊗gmgm.σg=1(31)∼∼wheregWisgivenby(28),gmisthereducedformofggivenin(19)without∼∼∼thecovariancesrelatedtozgin(1).gm=∂vec(gm)/∂θ.Sgmisthereducedform∼ofSggivenin(19)withoutthevariatezg.Itcanbederivedthat 106P.M.Bentler,J.Liang∼gB+µgµgµg∼Cg+agagaggm=,Sgm=,µg1ag1(32)Cg=gB−gB(gB+gW)−1gB,+−1ag=µggB(gB+gW)(yg−µg),y+CSgW=yggg+agag−agyg−(agyg),whereµg,gW,andgBaregivenin(8).ygistheobservationfromtheLMEM∼(5).Thederivativesin(28)helptocomputethederivativegmin(29).3.2AsymptoticPropertiesBecausetheestimatorθˆforthemodelparameterθin(1)obtainedfromtheEMalgorithminSect.3.1isanMLE,somegeneralpropertiessuchasasymptoticnor-malityapplytotheMLEinSect.3.1.WecanapplythegeneralresultonMLEgivenbyHoadley(1971)totheMLEθˆ.Basedonassumptions(A1)–(A5)on(1),theavailableobservations{ug:g=1,...,G}definedin(9)areindependentbutnotidenticallydistributed.Thenegativetwiceofthelog-likelihoodfunctionfrom{ug}definedin(9)canbeexpressedasG#−1(θ)Tf(θ)=(Ng−1)log|gW(θ)|+trgWgWg=1(33)G#+−1(θ)Tlog|g(θ)|+trggB,g=1wheregisgivenin(15)and11TgW=Yg0INg−JNgYg0,TgB=(cg−µg)(cg−µg),(34)Ng−1NgwhereYg0isgivenin(9),andµgandcgaregivenin(2)and(15),respectively.INistheNg×NgidentitymatrixandJNtheNg×Ngmatrixofones(allofggitselementsare“1”).Becausetheobservationvectors{ug}in(9)areindependentlynormallydistributed,itcanbeverifiedthat{ug}satisfytheregularityconditionsinTheorem2ofHoadley(1971).Thenwehavethefollowingtheorem.Theorem1.TheMLEθˆforthemodelparameterθfrom(1)withassumptions(A1)–(A5)isasymptoticallynormallydistributedwith1/2D−1G(θˆ−θ)→N0,2(θ),G→∞,(35) Two-LevelStructuralEquationModelsandLinearMixedEffectsModels107Dwherethesign“→”means“convergeindistribution,”andthematrix(θ)isgivenby⎧1⎨G(N−1−1(θ)=g−1)gWgW⊗gWgWG⎩g=1⎫(36)G#⎬+−1⊗−1+2−1gggggµggµ,⎭g=1wheregandgWaregivenin(15)and(22),respectively,and∂(vecg)∂µgg=,gµ=.(37)∂θ∂θTheorem3.1isadirectresultfromTheorem2ofHoadley(1971)appliedtotheindependentlynotidenticallynormallydistributedobservationvectors{ug}in(9).ByTheorem3.1,theasymptoticstandarderrorsofthecomponentsoftheMLEθˆfromthemodelgivenby(1)canbeapproximatelycomputedbythesquarerootsofthecorrespondingdiagonalelementsoftheasymptoticcovariancematrixofθˆ2#−1cov(θˆ)≈(θˆ).(38)GThechi-squarestatisticfortestinggoodness-of-fitofformulation(1)withmeanandcovariancestructures(3)isdefinedasthedifferencebetweenthemodelchi-squareattherestrictedmodel((3)withr150>10DecisiveTheinterpretationgivenin(1)isasuggestionanditisnotnecessarytoregarditasastrictrule,andselectiondependsonone’spreferenceinthesubstantivesituation.Similarlyinfrequentisthypothesistesting,onemaytakethetypeIerrortobe0.05or0.10,andthechoiceisdecidedwithotherfactorsinthesubstantivesituation.SeeGarcia-DonatoandChan(2005)formoretechnicaltreatmentoncalibratingtheBayesfactor.ThepriordistributionsoftheparametersareinvolvedintheBayesfactor,see(2)below.AspointedoutbyKassandRaftery(1995),noninformativepriorsshouldnotbeused.InmostBayesiananalysesofSEMs,theproperconjugatetypepriordistributionsthatinvolvepriorinputsofthehyper-parametervalueshavebeenused.Forsituationswherewehavegoodpriorinformation,forexample,fromanalysisofcloselyrelateddata,orknowledgeofexperts,subjectivehyper-parametervaluesshouldbeused.Inothersituations,ideasfromdataand/orinformationfromvarioussourcescanbeused.Ingeneral,themarginaldensitiesp(Y|Mk),k=0,1,areobtainedbyintegratingovertheparameterspace,thatis,p(Y|Mk)=p(Y|θk,Mk)p(θk|Mk)dθk,(2) 124S.-Y.Lee,X.-Y.SongwhereθkistheparametervectorinMk,p(θk|Mk)isitspriordensity,andp(Y|θk,Mk)istheprobabilitydensityofYgivenθk.Inthischapter,weuseθtorepresenttheparametervectoringeneral,anditmaystandforθ0,θ1ortheparametervectorinthelinkedmodel(seethefollowingsections)accordingtothecontext.Thedimensionoftheaboveintegralisequaltothedimensionofθk.Veryoften,itisverydifficulttoobtainB10analytically,andvariousanalyticandnumericalapproximationshavebeenproposedintheliterature.Inthenextsection,theprocedurebasedonideaofthepathsampling(GelmanandMeng,1998)ispro-posedtocomputetheBayesfactor.Thepathsampling,whichisageneralizationoftheimportancesamplingandbridgesampling(MengandWong,1996),hasseveralnicefeatures.Itsimplementationissimple.Ingeneral,aspointedoutbyGelmanandMeng(1998),wecanalwaysconstructacontinuouspathtolinktwocompetingmodelswiththesamesupport.Hence,themethodcanbeappliedtoawidevarietyofproblems.Unlikesomemethodsinestimatingthemarginallikelihoodviaposte-riorsimulation,itdoesnotrequiretoestimatethelocationand/orscaleparametersintheposterior.Distinctfrommostexistingapproaches,thepriordensityisnotdirectlyinvolvedintheevaluation.Finally,thelogarithmscaleofBayesfactoriscomputed,whichisgenerallymorestablethantheratioscale.2.2OtherAlternativesAsimpleapproximationof2logB10isthefollowingSchwarzcriterionS∗(Schwarz,1978):2logB∗10∼=2S=2logp(Y|θ˜1,M1)−logp(Y|θ˜0,M0)−(d1−d0)logn,(3)whereθ˜1andθ˜0aremaximumlikelihood(ML)estimatesofθ1andθ0underM1andM0,respectively;d1andd0arethedimensionsofθ1andθ0;andnisthesamplesize.Minus2S∗isthefollowingBayesianInformationCriterion(BIC)forcomparingM1andM0:BIC∗∼10=−2S=−2logB10=2logB01.(4)TheinterpretationofBIC10canbebasedon(1).ForeachMk,k=0,1,wedefineBICk=−2logp(Y|θ˜k,Mk)+dklogn.(5)Then2logB10=BIC0−BIC1.Hence,itfollowsthatthemodelMkwiththesmallerBICkvalueisselected.Asntendstoinfinity,ithasbeenshown(Schwarz,1978)thatS∗−logB10→0,logB10 BayesianModelComparisonofStructuralEquationModels125thusS∗maybeviewedasanapproximationtologB10.AsthisapproximationisoforderO(1),S∗doesnotgivetheexactlogB10evenforlargesamples.However,sincetheinterpretationisonthenaturallogarithmscale,itprovidesareasonableindicationofevidence.AspointedoutbyKassandRaftery(1995),itcanbeusedforscientificreportingaslongasthenumberofdegreesoffreedom(d1−d0)involvedinthecomparisonissmallrelativetothesamplesizen.TheBICisappealinginthatitisrelativelysimpleandcanbeappliedevenwhenthepriorsp(θ0|Mk)(k=1,2)arehardtosetprecisely.TheMLestimatesofθ1andθ0areinvolvedinthecomputationofBIC.Inpractice,sincetheBayesianestimatesandtheMLestimatesareclosetoeachother,itcanbeusedforcomputingtheBIC.Theorderofapproximationisnotchanged,andtheBICobtainedcanbeinterpretedusingthecriteriongivenin(1).SeeRaftery(1993)foranapplicationofBICtothestandardLISRELmodelthatisbasedonthenormalassumptionandalinearstructuralequation.Underthissimplecase,thecomputationoftheobserveddatalogarithmlikelihoodlogp(Y|θ˜k,Mk)isstraightforward.However,forsomecomplexSEMs,evaluationoftheobserveddatalogarithmlikelihoodmaybedifficult.TheAkaikeInformationCriterion(AIC)associatedwithacompetingmodelMkisgivenbyAICk=−2logp(Y|θ˜k,Mk)+2dk,(6)whichdoesnotinvolvethesamplesizen.TheinterpretationofAICkissimilartoBICk.Thatis,MkisselectedifitsAICkissmaller.Comparing(5)with(6),weseethatBICtendstofavorsimplermodelsthanthoseselectedbyAIC.Anothergoodness-of-fitormodelcomparisonstatisticthattakesintoaccountthenumberofunknownparametersinthemodelistheDevianceInformationCriterion(DIC),seeSpiegelhalteretal.(2002).ThisstatisticisintendedasageneralizationofAIC.UnderacompetingmodelMkwithavectorofunknownparameterθkof(j)dimensiondk,let{θ:j=1,...,J}beasampleofobservationssimulatedfromktheposteriordistribution.Wedefine2J(j)DICk=−logpY|θ,Mk+2dk.(7)Jkj=1Inmodelcomparison,themodelwiththesmallerDICvalueisselected.Inanalyzingahypothesizedmodel,WinBUGS(Spiegelhalteretal.,2003)pro-ducesaDICvaluethatcanbeusedformodelcomparison.However,aspointedoutintheWinBUGSUserManual(Spiegelhalteretal.,2003),inpracticalapplicationofDIC,itisimportanttonotethefollowing:(a)IfthedifferenceinDICissmall,forexample,lessthanfive,andthemodelsmakeverydifferentinferences,thenjustreportingthemodelwiththelowestDICcouldbemisleading.(b)DICcanbeappliedtononnestedmodels.SimilartotheBayesfactor,BIC,andAIC,DICgivesclearconclusiontosupportthenullhypothesisorthealternativehypothesis.(c)DICassumestheposteriormeantobeagoodestimateoftheparameter.Therearecir-cumstances,suchasmixturemodels,inwhichWinBUGSwillnotgivetheDICvalues.Becauseof(c)andbecausetheBayesfactoristhemorecommonstatisticformodelcomparison,wefocusonBayesfactorinthischapter. 126S.-Y.Lee,X.-Y.Song3ComputationofBayesFactorthroughPathSamplingIngeneral,letYbethematrixofobserveddata,Lbethematrixoflatentdata,andbethematrixoflatentvariablesinthemodel.Usually,owingtothecomplex-ityofthemodel,directapplicationofpathsampling(GelmanandMeng,1998)inevaluatingtheBayesfactorisdifficult.Inthischapter,weutilizetheideaofdataaugmentation(TannerandWong,1987)tosolvetheproblem.BelowweusesimilarreasoningsasinGelmanandMeng(1998)toshowbrieflythatpathsamplingcanbeappliedtocomputetheBayesfactorbyaugmentingYwithandL.Fromtheequalityp(,L,θ|Y)=p(Y,,L,θ)/p(Y),themarginaldensityp(Y)canbetreatedasthenormalizingconstantofp(,L,θ|Y),withthecomplete-dataproba-bilitydensityp(Y,,L,θ)takingastheunnormalizeddensity.Now,considerthefollowingclassofdensitieswhicharedenotedbyacontinuousparametertin[0,1]:1p(,L,θ|Y,t)=p(Y,,L,θ|t),(8)z(t)wherez(t)=p(Y|t)=p(Y,,L,θ|t)ddLdθ=p(Y,,L|θ,t)p(θ)ddLdθ,(9)withp(θ)beingthepriordensityofθ,whichisassumedtobeindependentoft.IncomputingtheBayesfactor,weconstructapathusingtheparametertin[0,1]tolinktwocompetingmodelsM1andM0together,sothatB10=z(1)/z(0).Takinglogarithmandthendifferentiating(9)withrespecttot,andassumingthelegitimacyofinterchangeofintegrationwithdifferentiation,wehavedlogz(t)1d=p(Y,,L,θ|t)ddLdθdtz(t)dtd=logp(Y,,L,θ|t)·p(,L,θ|Y,t)ddLdθdtd=E,L,θlogp(Y,,L,θ|t),dtwhereE,L,θdenotestheexpectationwithrespecttothedistributionp(,L,θ|Y,t).LetddU(Y,,L,θ,t)=logp(Y,,L,θ|t)=logp(Y,,L|θ,t),(10)dtdtwhichdoesnotinvolvethepriordensityp(θ),wehave1z(1)logB10=log=E,L,θ[U(Y,,L,θ,t)]dt.z(0)0 BayesianModelComparisonofStructuralEquationModels127WefollowthemethodasinOgata(1989)tonumericallyevaluatetheintegralovert.Specifically,wefirstordertheuniquevaluesofSfixedgrids{t(s)}Ssuchthats=0t(0)=00andaninversescaleparameterβ>0;ψkandψδkarethekthdiag-onalelementsofandδ,respectively;W[·,·]denotestheWishartdistribu-tion;α0k,β0k,α0δk,β0δk,A0k,0k,0ωk,ρ0,andpositivedefinitematricesH0k,H0yk,H0ωk,andR0arehyper-parameters,whosevaluesareeithersubjectivelyde-terminedifgoodpriorinformationisavailableorobjectivelydeterminedfromthedataorothersources.ThefullconditionaldistributionsundertheseconjugatepriordistributionscanbeobtainedfromLeeandSong(2003);ortheycanbeobtainedasspecialcasesofthefullconditionaldistributionspresentedintheAppendix.4.3ASimulationStudyTheobjectivesofthissimulationstudyaretorevealtheperformanceofthepathsamplingprocedureforcomputingBayesfactorandtoevaluatethesensitivitywithrespecttopriorinputs.Randomobservationsweresimulatedfromthenon-linearmodeldefinedby(13)and(14)witheightmanifestvariables,whicharere-latedtotwofixedcovariates{cyi1,cyi2}andthreelatentvariables{ηi,ξi1,ξi2}.Thefirstfixedcovariatecyi1issampledfromamultinomialdistributionwhichtakesvalues1.0,2.0,and3.0withprobabilities∗(−0.5),∗(0.5)−∗(−0.5),and1.0−∗(0.5),respectively,where∗isthedistributionfunctionofN[0,1].Thesecondcovariatecyi2issampledfromN[0,1].Thetruepopulationvaluesinmatri-cesA,,andareT1.01.01.01.01.01.01.01.0A=,0.70.70.70.70.70.70.70.7⎡⎤1.0∗1.51.50.0∗0.0∗0.0∗0.0∗0.0∗T=⎣0.0∗0.0∗0.0∗1.0∗1.50.0∗0.0∗0.0∗⎦,ε=I8,0.0∗0.0∗0.0∗0.0∗0.0∗1.0∗1.51.5whereI8isa8by8identitymatrix,andparameterswithanasteriskweretreatedasknown.Thetruevariancesandcovarianceofξi1andξi2areφ11=φ22=1.0,andφ21=0.15.Thesetwolatentvariablesarerelatedtoηiby2+δηi=1.0ci+0.5ξi1+0.5ξi2+1.0ξi2i,whereciisanotherfixedcovariatesampledfromaBernoullidistributionthattakes1.0withprobability0.7and0.0withprobability0.3;andψδ=1.0.Onthebasisofthesespecifications,randomsamples{yi,i=1,...,n}withn=300weregener-atedforthesimulationstudy.Atotalof100replicationsweretakenforeachcase.Attentionisdevotedtocomparemodelswithdifferentformationsofthemoreinterestingstructuralequationwithlatentvariables.Hence,modelswiththesamemeasurementequationandthefollowingstructuralequationsareinvolvedinthemodelcomparison: BayesianModelComparisonofStructuralEquationModels131M20:ηi=bci+γ1ξi1+γ2ξi2+γ22ξi2+δi,M1:ηi=bci+γ1ξi1+γ2ξi2+δi,M2:ηi=bci+γ1ξi1+γ2ξi2+γ12ξi1ξi2+δi,2+δM3:ηi=bci+γ1ξi1+γ2ξi2+γ11ξi1i,M24:ηi=bci+γ1ξi1+γ2ξi2+γ12ξi1ξi2+γ11ξi1+δi,M25:ηi=γ1ξi1+γ2ξi2+γ22ξi2+δi,M226:ηi=bci+γ1ξi1+γ2ξi2+γ12ξi1ξi2+γ11ξi1+γ22ξi2+δi.Here,M0isthetruemodel;M1isalinearmodel;M2,M3,andM4arenonnestedinM0;andM0isnestedinthemostgeneralmodelM6.TogiveamoredetailedillustrationinapplyingtheproceduretomodelcomparisonofnonlinearSEMs,theimplementationofpathsamplingtoestimatelogB02incomparingM0andM2isgivenhere.Letθ=(θ˜,ω)andθt=(θ˜,tω),whereω=(b,γ1,γ2,γ12,γ22),tω=(b,γ1,γ2,(1−t)γ12,tγ22),andθ˜includesallunknownparametersinMtexceptforω.Theprocedureconsistsofthefollowingsteps:Step1:SelectaMttolinkM0andM2.Here,Mtisdefinedwiththesamemeasure-mentmodelasinM0andM2,butwiththefollowingstructuralequation:2+δMt:ηi=bci+γ1ξi1+γ2ξi2+(1−t)γ12ξi1ξi2+tγ22ξi2i.Clearly,whent=1,Mt=M0;whent=0,Mt=M2.Step2:Atthefixedgridt=t,generateobservations((j),θ(j)),j=1,...,J(s)fromp(,θ|Y,t(s)).Specifically,atthejthiteration(j+1)(j)1.Generatefromp(|θt,Y).(j+1)(j+1)(j)2.Generateθ˜fromp(θ˜|,tω,Y).(j+1)(j+1)(j+1)3.Generateωfromp(ω|,θ˜,Y).(j+1)(j+1)(j+1)(j+1)4.Updateθtbylettingθt=(θ˜,tω),where(j+1)(j+1)Ttω=(1,1,1,(1−t(s)),t(s))ω.Step3:CalculateU(Y,(j),θ(j),t)bysubstituting{((j),θ(j));j=1,...,J}(s)to(15)asfollows:n)=(η2U(Y,,θ,t(s)i−bci−γ1ξ1−γ2ξ2−(1−t(s))γ12ξ1ξ2−t(s)γ22ξ2)i=1×(γ212ξ1ξ2−γ22ξ2)/ψδ.Step4:CalculateU¯(s);see(12).Step5:RepeatStep2toStep5untilallU¯(s),s=0,...,Sarecalculated.Then,logB02canbeestimatedby(11).Inthesensitivityanalysisconcerningaboutthepriorinputs,thelessimpor-tanthyper-parametersintheconjugatepriordistributionareselectedasH0k=I, 132S.-Y.Lee,X.-Y.SongH0yk=I,andH0ωk=I.Forthemoreimportanthyper-parameters,wefol-lowedthesuggestionofKassandRaftery(1995)toperturbthemasfollows.Forα0k=α0δk=8,β0k=β0δk=10,andρ0=20,weconsiderthefollowingthreetypesofpriorinputs:(I)A−1=0k,0k,and0ωkareselectedtobethetrueparametermatrices,andR0(ρ0−q2−1)0,whereelementsin0arethetrueparametervalues.(II)Thehyper-parametersspecifiedin(I)areequaltohalfofthevaluesgivenin(I).(III)Thehyper-parametersspecifiedin(I)areequaltotwiceofthevaluesgivenin(I).ForType(I)priorinputsasgivenabove,weconsiderthefollowingpriorinputsonα0k,α0δk,β0k,β0δk,andρ0:(IV)α0k=α0δk=3,β0k=β0δk=5,andρ0=12.(V)α0k=α0δk=12,β0k=β0δk=15,andρ0=30.Foreverycase,wetook20gridsin[0,1]andcollectedJ=1,000iterationsafterdiscarding500burn-initerationsateachgridinthecomputationofBayesfactor.σ2wassettobe1.0intheMHalgorithm,whichgivesanapproximateacceptancerate0.43.EstimatesoflogB0k,k=1,...,6underthethreedifferentpriorswerecom-puted.ThemeanandstandarddeviationoflogB0kwerealsocomputedonthebasisof100replications.ResultscorrespondingtologB0k,k=1,...,5and4B60arere-portedinTable1.Moreover,foreachk=1,...,6,weevaluateD(I−II)=max|logB0k(I)−logB0k(II)|aswellasD(I–III)andD(IV–V)similarly,wherelogB0k(I)istheestimateunderprior(I)andsoon,and“max”isthemaximumtakenoverthe100replications.TheresultsarepresentedinTable2,forexample,themaximumdifferenceoftheesti-matesoflogB01obtainedviapriors(I)and(II)is6.55.FromtherowsofTable1,weobservethatthemeansandstandarddeviationsoflogB0kobtainedunderdiffer-entpriorinputsareclosetoeachother.ThisindicatesthattheestimateoflogB0kisnotverysensitivetothesepriorinputsunderasamplesizeof300.Forprac-ticalapplications,weseefromTable2thatevenfortheworstsituationwiththeTable1MeanandstandarderrorsoftheestimatedlogB0kinthesimulationstudyMean(std)PriorIPriorIIPriorIIIPriorIVPriorVlogB01106.28(25.06)107.58(25.15)102.96(24.81)103.87(22.71)104.61(23.92)logB02102.16(24.91)103.45(25.02)99.17(24.54)99.98(22.67)100.49(23.47)logB03109.51(25.63)111.23(25.74)105.96(25.19)107.20(23.81)108.24(24.59)logB04105.23(25.31)106.61(25.47)101.83(24.90)103.16(23.78)103.69(24.12)logB0517.50(5.44)18.02(5.56)16.65(5.21)18.02(5.34)17.85(5.30)logB600.71(0.54)0.71(0.51)0.69(0.55)0.78(0.67)0.75(0.65) BayesianModelComparisonofStructuralEquationModels133Table2MaximumabsolutedifferencesoflogB0kundersomedifferentpriorslogB01logB02logB03logB04logB05logB60D(I–II)6.555.478.225.242.180.27D(I–III)7.849.3310.2310.173.070.31D(IV–V)14.0317.8613.654.871.910.25maximumabsolutedeviation,theestimatedlogarithmofBayesfactorsunderdiffer-entpriorinputsgivethesameconclusionforselectingthemodelviathecriteriongivenin(1).FromTable1,itisclearthatM0ismuchbetterthanthelinearmodelM1andthenonnestedmodelsM2,M3,M4,andM5.Thus,thecorrectmodelisselected.ForcomparisonwiththeencompassingmodelM6,wefoundthatoutof100replicationsunderprior(I),75ofthelogB60areintheinterval(0.0,1.0),23ofthemarein(1.0,2.0),andonly2ofthemarein(2.0,3.0).SinceM0issimplerthanM6,itshouldbeselectediflogB60isin(0.0,1.0).Thus,thetruemodelisselectedin75outofthe100replications.Owingtorandomness,theremaininglogB60supportmildlytheencompassingmodel.Althoughtheencompassingmodelisnotthetruemodel,itshouldnotberegardedasanincorrectmodelforfittingthedata.IthasbeenpointedoutbyKassandRaftery(1995)thattheeffectofthepriorsissmallinestimation.Togivesomeideasabouttheempiricalperformanceoftheproposedprocedureonestimation,themeansoftheBayesianestimatesandtherootmeansquares(RMS)betweentheBayesianestimatesandthetruevaluesofM0overthe100replicationsundersomepriorinputsarereportedinTable3.ItseemsthatBayesianestimatesarequiteaccurateandnotverysensitivetotheselectedpriorinputs.5ModelComparisonofanIntegratedSEMMotivatedbythedemandofefficientstatisticalmethodsforanalyzingvariouskindsofcomplexrealdatainsubstantiveresearch,therecentgrowthofSEMhasbeenratherrepaid.Efficientmethodshavebeendevelopedtohandlemissingdata(seeDolanetal.,2005;SongandLee,2006b),dichotomousororderedcategoricaldata(SongandLee,2004,2005),andhierarchicalormultileveldata(AnsariandJedidi,2000;RaykovandMarcoulides,2006;LeeandSong,2004b).Althoughmodelcom-parisonhasbeenseparatelyaddressedinsomeoftheabovementionedarticles,how-ever,itisnecessarytocomputetheBayesfactorinthecontextofanintegratedmodelformodelcomparisonundersomecomplexsituations.Toseethispoint,letusconsider,forexample,theproblemofcomparinganonlinearSEM(M1)withatwo-levellinearSEM(M2).ThecomputationalmethodthatwasdevelopedbasedonM1forcomputingtheBayesfactorcanonlybeusedtocomparemodelsunderthemodelframeworkofthenonlinearSEMmodel.Asthemethoddevelopedunder 134S.-Y.Lee,X.-Y.SongTable3MeanandRMSofBayesianestimatesunderM0withdifferentpriorsParaPriorIPriorIIPriorIIIMeanRMSMeanRMSMeanRMSλ21=1.51.4950.0471.4910.0481.4980.047λ31=1.51.4930.0491.4900.0491.4970.049λ52=1.51.4670.1271.4670.1201.4650.119λ73=1.51.5250.0941.5250.0981.5360.104λ83=1.51.5340.0981.5330.1011.5440.104b=1.01.0100.1391.0170.1331.0040.131γ1=0.50.5180.0910.5190.0920.5160.091γ2=0.50.5050.1230.4940.1260.5350.133γ22=1.01.0600.1411.0640.1481.0650.146a11=1.01.0000.0640.9920.0621.0150.065a21=1.01.0070.0940.9970.0871.0260.095a31=1.01.0000.0930.9900.0861.0190.091a41=1.00.9910.0370.9890.0370.9960.036a51=1.00.9920.0460.9890.0470.9990.045a61=1.01.0000.0310.9970.0311.0060.032a71=1.00.9970.0450.9930.0461.0060.047a81=1.00.9980.0400.9940.0411.0070.041a12=0.70.7010.0950.6850.0950.7260.096a22=0.70.7000.1230.6760.1240.7350.125a32=0.70.7030.1080.6800.1100.7390.113a42=0.70.7040.0860.6950.0850.7180.088a52=0.70.7230.1020.7100.0980.7410.107a62=0.70.7020.0540.6980.0550.7110.056a72=0.70.6950.0680.6900.0690.7070.068a82=0.70.7130.0800.7080.0800.7250.083ψ1=1.00.8420.0950.8410.0940.8400.094ψ2=1.00.8390.0980.8430.1000.8520.104ψ3=1.00.8360.0930.8390.0940.8470.096ψ4=1.00.8110.0770.8130.0770.8120.079ψ5=1.00.9480.1790.9450.1760.9490.179ψ6=1.00.8390.0820.8390.0820.8410.083ψ7=1.00.8520.0940.8510.0940.8510.094ψ8=1.00.8460.0860.8460.0870.8460.087ψδ=1.01.0210.1091.0280.1111.0260.112φ11=1.00.9800.1400.9800.1370.9810.136φ12=.150.1480.0760.1480.0760.1470.076φ22=1.00.9590.1260.9620.1290.9480.132M1cannotbeappliedtoadifferenttwo-levelSEM,themodelcomparisonprob-lemcannotbesolved.Similarly,evensimultaneouslygivenaseparateBayesiandevelopmentonthebasisofM2,thecomputationalprocedurethatisdevelopedunderatwo-levellinearSEMcannothandleanonlinearSEMmodel.AsolutiontothisproblemrequiresthedevelopmentofanintegratedmodelthatsubsumesbothM1andM2,sothatmodelcomparisoncanbedoneunderacomprehensiveframework. BayesianModelComparisonofStructuralEquationModels1355.1TheIntegratedModelWeconsideratwo-levelnonlinearSEMwithmissingdichotomousandorderedcat-egoricalvariable.Ingenericsense,thespecificrelationshipofanorderedcategoricalvariablezanditsunderlyingcontinuousvariableyisgivenbyz=kifαk−1≤y<αk,fork=1,...,m,(17)where{−∞=α0<α1<···<αm−1<αm=∞}isthesetofunknownthresh-oldsthatdefinesmcategories.Thevarianceandthresholdscorrespondingtoeachorderedcategoricalvariablearenotidentifiable.Acommonmethodforachievingidentificationistofixthesmallestandthelargestthresholds,α1andαm−1,ofthecorrespondingorderedcategoricalvariableatpreassignedvalues(see,ShiandLee,2000),forexample,α1=∗−1(f∗)andαm−1=∗−1(f∗),wheref∗istheob-1m−1kservedcumulativemarginalproportionofthecategorieswithzα,andd=0ifw≤α,(18)whereαisanunknownthresholdparameter.Toidentifyadichotomousvariable,weusethemethodsuggestedbySongandLee(2005)byfixingthecorrespondingerrormeasurement’svarianceatapreassignedvalue,forexample,1.0.Toformulatetheintegratedmodel,weconsideracollectionofp-variaterandomvectorsugifori=1,...,Ng,withingroupsg=1,...,G.AsthesamplesizesNgmaydifferfromgrouptogroup,thedatasetisunbalanced.Weassumethatconditionalonthegroupmeanvg,randomobservationsineachgroupatthewithin-groups(first)levelhavethefollowingstructure:ugi=vg+A1gcugi+1gω1gi+1gi,g=1,...,G,i=1,...,Ng,(19)wherecugiisavectoroffixedcovariates,A1gisamatrixofcoefficients,1gisamatrixoffactorloadings,ω1giisaq1×1vectoroflatentvariables,and1giisap×1vectoroferrormeasurements.Itisassumedthat1giisindependentofω1giandisdistributedasN[0,1g],where1gisadiagonalmatrix.Atthebetween-groups(second)level,weassumethatvghasthestructurevg=A2cvg+2ω2g+2g,g=1,...,G,(20)wherecvgisavectoroffixedcovariates,A2isamatrixofcoefficients,2isamatrixoffactorloadings,ω2gisaq2×1vectoroflatentvariables,and2gisap×1vectoroferrormeasurements.Itisassumedthat2gisindependentofω2gandisdistrib-utedasN[0,2],where2isadiagonalmatrix.Moreover,thefirstlevellatentvectorsareassumedtobeindependentofthesecondlevellatentvectors.However,becauseofthepresenceofvg,ugiandugjarecorrelated,andtheusualassumptionaboutindependenceisviolated.Equations(19)and(20)definethemeasurement 136S.-Y.Lee,X.-Y.Songequationsforthewithin-groupsandbetween-groupsmodels.Toassesstherela-tionshipsamonglatentvariablesatbothlevels,thefollowingnonlinearstructuralequationsinthebetween-groupsandwithin-groupsmodelsareconsidered:η1gi=B1gc1gi+1gη1gi+1gF1(ξ1gi)+δ1gi,and(21)η2g=B2c2g+2η2g+2F2(ξ2g)+δ2g,(22)wherec1giandc2garefixedcovariates,B1g,B2,1g,2,1g,and2arema-Ttricesofcoefficients,F1(ξ1gi)=f11(ξ1gi),...,f1a(ξ1gi)andF2(ξ2g)=Tf21(ξ2g),...,f2b(ξ2g)arevector-valuedfunctionswithnonzerodifferentiablefunctions.WeassumethatI1−1gandI2−2arenonsingularandtheirdetermi-nantsareindependentof1gand2,respectively;ξ1giandδ1giareindependentlydistributedasN[0,1g]andN[0,1δg],respectively,where1δgisadiagonalmatrix.Similarly,itisassumedthatξ2gandδ2gareindependentlydistributedasN[0,2]andN[0,2δ],respectively,where2δisadiagonalmatrix.However,owingtothenonlinearfunctionsinF1andF2,thedistributionofugiisnotnormal.Toinvestigatethemodelwithmixedcontinuous,dichotomous,andorderedcate-goricalvariables,wesupposewithoutlossofgeneralitythatugi=(xT,wT,yT)T,gigigiwherexgi=(xgi1,...,xgir)Tisanobservablecontinuousrandomvector,wgi=(wgi1,...,wgis)Tandygi=(ygi1,...,ygit)Tareunobservablecontinuousran-domvectorsthatunderlietheobservabledichotomousandorderedcategoricalvectorsdgiandzgi,respectively.Thelinksbetweenanorderedcategoricalvari-ableandadichotomousvariablewiththeirunderlyingcontinuousvariablesaregivenby(17)and(18),respectively.Entriesofugiareallowedtobemissingatrandom.Weidentifythecovariancemodelsbyfixingappropriateelementsof1g,2,1g,1g,2,and2atpreassignedvalues.5.2ModelComparisonWefirstconsidertheposteriorsimulationtogeneratetherequiredobservationsfromthejointposteriordistributionforcomputingtheBayesfactor,see(11)and(12).LetXobs,Dobs,andZobsbetheobserveddatacorrespondingtothecontinuous,dichotomous,andorderedcategoricalvariables,andletXmis,Dmis,andZmisbethemissingdatacorrespondingtothesetypesofvariables.Itisassumedthatmissingdataaremissingatrandom.LetWobsandWmisbetheunderlyingunobservablecontinuousmeasurementscorrespondingtoDobsandDmis;andletYobsandYmisbetheunderlyingunobservablecontinuousmeasurementscorrespondingtoZobsandZmis,respectively.LetO=(Xobs,Dobs,Zobs)betheobserveddataset;Moreover,let1g=(ω1gi,...,ω1gN)and1=(11,...,1G)bethematricesthatcon-gtainthewithin-groupslatentvectorsandmatrices;andlet2=(ω21,...,ω2G)andV=(v1,...,vG)bethematricesthatcontainthebetween-groupslatentvectors.Finally,letαandθbethevectorsthat,respectively,containallunknown BayesianModelComparisonofStructuralEquationModels137thresholdsandallunknownparametersthatareinvolvedinthemodeldefinedby(19)–(22).Asufficientlylargenumberofobservationswillbesimulatedfromthefollowingjointposteriordistributionp(θ,α,Yobs,Wobs,Umis,1,2,V|O),whereUmis=(Xmis,Ymis,Zmis).ThesimulationisdonebytheGibbssam-pler,whichiterativelydrawssamplesfromthefollowingfullconditionaldis-tributions:p(θ|α,Yobs,Wobs,Umis,1,2,V,O),p(α,Yobs|θ,Wobs,Umis,1,2,V,O),p(Wobs|θ,α,Yobs,Umis,1,2,V,O),p(Umis|θ,α,Yobs,Wobs,1,2,V,O),p(1|θ,α,Yobs,Wobs,Umis,2,V,O),p(2|θ,α,Yobs,Wobs,Umis,1,V,O),andp(V|θ,α,Yobs,Wobs,Umis,1,2,O).Notethatp(θ|α,Yobs,Wobs,Umis,1,2,V,O)=p(θ|∗)isfurtherdecomposedintothefol-lowingcomponents:p(A1|θ−A,∗),p(1|θ−,∗),...,p(2δ|θ−,∗),where112δθ−A,θ−,...,θ−aresubvectorsofθwithoutA1,1,...,2δ,respec-112δtively.TheabovefullconditionaldistributionsrequiredforimplementingtheGibbssampleraregivenintheAppendix.Someoftheconditionaldistributionsarestandarddistributionssuchasnormal,univariatetruncatednormal,Gamma,andinvertedWishart,simulatingobservationsfromthemisstraight-forwardandfast.TheMetropolis–Hastings(MH)(Metropolisetal.,1953;Hastings,1970)algorithmwillbeusedtosimulateobservationsfromthefollowingmorecomplicatedconditionaldistributions,p(1|θ,α,Yobs,Wobs,Umis,2,V,O),p(2|θ,α,Yobs,Wmis,Umis,1,V,O),andp(α,Yobs|θ,Wobs,Umis,1,2,V,O).AstheimplementationoftheMHalgorithmissimilartothatgiveninSongandLee(2004,2005),itisnotpresented.Inthepathsamplingprocedure,weaugmenttheobserveddataOwiththela-tentquantities(Yobs,Wobs,Umis,1,2,V)intheanalysis.Considerthefollowingclassofdensitiesdefinedbyacontinuousparametertin[0,1]:p(θ,α,Yobs,Wobs,Umis,1,2,V,O|t)p(θ,α,Yobs,Wobs,Umis,1,2,V|O,t)=,z(t)wherez(t)=p(O|t).RecallthattisaparametertolinkM0andM1suchthatfora=0,1,z(a)=p(O|t=a)=p(O|Ma).Hence,B10=z(1)/z(0).ItfollowsfromthereasoninginSect.3thatS1logB10=(t(s+1)−t(s))(¯(s+1)+¯(s)),(23)2s=0wheret(0)=0

当前文档最多预览五页,下载文档查看全文

此文档下载收益归作者所有

当前文档最多预览五页,下载文档查看全文
温馨提示:
1. 部分包含数学公式或PPT动画的文件,查看预览时可能会显示错乱或异常,文件下载后无此问题,请放心下载。
2. 本文档由用户上传,版权归属用户,天天文库负责整理代发布。如果您对本文档版权有争议请及时联系客服。
3. 下载前请仔细阅读文档内容,确认文档内容符合您的需求后进行下载,若出现内容与标题不符可向本站投诉处理。
4. 下载文档时可能由于网络波动等原因无法下载或下载错误,付费完成后未能成功下载的用户请联系客服处理。
大家都在看
近期热门
相关文章
更多
相关标签
关闭