资源描述:
《Cancer classification and prediction using logistic regression with Bayesian gene selection》由会员上传分享,免费在线阅读,更多相关内容在行业资料-天天文库。
1、JournalofBiomedicalInformatics37(2004)249–259www.elsevier.com/locate/yjbinCancerclassificationandpredictionusinglogisticregressionwithBayesiangeneselectiona,ba,ba,b,*XiaoboZhou,Kuang-YuLiu,StephenT.C.WongaHarvardCenterforNeurodegenerationandRepair—CenterforBioinformatics,HarvardMedicalSchool,220Longw
2、oodAvenue,Boston,MA02115,USAbRadiologyDepartment,HarvardMedicalSchoolandBrighamandWomenÕsHospital,77FrancisStreet,Boston,MA02115,USAReceived13June2004Availableonline11September2004AbstractInmicroarray-basedcancerclassificationandprediction,geneselectionisanimportantresearchproblemowingtothelargenum-b
3、erofgenesandthesmallnumberofexperimentalconditions.Inthispaper,weproposeaBayesianapproachtogeneselectionandclassificationusingthelogisticregressionmodel.Thebasicideaofourapproachisinconjunctionwithalogisticregressionmodeltorelatethegeneexpressionwiththeclasslabels.WeuseGibbssamplingandMarkovchainMont
4、eCarlo(MCMC)methodstodis-coverimportantgenes.ToimplementGibbsSamplerandMCMCsearch,wederiveaposteriordistributionofselectedgenesgiventheobserveddata.Aftertheimportantgenesareidentified,thesamelogisticregressionmodelisthenusedforcancerclassificationandprediction.Issuesforefficientimplementationforthepropo
5、sedmethodarediscussed.Theproposedmethodisevaluatedagainstseverallargemicroarraydatasets,includinghereditarybreastcancer,smallroundblue-celltumors,andacuteleukemia.Theresultsshowthatthemethodcaneffectivelyidentifyimportantgenesconsistentwiththeknownbiologicalfindingswhiletheaccuracyoftheclassificationis
6、alsohigh.Finally,therobustnessandsensitivitypropertiesoftheproposedmethodarealsoinvestigated.Ó2004ElsevierInc.Allrightsreserved.Keywords:Genemicroarray;Logisticregression;Bayesiangeneselection;Cancerclassification1.Introductionformodelselection[12],andthelogisticregressionmod-el[3].Thelogisticregress
7、ionmodel,alsoknownaslogitCancerclassificationandpredictionhasbecomeoneintheliterature,isoneofthemostcommonmodelsforofthemostimportantapplicationsofDNAmicroarrayprediction,regression,andclassificationofd