资源描述:
《Active Semi-Supervision for Pairwise Constrained Clustering成对约束聚类的主动半监督》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、ProceedingsoftheSIAMInternationalConferenceonDataMining,(SDM-2004),pp.333-344,LakeBuenaVista,FL,April,2004ActiveSemi-SupervisionforPairwiseConstrainedClusteringSugatoBasuArindamBanerjeeRaymondJ.MooneyComputerSciences,ElectricalandComputerEng.,ComputerSciences,Univ.ofTexasatAustin,Univ.ofTexasa
2、tAustin,Univ.ofTexasatAustin,Austin,TX78712Austin,TX78712Austin,TX78712sugato@cs.utexas.eduabanerje@ece.utexas.edumooney@cs.utexas.eduAbstractprovidingclasslabels,sincetruelabelsmaybeunknownSemi-supervisedclusteringusesasmallamountofsuper-apriori,whileitcanbeeasiertospecifywhetherpairsofvisedd
3、atatoaidunsupervisedlearning.Onetypicalap-pointsbelongtothesameclusterordifferentclusters.proachspecifiesalimitednumberofmust-linkandcannot-Weproposeacostfunctionforpairwiseconstrainedlinkconstraintsbetweenpairsofexamples.Thispaperclustering(PCC)thatcanbeshowntobetheconfigurationpresentsapairwis
4、econstrainedclusteringframeworkandaenergyofaHiddenMarkovRandomField(HMRF)overnewmethodforactivelyselectinginformativepairwisecon-thedatawithawell-definedpotentialfunctionandnoisestraintstogetimprovedclusteringperformance.Theclus-model.Then,thepairwise-constrainedclusteringproblemteringandactive
5、learningmethodsarebotheasilyscalablebecomesequivalenttofindingtheHMRFconfigurationwithtolargedatasets,andcanhandleveryhighdimensionaldata.thehighestposteriorprobability,i.e.,minimizingitsenergy.ExperimentalandtheoreticalresultsconfirmthatthisactiveWepresentanalgorithmforsolvingthisproblem.queryin
6、gofpairwiseconstraintssignificantlyimprovestheFurther,inordertomaximizetheutilityofthelimitedaccuracyofclusteringwhengivenarelativelysmallamountsuperviseddataavailableinasemi-supervisedsetting,super-ofsupervision.visedtrainingexamplesshouldbeactivelyselectedasmaxi-mallyinformativeonesratherthan
7、chosenatrandom,ifpos-1Introductionsible[27].Inthatcase,fewerconstraintswillberequiredtosignificantlyimprovetheclusteringaccuracy.Tothisend,Inmanydataminingandmachinelearningtasks,thereisawepresentanewmethodforactivelyselectinggoodpair-la