资源描述:
《Frequent Pattern Mining In web log.pdf》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、ActaPolytechnicaHungaricaVol.3,No.1,2006FrequentPatternMininginWebLogDataRenátaIváncsy,IstvánVajkDepartmentofAutomationandAppliedInformatics,andHAS-BUTEControlResearchGroupBudapestUniversityofTechnologyandEconomicsGoldmannGy.tér3,H-1111Budapest,Hungarye-mail:{renata
2、.ivancsy,vajk}@aut.bme.huAbstract:Frequentpatternminingisaheavilyresearchedareainthefieldofdataminingwithwiderangeofapplications.OneofthemistousefrequentpatterndiscoverymethodsinWeblogdata.DiscoveringhiddeninformationfromWeblogdataiscalledWebusagemining.Theaimofdisc
3、overingfrequentpatternsinWeblogdataistoobtaininformationaboutthenavigationalbehavioroftheusers.Thiscanbeusedforadvertisingpurposes,forcreatingdynamicuserprofilesetc.InthispaperthreepatternminingapproachesareinvestigatedfromtheWebusageminingpointofview.Thedifferentpa
4、tternsinWeblogminingarepagesets,pagesequencesandpagegraphs.Keywords:Patternmining,Sequencemining,GraphMining,Weblogmining1IntroductionTheexpansionoftheWorldWideWeb(Webforshort)hasresultedinalargeamountofdatathatisnowingeneralfreelyavailableforuseraccess.Thedifferent
5、typesofdatahavetobemanagedandorganizedinsuchawaythattheycanbeaccessedbydifferentusersefficiently.Therefore,theapplicationofdataminingtechniquesontheWebisnowthefocusofanincreasingnumberofresearchers.SeveraldataminingmethodsareusedtodiscoverthehiddeninformationintheWe
6、b.However,WebminingdoesnotonlymeanapplyingdataminingtechniquestothedatastoredintheWeb.ThealgorithmshavetobemodifiedsuchthattheybettersuitthedemandsoftheWeb.NewapproachesshouldbeusedwhichbetterfitthepropertiesofWebdata.Furthermore,notonlydataminingalgorithms,butalsoa
7、rtificialintelligence,informationretrievalandnaturallanguageprocessingtechniquescanbeusedefficiently.Thus,Webmininghasbeendevelopedintoanautonomousresearcharea.Thefocusofthispaperistoprovideanoverviewhowtousefrequentpatternminingtechniquesfordiscoveringdifferenttype
8、sofpatternsinaWeblog–77–R.Iváncsyetal.FrequentPatternMininginWebLogDatadatabase.Thethreepatternstobesearchedarefrequentitemsets,sequencesa