欢迎来到天天文库
浏览记录
ID:40048103
大小:6.46 MB
页数:579页
时间:2019-07-18
《An Introduction to Information Retrieve》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、AnIntroductiontoInformationRetrievalDraftofNovember17,2007Preliminarydraft(c)2007CambridgeUPPreliminarydraft(c)2007CambridgeUPAnIntroductiontoInformationRetrievalChristopherD.ManningPrabhakarRaghavanHinrichSchützeCambridgeUniversityPressCambridge,EnglandPreliminar
2、ydraft(c)2007CambridgeUPDRAFT!DONOTDISTRIBUTEWITHOUTPRIORPERMISSION©2007CambridgeUniversityPressByChristopherD.Manning,PrabhakarRaghavan&HinrichSchützePrintedonNovember17,2007Website:http://www.informationretrieval.org/Comments,corrections,andotherfeedbackmostwelc
3、omeat:informationretrieval@yahoogroups.comPreliminarydraft(c)2007CambridgeUPDRAFT!©November17,2007CambridgeUniversityPress.Feedbackwelcome.vBriefContents1Booleanretrieval12Thetermvocabularyandpostingslists193Dictionariesandtolerantretrieval494Indexconstruction675I
4、ndexcompression856Scoring,termweightingandthevectorspacemodel1097Computingscoresinacompletesearchsystem1358Evaluationininformationretrieval1519Relevancefeedbackandqueryexpansion17710XMLretrieval19511Probabilisticinformationretrieval21912Languagemodelsforinformatio
5、nretrieval23713TextclassificationandNaiveBayes25314Vectorspaceclassification28915Supportvectormachinesandmachinelearningondocuments31916Flatclustering34917Hierarchicalclustering37718Matrixdecompositionsandlatentsemanticindexing40319Websearchbasics42120Webcrawlingand
6、indexes44321Linkanalysis461Preliminarydraft(c)2007CambridgeUPPreliminarydraft(c)2007CambridgeUPDRAFT!©November17,2007CambridgeUniversityPress.Feedbackwelcome.viiContentsListofTablesxvListofFiguresxviiTableofNotationxxvPrefacexxix1Booleanretrieval11.1Anexampleinfor
7、mationretrievalproblem31.2Afirsttakeatbuildinganinvertedindex61.3ProcessingBooleanqueries101.4TheextendedBooleanmodelversusrankedretrieval141.5Referencesandfurtherreading172Thetermvocabularyandpostingslists192.1Documentdelineationandcharactersequencedecoding192.1.1
8、Obtainingthecharactersequenceinadocument192.1.2Choosingadocumentunit202.2Determiningthevocabularyofterms222.2.1Tokenization222.2.2Droppingcommonterms:st
此文档下载收益归作者所有