欢迎来到天天文库
浏览记录
ID:34628699
大小:6.58 MB
页数:581页
时间:2019-03-08
《an introduction to information retrieval (2009)》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、AnIntroductiontoInformationRetrievalDraftofApril1,2009Onlineedition(c)2009CambridgeUPOnlineedition(c)2009CambridgeUPAnIntroductiontoInformationRetrievalChristopherD.ManningPrabhakarRaghavanHinrichSchützeCambridgeUniversityPressCambridge,EnglandOnlineedit
2、ion(c)2009CambridgeUPDRAFT!DONOTDISTRIBUTEWITHOUTPRIORPERMISSION©2009CambridgeUniversityPressByChristopherD.Manning,PrabhakarRaghavan&HinrichSchützePrintedonApril1,2009Website:http://www.informationretrieval.org/Comments,corrections,andotherfeedbackmostw
3、elcomeat:informationretrieval@yahoogroups.comOnlineedition(c)2009CambridgeUPDRAFT!©April1,2009CambridgeUniversityPress.Feedbackwelcome.vBriefContents1Booleanretrieval12Thetermvocabularyandpostingslists193Dictionariesandtolerantretrieval494Indexconstructi
4、on675Indexcompression856Scoring,termweightingandthevectorspacemodel1097Computingscoresinacompletesearchsystem1358Evaluationininformationretrieval1519Relevancefeedbackandqueryexpansion17710XMLretrieval19511Probabilisticinformationretrieval21912Languagemod
5、elsforinformationretrieval23713TextclassificationandNaiveBayes25314Vectorspaceclassification28915Supportvectormachinesandmachinelearningondocuments31916Flatclustering34917Hierarchicalclustering37718Matrixdecompositionsandlatentsemanticindexing40319Websearc
6、hbasics42120Webcrawlingandindexes44321Linkanalysis461Onlineedition(c)2009CambridgeUPOnlineedition(c)2009CambridgeUPDRAFT!©April1,2009CambridgeUniversityPress.Feedbackwelcome.viiContentsListofTablesxvListofFiguresxixTableofNotationxxviiPrefacexxxi1Boolean
7、retrieval11.1Anexampleinformationretrievalproblem31.2Afirsttakeatbuildinganinvertedindex61.3ProcessingBooleanqueries101.4TheextendedBooleanmodelversusrankedretrieval141.5Referencesandfurtherreading172Thetermvocabularyandpostingslists192.1Documentdelineati
8、onandcharactersequencedecoding192.1.1Obtainingthecharactersequenceinadocument192.1.2Choosingadocumentunit202.2Determiningthevocabularyofterms222.2.1Tokenization222.2.2Droppingcommonterms:stopwords272.2.3Normalization(equiv
此文档下载收益归作者所有