欢迎来到天天文库
浏览记录
ID:33410252
大小:4.36 MB
页数:69页
时间:2019-02-25
《基于web的数字化资源全文检索系统的设计与实现》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、西安电子科技大学硕士学位论文基于web的数字化资源全文检索系统的设计与实现姓名:马静申请学位级别:硕士专业:教育技术学指导教师:詹海生20100101᪈㾱᪈㾱ؑᢰᵟⲴ䗵䙏ਁኅ઼з㔤㖁Ⲵࠪ⧠ӗ⭏Ҷབྷ䟿Ⲵᮠᦞˈ⢩࡛ᱟԕdocṬᔿǃpdfṬᔿǃtxtṬᔿⲴᮠᦞ⎧䟿໎䮯Ⲵ䎻࣯DŽྲօਸ⨶Ⲵ㓴㓷઼ᆈۘ䘉Ӌᔲᶴᮠᦞˈᨀ儈Ự㍒Ⲵ᭸⦷ˈᱟޘ᮷Ự㍒亶ฏ䶒ѤⲴањ䟽㾱䰞仈DŽᵜ䈮仈Ⲵ⹄ウԫ࣑ᱟ䇮䇑ᇎ⧠ањѝ᮷ޘ᮷Ự㍒㌫㔏ˈᇎ⧠ሩൠᯩᘇ䍴ᯉⲴޘ᮷Ự㍒࣏㜭DŽᵜ䈮仈␡ޕ⹄ウҶޘ᮷Ự㍒Ⲵ⨶ˈवᤜ㍒ᕅᇊѹǃ㍒ᕅᇩǃ㍒ᕅࡋᔪ઼㍒ᕅᩌ㍒DŽᵜ䈮仈䟷⭘LuceneᔰⓀᩌ㍒ᢰᵟDŽᵜ᮷䈖㓶࠶
2、᷀ҶLuceneⲴ㔃ᶴ㓴ᡀǃ㍒ᕅ䗷〻ǃᩌ㍒䗷〻ǃᮠᦞ⍱㓴㓷઼Ự㍒व㔃ᶴ৺㠚࣏㜭DŽᵜ䈮仈㍒ᕅᆈۘ㔃ᶴ䟷⭘قᧂ㺘㔃ᶴˈᴹ᭸Ⲵᆈਆقᧂ㺘઼ᘛ䙏૽ᓄᩌ㍒൘ޘ᮷ᩌ㍒亶ฏѝ䎧⵰㠣ޣ䟽㾱Ⲵ⭘DŽᵜ᮷ሩ㍒ᕅᆈۘ㔃ᶴ䘋㹼Ҷ␡ޕⲴ࠶᷀ˈ䇮䇑а㜭ཏᴹ᭸ᨀ儈㍒ᕅ㕙⦷Ⲵ㍒ᕅ᮷ẓ䟽ᧂ㇇⌅üü㚊㊫㍒ᕅ䟽ᧂ㇇⌅DŽ䈕㇇⌅ሶլᓖ儈Ⲵ᮷ẓᧂࡇ൘а䎧ˈӾ㘼߿ቁ㕆⸱᮷ẓ㕆ਧѻ䰤Ⲵᐞ٬ᡰ䴰㾱Ⲵᆇ㢲ᮠˈ䗮ࡠᨀ儈ᩌ㍒᭸⦷Ⲵ᭸᷌DŽ䙊䗷⍻䈅ˈ㚊㊫㍒ᕅ䟽ᧂ㇇⌅ਟԕ᰾ᱮ߿ቁ㍒ᕅᆈۘオ䰤ˈ䗮ࡠ㍒ᕅ㕙᭸᷌ީ䭤䈃φޞᮽỶ㍘Lucene㚐㊱㍘ᕋ䠃ᧈ㇍⌋AbstractAbstractTherapidde
3、velopmentofinformationtechnologiesandtheadventoftheWore-WideWebhaveresultedinatremendousincreaseintheamountofavailableinformation,especiallythedataintheformatofdocǃpdfandtxtarealsoexpansionrapidly.Facingthemoreandmorepopulartrendofthenetworkelectronicinformation,howtoreasonablear
4、rangementtheformatandcontentoftheinformationandhowtoletthemtobringthebiggestconveniencetotheinternetusers,thisistheresearchgoalofthisthesis.TheresearchaimofthissubjectistobuildachorographyFull-TextretrievalsystembasedonB/Swhichaccessedviaweb.Theresearchtargetofthissubjectistodesi
5、gnanddevelopaChinesefull-textretrievalsystem,thusrealizingthefull-textsearchfunctionalityoflocalgazetteers.AnimprovedinvertedindexalgorithmnamedClusterindexrealignalgorithmisdesignedandusedtostorethecompressedmassindexfile,forreducingtheindexstoragespaceandimproveretrievalefficie
6、ncyresults.Thesubjectstudiedtheprincipaloffull-textretrievalindetails,includingtheprincipleoffull-textretrievalindex,indexcontent,indexingandsearch.Thispaperanalysesthestructure,indexprocess,searchprocess,thedataflow,organizationstructureandtheirrespectivebagandretrievalfunctions
7、ofLucene.Theindexstoragestructurehasbeenanalyzedfurtherinthispaper,designinganalgorithmwhichcouldeffectivelyimprovethecompressionratioofindex----clusterindexrealignalgorithm.Thisalgorithmarraythehighsimilardocumenttogetherinordertodecreasethesmalld-gapfrequencythroughclustering,w
8、hichwouldmaketheindexcompressionmoreeffe
此文档下载收益归作者所有