l.f. modeling and evaluating summaries using complex networksnew

l.f. modeling and evaluating summaries using complex networksnew

ID:34651794

大小:139.87 KB

页数:10页

时间:2019-03-08

上传者:xinshengwencai
l.f. modeling and evaluating summaries using complex networksnew_第1页
l.f. modeling and evaluating summaries using complex networksnew_第2页
l.f. modeling and evaluating summaries using complex networksnew_第3页
l.f. modeling and evaluating summaries using complex networksnew_第4页
l.f. modeling and evaluating summaries using complex networksnew_第5页
资源描述:

《l.f. modeling and evaluating summaries using complex networksnew》由会员上传分享,免费在线阅读,更多相关内容在教育资源-天天文库

ModelingandEvaluatingSummariesUsingComplexNetworksThiagoAlexandreSalgueiroPardo1,LucasAntiqueira1,MariadasGraçasVolpeNunes1,OsvaldoN.OliveiraJr.1,2,LucianodaFontouraCosta21NúcleoInterinstitucionaldeLingüísticaComputacional(NILC)CP668–ICMC-USP,13.560-970SãoCarlos,SP,Brasilhttp://www.nilc.icmc.usp.br2InstitutodeFísicadeSãoCarlosCP369–IFSC-USP,13.560-970SãoCarlos,SP,Brasilhttp://www.ifsc.usp.br/{taspardo,lantiq}@gmail.com,gracan@icmc.usp.br,{chu,luciano}@if.sc.usp.brAbstract.Thispaperpresentsasummaryevaluationmethodbasedonacomplexnetworkmeasure.Weshowhowtomodelsummariesascomplexnetworksandestablishapossiblecorrelationbetweensummaryqualityandthemeasureknownasdynamicsofthenetworkgrowth.Itisagenericandlanguageindependentmethodthatenableseasyandfastcomparativeevaluationofsummaries.WeevaluateourapproachusingmanuallyproducedsummariesandautomaticsummariesproducedbythreeautomatictextsummarizersfortheBrazilianPortugueselanguage.Theresultsareinagreementwithhumanintuitionandshowedtobestatisticallysignificant.1IntroductionAutomatictextsummarizationisthetaskofautomaticallyproducingashorterversionofatext(Mani,2001),whichshouldconveytheessentialmeaningofthesourcetextandattendthereader’sgoals.Nowadays,duetotheincreasingamountofavailableinformation,mainlyon-line,andthenecessityofretrievingsuchinformationwithhighaccuracyandofunderstandingitfasterthanever,automaticsummarizationisunquestionablyanimportanttask.Summariesarepresentinawiderangeofourdailyactivities.Duringscientificpaperswriting,wehavetowriteabstracts;whenreadingthesepapers,abstractshelpustodeterminewhetherthepaperisimportantornotforourpurposes.Inabookshop,thedecisionofbuyingabookisusuallybasedonitscoversynthesis.Someinternetsearchenginesusesummariestoidentifydocumentsmainpartsandtohelpusersinchoosingwhichdocumentstoretrieve.Inspiteoftheextensiveinvestigationintomethodsforautomaticsummarization,itisstillhardtodeterminewhichmethodisbetter.Summaryevaluationremainsanunresolvedissue.Variousaspectsinsummariesrequireevaluation(Mani,2001),includingamountofinformation,coherence,cohesion,thematicprogression,legibility,grammaticalityandtextuality.Somearehardtodefine,whilesomesignificantlyoverlap.Dependingonthefinaluseofasummary,beitforhumansorcomputerapplications,differentcriterianeedtobematched:ifhumansarethe intendedreaders,coherenceandcohesionmaybenecessary;ifthesummaryistobeusedinacomputerapplication,sometimesonlythedepictedinformationmaybeenough.Thereareseveralsummaryevaluationmetrics,whosecomputationmaybecarriedouteitherbyhumansorcomputers:ifhumansperformtheevaluation,itbecomesexpensive,timeconsumingandpronetoerrorsandinconsistencies;ifcomputersperformit,subjectiveaspectsoftheevaluationarelostandevaluationmaynotbecomplete.Giventheimportanceofthetask,internationalconferenceshavebeendevotedtothetheme,withDUC(DocumentUnderstandingConference)beingthemostprominent,drivingresearchinthisareaforthepast7years.Concomitantly,recenttrendsinNaturalLanguageProcessing(NLP)showtheuseofgraphsasapowerfultechniqueformodelingandprocessingtexts.Suchinterestingraphsisduetotheirgenericapplicability,oftenleadingtoelegantsolutionstodifficultproblems.Fortextsummarizationpurposes,graphshavebeenusedforbothsummaryproduction(see,e.g.,ErkanandRadev,2004;Mihalcea,2005)andevaluation(see,e.g.,SantosJr.etal.,2004).Inparticular,aspecialkindofgraphs,calledcomplexnetworks,hasreceivedgreatattentionoverthelastfewyears.TheyhavebeenprovenusefultomodelNLPandLinguisticsproblems,inadditiontomanyotherapplications(see,e.g.,Barabási,2003).Complexnetworkshavebeenused,forinstance,inmodelinglexicalresources(SigmanandCecchi,2002),human-inducedwordsassociation(Costa,2004),languageevolutionmodeling(DorogovtsevandMendes,2002),syntacticrelationshipbetweenwords(Canchoetal.,2005)andtextqualitymeasurement(Antiqueiraetal.,2005a,2005b).Thispaperpresentsafirstapproachtotheuseofcomplexnetworksinsummaryevaluation.Particularly,itbuildsontheworkofAntiqueiraetal.(2005a,2005b),bydescribingapossiblerepresentationofsummariesascomplexnetworksandestablishingacorrelationbetweensummaryqualityandoneofthenetworkproperties,namelythedynamicsofthenetworkgrowth.WeevaluateourapproachusingTeMáriocorpus(PardoandRino,2003),comprising100textsinBrazilianPortugueseandthecorrespondinghuman-produced(manual)summaries,andautomaticsummariesproducedbythesystemsGistSumm(Pardoetal.,2003),SuPor(Módolo,2003)andGEI(PardoandRino,2004).Inthenextsection,complexnetworksareintroduced.Section3describeshowsummariesaremodeledascomplexnetworks.ExperimentalresultswithmanualandautomaticsummariesforverifyingthecorrelationofthedynamicsofthenetworkgrowthpropertyandqualityareshowninSection4.Section5presentstheconclusionsandfinalremarks.2Complexnetworks:anoverviewComplexnetworksareparticularlycomplextypesofgraphs,i.e.structuresthatcontainnodesandedgesconnectingthem.Theyhavereceivedenormousattentioninthelastfewyears,buttheirstudycanbetracedbacktoinitialdevelopmentingraphtheory.However,incontrasttosimplegraphs,complexnetworkspresentconnectingstructuresthattendtodepartfrombeingrandomlyuniform,i.e.,theirgrowthisusuallynotuniformlyrandom(Barabási,2003).Complexnetworkshavebeenusedtodescribeseveralworldphenomena,fromsocialnetworkstointernettopology.Such phenomenapresentpropertiesthatoftenconformtothecomplexnetworkcharacteristics,whichcausedthecomplexnetworkstobestudiedinawiderangeofsciences,mainlybymechanicalstatisticsandphysics.SeeBarabási(2003)andNewman(2003)foracomprehensivescenarioofcomplexnetworkuses.Somepropertiesthatmaybeobservableincomplexnetworksareworthmentioning.Networksknownassmallworldnetworkspointtothefactthatthereisarelativelyshortpathbetweenmostnodesinthenetworks.Forinstance,socialnetworksareusuallysmallworlds.Theclusteringcoefficientindicatesthetendencyofthenetworknodestoformgroups;inasocialnetwork,thefriendsofapersontendtobefriendstoo.Anetworkissaidtobescalefreeiftheprobabilityofanodehaving-γkedgesconnectingittoothernodesfollowsapowerlawdistribution,i.e.,P(k)~k,whereγisaconstantvaluedependentonthenetworkproperties(topologyandconnectivityfactors,forinstance).Scalefreenetworkscontainhubs,whichconsistofhighlyconnectednodes.Ininternet,forexample,hubsarethepagesreceivinglinksfrommanyotherpages.ThesepropertiesarealsoapplicabletoNLPrelatedtasks.SigmanandCecchi(2002)modeledWordNet(Miller,1985)asacomplexnetwork,wherenodesrepresentthewordmeaningsandedgesrepresentthesemanticrelationsbetweenthem.Theyshowedthatthisnetworkisasmallworldandcontainshubs,mainlybecauseofpolysemicwords.Motteretal.(2002)modeledathesaurusasanetwork,wherenodesrepresentwordsandedgesrepresentthesynonymrelationsbetweenthem,anddetectedthatthisnetworkwasscalefree.Antiqueiraetal.(2005a,2005b)modeledtextsascomplexnetworks,wherenodesrepresentthewordsandedgesconnectadjacentwordsinatext.Amongotherthings,theysuggestedthattextqualityissomewhatrelatedtotheclusteringcoefficient,withqualitydeterioratingwithanincreasingcoefficient.Inthenextsection,weshowhowtomodelsummariesascomplexnetworks.3RepresentingsummariesascomplexnetworksOurrepresentationofsummariesascomplexnetworksfollowstheschemeproposedbyAntiqueiraetal.(2005a,2005b).Firstly,pre-processingstepsarecarriedout:thesummarystopwordsareremovedandtheremainingwordsarelemmatized.Removingstopwordseliminatesirrelevantandverycommonwords;usinglemmasinsteadofwordscausestheprocessingtobemoreintelligent,sinceitispossibletoidentifywordswithrelatedmeaning.Thepre-processedsummaryisthenrepresentedasacomplexnetwork.Eachwordcorrespondstoanodeinthenetworkandwordsassociationsarerepresentedasdirectededges.Intherepresentationadopted,eachassociationisdeterminedbyasimpleadjacencyrelation:foreachpairofadjacentwordsinthesummarythereisadirectededgeinthenetworkpointingfromthenodethatrepresentsthefirstwordtothenoderepresentingthesubsequentwordinthesummary.Theedgesareweightedwiththenumberoftimestheadjacentwordsarefoundinthesummary.Significantly,inthisrepresentation,sentenceandparagraphboundariesarenottakenintoconsideration.Asanexample,thesamplesummaryofFigure1(inPortuguese)isrepresentedbythenetworkinFigure2. LulaeFernandoHenriqueCardosoestãonitidamenteàfrentenaseleições.Nassondagensanteriores,osegundolugardeFernandoHenriqueeradisputadopormaisquatro.Hoje,quemmaisoameaça,mesmoassimsemperigo,éSarney.Figure1.SamplesummaryFigure2.ComplexnetworkforthesummaryinFigure14SummaryevaluationAntiqueiraetal.(2005a,2005b)showedtheexistenceofcorrelationbetweenthedynamicsofnetworkgrowthandthequalityofthetextrepresented.Thedynamicsofanetworkgrowthisatemporalmeasureofhowmanyconnectedcomponentsthereareinthenetworkaswordsassociationsareprogressivelyincorporatedintothenetworkasitisconstructed.Initially,inatimet0,allNdifferentwords(nodesofthenetwork)inthetextunderanalysisarethecomponents.Inasubsequenttimet1,whenanassociationisfoundbetweenanytwoadjacentwordswiandwjinthetext,thereareN-1components,i.e.,thecomponentformedbywiandwjandtheotherN-2wordswithoutanyedgebetweenthem.Thisprocedureisconsideredwitheachnewwordbeingadded,untilonlyonecomponentrepresentingthewholetextisformed.Foreachtext,Antiqueiraetal.plotagraphicwhosecurveindicatesthenumberofcomponentsinthenetworkasnewwordsassociationsareconsidered(whichimpliesinsertinganewedge,ifitdoesnotexist,orincreasingtheedgeweightby1ifitalreadyexists).Consideringastraightlineinthisgraphic,whichwouldindicatethatthereisalinearvariationofthenumberofcomponentsasnewwordsassociationsareconsidered,theauthorsshowedthatgood-qualitytextstendtobeassociatedtoastraightlineinthedynamicsplot.Moreover,textqualitydecreasedwithanincreaseinthedeviationfromthestraightline.Thegeneraldeviationfromthestraightlineforatextisquantifiedbyfollowingformula: A∑f(M)−g(M)/NM=1deviation=Awheref(M)isthefunctionthatdeterminesthenumberofcomponentsforMwordsassociationsandg(M)isthefunctionthatdeterminesthelinearvariationofcomponentsforMwordsassociations;NisthenumberofdifferentwordsinthetextandAisthetotalnumberofwordsassociationsfound.Figure3showstheplotforalongerversionofthesummaryinFigure1,whichisamanualsummarybuiltbyaprofessionalabstractor.Thestraightdottedlineistheonethatassumelinearvariationofthenumberofcomponents;theotherlineistherealcurveforthesummary.Accordingtotheaboveformula,thegeneraldeviationforthesummaryis0.023.Figure4showstheplotforanautomaticsummaryknowntobeworse,withsamesizeandforthesamesourcetextofthesummaryofFigure3.Itsgeneraldeviationis0.051.Notethelargerdeviationinthecurve.Fig3.PlotforamanualsummaryFigure4.PlotforanautomaticsummaryAntiqueiraetal.performedtheirexperimentswithnewstexts,supposedtobegood,andstudents’essays,supposedtobeworsethanthenewstexts.Inthispaper,weevaluatethepossibilityofadoptingsuchmethodinsummaryevaluation.Inordertodoso,wefirstassume,asmostworksonsummaryevaluationdo,thatasummarymustdisplaythesamepropertiesatextpresentsinordertobeclassifiedastext.Therefore,summaries,astexts,mustbecoherentandcohesive,legible,grammatical,andpresentgoodthematicprogression.Inourevaluation,weusedacorpuscalledTeMário(PardoandRino,2003)forBrazilianPortuguese.TeMárioconsistsof100newstextsfromtheon-linenewspaperFolhadeSãoPaulo(containingtextsfromSectionsSpecial,World,Opinion,International,andPolitics)andtheircorrespondingmanualsummarieswrittenbyaprofessionalabstractor.Toourknowledge,TeMárioistheonlyavailablecorpusforsummarizationpurposesfortheBrazilianPortugueselanguage.Wecomparedthemanualsummariestoautomaticsummariesproducedby3systems,namely,GistSumm(GISTSUMMarizer)(Pardoetal.,2003),SuPor(SUmmarizerforPORtuguese)(Módolo,2003)andGEI(GeradordeExtratosIdeais)(PardoeRino,2004).Weselectedthesesystemsforthefollowingreasons:GistSummisoneofthefirstsummarizationsystemspubliclyavailableforPortuguese;accordingtoRinoetal.(2004),SuPoristhebestsummarizationsystem forPortuguese;GEIwasusedtoproducetheautomaticsummariesthatalsoaccompanyTeMáriodistribution.Inwhatfollows,eachsystemisbrieflyexplained.Then,ourexperimentisdescribedandtheresultsdiscussed.4.1.SystemsdescriptionThesummarizersusedintheevaluationareallextractivesummarizers,i.e.,theybuildthesummaryofasourcetextbyjuxtaposingcompletesentencesfromthetext,withoutmodifyingthem.Thesummariesproducedinthiswayarealsocalledextracts.GistSummisanautomaticsummarizerbasedonasummarizationmethodcalledgist-basedmethod.Itcomprisesthreemainprocesses:textsegmentation,sentenceranking,andsummaryproduction.Sentencerankingisbasedonthekeywordsmethod(Luhn,1958):itscoreseachsentenceofthesourcetextbysummingupthefrequencyofitswordsandthegistsentenceischosenastheonewiththehighestscore.Summaryproductionfocusesonselectingothersentencesfromthesourcetexttoincludeinthesummary,basedon:(a)gistcorrelationand(b)relevancetotheoverallcontentofthesourcetext.Criterion(a)isfulfilledbysimplyverifyingco-occurringwordsinthecandidatesentencesandthegistsentence,ensuringlexicalcohesion.Criterion(b)isfulfilledbysentenceswhosescoreisaboveathreshold,computedastheaverageofallthesentencescores,toguaranteethatonlyrelevantsentencesarechosen.Alltheselectedsentencesabovethethresholdarejuxtaposedtocomposethesummary.SuPorisamachinelearningbasedsummarizationsystemand,therefore,hastwodistinctprocesses:trainingandextractingbasedonaNaïve-Bayesmethod,followingKupiecetal.(1995).Itallowscombininglinguisticandnon-linguisticfeatures.InSuPor,relevantfeaturesforclassificationare(a)sentencelength(minimumof5words);(b)wordsfrequency;(c)signalingphrases;(d)sentencelocationinthetexts;and(e)occurrenceofnounsandpropernouns.Asaresultoftraining,aprobabilisticdistributionisproduced,whichentitlessummarizationinSuPor.Inthispaper,followingRinoetal.(2004),weusethesamefeatures.SuPorworksinthefollowingway:firstly,thesetoffeaturesofeachsentenceareextracted;secondly,foreachofthesets,theBayesianclassifierprovidestheprobabilityofthecorrespondingsentencebeingincludedinthesummary.Themostprobableonesareselectedtobeinthesummary.Givenamanualsummaryanditssourcetext,GEIproducesthecorrespondingidealextract,i.e.,asummarycomposedofcompletesentencesfromthesourcetextthatcorrespondtothesentencescontentfromthemanualsummary.Thistoolisbasedonthewidelyknownvectorspacemodelandthecosinesimilaritymeasure(SaltonandBuckley,1988),andworksasfollows:1)foreachsentenceinthemanualsummary,themostsimilarsentenceinthesourcetextisobtainedthroughthecosinemeasure(basedonwordco-occurrence);2)themostrepresentativesentencesareselected,yieldingthecorrespondingidealextract.Ingeneral,idealextractsarenecessarytocalculateautomaticallytheamountofrelevantinformationinautomaticsummariesproducedbyextractivemethods.Theautomaticsummariesarecomparedtotheidealextractsandtwomeasuresareusuallycomputed:recallandprecision.Recallisdefinedasthenumberofsentencesfromthe idealextractincludedintheautomaticsummaryoverthenumberofsentencesintheidealextract;precisionisdefinedasthenumberofsentencesfromtheidealextractincludedintheautomaticsummaryoverthenumberofsentencesintheautomaticsummary.Athirdmeasure,calledf-measure,isacombinationofrecallandprecision,beingauniquemeasureofasummarizationsystemperformance.AsdescribedbyRinoetal.(2004),GistSummandSuporparticipatedinacomparativeevaluation.Recall,precisionandf-measurewerecomputedforTeMáriocorpus,usingtheidealextractsproducedbyGEI.A30%compressionratewasusedinproducingtheautomaticsummaries.Thecompressionratespecifiesthesizeofthesummarytobeproducedinrelationtothesourcetextintermsofnumberofwords.Inthiscase,the30%compressionratespecifiesthatthesummarymusthaveatmost30%ofthenumberofwordsinthesourcetext.Recall,precisionandf-measureforGistSummandSuPorareshowninTable1,whichreproducespartoftheevaluationthatRinoetalpresented.Table1.Systemsperformance(in%)SystemsRecallPrecisionF-measureSuPor40.844.942.8GistSumm25.649.933.8Ascanbenoted,GistSummhadthehighestprecision,butthelowestrecall.SuPorpresentedthebestf-measure,being,therefore,thebestsystem.Theseresultswillbecommenteduponinthenextsubsection,whichdescribesthecomplexnetworkexperimentconductedinthispaper.4.2.ExperimentForrunningourexperiment,wetookthemanualsummariesandtheidealextracts(producedbyGEI)thataccompanyTeMárioandthecorrespondingautomaticsummariesproducedbyGistSummandSuPor.AsinRinoetal.(2004),weuseda30%compressionrate.BasedonourknowledgeaboutthewaythesummarieswereproducedandontheevaluationthatRinoetal.presented,weassumethatthemanualsummariesarebetterthantheidealextracts,whicharebetterthantheautomaticsummaries.Intermsofcomplexnetworks,thedeviationfromastraightlineinthedynamicsofnetworkgrowthshouldbelowestforthemanualsummaries,andthenincreasefortheidealextractsandevenmorefortheautomaticsummaries.Atthispoint,itishardtopredicthowSuPorandGistSummsummarieswillbehaveinrelationtoeachother.AlthoughSuPorisbetterthanGistSummininformativityevaluation(seeTable1),i.e.,theamountofrelevantinformationthesummarieshave,itisunlikelythiswillbereflectedinthewaywemodelsummariesascomplexnetworks.Infact,inthetextqualityexperiment,Antiqueiraetal.(2005a,2005b)suggestedthatwhatisbeingcapturedbythecomplexnetworkistheflowofnewconceptsintroducedduringthetext:badtextswouldintroducemostoftheconceptsabruptly;goodtexts,ontheotherhand,woulddoitgradualanduniformlyduringthetextdevelopment,resultinginamoreunderstandableandreadabletext.Table2showstheaveragedeviationforeachgroupofsummariesanditsincreaseinrelationtothemanualsummariesdeviation.Forinstance,forGistSumm(line3in thetable),theaverageofthesummariesdeviationis0.03673,whichis20.62%largerthantheaveragedeviationforthemanualsummaries.Table2.ExperimentresultsAvg.deviationOvermanualsummaries(%)Manualsummaries0.030450GEI0.0353816.19GistSumm0.0367320.62SuPor0.0437343.61Usingt-studenttest(CasellaandBerger,2001)forcomparingtheaveragedeviationsofourdata,with99%confidenceinterval,thep-valuesarebelow0.03,whichindicatesthattheresultingnumbersarenotduetomerechance.Inotherwords,theresultsarestatisticallysignificant.Theonlyexceptionwasthep-valueforthecomparisonbetweenGistSummandGEI,whichwasaround0.60.Thishappenedbecauseoftheshortdistancebetweentheresultsofthetwosystems,asTable2illustrates.Figure5showsthehistogramsforthesummariesandtheirrespectivedeviations,wherethex-axisrepresentsthedeviationandthey-axisthenumberoftexts.Astheaveragedeviationgrowsforeachgroupofsummaries,theGaussiandistributionhasitspeak(whichcorrespondstothemean)displacedtotheright,i.e.therearemoretextswithhigherdeviations.ManualsummariesGEIsummariesGistSummsummariesSuPorsummariesFigure5.Histogramsforsummariesandtheirdeviations Asexpected,theresultssuggestthatmanualsummariesarebetterthantheidealextracts,andthatthesearebetterthantheautomaticsummaries.Thisobservationpositivelyanswersourquestionaboutthepossibilityofusingcomplexnetworkstoevaluatesummariesinacomparativefashion.Weclaimthatitmustberestrictedtoacomparativeevaluationbecauseitisdifficulttojudgethevalidityofadeviationnumberwithoutanyreference.Theresultsalsoshowthat,incontrasttotheinformativityevaluation,GistSummoutperformedSuPorinthisexperiment,asmentionedaboveasapossibleresult.WebelievethereasonforthistobethesummarizationmethodusedbyGistSumm:toproducethesummary,itselectssentencesthatcorrelatewiththegistsentence,resultinginasummarywithsimilarthematicelementsacrossthesentencesand,therefore,withamorenaturalflowofconcepts.WithGistSummandSuPornumbers,itisalsopossibletoconcludeforthetruthoftheassumptionthatourmodelingofsummariesascomplexnetworksprobablydoesnotcapturesummaryinformativityorthatalternativecomplexnetworksmeasurementsmaybenecessary.5ConclusionsThispaperpresentedanapplicationoftheapproachdescribedbyAntiqueiraetal.(2005a,2005b)tosummaryevaluation,whichisconsideredahardprobleminNLP.Bymodelingsummariesascomplexnetworksandbyexploringanetworkmetric,weshowedittobepossibletodistinguishsummariesaccordingtotheirquality.Theevaluationpresentedherecanbeusedinassociationtootherautomaticevaluations,complementingtheresultsobtainedwiththetraditionalinformativitymetrics–recallandprecision–ornewones–ROUGE(LinandHovy,2003),forinstance.Becauseitisbasedonabstractrepresentationoftextsintermsofcomplexnetworks,theproposedsolutionlookselegant,genericandlanguageindependent.Inthefuture,weplantoapplysuchevaluationtoothertextgenres,inadditiontothenewstexts.Wealsoaimatinvestigatingothernetworkpropertiesandtheirusefulnessforcharacterizingtheseveralaspectsofasummarythatisworthmodelingandevaluating,e.g.,coherenceandcohesion.Otherwaysofmodelingsummariesascomplexnetworksmayalsobeexplored.AcknowledgmentsTheauthorsaregratefultoCNPqandFAPESP.ReferencesAntiqueira,L.;Nunes,M.G.V.;OliveiraJr.,O.N.;Costa,L.F.(2005a).ModelandoTextoscomoRedesComplexas.InAnaisdoIIIWorkshopemTecnologiadaInformaçãoedaLinguagemHumana–TIL.SãoLeopoldo-RS,Brazil.July22-26.Antiqueira,L.;Nunes,M.G.V.;OliveiraJr.,O.N.;Costa,L.F.(2005b).Complexnetworksintheassessmentoftextquality.physics/0504033.Barabási,A.L.(2003),Linked:HowEverythingIsConnectedtoEverythingElseandWhatItMeansforBusiness,Science,andEverydayLife.Plume,NewYork. Cancho,R.F.;Capocci,A.;Caldarelli,G.(2005).Spectralmethodsclusterwordsofthesameclassinasyntacticdependencynetwork.cond-mat/0504165.Casella,J.andBerger,R.L.(2001).StatisticalInference.Duxbury,Belmont,California.Costa,L.F.(2004).What’sinaname?InternationalJournalofModernPhysicsC,Vol.15,pp.371-379.Dorogovtsev,S.N.andMendes,J.F.F.(2002).Evolutionofnetworks.AdvancesinPhysics,Vol.51,N.4,pp.1079-1187.Erkan,G.andRadev,D.R.(2004).Lexrank:Graph-basedcentralityassalienceintextsummarization.JournalofArtificialIntelligenceResearch–JAIR,Vol.22,pp.457-479.Kupiec,J.;Pedersen,J.;Chen,F.(1995).Atrainabledocumentsummarizer.IntheProceedingsofthe18thACM-SIGIRConferenceonResearch&DevelopmentinInformationRetrieval,pp.68-73.Lin,C-Y.andHovy,E.H.(2003).AutomaticEvaluationofSummariesUsingN-gramCo-occurrenceStatistics.IntheProceedingsofLanguageTechnologyConference–HLT.Edmonton,Canada.May27-June1.Luhn,H.(1958).Theautomaticcreationofliteratureabstracts.IBMJournalofResearchandDevelopment,Vol.2,pp.159-165.Mani,I.(2001).AutomaticSummarization.JohnBenjamin’sPublishingCompany.Mihalcea,R.(2005).LanguageIndependentExtractiveSummarization.IntheProceedingsofthe43ndAnnualMeetingoftheAssociationforComputationalLinguistics.AnnArbor,Michigan.Miller,G.A.(1985).Wordnet:adictionarybrowser.IntheProceedingsoftheFirstInternationalConferenceonInformationinData.UniversityofWaterloo.Módolo,M.(2003).SuPor:umAmbienteparaaExploraçãodeMétodosExtrativosparaaSumarizaçãoAutomáticadeTextosemPortuguês.Masterthesis.DepartamentodeComputação,UFSCar.Motter,A.E.;Moura,A.P.S.;Lai,Y.C.;Dasgupta,P.(2002).Topologyoftheconceptualnetworkoflanguage.PhysicalReviewE,Vol.65,065102.Newman,M.E.J.(2003).Thestructureandfunctionofcomplexnetworks.SIAMReview,Vol.45,pp.167-256.Pardo,T.A.S.andRino,L.H.M.(2003).TeMário:UmCorpusparaSumarizaçãoAutomáticadeTextos.NILCtechnicalreport.NILC-TR-03-09.SãoCarlos-SP,October,13p.Pardo,T.A.S.andRino,L.H.M.(2004).DescriçãodoGEI-GeradordeExtratosIdeaisparaoPortuguêsdoBrasil.NILCtechnicalreport.NILC-TR-04-07.SãoCarlos-SP,August,10p.Pardo,T.A.S.;Rino,L.H.M.;Nunes,M.G.V.(2003).GistSumm:ASummarizationToolBasedonaNewExtractiveMethod.InN.J.Mamede,J.Baptista,I.Trancoso,M.G.V.Nunes(eds.),6thWorkshoponComputationalProcessingofthePortugueseLanguage-WrittenandSpoken–PROPOR(LectureNotesinArtificialIntelligence2721),pp.210-218.Faro,Portugal.June26-27.Rino,L.H.M.;Pardo,T.A.S.;SillaJr.,C.N.;Kaestner,C.A.;Pombo,M.(2004).AComparisonofAutomaticSummarizationSystemsforBrazilianPortugueseTexts.IntheProceedingsofthe17thBrazilianSymposiumonArtificialIntelligence–SBIA(LectureNotesinArtificialIntelligence3171),pp.235-244.SãoLuis-MA,Brazil.September,29-October,1.SantosJr.E.;Mohamed,A.A.;ZhaoQ.(2004).AutomaticEvaluationofSummariesUsingDocumentGraphs.IntheProceedingsoftheWorkshoponTextSummarizationBranchesOut,pp.66-73.Salton,G.andBuckley,C.(1988).Term-weightingapproachesinautomatictextretrieval.InformationProcessingandManagement,Vol.24,pp.513-523.Sigman,M.andCecchi,G.A.(2002).GlobalOrganizationoftheWordnetLexicon.IntheProceedingsoftheNationalAcademyofSciences,Vol.99,pp.1742-1747.

当前文档最多预览五页,下载文档查看全文

此文档下载收益归作者所有

当前文档最多预览五页,下载文档查看全文
温馨提示:
1. 部分包含数学公式或PPT动画的文件,查看预览时可能会显示错乱或异常,文件下载后无此问题,请放心下载。
2. 本文档由用户上传,版权归属用户,天天文库负责整理代发布。如果您对本文档版权有争议请及时联系客服。
3. 下载前请仔细阅读文档内容,确认文档内容符合您的需求后进行下载,若出现内容与标题不符可向本站投诉处理。
4. 下载文档时可能由于网络波动等原因无法下载或下载错误,付费完成后未能成功下载的用户请联系客服处理。
大家都在看
近期热门
关闭