欢迎来到天天文库
浏览记录
ID:40352189
大小:2.71 MB
页数:10页
时间:2019-07-31
《YouTube-8M A Large-Scale Video Classification Benchmark》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
1、YouTube-8M:ALarge-ScaleVideoClassificationBenchmarkSamiAbu-El-HaijaNisargKothariJoonseokLeePaulNatsevhaija@google.comndk@google.comjoonseok@google.comnatsev@google.comGeorgeTodericiBalakrishnanVaradarajanSudheendraVijayanarasimhangtoderici@google.combala
2、krishnanv@google.comsvnaras@google.comGoogleResearchABSTRACTManyrecentadvancementsinComputerVisionareattributedtolargedatasets.Open-sourcesoftwarepackagesforMachineLearn-ingandinexpensivecommodityhardwarehavereducedthebar-rierofentryforexploringnovelapp
3、roachesatscale.Itispossibletotrainmodelsovermillionsofexampleswithinafewdays.Al-thoughlarge-scaledatasetsexistforimageunderstanding,suchasImageNet,therearenocomparablesizevideoclassificationdatasets.Inthispaper,weintroduceYouTube-8M,thelargestmulti-label
4、videoclassificationdataset,composedof8millionvideos—500Khoursofvideo—annotatedwithavocabularyof4800visualen-tities.Togetthevideosandtheir(multiple)labels,weusedaYouTubevideoannotationsystem,whichlabelsvideoswiththemaintopicsinthem.Whilethelabelsaremachi
5、ne-generated,theyhavehigh-precisionandarederivedfromavarietyofhuman-basedsignalsincludingmetadataandqueryclicksignals,sotheyrepre-sentanexcellenttargetforcontent-basedannotationapproaches.Figure1:YouTube-8Misalarge-scalebenchmarkforgeneralWefilteredthevi
6、deolabels(KnowledgeGraphentities)usingbothmulti-labelvideoclassification.Thisscreenshotofadatasetautomatedandmanualcurationstrategies,includingaskinghumanexplorerdepictsasubsetofvideosinthedatasetannotatedratersifthelabelsarevisuallyrecognizable.Then,wed
7、ecodedwiththeentity“Guitar”.Thedatasetexplorerallowsbrowsingeachvideoatone-frame-per-second,andusedaDeepCNNpre-andsearchingofthefullvocabularyofKnowledgeGraphenti-trainedonImageNettoextractthehiddenrepresentationimmedi-ties,groupedin24top-levelverticals
8、,alongwithcorrespondingatelypriortotheclassificationlayer.Finally,wecompressedthevideos.framefeaturesandmakeboththefeaturesandvideo-levellabelsavailablefordownload.Thedatasetcontainsframe-levelfeaturesforover1:9billionvideoframesa
此文档下载收益归作者所有