资源描述:
《reilly media] r cookbook data transformations》由会员上传分享,免费在线阅读,更多相关内容在工程资料-天天文库。
1、CHAPTER6DataTransformationsIntroductionThischapterisallabouttheapplyfunctions:apply,lapply,sapply,tapply,mapply;andtheircousins,byandsplit.Thesefunctionsletyoutakedataingreatgulpsandprocessthewholegulpatonce.Wheretraditionalprogramminglanguagesuseloops,Rusesvectoriz
2、edoperationsandtheapplyfunctionstocrunchdatainbatches,greatlystream-liningthecalculations.DefiningGroupsViaaFactorAnimportantidiomofRisusingafactortodefineagroup.Supposewehaveavectorandafactor,bothofthesamelength,thatwerecreatedasfollows:>v<-c(40,2,83,28,58)>f<-fact
3、or(c("A","C","C","B","C"))Wecanvisualizethevectorelementsandfactorslevelssidebyside,likethis:VectorFactor40A2C83A28B58CThefactorlevelidentifiesthegroupofeachvectorelement:40and83areingroupA;28isingroupB;and2and58areingroupC.Inthisbook,Irefertosuchfactorsasgroupingfa
4、ctors.Theyeffectivelysliceanddiceourdatabyputtingthemintogroups.Thisispowerfulbecauseprocessingdataingroupsoccursofteninstatisticswhencomparinggroupmeans,comparinggrouppro-portions,performingANOVAanalysis,andsoforth.147Thischapterhasrecipesthatusegroupingfactorstosp
5、litvectorelementsintotheirrespectivegroups(Recipe6.1),applyafunctiontogroupswithinavector(Rec-ipe6.5),andapplyafunctiontogroupsofrowswithinadataframe(Recipe6.6).Inotherchapters,thesameidiomisusedtotestgroupmeans(Recipe9.19),performone-wayANOVAanalysis(Recipe11.20),a
6、ndplotdatapointsbygroups(Recipe10.4),amongotheruses.6.1SplittingaVectorintoGroupsProblemYouhaveavector.Eachelementbelongstoadifferentgroup,andthegroupsareiden-tifiedbyagroupingfactor.Youwanttosplittheelementsintothegroups.SolutionSupposethevectorisxandthefactorisf.Y
7、oucanusethesplitfunction:>groups<-split(x,f)Alternatively,youcanusetheunstackfunction:>groups<-unstack(data.frame(x,f))Bothfunctionsreturnalistofvectors,whereeachvectorcontainstheelementsforonegroup.Theunstackfunctiongoesonestepfurther:ifallvectorshavethesamelength,
8、itconvertsthelistintoadataframe.DiscussionTheCars93datasetcontainsafactorcalledOriginthathastwolevels,USAandnon-USA.Italsocontainsacolumnc