资源描述:
《Convergence of Database and Internet》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库。
ConvergenceofDatabaseandInternet:ConvergenceofDatabaseandInternet:ChallengeandOpportunitiesChallengeandOpportunitiesAoyingZhou,WeiningQian,MinqiZhouEastChinaNormalUniversity OutlineOutline••DBandtheInternetDBandtheInternet••Scalability:theGood,theBad,andtheUglyScalability:theGood,theBad,andtheUgly••ConsistencyindistributedsystemsConsistencyindistributedsystems••ConsistencyindatabasetransactionsConsistencyindatabasetransactions••AlongwaytogoAlongwaytogo2014-8-30Shanghai2 TheWayofInternetEnterprisesTheWayofInternetEnterprises1.Toconstructacyberspace,viaproviding–informationservices,or•Google,Baidu,...–anenvironmentforcommunication,entertainments,etc.•Facebook,Tencent,...2.Toincreasethevolumeoftraffic3.Traffic==>cashflow••EssentiallyeyeballeconomyEssentiallyeyeballeconomy2014-8-30Shanghai3 TrendsTrends••O2O:OnlinetoOfflineO2O:OnlinetoOffline–Enabledbysmartphone,LBS,andmobilecomputing••ThebusinessintheofflinepartispotentiallyThebusinessintheofflinepartispotentiallymuchlargerthanthatintheonlinepartmuchlargerthanthatintheonlinepart••Databases(DataBases)becomemoreDatabases(DataBases)becomemoreimportantimportant–Thekeysystemfordatamanagementandtransactionprocessing2014-8-30Shanghai4 Databasein/fromInternetEnterprisesDatabasein/fromInternetEnterprises•BigTable(Google)–Webpagestorage•Cassandra(Facebook)–Messagemanagement•OceanBase(Alibaba)–TaobaoE-commerce,Alipay,...•F1/Spanner(Google)–Adwords•...2014-8-30Shanghai5 What'snew?What'snew?•Massiveusers(sometimesunpredictable)–e.g.11/11(FestivalofSingles)–Similarworkloadpeakscanbefoundin:•TicketbookingduringSpringFestival•Financialproductsale2014-8-30Shanghai6 What'snew?What'snew?•Massiveusers(sometimesunpredictable)–e.g.11/11(FestivalofSingles)•Requirement:Scalability–Scaling-uptoScaling-out–Thousandsofmachines–Lowoverheadonmanagement•Weneedsystemsthatisscalableandconsistent2014-8-30Shanghai7 OutlineOutline••DBandtheInternetDBandtheInternet••Scalability:theGood,theBad,andtheUglyScalability:theGood,theBad,andtheUgly••ConsistencyindistributedsystemsConsistencyindistributedsystems••ConsistencyindatabasetransactionsConsistencyindatabasetransactions••AlongwaytogoAlongwaytogo2014-8-30Shanghai8 Hadoop,forexampleHadoop,forexample2014-8-30Shanghai9 BeforeHadoopBeforeHadoop••2003:GFSpaper2003:GFSpaper–SanjayGhemawat,HowardGobioff,Shun-TakLeung:TheGooglefilesystem.SOSP2003:29-43••2004:MapReducepaper2004:MapReducepaper–JeffreyDean,SanjayGhemawat:MapReduce:SimplifiedDataProcessingonLargeClusters.OSDI2004:137-150••2006:BigTablepaper2006:BigTablepaper–FayChang,JeffreyDean,SanjayGhemawat,etal:Bigtable:ADistributedStorageSystemforStructuredData.OSDI2006:205-2182014-8-30Shanghai10 BirthofHadoopBirthofHadoop••2004:DougCuttingandMichaelJ.Cafarella2004:DougCuttingandMichaelJ.CafarellabuildasystembasedonpapersfromGooglebuildasystembasedonpapersfromGoogleLabs,whichisnamedasLabs,whichisnamedasHadoopHadoop。。••2005:HadoopislistedaspartofNutch,which2005:HadoopislistedaspartofNutch,whichisasub-projectofLuceneofApacheisasub-projectofLuceneofApache••2006.3:Map/ReduceandNutchDistributed2006.3:Map/ReduceandNutchDistributedFileSystem(NDFS)areincludedinHadoopFileSystem(NDFS)areincludedinHadoop••2006.12006.1––2008:Web-scaleHadoop!2008:Web-scaleHadoop!(@Yahoo!)(@Yahoo!)2014-8-30Shanghai11 ThebigfamilyofThebigfamilyof•Hadoopcommons•MapReduce•HDFS•Pig•HBase•Zookeeper•Hive•Avro•Mahout•...2014-8-30Shanghai12 ThesuccessofHadoopThesuccessofHadoop••Hardwaresupport:Hardwaresupport:–Shared-nothingarchitecture,MPPclusters–Fastnetworking••Tech.advantagesofTech.advantagesof–distributedstorageandparallelprocessing–replicationsandauto-failover••WideadoptioninInternetcompaniesWideadoptioninInternetcompanies••ScalabilityseemstobeachievedScalabilityseemstobeachievedforspecifictypeofapplicationsforspecifictypeofapplications2014-8-30Shanghai13 ScalabilityisnotfreeScalabilityisnotfree••NotransactionprocessingNotransactionprocessing–Appendonlyorweakconsistency••LimitedparadigmofprocessingLimitedparadigmofprocessing–MapReduce-styleprocessing–Distributive/algebraicaggregation2014-8-30Shanghai14 CAPTheoremCAPTheorem••AsystemfacingAsystemfacingnetworkpartitionsnetworkpartitionsmustmustchoosebetweeneitherchoosebetweeneitheravailabilityavailabilityorstrongorstrongconsistencyconsistency2014-8-30Shanghai15 CAP!=CAP!=¬¬TransactionsTransactions•CAPisaboutlinearizability–singleobjectguaranteestrictrealtimeordering•Serializability–transactionsbehaveasifexecutedseriallyagainstasingledatabase–multipleobjectguaranteenorealtimeordering•MostDBMSdonotprovideserializabilitybydefault–Someevendonotsupportit!2014-8-30Shanghai16 OutlineOutline••DBandtheInternetDBandtheInternet••Scalability:theGood,theBad,andtheUglyScalability:theGood,theBad,andtheUgly••ConsistencyindistributedsystemsConsistencyindistributedsystems••ConsistencyindatabasetransactionsConsistencyindatabasetransactions••AlongwaytogoAlongwaytogo2014-8-30Shanghai17 ConsistencyinDistributedSystemsConsistencyinDistributedSystems••ClockClock–TheprecisionoftheclockiskeytoperformanceNon-failure/••TheproblemspaceTheproblemspaceConsensusNetwork••State-of-the-artState-of-the-artstructure–PaxosSynch./Asynch.2014-8-30Shanghai18 OutlineOutline••DBandtheInternetDBandtheInternet••Scalability:theGood,theBad,andtheUglyScalability:theGood,theBad,andtheUgly••ConsistencyindistributedsystemsConsistencyindistributedsystems••ConsistencyindatabasetransactionsConsistencyindatabasetransactions••AlongwaytogoAlongwaytogo2014-8-30Shanghai19 DatabaseTransactionsDatabaseTransactionsStateATransactionStateBValid√Valid√••SerialexecutionistooSerialexecutionistooexpensive!expensive!••ButtransactionsmayhaveButtransactionsmayhaveconflicts.conflicts.••SoweneedconcurrencySoweneedconcurrencycontrolcontrol2014-8-30Shanghai20 DatabaseTransactionsDatabaseTransactionsStateATransactionStateBValid√Valid√••TheremaybefailuresTheremaybefailures–Software/hardware/network••SoweneeddatabaserecoverySoweneeddatabaserecovery2014-8-30Shanghai21 DatabaseTransactionsDatabaseTransactionsStateATransactionStateBValid√Valid√••ACIDPropertiesACIDProperties–Atomicity–Consistency:Notbinary,butaspectrumofmodels–Isolation–Durability2014-8-30Shanghai22 ConsistencymodelspectrumConsistencymodelspectrum•Isolationlevels–Serializable–Repeatablereads–Readcommitted–Readuncommitted•More–Cursorstability–Snapshotisolation–Recency–Safe–Linearizability–Strongone-copyserializability–Readyourwrites–...2014-8-30Shanghai23 TheoreticalanalysisonTheoreticalanalysisonDistributedtransactionprocessingDistributedtransactionprocessing•HAT–High-availabletransactions•PeterBailis,AaronDavidson,AlanFekete,AliGhodsi,JosephM.Hellerstein,IonStoica,HighlyAvailableTransactions:VirtuesandLimitations.VLDB2014•PACELC–Ifthereisapartition(P),howdoesthesystemtradeoffavailabilityandconsistency(AandC);–Else(E),whenthesystemisrunningnormallyintheabsenceofpartitions,howdoesthesystemtradeofflatency(L)andconsistency(C)?•DanielAbadi:ConsistencyTradeoffsinModernDistributedDatabaseSystemDesign:CAPisOnlyPartoftheStory.IEEEComputer45(2):37-42(2012)2014-8-30Shanghai24 ScalableTransactionProcessingScalableTransactionProcessing•Verticalapproach–Toseparatereadandwrite:•OceanBase:Baseline+Updates•Exadata:cachetier–Queuing/transactionrouting•Scalebase•Horizontalapproach–VoltDB:datapartitionbasedonhistories,andruntimeoptimization2014-8-30Shanghai25 OceanbaseOceanbase2014-8-30Shanghai26 ExadataExadata2014-8-30Shanghai27 ScalebaseScalebase2014-8-30Shanghai28 VoltDBVoltDBPartitioningTablesSerializedReplicatingProcessingTables2014-8-30Shanghai29 OutlineOutline••DBandtheInternetDBandtheInternet••Scalability:theGood,theBad,andtheUglyScalability:theGood,theBad,andtheUgly••ConsistencyindistributedsystemsConsistencyindistributedsystems••ConsistencyindatabasetransactionsConsistencyindatabasetransactions••AlongwaytogoAlongwaytogo2014-8-30Shanghai30 ConsistencymodelsConsistencymodels•Unavailableimplementations⇏unavailablesemantics–ReadCommitted(RC)–ANSIRepeatableRead(RR)–Bothcanbeachievedindistributeddatabases•But–Cannotmakerecencyguaranteesonreadsduringpartitionsofunboundedduration–Cannotenforcearbitrarydatabaseintegrityconstraints(e.g.,uniqueness)2014-8-30Shanghai31 Real-lifeapplicationsReal-lifeapplications•Integrityconstraintsareoftenguaranteedby–Middleware–Applications–butnotbyDBMS,forperformanceconsideration•Businesslogicallowsareasonablelongdurationforprocessingofsomedistributedtransactions–Inconsistentwindowsize<