全文搜索引擎的设计与实现外文翻译.doc
《全文搜索引擎的设计与实现外文翻译.doc》由会员分享,可在线阅读,更多相关《全文搜索引擎的设计与实现外文翻译.doc(25页珍藏版)》请在冰点文库上搜索。
江汉大学毕业论文(设计)
外文翻译
原文来源TheHadoopDistributedFileSystem:
ArchitectureandDesign
中文译文Hadoop分布式文件系统:
架构和设计
姓名XXXX
学号200708202137
2013年4月8日
英文原文
TheHadoopDistributedFileSystem:
ArchitectureandDesign
Source:
http:
//hadoop.apache.org/docs/r0.18.3/hdfs_design.html
Introduction
TheHadoopDistributedFileSystem(HDFS)isadistributedfilesystemdesignedtorunoncommodityhardware.Ithasmanysimilaritieswithexistingdistributedfilesystems.However,thedifferencesfromotherdistributedfilesystemsaresignificant.HDFSishighlyfault-tolerantandisdesignedtobedeployedonlow-costhardware.HDFSprovideshighthroughputaccesstoapplicationdataandissuitableforapplicationsthathavelargedatasets.HDFSrelaxesafewPOSIXrequirementstoenablestreamingaccesstofilesystemdata.HDFSwasoriginallybuiltasinfrastructurefortheApacheNutchwebsearchengineproject.HDFSispartoftheApacheHadoopCoreproject.TheprojectURLishttp:
//hadoop.apache.org/core/.
AssumptionsandGoals
HardwareFailure
Hardwarefailureisthenormratherthantheexception.AnHDFSinstancemayconsistofhundredsorthousandsofservermachines,eachstoringpartofthefilesystem’sdata.Thefactthatthereareahugenumberofcomponentsandthateachcomponenthasanon-trivialprobabilityoffailuremeansthatsomecomponentofHDFSisalwaysnon-functional.Therefore,detectionoffaultsandquick,automaticrecoveryfromthemisacorearchitecturalgoalofHDFS.
StreamingDataAccess
ApplicationsthatrunonHDFSneedstreamingaccesstotheirdatasets.Theyarenotgeneralpurposeapplicationsthattypicallyrunongeneralpurposefilesystems.HDFSisdesignedmoreforbatchprocessingratherthaninteractiveusebyusers.Theemphasisisonhighthroughputofdataaccessratherthanlowlatencyofdataaccess.POSIXimposesmanyhardrequirementsthatarenotneededforapplicationsthataretargetedforHDFS.POSIXsemanticsinafewkeyareashasbeentradedtoincreasedatathroughputrates.
LargeDataSets
ApplicationsthatrunonHDFShavelargedatasets.AtypicalfileinHDFSisgigabytestoterabytesinsize.Thus,HDFSistunedtosupportlargefiles.Itshouldprovidehighaggregatedatabandwidthandscaletohundredsofnodesinasinglecluster.Itshouldsupporttensofmillionsoffilesinasingleinstance.
SimpleCoherencyModel
HDFSapplicationsneedawrite-once-read-manyaccessmodelforfiles.Afileoncecreated,written,andclosedneednotbechanged.Thisassumptionsimplifiesdatacoherencyissuesandenableshighthroughputdataaccess.AMap/Reduceapplicationorawebcrawlerapplicationfitsperfectlywiththismodel.Thereisaplantosupportappending-writestofilesinthefuture.
“MovingComputationisCheaperthanMovingData”
Acomputationrequestedbyanapplicationismuchmoreefficientifitisexecutednearthedataitoperateson.Thisisespeciallytruewhenthesizeofthedatasetishuge.Thisminimizesnetworkcongestionandincreasestheoverallthroughputofthesystem.Theassumptionisthatitisoftenbettertomigratethecomputationclosertowherethedataislocatedratherthanmovingthedatatowheretheapplicationisrunning.HDFSprovidesinterfacesforapplicationstomovethemselvesclosertowherethedataislocated.
PortabilityAcrossHeterogeneousHardwareandSoftwarePlatforms
HDFShasbeendesignedtobeeasilyportablefromoneplatformtoanother.ThisfacilitateswidespreadadoptionofHDFSasaplatformofchoiceforalargesetofapplications.
NameNodeandDataNodes
HDFShasamaster/slavearchitecture.AnHDFSclusterconsistsofasingleNameNode,amasterserverthatmanagesthefilesystemnamespaceandregulatesaccesstofilesbyclients.Inaddition,thereareanumberofDataNodes,usuallyonepernodeinthecluster,whichmanagestorageattachedtothenodesthattheyrunon.HDFSexposesafilesystemnamespaceandallowsuserdatatobestoredinfiles.Internally,afileissplitintooneormoreblocksandtheseblocksarestoredinasetofDataNodes.TheNameNodeexecutesfilesystemnamespaceoperationslikeopening,closing,andrenamingfilesanddirectories.ItalsodeterminesthemappingofblockstoDataNodes.TheDataNodesareresponsibleforservingreadandwriterequestsfromthefilesystem’sclients.TheDataNodesalsoperformblockcreation,deletion,andreplicationuponinstructionfromtheNameNode.
TheNameNodeandDataNodearepiecesofsoftwaredesignedtorunoncommoditymachines.ThesemachinestypicallyrunaGNU/Linuxoperatingsystem(OS).HDFSisbuiltusingtheJavalanguage;anymachinethatsupportsJavacanruntheNameNodeortheDataNodesoftware.UsageofthehighlyportableJavalanguagemeansthatHDFScanbedeployedonawiderangeofmachines.AtypicaldeploymenthasadedicatedmachinethatrunsonlytheNameNodesoftware.EachoftheothermachinesintheclusterrunsoneinstanceoftheDataNodesoftware.ThearchitecturedoesnotprecluderunningmultipleDataNodesonthesamemachinebutinarealdeploymentthatisrarelythecase.
TheexistenceofasingleNameNodeinaclustergreatlysimplifiesthearchitectureofthesystem.TheNameNodeisthearbitratorandrepositoryforallHDFSmetadata.ThesystemisdesignedinsuchawaythatuserdataneverflowsthroughtheNameNode.
TheFileSystemNamespace
HDFSsupportsatraditionalhierarchicalfileorganization.Auseroranapplicationcancreatedirectoriesandstorefilesinsidethesedirectories.Thefilesystemnamespacehierarchyissimilartomostotherexistingfilesystems;onecancreateandremovefiles,moveafilefromonedirectorytoanother,orrenameafile.HDFSdoesnotyetimplementuserquotasoraccesspermissions.HDFSdoesnotsupporthardlinksorsoftlinks.However,theHDFSarchitecturedoesnotprecludeimplementingthesefeatures.
TheNameNodemaintainsthefilesystemnamespace.AnychangetothefilesystemnamespaceoritspropertiesisrecordedbytheNameNode.AnapplicationcanspecifythenumberofreplicasofafilethatshouldbemaintainedbyHDFS.Thenumberofcopiesofafileiscalledthereplicationfactorofthatfile.ThisinformationisstoredbytheNameNode.
DataReplication
HDFSisdesignedtoreliablystoreverylargefilesacrossmachinesinalargecluster.Itstoreseachfileasasequenceofblocks;allblocksinafileexceptthelastblockarethesamesize.Theblocksofafilearereplicatedforfaulttolerance.Theblocksizeandreplicationfactorareconfigurableperfile.Anapplicationcanspecifythenumberofreplicasofafile.Thereplicationfactorcanbespecifiedatfilecreationtimeandcanbechangedlater.FilesinHDFSarewrite-onceandhavestrictlyonewriteratanytime.
TheNameNodemakesalldecisionsregardingreplicationofblocks.ItperiodicallyreceivesaHeartbeatandaBlockreportfromeachoftheDataNodesinthecluster.ReceiptofaHeartbeatimpliesthattheDataNodeisfunctioningproperly.ABlockreportcontainsalistofallblocksonaDataNode.
ReplicaPlacement:
TheFirstBabySteps
TheplacementofreplicasiscriticaltoHDFSreliabilityandperformance.OptimizingreplicaplacementdistinguishesHDFSfrommostotherdistributedfilesystems.Thisisafeaturethatneedslotsoftuningandexperience.Thepurposeofarack-awarereplicaplacementpolicyistoimprovedatareliability,availability,andnetworkbandwidthutilization.Thecurrentimplementationforthereplicaplacementpolicyisafirsteffortinthisdirection.Theshort-termgoalsofimplementingthispolicyaretovalidateitonproductionsystems,learnmoreaboutitsbehavior,andbuildafoundationtotestandresearchmoresophisticatedpolicies.
LargeHDFSinstancesrunonaclusterofcomputersthatcommonlyspreadacrossmanyracks.Communicationbetweentwonodesindifferentrackshastogothroughswitches.Inmostcases,networkbandwidthbetweenmachinesinthesamerackisgreaterthannetworkbandwidthbetweenmachinesindifferentracks.
TheNameNodedeterminestherackideachDataNodebelongstoviatheprocessoutlinedinRackAwareness.Asimplebutnon-optimalpolicyistoplacereplicasonuniqueracks.Thispreventslosingdatawhenanentirerackfailsandallowsuseofbandwidthfrommultiplerackswhenreadingdata.Thispolicyevenlydistributesreplicasintheclusterwhichmakesiteasytobalanceloadoncomponentfailure.However,thispolicyincreasesthecostofwritesbecauseawriteneedstotransferblockstomultipleracks.
Forthecommoncase,whenthereplicationfactoristhree,HDFS’splacementpolicyistoputonereplicaononenodeinthelocalrack,anotheronadifferentnodeinthelocalrack,andthelastonadifferentnodeinadifferentrack.Thispolicycutstheinter-rackwritetrafficwhichgenerallyimproveswriteperformance.Thechanceofrackfailureisfarlessthanthatofnodefailure;thispolicydoesnotimpactdatareliabilityandavailabilityguarantees.However,itdoesreducetheaggregatenetworkbandwidthusedwhenreadingdatasinceablockisplacedinonlytwouniqueracksratherthanthree.Withthispolicy,thereplicasofafiledonotevenlydistributeacrosstheracks.Onethirdofreplicasareononenode,twothirdsofreplicasareononerack,andtheotherthirdareevenlydistributedacrosstheremainingracks.Thispolicyimproveswriteperformancewithoutcompromisingdatareliabilityorreadperformance.
Thecurrent,defaultreplicaplacementpolicydescribedhereisaworkinprogress.
ReplicaSelection
Tominimizeglobalbandwidt