锘??xml version="1.0" encoding="utf-8" standalone="yes"?>日韩十八禁一区二区久久,久久久久久夜精品精品免费啦,久久亚洲精品无码AV红樱桃http://www.shnenglu.com/dawnbreak/category/8919.htmlPearLi's Blogzh-cnThu, 03 Sep 2009 03:12:46 GMTThu, 03 Sep 2009 03:12:46 GMT60Hadoop瀛︿範絎旇涓 綆瑕佷粙緇?http://www.shnenglu.com/dawnbreak/articles/95178.htmlpear_lipear_liThu, 03 Sep 2009 02:58:00 GMThttp://www.shnenglu.com/dawnbreak/articles/95178.htmlhttp://www.shnenglu.com/dawnbreak/comments/95178.htmlhttp://www.shnenglu.com/dawnbreak/articles/95178.html#Feedback0http://www.shnenglu.com/dawnbreak/comments/commentRss/95178.htmlhttp://www.shnenglu.com/dawnbreak/services/trackbacks/95178.html    榪欓噷鍏堝ぇ鑷翠粙緇嶄竴涓婬adoop.
    鏈枃澶ч儴鍒嗗唴瀹歸兘鏄粠瀹樼綉Hadoop涓婃潵鐨勩傚叾涓湁涓綃?/span>浠嬬粛HDFS鐨刾df鏂囨。錛岄噷闈㈠Hadoop浠嬬粛鐨勬瘮杈冨叏闈簡銆傛垜鐨勮繖涓涓郴鍒楃殑Hadoop瀛︿範絎旇涔熸槸浠?/span>榪欓噷涓姝ヤ竴姝ヨ繘琛屼笅鏉ョ殑錛屽悓鏃跺張鍙傝冧簡緗戜笂鐨勫緢澶氭枃绔狅紝瀵瑰涔燞adoop涓亣鍒扮殑闂榪涜浜嗗綊綰蟲葷粨銆?br>    璦褰掓浼狅紝鍏堣涓涓婬adoop鐨勬潵榫欏幓鑴夈傝皥鍒癏adoop灝變笉寰椾笉鎻愬埌Lucene鍜?/span>Nutch銆傞鍏堬紝Lucene騫朵笉鏄竴涓簲鐢ㄧ▼搴忥紝鑰屾槸鎻愪緵浜嗕竴涓函Java鐨勯珮鎬ц兘鍏ㄦ枃绱㈠紩寮曟搸宸ュ叿鍖?/span>錛屽畠鍙互鏂逛究鐨勫祵鍏ュ埌鍚勭瀹為檯搴旂敤涓疄鐜板叏鏂囨悳绱?绱㈠紩鍔熻兘銆?span style="COLOR: #0000ff">Nutch鏄竴涓簲鐢ㄧ▼搴忥紝鏄竴涓互Lucene涓哄熀紜瀹炵幇鐨勬悳绱㈠紩鎿庡簲鐢?/span>錛孡ucene涓篘utch鎻愪緵浜嗘枃鏈悳绱㈠拰绱㈠紩鐨凙PI錛孨utch涓嶅厜鏈夋悳绱㈢殑鍔熻兘錛岃繕鏈夋暟鎹姄鍙栫殑鍔熻兘銆傚湪nutch0.8.0鐗堟湰涔嬪墠錛孒adoop榪樺睘浜嶯utch鐨勪竴閮ㄥ垎錛岃屼粠nutch0.8.0寮濮嬶紝灝嗗叾涓疄鐜扮殑NDFS鍜孧apReduce鍓ョ鍑烘潵鎴愮珛涓涓柊鐨勫紑婧愰」鐩紝榪欏氨鏄疕adoop錛岃宯utch0.8.0鐗堟湰杈冧箣浠ュ墠鐨凬utch鍦ㄦ灦鏋勪笂鏈変簡鏍規湰鎬х殑鍙樺寲錛岄偅灝辨槸瀹屽叏鏋勫緩鍦℉adoop鐨勫熀紜涔嬩笂浜嗐傚湪Hadoop涓疄鐜頒簡Google鐨凣FS鍜孧apReduce綆楁硶錛屼嬌Hadoop鎴愪負浜嗕竴涓垎甯冨紡鐨勮綆楀鉤鍙般?br>   鍏跺疄錛孒adoop騫朵笉浠呬粎鏄竴涓敤浜庡瓨鍌ㄧ殑鍒嗗竷寮忔枃浠剁郴緇燂紝鑰屾槸璁捐鐢ㄦ潵鍦ㄧ敱閫氱敤璁$畻璁懼緇勬垚鐨勫ぇ鍨嬮泦緹や笂鎵ц鍒嗗竷寮忓簲鐢ㄧ殑妗嗘灦銆?br>
   Hadoop鍖呭惈涓や釜閮ㄥ垎錛?/span>

   1銆丠DFS

      鍗矵adoop Distributed File System (Hadoop鍒嗗竷寮忔枃浠剁郴緇?
      HDFS鍏鋒湁楂樺閿欐э紝騫朵笖鍙互琚儴緗插湪浣庝環鐨勭‖浠惰澶囦箣涓娿侶DFS寰堥傚悎閭d簺鏈夊ぇ鏁版嵁闆嗙殑搴旂敤錛屽茍涓旀彁渚涗簡瀵規暟鎹鍐欑殑楂樺悶鍚愮巼銆侶DFS鏄竴涓猰aster/slave鐨勭粨鏋勶紝灝遍氬父鐨勯儴緗叉潵璇達紝鍦╩aster涓婂彧榪愯涓涓狽amenode錛岃屽湪姣忎竴涓猻lave涓婅繍琛屼竴涓狣atanode銆?br>      HDFS鏀寔浼犵粺鐨勫眰嬈℃枃浠剁粍緇囩粨鏋勶紝鍚岀幇鏈夌殑涓浜涙枃浠剁郴緇熷湪鎿嶄綔涓婂緢綾諱技錛屾瘮濡備綘鍙互鍒涘緩鍜屽垹闄や竴涓枃浠訛紝鎶婁竴涓枃浠朵粠涓涓洰褰曠Щ鍒板彟涓涓洰褰曪紝閲嶅懡鍚嶇瓑絳夋搷浣溿侼amenode綆$悊鐫鏁翠釜鍒嗗竷寮忔枃浠剁郴緇燂紝瀵規枃浠剁郴緇熺殑鎿嶄綔錛堝寤虹珛銆佸垹闄ゆ枃浠跺拰鏂囦歡澶癸級閮芥槸閫氳繃Namenode鏉ユ帶鍒躲?nbsp;
     涓嬮潰鏄疕DFS鐨勭粨鏋勶細


      浠庝笂闈㈢殑鍥句腑鍙互鐪嬪嚭錛孨amenode錛孌atanode錛孋lient涔嬮棿鐨勯氫俊閮芥槸寤虹珛鍦═CP/IP鐨勫熀紜涔嬩笂鐨勩傚綋Client瑕佹墽琛屼竴涓啓鍏ョ殑鎿嶄綔鐨勬椂鍊欙紝鍛戒護涓嶆槸椹笂灝卞彂閫佸埌Namenode錛孋lient棣栧厛鍦ㄦ湰鏈轟笂涓存椂鏂囦歡澶逛腑緙撳瓨榪欎簺鏁版嵁錛屽綋涓存椂鏂囦歡澶逛腑鐨勬暟鎹潡杈懼埌浜嗚瀹氱殑Block鐨勫鹼紙榛樿鏄?4M錛夋椂錛孋lient渚夸細閫氱煡Namenode錛孨amenode渚垮搷搴擟lient鐨凴PC璇鋒眰錛屽皢鏂囦歡鍚嶆彃鍏ユ枃浠剁郴緇熷眰嬈′腑騫朵笖鍦―atanode涓壘鍒頒竴鍧楀瓨鏀捐鏁版嵁鐨刡lock錛屽悓鏃跺皢璇atanode鍙婂搴旂殑鏁版嵁鍧椾俊鎭憡璇塁lient錛孋lient渚胯繖浜涙湰鍦頒復鏃舵枃浠跺す涓殑鏁版嵁鍧楀啓鍏ユ寚瀹氱殑鏁版嵁鑺傜偣銆?br>      HDFS閲囧彇浜嗗壇鏈瓥鐣ワ紝鍏剁洰鐨勬槸涓轟簡鎻愰珮緋葷粺鐨勫彲闈犳э紝鍙敤鎬с侶DFS鐨勫壇鏈斁緗瓥鐣ユ槸涓変釜鍓湰錛屼竴涓斁鍦ㄦ湰鑺傜偣涓婏紝涓涓斁鍦ㄥ悓涓鏈烘灦涓殑鍙︿竴涓妭鐐逛笂錛岃繕鏈変竴涓壇鏈斁鍦ㄥ彟涓涓笉鍚岀殑鏈烘灦涓殑涓涓妭鐐逛笂銆傚綋鍓嶇増鏈殑hadoop0.12.0涓繕娌℃湁瀹炵幇錛屼絾鏄鍦ㄨ繘琛屼腑錛岀浉淇′笉涔呭氨鍙互鍑烘潵浜嗐?br>
   2銆丮apReduce鐨勫疄鐜?br>
      
MapReduce鏄疓oogle 鐨勪竴欏歸噸瑕佹妧鏈紝瀹冩槸涓涓紪紼嬫ā鍨嬶紝鐢ㄤ互榪涜澶ф暟鎹噺鐨勮綆椼傚浜庡ぇ鏁版嵁閲忕殑璁$畻錛岄氬父閲囩敤鐨勫鐞嗘墜娉曞氨鏄茍琛岃綆椼傝嚦灝戠幇闃舵鑰岃█錛屽璁稿寮鍙戜漢鍛樻潵璇達紝騫惰璁$畻榪樻槸涓涓瘮杈冮仴榪滅殑涓滆タ銆侻apReduce灝辨槸涓縐嶇畝鍖栧茍琛岃綆楃殑緙栫▼妯″瀷錛屽畠璁╅偅浜涙病鏈夊灝戝茍琛岃綆楃粡楠岀殑寮鍙戜漢鍛樹篃鍙互寮鍙戝茍琛屽簲鐢ㄣ?br>      MapReduce鐨勫悕瀛楁簮浜庤繖涓ā鍨嬩腑鐨勪袱欏規牳蹇冩搷浣滐細Map鍜?Reduce銆備篃璁哥啛鎮塅unctional Programming錛?/span>鍑芥暟寮忕紪紼?/font>錛夌殑浜鴻鍒拌繖涓や釜璇嶄細鍊嶆劅浜插垏銆傜畝鍗曠殑璇存潵錛孧ap鏄妸涓緇勬暟鎹竴瀵逛竴鐨勬槧灝勪負鍙﹀鐨勪竴緇勬暟鎹紝鍏舵槧灝勭殑瑙勫垯鐢變竴涓嚱鏁版潵鎸囧畾錛屾瘮濡傚[1, 2, 3, 4]榪涜涔?鐨勬槧灝勫氨鍙樻垚浜哰2, 4, 6, 8]銆俁educe鏄涓緇勬暟鎹繘琛屽綊綰︼紝榪欎釜褰掔害鐨勮鍒欑敱涓涓嚱鏁版寚瀹氾紝姣斿瀵筟1, 2, 3, 4]榪涜姹傚拰鐨勫綊綰﹀緱鍒扮粨鏋滄槸10錛岃屽瀹冭繘琛屾眰縐殑褰掔害緇撴灉鏄?4銆?br>      鍏充簬MapReduce鐨勫唴瀹癸紝寤鴻鐪嬬湅瀛熷博鐨勮繖綃?/span>MapReduce:The Free Lunch Is Not Over!

   濂戒簡錛屼綔涓鴻繖涓郴鍒楃殑絎竴綃囧氨鍐欒繖涔堝浜嗭紝鎴戜篃鏄垰寮濮嬫帴瑙adoop錛屼笅涓綃囧氨鏄Hadoop鐨勯儴緗詫紝璋堣皥鎴戝湪閮ㄧ講Hadoop鏃墮亣鍒扮殑闂錛屼篃緇欏ぇ瀹朵竴涓弬鑰冿紝灝戣蛋鐐瑰集璺?/span>
 


pear_li 2009-09-03 10:58 鍙戣〃璇勮
]]>
Map Reduce - the Free Lunch is not over?http://www.shnenglu.com/dawnbreak/articles/95176.htmlpear_lipear_liThu, 03 Sep 2009 02:43:00 GMThttp://www.shnenglu.com/dawnbreak/articles/95176.htmlhttp://www.shnenglu.com/dawnbreak/comments/95176.htmlhttp://www.shnenglu.com/dawnbreak/articles/95176.html#Feedback0http://www.shnenglu.com/dawnbreak/comments/commentRss/95176.htmlhttp://www.shnenglu.com/dawnbreak/services/trackbacks/95176.html

Map Reduce - the Free Lunch is not over?

by Meng Yan on Nov.15, 2006, under Other

寰蔣钁楀悕鐨凜++澶у笀Herb Sutter鍦?005騫村垵鐨勬椂鍊欐浘緇忓啓榪囦竴綃囬噸閲忕駭鐨勬枃绔狅細”The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software“錛岄璦OO涔嬪悗杞歡寮鍙戝皢瑕侀潰涓寸殑鍙堜竴嬈¢噸澶у彉闈?騫惰璁$畻銆?/p>

鎽╁皵瀹氬緥緇熷埗涓嬬殑杞歡寮鍙戞椂浠f湁涓涓潪甯告湁鎰忔濈殑鐜拌薄錛?#8221;Andy giveth, and Bill taketh away.”銆備笉綆PU鐨勪富棰戞湁澶氬揩錛屾垜浠緇堟湁鍔炴硶鏉ュ埄鐢ㄥ畠錛岃屾垜浠篃闄墮唹鍦ㄦ満鍣ㄥ崌綰у甫鏉ョ殑紼嬪簭鎬ц兘鎻愰珮涓?/p>

鎴戣鐫鎴戝ぇ浜岀殑鏃跺欐浘緇忓仛榪囦竴涓簲瀛愭鐨勭▼搴忥紝褰撴椂鐨勭畻娉曞氨鏄鍏堣璁′竴浜涙鍨嬶紙鏈変紭鍏堢駭錛夛紝鐒跺悗鎵弿媯嬬洏錛屽褰㈠娍榪涜鍒嗘瀽錛岀湅鐪嬪綋鍓嶈蛋鍝儴瀵硅嚜宸辨渶閲嶈銆傚綋鐒朵笅媯嬭繕瑕佸牭鍒漢錛岃繖灝遍渶瑕佷簰鎹㈠弻鏂圭殑媯嬪瀷鍐嶈綆椼傚鏋滃彧綆椾竴姝ワ紝寰堝彲鑳借鐙$尵鐨勫鎵嬫楠楋紝鎵浠ヤ負浜嗗鎯沖嚑姝ワ紝榪橀渶瑕侀掑綊鍜屽洖鏈斻傚湪褰撴椂鐨勬満鍣ㄤ笂錛岀畻3姝ュ氨鍩烘湰涓婇渶瑕?縐掑乏鍙崇殑鏃墮棿浜嗐傚悗鏉ュぇ瀛︽瘯涓氭敹鎷句笢瑗跨殑鏃跺欐壘鍒拌繖涓▼搴忥紝璇曚簡涓涓嬶紝鍙戠幇綆?0姝ラ渶瑕佺殑鏃墮棿涔熷熀鏈笂鎰熻涓嶅嚭鏉ヤ簡銆?/p>

涓嶇煡閬撲綘鏄惁鏈夊悓鏍風殑緇忓巻錛屾垜浠笉鐭ヤ笉瑙夌殑涓鐩村湪浜彈鐫榪欐牱鐨勫厤璐瑰崍槨愩傚彲鏄紝闅忕潃鎽╁皵瀹氬緥鐨勬彁鍓嶇粓緇擄紝鍏嶈垂鐨勫崍槨愮粓絀惰榪樺洖鍘匯傝櫧鐒剁‖浠惰璁″笀榪樺湪鍔姏錛欻yper Threading CPU錛堝鍑轟竴濂楀瘎瀛樺櫒錛岀浉褰撲簬涓涓昏緫CPU錛変嬌寰桺ipeline灝藉彲鑳芥弧璐熻嵎錛屼嬌澶氫釜Thread鐨勬搷浣滄湁鍙兘騫惰錛屼嬌寰楀綰跨▼紼嬪簭鐨勬ц兘鏈?%-15%鐨勬彁鍗囷紱澧炲姞Cache瀹歸噺涔熶嬌寰楀寘鎷琒ingle-Thread鍜孧ulti-Thread紼嬪簭閮借兘鍙楃泭銆備篃璁歌繖浜涜繕鑳藉府鍔╀綘涓孌墊椂闂達紝浣嗛棶棰樻槸錛屾垜浠繀欏誨仛鍑烘敼鍙橈紝闈㈠榪欎釜鍗沖皢鍒版潵鐨勫彉闈╋紝浣犲噯澶囧ソ浜嗕箞錛?/p>

Concurrency Programming != Multi-Thread Programming銆傚緢澶氫漢閮戒細璇碝ultiThreading璋佷笉浼氾紝闂鏄紝浣犳槸涓轟粈涔堜嬌鐢?濡備綍浣跨敤澶氱嚎紼嬬殑錛熸垜浠庡墠鍋氳繃涓涓被浼糀cdSee涓鏍風殑鍥懼儚鏌ョ湅/澶勭悊紼嬪簭錛屾垜閫氬父鐢ㄥ畠鏉ュ鐞嗘垜鐨勬暟鐮佺収鐗囥傛垜鍦ㄩ噷闈㈢敤浜嗗ぇ閲忕殑澶氱嚎紼嬶紝涓嶈繃涓昏鐩殑鏄湪鍥懼儚澶勭悊鐨勬椂鍊欎笉瑕丅lock浣廢I錛屾墍浠ュ皢CPU Intensive鐨勮綆楅儴鍒嗙敤鍚庡彴綰跨▼榪涜澶勭悊銆傝屽茍娌℃湁鎶婂鍥懼儚鐭╅樀鐨勮繍綆楀茍琛屽垎寮銆?/p>

鎴戣寰桟oncurrency Programming鐪熸鐨勬寫鎴樺湪浜嶱rogramming Model鐨勬敼鍙橈紝鍦ㄧ▼搴忓憳鐨勮剳瀛愰噷闈㈣瀵硅嚜宸辯殑紼嬪簭鎬庢牱騫惰鍖栨湁寰堟竻妤氱殑璁よ瘑錛屾洿閲嶈鐨勬槸錛屽浣曞幓瀹炵幇錛堝寘鎷灦鏋勩佸閿欍佸疄鏃剁洃鎺х瓑絳夛級榪欑騫惰鍖栵紝濡備綍鍘?strong>璋冭瘯錛屽浣曞幓嫻嬭瘯銆?/p>

鍦℅oogle錛屾瘡澶╂湁嫻烽噺鐨勬暟鎹渶瑕佸湪鏈夐檺鐨勬椂闂村唴榪涜澶勭悊錛堝叾瀹炴瘡涓簰鑱旂綉鍏徃閮戒細紕板埌榪欐牱鐨勯棶棰橈級錛屾瘡涓▼搴忓憳閮介渶瑕佽繘琛屽垎甯冨紡鐨勭▼搴忓紑鍙戯紝榪欏叾涓寘鎷浣曞垎甯冦佽皟搴︺佺洃鎺т互鍙婂閿欑瓑絳夈侴oogle鐨?a >MapReduce姝f槸鎶婂垎甯冨紡鐨勪笟鍔¢昏緫浠庤繖浜涘鏉傜殑緇嗚妭涓娊璞″嚭鏉ワ紝浣垮緱娌℃湁鎴栬呭緢灝戝茍琛屽紑鍙戠粡楠岀殑紼嬪簭鍛樹篃鑳借繘琛屽茍琛屽簲鐢ㄧ▼搴忕殑寮鍙戙?/p>

MapReduce涓渶閲嶈鐨勪袱涓瘝灝辨槸Map錛堟槧灝勶級鍜孯educe錛堣綰︼級銆傚垵鐪婱ap/Reduce榪欎袱涓瘝錛岀啛鎮塅unction Language鐨勪漢涓瀹氭劅瑙夊緢鐔熸倝銆侳P鎶婅繖鏍風殑鍑芥暟縐頒負”higher order function”錛?#8221;High order function”琚垚涓篎unction Programming鐨勫埄鍣ㄤ箣涓鍝︼級錛屼篃灝辨槸璇達紝榪欎簺鍑芥暟鏄紪鍐欐潵琚笌鍏跺畠鍑芥暟鐩哥粨鍚堬紙鎴栬呰琚叾瀹冨嚱鏁拌皟鐢ㄧ殑錛夈傚鏋滆紜姣旂殑鍖栵紝鍙互鎶婂畠鎯寵薄鎴怌閲岄潰鐨凜allBack鍑芥暟錛屾垨鑰匰TL閲岄潰鐨凢unctor銆傛瘮濡備綘瑕佸涓涓猄TL鐨勫鍣ㄨ繘琛屾煡鎵撅紝闇瑕佸埗瀹氭瘡涓や釜鍏冪礌鐩告瘮杈冪殑Functor錛圕omparator錛夛紝榪欎釜Comparator鍦ㄩ亶鍘嗗鍣ㄧ殑鏃跺欏氨浼氳璋冪敤銆?/p>

鎷垮墠闈㈣榪囧浘鍍忓鐞嗙▼搴忔潵涓句緥錛屽叾瀹炲ぇ澶氭暟鐨勫浘鍍忓鐞嗘搷浣滈兘鏄鍥懼儚鐭╅樀榪涜鏌愮榪愮畻銆傝繖閲岀殑榪愮畻閫氬父鏈変袱縐嶏紝涓縐嶆槸鏄犲皠錛屼竴縐嶆槸瑙勭害銆傛嬁涓ょ鏁堟灉鏉ヨ錛?#8221;鑰佺収鐗?#8221;鏁堟灉閫氬父鏄己鍖栫収鐗囩殑G/B鍊鹼紝鐒跺悗瀵規瘡涓薄绱犲姞涓浜涢殢鏈虹殑鍋忕Щ錛岃繖浜涙搷浣滃湪浜岀淮鐭╅樀涓婄殑姣忎竴涓厓绱犻兘鏄嫭绔嬬殑錛屾槸Map鎿嶄綔銆傝?#8221;闆曞埢”鏁堟灉闇瑕佹彁鍙栧浘鍍忚竟緙橈紝灝遍渶瑕佸厓绱犱箣闂寸殑榪愮畻浜嗭紝鏄竴縐峈educe鎿嶄綔銆傚啀涓句釜綆鍗曠殑渚嬪瓙錛屼竴涓竴緇寸煩闃碉紙鏁扮粍錛塠0,1,2,3,4]鍙互鏄犲皠涓篬0,2,3,6,8]錛堜箻2錛夛紝涔熷彲浠ユ槧灝勪負[1,2,3,4,5]錛堝姞1錛夈傚畠鍙互瑙勭害涓?錛堝厓绱犳眰縐級涔熷彲浠ヨ綰︿負10錛堝厓绱犳眰鍜岋級銆?/p>

闈㈠澶嶆潅闂錛屽彜浜烘暀瀵兼垜浠“鍒?/strong>鑰?strong>娌?/strong>涔?#8221;錛岃嫳鏂囦腑瀵瑰簲鐨勮瘝鏄?#8221;Divide and Conquer“銆侻ap/Reduce鍏跺疄灝辨槸Divide/Conquer鐨勮繃紼嬶紝閫氳繃鎶婇棶棰楧ivide錛屼嬌榪欎簺Divide鍚庣殑Map榪愮畻楂樺害騫惰錛屽啀灝哅ap鍚庣殑緇撴灉Reduce錛堟牴鎹煇涓涓狵ey錛夛紝寰楀埌鏈緇堢殑緇撴灉銆?/p>

Googler鍙戠幇榪欐槸闂鐨勬牳蹇冿紝鍏跺畠閮芥槸鍏辨ч棶棰樸傚洜姝わ紝浠栦滑鎶奙apReduce鎶借薄鍒嗙鍑烘潵銆傝繖鏍鳳紝Google鐨勭▼搴忓憳鍙互鍙叧蹇冨簲鐢ㄩ昏緫錛屽叧蹇冩牴鎹摢浜汯ey鎶婇棶棰樿繘琛屽垎瑙o紝鍝簺鎿嶄綔鏄疢ap鎿嶄綔錛屽摢浜涙搷浣滄槸Reduce鎿嶄綔銆傚叾瀹冨茍琛岃綆椾腑鐨勫鏉傞棶棰樿濡傚垎甯冦佸伐浣滆皟搴︺佸閿欍佹満鍣ㄩ棿閫氫俊閮戒氦緇橫ap/Reduce Framework鍘誨仛錛屽緢澶х▼搴︿笂綆鍖栦簡鏁翠釜緙栫▼妯″瀷銆?/p>

MapReduce鐨勫彟涓涓壒鐐規槸錛孧ap鍜孯educe鐨?strong>杈撳叆鍜岃緭鍑洪兘鏄腑闂翠復鏃舵枃浠?/strong>錛圡apReduce鍒╃敤Google鏂囦歡緋葷粺鏉ョ鐞嗗拰璁塊棶榪欎簺鏂囦歡錛夛紝鑰屼笉鏄笉鍚岃繘紼嬮棿鎴栬呬笉鍚屾満鍣ㄩ棿鐨勫叾瀹冮氫俊鏂瑰紡銆傛垜瑙夊緱錛岃繖鏄疓oogle涓璐殑椋庢牸錛屽寲綣佷負綆錛岃繑鐠炲綊鐪熴?/p>

鎺ヤ笅鏉ュ氨鏀句笅鍏跺畠錛岀爺絀朵竴涓婱ap/Reduce鎿嶄綔銆傦紙鍏跺畠姣斿瀹歸敊銆佸浠戒換鍔′篃鏈夊緢緇忓吀鐨勭粡楠屽拰瀹炵幇錛岃鏂囬噷闈㈤兘鏈夎榪幫級

Map鐨勫畾涔夛細

Map, written by the user, takes an input pair and produces a set of intermediate key/value pairs. The MapReduce library groups together all intermediate values associated with the same intermediate key I and passes them to the Reduce function.

Reduce鐨勫畾涔夛細

The Reduce function, also written by the user, accepts an intermediate key I and a set of values for that key. It merges together these values to form a possibly smaller set of values. Typically just zero or one output value is produced per Reduce invocation. The intermediate values are supplied to the user’s reduce function via an iterator. This allows us to handle lists of values that are too large to fit in memory.

MapReduce璁烘枃涓粰鍑轟簡榪欐牱涓涓緥瀛愶細鍦ㄤ竴涓枃妗i泦鍚堜腑緇熻姣忎釜鍗曡瘝鍑虹幇鐨勬鏁般?/p>

Map鎿嶄綔鐨勮緭鍏ユ槸姣忎竴綃囨枃妗o紝灝嗚緭鍏ユ枃妗d腑姣忎竴涓崟璇嶇殑鍑虹幇杈撳嚭鍒頒腑闂存枃浠朵腑鍘匯?/p>

map(String key, String value):
    // key: document name
    // value: document contents
    for each word w in value:
        EmitIntermediate(w, “1″);

姣斿鎴戜滑鏈変袱綃囨枃妗o紝鍐呭鍒嗗埆鏄?/p>

A 錛?“I love programming”

B 錛?“I am a blogger, you are also a blogger”銆?/p>

B鏂囨。緇忚繃Map榪愮畻鍚庤緭鍑虹殑涓棿鏂囦歡灝嗕細鏄細

	I,1
am,1
a,1
blogger,1
you,1
are,1
a,1
blogger,1

Reduce鎿嶄綔鐨勮緭鍏ユ槸鍗曡瘝鍜屽嚭鐜版鏁扮殑搴忓垪銆傜敤涓婇潰鐨勪緥瀛愭潵璇達紝灝辨槸 (”I”, [1, 1]), (”love”, [1]), (”programming”, [1]), (”am”, [1]), (”a”, [1,1]) 絳夈傜劧鍚庢牴鎹瘡涓崟璇嶏紝綆楀嚭鎬葷殑鍑虹幇嬈℃暟銆?/p>

reduce(String key, Iterator values):
    // key: a word
    // values: a list of counts
    int result = 0;
    for each v in values:
        result += ParseInt(v);
    Emit(AsString(result));

鏈鍚庤緭鍑虹殑鏈緇堢粨鏋滃氨浼氭槸錛?”I”, 2″), (”a”, 2″)……

瀹為檯鐨勬墽琛岄『搴忔槸錛?/p>

  1. MapReduce Library灝咺nput鍒嗘垚M浠姐傝繖閲岀殑Input Splitter涔熷彲浠ユ槸澶氬彴鏈哄櫒騫惰Split銆?
  2. Master灝哅浠絁ob鍒嗙粰Idle鐘舵佺殑M涓獁orker鏉ュ鐞嗭紱
  3. 瀵逛簬杈撳叆涓殑姣忎竴涓?lt;key, value> pair 榪涜Map鎿嶄綔錛屽皢涓棿緇撴灉Buffer鍦∕emory閲岋紱
  4. 瀹氭湡鐨勶紙鎴栬呮牴鎹唴瀛樼姸鎬侊級錛屽皢Buffer涓殑涓棿淇℃伅Dump鍒?strong>鏈湴紓佺洏涓婏紝騫朵笖鎶婃枃浠朵俊鎭紶鍥炵粰Master錛圡aster闇瑕佹妸榪欎簺淇℃伅鍙戦佺粰Reduce worker錛夈傝繖閲屾渶閲嶈鐨勪竴鐐規槸錛?strong>鍦ㄥ啓紓佺洏鐨勬椂鍊欙紝闇瑕佸皢涓棿鏂囦歡鍋歅artition錛堟瘮濡俁涓級銆傛嬁涓婇潰鐨勪緥瀛愭潵涓句緥錛屽鏋滄妸鎵鏈夌殑淇℃伅瀛樺埌涓涓枃浠訛紝Reduce worker鍙堜細鍙樻垚鐡墮銆傛垜浠彧闇瑕佷繚璇?strong>鐩稿悓Key鑳藉嚭鐜板湪鍚屼竴涓狿artition閲岄潰灝卞彲浠ユ妸榪欎釜闂鍒嗚В銆?
  5. R涓猂educe worker寮濮嬪伐浣滐紝浠庝笉鍚岀殑Map worker鐨凱artition閭i噷鎷垮埌鏁版嵁錛?strong>read the buffered data from the local disks of the map workers錛夛紝鐢╧ey榪涜鎺掑簭錛堝鏋滃唴瀛樹腑鏀句笉涓嬮渶瑕佺敤鍒板閮ㄦ帓搴?- external sort錛夈傚緢鏄劇劧錛屾帓搴忥紙鎴栬呰Group錛夋槸Reduce鍑芥暟涔嬪墠蹇呴』鍋氱殑涓姝ャ?榪欓噷闈㈠緢鍏抽敭鐨勬槸錛屾瘡涓猂educe worker浼氬幓浠庡緢澶歁ap worker閭i噷鎷垮埌X(0<X<R) Partition鐨勪腑闂寸粨鏋滐紝榪欐牱錛屾墍鏈夊睘浜庤繖涓狵ey鐨勪俊鎭凡緇忛兘鍦ㄨ繖涓獁orker涓婁簡銆?
  6. Reduce worker閬嶅巻涓棿鏁版嵁錛屽姣忎竴涓敮涓Key錛屾墽琛孯educe鍑芥暟錛堝弬鏁版槸榪欎釜key浠ュ強鐩稿搴旂殑涓緋誨垪Value錛夈?
  7. 鎵ц瀹屾瘯鍚庯紝鍞ら啋鐢ㄦ埛紼嬪簭錛岃繑鍥炵粨鏋滐紙鏈鍚庡簲璇ユ湁R浠絆utput錛屾瘡涓猂educe Worker涓涓級銆?

鍙錛岃繖閲岀殑鍒嗭紙Divide錛変綋鐜板湪涓ゆ錛屽垎鍒槸灝嗚緭鍏ュ垎鎴怣浠斤紝浠ュ強灝哅ap鐨勪腑闂寸粨鏋滃垎鎴怰浠姐傚皢杈撳叆鍒嗗紑閫氬父寰堢畝鍗曪紝Map鐨勪腑闂寸粨鏋滈氬父鐢?#8221;hash(key) mod R”榪欎釜緇撴灉浣滀負鏍囧噯錛屼繚璇佺浉鍚岀殑Key鍑虹幇鍦ㄥ悓涓涓狿artition閲岄潰銆傚綋鐒訛紝浣跨敤鑰呬篃鍙互鎸囧畾鑷繁鐨凱artition Function錛屾瘮濡傦紝瀵逛簬Url Key錛屽鏋滃笇鏈涘悓涓涓狧ost鐨刄RL鍑虹幇鍦ㄥ悓涓涓狿artition錛屽彲浠ョ敤”hash(Hostname(urlkey)) mod R”浣滀負Partition Function銆?/p>

瀵逛簬涓婇潰鐨勪緥瀛愭潵璇達紝姣忎釜鏂囨。涓兘鍙兘浼氬嚭鐜版垚鍗冧笂涓囩殑 (”the”, 1)榪欐牱鐨勪腑闂寸粨鏋滐紝鐞愮鐨勪腑闂存枃浠跺繀鐒跺鑷翠紶杈撲笂鐨勬崯澶便傚洜姝わ紝MapReduce榪樻敮鎸佺敤鎴鋒彁渚汣ombiner Function銆傝繖涓嚱鏁伴氬父涓嶳educe Function鏈夌浉鍚岀殑瀹炵幇錛屼笉鍚岀偣鍦ㄤ簬Reduce鍑芥暟鐨勮緭鍑烘槸鏈緇堢粨鏋滐紝鑰孋ombiner鍑芥暟鐨勮緭鍑烘槸Reduce鍑芥暟鐨勬煇涓涓緭鍏ョ殑涓棿鏂囦歡銆?/p>

Tom White緇欏嚭浜哊utch[2]涓彟涓涓緢鐩磋鐨勪緥瀛愶紝鍒嗗竷寮廏rep銆傛垜涓鐩磋寰楋紝Pipe涓殑寰堝鎿嶄綔錛屾瘮濡侻ore銆丟rep銆丆at閮界被浼間簬涓縐峂ap鎿嶄綔錛岃孲ort銆乁niq銆亀c絳夐兘鐩稿綋浜庢煇縐峈educe鎿嶄綔銆?/p>

鍔犱笂鍓嶄袱澶〨oogle鍒氬垰鍙戝竷鐨?a >BigTable璁烘枃錛岀幇鍦℅oogle鏈変簡鑷繁鐨勯泦緹?- Googel Cluster錛屽垎甯冨紡鏂囦歡緋葷粺 - GFS錛屽垎甯冨紡璁$畻鐜 - MapReduce錛屽垎甯冨紡緇撴瀯鍖栧瓨鍌?- BigTable錛屽啀鍔犱笂Lock Service銆傛垜鐪熺殑鑳芥劅瑙夌殑鍒癎oogle钁楀悕鐨勫厤璐規櫄槨愪箣澶栫殑瀵逛簬紼嬪簭鍛樼殑鍙︿竴縐嶅厤璐圭殑鏅氶錛岄偅涓敱澶ч噺鐨刢ommodity PC緇勬垚鐨刲arge clusters銆傛垜瑙夊緱榪欎簺鎵嶇湡姝f槸Google鐨勬牳蹇冧環鍊兼墍鍦ㄣ?/p>

鍛靛懙錛屽氨鍍忓井杞佸叺Joel Spolsky錛堜綘搴旇鐪嬭繃浠栫殑”Joel on Software”鍚э紵錛夋浘緇忚榪囷紝瀵逛簬寰蔣鏉ヨ鏈鍙曠殑鏄痆1]錛屽井杞繕鍦ㄨ嫤鑻﹁拷璧禛oogle鏉ュ畬鍠凷earch鍔熻兘鐨勬椂鍊欙紝Google宸茬粡鍦ㄩ儴緗蹭笅涓浠g殑瓚呯駭璁$畻鏈轟簡銆?/p>

The very fact that Google invented MapReduce, and Microsoft didn’t, says something about why Microsoft is still playing catch up trying to get basic search features to work, while Google has moved on to the next problem: building Skynet^H^H^H^H^H^H the world’s largest massively parallel supercomputer. I don’t think Microsoft completely understands just how far behind they are on that wave.

娉?錛氬叾瀹烇紝寰蔣涔熸湁鑷繁鐨勬柟妗?- DryAd銆傞棶棰樻槸錛屽ぇ鍏徃閲岋紝瑕佹兂閲嶆柊閮ㄧ講榪欐牱涓涓簳灞傜殑InfraStructure錛屾棤璁烘槸鎶鏈殑鍘熷洜錛岃繕鏄斂娌葷殑鍘熷洜錛屽皢鏄浣曠殑闅俱?/p>

娉?錛?a >Lucene涔嬬埗Doug Cutting鐨勫張涓鍔涗綔錛孭roject Hadoop - 鐢盚adoop鍒嗗竷寮忔枃浠剁郴緇熷拰涓涓狹ap/Reduce鐨勫疄鐜扮粍鎴愶紝Lucene/Nutch鐨勬垚浜х嚎涔熷榻愬叏鐨勪簡銆?/p>



pear_li 2009-09-03 10:43 鍙戣〃璇勮
]]>
亚洲一本综合久久| 天天躁日日躁狠狠久久| 一本大道久久a久久精品综合| 国产精品久久成人影院| 国产国产成人久久精品| 奇米影视7777久久精品人人爽| 99精品国产综合久久久久五月天 | 欧美精品一区二区精品久久| 亚洲一区中文字幕久久| 一日本道伊人久久综合影| 精品国产VA久久久久久久冰| 香蕉久久AⅤ一区二区三区| 97久久精品无码一区二区| 久久天天躁狠狠躁夜夜2020| 99久久国产热无码精品免费| 7777精品伊人久久久大香线蕉| 亚洲午夜久久久精品影院| 日韩人妻无码精品久久久不卡| 久久综合亚洲色HEZYO国产| 91久久精品91久久性色| 亚洲精品蜜桃久久久久久| 欧美国产精品久久高清| 久久噜噜电影你懂的| 久久人人爽人人爽人人片AV不| 亚洲欧美久久久久9999| 国产福利电影一区二区三区,免费久久久久久久精| 亚洲色欲久久久久综合网| 91精品国产色综久久| 成人久久综合网| 久久国产亚洲高清观看| 久久综合亚洲色一区二区三区| 久久国产V一级毛多内射| 精品久久人人做人人爽综合 | 精品水蜜桃久久久久久久| 精品久久香蕉国产线看观看亚洲| 亚洲精品乱码久久久久久中文字幕 | 亚洲AV无码久久精品蜜桃| 青青草原精品99久久精品66| 久久这里只有精品18| 色综合久久88色综合天天 | 亚洲国产视频久久|