• <ins id="pjuwb"></ins>
    <blockquote id="pjuwb"><pre id="pjuwb"></pre></blockquote>
    <noscript id="pjuwb"></noscript>
          <sup id="pjuwb"><pre id="pjuwb"></pre></sup>
            <dd id="pjuwb"></dd>
            <abbr id="pjuwb"></abbr>
            隨筆-379  評(píng)論-37  文章-0  trackbacks-0

            來(lái)自startup的垂直搜索引擎http://www.kosmix.com/的開(kāi)源項(xiàng)目,又一個(gè)開(kāi)源的類似google mapreduce 的分布式文件系統(tǒng),可以應(yīng)用在諸如圖片存儲(chǔ)、搜索引擎、網(wǎng)格計(jì)算、數(shù)據(jù)挖掘這樣需要處理大數(shù)據(jù)量的網(wǎng)絡(luò)應(yīng)用中。與hadoop集成得也比較好,這樣可以充分利用了hadoop一些現(xiàn)成的功能,基于C++。

            Introduction

            Applications that process large volumes of data (such as, search engines, grid computing applications, data mining applications, etc.) require a backend infrastructure for storing data. Such infrastructure is required to support applications whose workload could be characterized as:

            • Primarily write-once/read-many workloads
            • Few millions of large files, where each file is on the order of a few tens of MB to a few tens of GB in size
            • Mostly sequential access

            We have developed the Kosmos Distributed File System (KFS), a high performance distributed file system to meet this infrastructure need.

            The system consists of 3 components:

            1. Meta-data server: a single meta-data server that provides a global namespace
            2. Block server: Files are split into blocks orchunksand stored on block servers. Blocks are also known as chunk servers. Chunkserver store the chunks as files in the underlying file system (such as, XFS on Linux)
            3. Client library: that provides the file system API to allow applications to interface with KFS. To integrate applications to use KFS, applications will need to be modified and relinked with the KFS client library.

            KFS is implemented in C++. It is built using standard system components such as, TCP sockets, aio (for disk I/O), STL, and boost libraries. It has been tested on 64-bit x86 architectures running Linux FC5.

            While KFS can be accessed natively from C++ applications, support is also provided for Java applications. JNI glue code is included in the release to allow Java applications to access the KFS client library APIs.

            Features
            • Incremental scalability: New chunkserver nodes can be added as storage needs increase; the system automatically adapts to the new nodes.
            • Availability: Replication is used to provide availability due to chunk server failures. Typically, files are replicated 3-way.
            • Per file degree of replication: The degree of replication is configurable on a per file basis, with a max. limit of 64.
            • Re-replication: Whenever the degree of replication for a file drops below the configured amount (such as, due to an extended chunkserver outage), the metaserver forces the block to be re-replicated on the remaining chunk servers. Re-replication is done in the background without overwhelming the system.
            • Re-balancing: Periodically, the meta-server may rebalance the chunks amongst chunkservers. This is done to help with balancing disk space utilization amongst nodes.
            • Data integrity: To handle disk corruptions to data blocks, data blocks are checksummed. Checksum verification is done on each read; whenever there is a checksum mismatch, re-replication is used to recover the corrupted chunk.
            • File writes: The system follows the standard model. When an application creates a file, the filename becomes part of the filesystem namespace. For performance, writes are cached at the KFS client library. Periodically, the cache is flushed and data is pushed out to the chunkservers. Also, applications can force data to be flushed to the chunkservers. In either case, once data is flushed to the server, it is available for reading.
            • Leases: KFS client library uses caching to improve performance. Leases are used to support cache consistency.
            • Chunk versioning: Versioning is used to detect stale chunks.
            • Client side fail-over: The client library is resilient to chunksever failures. During reads, if the client library determines that the chunkserver it is communicating with is unreachable, the client library will fail-over to another chunkserver and continue the read. This fail-over is transparent to the application.
            • Language support: KFS client library can be accessed from C++, Java, and Python.
            • FUSE support on Linux: By mounting KFS via FUSE, this support allows existing linux utilities (such as, ls) to interface with KFS.
            • Tools: A shell binary is included in the set of tools. This allows users to navigate the filesystem tree using utilities such as, cp, ls, mkdir, rmdir, rm, mv. Tools to also monitor the chunk/meta-servers are provided.
            • Deploy scrīpts: To simplify launching KFS servers, a set of scrīpts to (1) install KFS binaries on a set of nodes, (2) start/stop KFS servers on a set of nodes are also provided.
            • Job placement support: The KFS client library exports an API to determine the location of a byte range of a file. Job placement systems built on top of KFS can leverage this API to schedule jobs appropriately.
            • Local read optimization: When applications are run on the same nodes as chunkservers, the KFS client library contains an optimization for reading data locally. That is, if the chunk is stored on the same node as the one on which the application is executing, data is read from the local node.
            KFS with Hadoop

            KFS has been integrated with Hadoop using Hadoop’s filesystem interfaces. This allows existing Hadoop applications to use KFS seamlessly. The integration code has been submitted as a patch to Hadoop-JIRA-1963 (this will enable distribution of the integration code with Hadoop). In addition, the code as well as instructions will also be available for download from the KFS project page shortly. As part of the integration, there is job placement support for Hadoop. That is, the Hadoop Map/Reduce job placement system can schedule jobs on the nodes where the chunks are stored.

            參考資料:

            • distribute file system

            http://lucene.apache.org/hadoop/

            http://www.danga.com/mogilefs/

            http://www.lustre.org/

            http://oss.sgi.com/projects/xfs/

             

            http://www.megite.com/discover/filesystem

            http://swik.net/distributed+cluster

            • cluster&high availability

            http://www.gluster.org/index.php

            http://www.linux-ha.org/

            http://openssi.org

            http://kerrighed.org/

            http://openmosix.sourceforge.net/

             

            http://www.linux.com/article.pl?sid=06/09/12/1459204

            http://labs.google.com/papers/mapreduce.html

            posted on 2010-04-01 09:47 小王 閱讀(2012) 評(píng)論(2)  編輯 收藏 引用 所屬分類: 分布式系統(tǒng)

            評(píng)論:
            # re: kosmix,又一個(gè)開(kāi)源的類似google mapreduce 的分布式文件系統(tǒng) 2010-04-01 12:55 | 那誰(shuí)
            概念性的錯(cuò)誤:mapreduce不是分布式文件系統(tǒng),你說(shuō)的應(yīng)該是GFS.
              回復(fù)  更多評(píng)論
              
            # re: kosmix,又一個(gè)開(kāi)源的類似google mapreduce 的分布式文件系統(tǒng) 2010-04-01 21:51 | 小王
            感謝那誰(shuí)的指教,現(xiàn)題目已經(jīng)改過(guò)
              回復(fù)  更多評(píng)論
              
            # re: kosmix,又一個(gè)開(kāi)源的類似GFS的分布式文件系統(tǒng) 2010-05-10 12:51 | CANDYGonzales19
            Do not money to buy a house? Worry no more, just because it is real to take the <a href="http://lowest-rate-loans.com/topics/credit-loans">http://www.lowest-rate-loans.com</a> to solve such problems. Hence take a commercial loan to buy all you want.   回復(fù)  更多評(píng)論
              
            国产精品青草久久久久婷婷| 久久久久亚洲AV无码观看| 国产精品久久久久久影院 | 亚洲AV无码久久精品色欲| 色婷婷综合久久久中文字幕| 国内精品伊人久久久久av一坑| 久久九九有精品国产23百花影院| 大美女久久久久久j久久| 综合久久给合久久狠狠狠97色 | 国产精品嫩草影院久久| 久久免费香蕉视频| 久久综合88熟人妻| 99久久国产综合精品五月天喷水| 女同久久| 99国产精品久久久久久久成人热| 久久99精品九九九久久婷婷| 久久妇女高潮几次MBA| 91麻豆精品国产91久久久久久| 一本一道久久a久久精品综合| 久久精品国产久精国产思思| 久久97久久97精品免视看秋霞| 国产成年无码久久久免费| 亚洲国产天堂久久综合网站 | 国产精品热久久无码av| 亚洲精品乱码久久久久久按摩 | 蜜臀久久99精品久久久久久| 久久久精品人妻一区二区三区四| 久久精品无码一区二区app| 日本欧美久久久久免费播放网| 国产成人无码精品久久久久免费| 久久国产色av免费看| 久久99精品久久久久久野外| 日本人妻丰满熟妇久久久久久| 亚洲国产成人久久一区久久| 久久99国产精品久久99| 国产A级毛片久久久精品毛片| 久久精品人妻一区二区三区| 国产成人久久精品激情| 久久久久久国产精品美女 | 青青草原精品99久久精品66| 色播久久人人爽人人爽人人片aV |