Last active
September 28, 2019 03:13
-
-
Save wantedfast/e9ebe2a2d116424d7b3465d0d6e62c5d to your computer and use it in GitHub Desktop.
The instruction of Hadloop
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| The Hadoop Distributed File System (HDFS) is a distributed file system designed to run on commodity hardware. It has many similarities with existing distributed file systems. However, the differences from other distributed file systems are significant. HDFS is highly fault-tolerant and is designed to be deployed on low-cost hardware. HDFS provides high throughput access to application data and is suitable for applications that have large data sets. HDFS relaxes a few POSIX requirements to enable streaming access to file system data. HDFS was originally built as infrastructure for the Apache Nutch web search engine project. HDFS is part of the Apache Hadoop Core project. The project URL is http://hadoop.apache.org | |
| ## 大数据的本质: | |
| - 大数据的存储 :分布式文件系统(分布式存储)- HDFS:Hadloop distributed file system. | |
| - 大数据的计算: 分布式计算 | |
| ## Hadloop特性 | |
| - 冗余度 | |
| - HDFS默认冗余度为3 | |
| - 效率:水平复制 | |
| - 传输以数据块作为单位 :Hadloop 1.x 64M, Hadloop 2.x 128M | |
| ## Hadloop安装模式 | |
| - 本地模式 1台机器 | |
| - 伪分布模式 1台机器 | |
| - 全分布模式 3台机器 | |
| ## Hadloop图解 | |
| HDFS has a master/slave architecture. An HDFS cluster consists of a single NameNode, a master server that manages the file system namespace and regulates access to files by clients. In addition, there are a number of DataNodes, usually one per node in the cluster, which manage storage attached to the nodes that they run on. HDFS exposes a file system namespace and allows user data to be stored in files. Internally, a file is split into one or more blocks and these blocks are stored in a set of DataNodes. **The NameNode executes file system namespace operations like opening, closing, and renaming files and directories. It also determines the mapping of blocks to DataNodes. The DataNodes are responsible for serving read and write requests from the file system’s clients. The DataNodes also perform block creation, deletion, and replication upon instruction from the NameNode.** | |
|  | |
|  |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment