Hadoop Daemons
In Apache Hadoop, daemons are long-running processes or services that are responsible for various tasks within the Hadoop cluster. These daemons work together to ensure the proper functioning of the Hadoop distributed file system (HDFS) and the execution of data processing jobs using the MapReduce programming model. Here are the key Hadoop daemons:
NameNode:
- The NameNode is one of the most critical daemons in Hadoop’s HDFS.
- It manages the metadata and namespace of the HDFS file system, including directory structures, permissions, and file-to-block mappings.
- The NameNode does not store actual data but keeps track of where data is located in DataNodes.
DataNode:
- DataNodes are responsible for storing the actual data blocks in the HDFS.
- They periodically send heartbeats and block reports to the NameNode to inform it about the status of data blocks they store.
- DataNodes can also replicate and balance data blocks as needed.
ResourceManager:
- The ResourceManager is a central resource management and scheduling component in YARN (Yet Another Resource Negotiator).
- It allocates and manages cluster resources (CPU, memory, etc.) among various applications running in the cluster.
- The ResourceManager communicates with NodeManagers to manage resource allocation and job execution.
NodeManager:
- NodeManagers are responsible for managing resources (CPU, memory) on individual cluster nodes.
- They monitor resource usage and report it to the ResourceManager.
- NodeManagers are responsible for launching and managing containers that execute application tasks.
SecondaryNameNode:
- The SecondaryNameNode is not a backup or failover for the NameNode but rather assists the NameNode in checkpointing its metadata.
- It periodically downloads a copy of the namespace and edits log from the NameNode, merges them, and then uploads a new checkpoint back to the NameNode.
- This helps reduce the time required for recovering the filesystem metadata in the event of a NameNode failure.
JobTracker (Deprecated in Hadoop 2.x):
- In Hadoop 1.x, the JobTracker was responsible for job scheduling and management in the MapReduce framework.
- It tracked the progress of MapReduce jobs, allocated resources, and monitored task execution.
- Note that the JobTracker has been replaced by the ResourceManager in Hadoop 2.x and later versions.
TaskTracker (Deprecated in Hadoop 2.x):
- In Hadoop 1.x, the TaskTracker was responsible for executing tasks (Map and Reduce tasks) assigned by the JobTracker.
- It reported task progress and status back to the JobTracker.
- TaskTrackers have been replaced by NodeManagers in Hadoop 2.x and later versions as part of the YARN resource management model.
HistoryServer:
- The HistoryServer (or JobHistoryServer) stores historical information about completed MapReduce jobs, such as logs, counters, and other metadata.
- Users can retrieve and view this information to analyze job performance and troubleshoot issues.
Hadoop Training Demo Day 1 Video:
Conclusion:
Unogeeks is the No.1 IT Training Institute for Hadoop Training. Anyone Disagree? Please drop in a comment
You can check out our other latest blogs on Hadoop Training here – Hadoop Blogs
Please check out our Best In Class Hadoop Training Details here – Hadoop Training
Follow & Connect with us:
———————————-
For Training inquiries:
Call/Whatsapp: +91 73960 33555
Mail us at: info@unogeeks.com
Our Website ➜ https://unogeeks.com
Follow us:
Instagram: https://www.instagram.com/unogeeks
Facebook:https://www.facebook.com/UnogeeksSoftwareTrainingInstitute
Twitter: https://twitter.com/unogeeks