Hadoop 3.0
Hadoop 3.0 is a major release of the Apache Hadoop framework, which is widely used for distributed storage and processing of large datasets in big data environments. Hadoop 3.0 introduced several significant enhancements, improvements, and new features compared to the previous Hadoop 2.x releases. Some of the notable changes in Hadoop 3.0 include:
Erasure Coding:
- Hadoop 3.0 introduced support for erasure coding in HDFS, an alternative to traditional 3x replication for data durability. Erasure coding helps reduce storage overhead while maintaining data reliability.
YARN Timeline Service v.2:
- The YARN Timeline Service was overhauled and introduced as version 2. It provides improved support for fine-grained resource tracking and application history.
GPU Support:
- Hadoop 3.0 added support for running GPU-accelerated tasks in YARN containers, enabling the use of GPUs for accelerating machine learning and data processing workloads.
Resource Types and GPU Scheduling:
- Resource types were introduced to YARN, allowing better categorization and management of resources. GPU scheduling capabilities were also improved for better utilization.
Shell Script Rewrite:
- The Hadoop shell script (
hadoop
command) was rewritten in Hadoop 3.0, offering better consistency and usability across different environments.
- The Hadoop shell script (
Native Azure Data Lake Support:
- Native support for Microsoft Azure Data Lake Storage (ADLS) was added, allowing Hadoop to interact seamlessly with Azure data services.
Federation and Router-based HDFS:
- Hadoop 3.0 introduced the concept of HDFS Federation, which allows multiple HDFS namespaces to be hosted on a single HDFS cluster. Additionally, HDFS Router-based federation was introduced for improved scalability and management of large HDFS deployments.
Improved Hadoop Common:
- Several improvements and updates were made to the Hadoop Common module, including updates to dependencies, enhancements to security, and better compatibility with various operating systems and platforms.
Enhanced Hadoop Docker Support:
- Hadoop 3.0 improved support for running Hadoop components in Docker containers, making it easier to set up and manage Hadoop clusters using containerization.
Security Enhancements:
- Hadoop 3.0 includes security enhancements, such as support for TLS/SSL encryption for communication, Kerberos improvements, and better integration with security technologies.
Performance Improvements:
- Various performance enhancements and optimizations were made across different components of Hadoop, including HDFS, MapReduce, and YARN.
API and Protocol Updates:
- Hadoop 3.0 introduced updates to various APIs and protocols to accommodate the new features and improvements.
Hadoop Training Demo Day 1 Video:
Conclusion:
Unogeeks is the No.1 IT Training Institute for Hadoop Training. Anyone Disagree? Please drop in a comment
You can check out our other latest blogs on Hadoop Training here – Hadoop Blogs
Please check out our Best In Class Hadoop Training Details here – Hadoop Training
Follow & Connect with us:
———————————-
For Training inquiries:
Call/Whatsapp: +91 73960 33555
Mail us at: info@unogeeks.com
Our Website ➜ https://unogeeks.com
Follow us:
Instagram: https://www.instagram.com/unogeeks
Facebook:https://www.facebook.com/UnogeeksSoftwareTrainingInstitute
Twitter: https://twitter.com/unogeeks