Hadoop 3.0

Share

                                 Hadoop 3.0

Hadoop 3.0 is a major release of the Apache Hadoop framework, which is widely used for distributed storage and processing of large datasets in big data environments. Hadoop 3.0 introduced several significant enhancements, improvements, and new features compared to the previous Hadoop 2.x releases. Some of the notable changes in Hadoop 3.0 include:

  1. Erasure Coding:

    • Hadoop 3.0 introduced support for erasure coding in HDFS, an alternative to traditional 3x replication for data durability. Erasure coding helps reduce storage overhead while maintaining data reliability.
  2. YARN Timeline Service v.2:

    • The YARN Timeline Service was overhauled and introduced as version 2. It provides improved support for fine-grained resource tracking and application history.
  3. GPU Support:

    • Hadoop 3.0 added support for running GPU-accelerated tasks in YARN containers, enabling the use of GPUs for accelerating machine learning and data processing workloads.
  4. Resource Types and GPU Scheduling:

    • Resource types were introduced to YARN, allowing better categorization and management of resources. GPU scheduling capabilities were also improved for better utilization.
  5. Shell Script Rewrite:

    • The Hadoop shell script (hadoop command) was rewritten in Hadoop 3.0, offering better consistency and usability across different environments.
  6. Native Azure Data Lake Support:

    • Native support for Microsoft Azure Data Lake Storage (ADLS) was added, allowing Hadoop to interact seamlessly with Azure data services.
  7. Federation and Router-based HDFS:

    • Hadoop 3.0 introduced the concept of HDFS Federation, which allows multiple HDFS namespaces to be hosted on a single HDFS cluster. Additionally, HDFS Router-based federation was introduced for improved scalability and management of large HDFS deployments.
  8. Improved Hadoop Common:

    • Several improvements and updates were made to the Hadoop Common module, including updates to dependencies, enhancements to security, and better compatibility with various operating systems and platforms.
  9. Enhanced Hadoop Docker Support:

    • Hadoop 3.0 improved support for running Hadoop components in Docker containers, making it easier to set up and manage Hadoop clusters using containerization.
  10. Security Enhancements:

    • Hadoop 3.0 includes security enhancements, such as support for TLS/SSL encryption for communication, Kerberos improvements, and better integration with security technologies.
  11. Performance Improvements:

    • Various performance enhancements and optimizations were made across different components of Hadoop, including HDFS, MapReduce, and YARN.
  12. API and Protocol Updates:

    • Hadoop 3.0 introduced updates to various APIs and protocols to accommodate the new features and improvements.

Hadoop Training Demo Day 1 Video:

 
You can find more information about Hadoop Training in this Hadoop Docs Link

 

Conclusion:

Unogeeks is the No.1 IT Training Institute for Hadoop Training. Anyone Disagree? Please drop in a comment

You can check out our other latest blogs on Hadoop Training here – Hadoop Blogs

Please check out our Best In Class Hadoop Training Details here – Hadoop Training

💬 Follow & Connect with us:

———————————-

For Training inquiries:

Call/Whatsapp: +91 73960 33555

Mail us at: info@unogeeks.com

Our Website ➜ https://unogeeks.com

Follow us:

Instagram: https://www.instagram.com/unogeeks

Facebook:https://www.facebook.com/UnogeeksSoftwareTrainingInstitute

Twitter: https://twitter.com/unogeeks


Share

Leave a Reply

Your email address will not be published. Required fields are marked *