千家信息网

hadoop 2.4 namenode源码分析

发表于:2025-02-03 作者:千家信息网编辑
千家信息网最后更新 2025年02月03日,这篇文章主要介绍"hadoop 2.4 namenode源码分析",在日常操作中,相信很多人在hadoop 2.4 namenode源码分析问题上存在疑惑,小编查阅了各式资料,整理出简单好用的操作方法
千家信息网最后更新 2025年02月03日hadoop 2.4 namenode源码分析

这篇文章主要介绍"hadoop 2.4 namenode源码分析",在日常操作中,相信很多人在hadoop 2.4 namenode源码分析问题上存在疑惑,小编查阅了各式资料,整理出简单好用的操作方法,希望对大家解答"hadoop 2.4 namenode源码分析"的疑惑有所帮助!接下来,请跟着小编一起来学习吧!

在hadoop nn的HA,对于主备节点的选举,是通过ActiveStandbyElector来实现的。源码上有针对该类的解释。


小弟英文不才,翻译一下。该类主要使用了zookeeper实现了主节点的选举,对于成功选举的主节点,会在zookeeper上创建零时节点。如果创建成功,NN会变成active,而其余nn节点会成备用节点。


下面还是来具体分析一下ActiveStandbyElector类的作用,ActiveStandbyElector主要实现了选举,选举流程主要是通过创建零时节点的方式实现,如果创建成功。可以认为是获取到对应的LOCK,该节点可以成为active。如果没有成功创建该节点,可以认为为standby节点,对于standby节点,需要一直监听该LOCK节点的状态。如果发生节点的事件,就去尝试选举。基本流程就是这样。

下面,来看一下ActiveStandbyElector类的主要方法和流程。对于熟悉zookeeper的同学来说,zookeeper的必须要实现watcher接口,其中可以实现自己的各种事件的处理逻辑。

在ActiveStandbyElector中,采用了

内部类来实现Watcher接口,其process方法,调用了processWatchEvent来实现具体的业务处理。



下面来分析该processWatchEvent的具体逻辑:

//处理zk的事件

synchronized void processWatchEvent(ZooKeeper zk, WatchedEvent event) {    Event.EventType eventType = event.getType();    if (isStaleClient(zk)) return;    LOG.debug("Watcher event type: " + eventType + " with state:"        + event.getState() + " for path:" + event.getPath()        + " connectionState: " + zkConnectionState        + " for " + this);    if (eventType == Event.EventType.None) {            //会话本身的时间,如连接。失去连接。      // the connection state has changed      switch (event.getState()) {      case SyncConnected:        LOG.info("Session connected.");        // if the listener was asked to move to safe state then it needs to        // be undone        ConnectionState prevConnectionState = zkConnectionState;        zkConnectionState = ConnectionState.CONNECTED;        if (prevConnectionState == ConnectionState.DISCONNECTED &&            wantToBeInElection) {          monitorActiveStatus();//监控节点        }        break;      case Disconnected:        LOG.info("Session disconnected. Entering neutral mode...");        // ask the app to move to safe state because zookeeper connection        // is not active and we dont know our state        zkConnectionState = ConnectionState.DISCONNECTED;        enterNeutralMode();        break;      case Expired:        // the connection got terminated because of session timeout        // call listener to reconnect        LOG.info("Session expired. Entering neutral mode and rejoining...");        enterNeutralMode();        reJoinElection(0);//参与选举        break;      case SaslAuthenticated:        LOG.info("Successfully authenticated to ZooKeeper using SASL.");        break;      default:        fatalError("Unexpected Zookeeper watch event state: "            + event.getState());        break;      }      return;    }        // a watch on lock path in zookeeper has fired. so something has changed on    // the lock. ideally we should check that the path is the same as the lock    // path but trusting zookeeper for now    //节点事件    String path = event.getPath();    if (path != null) {      switch (eventType) {      case NodeDeleted:        if (state == State.ACTIVE) {          enterNeutralMode();//该方法目前未实现        }        joinElectionInternal();//开始选举        break;      case NodeDataChanged:        monitorActiveStatus();//继续监控该节点,尝试成为active        break;      default:        LOG.debug("Unexpected node event: " + eventType + " for path: " + path);        monitorActiveStatus();      }      return;    }    // some unexpected error has occurred    fatalError("Unexpected watch error from Zookeeper");  }




而joinElectionInternal,选举的核心方法就是,


选举就是通过对zkLokFilePath节点的创建,来完成。这个采用了zk的异步回调。

从该类的定义,可以看出,本身就是实现了zk的两个接口。

StatCallback需要实现的方法,如下:



对于两个方法的实现,ActiveStandbyElector内部实现几乎是一样的。这里不再贴上源码,有兴趣的可以自己去看源码。

贴上实现方法,有注释。呵呵


public synchronized void processResult(int rc, String path, Object ctx,      String name) {    if (isStaleClient(ctx)) return;    LOG.debug("CreateNode result: " + rc + " for path: " + path        + " connectionState: " + zkConnectionState +        "  for " + this);    Code code = Code.get(rc);//为了方便使用,这里自定义了一组状态    if (isSuccess(code)) {//成功返回,成功创建zklocakpath节点      // we successfully created the znode. we are the leader. start monitoring      if (becomeActive()) {//要将本节点上的NN变成active        monitorActiveStatus();//继续监控节点状态      } else {        reJoinElectionAfterFailureToBecomeActive();//失败,继续选举尝试      }      return;    }    if (isNodeExists(code)) {//节点存在,说明已经有active,wait即可      if (createRetryCount == 0) {        // znode exists and we did not retry the operation. so a different        // instance has created it. become standby and monitor lock.        becomeStandby();      }      // if we had retried then the znode could have been created by our first      // attempt to the server (that we lost) and this node exists response is      // for the second attempt. verify this case via ephemeral node owner. this      // will happen on the callback for monitoring the lock.      monitorActiveStatus();//不过努力成为active的动作不能停      return;    }    String errorMessage = "Received create error from Zookeeper. code:"        + code.toString() + " for path " + path;    LOG.debug(errorMessage);    if (shouldRetry(code)) {      if (createRetryCount < maxRetryNum) {        LOG.debug("Retrying createNode createRetryCount: " + createRetryCount);        ++createRetryCount;        createLockNodeAsync();        return;      }      errorMessage = errorMessage          + ". Not retrying further znode create connection errors.";    } else if (isSessionExpired(code)) {      // This isn't fatal - the client Watcher will re-join the election      LOG.warn("Lock acquisition failed because session was lost");      return;    }    fatalError(errorMessage);  }

对于becomeStandby,becomeActive这些状态的改变,有ZKFailoverController来实现。

到此,关于"hadoop 2.4 namenode源码分析"的学习就结束了,希望能够解决大家的疑惑。理论与实践的搭配能更好的帮助大家学习,快去试试吧!若想继续学习更多相关知识,请继续关注网站,小编会继续努力为大家带来更多实用的文章!

0