hadoop使用LinuxContainerExecutor后使用root用户提交任务报错--源码修改

tech2023-08-03  104

问题描述

在hadoop3.2.1的版本中,配置cgroup对yarn的cpu资源进行隔离之后,发现,使用root用户在yarn上提交任务时,无法提交成功,并会报错:

Runing as root is not allowed!

最后将这些错误在源码中搜索发现以下内容:

/** * Is the user a real user account? * Checks: * 1. Not root * 2. UID is above the minimum configured. * 3. Not in banned user list * Returns NULL on failure */ struct passwd* check_user(const char *user) { if (strcmp(user, "root") == 0) { fprintf(LOGFILE, "Running as root is not allowed\n"); fflush(LOGFILE); return NULL; }

原来是在源码里写明了在开启了LCE(LinuxContainerExecutor)后。LCE不能使用root用户提交任务,会检查user信息。

问题本质

user信息的检查会在org.apache.hadoop.mapreduce.job的init阶段执行。 Container启动的过程大致如下: 资源本地化——启动container——运行container——资源回收

问题解决

在\hadoop-3.2.1-src\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-nodemanager\src\main\native\container-executor\impl\container-executor.c 中,找到上述的代码片段,并注释如下:

/** * Is the user a real user account? * Checks: * 1. Not root * 2. UID is above the minimum configured. * 3. Not in banned user list * Returns NULL on failure */ struct passwd* check_user(const char *user) { // if (strcmp(user, "root") == 0) { // fprintf(LOGFILE, "Running as root is not allowed\n"); // fflush(LOGFILE); // return NULL; // } char *min_uid_str = get_value(MIN_USERID_KEY); int min_uid = DEFAULT_MIN_USERID; if (min_uid_str != NULL) { char *end_ptr = NULL; 。。。。。。 。。。。

然后将代码重新编译即可。

最新回复(0)