600 秒内无法报告状态.杀戮！在 hadoop 中报告进度-Java问题

Failed to report status for 600 seconds. Killing! Reporting progress in hadoop(600 秒内无法报告状态.杀戮！在 hadoop 中报告进度)

本文介绍了600 秒内无法报告状态.杀戮！在 hadoop 中报告进度的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我收到以下错误:

Task attempt_201304161625_0028_m_000000_0 failed to report status for 600 seconds. Killing!

我的地图工作.这个问题类似于这个，这个，和这个.但是，我不想在 hadoop 杀死不报告进度的任务之前增加默认时间，即，

for my Map jobs. This question is similar to this, this, and this. However, I do not want to increase the default time before hadoop kills a task that doesn't report progress, i.e.,

Configuration conf=new Configuration();
long milliSeconds = 1000*60*60;
conf.setLong("mapred.task.timeout", milliSeconds);

相反，我想使用 context.progress()、context.setStatus("Some Message") 或 context.getCounter(SOME_ENUM.PROGRESS).increment(1) 或类似的东西.但是，这仍然会导致作业被终止.这是我试图报告进度的代码片段.映射器:

Instead, I want to periodically report progress using either context.progress(), context.setStatus("Some Message") or context.getCounter(SOME_ENUM.PROGRESS).increment(1) or something similar. However, this still causes the job to be killed. Here are the snippets of code where I am attempting to report progress. The mapper:

protected void map(Key key, Value value, Context context) throws IOException, InterruptedException {

    //do some things
    Optimiser optimiser = new Optimiser();
    optimiser.optimiseFurther(<some parameters>, context);
    //more things
    context.write(newKey, newValue);
}

Optimiser 类中的 optimiseFurther 方法:

the optimiseFurther method within the Optimiser class:

public void optimiseFurther(<Some parameters>, TaskAttemptContext context) {

    int count = 0;
    while(something is true) {
        //optimise

        //try to report progress
        context.setStatus("Progressing:" + count);
        System.out.println("Optimise Progress:" + context.getStatus());
        context.progress();
        count++;
    }
}

映射器的输出显示状态正在更新:

The output from a mapper shows the status is being updated:

Optimise Progress:Progressing:0
Optimise Progress:Progressing:1
Optimise Progress:Progressing:2
...

但是，在默认时间后，作业仍会被终止.我是否以错误的方式使用上下文?为了成功报告进度，我还需要在工作设置中做些什么吗?

However, the job is still being killed after the default amount of time. Am I using the context in the wrong way? Is there anything else I need to do in the job setup in order to report the progress successfully?

问题描述

推荐答案