Where does job.setOutputKeyClass and job.setOutputReduceClass refers to?(job.setOutputKeyClass 和 job.setOutputReduceClass 指的是哪里?)
问题描述
我以为他们指的是 Reducer,但在我的程序中我有
I thought that they refer to the Reducer but in my program I have
公共静态类 MyMapper 扩展映射器
和
公共静态类 MyReducer 扩展减速机<文本、文本、NullWritable、文本>
如果我有
job.setOutputKeyClass(NullWritable.class);
job.setOutputValueClass(Text.class);
我得到以下异常
map 中键的类型不匹配:预期 org.apache.hadoop.io.NullWritable,收到 org.apache.hadoop.io.Text
但如果我有
job.setOutputKeyClass(Text.class);
没有问题.
我的代码是否有问题,或者这是因为 NullWritable 或其他原因而发生的?
Is there sth wrong with my code or this happens because of NullWritable or sth else?
我还必须使用 job.setInputFormatClass
和 job.setOutputFormatClass
吗?因为我的程序没有它们也能正常运行.
Also do I have to use job.setInputFormatClass
and job.setOutputFormatClass
? Because my programs runs correctly without them.
推荐答案
调用 job.setOutputKeyClass( NullWritable.class );
将设置预期的类型作为 map 和 reduce 阶段的输出.
Calling job.setOutputKeyClass( NullWritable.class );
will set the types expected as output from both the map and reduce phases.
如果您的 Mapper 发出的类型与 Reducer 不同,您可以使用 JobConf
的 setMapOutputKeyClass()
和 setMapOutputValueClass()
方法.这些隐式设置了 Reducer 期望的输入类型.
If your Mapper emits different types than the Reducer, you can set the types emitted by the mapper with the JobConf
's setMapOutputKeyClass()
and setMapOutputValueClass()
methods. These implicitly set the input types expected by the Reducer.
(来源:雅虎开发者教程)
关于第二个问题,默认的 InputFormat
是 TextInputFormat
.这将每个输入文件的每一行视为单独的记录,并且不执行解析.如果您需要以不同的格式处理您的输入,您可以调用这些方法,以下是一些示例:
Regarding your second question, the default InputFormat
is the TextInputFormat
. This treats each line of each input file as a separate record, and performs no parsing. You can call these methods if you need to process your input in a different format, here are some examples:
InputFormat | Description | Key | Value
--------------------------------------------------------------------------------------------------------------------------------------------------------
TextInputFormat | Default format; reads lines of text files | The byte offset of the line | The line contents
KeyValueInputFormat | Parses lines into key, val pairs | Everything up to the first tab character | The remainder of the line
SequenceFileInputFormat | A Hadoop-specific high-performance binary format | user-defined | user-defined
OutputFormat
的默认实例是 TextOutputFormat
,它将(键、值)对写入文本文件的各行.下面是一些例子:
The default instance of OutputFormat
is TextOutputFormat
, which writes (key, value) pairs on individual lines of a text file. Some examples below:
OutputFormat | Description
---------------------------------------------------------------------------------------------------------
TextOutputFormat | Default; writes lines in "key value" form
SequenceFileOutputFormat | Writes binary files suitable for reading into subsequent MapReduce jobs
NullOutputFormat | Disregards its inputs
(来源:其他雅虎开发者教程)
这篇关于job.setOutputKeyClass 和 job.setOutputReduceClass 指的是哪里?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:job.setOutputKeyClass 和 job.setOutputReduceClass 指的是哪里?


- C++ 和 Java 进程之间的共享内存 2022-01-01
- Eclipse 插件更新错误日志在哪里? 2022-01-01
- 从 finally 块返回时 Java 的奇怪行为 2022-01-01
- Spring Boot连接到使用仲裁器运行的MongoDB副本集 2022-01-01
- Jersey REST 客户端:发布多部分数据 2022-01-01
- Safepoint+stats 日志,输出 JDK12 中没有 vmop 操作 2022-01-01
- value & 是什么意思?0xff 在 Java 中做什么? 2022-01-01
- Java包名称中单词分隔符的约定是什么? 2022-01-01
- 将log4j 1.2配置转换为log4j 2配置 2022-01-01
- 如何使用WebFilter实现授权头检查 2022-01-01