Hadoop : java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.Text(Hadoop:java.lang.ClassCastException:org.apache.hadoop.io.LongWritable 不能转换为 org.apache.hadoop.io.Text)
问题描述
我的程序看起来像
public class TopKRecord extends Configured implements Tool {公共静态类 MapClass 扩展 Mapper<Text,Text,Text,Text>{公共无效映射(文本键,文本值,上下文上下文)抛出 IOException,InterruptedException {//你的地图代码在这里String[] 字段 = value.toString().split(",");字符串年份 = 字段 [1];字符串声明=字段[8];if (claims.length() > 0 && (!claims.startsWith("""))) {context.write(new Text(year.toString()), new Text(claims.toString()));}}}公共 int 运行(字符串 args[])抛出异常 {工作工作 = 新工作();job.setJarByClass(TopKRecord.class);job.setMapperClass(MapClass.class);FileInputFormat.setInputPaths(job, new Path(args[0]));FileOutputFormat.setOutputPath(job, new Path(args[1]));job.setJobName("TopKRecord");job.setMapOutputValueClass(Text.class);job.setNumReduceTasks(0);布尔成功 = job.waitForCompletion(true);返回成功?0:1;}公共静态 void main(String args[]) 抛出异常 {int ret = ToolRunner.run(new TopKRecord(), args);System.exit(ret);}}
数据看起来像
"PATENT","GYEAR","GDATE","APPYEAR","COUNTRY","POSTATE","ASSIGNEE","ASSCODE","CLAIMS","NCLASS","CAT","SUBCAT","CMADE","CRECEIVE","RATIOCIT","GENERAL","ORIGINAL","FWDAPLAG","BCKGTLAG","SELFCTUB","SELFCTLB","SECDUPBD","SECDLWBD"3070801,1963,1096,,"BE","",,1,,269,6,69,,1,,0,,,,,,,3070802,1963,1096,,"US","TX",,1,,2,6,63,,0,,,,,,,,,,3070803,1963,1096,,"US","IL",,1,,2,6,63,,9,,0.3704,,,,,,,3070804,1963,1096,,"US","OH",,1,,2,6,63,,3,,0.6667,,,,,,,
在运行这个程序时,我在控制台上看到以下内容
12/08/02 12:43:34 信息 mapred.JobClient:任务 ID:尝试_201208021025_0007_m_000000_0,状态:失败java.lang.ClassCastException: org.apache.hadoop.io.LongWritable 不能转换为 org.apache.hadoop.io.Text在 com.hadoop.programs.TopKRecord$MapClass.map(TopKRecord.java:26)在 org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)在 org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)在 org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)在 org.apache.hadoop.mapred.Child$4.run(Child.java:255)在 java.security.AccessController.doPrivileged(本机方法)在 javax.security.auth.Subject.doAs(Subject.java:396)在 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)在 org.apache.hadoop.mapred.Child.main(Child.java:249)
我相信类类型映射正确,类映射器,p>
请让我知道我在这里做错了什么?
当你用M/R程序读取文件时,你的mapper的输入key应该是文件中行的索引,而输入值将是整行.
所以这里发生的事情是你试图将行索引作为一个错误的 Text
对象,你需要一个 LongWritable
来代替,以便 Hadoop 不会不要抱怨类型.
试试这个:
public class TopKRecord extends Configured implements Tool {公共静态类 MapClass 扩展 Mapper<LongWritable, Text, Text, Text>{public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {//你的地图代码在这里String[] 字段 = value.toString().split(",");字符串年份 = 字段 [1];字符串声明=字段[8];if (claims.length() > 0 && (!claims.startsWith("""))) {context.write(new Text(year.toString()), new Text(claims.toString()));}}}...}
您可能还需要重新考虑代码中的一件事,即您正在为正在处理的每条记录创建 2 个 Text
对象.您应该只在开始时创建这两个对象,然后在您的映射器中使用 set
方法设置它们的值.如果您要处理大量数据,这将为您节省大量时间.
My program looks like
public class TopKRecord extends Configured implements Tool {
public static class MapClass extends Mapper<Text, Text, Text, Text> {
public void map(Text key, Text value, Context context) throws IOException, InterruptedException {
// your map code goes here
String[] fields = value.toString().split(",");
String year = fields[1];
String claims = fields[8];
if (claims.length() > 0 && (!claims.startsWith("""))) {
context.write(new Text(year.toString()), new Text(claims.toString()));
}
}
}
public int run(String args[]) throws Exception {
Job job = new Job();
job.setJarByClass(TopKRecord.class);
job.setMapperClass(MapClass.class);
FileInputFormat.setInputPaths(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.setJobName("TopKRecord");
job.setMapOutputValueClass(Text.class);
job.setNumReduceTasks(0);
boolean success = job.waitForCompletion(true);
return success ? 0 : 1;
}
public static void main(String args[]) throws Exception {
int ret = ToolRunner.run(new TopKRecord(), args);
System.exit(ret);
}
}
The data looks like
"PATENT","GYEAR","GDATE","APPYEAR","COUNTRY","POSTATE","ASSIGNEE","ASSCODE","CLAIMS","NCLASS","CAT","SUBCAT","CMADE","CRECEIVE","RATIOCIT","GENERAL","ORIGINAL","FWDAPLAG","BCKGTLAG","SELFCTUB","SELFCTLB","SECDUPBD","SECDLWBD"
3070801,1963,1096,,"BE","",,1,,269,6,69,,1,,0,,,,,,,
3070802,1963,1096,,"US","TX",,1,,2,6,63,,0,,,,,,,,,
3070803,1963,1096,,"US","IL",,1,,2,6,63,,9,,0.3704,,,,,,,
3070804,1963,1096,,"US","OH",,1,,2,6,63,,3,,0.6667,,,,,,,
On running this program I see the following on console
12/08/02 12:43:34 INFO mapred.JobClient: Task Id : attempt_201208021025_0007_m_000000_0, Status : FAILED
java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.Text
at com.hadoop.programs.TopKRecord$MapClass.map(TopKRecord.java:26)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
I believe that the Class Types are mapped correctly, Class Mapper,
Please let me know what is that I am doing wrong here?
When you read a file with a M/R program, the input key of your mapper should be the index of the line in the file, while the input value will be the full line.
So here what's happening is that you're trying to have the line index as a Text
object which is wrong, and you need an LongWritable
instead so that Hadoop doesn't complain about type.
Try this instead:
public class TopKRecord extends Configured implements Tool {
public static class MapClass extends Mapper<LongWritable, Text, Text, Text> {
public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
// your map code goes here
String[] fields = value.toString().split(",");
String year = fields[1];
String claims = fields[8];
if (claims.length() > 0 && (!claims.startsWith("""))) {
context.write(new Text(year.toString()), new Text(claims.toString()));
}
}
}
...
}
Also one thing in your code that you might want to reconsider, you're creating 2 Text
objects for every record you're processing. You should only create these 2 objects right at the beginning, and then in your mapper just set their values by using the set
method. This will save you a lot of time if you're processing a decent amount of data.
这篇关于Hadoop:java.lang.ClassCastException:org.apache.hadoop.io.LongWritable 不能转换为 org.apache.hadoop.io.Text的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:Hadoop:java.lang.ClassCastException:org.apache.hadoop.io.LongWritable 不能转换为 org.apache.hadoop.io.Text
- Spring Boot连接到使用仲裁器运行的MongoDB副本集 2022-01-01
- value & 是什么意思?0xff 在 Java 中做什么? 2022-01-01
- 如何使用WebFilter实现授权头检查 2022-01-01
- Eclipse 插件更新错误日志在哪里? 2022-01-01
- 从 finally 块返回时 Java 的奇怪行为 2022-01-01
- Java包名称中单词分隔符的约定是什么? 2022-01-01
- Jersey REST 客户端:发布多部分数据 2022-01-01
- Safepoint+stats 日志,输出 JDK12 中没有 vmop 操作 2022-01-01
- 将log4j 1.2配置转换为log4j 2配置 2022-01-01
- C++ 和 Java 进程之间的共享内存 2022-01-01