MaReduce阶段如何解决Output directory already exists错误

tech2024-07-07 147

在MapReduce的程序中，如果我们要是想要在本地运行。我们是要设置输出和输入路径的，对于输出路径而言，这个是相对于输入路径而言还是比较讲究的，首先这个路径得是个文件夹，而且这个文件夹里面是不能有任何东西的，只要是存在一个东西都会直接报错

import java.io.IOException; public class Driver { public static void main(String[] args) { //设置输入和输出的路径 args=new String[2]; args[0]= "D:\\LIB\\hello.txt"; args[1]="D:\\LIB\\output"; //1:获取job信息 Configuration configuration=new Configuration (); try { Job job=Job.getInstance (configuration); //2：获取jar包信息 job.setJarByClass (Driver.class); //3：关联自定义的mapper和reduceer job.setMapperClass (WordCountMapper.class); job.setReducerClass (WordCountReducer.class); //4：设置map输出类型数据 job.setMapOutputKeyClass (Text.class); job.setMapOutputValueClass (IntWritable.class); //5：设置最终输出数据类型 job.setOutputKeyClass (Text.class); job.setOutputValueClass (IntWritable.class); //6：设置数据输入和输出文件路径 FileInputFormat.setInputPaths (job,new Path (args[0])); FileOutputFormat.setOutputPath (job,new Path (args[1])); boolean a = job.waitForCompletion (true); System.out.println (a ); } catch (IOException e) { e.printStackTrace ( ); } catch (InterruptedException e) { e.printStackTrace ( ); } catch (ClassNotFoundException e) { e.printStackTrace ( ); } //提交代码 } }

以上是没有经过处理的MapReduce阶段的Driver的代码如果我们要是不删除我们输出文件夹中的所有东西，我们的程序就会直接报错，要是我们删除的话，就得每一次在执行一遍就得删一次，想想自己每一次都得到文件管理去一次的辛苦，想想就直接受不了这个时候，如果我们就是在他这个程序执行的时候，我们给他设置一段到吗，这个代码的思路，就是我们得到他的输出目录，然后进行判断，如果这个文件是存在的，我们就给他删了按照这样的思路我们就是直接诞生一下的代码

import java.io.IOException; public class Driver { public static void main(String[] args) { //设置输入和输出的路径 args=new String[2]; args[0]= "D:\\LIB\\hello.txt"; args[1]="D:\\LIB\\output"; //1:获取job信息 Configuration configuration=new Configuration (); try { Job job=Job.getInstance (configuration); //2：获取jar包信息 job.setJarByClass (Driver.class); //3：关联自定义的mapper和reduceer job.setMapperClass (WordCountMapper.class); job.setReducerClass (WordCountReducer.class); //4：设置map输出类型数据 job.setMapOutputKeyClass (Text.class); job.setMapOutputValueClass (IntWritable.class); //5：设置最终输出数据类型 job.setOutputKeyClass (Text.class); job.setOutputValueClass (IntWritable.class); Path path = new Path(args[1]); FileSystem fileSystem = path.getFileSystem(configuration); if (fileSystem.exists(path)) { fileSystem.delete(path, true);/ } //6：设置数据输入和输出文件路径 FileInputFormat.setInputPaths (job,new Path (args[0])); FileOutputFormat.setOutputPath (job,new Path (args[1])); boolean a = job.waitForCompletion (true); System.out.println (a ); } catch (IOException e) { e.printStackTrace ( ); } catch (InterruptedException e) { e.printStackTrace ( ); } catch (ClassNotFoundException e) { e.printStackTrace ( ); } //提交代码 } } Path path = new Path(args[1]); FileSystem fileSystem = path.getFileSystem(configuration); if (fileSystem.exists(path)) { fileSystem.delete(path, true);/ }

第一行代码的意义在于获取他的文件目录，我们可以理解，把它的文件目录弄成一个对象第二行代码的意义在于找到他这个目录下的文件剩下的就是进行判断了这样就是不管你这个文件是否存在，我都会直接给你删除了

最新回复(0)