About 449,000 results
Open links in new tab
  1. mapreduce - How to get the input file name in the mapper in a …

    Sep 25, 2013 · If you are using Hadoop Streaming, you can use the JobConf variables in a streaming job's mapper/reducer. As for the input file name of mapper, see the Configured …

  2. hadoop - What is the purpose of shuffling and sorting phase in …

    Mar 3, 2014 · Then, the MapReduce job stops at the map phase, and the map phase does not include any kind of sorting (so even the map phase is faster). Tom White has been an Apache …

  3. hadoop - what are the disadvantages of mapreduce? - Stack …

    Sep 3, 2013 · What are the disadvantages of mapreduce? There are lots of advantages of mapreduce. But I would like to know the disadvantages of mapreduce too.

  4. mapreduce - How to optimize shuffling/sorting phase in a hadoop …

    Dec 10, 2015 · By default mapreduce picks only one reducer which will be the issue for most of the scenarios. Either you can change the code by setting number of reducers to a higher value …

  5. mapreduce - Hadoop error in shuffle in fetcher#1 - Stack Overflow

    Nov 7, 2013 · I'm running a parsing job in hadoop, the source is a 11GB map file with about 900,000 binary records each representing an HTML file, the map extract links and write them …

  6. Chaining multiple MapReduce jobs in Hadoop - Stack Overflow

    Mar 23, 2010 · In many real-life situations where you apply MapReduce, the final algorithms end up being several MapReduce steps. i.e. Map1 , Reduce1 , Map2 , Reduce2 , and so on. So …

  7. hadoop - Setting the number of map tasks and reduce tasks

    Jul 31, 2011 · In the newer version of Hadoop, there are much more granular mapreduce.job.running.map.limit and mapreduce.job.running.reduce.limit which allows you to …

  8. java - Jars for hadoop mapreduce - Stack Overflow

    Aug 6, 2015 · I am following this hadoop mapreduce tutorial given by Apache. The Java code given there uses these Apache-hadoop classes: import …

  9. hadoop - Differences between MapReduce and Yarn - Stack …

    Here are the MapReduce 1.0 and MapReduce 2.0 (YARN) MapReduce 1.0 In a typical Hadoop cluster, racks are interconnected via core switches. Core switches should connect to top-of …

  10. mapreduce - Hadoop - get results from output files after reduce ...

    Aug 26, 2013 · In mapreduce you specify an output folder, the only thing it will contain will be part-r files (which is the output of a reduce task) and a _SUCCESS file (which is empty).