org.apache.hadoop.mapred
Class ReduceTask.ReduceCopier<K,V>

java.lang.Object
  extended by org.apache.hadoop.mapred.ReduceTask.ReduceCopier<K,V>
All Implemented Interfaces:
ShuffleConsumerPlugin
Enclosing class:
ReduceTask

public static class ReduceTask.ReduceCopier<K,V>
extends java.lang.Object
implements ShuffleConsumerPlugin


Nested Class Summary
static class ReduceTask.ReduceCopier.MapOutput
          Describes the output of a map; could either be on disk or in-memory.
 class ReduceTask.ReduceCopier.MapOutputCopier
          Copies map outputs as they become available
 class ReduceTask.ReduceCopier.MapOutputLocation
          Abstraction to track a map-output.
 class ReduceTask.ReduceCopier.ShuffleClientMetrics
          This class contains the methods that should be used for metrics-reporting the specific metrics for shuffle.
 
Nested classes/interfaces inherited from interface org.apache.hadoop.mapred.ShuffleConsumerPlugin
ShuffleConsumerPlugin.Context
 
Field Summary
static long COUNTER_UPDATE_INTERVAL
           
static long DEFAULT_DISK_HEALTH_CHECK_INTERVAL
          How often TaskTracker needs to check the health of its disks, if not configured using mapred.disk.healthChecker.interval
static int FILE_NOT_FOUND
           
static java.lang.String FOR_REDUCE_TASK
          The reduce task number for which this map output is being transferred
static java.lang.String FROM_MAP_TASK
          The map task from which the map output data is being transferred
static int HEARTBEAT_INTERVAL_MIN_DEFAULT
           
static java.lang.String MAP_OUTPUT_LENGTH
          The custom http header used for the map output length.
static java.lang.String RAW_MAP_OUTPUT_LENGTH
          The custom http header used for the "raw" map output length.
static int SUCCESS
           
static java.lang.String WORKDIR
           
 
Constructor Summary
ReduceTask.ReduceCopier()
           
 
Method Summary
protected  void checkAndInformJobTracker(int failures, TaskAttemptID mapId, boolean readError)
           
 void close()
          close and clean any resource associated with this object.
protected  boolean closeMerger()
           
 RawKeyValueIterator createKVIterator()
          Create a RawKeyValueIterator from copied map outputs.
 boolean fetchOutputs()
          To fetch the map outputs.
 java.lang.Throwable getMergeThrowable()
          To get any exception from merge.
 void init(ShuffleConsumerPlugin.Context context)
          To initialize the reduce copier plugin.
protected  void initMerger()
           
protected  ReduceTask.ReduceCopier.MapOutput shuffle(ReduceTask.ReduceCopier.MapOutputCopier copier, ReduceTask.ReduceCopier.MapOutputLocation mapOutputLoc, java.net.URLConnection connection, java.io.InputStream input, ReduceTask.ReduceCopier.ShuffleClientMetrics shuffleClientMetrics, org.apache.hadoop.fs.Path filename, long decompressedLength, long compressedLength)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

HEARTBEAT_INTERVAL_MIN_DEFAULT

public static final int HEARTBEAT_INTERVAL_MIN_DEFAULT
See Also:
Constant Field Values

COUNTER_UPDATE_INTERVAL

public static final long COUNTER_UPDATE_INTERVAL
See Also:
Constant Field Values

DEFAULT_DISK_HEALTH_CHECK_INTERVAL

public static final long DEFAULT_DISK_HEALTH_CHECK_INTERVAL
How often TaskTracker needs to check the health of its disks, if not configured using mapred.disk.healthChecker.interval

See Also:
Constant Field Values

SUCCESS

public static final int SUCCESS
See Also:
Constant Field Values

FILE_NOT_FOUND

public static final int FILE_NOT_FOUND
See Also:
Constant Field Values

MAP_OUTPUT_LENGTH

public static final java.lang.String MAP_OUTPUT_LENGTH
The custom http header used for the map output length.

See Also:
Constant Field Values

RAW_MAP_OUTPUT_LENGTH

public static final java.lang.String RAW_MAP_OUTPUT_LENGTH
The custom http header used for the "raw" map output length.

See Also:
Constant Field Values

FROM_MAP_TASK

public static final java.lang.String FROM_MAP_TASK
The map task from which the map output data is being transferred

See Also:
Constant Field Values

FOR_REDUCE_TASK

public static final java.lang.String FOR_REDUCE_TASK
The reduce task number for which this map output is being transferred

See Also:
Constant Field Values

WORKDIR

public static final java.lang.String WORKDIR
See Also:
Constant Field Values
Constructor Detail

ReduceTask.ReduceCopier

public ReduceTask.ReduceCopier()
Method Detail

init

public void init(ShuffleConsumerPlugin.Context context)
          throws java.lang.ClassNotFoundException,
                 java.io.IOException
Description copied from interface: ShuffleConsumerPlugin
To initialize the reduce copier plugin.

Specified by:
init in interface ShuffleConsumerPlugin
Parameters:
context - reduce copier context.
Throws:
java.lang.ClassNotFoundException
java.io.IOException

initMerger

protected void initMerger()
                   throws java.io.IOException
Throws:
java.io.IOException

fetchOutputs

public boolean fetchOutputs()
                     throws java.io.IOException
Description copied from interface: ShuffleConsumerPlugin
To fetch the map outputs.

Specified by:
fetchOutputs in interface ShuffleConsumerPlugin
Returns:
true if the fetch was successful; false otherwise.
Throws:
java.io.IOException

closeMerger

protected boolean closeMerger()

shuffle

protected ReduceTask.ReduceCopier.MapOutput shuffle(ReduceTask.ReduceCopier.MapOutputCopier copier,
                                                    ReduceTask.ReduceCopier.MapOutputLocation mapOutputLoc,
                                                    java.net.URLConnection connection,
                                                    java.io.InputStream input,
                                                    ReduceTask.ReduceCopier.ShuffleClientMetrics shuffleClientMetrics,
                                                    org.apache.hadoop.fs.Path filename,
                                                    long decompressedLength,
                                                    long compressedLength)
                                             throws java.io.IOException,
                                                    java.lang.InterruptedException
Throws:
java.io.IOException
java.lang.InterruptedException

checkAndInformJobTracker

protected void checkAndInformJobTracker(int failures,
                                        TaskAttemptID mapId,
                                        boolean readError)

createKVIterator

public RawKeyValueIterator createKVIterator()
                                     throws java.io.IOException
Create a RawKeyValueIterator from copied map outputs. All copying threads have exited, so all of the map outputs are available either in memory or on disk. We also know that no merges are in progress, so synchronization is more lax, here. The iterator returned must satisfy the following constraints: 1. Fewer than io.sort.factor files may be sources 2. No more than maxInMemReduce bytes of map outputs may be resident in memory when the reduce begins If we must perform an intermediate merge to satisfy (1), then we can keep the excluded outputs from (2) in memory and include them in the first merge pass. If not, then said outputs must be written to disk first.

Specified by:
createKVIterator in interface ShuffleConsumerPlugin
Returns:
an iterator for merged key-value pairs.
Throws:
java.io.IOException

getMergeThrowable

public java.lang.Throwable getMergeThrowable()
Description copied from interface: ShuffleConsumerPlugin
To get any exception from merge.

Specified by:
getMergeThrowable in interface ShuffleConsumerPlugin

close

public void close()
Description copied from interface: ShuffleConsumerPlugin
close and clean any resource associated with this object.

Specified by:
close in interface ShuffleConsumerPlugin


Copyright © 2009 The Apache Software Foundation