org.apache.hadoop.mapreduce.lib.input
Class KeyValueLineRecordReader

java.lang.Object
  extended by org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>
      extended by org.apache.hadoop.mapreduce.lib.input.KeyValueLineRecordReader
All Implemented Interfaces:
java.io.Closeable

public class KeyValueLineRecordReader
extends RecordReader<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>

This class treats a line in the input as a key/value pair separated by a separator character. The separator can be specified in config file under the attribute name key.value.separator.in.input.line. The default separator is the tab character ('\t').


Constructor Summary
KeyValueLineRecordReader(org.apache.hadoop.conf.Configuration conf)
           
 
Method Summary
 void close()
          Close the record reader.
static int findSeparator(byte[] utf, int start, int length, byte sep)
           
 org.apache.hadoop.io.Text getCurrentKey()
          Get the current key
 org.apache.hadoop.io.Text getCurrentValue()
          Get the current value.
 java.lang.Class<?> getKeyClass()
           
 float getProgress()
          The current progress of the record reader through its data.
 void initialize(InputSplit genericSplit, TaskAttemptContext context)
          Called once at initialization.
 boolean nextKeyValue()
          Read key/value pair in a line.
static void setKeyValue(org.apache.hadoop.io.Text key, org.apache.hadoop.io.Text value, byte[] line, int lineLen, int pos)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

KeyValueLineRecordReader

public KeyValueLineRecordReader(org.apache.hadoop.conf.Configuration conf)
                         throws java.io.IOException
Throws:
java.io.IOException
Method Detail

getKeyClass

public java.lang.Class<?> getKeyClass()

initialize

public void initialize(InputSplit genericSplit,
                       TaskAttemptContext context)
                throws java.io.IOException
Description copied from class: RecordReader
Called once at initialization.

Specified by:
initialize in class RecordReader<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>
Parameters:
genericSplit - the split that defines the range of records to read
context - the information about the task
Throws:
java.io.IOException

findSeparator

public static int findSeparator(byte[] utf,
                                int start,
                                int length,
                                byte sep)

setKeyValue

public static void setKeyValue(org.apache.hadoop.io.Text key,
                               org.apache.hadoop.io.Text value,
                               byte[] line,
                               int lineLen,
                               int pos)

nextKeyValue

public boolean nextKeyValue()
                     throws java.io.IOException
Read key/value pair in a line.

Specified by:
nextKeyValue in class RecordReader<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>
Returns:
true if a key/value pair was read
Throws:
java.io.IOException

getCurrentKey

public org.apache.hadoop.io.Text getCurrentKey()
Description copied from class: RecordReader
Get the current key

Specified by:
getCurrentKey in class RecordReader<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>
Returns:
the current key or null if there is no current key

getCurrentValue

public org.apache.hadoop.io.Text getCurrentValue()
Description copied from class: RecordReader
Get the current value.

Specified by:
getCurrentValue in class RecordReader<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>
Returns:
the object that was read

getProgress

public float getProgress()
Description copied from class: RecordReader
The current progress of the record reader through its data.

Specified by:
getProgress in class RecordReader<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>
Returns:
a number between 0.0 and 1.0 that is the fraction of the data read

close

public void close()
           throws java.io.IOException
Description copied from class: RecordReader
Close the record reader.

Specified by:
close in interface java.io.Closeable
Specified by:
close in class RecordReader<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>
Throws:
java.io.IOException


Copyright © 2009 The Apache Software Foundation