Class StreamingStep
- java.lang.Object
-
- com.amazonaws.services.elasticmapreduce.util.StreamingStep
-
public class StreamingStep extends Object
Class that makes it easy to define Hadoop Streaming steps.See also: Hadoop Streaming
AWSCredentials credentials = new BasicAWSCredentials(accessKey, secretKey); AmazonElasticMapReduce emr = new AmazonElasticMapReduceClient(credentials); HadoopJarStepConfig config = new StreamingStep() .withInputs("s3://elasticmapreduce/samples/wordcount/input") .withOutput("s3://my-bucket/output/") .withMapper("s3://elasticmapreduce/samples/wordcount/wordSplitter.py") .withReducer("aggregate") .toHadoopJarStepConfig(); StepConfig wordCount = new StepConfig() .withName("Word Count") .withActionOnFailure("TERMINATE_JOB_FLOW") .withHadoopJarStep(config); RunJobFlowRequest request = new RunJobFlowRequest() .withName("Word Count") .withSteps(wordCount) .withLogUri("s3://log-bucket/") .withInstances(new JobFlowInstancesConfig() .withEc2KeyName("keypairt") .withHadoopVersion("0.20") .withInstanceCount(5) .withKeepJobFlowAliveWhenNoSteps(true) .withMasterInstanceType("m1.small") .withSlaveInstanceType("m1.small")); RunJobFlowResult result = emr.runJobFlow(request);
-
-
Constructor Summary
Constructors Constructor Description StreamingStep()
Creates a new default StreamingStep.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Map<String,String>
getHadoopConfig()
Get the Hadoop config overrides (-D values).List<String>
getInputs()
Get list of step input paths.String
getMapper()
Get the mapper.String
getOutput()
Get output path.String
getReducer()
Get the reducervoid
setHadoopConfig(Map<String,String> hadoopConfig)
Set the Hadoop config overrides (-D values).void
setInputs(Collection<String> inputs)
Set the list of step input paths.void
setMapper(String mapper)
Set the mapper.void
setOutput(String output)
Set the output path for this step.void
setReducer(String reducer)
Set the reducerHadoopJarStepConfig
toHadoopJarStepConfig()
Creates the final HadoopJarStepConfig once you are done configuring the step.StreamingStep
withHadoopConfig(String key, String value)
Add a Hadoop config override (-D value).StreamingStep
withInputs(String... inputs)
Add more input paths to this step.StreamingStep
withMapper(String mapper)
Set the mapperStreamingStep
withOutput(String output)
Set the output path for this step.StreamingStep
withReducer(String reducer)
Set the reducer
-
-
-
Method Detail
-
getInputs
public List<String> getInputs()
Get list of step input paths.- Returns:
- List of step inputs
-
setInputs
public void setInputs(Collection<String> inputs)
Set the list of step input paths.- Parameters:
inputs
- List of step inputs.
-
withInputs
public StreamingStep withInputs(String... inputs)
Add more input paths to this step.- Parameters:
inputs
- A list of inputs to this step.- Returns:
- A reference to this updated object so that method calls can be chained together.
-
getOutput
public String getOutput()
Get output path.- Returns:
- Output path.
-
setOutput
public void setOutput(String output)
Set the output path for this step.- Parameters:
output
- Output path.
-
withOutput
public StreamingStep withOutput(String output)
Set the output path for this step.- Parameters:
output
- Output path- Returns:
- A reference to this updated object so that method calls can be chained together.
-
getMapper
public String getMapper()
Get the mapper.- Returns:
- Mapper.
-
setMapper
public void setMapper(String mapper)
Set the mapper.- Parameters:
mapper
- Mapper
-
withMapper
public StreamingStep withMapper(String mapper)
Set the mapper- Parameters:
mapper
- Mapper- Returns:
- A reference to this updated object so that method calls can be chained together.
-
getReducer
public String getReducer()
Get the reducer- Returns:
- Reducer
-
setReducer
public void setReducer(String reducer)
Set the reducer- Parameters:
reducer
- Reducer
-
withReducer
public StreamingStep withReducer(String reducer)
Set the reducer- Parameters:
reducer
- Reducer- Returns:
- A reference to this updated object so that method calls can be chained together.
-
getHadoopConfig
public Map<String,String> getHadoopConfig()
Get the Hadoop config overrides (-D values).- Returns:
- Hadoop config.
-
setHadoopConfig
public void setHadoopConfig(Map<String,String> hadoopConfig)
Set the Hadoop config overrides (-D values).- Parameters:
hadoopConfig
- Hadoop config.
-
withHadoopConfig
public StreamingStep withHadoopConfig(String key, String value)
Add a Hadoop config override (-D value).- Parameters:
key
- Hadoop configuration key.value
- Configuration value.- Returns:
- A reference to this updated object so that method calls can be chained together.
-
toHadoopJarStepConfig
public HadoopJarStepConfig toHadoopJarStepConfig()
Creates the final HadoopJarStepConfig once you are done configuring the step. You can use this as you would any other HadoopJarStepConfig.- Returns:
- HadoopJarStepConfig representing this streaming step.
-
-