Class StreamingStep
java.lang.Object
com.amazonaws.services.elasticmapreduce.util.StreamingStep
Class that makes it easy to define Hadoop Streaming steps.
See also: Hadoop Streaming
AWSCredentials credentials = new BasicAWSCredentials(accessKey, secretKey); AmazonElasticMapReduce emr = new AmazonElasticMapReduceClient(credentials); HadoopJarStepConfig config = new StreamingStep() .withInputs("s3://elasticmapreduce/samples/wordcount/input") .withOutput("s3://my-bucket/output/") .withMapper("s3://elasticmapreduce/samples/wordcount/wordSplitter.py") .withReducer("aggregate") .toHadoopJarStepConfig(); StepConfig wordCount = new StepConfig() .withName("Word Count") .withActionOnFailure("TERMINATE_JOB_FLOW") .withHadoopJarStep(config); RunJobFlowRequest request = new RunJobFlowRequest() .withName("Word Count") .withSteps(wordCount) .withLogUri("s3://log-bucket/") .withInstances(new JobFlowInstancesConfig() .withEc2KeyName("keypairt") .withHadoopVersion("0.20") .withInstanceCount(5) .withKeepJobFlowAliveWhenNoSteps(true) .withMasterInstanceType("m1.small") .withSlaveInstanceType("m1.small")); RunJobFlowResult result = emr.runJobFlow(request);
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionGet the Hadoop config overrides (-D values).Get list of step input paths.Get the mapper.Get output path.Get the reducervoid
setHadoopConfig
(Map<String, String> hadoopConfig) Set the Hadoop config overrides (-D values).void
setInputs
(Collection<String> inputs) Set the list of step input paths.void
Set the mapper.void
Set the output path for this step.void
setReducer
(String reducer) Set the reducerCreates the final HadoopJarStepConfig once you are done configuring the step.withHadoopConfig
(String key, String value) Add a Hadoop config override (-D value).withInputs
(String... inputs) Add more input paths to this step.withMapper
(String mapper) Set the mapperwithOutput
(String output) Set the output path for this step.withReducer
(String reducer) Set the reducer
-
Constructor Details
-
StreamingStep
public StreamingStep()Creates a new default StreamingStep.
-
-
Method Details
-
getInputs
Get list of step input paths.- Returns:
- List of step inputs
-
setInputs
Set the list of step input paths.- Parameters:
inputs
- List of step inputs.
-
withInputs
Add more input paths to this step.- Parameters:
inputs
- A list of inputs to this step.- Returns:
- A reference to this updated object so that method calls can be chained together.
-
getOutput
Get output path.- Returns:
- Output path.
-
setOutput
Set the output path for this step.- Parameters:
output
- Output path.
-
withOutput
Set the output path for this step.- Parameters:
output
- Output path- Returns:
- A reference to this updated object so that method calls can be chained together.
-
getMapper
Get the mapper.- Returns:
- Mapper.
-
setMapper
Set the mapper.- Parameters:
mapper
- Mapper
-
withMapper
Set the mapper- Parameters:
mapper
- Mapper- Returns:
- A reference to this updated object so that method calls can be chained together.
-
getReducer
Get the reducer- Returns:
- Reducer
-
setReducer
Set the reducer- Parameters:
reducer
- Reducer
-
withReducer
Set the reducer- Parameters:
reducer
- Reducer- Returns:
- A reference to this updated object so that method calls can be chained together.
-
getHadoopConfig
Get the Hadoop config overrides (-D values).- Returns:
- Hadoop config.
-
setHadoopConfig
Set the Hadoop config overrides (-D values).- Parameters:
hadoopConfig
- Hadoop config.
-
withHadoopConfig
Add a Hadoop config override (-D value).- Parameters:
key
- Hadoop configuration key.value
- Configuration value.- Returns:
- A reference to this updated object so that method calls can be chained together.
-
toHadoopJarStepConfig
Creates the final HadoopJarStepConfig once you are done configuring the step. You can use this as you would any other HadoopJarStepConfig.- Returns:
- HadoopJarStepConfig representing this streaming step.
-