A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://hadoop.apache.org/docs/r1.2.1/api/org/apache/hadoop/mapred/JobClient.html below:

JobClient (Hadoop 1.2.1 API)

org.apache.hadoop.mapred
Class JobClient
java.lang.Object
  org.apache.hadoop.conf.Configured
      org.apache.hadoop.mapred.JobClient
All Implemented Interfaces:
Configurable, Tool
public class JobClient
extends Configured
implements Tool

JobClient is the primary interface for the user-job to interact with the JobTracker. JobClient provides facilities to submit jobs, track their progress, access component-tasks' reports/logs, get the Map-Reduce cluster status information etc.

The job submission process involves:

  1. Checking the input and output specifications of the job.
  2. Computing the InputSplits for the job.
  3. Setup the requisite accounting information for the DistributedCache of the job, if necessary.
  4. Copying the job's jar and configuration to the map-reduce system directory on the distributed file-system.
  5. Submitting the job to the JobTracker and optionally monitoring it's status.
Normally the user creates the application, describes various facets of the job via JobConf and then uses the JobClient to submit the job and monitor its progress.

Here is an example on how to use JobClient:

     // Create a new JobConf
     JobConf job = new JobConf(new Configuration(), MyJob.class);
     
     // Specify various job-specific parameters     
     job.setJobName("myjob");
     
     job.setInputPath(new Path("in"));
     job.setOutputPath(new Path("out"));
     
     job.setMapperClass(MyJob.MyMapper.class);
     job.setReducerClass(MyJob.MyReducer.class);

     // Submit the job, then poll for progress until the job is complete
     JobClient.runJob(job);
 
Job Control

At times clients would chain map-reduce jobs to accomplish complex tasks which cannot be done via a single map-reduce job. This is fairly easy since the output of the job, typically, goes to distributed file-system and that can be used as the input for the next job.

However, this also means that the onus on ensuring jobs are complete (success/failure) lies squarely on the clients. In such situations the various job-control options are:

  1. runJob(JobConf) : submits the job and returns only after the job has completed.
  2. submitJob(JobConf) : only submits the job, then poll the returned handle to the RunningJob to query status and make scheduling decisions.
  3. JobConf.setJobEndNotificationURI(String) : setup a notification on job-completion, thus avoiding polling.
See Also:
JobConf, ClusterStatus, Tool, DistributedCache
          Methods inherited from class java.lang.Object clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait     MAPREDUCE_CLIENT_RETRY_POLICY_ENABLED_KEY
public static final String MAPREDUCE_CLIENT_RETRY_POLICY_ENABLED_KEY
See Also:
Constant Field Values
MAPREDUCE_CLIENT_RETRY_POLICY_ENABLED_DEFAULT
public static final boolean MAPREDUCE_CLIENT_RETRY_POLICY_ENABLED_DEFAULT
See Also:
Constant Field Values
MAPREDUCE_CLIENT_RETRY_POLICY_SPEC_KEY
public static final String MAPREDUCE_CLIENT_RETRY_POLICY_SPEC_KEY
See Also:
Constant Field Values
MAPREDUCE_CLIENT_RETRY_POLICY_SPEC_DEFAULT
public static final String MAPREDUCE_CLIENT_RETRY_POLICY_SPEC_DEFAULT
See Also:
Constant Field Values
HEARTBEAT_INTERVAL_MIN
public static final int HEARTBEAT_INTERVAL_MIN
See Also:
Constant Field Values
COUNTER_UPDATE_INTERVAL
public static final long COUNTER_UPDATE_INTERVAL
See Also:
Constant Field Values
DEFAULT_DISK_HEALTH_CHECK_INTERVAL
public static final long DEFAULT_DISK_HEALTH_CHECK_INTERVAL
How often TaskTracker needs to check the health of its disks, if not configured using mapred.disk.healthChecker.interval
See Also:
Constant Field Values
SUCCESS
public static final int SUCCESS
See Also:
Constant Field Values
FILE_NOT_FOUND
public static final int FILE_NOT_FOUND
See Also:
Constant Field Values
MAP_OUTPUT_LENGTH
public static final String MAP_OUTPUT_LENGTH
The custom http header used for the map output length.
See Also:
Constant Field Values
RAW_MAP_OUTPUT_LENGTH
public static final String RAW_MAP_OUTPUT_LENGTH
The custom http header used for the "raw" map output length.
See Also:
Constant Field Values
FROM_MAP_TASK
public static final String FROM_MAP_TASK
The map task from which the map output data is being transferred
See Also:
Constant Field Values
FOR_REDUCE_TASK
public static final String FOR_REDUCE_TASK
The reduce task number for which this map output is being transferred
See Also:
Constant Field Values
WORKDIR
public static final String WORKDIR
See Also:
Constant Field Values
JobClient
public JobClient()
Create a job client.
JobClient
public JobClient(JobConf conf)
          throws IOException
Build a job client with the given JobConf, and connect to the default JobTracker.
Parameters:
conf - the job configuration.
Throws:
IOException
JobClient
public JobClient(InetSocketAddress jobTrackAddr,
                 Configuration conf)
          throws IOException
Build a job client, connect to the indicated job tracker.
Parameters:
jobTrackAddr - the job tracker to connect to.
conf - configuration.
Throws:
IOException
init
public void init(JobConf conf)
          throws IOException
Connect to the default JobTracker.
Parameters:
conf - the job configuration.
Throws:
IOException
close
public void close()
           throws IOException
Close the JobClient.
Throws:
IOException
getFs
public FileSystem getFs()
                 throws IOException
Get a filesystem handle. We need this to prepare jobs for submission to the MapReduce system.
Returns:
the filesystem handle.
Throws:
IOException
submitJob
public RunningJob submitJob(String jobFile)
                     throws FileNotFoundException,
                            InvalidJobConfException,
                            IOException
Submit a job to the MR system. This returns a handle to the RunningJob which can be used to track the running-job.
Parameters:
jobFile - the job configuration.
Returns:
a handle to the RunningJob which can be used to track the running-job.
Throws:
FileNotFoundException
InvalidJobConfException
IOException
submitJob
public RunningJob submitJob(JobConf job)
                     throws FileNotFoundException,
                            IOException
Submit a job to the MR system. This returns a handle to the RunningJob which can be used to track the running-job.
Parameters:
job - the job configuration.
Returns:
a handle to the RunningJob which can be used to track the running-job.
Throws:
FileNotFoundException
IOException
submitJobInternal
public RunningJob submitJobInternal(JobConf job)
                             throws FileNotFoundException,
                                    ClassNotFoundException,
                                    InterruptedException,
                                    IOException
Internal method for submitting jobs to the system.
Parameters:
job - the configuration to submit
Returns:
a proxy object for the running job
Throws:
FileNotFoundException
ClassNotFoundException
InterruptedException
IOException
isJobDirValid
public static boolean isJobDirValid(Path jobDirPath,
                                    FileSystem fs)
                             throws IOException
Checks if the job directory is clean and has all the required components for (re) starting the job
Throws:
IOException
getJob
public RunningJob getJob(JobID jobid)
                  throws IOException
Get an RunningJob object to track an ongoing job. Returns null if the id does not correspond to any known job.
Parameters:
jobid - the jobid of the job.
Returns:
the RunningJob handle to track the job, null if the jobid doesn't correspond to any known job.
Throws:
IOException
getJob
@Deprecated
public RunningJob getJob(String jobid)
                  throws IOException
Deprecated. Applications should rather use getJob(JobID).
Throws:
IOException
getMapTaskReports
public TaskReport[] getMapTaskReports(JobID jobId)
                               throws IOException
Get the information of the current state of the map tasks of a job.
Parameters:
jobId - the job to query.
Returns:
the list of all of the map tips.
Throws:
IOException
getMapTaskReports
@Deprecated
public TaskReport[] getMapTaskReports(String jobId)
                               throws IOException
Deprecated. Applications should rather use getMapTaskReports(JobID)
Throws:
IOException
getReduceTaskReports
public TaskReport[] getReduceTaskReports(JobID jobId)
                                  throws IOException
Get the information of the current state of the reduce tasks of a job.
Parameters:
jobId - the job to query.
Returns:
the list of all of the reduce tips.
Throws:
IOException
getCleanupTaskReports
public TaskReport[] getCleanupTaskReports(JobID jobId)
                                   throws IOException
Get the information of the current state of the cleanup tasks of a job.
Parameters:
jobId - the job to query.
Returns:
the list of all of the cleanup tips.
Throws:
IOException
getSetupTaskReports
public TaskReport[] getSetupTaskReports(JobID jobId)
                                 throws IOException
Get the information of the current state of the setup tasks of a job.
Parameters:
jobId - the job to query.
Returns:
the list of all of the setup tips.
Throws:
IOException
getReduceTaskReports
@Deprecated
public TaskReport[] getReduceTaskReports(String jobId)
                                  throws IOException
Deprecated. Applications should rather use getReduceTaskReports(JobID)
Throws:
IOException
displayTasks
public void displayTasks(JobID jobId,
                         String type,
                         String state)
                  throws IOException
Display the information about a job's tasks, of a particular type and in a particular state
Parameters:
jobId - the ID of the job
type - the type of the task (map/reduce/setup/cleanup)
state - the state of the task (pending/running/completed/failed/killed)
Throws:
IOException
getClusterStatus
public ClusterStatus getClusterStatus()
                               throws IOException
Get status information about the Map-Reduce cluster.
Returns:
the status information about the Map-Reduce cluster as an object of ClusterStatus.
Throws:
IOException
getClusterStatus
public ClusterStatus getClusterStatus(boolean detailed)
                               throws IOException
Get status information about the Map-Reduce cluster.
Parameters:
detailed - if true then get a detailed status including the tracker names and memory usage of the JobTracker
Returns:
the status information about the Map-Reduce cluster as an object of ClusterStatus.
Throws:
IOException
getStagingAreaDir
public Path getStagingAreaDir()
                       throws IOException
Grab the jobtracker's view of the staging directory path where job-specific files will be placed.
Returns:
the staging directory where job-specific files are to be placed.
Throws:
IOException
jobsToComplete
public JobStatus[] jobsToComplete()
                           throws IOException
Get the jobs that are not completed and not failed.
Returns:
array of JobStatus for the running/to-be-run jobs.
Throws:
IOException
getAllJobs
public JobStatus[] getAllJobs()
                       throws IOException
Get the jobs that are submitted.
Returns:
array of JobStatus for the submitted jobs.
Throws:
IOException
runJob
public static RunningJob runJob(JobConf job)
                         throws IOException
Utility that submits a job, then polls for progress until the job is complete.
Parameters:
job - the job configuration.
Throws:
IOException - if the job fails
monitorAndPrintJob
public boolean monitorAndPrintJob(JobConf conf,
                                  RunningJob job)
                           throws IOException,
                                  InterruptedException
Monitor a job and print status in real-time as progress is made and tasks fail.
Parameters:
conf - the job's configuration
job - the job to track
Returns:
true if the job succeeded
Throws:
IOException - if communication to the JobTracker fails
InterruptedException
setTaskOutputFilter
@Deprecated
public void setTaskOutputFilter(JobClient.TaskStatusFilter newValue)
Deprecated. 
Sets the output filter for tasks. only those tasks are printed whose output matches the filter.
Parameters:
newValue - task filter.
getTaskOutputFilter
public static JobClient.TaskStatusFilter getTaskOutputFilter(JobConf job)
Get the task output filter out of the JobConf.
Parameters:
job - the JobConf to examine.
Returns:
the filter level.
setTaskOutputFilter
public static void setTaskOutputFilter(JobConf job,
                                       JobClient.TaskStatusFilter newValue)
Modify the JobConf to set the task output filter.
Parameters:
job - the JobConf to modify.
newValue - the value to set.
getTaskOutputFilter
@Deprecated
public JobClient.TaskStatusFilter getTaskOutputFilter()
Deprecated. 
Returns task output filter.
Returns:
task filter.
run
public int run(String[] argv)
        throws Exception
Description copied from interface: Tool
Execute the command with the given arguments.
Specified by:
run in interface Tool
Parameters:
argv - command specific arguments.
Returns:
exit code.
Throws:
Exception
getDefaultMaps
public int getDefaultMaps()
                   throws IOException
Get status information about the max available Maps in the cluster.
Returns:
the max available Maps in the cluster
Throws:
IOException
getDefaultReduces
public int getDefaultReduces()
                      throws IOException
Get status information about the max available Reduces in the cluster.
Returns:
the max available Reduces in the cluster
Throws:
IOException
getSystemDir
public Path getSystemDir()
Grab the jobtracker system directory path where job-specific files are to be placed.
Returns:
the system directory where job-specific files are to be placed.
getQueues
public JobQueueInfo[] getQueues()
                         throws IOException
Return an array of queue information objects about all the Job Queues configured.
Returns:
Array of JobQueueInfo objects
Throws:
IOException
getJobsFromQueue
public JobStatus[] getJobsFromQueue(String queueName)
                             throws IOException
Gets all the jobs which were added to particular Job Queue
Parameters:
queueName - name of the Job Queue
Returns:
Array of jobs present in the job queue
Throws:
IOException
getQueueInfo
public JobQueueInfo getQueueInfo(String queueName)
                          throws IOException
Gets the queue information associated to a particular Job Queue
Parameters:
queueName - name of the job queue.
Returns:
Queue information associated to particular queue.
Throws:
IOException
getQueueAclsForCurrentUser
public QueueAclsInfo[] getQueueAclsForCurrentUser()
                                           throws IOException
Gets the Queue ACLs for current user
Returns:
array of QueueAclsInfo object for current user.
Throws:
IOException
getDelegationToken
public Token<DelegationTokenIdentifier> getDelegationToken(Text renewer)
                                                    throws IOException,
                                                           InterruptedException
Throws:
IOException
InterruptedException
renewDelegationToken
public long renewDelegationToken(Token<DelegationTokenIdentifier> token)
                          throws SecretManager.InvalidToken,
                                 IOException,
                                 InterruptedException
Renew a delegation token
Parameters:
token - the token to renew
Returns:
the new expiration time
Throws:
SecretManager.InvalidToken
IOException
InterruptedException
cancelDelegationToken
public void cancelDelegationToken(Token<DelegationTokenIdentifier> token)
                           throws IOException,
                                  InterruptedException
Cancel a delegation token from the JobTracker
Parameters:
token - the token to cancel
Throws:
IOException
InterruptedException
main
public static void main(String[] argv)
                 throws Exception
Throws:
Exception
Copyright © 2009 The Apache Software Foundation

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4