In last blog post, a hadoop distribution is built to run a YARN job.
$ bin/hadoop jar share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.2.0.jar \
date -u command is executed in Hadoop cluster by above script, we might conclude that there exists a dispatcher named
Client in "hadoop-yarn-applications-distributedshell-2.2.0.jar", responsible for deploying a jar to cluster with parameters, such as shell command and args, and notify the cluster to execute the shell command.
To see what's in the rabbit hole, let's step into the
Client source code.
Code snippets will be full of this post, to not confuse you, all comments added by me begin with
//** instead of
/* and the code can be cloned from Apache Git Repository, commit id is
The Process Logic
Client is started as a process, we'd better to look into the
main method first.
//** org.apache.hadoop.yarn.applications.distributedshell.Client.java L164
There are three procedures, first, constructs a
Client instance, then initializes it, and invokes the
run method of it.
The Client Instance
There are three constructors in
main method calls the default one with no parameters.
//** org.apache.hadoop.yarn.applications.distributedshell.Client.java L227
The default constructor creates a
YarnConfiguration instance and bypasses it to another constructor.
//** org.apache.hadoop.yarn.applications.distributedshell.Client.java L194
Then sets "org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster" as
//** org.apache.hadoop.yarn.applications.distributedshell.Client.java L200
YarnClientImpl extends from
YarnClient which extends from
AbstractService which implements from
Service, the main job of it is to control the service life-cycle,
YarnClientImpl is created and initialized.
init method can't be found in
YarnClientImpl as well as
YarnClient, the actual
init happens in
//** org.apache.hadoop.service.AbstractService.java L151
It first checks whether the state
STATE.INITED, if not,
enterState(STATE.INITED), and calls the
serviceInit method, when the state is successfully transferred to
notifyListeners() is called.
//** org.apache.hadoop.service.AbstractService.java L415
notifyListeners notifies all its listeners and global listeners to change their states correspondingly.
But, what is the STATE?
The State Model
Let's go back to the
//** org.apache.hadoop.yarn.client.api.YarnClient.java L55
YarnClientImpl instance is created.
//** org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.java L86
//** org.apache.hadoop.yarn.client.api.YarnClient.java L60
//** org.apache.hadoop.service.AbstractService.java L111
Bingo, that's the state model:
//** org.apache.hadoop.service.ServiceStateModel.java L66
The state model is simply a name state pair, the name is the service implementation class name.
isInState checks the state value.
//** org.apache.hadoop.service.ServiceStateModel.java L84
enterState changes the state value after
//** org.apache.hadoop.service.ServiceStateModel.java L110
The state transferring is checked by looking up the statemap with current state, then return a boolean to indicate whether the state transition is valid, if it's invalid, the
//** org.apache.hadoop.service.ServiceStateModel.java L125
Then what's in the statemap?
//** org.apache.hadoop.service.ServiceStateModel.java L35
That's the state model we are looking for. The current state is the row index, the proposed state is the column index, the value is whether the current state can be transfered to proposed state.
The Command Initialization
Go back again, to when the
YarnClientImpl is about to be initialized, the
init method of its super
AbstractService calls the
serviceInit method and the state is transfered to
YarnClientImpl has implemented the
//** org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.java L96
The address of Resource Manager,
rmAddress is assigned from the configuration instance.
Client instance is created successfully,
init method is invoked by
main to initialize the instance.
//** org.apache.hadoop.yarn.applications.distributedshell.java L244
The command line options is parsed, and assigned to instance variables for later usage.
Client is ready to