org.griphyn.cPlanner.selector.site.heft
Class Algorithm

java.lang.Object
  extended by org.griphyn.cPlanner.selector.site.heft.Algorithm

public class Algorithm
extends Object

The HEFT based site selector. The runtime for the job in seconds is picked from the pegasus profile key runtime in the transformation catalog for a transformation. The data communication costs between jobs if scheduled on different sites is assumed to be fixed. Later on if required, the ability to specify this value will be exposed via properties. The number of processors in a site is picked by the attribute idle-nodes associated with the vanilla jobmanager for a site in the site catalog.

Version:
$Revision: 426 $
Author:
Karan Vahi
See Also:
AVERAGE_BANDWIDTH, RUNTIME_PROFILE_KEY, DEFAULT_NUMBER_OF_FREE_NODES, AVERAGE_DATA_SIZE_BETWEEN_JOBS, org.griphyn.cPlanner.classes.JobManager.IDLE_NODES

Field Summary
static float AVERAGE_BANDWIDTH
          The average bandwidth between the sites.
static float AVERAGE_DATA_SIZE_BETWEEN_JOBS
          The average data that is transferred in between 2 jobs in the workflow.
static int DEFAULT_NUMBER_OF_FREE_NODES
          The default number of nodes that are associated with a site if not found in the site catalog.
private  float mAverageCommunicationCost
          The average communication cost between nodes.
static long MAXIMUM_FINISH_TIME
          The maximum finish time possible for a job.
private  String mLabel
          The label of the workflow.
private  LogManager mLogger
          The handle to the LogManager
private  edu.isi.ikcap.workflows.ac.ProcessCatalog mProcessCatalog
          The handle to the Process Catalog.
private  PegasusProperties mProps
          The handle to the properties.
private  String mRequestID
          The request id associated with the DAX.
private  PoolInfoProvider mSiteHandle
          Handle to the site catalog.
private  Map mSiteMap
          Map containing the number of free nodes for each site.
private  List mSites
          The list of sites where the workflow can run.
private  TransformationCatalog mTCHandle
          The handle to the transformation catalog.
protected  Mapper mTCMapper
          Handle to the TCMapper.
private  edu.isi.ikcap.workflows.sr.util.WorkflowGenerationProvenanceCatalog mWGPC
          The handle to the workflow provenance catalog
private  Graph mWorkflow
          The workflow in the graph format, that needs to be scheduled.
static String PROCESS_CATALOG_IMPL_PROPERTY
          The property that designates which Process catalog impl to pick up.
static String RUNTIME_PROFILE_KEY
          The pegasus profile key that gives us the expected runtime.
 
Constructor Summary
Algorithm(PegasusBag bag)
          The default constructor.
 
Method Summary
protected  float calculateAverageComputeTime(SubInfo job)
          Returns the average compute time in seconds for a job.
protected  long[] calculateEstimatedStartAndFinishTime(GraphNode node, String site)
          Estimates the start and finish time of a job on a site.
protected  float computeDownwardRank(GraphNode node)
          Computes the downward rank of a node.
 String description()
          This method returns a String describing the site selection technique that is being implemented by the implementing class.
protected  long getAvailableTime(String site, long readyTime)
          Returns the available time for a site.
protected  int getExpectedRuntime(SubInfo job, TransformationCatalogEntry entry)
          Return expected runtime.
protected  int getExpectedRuntimeFromAC(SubInfo job, TransformationCatalogEntry entry)
          Return expected runtime from the AC only if the process catalog is initialized.
private  float getFloatValue(Object key)
          A convenience method to get the intValue for the object passed.
protected  int getFreeNodesForSite(String site)
          Returns the freenodes for a site.
 long getMakespan()
          Returns the makespan of the scheduled workflow.
protected  edu.isi.ikcap.workflows.ac.ProcessCatalog loadProcessCatalog(String type, Properties props)
          Load the process catalog, only if it is determined that the Transformation Catalog description is the windward one.
 String mapJob2ExecPool(SubInfo job, List pools)
          The call out to the site selector to determine on what pool the job should be scheduled.
protected  void populateSiteMap(List sites)
          Populates the number of free nodes for each site, by querying the Site Catalog.
 void schedule(ADag dag, List sites)
          Schedules the workflow using the heft.
 void schedule(Graph workflow, List sites)
          Schedules the workflow according to the HEFT algorithm.
protected  void scheduleJob(String site, long start, long end)
          Schedules a job to a site.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

RUNTIME_PROFILE_KEY

public static final String RUNTIME_PROFILE_KEY
The pegasus profile key that gives us the expected runtime.

See Also:
Constant Field Values

PROCESS_CATALOG_IMPL_PROPERTY

public static final String PROCESS_CATALOG_IMPL_PROPERTY
The property that designates which Process catalog impl to pick up.

See Also:
Constant Field Values

AVERAGE_BANDWIDTH

public static final float AVERAGE_BANDWIDTH
The average bandwidth between the sites. In mega bytes/per second.

See Also:
Constant Field Values

AVERAGE_DATA_SIZE_BETWEEN_JOBS

public static final float AVERAGE_DATA_SIZE_BETWEEN_JOBS
The average data that is transferred in between 2 jobs in the workflow. In megabytes.

See Also:
Constant Field Values

DEFAULT_NUMBER_OF_FREE_NODES

public static final int DEFAULT_NUMBER_OF_FREE_NODES
The default number of nodes that are associated with a site if not found in the site catalog.

See Also:
Constant Field Values

MAXIMUM_FINISH_TIME

public static final long MAXIMUM_FINISH_TIME
The maximum finish time possible for a job.

See Also:
Constant Field Values

mAverageCommunicationCost

private float mAverageCommunicationCost
The average communication cost between nodes.


mWorkflow

private Graph mWorkflow
The workflow in the graph format, that needs to be scheduled.


mSiteHandle

private PoolInfoProvider mSiteHandle
Handle to the site catalog.


mSites

private List mSites
The list of sites where the workflow can run.


mSiteMap

private Map mSiteMap
Map containing the number of free nodes for each site. The key is the site name, and value is a Site object.


mTCMapper

protected Mapper mTCMapper
Handle to the TCMapper.


mLogger

private LogManager mLogger
The handle to the LogManager


mProps

private PegasusProperties mProps
The handle to the properties.


mWGPC

private edu.isi.ikcap.workflows.sr.util.WorkflowGenerationProvenanceCatalog mWGPC
The handle to the workflow provenance catalog


mRequestID

private String mRequestID
The request id associated with the DAX.


mLabel

private String mLabel
The label of the workflow.


mProcessCatalog

private edu.isi.ikcap.workflows.ac.ProcessCatalog mProcessCatalog
The handle to the Process Catalog.


mTCHandle

private TransformationCatalog mTCHandle
The handle to the transformation catalog.

Constructor Detail

Algorithm

public Algorithm(PegasusBag bag)
The default constructor.

Parameters:
bag - the bag of Pegasus related objects.
Method Detail

schedule

public void schedule(ADag dag,
                     List sites)
Schedules the workflow using the heft.

Parameters:
dag - the ADag object containing the abstract workflow that needs to be mapped.
sites - the list of candidate sites where the workflow can potentially execute.

loadProcessCatalog

protected edu.isi.ikcap.workflows.ac.ProcessCatalog loadProcessCatalog(String type,
                                                                       Properties props)
Load the process catalog, only if it is determined that the Transformation Catalog description is the windward one.

Parameters:
type - the type of process catalog
props - contains all necessary data to establish the link.
Returns:
true if connected now, or false to indicate a failure.

schedule

public void schedule(Graph workflow,
                     List sites)
Schedules the workflow according to the HEFT algorithm.

Parameters:
workflow - the workflow that has to be scheduled.
sites - the list of candidate sites where the workflow can potentially execute.

getMakespan

public long getMakespan()
Returns the makespan of the scheduled workflow. It is maximum of the actual finish times for the leaves of the scheduled workflow.

Returns:
long the makespan of the workflow.

calculateEstimatedStartAndFinishTime

protected long[] calculateEstimatedStartAndFinishTime(GraphNode node,
                                                      String site)
Estimates the start and finish time of a job on a site.

Parameters:
node - the node that is being scheduled
site - the site for which the finish time is reqd.
Returns:
long[0] the estimated start time. long[1] the estimated finish time.

computeDownwardRank

protected float computeDownwardRank(GraphNode node)
Computes the downward rank of a node. The downward rank of node i is _ ___ max { rank( n ) + w + c } j E pred( i ) d j j ji

Parameters:
node - the GraphNode whose rank needs to be computed.
Returns:
computed rank.

calculateAverageComputeTime

protected float calculateAverageComputeTime(SubInfo job)
Returns the average compute time in seconds for a job.

Parameters:
job - the job whose average compute time is to be computed.
Returns:
the weighted compute time in seconds.

getExpectedRuntime

protected int getExpectedRuntime(SubInfo job,
                                 TransformationCatalogEntry entry)
Return expected runtime.

Parameters:
job - the job in the workflow.
entry - the TransformationCatalogEntry object.
Returns:
the runtime in seconds.

getExpectedRuntimeFromAC

protected int getExpectedRuntimeFromAC(SubInfo job,
                                       TransformationCatalogEntry entry)
Return expected runtime from the AC only if the process catalog is initialized.

Parameters:
job - the job in the workflow.
entry - the TC entry
Returns:
the runtime in seconds.

populateSiteMap

protected void populateSiteMap(List sites)
Populates the number of free nodes for each site, by querying the Site Catalog.

Parameters:
sites - list of sites.

getFreeNodesForSite

protected int getFreeNodesForSite(String site)
Returns the freenodes for a site.

Parameters:
site - the site identifier.
Returns:
number of nodes

scheduleJob

protected void scheduleJob(String site,
                           long start,
                           long end)
Schedules a job to a site.

Parameters:
site - the site at which to schedule
start - the start time for job
end - the end time of job

getAvailableTime

protected long getAvailableTime(String site,
                                long readyTime)
Returns the available time for a site.

Parameters:
site - the site at which you want to schedule the job.
readyTime - the time at which all the data reqd by the job will arrive at site.
Returns:
the available time of the site.

description

public String description()
This method returns a String describing the site selection technique that is being implemented by the implementing class.

Returns:
String

mapJob2ExecPool

public String mapJob2ExecPool(SubInfo job,
                              List pools)
The call out to the site selector to determine on what pool the job should be scheduled.

Parameters:
job - SubInfo the SubInfo object corresponding to the job whose execution pool we want to determine.
pools - the list of String objects representing the execution pools that can be used.
Returns:
if the pool is found to which the job can be mapped, a string of the form executionpool:jobmanager where the jobmanager can be null. If the pool is not found, then set poolhandle to NONE. null - if some error occured .

getFloatValue

private float getFloatValue(Object key)
A convenience method to get the intValue for the object passed.

Parameters:
key - the key to be converted
Returns:
the floatt value if object an integer, else -1


Copyright © 2007 The University of Southern California. All Rights Reserved.