Resource Management

On this page, information is regarding the allocation and charging of resources within the cluster.

Please refer to this presentation for a brief overview of how to log into and submit jobs to STOKES.
STOKES_HPCC_MPI_Usage_Pres.pdf

A queuing system similar to the one adopted by NERSC will be implemented in the IBM cluster recently acquired by the University of Central Florida. Some differences that have been considered when constructing a queuing schedule, include a different Max Wallclock limit that extends to longer periods of time in order to accommodate jobs which may take several days to finish. The set of queues that have been initially established for resources are delineated below:



Upon better utilization experience and planned expansion of resources, this queuing system will be updated.

Batch Job Compute Time Charging:

Account charging is customarily defined in terms of a master public policy (MPP) to which users of the HPC must adhere to and which are defined according to the following items:
  • Nodes – Computers in the cluster with consumable resources such as memory, processors, and storage.
  • Wallclock hours (WC) – Number of hours that have elapsed as counted by the clock on the wall.
  • CPU Cores (CC) – Number of processor cores scheduled for the job submitted by a user.
  • Machine Charge Factor (MCF) – Factor associated with the processing speed or amount of work that can be performed by the core architecture of the node consumed by a job submitted by a user. This usually depends on the clock speed of the processor in the node. (eg. A 2.4 GHz AMD Opteron vs 3.0GHz Intel Xeon cores)
  • Job Priority (JP) – Priority with which a user’s job submission will be scheduled in the queue without terminating jobs that are currently running.
The number of MPP hours consumed by a job as submitted by a user is calculated according to the following formula:

MPP Hours = [WC]*[Nodes]*[CC/node]*[MCF]*[JP]
For example, a job utilizing 32 nodes with 2 CPU cores per node running for 10 hours on a core architecture associated with a machine factor of 3.6 requested to be run at a normal priority (JP=1.0) would be charged as follows:
10 hours * 32 nodes * 2 CPUs/node * 3.6 MCF * 1.0 priority = 2,304 MPP Hours.

Storage Account Charging:

Storage quotas are handled by the standard quota management system available in the linux kernel. Default limits on new accounts created on a per group basis will be 300GB of storage with access to 260 GB of scratch storage per user (this is expected to be removed by the user immediately after job completion). Any users requiring additional storage are expected to submit a special request along with the request for MPP hours in the HPCC.

If you have any questions or problems regarding resource management, please contact:
Sergio Tafur
Phone: 407-882-1350
Email: tafur@phsyics.ucf.edu