Fairshare

The batch farm uses a "fair share" system to schedule jobs, and at JLab this is the Maui job scheduler.  This system works within fixed scheduling bins or periods (currently set to one day) and attempts to fill the current bin first with the highest priority work (largest fairshare) and then with lower priority work.  If a project or user gets no work done in one bin, they effectively have higher priority in the next bin, so that over multiple bins the assigned shares can be achieved with reasonable precision.

From Maui's point of view there are only accounts and users, each with fairshare.  Over time, each account will achieve its allocated share (e.g. 35%).  Within an account, users also have fairshare, and a user with twice the fairshare of another user of the same account will typically get twice as much work through the system.  Similarly, if there are two accounts each with 50% fair share, where one has only a single user and the other has 5 users, then that single user will get his accounts full fair share (50%) while the other users each get a fifth of their accounts share (10%).  This is the definition of "fair".

From Auger's point of view, the system is a bit more complex (thus configurable).  At JLab there are major shared unix accounts (users) to do the more demanding work of first pass reconstruction, simulation, etc.  Auger calls these users "projects". 
Projects behave like users to Maui, and their weight is configured to be higher than normal user weights (10 to 1).   For example, if one of these project-users is running and 10 regular users are using the same account, then the project will get half of that account's fair share, and the users will each get 5%.

To use a project, the user specified the name of the project using the PROJECT parameter, and Auger will then schedule the work to be done using that project account/user (if the user is a member of that project).

Tuning the fairshare system is an ongoing process; there are many parameters in the Maui scheduler.
Currently the farm's system is set to use 1 day bins, and to include 7 bins with exponentially decaying weights so that recent history is more important that older bins (i.e. a crude approximation to a sliding exponential window).  Since single jobs can run for 3 days, the system as configured is hampered in its ability to treat large complex job mixes accurately.  This could penalize small allocations (users) when the system is especially busy.  Maui is designed to allow a user to take a few days off and then catch up, but the current configuration hamstrings the ability to catch up (the "sliding window" it is very near sighted).  All of these optimization are done collaboratively by Scientific Computing and the Physics division as needed, and once someone understands its behavior, then their suggestions for tweaks are welcome.

Project Fairshare, Spring 2018

At the highest level, the jobs are scheduled to run according to the following account weights. Accounts are mapped from different PROJECTS. ,broken down with their target percentage of utilization of resources. These allocations are adjusted periodically by Physics based on beam running, project priorities, and other considerations at the time.

  Auger Accounts  
   FS percentage  
halla 5% 
hallb 30% 
hallc 5% 
halld 60% 
other 10%