Deadline Documentation
RSS< Twitter< etc

Power Management


Overview

Power management is a system for controlling how machines startup and shutdown automatically based on sets of conditions on the render farm, including job load and temperature. Machines can be configured to shutdown after they have been idle for a set amount of time, and startup depending on how many jobs are queued. Machines may also poll an external temperature sensor using SNMP and shutdown a number of machines if the temperature rises beyond a given threshold. If you have problematic machines that you find you are always restarting, you can have them reboot automatically at defined intervals. You can even set up a schedule for when slaves should start and stop.

Power management is built into Deadline Pulse, so Pulse must be running to use this feature. The only exception to this rule is Temperature checking. Redundancy for Temperature checking has been built into the Slave application, so if Pulse isn't running, you're still protected if the temperature in your farm room begins to rise. If you wish to use power management without having Pulse act as a proxy between the Deadline applications and the Repository, simply leave the Host Name Or IP Address setting blank in the Pulse Settings in the Repository Options.

To configure and run Deadline Pulse, see the Deadline Pulse documentation.

Configuration

Power management can be configured from the Deadline Monitor while in Super User mode by selecting Tools -> Configure Power Management.

Machine Groups are used by Power Management to organize slave machines on the farm, and each group has five sections that can be configured independently of each other.

Power Management Group Settings:

  • Group Name: The name the Power Management Group will be identified by.
  • Group Mode: Select Disabled or Enabled.
  • Slaves Not In Group: Slaves that are available to be added to the new group.
  • Slaves In Group: Slaves that will be included in the new group.

Idle Shutdown

A system for forcing slaves to shutdown after they have been idle for set periods of time. This can be used to save on energy costs when the render farm is sitting idle. Combining this feature with Wake-On-Lan will ensure that machines in the render farm are only running when they are needed.

You can split the idle time period between a daytime period and an evening period. This is useful because in most cases, you want most of your machines to stay on during the working day, and then shutdown during the evening when there are no renders left. In addition, you can also specify exceptions to these two periods, which means (for example) you could have different idle periods for the weekend.

Idle Shutdown Settings:

  • Idle Shutdown Mode: Select Disabled, Enabled, or Debug mode. In Debug mode, all the checks are performed as normal, but no action is actually taken.
  • Slaves Will Be Shutdown After Being Idle For # Minutes: Self explanatory.
  • Number of Slaves To Leave Running: This amount of slaves to be left running.
  • Slave Shutdown Type: The method that will be used to shutdown the slaves machine.
    • Shutdown: Power off the machine using the normal shutdown method.
    • Stand By: Put the machine into Stand By mode instead of shutting it down. Only works for Windows slaves.
    • Run Command: Use this method to have the Slave run a command when attempting to shutdown a slave.
  • Important Processes: If the Slave has any of these processes running it will not shutdown.
  • Overrides: Define overrides for different days and times. Simply specify the day(s) of the week, the time period, the minimum number of slaves, and the idle shutdown time for each override required. For example if more machines are required to be running continuously for Friday evening and Saturday afternoon, it can be specified in an override.
  • Override Shutdown Order: Whether or not to define the order in which slaves are shutdown. If disabled, slaves are shutdown in alphabetical order. If enabled, use the Set Shutdown Order dialog to define the order.

Machine Startup

A system that allows powered down machines to be started automatically when new jobs are submitted to the render farm. Combining this feature with Idle Shutdown will ensure that machines in the render farm are only running when they are needed.

If slave machines support it, Wake On Lan (WOL) or IPMI commands can be used to start them up after they shutdown. By default, the WOL packet is sent over port 9, but you can change this in the Wake On Lan Settings in the Repository Options. Make sure there isn't a firewall or other security software blocking communication over the selected port(s).

Note: If machines in the group begin to be shutdown due to temperature, this feature may be automatically disabled for the group to prevent machines from starting up and raising the temperature again.

Machine Startup Settings:

  • Machine Startup Mode: Select Disabled, Enabled, or Debug mode. In Debug mode, all the checks are performed as normal, but no action is actually taken.
  • Number Of Slaves To Wake Up Per Interval: The maximum number of machines to start in the given power management check interval. The interval can be configured in the Pulse section of the repository options.
  • Use Wake On Lan: Use Wake On Lan to start the machines.
  • Run Command: This is primarily for IPMI support. If enabled, Pulse will run a given command to start slave machines. This command will be run once for each slave that is being woken up. A few tags can be used within the command:
    • {SLAVE_NAME}: Is replaced with the current slave's hostname.
    • {SLAVE_MAC}: Is replaced with the current slave's MAC address.
    • {SLAVE_IP}: Is replaced with the current slave's IP address.
  • Override Startup Order: Whether or not to define the order in which slaves are started up. If disabled, slaves are started up in alphabetical order. If enabled, use the Set Startup Order dialog to define the order.

Thermal Shutdown

The thermal shutdown system polls temperature sensors and responds by shutting down machines if the temperature gets too high. The sensors we use are NetTherms, and APC Sensors are also known to be compatible. Note that this feature uses port 161, so make sure there isn't a firewall or other security software blocking communication over this port.

Thermal Shutdown Settings:

  • Thermal Shutdown Mode: Select Disabled, Enabled, or Debug mode. In Debug mode, all the checks are performed as normal, but no action is actually taken.
  • Thermal Shutdown Temperature Units : The units to display and configure the temperatures in. Note that this is separate from the units that the actual sensors use.
  • Thermal Sensors: The host and OID (object identifier) of the sensor(s) in the zone. To add a sensor, simply click the Add button.
  • Temperature Thresholds: Thresholds can be added for any temperature. When a sensor reports a temperature higher than a particular threshold, the slaves in the zone will respond accordingly. Note that higher temperature thresholds take precedence over lower temperature thresholds.
  • Shut down slaves if sensor(s) cannot be reached: If enabled, the slaves will be shutdown after a period of time in which the temperature sensor could not be reached for temperature information.
  • Disable Machine Startup if threshold is reached: If enabled, Machine Startup for the current group will be disabled if a thermal threshold is reached.
  • Re-Enable Machine Startup when temperature returns to temperature: If enabled, Machine Startup for the current group will be re-enabled (if previously disabled by Thermal Shutdown) when the temperature returns to a specified temperature.
  • Override Shutdown Order: Whether or not to define the order in which slaves are shutdown. If disabled, slaves are shutdown in alphabetical order. If enabled, use the Set Shutdown Order dialog to define the order.

Sensor Settings:

  • Sensor Hostname Or IP Address: The host of the temperature sensor.
  • Sensor OID: The OID (object identifier) of the temperature sensor. The default OID is for the particular type of sensor we use.
  • Sensor SNMP Community: If testing the sensor fails when Private is selected, try selecting Public.
  • Sensor Reports Temperature As: Select the units that your temperature sensor uses to report the temperature.
  • Sensor Timeout In Milliseconds: The timeout value for contacting the sensor..
  • Test Sensor: Queries the sensor for the temperature and displays it. If the temperature displayed seems incorrect, make sure you are reading in the correct units.

Machine Restart

If you have problematic machines that you need to restart on a regular basis, you can configure the Machine Restart feature of power management to restart your slave machines at specified intervals. Note that if the slave on the machine is in the middle of rendering a task, it will finish its current task before the machine is restarted.

Machine Restart Settings:

  • Machine Restart Mode: Select Disabled, Enabled, or Debug mode. In Debug mode, all the checks are performed as normal, but no action is actually taken.
  • Restart machines after Slave has been running for: The interval, in minutes, at which this group of slaves will be restarted.

Slave Scheduling

You can use the Slave Scheduling feature of power management to configure when slaves applications should be launched and shut down. You can even group together different machines and have those groups follow different schedules.

Slave Scheduling Settings:

  • Slave Scheduling Mode: Select Disabled, Enabled, or Debug mode. In Debug mode, all the checks are performed as normal, but no action is actually taken.
  • Day Of The Week: Configure which days of the week you want to set a schedule for.
  • Start Time: The time on the selected day that the Slave application should be launched if it is not already running.
  • Stop Time: The time on the selected day that the Slave application should be closed if it is running.
  • Slaves In Group: The slaves that will follow this schedule.
  • Use Wake On Lan If Machine Is Offline When Starting: If enabled, an offline machine will be booted up if the Slave application is scheduled to run. Note that the machine must support Wake On Lan.
  • Allow Slaves To Finish Their Current Task When Stopping: If enabled, the Slave application won't close until it finishes its current task.