Network Performance Guide
Overview
This guide is intended to help you find and fix bottlenecks in your Deadline render farm. If you are noticing slow network performance when you are using Deadline, there are a few things you can do to try and improve it.
Adjust Slave Settings
There are a few Slave settings in the Repository Options that you can tweak to help improve performance.
The settings to tweak are under Wait Times:
- Number of seconds between Repository queries (RQ): The number of seconds a slave that has not connected to Pulse will wait between queries to the repository when it is idle.
- Number of seconds between queries for new Tasks (TQ): The number of seconds a slave that has not connected to Pulse will wait after it has completed a task before it tries to query for another.
As a general rule of thumb, we suggest setting the RQ to the number of slaves you have, divided by 2. For example, if you have 100 slaves, set the value to 50. We then suggest setting the TQ to the RQ divided by 10. So in this example, set the value to 5. Now that you have a good starting point, you can adjust these values as necessary to try and improve performance.
Use Deadline Pulse
If you have more than 50 render nodes in your farm, or are still experiencing performance issues after adjusting the Slave settings, we recommend that you run the Deadline Pulse application. Pulse is a server application that acts as a proxy between the Deadline applications and the Repository, which helps to reduce the load on your network and improve Deadline's overall performance. Note that if the Pulse application goes down for any reason, the Deadline applications will revert back to scanning the Repository themselves. Whether you decide to run Pulse or not, you will still benefit from Deadline's robustness. To configure and run Deadline Pulse, see the Deadline Pulse documentation.
Once you have installed and configured Pulse, all you need to do is launch the Pulse application on the appropriate machine. Pulse will start showing how many slaves are connecting to it and how many it is servicing. If the slaves don't start connecting to it right away, it may be possible that they haven't recognized the changes you made in the Repository Options yet. You can simply restart the Slave applications to get them to recognize the new changes immediately.
If the slaves still have difficulty connecting to Pulse, check if there is any firewall or security software on the Pulse machine that is interfering with the communication. If there is, you may need to add an exception for Pulse and the port that it is communicating over before the slaves can successfully connect to it.
Enable Throttling
Pulse supports a Throttling feature, which is helpful if you're submitting large files with your jobs. This can be used to limit the number of Slaves that are copying over the job files at a time. The Throttling settings can be found in the Pulse Settings section of the Repository Options.
For example, if you have 100 slaves, and you're submitting 500mb scene files with your jobs, you may notice a performance hit if all 100 slaves try to copy over the job files at the same time. You could set the Slave Throttle Limit to 10 so that only 10 slaves at a time will ever be copying over those files. Note that a Slave only copies over the job files when it starts up a new job. When it goes to render subsequent tasks for the same job, it will not be affected by the throttling feature. As a rule of thumb, we suggest setting the Slave Throttle Wait Interval to the same value as the Job Scan Interval in the General Pulse Settings.
Manage Job Auxiliary Files
If you are submitting your scene files with the jobs, this can affect overall performance if the scene files are quite large. This is because whenever a Slave starts a new job, it copies those job files locally before rendering (including the scene file). If you have 200 machines starting a job with a 500MB scene file, and your Repository machine hardware isn't built to handle a large load, your performance will suffer.
If enabling Throttling isn't helping, another option (which can be used in conjunction with Throttling if you wish) is to configure Deadline to store these scene files in alternate location (like a separate file server). This can be done by configuring the Job Auxiliary Files settings in the Repository Options.
You can choose a server that's better equipped to handle the load, which will help improve the performance and stability of your Repository machine. In a mixed farm environment, you need to ensure that the paths for each operating system resolve to the same location. Otherwise, a scene file submitted with the job on one operating system will not be visible to a slave running on another.
Evaluate Repository Hardware
Download and read the Scaling Whitepaper from the Miscellaneous Deadline Downloads Page. This is a guide for setting up and configuring your network to get the best performance out of your render farm. It contains a section on hardware considerations to help ensure your Repository machine meets the current and future demands of your render farm.
