Skip to content

Machine Monitoring Graphs

There are several machine monitoring graphs within Designer.

To open the Monitoring Manager, left-click on the FPS in the lower right corner of the track Monitoring widget

Once the Monitoring Manager window is open, left-click on Local to see the list of monitoring graphs:

Monitoring Manager

This graph shows the queue length for each output. The queue length is dictated by whether the project is running at normal latency, low latency or ultra low latency.

This graph can be used to see dropped frames. If the queue length drops down by one, then a frame has been shipped before the next frame has joined the queue. Because frames are queued, this will not be seen on the output as a dropped frame. Running in ultra low latency where frames are shipped as soon as they have rendered, means that if the next frame is ready when a frame ships, you will see a frame drop.

If running at normal latency, but the graph line drops to zero, then frames are not ready fast enough which could manifest as framedrop.

Output latency

This is a cumulative graph that shows the amount of time spent in different tasks, e.g how much time spent in the GUI, how much time downloading DMX.

The sequence of events when playing back a video is:

Frames are loaded from video files -> decoded into RAM -> uploaded to the GPU -> rendered on the GPU.

The amount of time taken to render the frame on the GPU. This includes other rendering work, not just that related to video playback. When Frame finishes, the frame is ready to be presented.

The amount of time a frame takes to upload to the GPU in the sequence of events described above.

The composite time spent on all the track elements together.

The amount of time spent rendering the GUI.

The cumulation of all the components that make up Compositor and GUI time.

The time spent downloading DMX

When looking at the graph you can see how much time is spent in each part of the process. If any lines are going above the dotted lines then the server is dropping frames. The dotted lines indicate the maximum amount of time that can be spent at either 50fps or 60fps.

GPU Profiler

Graphs in milliseconds (ms) the journey of a frame from the time it is rendered through to being presented for output, relative to the present time which is 0.

The time that the frame was ready (rendered) at. Whenever a frame is rendered, it is ‘ready’. ‘Ready’ shows how many milliseconds in advance the frame was ready before it was actually presented. This value will vary based on the load on Disguise.

Time the frame was added to the present queue. The present queue is Disguise notifying windows that this is an upcoming frame. This will most likely be identical to ‘ready’ as the frame is added to the queue as soon as it is rendered. The time may be higher if the queue is full, so the frame can’t be immediately added to the queue.

From when it was added to the queue, the time at which it is estimated that the frame will be presented given the length of the queue.

The time at which the frame was actually presented, which is always 0.

All other graphed values are relative to this.

If Disguise is running slow, frames will not be ready in time and will be too close to 0, or after 0. This means it is dropping frames. You will also see that if running in low or ultra low latency mode, there is less time from ‘ready’ to ‘present’, and in half speed mode the frame is ready twice as early as in full speed mode.

Frame latency

Graphs the frames per second (FPS) per head

Use this graph to see if you are dropping frames on any output head.

fps graph

How much memory Disguise is using.

The difference between how much memory in use from one frame to the next.

You can see values changing as you add and remove resources, e.g. many of the normal actions of setting up and programming a showfile.

This graph usually should not have to be used, but it can be used if experiencing crashes to make sure they are not memory related. If there is a memory leak, the delta will continuously spike in an upwards trend.

Process Memory

Graphs the data the CPU is reading from PCI devices.

Graphs the data the CPU is writing to PCI devices.

The number of bytes transferred during operations other than read and write.

It is very unlikely that this graph will be needed in normal operation. Could be used if a fault with a PCI device or the motherboard is suspected.

PCI/CPU IO 

Shows the read and write disk speed in MB/s.

Could be used to see if you were bottlenecked somewhere e.g. read speed would go up and then plateau.

Disk 

The CPU graph is cumulative.

The amount of time spent on the CPU

The amount of time waiting to present. When running a project in normal latency this should be flat, but when running in ultra low latency this series will be ‘spikey’ as there’s no queue.

Total time is the cumulation of working time and present wait time, and should always sit beneath the ms required at your FPS.

In a normal setup, total time should be flat because Disguise takes the full CPU cycle to present a frame. At the end of every update loop, something is given to the GPU to present. If that loop is finished early, then there has to be a wait before there is a slot available in the queue.

If the CPU working time is high and the present time is low, then this is because the queue is empty. If frames are dropped in this scenario, then it is because of CPU load.

CPU

This graph shows the amount of memory that the GPU is currently using.

The GPU is responsible for dealing with its own memory allocation, but Disguise tries to be as efficient as possible with the GPU memory it requires.

All textures that are rendering onscreen in the GUI are bitmaps that are uploaded to the GPU. This includes all GUI elements, as well as any media that is playing back.

Disguise tries to reduce the amount of GPU memory in use by evicting textures when they are no longer needed. If textures are not used for a while, they are evicted. Conversely, some textures e.g LUTS are expensive to upload to the GPU, but once they are uploaded they are fast to use. For this reason, Disguise tells the GPU to keep these textures alive.

Not commonly needed, but if a bug e.g. a memory leak, is suspected, a constantly increasing GPU Memory value is useful information for reporting the suspected bug. GPU Memory will increase as textures are uploaded, and decrease as they are evicted.

GPU Memory

actualFrameTime is the clock of the next upcoming present, and the delta is the amount of time passed between each frame time.

This value should remain constant, at a value dictated by the fps of the system.

Timer

There are two places where we can find RenderStream graphs:

  1. The Details view of one of the machines in a workload
    • shows engine and stream profiling for a specific render node machine
    • can be accessed via the ‘Details’ button next to a specific machine in workload widget

Find RenderStream Graphs Workload widget and the stream profiling graphs that are accessible from it

  1. The ‘RenderStream’ tab of the Monitoring Manager
    • lists all graphs for all streams and workloads
    • graphs are grouped by actors receiving the streams
    • can be accessed via the Monitoring Manager (left-click on FPS bug in Disguise)

Graphs available for RenderStream in the Monitoring Manager Graphs available for RenderStream in the Monitoring Manager

Find RenderStream Graphs

Engine-agnostic profiling data captured by the RenderStream plugin from the render node.

The time a frame request spends waiting to be picked up for processing by the workload instance.

How much time the frame request spends being processed. It is the time between the frame request being picked up for processing and the call to send the resulting frame back to disguise.

How long it takes to send the frame.

Shows the frames dropped by the render node.

Shows problems in sending the frames back to disguise.

Normally, time in queue should be minimal with most of the time spent generating the frame indicated by Engine CPU time (ms). No requests should be discarded or marked as send failed.

High Time in queue: engine can’t keep up with the rate of incoming requests from disguise. Eventually, if too many requests are waiting in the queue, the render node will discard request, which will be reflected in the Discarded line in the graph.

High Time until frame ready: indication of sending/networking issues

Profiling graph

Engine-specific profiling information for a workload instance. This data is specific to each RenderStream plugin implementation, or could be missing depending on the implementation as well.

The data we get completely depends on what engine data a particular plugin implementation is tapping into. In the case imaged above, we have Unreal Engine with granular data about how much time each step of the UE pipeline is taking. Abnormally high values in any of these sections could indicate high CPU usage (blueprints, geometry, physics, etc.), high GPU usage (resolution, post-processing, etc.), or other issues. For example, high Unreal Idle Time could indicate network slowdowns in nDislpay synchronisation.

Stream graph

Data generated by the receiver of a stream (actor/director) describing the latency of the stream.

Shows the difference in timestamps between arriving frame and the frame that arrived before it.

Actual latency of a stream: the time elapsed between disguise requesting a frame and then receiving back the frame for that request.

The current active latency which is based on the slowest stream in the session.

Render frame delta should be constant (at 60fps, delta should be 16.6ms). Fluctuations identify either dropped frame requests or unstable framerate on the RenderStream controller machine.

Latency should not be too far off from the active latency. If there is a stream that is significantly faster than other streams, its latency will show as a lot lower than the active latency. If the difference between the two is greater than 4 frames worth of time, effort should be made to better balance the workload. Also, if using a manual active latency override, this could be an indication that the override value is not accurate.

Active latency and latency should not fluctuate wildly. That could indicate optimisation is required on the asset side.

Render skew graph

The graph shows how well temporally matched are the frames received across the streams of a workload.

Shows how close to the active latency prediction are the displayed frames of a RenderStream workload. This is the difference between the expected timestamp based on active latency and the actual timestamp of the closest matching group of frames.

Within the best matching group of frames, these lines indicate the individual stream errors.

Ideally, active latency error should be zero with the individual stream errors being less than a frame worth of time.

  • If individual streams have error greater than a frame worth of time, this could be perceived as a tear between fragments that belong to the same actor.
  • If the active latency error differs across multiple subscribing actors (across multiple instances of this graph), this could be perceived as a tear on the edge between the outputs of two actors.

Response skew graph

Shows how close to the active latency prediction is the currently processed RenderStream response. This is the difference between the expected timestamp based on active latency and the actual timestamp of the closest matching group of frame responses (frame sent notifications from render nodes).

This can be used as another indication how well the active latency estimation fits a stream. Ideally, the error in this graph should be less than a frame worth of time. More than a frame of error could indicate network issues on the d3net network adapter, as it is the one used for sending RenderStream frame requests and receiving the sent notifications replies from render nodes.