Introducing Depth Mode

Depth mode is a setting in ControlNet that interprets the overall foreground and background of a provided reference image

Depth mode interprets the overall foreground and background of a provided reference image. It seeks to accurately map and interpret all the potential depth of field details of images.

What are good examples of reference images?

     Depth mode relies on information such a distinct foreground and background elements, shadow, light, and shading to create it’s mode map. These elements provide a blueprint of what depth mode should highlight, and what it should keep the same. See some of our examples below:

Examples of Good References

     Depth mode is a very flexible tool. However, to get the best results, it is important to include shadow and depth of field information in your reference images. The best results in depth mode tend to come from 3D render screengrabs. However photos, shaded illustrations, and other complex images can provide strong results.




Examples of bad references

     There are very few bad use cases or reference images for depth mode, so long as you are trying to achieve results that align with what it does best. However, depth mode tends to perform the most poorly with flat, unshaded illustrations. It’s not recommended to use particularly flat or unshaded images. A shaded vector image, for example, may work however its not ideal.




Feature highlights

     Every mode in ControlNet has different features. We’ve shared the primary feature highlights below.

Nuanced composition

     Out of the three most basic settings (structure, segmentation, and depth), we find that depth mode lands squarely in the middle in it’s compositional nuance. We can illustrate that by showing an example of one image run through each mode.

Depth mode maintains an even balance of the information it gets from the prompt and model, and the original reference image.



In contrast, Structure model pulls as much detail as possible from the original image, which leads to the prompt and model having less significant impact.



And finally Segmentation mode pulls only the shapes that are defined by the edges in an image, relying more heavily on information provided by the prompt and model.



Attention to shaded details

     As we can see in the golem example, the depth and shading of an image is the aspect most keenly highlighted when using depth mode. This is unique to Depth and to a lesser degree, Normal Maps. We can see the importance of shading here:



Moderate linework coherence

     Depth mode still retains a decent amount of linework from the original image. As we saw above, this is nowhere near as extreme as structure mode, or line art mode. However, it is enough detail to greatly impact the output.



Shadows and highlights

     Most modes are well suited for a multitude of styles. There is a misconception that depth mode may only be useful for styles that resemble 3D. However, using the image of the woman’s face from before, we can see that instead of forcing a 3D perspective, depth mode informs a nuanced and expressive amount of shadow and highlighting to the final output. It is useful to think about your light source when you are using a reference image.



     Depth mode is very versatile and a useful tool in a number of styles and use case. As a reminder, a few important points to think about when you are choosing your reference image:

  • Depth of field
  • Details present in your reference image
  • Light source and shading in you reference image

Thanks for reading, and enjoy creating with depth mode!