Creating Food Assets Using Phone Photos

Using reference images is a powerful way to create assets from real world examples. In this article we explore how to leverage simple mobile phone photos of food and transform them into beautiful assets.

Choosing Your Model

     In the case of this article, I am going to stick with a general concept model trained in a simple and familiar style. As you can see from my training images below, it does not take much to build a good style model:

In the case of this particular model, I really wanted to lock in the style. I also include a very wide variety of input images. It is better to avoid too many of the same subject when you are trying to train a style, as the AI will pick up on anything that is consistent in your dataset.

Here are my training parameters for this dataset training on SD 1.5:

  • Training Class: Art Style > Illustration
  • Training Steps: 600
  • Learning Rate: 2e-6
  • Text Encoder: On

The Reference Images

     In this workflow we are going to rely heavily on reference images. As you can see below, simple reference images are all you need. This is ideal, because it is always wise that we avoid using reference images that are not licensed, even if they will be transformed.


     It’s helpful to have a mostly blank or solid color background, as you’ll see in the results. Let’s take a walkthrough of the process with one of the photos.

Making an Artichoke Asset

     First we will start with this image of an artichoke - you don’t need any fancy equipment to reproduce this quality of image. You just need good natural light, a mobile phone, and a clean surface.









     Next, pick your model on Scenario, and upload this image in the ‘Reference Image’ spot. Go ahead and remove the background. You won’t need to adjust the aspect ratio, so long as you’ve centered the object when you take the photograph.











For the next step, simply write the following prompt:
[name of object], game asset, solid color background












     The next steps are easy - simply prompt and adjust the results. You will see some example results and ControlNet settings in the grids below. I recommend Segmentation or Depth mode for objects such as these. Img2Img works well, which you can activate simply by selecting “Disabled”.

Here are some tips:

* More complex objects will likely need a higher influence on Img2Img, you can see below with a few of the items such as Les Calissons de Provence. In this case I had to greatly increase to get the result I want.

* If your concept is simple, you can use a lower influence. Making it too high will retain a lot more realism if the object is easy for the AI to understand.

* Structure mode will stay more true to the input image, whereas Depth will carry more of the models style.

* ControlNet does not take into account solid backgrounds, so it is recommended you use our background removal tool during post processing.

* It is wise to play around with guidance, sampling steps, and more. Every model is a little different.



Take a look at the examples below. As you can see, different objects benefit from different settings, so there is usually an exploratory phase when converting photos into assets.