Step-by-Step Guide to Custom Concept Art Model Training with SD 1.5

In this tutorial, you will learn the basics of how to train a custom style model on Scenario! We will be showing this process through the webapp, which is the recommended interface for training custom models

When you begin training a custom model, you can either train a style, subject, or hybrid. Training assets requires some practice and finesse. It's important to understand how the categories break down:

  • Style training refers to an overarching style or design. This is the way that the subject matter looks, and it can connect many different subjects and concepts. Style tends to be the most straightforward with the least room for error.
  • Subject training refers to a specific focus in a designed piece. This is the "what" of an image composition. Subjects can be characters, objects, landscapes, or even abstract shapes. Most importantly, subjects are specific and identify a part of the overall composition.
  • Hybrid training is what most people are doing when they think they are training a subject. A hybrid training contains style and subject elements, and merges the style with a subject. Even within a hybrid training, your aesthetic and design elements should not be considered the same thing as your subject.

     Whenever you train a custom model of any kind, you are training tokens. A token is a word or phrase that the AI associates with styles, shapes, and patterns in its training memory. "Woman" or "dog" are examples of tokens. When you train a custom model, any tokens you use will take on new meanings for the AI.

Start with Style First

     When you begin using Scenario for a long form development project, it is best to start with a style training on your existing assets. We recommend training out your style because:

     Starting with a style training will help you establish a consistent look and feel for your project, which will be easier to maintain and build upon as you create more content.
Training a style model first will also allow you to experiment with different looks and styles for your project, without committing to a specific subject or theme. This can be especially useful if you are still exploring different creative directions for your project.

     Overall, starting with a style training can be a good way to get a feel for the Scenario platform and how it works, and can also help you lay the foundation for your long term development project.

What is an Art Style?

     An art style refers to the overall aesthetic and visual characteristics of a piece of art or a body of work. It can include elements such as the use of color, line, composition, and form. An art style can be associated with a specific period, movement, artist, or cultural tradition. It can also be a personal style developed by an individual artist over time.


Take this picture for example:

When you look at this illustration, what is the subject?


Is it the girl?


And if she is the subject, is her horned helmet such an important part of her identity that it can be considered a part of her?Is the helmet a separate object?

 

Is this a part of a series in which all human characters wear horned helmets?

And in the last case, would that make it a style?

     As you can see, defining the style of your model is an important step in the training process. Without knowing the context of this single piece of work, it is unclear whether the girl in the helmet with pigtails is the subject, or whether we are training the illustration style independent of specific subjects. It is therefore crucial to carefully review all images in your dataset when preparing to train.

Defining Your Style

     Games have a cohesive end-to-end style. Some games have a specific style for characters, interface, and environment. It is important that you have outlined what you are designing before you attempt to train a model.

     Style is contextual. Since style is the aesthetic glue that pulls together many different subjects and story elements in a piece of visual media, you need to have various examples of a style to properly train it.

You can have hybrids:

     This dataset is composed of many different isometric buildings. In the case of these images, the style includes the aesthetic, general shape, structure, and limited color variation. The subject of each image is unique, but fits into a general “building” subject category. These are more of a hybrid dataset, because the token being trained will help the AI remember both a subject and a style when it’s used in a prompt.

On the other hand, this sample of a dataset:

As you can see, these three images are very different. However, their aesthetic has a cohesive quality which allows a viewer to recognize they all belong in the same world.

Whenever it comes to a cohesive experience, understanding how your style plays a role is critical. Most games either have:

  • A single cohesive style.
  • A variety of styles which range between characters, objects, and themes.

     With text encoding, which we will go over in a different tutorial, you can teach the AI how to differentiate between different micro-styles within a general aesthetic. In this tutorial we are going to walk you through how to train a style without text encoding. This is a great approach if you know that your game functionality needs many different, individual models, rather than one large model. However, it will take more training rounds to get each individual styles, and you will need to rely on prompt engineering in some cases as you go through the process of creating your generative world.

Examples of Art Style Regularization Settings

     You will rely on regularization images to ensure a well trained model. Here’s a short walk through of the different options:

Character Design

     Character design is almost what one might consider to be a hybrid, but it is just general enough that we can consider it a proper style in this case. Use this if you are just focusing on the design of a stylized character set. Try to include as much variety as possible in your subject, while keeping your style really consistent. We will have an upcoming dataset creation guide for style which you can use to help with this.

                 

Concept Art

     This is a great general default if you aren’t sure what category your style falls under. To get a good concept art range, we recommend a lot of diversity between images. There isn’t an exact ration - and it’s really dependent on your subject matter. As much balance you can have between which subjects are represented in your dataset, the better.

Drawing/Illustration/Painting/Sketch

     These categories refer to types and qualities of aesthetics. While a drawing can be considered more the middle of the road, a sketch would be far more simple, and an illustration would be more composed. Of course, a painting would provide reference points for texture and movement not see in the other categories. If the medium being portrayed is more important than capturing the concept of a world built experience, these are a good choice.

Pattern

     This setting is a great choices when you are looking at textures, design patterns, and other repetitive design aesthetics. While concept art could be considered a reflection of a world, pattern is represents a small but important repeating detail rule within the world.

Walkthrough - Training the Bubbleverse (Style Training)

     Now it’s time for a brief walkthrough of the process. We’ve created a sample dataset for style training. This style is meant to be general concept art for a potential 3D game. One might use art like this to sketch out ideas, pitch a game concept, develop base designs and create marketing materials.

Step 1: Gather the Dataset

     This dataset is designed to not only teach the AI how to make the subjects within it, but also give it enough context that it understands this style can translate to many different subject prompts which have not been trained into the model. A good general concept art model is able to produce subjects that don’t appear in the dataset in a completely cohesive style. Here are some good starting steps in creating your dataset:

     Collect examples of the style you want to train from the game you are designing. Make sure to include a variety of different subjects, environments, and other elements.
When you have collected your examples, review them to determine which elements of the style are consistent across all of your examples. These are the elements that you will want to focus on when you begin training.
When you are training a style, it is important to pay attention to details such as color, composition, line work, and other visual elements. Make sure to provide enough examples of each of these elements in your training dataset.

     We have included a download link for our sample dataset, so you can practice along. As you can see, one of the keys to success is to have a broad range of subjects represented, including characters, landscapes, and objects. Sometimes there is a benefit to cutting out the background of the images in Scenario, however this isn’t always the case. In the case of concept art training, it is not recommended.
Download a Dataset

     We will be releasing a more in depth series on how to curate a dataset. Primarily it is important to keep in mind that when you are training a style, you want a lot of diversity from image to image. If you have any images that are outliers - for example if all of the images in your dataset are neutral tones, but you have one that is red, you will want to make sure you have enough training images that the red makes up a small percentage of your dataset.

     For style, we recommend 30+ images. The more images you use, the more diverse you want your dataset to be. If you have fifty images, make sure to divide the total number of images by the number of unique subjects you have. You should use no more than that average for any individual type of subject, and the more subjects you include, the less likely you are to have an image overfit, or take over, your dataset.

Step 2: Upload Your Dataset

     Once you've identified your dataset, it's time to upload it. If you have images that are not a 1:1 ratio, you can use Scenario's cropping tool to adjust them in the web app. While removing the backgrounds of certain types of subject images can result in better outputs, this is generally not the case with general style images.

     From the Models Page, click New Model, and then click Start Training. Name your model and select SD 1.5.

Screenshot 2023-12-23 at 12.11.23 PM

Screenshot 2023-12-24 at 1.32.47 PM

Screenshot 2023-12-26 at 3.20.19 PM
the descriptions of the image are auto-generated and are called captions. Advanced Users can hand caption their dataset images but it is not recommended for beginners. For more information click here

Step 3: Choose your Regularization Class and Category

     Next, choose a regularization class from the dropdown menu (e.g. Art Style, Concept Art), and select a category.

Screenshot 2023-12-26 at 3.23.01 PM   Screenshot 2023-12-26 at 3.23.29 PM

Step 4: Training Settings

     We recommend if this is your first time custom training that you stick to the default settings. We will get into various settings more in future guides, however 100 steps per image is a good rule of thumb.

Screenshot 2023-12-26 at 3.25.55 PM


     Now you can go ahead and Start Training. Training time ranges from 20 minutes-2 hours, roughly, depending on the number of images you use in your dataset.. When the training is complete, you will be able to test your model and generate images using prompts.

     Keep in mind that the quality of your output images will depend on the quality and diversity of your dataset, as well as the number of training steps. If you are not satisfied with the results of your training, you may need to adjust your dataset and try training again.

     We recommend that you read our tutorials on dataset curation and prompt engineering for more tips on how to improve the quality of your output images.

Start Here: Scenario Basics