Introduction to Scripthea

Scripthea is a freeware Windows application for text-to-image prompt engineering.

Text-to-image AI generation domain has been exploding for the list year or two and I find its abilities fascinating. Scripthea application is my contribution to that dynamic domain while having fun developing it. The software provides a systematic approach in composing the text prompt (aka prompt engineering). Briefly, prompt = cue + modifiers. You will be offered collections of short descriptive texts (cues) and categorized collections of modifiers, like a painter, art style, time period, etc. To unravel the capabilities of  Scripthea you need to install Stable Diffusion (ComfyUI or AUTOMATIC1111). After generating an image its prompt and generation setting become part of a collection (image depot) with a convenient image viewer on the second tab. On top of that, you can also "scan". After selecting some cues and some modifiers Scan will combine them (there are rules). So for example if you would like to see how a particular painter would paint different subjects (cues) or how specific topic would be painted by different painters. Scripthea will generate all the combinations for you (scan) and query the active API for you to put them in the working image depot.

The software is distributed as a freeware and it is an open source project (under MIT license) hosted in GitHub repository.

New! Scripthea now includes external prompt collections with ten of thousands of prompts (see here), more to come...

See a short introduction video-clip HERE or on YouTube.
Read some reviews:
  • Scripthea Makes AI Art Creation Effortless. by Ramakanth (here)
  • Scripthea: Unleashing Your Inner Artist with Text-to-Image AI. by Nayedeals  (here)
  • A comprehensive prompt composer tool to help with image-based generative AI, providing a methodical approach to composing complex prompts with minimal effort. by Robert Condorache  (here)

What is it for ?

Traditional approach

Text-to-image generation become a common tool for anybody trying to visualize their thoughts, taste or just playing around out of curiosity. As the domain develops a number of guides, tutorials and lists of tips accumulate.
Common tips for writing a good prompts include: state the main subject of your image; give in clear terms as much as possible details including lighting, describe the composition/framing, the mood and your preferred style.
The traditional approach is suitable for realistic and arty images

Within this approach Scripthea can help you a lot with plenty of cues (proto-prompts) and more-than-you-will-ever-use number of modifiers. The most powerful Scripthea feature in this regard is its ability to easy set iterations (scans) over cues and/or modifiers. (see Scenario #2). The scans will give you an increasingly good feeling how the model you are currently using reacts on different cues and combination of modifiers. There will be always some chance involved in the creation process but the aim here is to minimize it. Another manner to instruct Stable Diffusion to follow more strictly your description is to increase (relative to that model) CFG (classifier-free guidance) scale .

Specialized and not-so-specialized websites offer plethora of images created by people who follow the traditional approach. If you are one of these people you may try maybe Scripthea out of curiosity or make an use of well-developed image collection visualization and management tools. There is integrated Python scripting for the more advanced users.

All this can be very productive but relies on two conditions:
1. you have a very good visual idea what your intended image supposed to be
2. you poses enough control over the instrument to express that idea (the language) as vocabulary and nuance in order to describe your intentions as adequate as possible.


Inspiration way

Scripthea offers an alternative approach for those of you who don't meet one or both of the conditions mentioned above. The only prerequisite is: you need to know what you like (or dislike). It seems easy but because the approach is more passive than the traditional one, it takes more time (trials) to navigate towards what you may consider a hit. In this way Scripthea is especially useful for the creative souls who don't speak English as a first language.

How does it work:

At the beginning you have to pick a subject or a theme for your image. Scripthea offers 31 lists of cues (proto-prompts). Each list  covers a particular subject (the clue is in the name). The important point here is to select the cues which seem to be figurative, metaphorical or ambiguous in a way. That is a critical part of Scripthea inspiration approach to compose the text-to-image prompts. The aim is not so much to describe in details the image but to use the cue to "inspire" the model or one may say to challenges the model's "creativity". The modifiers gives the model a context or guideline in a direction you would like to explore. You may direct Stable Diffusion to relax the literal interpretation of your prompt by decreasing a notch CFG scale. As you may guess that makes inspirational way much more suitable for creation of art than realistic images.

Generally you start from scan set (scan preview tab) with more cues (10-100) and 1 to 3 modifiers and you generate your first iteration (image depot). Then you filter the resulting image depot and use the selected images prompts as cues in new scan set with some new modifiers. That way you iterate (2 to 5 times) towards some outcome you really like. After that you can go (optionally) for upscaling, or to some image editor for final touches. (see Scenario #3)

While counting on the model "creativity" will pleasantly surprise you it also include counting on some chance/randomness that's why this approach is to be expected to be more lengthily. Still, I found the process to be a lot of fun and to have a good feeling of creative collaboration with the people behind the model.

Here is the prompt composer tab...

On the left, you see the log panel which will text you about any ongoing operations. For prompt composing, there are two modes: Single and Scan. In Single mode, you can use one cue with more than one modifier. In Scan mode, you can select any number of cues and any number of modifiers although each prompt will be combination of one cue + all Fixed modifiers + with a number (modifiers sample number) of Scannable modifiers. Modifiers are divided into switchable (on/off) categories. If you wonder about any modifier, hover over it, there will be a hint for the most of them. If you right-click on any modifier you will be asked to confirm a google search for that modifier. In options, you can specify the image depot folder where the images from your scan (or a single query) will go.

All the options, external and internal sizes and main window position are saved on application closing and retrieved on starting.

...and that is the image depot viewer

The viewer shows a Scripthea image depot (a folder with bunch of images and description.idf file). You can select a image depot folder from the directory browser on the left while the image depot text box rim is highlighted (in navy). Check Viewer page for more details about the directory browser. You can chose between table view and thumbnail (grid) view. In the grid view you can adjust the thumbnails from the menu (bottom left button). You can move around with the arrows on the bottom, all self-explanatory (I think). The only other than viewing operation you can do here is delete an image. On the very bottom common (for both views) there is the find panel which will find a word(s) in the prompts of the active image depot and select it. Mark button will highlight the some of the prompt/images by the same criterion. The shown image itself can be zoomed in/out (buttons), panned (scroll-bars) or fit (the middle button), more tools are comming...

Image Depot Master

Image Depot Master (IDM) is image depot manager for copying and moving images from image depot to another or an empty one, as well deleting images from image depot.

It provides an option to validate image depot consistency as erase entries in description.idf without corresponding images. More complete that validating is synchronizing (three bar menu) which is validating and deleting all the images in the folder without entries.

The selection of a folder (image depot or an empty one) is done the same way as in image viewer, as well as two possible views - list and grid, similar to the viewer arrangement.

The idea of two panels to deal with files is coming from old Norton commander DOS file manager.

Import/Export utilities

The forth tab of Scripthea contains an import utility of converting image files from some some text-to-image generator (e.g. Stable Diffusion, Craiyon). The import utility will convert these images into Scripthea image depot. The description.idf file is a text file where each line is json formatted property dictionary of the generated images including the prompt. You can edit the file for any reason as you like as long as you keep the json structure. Export utility takes an image depot and exports selected subset to another folder with export control of files name and type (.png or .jpg). Optionally Scripthea can create an webpage with the exported images for local browsing or your website.

contact

Keep in mind that the application is under active development so I would appreciate any bug report. Let me know HERE and I'll do my best to fix it ASAP. In the same way, you can communicate any ideas for improvement, experiences with the software or your willingness to help me with the project.
I would especially appreciate more cue collections preferably organized by subject, see cues folder for *.cues files.

legal

Scripthea software has been written by and is copyrighted to Teodor Krastev. The sources are distributed under MIT's open source license.

the name "Scripthea"

The name Scripthea (or Scrip(t)Thea) is coming from script (written text, from Latin scriptum) and Theathe Greek goddess of sight and vision. In a way, you offer your script to the goddess Thea, or perhaps  inspire her, and in return, you receive a vision.