Introduction to Scripthea

Scripthea is a free, open-source Windows application for text-to-image prompt engineering.

Scripthea is designed to streamline the process of crafting prompts for text-to-image AI generators like Stable Diffusion. It offers a structured environment for building, testing, and refining prompts, making it an invaluable tool for artists, designers, and AI enthusiasts seeking greater control over their creative outputs. At its core, Scripthea simplifies prompt engineering by breaking down prompts into two components: cues (descriptive text or phrase) and modifiers (attributes like style, lighting, or artist references). This modular approach allows users to experiment with various combinations, facilitating a more systematic exploration of visual styles and themes.
Scripthea operates in two modes: in Single mode, you can use one cue and more than one modifier; in Scan mode, you can select any number of cues and any number of modifiers, but each prompt will combine only one cue with one or many modifiers. Cues and modifiers are organized and categorized in lists. After generating an image its prompt and generation setting become part of a collection (image depot) with a convenient image viewer on the second tab. So for example if you would like to see how a particular painter would paint different subjects (cues) or how specific topic would be painted by different painters. Scripthea will generate all the combinations for you (scan) and query the active image generator API for you to put them in the working image depot.

New! Scripthea now includes external prompt collections with one and half million unique prompts (see here).

NEW! Scripthea now can ask an LLM (via LM Studio) to tailor (fashion) your prompts in a particular way (see here).


See a short introduction video-clip HERE or on YouTube.
Read some reviews:
  • Scripthea Makes AI Art Creation Effortless. by Ramakanth (here)
  • Scripthea: Unleashing Your Inner Artist with Text-to-Image AI. by Nayedeals  (here)
  • A comprehensive prompt composer tool to help with image-based generative AI, providing a methodical approach to composing complex prompts with minimal effort. by Robert Condorache  (here)

The software is distributed as a freeware and it is an open source project (under MIT license) hosted in GitHub repository.

What is it for ?

Traditional approach

Text-to-image generation become a common tool for anybody trying to visualize their thoughts, visions or just playing around out of curiosity. From the very beginning the common (traditional) approach has been: you describe as much as possible what your vision is and text-to-image generator translate that into an image. As the domain develops a number of guides, tutorials and lists of tips accumulate. Common tips for writing a good prompts include: state the main subject of your image; give in clear terms as much as possible details including lighting, describe the composition/framing, the mood and your preferred style. The traditional approach is suitable for generating both realistic and arty images.

Within this approach Scripthea can help you by cataloging you images along with the prompts you write in image depot. More than enough number of categorized modifiers are there at your disposition. If you are not certain about some of the modifiers effect on the image, you may experiment using the most powerful Scripthea feature in this regard is its ability to easy set iterations (scans) over number of modifiers. (see Scenario #2). The scans will give you an increasingly good feeling how the text (cue) you are currently using reacts on different combination of modifiers. There will be always some chance involved in the text-to-image creation process but the aim here is to minimize it. Another way to instruct Stable Diffusion to follow more strictly your description is to increase (relative to that model) CFG (classifier-free guidance) scale in SD parameters.

All this can be very productive but relies on two conditions:
1. you have a good enough idea what the intended image supposed to look like
2. you poses enough control over the instrument to express that idea (the language) as vocabulary and nuances in order to describe your intentions as adequately as possible.


Inspirational way

Scripthea offers an alternative approach for those of you who don't meet one or both of the conditions mentioned in the traditional way. The only prerequisite is: you need to know what you like (or dislike). This way of using Scripthea is especially suitable to the creative souls who don't have English as a first language or would like to explore ambiguity of the English language as a mean of expression.

How does it work:

At the beginning you have to pick a subject or a theme for your image. Scripthea offers 31 lists of it own cues (marked in green in Pool Map). Each list  covers a particular subject (the clue is in the name). The important point here is that all of these cues were chosen to be figurative, metaphorical or ambiguous in a way. That is a critical part of Scripthea inspirational approach to compose the text-to-image prompts. The aim is not so much to describe in details the image but to use the cue to "inspire" the model or one may say to challenges the model's "creativity" or to provoke it to hallucinate in some cases. The modifiers gives the cue a context or guideline in a direction you would like to explore. You may direct Stable Diffusion to relax the literal interpretation of your prompt by decreasing a notch CFG scale parameter. As you may guess that makes inspirational way much more suitable for creation of AI-art than realistic images.

Generally you start from scan set (scan preview tab) with more cues (10-100) and 1 to 3 modifiers and you generate your first iteration (image depot). Then you filter the resulting image depot and use the selected images prompts as cues in new scan set with some new modifiers. That way you iterate (2 to 5 times) towards some outcome you really like. After that you can go (optionally) for upscaling, or to some image editor for final touches. (see Scenario #3)

While counting on the model "creativity" will pleasantly surprise you it also include counting on some chance/randomness which makes this the use of that approach  more lengthily. Still, I found the process to be a lot of fun and to have a good feeling of creative collaboration with the people behind the model.

Middle ground (external collections)

While having written the whole prompt yourself gives you a sense of control and satisfactions. It could require a lot of work if you have more an idea about subject or style than clear (concrete) picture in your mind. You can help yourself by borrowing prompts already written by others. Scripthea gives you access to a vast number (more than 1.5 million) unique prompts in so called external collections. While the big number is an advantage as freedom of choice it is overwhelming from practical prospective. You need to extract some subset of prompts as cues and work with it. You have a variety of extraction options (see Ext.collections), but most importantly you can target a subject, character or style in your words and run semantic matching (semantic extension is required). Scripthea will evaluate semantic similarities between the prompts from the ext. collection and your description, then rank these semantic similarities (cosine normalized distances) and extract the best ones into a cue list. After having the subset of some reasonable size (typically between 30 and 300 cues) you can proceed by selecting different modifiers (fixed and scanned) to fine-tune your resulting images.
To summarize, you start your creative process traditionally by borrowing prompts from ext. collection using some idea of yours in the semantic extraction and finish it or fine tuning it by exploring and challenging further your creative space, more in the inspirational way state of mind.

        Here is the prompt composer tab...

On the left, you see the log panel which will text you about any ongoing operations. For prompt composing, there are two modes: Single and Scan. In Single mode, you can use a single cue with one or many modifiers. In Scan mode, you can select any number of cues and any number of modifiers although each prompt will be combination of one cue + all Fixed modifiers + with a number (modifiers sample number) of Scannable modifiers. Modifiers are divided into switchable (on/off) categories. If you wonder about any modifier, hover over it, there will be a hint for the most of them. If you right-click on any modifier you will be asked to confirm a google search for that modifier. In options, you can specify the image depot folder where the images from your scan (or a single query) will go.

All the options, external and internal sizes and main window position are saved on application closing and retrieved on starting.

        ...and that is the image depot viewer

The viewer shows a Scripthea image depot (a folder with bunch of images and description.idf file). You can select a image depot folder from the directory browser on the left while the image depot text box rim is highlighted (in navy). Check Viewer page for more details about the directory browser. You can chose between table view and thumbnail (grid) view. In the grid view you can adjust the thumbnails from the menu (bottom left button). You can move around with the arrows on the bottom, all self-explanatory (I think). The only other than viewing operation you can do here is delete an image. On the very bottom common (for both views) there is the find panel which will find a word(s) in the prompts of the active image depot and select it. Mark button will highlight the some of the prompt/images by the same criterion. The shown image itself can be zoomed in/out (buttons), panned (scroll-bars) or fit (the middle button), more tools are coming...

Image Depot Master

Image Depot Master (IDM) is image depot manager for copying and moving images from image depot to another or to an empty one, as well deleting images from an image depot.

It provides an option to validate image depot consistency as erase entries in description.idf without corresponding images. The selection of a folder (image depot or an empty one) is done the same way as in image viewer, as well as two possible views - list and grid, similar to the viewer arrangement.

Import/Export utilities

The forth tab of Scripthea contains an import utility of converting image files from some some text-to-image generator (e.g. Stable Diffusion, Craiyon). The import utility will convert these images into Scripthea image depot. The description.idf file is a text file where each line is json formatted property dictionary of the generated images including the prompt. You can edit the file for any reason as you like as long as you keep the json structure.
Export utility takes an image depot and exports selected subset to another folder with export control of files name and type (.png or .jpg). Optionally Scripthea can create an webpage with the exported images for independent from Scripthea local browsing or your website.

Python scripting

Python scripting provides you (if you speak Python) with scripting abilities with access to a key Scripthea features as well as some specific to the integrated version of Python features. Great for automation of any routine task or anything time-consuming you know in advance what that would be. That is another way (a part of Custom Scripthea addon) to open Scripthea to any software/library.

 

contact

Keep in mind that the application is under active development so I would appreciate any bug report. Let me know HERE and I'll do my best to fix it ASAP. In the same way, you can communicate any ideas for improvement, experiences with the software or your willingness to help me with the project.

legal

Scripthea software has been written by and is copyrighted to Teodor Krastev. The sources are distributed under MIT's open source license.

 

the name "Scripthea"

The name Scripthea (or Scrip(t)Thea) is coming from script (written text, from Latin scriptum) and Theathe Greek goddess of sight and vision. In a way, you offer your script to the goddess Thea (aka Theia) and in return, you receive a vision.