Synthography

A synthographic image created with a fine-tuned Stable Diffusion model and outpainting

Synthography[1] is the method of generating digital media synthetically using machine learning. This is distinct from other graphic creation and editing methods in that synthography uses artificial intelligence art text-to-image models to generate synthetic media. It is commonly achieved by prompt engineering text descriptions as input to create or edit a desired image.[2][3]

Text-to-image models, algorithms, and software are tools used in synthography that are designed to have technical proficiency in creating the resulting artificial intelligence art output based on human input. Synthography typically uses text-to-image models to synthesize new images as a derivative of the training, validation, and test data sets on which the text-to-image models were trained.

Synthography is the method used, not the output itself. The output created specifically by synthography generative models (as opposed to the broader category of artificial intelligence art) are referred to as synthographs.[1] Those who practice synthography are referred to as synthographers.[4] A synthograper can harness the ability of linguistic composition to tame a generative model. Other cases also include fine-tuning a model on a dataset to expand its creation possibilities.

Etymology

From Latin synthesis "collection, composition", from Greek synthesis "composition, a putting together".[5]
"-graphy" is the word-forming element meaning "process of writing or recording" or "a writing, recording, or description" (in modern use especially in forming names of descriptive sciences). From French or German -graphie, from Greek -graphia "description of," used in abstract nouns from graphein earlier "to draw, represent by lines drawn," originally "to scrape, scratch" (on clay tablets with a stylus).[6] The term is still in its infancy, as early adopters are using different terminology for this technique. Other names are prompt engineering, image synthesis and artificial intelligence art.

History

The event known to have started the broad usage of text-to-image models is the publication of DALL-E by OpenAI in January 2021.[7] While it was not released to the public, CLIP (Contrastive Language-Image Pre-training) was open-sourced, which led to a succession of implementations with other generators such as Generative adversarial networks and Diffusion models.[8][9] The next big event, which led to a rise in popularity of such technique, was the release of DALL-E 2 in April 2022. After slowly releasing it as a private beta, it became public in July 2022. In August 2022, Stable Diffusion was open-sourced by Stability AI,[10] which fostered a community-led movement.

Methodology

As synthography refers to the method of generating AI visuals, these are the mediums or classes used in the method.

Synthography Mediums
input \ output text image 3D model video
text chatbot text-to-image text-to-3D text-to-video
image image-to-text image-to-image image-to-3D image-to-video
video video-to-video
brain brain-to-image
Legend
white background Doesn't exist yet
light green background Currently exists in academia or beta
green background Exists commercially or widely available

(Note: text-to-speech and speech-to-text are purposely omitted in the table since that can simply be performed by dictation/transcription software and therefore is implied by the 'text' row and column. Also note that the mediums listed are of the class of medium, not specific instances of it ie: chatbot instead of ChatGPT.)

Difference between Synthography and Artificial Intelligence Art

Artificial Intelligence (Artificial Intelligence Art) relation to Generative Models (Synthography), Venn diagram

Synthography is the method used to create synthetic media using generative models. Artificial Intelligence Art (including music, cooking, and video game level design) is the output created using artificial intelligence which is an overly and increasingly broad category.

When Elke Reinhuber coined the term synthography in her paper, "Synthography – An Invitation to Reconsider the Rapidly Changing Toolkit of Digital Image Creation", she spoke of a "legitimation crisis" as a need for the term. Before generative models were used, artificial intelligence art algorithms already existed in mediums such as graphics editing software (ie: content-aware fill, application of artistic styles, resolution enhancement) which employs a wide range of artificial intelligence tools, and DSLR and smartphone cameras (ie: object recognition, in-camera focus stacking, low-light machine learning algorithms) all of which continue to undergo rapid development.[1]

Artificial intelligence is a superset of Machine learning. Machine learning is a superset of neural networks. Neural networks are a superset of generative models such as GAN's (generative adversarial networks) and diffusion models. The relation between all of these is depicted in the Venn diagram shown here. Synthography specifically uses generative models, as popularized by software such as DALL-E, Midjourney, and Stable Diffusion.

References

  1. ^ a b c Reinhuber, Elke (2 December 2021). "Synthography–An Invitation to Reconsider the Rapidly Changing Toolkit of Digital Image Creation as a New Genre Beyond Photography". Google Scholar. Retrieved 20 December 2022.
  2. ^ Smith, Thomas (26 October 2022). "What is Synthography? An Interview With Mark Milstein - Synthetic Engineers". syntheticengineers.com. Synthetic Engineers. Retrieved 20 December 2022.
  3. ^ Oosthuizen, Megan (20 December 2022). "Artist Shows Us What A Live-Action Movie Could Look Like". fortressofsolitude.co.za. Fortress Entertainment. Retrieved 10 February 2023.
  4. ^ Ango, Stephan (3 July 2022). "A Camera for Ideas". stephanango.com. Retrieved 10 February 2023.
  5. ^ "synthesis". etymonline.com. Online Etymology Dictionary. Retrieved 27 December 2022.
  6. ^ "-graphy". etymonline.com. Online Etymology Dictionary. Retrieved 27 December 2022.
  7. ^ Underwood, Ted (21 October 2021). "Mapping the latent spaces of culture". tedunderwood.com. tedunderwood.com. Retrieved 6 February 2023.
  8. ^ Steinbrück, Alexa (3 August 2021). "VQGAN+CLIP - How does it work?". alexasteinbruck.medium.com. medium.com. Retrieved 6 February 2023.
  9. ^ Smith, Ethan. "A Traveler's Guide to the Latent Space". notion.com. Retrieved 6 February 2023.
  10. ^ Roose, Kevin (21 October 2022). "A Coming-Out Party for Generative A.I., Silicon Valley's New Craze". The New York Times. Retrieved 6 February 2023.