As a follow-up to my presentation at the Hamburg photopia fair, I was interviewed by “DIE BILDBESCHAFFER”. here is a rough translation, read the original interview in German here: https://www.die-bildbeschaffer.de/news/boris-eldagsen-ueber-ki-in-der-fotografie

Boris Eldagsen on AI in photography

Monday, 14.11.2022
Renowned photographic artist and digital marketing strategist Boris Eldagsen has valued an experimental approach to photography for more than 20 years and is particularly concerned with digital, forward-looking possibilities in the industry. He is currently delving deep into the topic of photography and artificial intelligence and takes us image procurers into the thick of it. The interview for Bildbeschaffer was conducted by Jana Kühle.

Boris, many established photographers react with resistance as soon as the topic of artificial intelligence comes up. Why is it different with you?

I am naturally curious about everything that has to do with creativity. I also look at the topic of AI-generated images from a creative perspective. We are dealing with a new technology that is in a way as disruptive as the internet and later smartphones were. I think today, as I did then: Wow, an incredible number of doors are opening up! Legally, there are still many things to be clarified, but many are already rushing ahead, according to the motto: Let’s just do it! Of course, this will have a lot of consequences …

You are currently testing various AI applications. What do you notice most about them?

The technology is developing so rapidly that everything you read and hear about it is already outdated after a few weeks. Since the summer, I’ve been working with AI almost daily to keep up to date. In the beginning, I tested the programme Midjourney. When it became too illustration-heavy for me, I switched to DALL-E as a beta tester.

These are programmes that create entire pictures with the help of text input. You can’t shake the impression that what used to take three years in technical development now takes three weeks.

That’s the way it is. In the meantime, the competition has developed in such a way that it is no longer just the big players who keep their insider knowledge to themselves. For example, DALL-E was the leader, but only accessible with registration and very restrictive. Then came Stable Diffusion (a deep-learning text-to-image model, editor’s note) at the end of August as open source. What’s been happening since then is just crazy.

What is the difference between the providers, apart from the fact that Stable Diffusion is open source and accessible to everyone?

While Dall-E is smart, simple and has only a few options, Stable Diffusion is the opposite. If you learn the ropes, you feel like you’re starting from scratch with Photoshop. The installation alone feels like you have to be a programmer. It’s the opposite of user-friendly. But there are an incredible amount of creative possibilities and I really enjoy experimenting with it. With DALL-E, I maxed out the restrictions and entered keywords until my account was locked. This led to the work VOMIT. Fortunately, after the blocking came Stable Diffusion and that’s where I started experimenting with language.

In what way?

For example, I nested the text to such an extent that subject and object no longer simply dissolve. That means I confuse the AI with the sentence structure. Then I invented new words and saw what happened. I deliberately misspelled words. Unbelievable mutations come out of that. At the beginning, these were a big problem for Stable Diffusion, but now there are solutions for that too. Within a few weeks they have developed extremely. And that’s just the beginning. It costs me nothing, I can train the programme with my own input and have all the artistic freedom to work with it. Very few people see it as an element in a workflow. But now, of course, it’s a matter of working out how best to use this software.

Take us into the world of open source software. How do we imagine it?

Gladly. I can use text input (prompts) to describe what I want to see, for example, and also exclude what I don’t want to see in the second step. I can have images generated in the style of a well-known artist. I can determine how many steps of the image generation (sampling steps) are made at all. Of course, I can also determine the size of the image. There is even the option to restore faces. Then there is the so-called hires fix, which cleans up the images afterwards. I can enter how many images are calculated at once. And I can set a so-called seed. If you go to different platforms like Lexica.art, where others show their prompts (the text specifications for the image creation, editor’s note) and results, these images always have a number, that is the number of the seed. That means I can take a picture of someone else and use that as the source material for my work. The result of Stable Diffusion, if you enter as a so-called text prompt “pizza eating bear, 1980s, polaroid, helmut newton”. The kitchen that appears in the background could be replaced in seconds by any other background using the negative command “kitchen”. With a vertical line as a command, I can also have different aspects (e.g. lighting conditions, perspective, choice of film) calculated as a variant. This does not exist with DALL-E, for example. Or if a person is ejected whose face I don’t like, I can erase it and have it recalculated, while the rest of the image remains. Oh, and I just see there’s another new feature that didn’t exist yesterday, “Extensions”.

It all looks as professional as it is complex.

I watched two videos to even get through it. That’s an incredible number of options that are pretty user-unfriendly to begin with. The average consumer can’t do anything with it. But at some point there will probably be a version that is easier to install and use.

At the end of October, Shutterstock announced that they would be working with DALL-E. I assume that DALL-E will be working with Shutterstock.

I assume DALL-E will also offer more options in the future, but I find it hard to imagine it keeping up with Stable Diffusion.

Getty, on the other hand, has announced a collaboration with Israeli anchor Bria. Why even go through agencies like Getty or Shutterstock and pay money when there is Stable Diffusion to do it yourself?

Well, here we are again with the not particularly user-friendly interface. The installation alone, which runs via Windows Command – that’s far too complicated and complex. It simply assumes too much for the normal user. I assume that most platforms will look for a partner. So there will be more and more providers of AI-generated images. But what Stable Diffusion is developing as open source is of course radical. Nothing has a copyright, you can do what you want with it. A company can’t offer that in the same form.

Let’s dare to look into the crystal ball. What does all this mean for photographers, for photo editors, for the procurement of images?

Even with the help of DALL-E2, photorealistic images can already be created just by entering text. The technology is already there. I myself have already sat down and created a series of pictures for the BFF (Bund Freischaffender Fotografen und Fotodesigner), of which only one picture was the original of a BFF photographer. The viewers were not always correct in their assessment of which one was the original. At the end of October, the Financial Times tested whether people are still able to recognise AI-generated images and distinguish them from real ones. 80 per cent could not tell the difference. So you can do a lot of shenanigans with it. For me, this is as disruptive a medium as digitalisation. The word AI-sation doesn’t exist as such as a word yet, but it will intervene in all areas. You can have AI calculate everything. And then there are professions in the field of pornography. Stable Diffusion also gets a lot of “NSFW” results – not safe for work, in German: pornographic content. So AI images of sexuality that in reality is not even possible because the human body would not be able to do it.

How does it change a photographer’s work?

In 2019, I was still given a picture and told “Do this for me in green”, the budget was 1500 euros. Now it costs only 3 cents for a similar result with AI. Agencies are already working with it, both in development and in the implementation of ideas. All jobs that say “do me the same thing in green” no longer need to be photographed. Mood images like old people on a park bench can be generated. Photographers will no longer be needed for that. What remains are documentaries or current affairs, celebrity weddings, events.

Welcome to the world of the also dangerous Deep Fake.

Definitely. I have also generated a real picture from the war zone in Butscha simply in different variations. So you can produce an alternative in a short time and spread it via social media without it being recognisable as Deep Fake. This level of manipulation is no longer a thing of the future. I assume that the facts are even harder to keep.

So would you subscribe to the thesis that trust in visual media is radically lost?

It has to. If it doesn’t, then society is being manipulated.

Which cannot be kept in check by jurisprudence?

Jurisprudence comes too late. Even more so when I have an army of bots doing it for me. What does jurisprudence want to do? More than ten years ago, I already dealt with manipulation in social media – if you go deeper, you look into an abyss. In the past, there were click farms in Asia or Africa, today it’s interconnected smartphones in China that can play platforms. The clean-up of the big social media providers is always a sweep up. They are always a few steps behind. And the way you can cover your tracks as a bot opens up completely different problems.

After all, readers often don’t even notice models, even if they are on two posters for different companies at the same bus stop. They take in the message and the mood of the image, the person himself remains diffuse. These images can then also be generated by AI.

With AI-generated images, I always have a new person who doesn’t exist. So there is no longer the problem that the same person or the same model appears again and again on different advertising spaces. The truthfulness of symbolic images in advertising did not exist anyway. I once assisted an advertising photographer for half a year, which I found very enlightening on the subject of truth. Everything was constructed. I don’t see any difference in digital. The crucial thing will be to know how to use the tools. Just as Photoshop or a camera is a tool that you have to know how to use. If you just type in a command and are satisfied with the result, you’re just scratching the surface of artificial intelligence. The programmes are already further along than most of us think.

Thank you very much for the informative interview.

If you would like to experience Boris Eldagsen live on the topic of photography and artificial intelligence, we recommend his lecture on 4 December at the DFA conference in Hamburg’s Deichtorhallen. The lecture will also be streamed live via facebook and can be accessed afterwards via www.youtube.com/@deutschefotografischeakademie.

Author: jk