Amongst the former members of the Beam Research Team, we held a competition to investigate generative AI for making images. When discussing the criteria we agreed that realistic outdoor scenes would be more challenging than indoor scenes and that it would be difficult to achieve spatial consistency between images.
The Challenge
Create photo-realistic images of an outdoor scene, you must generate 2 views of the same scene.
Rules
It is allowed to chain generators together and use images in intermediate stages; e.g. generate an image with text prompt and pass that output image to another generator; as long as you don't input anything other than text. It is also allowed to provide additional text prompt to the next generator.
We held a vote to decide whose images were the best, with the restriction that you could not vote for your own images.
Prompt: Generate an image of a downhill mountain bike race in the french Alps. The image should be photorealistic, containing one cyclist descending a steep, rocky track. There should be clear foreground/background separation in the image. The lighting should be flat, as if the weather is overcast. There should not be any lens distortion effects in the image.
Generator: Gemini
Entrant: Ben Leslie
Prompt: Can you create a photo realistic stereo pair image of the Taj Mahal, with a long baseline and do not include the camera in the image
Generator: Gemini
Entrant: Lyndon Hill
Prompt:
Generator: Gemini
Entrant: Lyndon Hill
Prompt: A photo realistic scene of an open-water swimming race (like a triathlon). The scene shows 5 swimmers and 2 supporting canoeists paddling alongside the swimmers. The water is quite rough and choppy but you can still see the people clearly. The camera is observing the scene from the top, a bird’s eye view of what you could get with a drone.
Prompt: Please see Eduardo's blog post on GenAI Multiple View Consistency; in summary use a ChatGPT description of the image on the left then ask Gemini to generate another viewpoint.
Generator: Gemini
Entrant: Eduardo Ruiz-Libreros
Prompt: Create an image of two photos of the eiffel tower on the same day. One should be from the front (north), one from the side (east). The first image should show the whole tower. The second should show a closer view. The two images should be laid out side-by-side.
Generator: Gemini
Entrant: James Ross
Prompt:
Generator: Gemini
Entrant: Lyndon Hill
Prompt: A stereo photograph of the Taj Mahal; using the left-right image style of the reference image. The left and right views should have a wide baseline. Make any clouds consistent between the two views. + using this reference image
Generator: Firefly
Entrant: Lyndon Hill
Prompt: Create a photo-realistic photograph of a bustling outdoor Mexican flea market ("tianguis") during the Day of the Dead celebrations in late afternoon golden light. Captured with a stereo camera 50mm lenses, rich depth of field. I would like left and right images to be joined. Right camera coordinate system is perfectly horizontally aligned with left one with a separation of 30 cm to the right.
Generator: Gemini
Entrant: Abel Pacheco-Ortega
Prompt: Generate a photo-realistic scene of an oak tree, in the countryside of Mexico. The tree is the only one in a grass field. I need two images from the same scene taken from different perspective exactly at the same time (no that different perspective). Give me an image that contains both perspectives as if they were a stereo pair
Generator: ChatGPT
Entrant: Abel Pacheco-Ortega
Prompt: Create a photo-realistic photograph of a bustling outdoor Mexican flea market ("tianguis") during the Day of the Dead celebrations in late afternoon golden light. Captured with a stereo camera 50mm lenses, rich depth of field. I would like left and right images to be joined. Right camera coordinate system is perfectly horizontally aligned with left one with a separation of 30 cm to the right.
Generator: ChatGPT
Entrant: Abel Pacheco-Ortega
Prompt: Create a photo realistic image of a chess board set up ready to play on a small round red table on the end of a small pier by a lake in Switzerland. The water is calm and the sky has no clouds.
Prompt:
Generator: Gemini
Entrant: Lyndon Hill