How to generate an image and text at the same time by API? Thanks – API


When using web gpt-4v, for example, we can ask gpt-4v to create an image around a topic and ask it to generate text explanation about the topic. Is it possible to use API to do the same thing? The document says that this is not possible, as we can only use Dall E3 to create an image, and then use GPT-3.5 or 4 to interpret the image. Any comments are highly appreciated.

Hi!

dalle and gpt4 are two completely separate models (as far as I know). Chatgpt rewrites your prompt and then calls dalle on those new prompts.

You could maybe accomplish something similar by using the assistants, but you might be well served by running your own chains :thinking:

Hi, thank you. What I meant is to use gpt-4v to generate an image and text at the same time, not meant to use DALL E. What I want to achieve is to create an image and an explanation about the image. With Web app, we can do this by a prompt, e.g., create an image about teacher and interpret the image. I am wondering how we can do it by API? An approach is of course to use DALLE to generate an image and then use GPT-4 to interpret the image, separately. I want to know if it is possible to use GPT-4V to do the two tasks together?

gpt-4-vision-preview can’t generate images, if that’s what you’re really asking :thinking:



Source link

Leave a Comment