Image Analysis (Vision)

A few models are capable of processing images and taking them into account for their answer generation. This works because these models have multimodal capabilities, meaning they can understand both text and visual content simultaneously. You can use this to extract text from doc

Image Analysis (Vision)

A few models are capable of processing images and taking them into account for their answer generation. This works because these models have multimodal capabilities, meaning they can understand both text and visual content simultaneously. You can use this to extract text from documents, describe what's in images, or analyze visual data.

The more context and details you add, the better your response because the model understands precisely what you expect. Do not miss our Prompt Engineering Guide to learn how to write great prompts.

Apart from uploading text files, you can also upload images (JPG, PNG) to the chat and let the model analyze them. This capability is called "vision". Most modern models from OpenAI, Anthropic, and Google support image analysis.

You can check which models support vision in our model picker within app.odeus.ai.