A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://cloudinary.com/documentation/cloudinary_ai_vision_addon below:

Cloudinary AI Vision Add-on | Documentation

Cloudinary is a cloud-based service that provides solutions for image and video management. These include server or client-side upload, on-the-fly image and video transformations, fast CDN delivery, and a variety of asset management options.

The Cloudinary AI Vision add-on is a service utilizing LLM (Large Language Model) capabilities, specialized models, advanced algorithms, prompt engineering, and Cloudinary's knowledge, to interpret and respond to visual content queries, providing answers to questions (e.g., "Are there flowers?") and requests (e.g., "Describe this image") about an image's content. By seamlessly integrating visual and textual data, AI Vision provides a more holistic and adaptable understanding of content, enabling businesses to tailor solutions that align closely with their unique brand and customer expectations, thus securing a substantial competitive advantage.

AI Vision is designed to cater to a variety of needs across different industries, streamlining content moderation, media classification and understanding content, and providing a powerful tool that automates the analysis, tagging, and moderation of visual content.

AI Vision uses the

Analyze API

and doesn't require the image to be stored in your Cloudinary account. The AI Vision methods accept either the

asset_id

of an image in your Cloudinary account, or a valid

uri

to an image.

Getting started

Before you can use the Cloudinary AI Vision add-on:

Overview

AI Vision offers scalable solutions for handling large volumes of media assets to provide a seamless, ready-to-use experience, enabling users to integrate effortlessly without having to do any complex customizations or prompt engineering. The add-on supports the following modes:

Tagging mode

The Tagging mode accepts a list of tag names along with their corresponding descriptions. If the image matches the description, which may encompass various elements, the response will be appropriately tagged. This approach enables customers to align with their own brand taxonomy, offering a dynamic, flexible, and open method for image classification.

To return the tags for an image based on provided definitions you call the ai_vision_tagging method with the following parameters:

Example Request:

Example Response: Moderation mode

The Moderation mode accepts multiple questions about an image, to which the response provides concise answers of "yes," "no," or "unknown." This functionality allows for a nuanced evaluation of whether the image adheres to specific content policies, creative specs, or aesthetic criteria.

To evaluate images against specific moderation questions you call the ai_vision_moderation method with the following parameters:

Example Request:

Example Response: General mode

The General mode serves a wide array of applications by providing detailed answers to diverse questions about an image. Users can inquire about any aspect of an image, such as identifying objects, understanding scenes, or interpreting text within the image.

To ask general questions you call the ai_vision_general method with the following parameters:

Example Request:

Example Response: Tokens

Your AI Vision Add-on quota is based on tokens. A token is a unit of measurement, similar to a word, used to quantify the processing required. Tokens can represent both text and images, with pricing based on the number of tokens processed.

Consolidating into token count provides a clear understanding of the total token used.

Every response also includes a limits node with the number of tokens used by the operation. For example:


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4