What is the Visual Understanding Module?

Module Description

Performing visual language comprehension tasks, such as answering visual questions, understanding scenes, and making advanced deductions.

Visual Language Comprehension
Is the ability to interpret and understand information conveyed through visual elements like symbols, images, colors, and layouts. It involves recognizing patterns, decoding cultural or contextual meanings, and connecting visuals with emotions or concepts. Any Audial information is not taken into account.

How does it work?

Select the Media File: Choose the media file you want to analyze.
Activate the Visual Understanding Module: In the left column, select the "Visual Understanding" module, enter a prompt and click the yellow "Add Module" button.
Start the Analysis: You can either add more modules or begin the analysis immediately by clicking "Start Analysis"

What Parameters are available?

Prompt (Free Text)

This algorithm needs an additional prompt in order to perform the analysis.
You can enter any prompt, depending on the length of the result, the analysis will take more time.

EXAMPLES:
Scene Description:
"Describe the actions happening in this video scene."
"What objects and people are present in this clip?"
Content Summarization:
"Summarize the key events in this 30-second video."
"Provide a high-level overview of this sports match."
Emotion and Tone Analysis:
"What is the emotional tone of this scene?"
"Are the characters in the video happy, sad, or angry?"
Highlights Extraction:
"Identify the most exciting moments in this soccer match."
"Find key scenes with dialogue in this video."
Visual Elements Detection:
"Identify all appearances of company logos and what companies?"
"Which nametags are visible in this video"
Audience Engagement Insights:
"What visual elements are most frequently focused on?"
"Analyze the facial expressions of viewers in this focus group video."

Prompt Library & Backlog:
In this initial release, a prompt library and the option to save used prompts are not yet available. These features will be introduced in a future update.

Displaying the Results:

Module Section:

On the right side of the player, you’ll see a section with detailed results for each module used in the analysis. Clicking on the module name opens a dropdown with specific parameters, useful for troubleshooting or viewing metadata.

Results:

The results are shown in the sidebar, along with the original prompt displayed as a text field.

As this module does not currently use timestamps, it cannot, for example, provide separate scene descriptions with timecode-based information. In a later release, we will also link the module with other AI modules, e.g. with shot boundary detection, in order to summarise individual sequences, for example.

New Function
This is a new addition to our modules, if you find improvements or bugs, please feel free to contact us via support form.