What is the Lower Thirds Recognition?

Module Description

The Lower Third Recognition reads names in on-screen text (commonly called "lower thirds"). It is an optical character recognition (OCR) function that converts written text (in a video or photo) into machine-readable data. It is mainly used as a base layer for our composite AI features Face Dataset Creation and Speaker Dataset Creation. These tools automatically create new training datasets by combining the lower third text with the person’s face or voice detected in the video.

Lower Thirds Recognition vs. Text Recognition
The Lower Thirds Recognition is specifically designed to identify name entities (such as names, functions, or locations) in the designated lower thirds area of the screen, usually at the bottom. It ignores regular text elements and does not search for text in other parts of the image, like the top or middle.

In contrast, Text Recognition scans the entire image, looking for any characters or words across the whole asset, regardless of their location on the screen. This makes it more suitable for general text extraction throughout the entire image.

How does it work?

Select the Media File: Choose the media file you want to analyze.
Activate the Lower Third Recognition Module: In the left column, select the "Lower Third Recognition" module.
Define the Model & Parameters: Choose the model for analysis from the available options, set the parameters, and click the yellow "Add Module" button.
Start the Analysis: You can either add more modules or begin the analysis immediately by clicking "Start Analysis"

What Parameters are available?

Apply generic method (Checkbox):
This option allows the system to use a general approach to recognize names. By checking this box, the system will search for common names without needing a specific list of names. It’s useful when you want to detect names broadly without setting up custom rules.
Single Name Detection (Checkbox):
The Lower Third Recognition is designed to recognize full names, such as "John Smith." If you want it to detect just first names or single names, like "John" by itself, you can enable this option to allow the detection of individual names, not just full ones.
Detect only faces (Checkbox):
Recognize lower thirds only if a face is visible at the same time. You can set this to false if you want to detect all name inserts, even if no faces are visible.

Additional Parameters Available in the API

In the API, you have more control over the Lower Third Recognition module with options such as:

Add Custom Name Dictionaries: You can provide a custom dictionary to detect your custom names. Only names from this list will be detected. Example: "Angela Merkel", "Olaf Scholz"]. As well as secondary dictionaries for additional entities such as political party, function, company or location.
Face Size and Quality: Set minimum face size and sharpness to ensure the Lower Third Recognition is only triggered when high-quality faces are detected.

Read more detailed information in our API documentation.

Displaying the Results:

Timeline:

The timeline, located below the player, displays the entire video runtime and the results from each module as gray bars.

By clicking on any of the grey result bars, you will see details such as:
- Name
- Timecode (TC)
- Exact frame numbers
- Runtime/Duration
Clicking on a result moves the playhead to the beginning of that result.
These results are identical to those provided by the API, but in a more user-friendly, graphical format. If there are multiple results, use your mouse wheel to scroll through the timeline.

Search Field:

Lower Thirds Recognition does not have a search field, as it is only a technical metadata tool.a.

Module Section

On the right side of the player, you’ll see a section with detailed results for each module used in the analysis. Clicking on the module name opens a dropdown with specific parameters, useful for troubleshooting or viewing metadata.

Result Cards

Results are displayed as cards in chronological order. Each card provides key information, such as:

- Name of the result: Indicates what name entity was recognized in the asset.
- Start frame of the appearance: Frame Number

- End frame of the appearance: Frame Number