What is the Speaker Identification?

Module Description

The Speaker Identification module works similarly to the Face Recognition module but focuses on identifying speakers by their voices. It detects, identifies, and labels speakers in your media, using pre-trained voice models or custom-trained ones.

Customized Speaker Identification
To use a custom speaker identification model, you need to access the training function within the Deep Model Customizer.

How It Works:

- Select the Media File: Choose the media file you want to analyze.
- Activate the Speaker Identification Module: In the left column, select the "Speaker Identification" module.
- Define the Model & Parameters: Choose the model for analysis, adjust the parameters (such as minimum similarity), and click the yellow "Add Module" button.
- Start the Analysis: Either add more modules or start the analysis immediately by clicking "Start Analysis."

What Parameters are available?

Model (Dropdown)
Select from pre-trained models or your own custom-trained speaker identification models.
Currently we feature only one pre-trained model:
- Celebrities
  Several personalities, including the world's most famous people and a vast majority of German politicians and athletes

Customized Speaker Identification
To create a custom speaker identification model, you need to access the training function within the Deep Model Customizer.

Min. Similarity (Slider):
Adjust the minimum similarity score for identifying speakers. A lower value returns more results, while a higher value improves accuracy.
Cluster Unknown Identities (Checkbox)
Group unrecognized speakers together as "unknown" without assigning individual IDs.
Numbering of Labels for Unknown Identities (Checkbox):
Automatically number and label all unknown speakers for easier reference.

Speaker Index:
The Speaker Index offers the easiest way to manage unknown voices. Each speaker is automatically assigned a unique ID, allowing you to rename it instantly. Normally, in the Deep Model Customizer, you would need to upload training material for each person. However, with the Speaker Index, every voice becomes recognizable right away without the need for extra training data.

Displaying the Results:

Timeline:

The Speaker Identification does currently not feature a timeline representation of the results.

Search Field:

Located in the top bar, the search field includes filter settings for refining your results.

Name field: Enter a name to view results that either match or don't match the entered name.
Sorting: Results can be sorted alphabetically, by similarity, or by duration. You can toggle between ascending and descending order.
Similarity: The similarity slider filters results based on confidence levels, displaying only results above a certain similarity threshold.

After adjusting filters, click "Apply" to apply them. Active filters appear in a black box beneath the search field and can be cleared by clicking the X symbol.

Module Section

On the right side of the player, you’ll see a section with detailed results for each module used in the analysis. Clicking on the module name opens a dropdown with specific parameters, useful for troubleshooting or viewing metadata.

Result Cards

Results are displayed as cards in chronological order. Each card provides key information, such as:

Name of the result
Indicates the detected speaker or "Unknown."
Identicons
Unique automatically generated placeholder image for the speaker label.

Identicons:
Identicons are simple, unique images created from data like an email or username. They use patterns of colored squares to give each user a distinct picture, often used on websites and apps as profile images when no photo is provided.

Similarity score:
Shows how closely the speaker’s voice matches the model.
Gender:
Indicates the gender, if recognized via voice.
Rename:
Click on the three dots (...) and choose "Rename" to relabel the result. This change will apply to both current and past recognized assets of this voice.
Show in Index Collection:
Click on the three dots (...) and choose "Show in Index Collection:" for accessing the Deep Indexer. Here you can view the indexed voice’s ID Card and Unique ID.

The Deep Indexer:
Using the Deep Indexer you can find all other assets where this voice was recognized in the past, you can relabel the person there and also deleting the Indexed Voice from all assets.
Meaning that the person will not remain in the Index but the media will still be there.