Recent advances in speaker and language recognition and characterization
The goal of this special issue is to highlight the current state of research efforts on speaker and language recognition and characterization. New ideas about features, models, tasks, datasets or benchmarks are growing making this a particularly exciting time.
In the last decade, speaker recognition (SR) has gained importance in the field of speech science and technology, with new applications beyond forensics, such as large-scale filtering of telephone calls, automated access through voice profiles, speaker indexing and diarization, etc. Current challenges involve the use of increasingly short signals to perform verification, the need for algorithms that are robust to all kind of extrinsic variabilities, such as noise and channel conditions, but allowing for a certain amount of intrinsic variability (due to health issues, stress, etc.) and the development of countermeasures against spoofing and tampering attacks. On the other hand, language recognition (LR) has also witnessed a remarkable interest from the community as an auxiliary technology for speech recognition, dialogue systems and multimedia search engines, but specially for large-scale filtering of telephone calls. An active area of research specific to LR is dialect and accent identification. Other issues that must be dealt with in LR tasks (such as short signals, channel and environment variability, etc.) are basically the same as for SR.
The features, modeling approaches and algorithms used in SR and LR are closely related, though not equally effective, since these two tasks differ in several ways. In the last couple of years, and after the success of Deep Learning in image and speech recognition, the use of Deep Neural Networks both as feature extractors and classifiers/regressors is opening new exciting research horizons.
Until recently, speaker and language recognition technologies were mostly driven by NIST evaluation campaigns: Speaker Recognition Evaluations (SRE) and Language Recognition Evaluations (LRE), which focused on large-scale verification of telephone speech. In the last years, other initiatives (such as the 2008/2010/2012 Albayzin LRE, the 2013 SRE in Mobile Environment, the RSR2015 database or the 2015 Multi-Genre Broadcast Challenge) have widened the range of applications and the research focus. Authors are encouraged to use these benchmarks to test their ideas.
This special issue aims to cover state-of-the-art works; however, to provide readers with a state-of-the-art background on the topic, we will invite one survey paper, which will undergo peer review. Topics of interest include, but are not limited to:
- Speaker and language recognition, verification, identification
- Speaker and language characterization
- Features for speaker and language recognition
- Speaker and language clustering
- Multispeaker segmentation, detection, and diarization
- Language, dialect, and accent recognition
- Robustness in channels and environment
- System calibration and fusion
- Speaker recognition with speech recognition
- Multimodal speaker recognition
- Speaker recognition in multimedia content
- Machine learning for speaker and language recognition
- Confidence estimation for speaker and language recognition
- Corpora and tools for system development and evaluation
- Low-resource (lightly supervised) speaker and language recognition
- Speaker synthesis and transformation
- Human and human-assisted recognition of speaker and language
- Spoofing and tampering attacks: analysis and countermeasures
- Forensic and investigative speaker recognition
- Systems and applications
Note that all papers will go through the same rigorous review process as regular papers, with a minimum of two reviewers per paper.
Guest Editors:
Eduardo Lleida | University of Zaragoza, Spain |
Luis J. Rodríguez-Fuentes | University of the Basque Country, Spain |
Important dates:
Submission open: | May 6, 2016 |
Submission deadline extended: | October 9, 2016 |
Notifications of final decision: | March 31, 2017 |
Scheduled publication: | April, 2017 |