TLK: Features

The current stable version (1.3.1) includes the following main ASR functions:

  • Diagonal Gaussian mixture and Bernoulli mixture HMM acoustic models.
  • Feature extraction.
  • I/O of acoustic models.
  • Initialisation of acoustic models.
  • Parameter estimation for acoustic models, including the Baum-Welch and Viterbi algorithms.
  • Acoustic model adaptation: MLLR and CMLLR features.
  • Recognition using ARPA language models and self-generated acoustic models.
  • Viterbi alignment.
  • Incremental training of acoustic models.
  • Weighted interpolation of acoustic models.
  • Recognition using hybrid DNN-HMMs.
  • Adaptation using DNNs.

And these additional usability features:

  • High-level tools to facilitate the preprocessing, training and recognition of standard acoustic systems.
  • A tool to directly transcribe a media file using a pre-installed system.
  • Simple configuration files for training setup.
  • Compressed ZIP file support.
  • Internationalisation of tools and documentation.

TLK is the software behind the implementation of the transLectures automatic transcription system in UPV’s Polimedia video lecture repository. It is being actively developed as the tL project progresses, and new versions will be released with improved features and usability.

The research leading to this software has received funding under the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement nÂș 287755.