Tools

In this section we provide a list of the automatic speech recognition (ASR) and machine translation (MT) software toolkits created by the transLectures partners. We have developed or improved this software in order to implement the solutions that we are proposing to produce accurate transcriptions and translations for large video lecture repositories.

Some of these toolkits are free software and can be downloaded from the links we provide.

TLP: The transLectures Platform

The transLectures Platform (TLP) is an open source (Apache License 2.0) set of software tools which includes everything you need in order to integrate transLectures transcription and translation technologies into a media repository. Its main components are the transLectures Database, Ingest Service, Web Service and Player.

TLK: The transLectures-UPV toolkit

The transLectures-UPV toolkit (TLK) is an open source (Apache License 2.0) set of tools for automatic speech recognition (ASR) developed at Universitat Politècnica de València (UPV). Among other functionalities, it features parameter estimation of hidden Markov models (HMMs) and recognition (speech, text…).

TLK is the software behind the implementation of the transLectures automatic transcription system in UPV’s Polimedia video lecture repository. It is being actively developed as the tL project progresses, and new versions will be released with improved features and usability.

TLM: The transLectures Matterhorn Plug-in

The transLectures Matterhorn Plug-in provides a transLectures Matterhorn Service and a transLectures Matterhorn Custom Workflow in order to integrate the transLectures Platform tools into the Opencast Matterhorn platform. It has been developed and tested for the Opencast Matterhorn 1.4.0, but it can be easily extended to support different versions.

RASR: The RWTH Aachen University Speech Recognition System

RASR (short for “RWTH ASR”) is a software package containing a speech recognition decoder together with tools for the development of acoustic models, for use in speech recognition systems. It has been developed by the Human Language Technology and Pattern Recognition Group at the RWTH Aachen University since 2001. Speech recognition systems developed using this framework have been applied successfully in several international research projects and corresponding evaluations.

rwthlm: The RWTH Aachen University Neural Network Language Modeling Toolkit

The software rwthlm supports different kinds of neural network layers (feedforward, standard recurrent, and long short-term memory neural networks), and arbitrarily deep networks as well as arbitrary combinations of the aforementioned kinds of layers. It was developed at the Human Language Technology and Pattern Recognition Group at RWTH Aachen University since 2013. The software has been successfully used in international evaluations, giving substantial improvements in speech recognition as well as machine translation applications.

Jane: The RWTH Aachen University Statistical Machine Translation Toolkit

Jane is RWTH‘s open source statistical machine translation toolkit. Jane supports state-of-the-art techniques for phrase-based and hierarchical phrase-based machine translation. Many advanced features are implemented in the toolkit, such as forced alignment phrase training for the phrase-based model and several syntactic extensions for the hierarchical model. RWTH has been developing Jane during the past years and it was used successfully in numerous machine translation evaluations.

EML Transcription Platform

For the massive adaptation of both acoustic models and stochastic language models, European Media Laboratory (EML) use their own set of tools, which are integrated into the EML Transcription Platform, a web-service based framework for the creation and adaptation of language components (acoustic model, language model) and their deployment in 7×24 usage scenarios.

The XEROX TunaTon Toolkit

The XEROX TunaTon Toolkit is a facility to train in parallel a large number of translation model variations, in order to compare their performance for a particular application domain.