linguaXlab

The project develops research-driven linguistic services to enhance the accessibility and (re)usability of language data related to South Tyrol. We specialise in Speech-to-Text and Natural Language Processing (NLP).

Contact: Greta H. Franzini, Luca Ducceschi.

Speech-to-Text

We have trained an AI model, AUGUSTA, capable of transcribing South Tyrolean dialectal speech into Standard German. This tool supports transcription of interviews, meetings, archival recordings, and more. Its performance is continuously refined through partnerships with both public and private sector organisations, including media and film companies, archives, IT firms and publishers.

The model is available for research and commercial applications, with a particular interest in research collaborations focussed on enhancing its accuracy.

Collaborators: AlpiLinK project - Università di Verona, Amt für Film und Medien - Autonome Provinz Bozen, Eurac Research [Institute for Regional Development, Institute for Minority Rights, Centre for Advanced Studies].

Natural Language Processing (NLP)

We conduct a wide range of linguistic analyses, from small-scale studies to large-scale computational processing. These include linguistic annotation, word embeddings, topic modelling, sentiment analysis and the extraction of neologisms. Our flagship initiative is Korpus Südtirol, a dedicated platform for exploring language use in South Tyrol.

Collaborators: Landesbibliothek Dr. Friedrich Teßmann.