Skip to main content


Information extraction and retrieval : Hiekadi - Heraldabide


This is an applied research project focussing on the digital transformation and smart management of content across four mass media in Basque. Today’s technologies in the media sphere are mainly geared towards text; the processing of video and audio content has not been significant to date, and innovative paradigms of neural language models in AI have not been applied. So this line of research is seeking to develop an innovative platform capable of overcoming these limitations.

  • The extraction of semantic tagging from text and audio materials for both Basque and Spanish.
  • Neural extraction of tags; distinction will be made between general subjects, specific subjects and named entities (people, organizations and places).
  • Robust automatic subtitling of audio materials that will take into consideration spontaneous speech, Basque dialects and ambient noise


Another similar projects