| Nome: | Descrição: | Tamanho: | Formato: | |
|---|---|---|---|---|
| 963.54 KB | Adobe PDF |
Orientador(es)
Resumo(s)
The detection of highlights in broadcast streams is essential for enhancing User Experience (UX) through automated summaries and efficient content retrieval. This is particularly relevant for live streaming environments common in sports and eSports, where audiences demand near real-time analysis. This paper presents a benchmark of models for highlight detection in broadcast audio, validated on the SoccerNet dataset but applicable to general competitive gaming streams. We propose a novel multi-modal architecture combining high-level semantic audio features (YAMNet) with Natural Language Processing (NLP) of transcribed commentary (analogous to eSports shoutcasting). Results show that fusing audio event detection with semantic text analysis significantly outperforms uni-modal baselines. The proposed framework offers a computationally efficient solution for AI-based broadcasting technologies, enabling scalable automation for content creators and improved viewer experiences.
Descrição
Palavras-chave
AI-based sports technologies Audio event detection Broadcast stream automation Machine learning for real-time analysis Multi-modal deep learning
Contexto Educativo
Citação
Editora
Science and Technology Publications, Lda
