MultiMoco NTU | 台大多媒體語料庫

關於 About

This project aims to serve as the pioneering work on the construction of a large-scale multimodal corpus for languages in Taiwan (MultiMoCo) including four official language groups in Taiwan, as well as the corpus analysis tools enhanced by both human annotation and recent multimodal machine learning techniques. Throughout this long-term project, our primary goal is set to provide an empirical base to support/evaluate the cognitive linguistic theoretical claim.

內容統計 Statistics

Video

臺灣數位電視公共頻道新聞
臺灣立法院議事轉播影像

Dialogue

Open AI Whisper 模型
針對音訊轉寫之文字

Caption

OCR 技術針對影像
擷取畫面之文字

Gesture

MediaPipe 針對影像
辨識說話者之手勢

223 total clips

5854 total minutes

1485297 total characters

22805 total gestures

【 2022 台大多模態語料庫工作坊】

📅 活動時間：11/12 (六) - 11/14 (一)
🙋 報名資格：對多模態分析有興趣者皆可報名。
💰 報名費用：本活動免費。
🔔 報名期間：即日起至 11/7 (一) 23:59 止

時間	11/12 (Sat)	11/13 (Sun)	11/14 (Mon)
地點	博雅201	博雅201	綜合202
上午	9:30-9:50 報到 10:00-10:50 謝舒凱、曾昱翔：台灣多模態語料庫介紹｜MultiMoCo: A large-scaled multimodal corpus for Languages in Taiwan 11:00-11:50 謝舒凱：如何進行一個多模態語料分析研究｜How to conduct a multimodal corpus-based study	10:00-10:50 徐嘉慧：語言、手勢與實體｜Language, Gesture, and Entity 11:10-11:50 廖聿鋆：機器學習輔助肢體自動辨識｜ Automatic gesture recognition with Mediapipe	10:00-10:50 廖元甫：國家語言語音語料庫-建置、應用與展望 11:00-12:00 曾昱翔：多模態語料庫中的機器學習技術｜Multimodal machine learning techniques
12:00 - 14:00	午間休息（附便當）
下午	14:00-14:50 陳品而：使用 ELAN 做多模態標記與分析 \| Multimodal annotation with ELAN - - - (tea time) - - - 15:10-15:50 謝舒凱、張鈺琳：多模語意標記｜Multimodal semantic annotation 16:00-16:50 [Online] Tiago Torrent: Reframing multimodal datasets: what Frame Semantics can contribute to the analysis of distinct communicative modes	14:00-14:50 黃柏瑄：語言產出唇形的自動抓取｜Automatic lip shape capturing in speech production - - - (tea time) - - - 15:00-15:50 王麒瑋、陳玠青： MAUS 網路服務與 Praat 教學｜Maus Web Service and Praat Tutorial 16:10-17:00 許芸涵： PRAAT 語音分析｜Phonetic analysis with PRAAT	13:00-15:00 莊勻、柯逸均：腦波資料收集與分析 \| Introduction to EEG data collection and analysis 15:10-15:40 pre-proposal discussion & - - - (tea time) - - - 16:00-16:50 [Online closing speech] Asli Özyürek: Multimodality as a design feature of language