MERaLiON released its first series of open-source models in December 2024, designed to address linguistic nuances in local accents across complex, multilingual environments. MERaLiON-AudioLLM overcomes traditional model limitations by using a fusion-based architecture that integrates audio and text cohesively, delivering improved performance in speech understanding.
- MERaLiON-AudioLLM: This model fuses the MERaLiON-Whisper encoder from Whisper-large-v2 and SEA-LION V3, a localised LLM developed by AI Singapore. It is fine-tuned using a carefully curated dataset of 260,000 hours of speech from diverse sources, including the National Speech Corpus (NSC), ensuring the model’s ability to process and integrate audio and text in an end-to-end manner for downstream tasks.
- LLaMA-3-MERaLiON-8B-Instruct: Building on the Llama-3-8B architecture, this model continues pretraining on over 120 billion tokens of data in English, Chinese, and Indonesian. It is further enhanced through rigorous pretraining and model weight merging.
- MERaLiON-SpeechEncoder-v1: A foundational speech model trained on 200,000 hours of predominantly English data, including 10,000 hours of Singapore-based speech, to support a range of speech applications within Singapore and beyond.
The development and deployment of these models are made possible through NSCC Singapore’s HPC resources, providing the necessary computational power for large-scale model training, fine-tuning, and rapid experimentation. The ASPIRE 2A+ supercomputing system and 300TB of data storage support distributed training, enabling efficient processing of multilingual data across various models.
- Efficient Model Training: HPC resources accelerate the development of MERaLiON’s models by enabling faster training and optimisation using large, diverse datasets.
- Scalable Data Handling: The ability to process massive datasets, scaling to above 600,000 speech hours , enhances the model’s effectiveness across multiple languages and accents.
- AI Model Performance: The integration of AI models with HPC ensures that MERaLiON can deliver real-time predictions and insights, enabling faster, more accurate language processing and decision-making.
Developed to enhance the understanding of human communication dynamics through its multimodal integration, MERaLiON marks a significant leap forward in advancing AI capabilities for Singapore and the Southeast Asia region. MERaLiON brings significant benefits to AI and language processing across industries:
- Customer Service Automation: MERaLiON enhances customer interactions by understanding local languages and emotions, improving accessibility and satisfaction. Small and Medium Enterprises (SMEs) can benefit from cost-effective, custom AI-driven support that caters to diverse communities, including non-English speakers and the elderly.
- Discovery of New Insights: With its empathetic reasoning, MERaLiON can detect distress signs in multilingual conversations, enabling early intervention. It also supports policy makers and companies in analysing trends to guide decisions in business strategy and policy formulation.
- Agentic Decision-Making: By processing multilingual and multimodal data, MERaLiON empowers government agencies and organisations to enhance cybersecurity, technology foresight, and strategic decision-making. It plays a key role in supporting policy innovation and identifying emerging opportunities and risks, helping Singapore maintain its competitive edge.
In furthering the delivery of Singapore-specific AI solutions, Microsoft and A*STAR I²R signed a Memorandum of Understanding (MoU) to integrate MERaLiON with Microsoft Azure. This collaboration will incorporate MERaLiON into Microsoft 365 and Copilot, enabling businesses to easily adopt AI assistants tailored to local languages and contexts. This partnership ensures that businesses across Singapore can improve productivity workflows, benefitting from AI-powered solutions specifically designed for the region.
MERaLiON is strategically positioned to empower public agencies and businesses in addressing current challenges while fostering the development of innovative use cases that drive future growth and efficiency.
“MERaLiON is a key step in advancing Singapore’s AI research, especially in understanding language nuances. By leveraging high-performance computing, we are developing models that can tackle real-world challenges across various sectors. The support from NSCC Singapore’s computational resources has been essential in enabling us to accelerate our research and bring us closer to creating AI solutions that are tailored to the region’s needs.”
Ms. Aw Ai Ti
Principal Investigator, Head of the Aural & Language Intelligence (ALI) department
ASTAR Institute for Infocomm Research (ASTAR I²R)