Meta’s Muse Spark takes AI a step closer to personal superintelligence
Meta Superintelligence Labs has introduced Muse Spark, a natively multimodal reasoning model with support for tool use, visual chain of thought, and multi-agent orchestration. The release includes a Contemplating mode, which is rolling out gradually and orchestrates multiple agents that reason in parallel.
Prompt: Can you turn this into a sudoku game that I can play in the web? (Source: Meta)
Capabilities
Meta positions Muse Spark as part of its push toward personal superintelligence that can understand a user’s world. The model can analyze a user’s immediate environment and support wellness-related use cases through its reasoning capabilities.
The model integrates visual information across domains and tools. It performs well on visual STEM questions, entity recognition, and localization, enabling interactive experiences such as creating minigames or troubleshooting home appliances with dynamic annotations.
The company said it collaborated with more than 1,000 physicians to curate training data aimed at improving health-related responses. The model can generate interactive displays that explain information such as nutritional content or muscle activity during exercise.
Model scaling
Meta studies Muse Spark’s scaling across three axes: pretraining, reinforcement learning, and test-time reasoning.
During pretraining, the system develops multimodal understanding, reasoning, and coding capabilities that serve as a foundation for later stages.
Reinforcement learning is used to amplify capabilities and improve reliability, with gains that generalize to unseen tasks, according to the company.
Test-time reasoning allows the system to “think” before producing responses. Meta said it uses thinking-time penalties to optimize token use and multi-agent orchestration to improve performance while maintaining comparable latency.
“To deliver the most intelligence per token, our RL training maximizes correctness subject to a penalty on thinking time,” the company said.
The company added that increasing the number of parallel agents allows more reasoning at inference time without significantly increasing latency.
Safety evaluation
Muse Spark was evaluated before deployment using Meta’s Advanced AI Scaling Framework, which defines threat models, evaluation protocols, and deployment thresholds.
According to the company, the model demonstrates refusal behavior in high-risk domains such as biological and chemical threats. Meta also said Muse Spark does not exhibit the autonomous capabilities or hazardous tendencies required to realize such scenarios and remains within safe margins across evaluated risk categories.
