Tech Product Reviews, How To, Best Ofs, deals and Advice
Cleaning up audio usually means scrubbing timelines and tweaking filters, but Meta thinks it should be as easy as describing the sound you want. The company has released a new open-source AI model called SAM Audio that can isolate almost any sound from a complex recording using simple text prompts.
Users can pull out specific noises like voices, instruments, or background sounds without digging through complicated editing software. The model is now available through Meta’s Segment Anything Playground that houses other prompt-based image and video editing tools. Broadly speaking, SAM Audio is designed to understand what sound you want to work with and separate it cleanly from everything else. Meta says this opens the door to faster audio editing for use cases like music production, podcasting, film and television, accessibility tools, and research. For example, a creator could isolate vocals from a band recording, remove traffic noise from a podcast, or delete a barking dog from an otherwise perfect recording, all by describing what they want the model to target. How SAM Audio works SAM Audio is a multimodal model that supports three different types of prompts. Users can describe a sound using text, click on a person or object in a video to visually identify the sound they want to isolate, or mark a time span where the sound first appears. These prompts can be used alone or combined, giving users fine-grained control over what gets separated. Under the hood, the system relies on Meta’s Perception Encoder Audiovisual engine. It acts as the model’s ability to recognize and understand sounds before slicing them out of the mix. Recommended Videos To improve audio separation evaluation, Meta has also introduced SAM Audio-Bench, a benchmark for measuring how well models handle speech, music, and sound effects. It is accompanied by SAM Audio Judge, which evaluates how natural and accurate the separated audio sounds to human listeners, even without reference tracks to compare against. Meta claims these evaluations show SAM Audio performs best when different prompt types are combined and can handle audio faster than real-time, even at scale. That said, the model has clear limitations. It does not support audio-based prompts, cannot perform full separation without any prompting, and struggles with similar overlapping sounds, such as isolating a single voice from a choir. Meta says it plans to improve these areas and is already exploring real-world applications, including accessibility work with hearing-aid makers and organizations supporting people with disabilities. The launch of SAM Audio ties into Meta’s broader AI push. The company is improving voice clarity on its AI glasses for noisy environments, working toward next-generation mixed reality glasses expected to arrive in 2027, and developing a conversational AI that could rival ChatGPT, signaling a wider focus on AI models that understands sound, context, and interaction.
Ai Artifical Intelligence Audio Editing Meta Open Source
United States Latest News, United States Headlines
Similar News:You can also read news stories similar to this one that we have collected from other news sources.
Asus is now offering the Nvidia GeForce RTX 5060 in two new flavorsTech Product Reviews, How To, Best Ofs, deals and Advice
Read more »
Google finds AI chatbots are only 69% accurate… at bestTech Product Reviews, How To, Best Ofs, deals and Advice
Read more »
Meta’s Threads doubles down on Communities, along with “Champion” badge and profile labelsTech Product Reviews, How To, Best Ofs, deals and Advice
Read more »
NASA’s ‘Moonbound’ builds the hype for its epic Artemis II missionTech Product Reviews, How To, Best Ofs, deals and Advice
Read more »
You’ll soon be able to receive iPhone notifications on a Galaxy smartwachTech Product Reviews, How To, Best Ofs, deals and Advice
Read more »
You can now cast Apple TV content to your TV from AndroidTech Product Reviews, How To, Best Ofs, deals and Advice
Read more »
