Meta Released Llama’s Multimodal Version 3.2 11B and 90B

IBL News | New York

Meta released this Wednesday, during its Meta Connect 2024 developer conference in Menlo Park, Llama’s multimodal version, 3.2 11B and 90B, a larger model that can interpret charts and graphs, analyze and caption images, and pinpoint objects in pictures given a simple description.

The multimodal Llama models can be downloaded from Hugging Face, Microsoft Azure, Google Cloud, and AWS. Meta is also hosting them on the official Llama site, Llama.com.

The company uses them to power its AI assistant, Meta AI, across WhatsApp, Instagram, and Facebook.

Meta’s CEO, Mark Zuckerberg, said that Meta AI is one of the most used AI assistants worldwide, with almost 500 million monthly active users. Zuckerberg said that Meta AI is on track to become the most used assistant worldwide.

Llama 3.2 1B and 3B are lightweight, text-only models that run on smartphones and other devices. They can be used for tasks such as summarizing and rewriting paragraphs (e.g., in an email). Optimized for Arm hardware from Qualcomm and MediaTek, 1B and 3B can tap tools such as calendar apps.

Another Connect 2024 developer conference announcement was that Meta’s AI assistant is getting a new voice mode.

Meta AI can now respond to questions out loud across platforms where it’s available: Instagram, Messenger, WhatsApp, and Facebook.

Meta also said it’s piloting an AI translation tool to translate voices in Instagram Reels automatically. The tool dubs a creator’s speech and auto-lip-syncs it, simulating the voice in another language.

Users can choose from several voices, including the AI clones of celebrities that Meta hired: Awkwafina, Dame Judi Dench, John Cena, Keegan-Michael Key, and Kristen Bell.

Google released upgraded Gemini models earlier this week, and OpenAI unveiled its o1 model earlier in the month.

Meta’s Connect 2024 Conference