IBL News | New York
In October 2023, researchers from Apple and Columbia University released the code and weights of an open-source, for research use only, multimodal LLM (MLLM). It was called Ferret and it did not receive much attention then.
Also, recently Apple announced it made a breakthrough in deploying LLMs on iPhone and iPad, including new techniques for 3D avatars and more immersive visual experiences.
Ferret includes the curation of “GRIT, a comprehensive refer-and-ground instruction tuning dataset including 1.1M samples that contain rich hierarchical spatial knowledge, with 95K hard negative data to promote model robustness.”
“The resulting model not only achieves superior performance in classical referring and grounding tasks but also greatly outperforms existing MLLMs in region-based and localization-demanded multimodal chatting,” wrote the creators of Ferret.
Interestingly, the news about Apple’s open source and local ML developments comes as both Anthropic and OpenAI are negotiating massive new funding raises for their proprietary LLM development efforts.
I somehow missed this. @Apple joined the open source AI community in October. Ferret’s introduction is a testament to Apple’s commitment to impactful AI research, solidifying its place as a leader in the multimodal AI space. Way to go @Apple – ps: I’m looking forward to the day… https://t.co/Pi1kQrsVvx
— Bart de Witte (@OpenMedFuture) December 23, 2023