WWDC24: Siri's Evolution with Apple Intelligence

By Sean Chen, June 11, 2024

Apple has just launched Apple Intelligence, and it’s bringing a lot to the table. We're talking about writing tools, speech-to-text and summarization, smart replies, image and emoji generation, and most notably, it’s transforming Siri into a genuine voice assistant.

At first glance, the initial features seemed like Apple was merely catching up to Android, introducing incremental updates that Android users have had for a while.

But then Apple unveiled the next-gen Siri, and that’s when their design and integration prowess became evident.

Not Just Purely Using GPT-4o

Contrary to the buzz, Apple Intelligence isn’t just leveraging GPT-4o. From the hardware it supports, like the A17 Pro and M series, it’s clear Apple has embedded smaller language models and specific-purpose image generation models directly into their devices. Plus, they're using their latest Private Cloud Compute technology to let Apple Intelligence or Siri use large language models in the cloud while ensuring privacy.

Interestingly, this new generative language model seems to be specially trained for typical mobile scenarios. The GPT-4o model, developed in collaboration with OpenAI, is reserved for more advanced tasks, like handling complex logical queries. This lets Siri’s AI focus more on enhancing the iPhone experience, making the model more streamlined and efficient. Mixing different services reduces the pressure on Apple to keep up with GenAI while directly challenging Microsoft Copilot’s integration of GenAI Chatbot features within the OS.

Apple’s strategy for developing large language models isn’t about chasing benchmarks. Instead, they’re focused on enhancing the product itself, ensuring LLM serves the product, not the other way around.

However, by the end of 2024, it seems that only English-supporting AI Siri will be available. Older devices or non-English users will still have to use the older version of Siri without LLM support. In the race for large language models, Apple is still behind the likes of OpenAI, Microsoft, and Google. Supporting other languages with the same level of accuracy remains a distant goal.

But in terms of user experience, Apple continues to lead.

From "ChatBot" to "ChatBot+" to "Experience"

Currently, other major players in the LLM market, like OpenAI’s ChatGPT, remain at the "ChatBot" level, relying on selling APIs for developers to integrate. Google’s Bard and Microsoft’s PC Copilot are at the "ChatBot+" stage, integrating more external features, but they’re still just beginning to explore deep LLM integration. Other hardware giants like Samsung and ASUS seem stuck at the "function" stage with their AI capabilities—features like circle searches, real-time translation, AI photo editing, or image searches.

Apple’s Apple Intelligence seems to have elevated LLM services to the "experience" level, making AI truly "relevant." With this relevance, the number of scenarios or functions users can utilize will greatly expand, potentially reaching limitless possibilities, thus realizing the ideal of an AI assistant. In summary, within Apple’s powerful ecosystem, no matter how strong other language models are, it will be difficult to outdo Apple’s advantage in deeply integrating LLM into users’ lives. We can only hope that other LLMs will follow suit and integrate more deeply into our daily lives.