What's new: Google is bringing its Gemini AI model to many of its services. Google Photos is getting a boost called "Ask Photos." This feature allows users to use natural language queries to conduct complex and context-aware searches in their photo library.
Artificial intelligence was undoubtedly the star of the show at Google I/O today. The company announced a slew of AI features, including one for Google Photos called "Ask Photos." Ask Photos allows users to search across their photos and ask questions about them using simple natural language input.
The Gemini-powered feature goes far beyond simply asking for photos of your dog. Ask Photos understands context and answers more complex questions. For instance, ask it for a photo of your child treading water, and it could return a single or multiple images of that. However, asking it to show your child learning to swim will return the entire process, from learning to tread water to getting a swimming certificate. Gemini understands the context of learning to swim and pulls related photos.
Ask Photos, a new feature coming to @GooglePhotos, makes it easier to search across your photos and videos with the help of Gemini models. It goes beyond simple search to understand context and answer more complex questions. #GoogleIO pic.twitter.com/OsYXZLo5S1
– Google (@Google) May 14, 2024
Another example demonstrated was finding photos of different vacation spots. Users can ask the AI to search for all the landmarks in a particular city or pictures of the Washington Monument, Lincoln Memorial, and White House on a trip to Washington D.C. will get appropriate results. It can even find pictures with your license plate number (provided you have a photo). Google CEO Sundar Pichai asked the AI, "What's my license plate number again?" The Photos app successfully returned his license plate number. It did this based on location data and other factors, like how often it found instances of the plate number.
While some people will likely find this feature a little creepy, it does highlight how sophisticated Google's Gemini AI model is. It could help many people find things in the hundreds (or thousands) of images they have stored on Google Photos. The focus on natural language input is also vital as AI models accelerate toward "multi-modality" input like processing text, audio, and video. OpenAI demonstrated this to jaw-dropping effect earlier this week with its GPT-4o (Omni) model.
Given the rise of generative AI models, Google's continued emphasis on AI is unsurprising. The search giant has seemingly added AI to everything. OpenAI's unveiling of its new Omni model shows that the AI wars are only getting more heated. Apple intends to join the fray by unveiling its generative AI efforts at its Worldwide Developer Conference next month.