Stay organized with collections Save and categorize content based on your preferences.
Learn how Google developed the state-of-the-art image classification model powering search in Google Photos. Get a crash course on convolutional neural networks, and then build your own image classifier to distinguish cat photos from dog photos.
Estimated Completion Time: 90–120 minutes PrerequisitesMachine Learning Crash Course or equivalent experience with ML fundamentals
Proficiency in programming basics, and some experience coding in Python
Note: The coding exercises in this practicum use the Keras API. Keras is a high-level deep-learning API for configuring neural networks. It is available both as a standalone library and as a module within TensorFlow.
Prior experience with Keras is not required for the Colab exercises, as code listings are heavily commented and explained step by step. Comprehensive API documentation is also available on the Keras site.
IntroductionIn May 2013, Google released search for personal photos, giving users the ability to retrieve photos in their libraries based on the objects present in the images.
Figure 1. Google Photos search for Siamese cats delivers the goods!
The feature, later incorporated into Google Photos in 2015, was widely perceived as a game-changer, a proof of concept that computer vision software could classify images to human standards, adding value in several ways:
Image classification is a supervised learning problem: define a set of target classes (objects to identify in images), and train a model to recognize them using labeled example photos. Early computer vision models relied on raw pixel data as the input to the model. However, as shown in Figure 2, raw pixel data alone doesn't provide a sufficiently stable representation to encompass the myriad variations of an object as captured in an image. The position of the object, background behind the object, ambient lighting, camera angle, and camera focus all can produce fluctuation in raw pixel data; these differences are significant enough that they cannot be corrected for by taking weighted averages of pixel RGB values.
Figure 2. Left: Cats can be captured in a photo in a variety of poses, with different backdrops and lighting conditions. Right: averaging pixel data to account for this variety does not produce any meaningful information.
To model objects more flexibly, classic computer vision models added new features derived from pixel data, such as color histograms, textures, and shapes. The downside of this approach was that feature engineering became a real burden, as there were so many inputs to tweak. For a cat classifier, which colors were most relevant? How flexible should the shape definitions be? Because features needed to be tuned so precisely, building robust models was quite challenging, and accuracy suffered.
Key TermsExcept as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2022-07-18 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2022-07-18 UTC."],[[["This tutorial teaches how Google developed its image classification model used in Google Photos."],["Users will learn about convolutional neural networks and build their own image classifier to differentiate cat and dog photos."],["The tutorial requires prior knowledge of machine learning fundamentals and basic Python coding skills."],["Traditional computer vision models relied on raw pixel data and engineered features but were limited in handling variations in images."],["This tutorial uses the Keras API, though prior experience is not necessary due to heavily commented code examples and comprehensive documentation."]]],[]]
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4