Photos and images constitute the largest chunk of the Web, and many include recognisable features, such as human faces or barcordes/QR codes. Detecting these features is computationally expensive, but would lead to interesting use cases e.g. face tagging, or web URL redirection. While hardware manufacturers have been supporting these features for a long time, Web Apps do not yet have access to these hardware capabilities, which makes the use of computationally demanding libraries necessary.
Text Detection, despite being an interesting field, is not considered stable enough across neither computing platforms nor character sets to be standarized in the context of this document. For reference a sister informative specification is kept in [TEXT-DETECTION-API].
1.1. Shape detection use casesPlease see the Readme/Explainer in the repository.
2. Shape Detection APIIndividual browsers MAY provide Detectors indicating the availability of hardware providing accelerated operation.
Detecting features in an image occurs asynchronously, potentially communicating with acceleration hardware independent of the browser. Completion events use the shape detection task source.
2.1. Image sources for detectionThis section is inspired by HTML Canvas 2D Context § image-sources-for-2d-rendering-contexts.
ImageBitmapSource
allows objects implementing any of a number of interfaces to be used as image sources for the detection process.
When an ImageBitmapSource
object represents an HTMLImageElement
, the element’s image must be used as the source image. Specifically, when an ImageBitmapSource
object represents an animated image in an HTMLImageElement
, the user agent must use the default image of the animation (the one that the format defines is to be used when animation is not supported or is disabled), or, if there is no such image, the first frame of the animation.
When an ImageBitmapSource
object represents an HTMLVideoElement
, then the frame at the current playback position when the method with the argument is invoked must be used as the source image when processing the image, and the source image’s dimensions must be the intrinsic dimensions of the media resource (i.e. after any aspect-ratio correction has been applied).
When an ImageBitmapSource
object represents an HTMLCanvasElement
, the element’s bitmap must be used as the source image.
When the UA is required to use a given type of ImageBitmapSource
as input argument for the detect()
method of whichever detector, it MUST run these steps:
If any ImageBitmapSource
have an effective script origin (origin) which is not the same as the Document’s effective script origin, then reject the Promise with a new DOMException
whose name is SecurityError
.
If the ImageBitmapSource
is an HTMLImageElement
object that is in the Broken
(HTML Standard §img-error) state, then reject the Promise with a new DOMException
whose name is InvalidStateError
, and abort any further steps.
If the ImageBitmapSource
is an HTMLImageElement
object that is not fully decodable then reject the Promise with a new DOMException
whose name is InvalidStateError
, and abort any further steps
If the ImageBitmapSource
is an HTMLVideoElement
object whose readyState
attribute is either HAVE_NOTHING
or HAVE_METADATA
then reject the Promise with a new DOMException
whose name is InvalidStateError
, and abort any further steps.
If the ImageBitmapSource
argument is an HTMLCanvasElement
whose bitmap’s origin-clean
(HTML Standard §concept-canvas-origin-clean) flag is false, then reject the Promise with a new DOMException
whose name is SecurityError
, and abort any further steps.
Note that if the ImageBitmapSource
is an object with either a horizontal dimension or a vertical dimension equal to zero, then the Promise will be simply resolved with an empty sequence of detected objects.
FaceDetector
represents an underlying accelerated platform’s component for detection of human faces in images. It can be created with an optional Dictionary of FaceDetectorOptions
. It provides a single detect()
operation on an ImageBitmapSource
which result is a Promise. This method MUST reject this promise in the cases detailed in § 2.1 Image sources for detection; otherwise it MAY queue a task that utilizes the OS/Platform resources to resolve the Promise with a Sequence of DetectedFace
s, each one essentially consisting on and delimited by a boundingBox
.
[Exposed=(Window,Worker), SecureContext] interfaceFaceDetector
{ constructor(optional FaceDetectorOptionsfaceDetectorOptions
= {}); Promise<sequence<DetectedFace>> detect(ImageBitmapSourceimage
); };
FaceDetector(optional FaceDetectorOptions faceDetectorOptions)
FaceDetector
with the optional faceDetectorOptions.
Detectors may potentially allocate and hold significant resources. Where possible, reuse the same
FaceDetector
for several detections.
detect(ImageBitmapSource image)
ImageBitmapSource
image. The detected faces, if any, are returned as a sequence of DetectedFace
s.
FaceDetectorOptions
dictionary FaceDetectorOptions
{
unsigned short maxDetectedFaces;
boolean fastMode;
};
maxDetectedFaces
, of type unsigned short
fastMode
, of type boolean
DetectedFace
dictionary DetectedFace
{
required DOMRectReadOnly boundingBox;
required sequence<Landmark>? landmarks;
};
boundingBox
, of type DOMRectReadOnly
landmarks
, of type sequence<Landmark>, nullable
dictionary Landmark
{
required sequence<Point2D> locations;
LandmarkType type;
};
locations
, of type sequence<Point2D>
type
, of type LandmarkType
enum LandmarkType
{
"mouth",
"eye",
"nose"
};
mouth
eye
nose
Consider adding attributes such as, e.g.:
[SameObject] readonly attribute unsigned long id;
to DetectedFace
.
BarcodeDetector
represents an underlying accelerated platform’s component for detection of linear or two-dimensional barcodes in images. It provides a single detect()
operation on an ImageBitmapSource
which result is a Promise. This method MUST reject this Promise in the cases detailed in § 2.1 Image sources for detection; otherwise it MAY queue a task using the OS/Platform resources to resolve the Promise with a sequence of DetectedBarcode
s, each one essentially consisting on and delimited by a boundingBox
and a series of Point2D
s, and possibly a rawValue
decoded DOMString
.
[Exposed=(Window,Worker), SecureContext] interfaceBarcodeDetector
{ constructor(optional BarcodeDetectorOptionsbarcodeDetectorOptions
= {}); static Promise<sequence<BarcodeFormat>> getSupportedFormats(); Promise<sequence<DetectedBarcode>> detect(ImageBitmapSourceimage
); };
BarcodeDetector(optional BarcodeDetectorOptions barcodeDetectorOptions)
BarcodeDetector
with barcodeDetectorOptions.
If barcodeDetectorOptions.formats
is present and empty, then throw a new TypeError
.
If barcodeDetectorOptions.formats
is present and contains unknown
, then throw a new TypeError
.
Detectors may potentially allocate and hold significant resources. Where possible, reuse the same
BarcodeDetector
for several detections.
getSupportedFormats()
Promise
promise and run the following steps in parallel:
Array
.BarcodeFormat
s that the UA understands as potentially detectable in images. Add these to supportedFormats.
The UA cannot give a definitive answer as to whether a given barcode format will always be recognized on an image due to e.g. positioning of the symbols or encoding errors. If a given barcode symbology is not in supportedFormats array, however, it should not be detectable whatsoever.
detect(ImageBitmapSource image)
ImageBitmapSource
image.
BarcodeDetectorOptions
dictionary BarcodeDetectorOptions
{
sequence<BarcodeFormat> formats;
};
formats
, of type sequence<BarcodeFormat>
BarcodeFormat
s to search for in the subsequent detect()
calls. If not present then the UA SHOULD search for all supported formats.
Limiting the search to a particular subset of supported formats is likely to provide better performance.
DetectedBarcode
dictionary DetectedBarcode
{
required DOMRectReadOnly boundingBox;
required DOMString rawValue;
required BarcodeFormat format;
required sequence<Point2D> cornerPoints;
};
boundingBox
, of type DOMRectReadOnly
rawValue
, of type DOMString
format
, of type BarcodeFormat
BarcodeFormat
.
cornerPoints
, of type sequence<Point2D>
BarcodeFormat
enum BarcodeFormat
{
"aztec",
"code_128",
"code_39",
"code_93",
"codabar",
"data_matrix",
"ean_13",
"ean_8",
"itf",
"pdf417",
"qr_code",
"unknown",
"upc_a",
"upc_e"
};
aztec
code_128
code_39
code_93
codabar
data_matrix
ean_13
ean_8
itf
pdf417
qr_code
unknown
upc_a
upc_e
This section is non-normative.
This interface reveals information about the contents of an image source. It is critical for implementations to ensure that it cannot be used to bypass protections that would otherwise protect an image source from inspection. § 2.1 Image sources for detection describes the algorithm to accomplish this.
By providing high-performance shape detection capabilities this interface allows developers to run image analysis tasks on the local device. This offers a privacy advantage over offloading computation to a remote system. Developers should consider the results returned by this interface as privacy sensitive as the original image from which they were derived.
4. ExamplesThis section is non-normative.
Slightly modified/extended versions of these examples (and more) can be found in e.g. this codepen collection.
4.1. Platform support for a given detectorThe following example can also be found in e.g.
this codepenwith minimal modifications.
if (window.FaceDetector == undefined) { console.error('Face Detection not supported on this platform'); } if (window.BarcodeDetector == undefined) { console.error('Barcode Detection not supported on this platform'); }4.2. Face Detection
The following example can also be found in e.g.
this codepen(or
this one, with landmarks overlay).
let faceDetector = new FaceDetector({fastMode: true, maxDetectedFaces: 1}); // Assuming |theImage| is e.g. a <img> content, or a Blob. faceDetector.detect(theImage) .then(detectedFaces => { for (const face of detectedFaces) { console.log( ' Face @ (${face.boundingBox.x}, ${face.boundingBox.y}),' + ' size ${face.boundingBox.width}x${face.boundingBox.height}'); } }).catch(() => { console.error("Face Detection failed, boo."); })4.3. Barcode Detection
The following example can also be found in e.g.
this codepen.
let barcodeDetector = new BarcodeDetector(); // Assuming |theImage| is e.g. a <img> content, or a Blob. barcodeDetector.detect(theImage) .then(detectedCodes => { for (const barcode of detectedCodes) { console.log(' Barcode ${barcode.rawValue}' + ' @ (${barcode.boundingBox.x}, ${barcode.boundingBox.y}) with size' + ' ${barcode.boundingBox.width}x${barcode.boundingBox.height}'); } }).catch(() => { console.error("Barcode Detection failed, boo."); })Conformance Document conventions
Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.
All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]
Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example"
, like this:
This is an example of an informative example.
Informative notes begin with the word “Note” and are set apart from the normative text with class="note"
, like this:
Note, this is an informative note.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4