Refers generically to an instance of AudioDecoder
, AudioEncoder
, VideoDecoder
, or VideoEncoder
.
An encoded chunk that does not depend on any other frames for decoding. Also commonly referred to as a "key frame".
Codec outputs such as VideoFrame
s that currently reside in the internal pipeline of the underlying codec implementation. The underlying codec implementation MAY emit new outputs only when new inputs are provided. The underlying codec implementation MUST emit all outputs in response to a flush.
Resources including CPU memory, GPU memory, and exclusive handles to specific decoding/encoding hardware that MAY be allocated by the User Agent as part of codec configuration or generation of AudioData
and VideoFrame
objects. Such resources MAY be quickly exhausted and SHOULD be released immediately when no longer in use.
A grouping of EncodedVideoChunk
s whose timestamp cadence produces a particular framerate. See scalabilityMode
.
An image that supports decoding to multiple levels of detail, with lower levels becoming available while the encoded data is not yet fully buffered.
A generational identifier for a given Progressive Image decoded output. Each successive generation adds additional detail to the decoded output. The mechanism for computing a frame’s generation is implementer defined.
An image track that is marked by the given image file as being the default track. The mechanism for indicating a primary track is format defined.
A VideoPixelFormat
containing red, green, and blue color channels in any order or layout (interleaved or planar), and irrespective of whether an alpha channel is present.
A VideoColorSpace
object, initialized as follows:
[[primaries]]
is set to bt709
,
[[transfer]]
is set to iec61966-2-1
,
[[matrix]]
is set to rgb
,
[[full range]]
is set to true
A VideoColorSpace
object, initialized as follows:
[[primaries]]
is set to smpte432
,
[[transfer]]
is set to iec61966-2-1
,
[[matrix]]
is set to rgb
,
[[full range]]
is set to true
A VideoColorSpace
object, initialized as follows:
[[primaries]]
is set to bt709
,
[[transfer]]
is set to bt709
,
[[matrix]]
is set to bt709
,
[[full range]]
is set to false
The state of an underlying codec implementation where the number of active decoding or encoding requests has reached an implementation specific maximum such that it is temporarily unable to accept more work. The maximum may be any value greater than 1, including infinity (no maximum). While saturated, additional calls to decode()
or encode()
will be buffered in the control message queue, and will increment the respective decodeQueuSize
and encodeQueueSize
attributes. The codec implementation will become unsaturated after making sufficient progress on the current workload.
This section is non-normative.
The codec interfaces defined by the specification are designed such that new codec tasks can be scheduled while previous tasks are still pending. For example, web authors can call decode()
without waiting for a previous decode()
to complete. This is achieved by offloading underlying codec tasks to a separate parallel queue for parallel execution.
This section describes threading behaviors as they are visible from the perspective of web authors. Implementers can choose to use more threads, as long as the externally visible behaviors of blocking and sequencing are maintained as follows.
2.2. Control MessagesA control message defines a sequence of steps corresponding to a method invocation on a codec instance (e.g. encode()
).
A control message queue is a queue of control messages. Each codec instance has a control message queue stored in an internal slot named [[control message queue]].
Queuing a control message means enqueuing the message to a codec’s [[control message queue]]. Invoking codec methods will generally queue a control message to schedule work.
Running a control message means performing a sequence of steps specified by the method that enqueued the message.
The steps of a given control message can block processing later messages in the control message queue. Each codec instance has a boolean internal slot named [[message queue blocked]] that is set to true
when this occurs. A blocking message will conclude by setting [[message queue blocked]] to false
and rerunning the Process the control message queue steps.
All control messages will return either "processed"
or "not processed"
. Returning "processed"
indicates the message steps are being (or have been) executed and the message may be removed from the control message queue. "not processed"
indicates the message must not be processed at this time and should remain in the control message queue to be retried later.
To Process the control message queue, run these steps:
While [[message queue blocked]] is false
and [[control message queue]] is not empty:
Let front message be the first message in [[control message queue]].
Let outcome be the result of running the control message steps described by front message.
If outcome equals "not processed"
, break.
Otherwise, dequeue front message from the [[control message queue]].
Each codec instance has an internal slot named [[codec work queue]] that is a parallel queue.
Each codec instance has an internal slot named [[codec implementation]] that refers to the underlying platform encoder or decoder. Except for the initial assignment, any steps that reference [[codec implementation]] will be enqueued to the [[codec work queue]].
Each codec instance has a unique codec task source. Tasks queued from the [[codec work queue]] to the event loop will use the codec task source.
3. AudioDecoder Interface[Exposed=(Window,DedicatedWorker), SecureContext] interface3.1. Internal SlotsAudioDecoder
: EventTarget { constructor(AudioDecoderInitinit
); readonly attribute CodecState state; readonly attribute unsigned long decodeQueueSize; attribute EventHandler ondequeue; undefined configure(AudioDecoderConfigconfig
); undefined decode(EncodedAudioChunkchunk
); Promise<undefined> flush(); undefined reset(); undefined close(); static Promise<AudioDecoderSupport> isConfigSupported(AudioDecoderConfigconfig
); }; dictionaryAudioDecoderInit
{ required AudioDataOutputCallbackoutput
; required WebCodecsErrorCallbackerror
; }; callbackAudioDataOutputCallback
= undefined(AudioDataoutput
);
[[control message queue]]
A queue of control messages to be performed upon this codec instance. See [[control message queue]].
[[message queue blocked]]
A boolean indicating when processing the [[control message queue]]
is blocked by a pending control message. See [[message queue blocked]].
[[codec implementation]]
Underlying decoder implementation provided by the User Agent. See [[codec implementation]].
[[codec work queue]]
A parallel queue used for running parallel steps that reference the [[codec implementation]]
. See [[codec work queue]].
[[codec saturated]]
A boolean indicating when the [[codec implementation]]
is unable to accept additional decoding work.
[[output callback]]
Callback given at construction for decoded outputs.
[[error callback]]
Callback given at construction for decode errors.
[[key chunk required]]
A boolean indicating that the next chunk passed to decode()
MUST describe a key chunk as indicated by [[type]]
.
[[state]]
The current CodecState
of this AudioDecoder
.
[[decodeQueueSize]]
The number of pending decode requests. This number will decrease as the underlying codec is ready to accept new input.
[[pending flush promises]]
A list of unresolved promises returned by calls to flush()
.
[[dequeue event scheduled]]
A boolean indicating whether a dequeue
event is already scheduled to fire. Used to avoid event spam.
AudioDecoder(init)
Let d be a new AudioDecoder
object.
Assign a new queue to [[control message queue]]
.
Assign false
to [[message queue blocked]]
.
Assign null
to [[codec implementation]]
.
Assign the result of starting a new parallel queue to [[codec work queue]]
.
Assign false
to [[codec saturated]]
.
Assign init.output to [[output callback]]
.
Assign init.error to [[error callback]]
.
Assign true
to [[key chunk required]]
.
Assign "unconfigured"
to [[state]]
Assign 0
to [[decodeQueueSize]]
.
Assign a new list to [[pending flush promises]]
.
Assign false
to [[dequeue event scheduled]]
.
Return d.
state
, of type CodecState, readonly
Returns the value of [[state]]
.
decodeQueueSize
, of type unsigned long, readonly
Returns the value of [[decodeQueueSize]]
.
ondequeue
, of type EventHandler
An event handler IDL attribute whose event handler event type is dequeue
.
dequeue
Fired at the AudioDecoder
when the decodeQueueSize
has decreased.
configure(config)
NOTE: This method will trigger a NotSupportedError
if the User Agent does not support config. Authors are encouraged to first check support by calling isConfigSupported()
with config. User Agents don’t have to support any particular codec type or configuration.
When invoked, run these steps:
If config is not a valid AudioDecoderConfig, throw a TypeError
.
If [[state]]
is “closed”
, throw an InvalidStateError
.
Set [[state]]
to "configured"
.
Set [[key chunk required]]
to true
.
Queue a control message to configure the decoder with config.
Running a control message to configure the decoder means running these steps:
Assign true
to [[message queue blocked]]
.
Enqueue the following steps to [[codec work queue]]
:
Let supported be the result of running the Check Configuration Support algorithm with config.
If supported is false
, queue a task to run the Close AudioDecoder algorithm with NotSupportedError
and abort these steps.
If needed, assign [[codec implementation]]
with an implementation supporting config.
Configure [[codec implementation]]
with config.
queue a task to run the following steps:
Assign false
to [[message queue blocked]]
.
Return "processed"
.
decode(chunk)
When invoked, run these steps:
If [[state]]
is not "configured"
, throw an InvalidStateError
.
If [[key chunk required]]
is true
:
Implementers SHOULD inspect the chunk’s [[internal data]]
to verify that it is truly a key chunk. If a mismatch is detected, throw a DataError
.
Otherwise, assign false
to [[key chunk required]]
.
Increment [[decodeQueueSize]]
.
Queue a control message to decode the chunk.
Running a control message to decode the chunk means performing these steps:
If [[codec saturated]]
equals true
, return "not processed"
.
If decoding chunk will cause the [[codec implementation]]
to become saturated, assign true
to [[codec saturated]]
.
Decrement [[decodeQueueSize]]
and run the Schedule Dequeue Event algorithm.
Enqueue the following steps to the [[codec work queue]]
:
Attempt to use [[codec implementation]]
to decode the chunk.
If decoding results in an error, queue a task to run the Close AudioDecoder algorithm with EncodingError
and return.
If [[codec saturated]]
equals true
and [[codec implementation]]
is no longer saturated, queue a task to perform the following steps:
Assign false
to [[codec saturated]]
.
Let decoded outputs be a list of decoded audio data outputs emitted by [[codec implementation]]
.
If decoded outputs is not empty, queue a task to run the Output AudioData algorithm with decoded outputs.
Return "processed"
.
flush()
When invoked, run these steps:
If [[state]]
is not "configured"
, return a promise rejected with InvalidStateError
DOMException
.
Set [[key chunk required]]
to true
.
Let promise be a new Promise.
Append promise to [[pending flush promises]]
.
Queue a control message to flush the codec with promise.
Return promise.
Running a control message to flush the codec means performing these steps with promise.
Enqueue the following steps to the [[codec work queue]]
:
Signal [[codec implementation]]
to emit all internal pending outputs.
Let decoded outputs be a list of decoded audio data outputs emitted by [[codec implementation]]
.
Queue a task to perform these steps:
If decoded outputs is not empty, run the Output AudioData algorithm with decoded outputs.
Remove promise from [[pending flush promises]]
.
Resolve promise.
Return "processed"
.
reset()
When invoked, run the Reset AudioDecoder algorithm with an AbortError
DOMException
.
close()
When invoked, run the Close AudioDecoder algorithm with an AbortError
DOMException
.
isConfigSupported(config)
NOTE: The returned AudioDecoderSupport
config
will contain only the dictionary members that User Agent recognized. Unrecognized dictionary members will be ignored. Authors can detect unrecognized dictionary members by comparing config
to their provided config.
When invoked, run these steps:
If config is not a valid AudioDecoderConfig, return a promise rejected with TypeError
.
Let p be a new Promise.
Let checkSupportQueue be the result of starting a new parallel queue.
Enqueue the following steps to checkSupportQueue:
Let supported be the result of running the Check Configuration Support algorithm with config.
Queue a task to run the following steps:
Let decoderSupport be a newly constructed AudioDecoderSupport
, initialized as follows:
Set config
to the result of running the Clone Configuration algorithm with config.
Set supported
to supported.
Resolve p with decoderSupport.
Return p.
If [[dequeue event scheduled]]
equals true
, return.
Assign true
to [[dequeue event scheduled]]
.
Queue a task to run the following steps:
Assign false
to [[dequeue event scheduled]]
.
For each output in outputs:
Let data be an AudioData
, initialized as follows:
Assign false
to [[Detached]]
.
Let resource be the media resource described by output.
Let resourceReference be a reference to resource.
Assign resourceReference to [[resource reference]]
.
Let timestamp be the [[timestamp]]
of the EncodedAudioChunk
associated with output.
Assign timestamp to [[timestamp]]
.
If output uses a recognized AudioSampleFormat
, assign that format to [[format]]
. Otherwise, assign null
to [[format]]
.
Assign values to [[sample rate]]
, [[number of frames]]
, and [[number of channels]]
as determined by output.
Invoke [[output callback]]
with data.
If [[state]]
is "closed"
, throw an InvalidStateError
.
Set [[state]]
to "unconfigured"
.
Signal [[codec implementation]]
to cease producing output for the previous configuration.
Remove all control messages from the [[control message queue]]
.
If [[decodeQueueSize]]
is greater than zero:
Set [[decodeQueueSize]]
to zero.
Run the Schedule Dequeue Event algorithm.
For each promise in [[pending flush promises]]
:
Reject promise with exception.
Remove promise from [[pending flush promises]]
.
Run the Reset AudioDecoder algorithm with exception.
Set [[state]]
to "closed"
.
Clear [[codec implementation]]
and release associated system resources.
If exception is not an AbortError
DOMException
, invoke the [[error callback]]
with exception.
[Exposed=(Window,DedicatedWorker), SecureContext] interface4.1. Internal SlotsVideoDecoder
: EventTarget { constructor(VideoDecoderInitinit
); readonly attribute CodecState state; readonly attribute unsigned long decodeQueueSize; attribute EventHandler ondequeue; undefined configure(VideoDecoderConfigconfig
); undefined decode(EncodedVideoChunkchunk
); Promise<undefined> flush(); undefined reset(); undefined close(); static Promise<VideoDecoderSupport> isConfigSupported(VideoDecoderConfigconfig
); }; dictionaryVideoDecoderInit
{ required VideoFrameOutputCallbackoutput
; required WebCodecsErrorCallbackerror
; }; callbackVideoFrameOutputCallback
= undefined(VideoFrameoutput
);
[[control message queue]]
A queue of control messages to be performed upon this codec instance. See [[control message queue]].
[[message queue blocked]]
A boolean indicating when processing the [[control message queue]]
is blocked by a pending control message. See [[message queue blocked]].
[[codec implementation]]
Underlying decoder implementation provided by the User Agent. See [[codec implementation]].
[[codec work queue]]
A parallel queue used for running parallel steps that reference the [[codec implementation]]
. See [[codec work queue]].
[[codec saturated]]
A boolean indicating when the [[codec implementation]]
is unable to accept additional decoding work.
[[output callback]]
Callback given at construction for decoded outputs.
[[error callback]]
Callback given at construction for decode errors.
[[active decoder config]]
The VideoDecoderConfig
that is actively applied.
[[key chunk required]]
A boolean indicating that the next chunk passed to decode()
MUST describe a key chunk as indicated by type
.
[[state]]
The current CodecState
of this VideoDecoder
.
[[decodeQueueSize]]
The number of pending decode requests. This number will decrease as the underlying codec is ready to accept new input.
[[pending flush promises]]
A list of unresolved promises returned by calls to flush()
.
[[dequeue event scheduled]]
A boolean indicating whether a dequeue
event is already scheduled to fire. Used to avoid event spam.
VideoDecoder(init)
Let d be a new VideoDecoder
object.
Assign a new queue to [[control message queue]]
.
Assign false
to [[message queue blocked]]
.
Assign null
to [[codec implementation]]
.
Assign the result of starting a new parallel queue to [[codec work queue]]
.
Assign false
to [[codec saturated]]
.
Assign init.output to [[output callback]]
.
Assign init.error to [[error callback]]
.
Assign null
to [[active decoder config]]
.
Assign true
to [[key chunk required]]
.
Assign "unconfigured"
to [[state]]
Assign 0
to [[decodeQueueSize]]
.
Assign a new list to [[pending flush promises]]
.
Assign false
to [[dequeue event scheduled]]
.
Return d.
state
, of type CodecState, readonly
Returns the value of [[state]]
.
decodeQueueSize
, of type unsigned long, readonly
Returns the value of [[decodeQueueSize]]
.
ondequeue
, of type EventHandler
An event handler IDL attribute whose event handler event type is dequeue
.
dequeue
Fired at the VideoDecoder
when the decodeQueueSize
has decreased.
configure(config)
NOTE: This method will trigger a NotSupportedError
if the User Agent does not support config. Authors are encouraged to first check support by calling isConfigSupported()
with config. User Agents don’t have to support any particular codec type or configuration.
When invoked, run these steps:
If config is not a valid VideoDecoderConfig, throw a TypeError
.
If [[state]]
is “closed”
, throw an InvalidStateError
.
Set [[state]]
to "configured"
.
Set [[key chunk required]]
to true
.
Queue a control message to configure the decoder with config.
Running a control message to configure the decoder means running these steps:
Assign true
to [[message queue blocked]]
.
Enqueue the following steps to [[codec work queue]]
:
Let supported be the result of running the Check Configuration Support algorithm with config.
If supported is false
, queue a task to run the Close VideoDecoder algorithm with NotSupportedError
and abort these steps.
If needed, assign [[codec implementation]]
with an implementation supporting config.
Configure [[codec implementation]]
with config.
queue a task to run the following steps:
Assign false
to [[message queue blocked]]
.
Return "processed"
.
decode(chunk)
NOTE: Authors are encouraged to call close()
on output VideoFrame
s immediately when frames are no longer needed. The underlying media resources are owned by the VideoDecoder
and failing to release them (or waiting for garbage collection) can cause decoding to stall.
NOTE: VideoDecoder
requires that frames are output in the order they expect to be presented, commonly known as presentation order. When using some [[codec implementation]]
s the User Agent will have to reorder outputs into presentation order.
When invoked, run these steps:
If [[state]]
is not "configured"
, throw an InvalidStateError
.
If [[key chunk required]]
is true
:
Implementers SHOULD inspect the chunk’s [[internal data]]
to verify that it is truly a key chunk. If a mismatch is detected, throw a DataError
.
Otherwise, assign false
to [[key chunk required]]
.
Increment [[decodeQueueSize]]
.
Queue a control message to decode the chunk.
Running a control message to decode the chunk means performing these steps:
If [[codec saturated]]
equals true
, return "not processed"
.
If decoding chunk will cause the [[codec implementation]]
to become saturated, assign true
to [[codec saturated]]
.
Decrement [[decodeQueueSize]]
and run the Schedule Dequeue Event algorithm.
Enqueue the following steps to the [[codec work queue]]
:
Attempt to use [[codec implementation]]
to decode the chunk.
If decoding results in an error, queue a task to run the Close VideoDecoder algorithm with EncodingError
and return.
If [[codec saturated]]
equals true
and [[codec implementation]]
is no longer saturated, queue a task to perform the following steps:
Assign false
to [[codec saturated]]
.
Let decoded outputs be a list of decoded video data outputs emitted by [[codec implementation]]
in presentation order.
If decoded outputs is not empty, queue a task to run the Output VideoFrame algorithm with decoded outputs.
Return "processed"
.
flush()
When invoked, run these steps:
If [[state]]
is not "configured"
, return a promise rejected with InvalidStateError
DOMException
.
Set [[key chunk required]]
to true
.
Let promise be a new Promise.
Append promise to [[pending flush promises]]
.
Queue a control message to flush the codec with promise.
Return promise.
Running a control message to flush the codec means performing these steps with promise.
Enqueue the following steps to the [[codec work queue]]
:
Signal [[codec implementation]]
to emit all internal pending outputs.
Let decoded outputs be a list of decoded video data outputs emitted by [[codec implementation]]
.
Queue a task to perform these steps:
If decoded outputs is not empty, run the Output VideoFrame algorithm with decoded outputs.
Remove promise from [[pending flush promises]]
.
Resolve promise.
Return "processed"
.
reset()
When invoked, run the Reset VideoDecoder algorithm with an AbortError
DOMException
.
close()
When invoked, run the Close VideoDecoder algorithm with an AbortError
DOMException
.
isConfigSupported(config)
NOTE: The returned VideoDecoderSupport
config
will contain only the dictionary members that User Agent recognized. Unrecognized dictionary members will be ignored. Authors can detect unrecognized dictionary members by comparing config
to their provided config.
When invoked, run these steps:
If config is not a valid VideoDecoderConfig, return a promise rejected with TypeError
.
Let p be a new Promise.
Let checkSupportQueue be the result of starting a new parallel queue.
Enqueue the following steps to checkSupportQueue:
Let supported be the result of running the Check Configuration Support algorithm with config.
Queue a task to run the following steps:
Let decoderSupport be a newly constructed VideoDecoderSupport
, initialized as follows:
Set config
to the result of running the Clone Configuration algorithm with config.
Set supported
to supported.
Resolve p with decoderSupport.
Return p.
If [[dequeue event scheduled]]
equals true
, return.
Assign true
to [[dequeue event scheduled]]
.
Queue a task to run the following steps:
Assign false
to [[dequeue event scheduled]]
.
For each output in outputs:
Let timestamp and duration be the timestamp
and duration
from the EncodedVideoChunk
associated with output.
Let displayAspectWidth and displayAspectHeight be undefined.
If displayAspectWidth
and displayAspectHeight
exist in the [[active decoder config]]
, assign their values to displayAspectWidth and displayAspectHeight respectively.
Let colorSpace be the VideoColorSpace
for output as detected by the codec implementation. If no VideoColorSpace
is detected, let colorSpace be undefined
.
NOTE: The codec implementation can detect a VideoColorSpace
by analyzing the bitstream. Detection is made on a best-effort basis. The exact method of detection is implementer defined and codec-specific. Authors can override the detected VideoColorSpace
by providing a colorSpace
in the VideoDecoderConfig
.
If colorSpace
exists in the [[active decoder config]]
, assign its value to colorSpace.
Assign the values of rotation
and flip
to rotation and flip respectively.
Let frame be the result of running the Create a VideoFrame algorithm with output, timestamp, duration, displayAspectWidth, displayAspectHeight, colorSpace, rotation, and flip.
Invoke [[output callback]]
with frame.
If state
is "closed"
, throw an InvalidStateError
.
Set state
to "unconfigured"
.
Signal [[codec implementation]]
to cease producing output for the previous configuration.
Remove all control messages from the [[control message queue]]
.
If [[decodeQueueSize]]
is greater than zero:
Set [[decodeQueueSize]]
to zero.
Run the Schedule Dequeue Event algorithm.
For each promise in [[pending flush promises]]
:
Reject promise with exception.
Remove promise from [[pending flush promises]]
.
Run the Reset VideoDecoder algorithm with exception.
Set state
to "closed"
.
Clear [[codec implementation]]
and release associated system resources.
If exception is not an AbortError
DOMException
, invoke the [[error callback]]
with exception.
[Exposed=(Window,DedicatedWorker), SecureContext] interface5.1. Internal SlotsAudioEncoder
: EventTarget { constructor(AudioEncoderInitinit
); readonly attribute CodecState state; readonly attribute unsigned long encodeQueueSize; attribute EventHandler ondequeue; undefined configure(AudioEncoderConfigconfig
); undefined encode(AudioDatadata
); Promise<undefined> flush(); undefined reset(); undefined close(); static Promise<AudioEncoderSupport> isConfigSupported(AudioEncoderConfigconfig
); }; dictionaryAudioEncoderInit
{ required EncodedAudioChunkOutputCallbackoutput
; required WebCodecsErrorCallbackerror
; }; callbackEncodedAudioChunkOutputCallback
= undefined (EncodedAudioChunkoutput
, optional EncodedAudioChunkMetadatametadata
= {});
[[control message queue]]
A queue of control messages to be performed upon this codec instance. See [[control message queue]].
[[message queue blocked]]
A boolean indicating when processing the [[control message queue]]
is blocked by a pending control message. See [[message queue blocked]].
[[codec implementation]]
Underlying encoder implementation provided by the User Agent. See [[codec implementation]].
[[codec work queue]]
A parallel queue used for running parallel steps that reference the [[codec implementation]]
. See [[codec work queue]].
[[codec saturated]]
A boolean indicating when the [[codec implementation]]
is unable to accept additional encoding work.
[[output callback]]
Callback given at construction for encoded outputs.
[[error callback]]
Callback given at construction for encode errors.
[[active encoder config]]
The AudioEncoderConfig
that is actively applied.
[[active output config]]
The AudioDecoderConfig
that describes how to decode the most recently emitted EncodedAudioChunk
.
[[state]]
The current CodecState
of this AudioEncoder
.
[[encodeQueueSize]]
The number of pending encode requests. This number will decrease as the underlying codec is ready to accept new input.
[[pending flush promises]]
A list of unresolved promises returned by calls to flush()
.
[[dequeue event scheduled]]
A boolean indicating whether a dequeue
event is already scheduled to fire. Used to avoid event spam.
AudioEncoder(init)
Let e be a new AudioEncoder
object.
Assign a new queue to [[control message queue]]
.
Assign false
to [[message queue blocked]]
.
Assign null
to [[codec implementation]]
.
Assign the result of starting a new parallel queue to [[codec work queue]]
.
Assign false
to [[codec saturated]]
.
Assign init.output to [[output callback]]
.
Assign init.error to [[error callback]]
.
Assign null
to [[active encoder config]]
.
Assign null
to [[active output config]]
.
Assign "unconfigured"
to [[state]]
Assign 0
to [[encodeQueueSize]]
.
Assign a new list to [[pending flush promises]]
.
Assign false
to [[dequeue event scheduled]]
.
Return e.
state
, of type CodecState, readonly
Returns the value of [[state]]
.
encodeQueueSize
, of type unsigned long, readonly
Returns the value of [[encodeQueueSize]]
.
ondequeue
, of type EventHandler
An event handler IDL attribute whose event handler event type is dequeue
.
dequeue
Fired at the AudioEncoder
when the encodeQueueSize
has decreased.
configure(config)
NOTE: This method will trigger a NotSupportedError
if the User Agent does not support config. Authors are encouraged to first check support by calling isConfigSupported()
with config. User Agents don’t have to support any particular codec type or configuration.
When invoked, run these steps:
If config is not a valid AudioEncoderConfig, throw a TypeError
.
If [[state]]
is "closed"
, throw an InvalidStateError
.
Set [[state]]
to "configured"
.
Queue a control message to configure the encoder using config.
Running a control message to configure the encoder means performing these steps:
Assign true
to [[message queue blocked]]
.
Enqueue the following steps to [[codec work queue]]
:
Let supported be the result of running the Check Configuration Support algorithm with config.
If supported is false
, queue a task to run the Close AudioEncoder algorithm with NotSupportedError
and abort these steps.
If needed, assign [[codec implementation]]
with an implementation supporting config.
Configure [[codec implementation]]
with config.
queue a task to run the following steps:
Assign false
to [[message queue blocked]]
.
Return "processed"
.
encode(data)
When invoked, run these steps:
If the value of data’s [[Detached]]
internal slot is true
, throw a TypeError
.
If [[state]]
is not "configured"
, throw an InvalidStateError
.
Let dataClone hold the result of running the Clone AudioData algorithm with data.
Increment [[encodeQueueSize]]
.
Queue a control message to encode dataClone.
Running a control message to encode the data means performing these steps:
If [[codec saturated]]
equals true
, return "not processed"
.
If encoding data will cause the [[codec implementation]]
to become saturated, assign true
to [[codec saturated]]
.
Decrement [[encodeQueueSize]]
and run the Schedule Dequeue Event algorithm.
Enqueue the following steps to the [[codec work queue]]
:
Attempt to use [[codec implementation]]
to encode the media resource described by dataClone.
If encoding results in an error, queue a task to run the Close AudioEncoder algorithm with EncodingError
and return.
If [[codec saturated]]
equals true
and [[codec implementation]]
is no longer saturated, queue a task to perform the following steps:
Assign false
to [[codec saturated]]
.
Let encoded outputs be a list of encoded audio data outputs emitted by [[codec implementation]]
.
If encoded outputs is not empty, queue a task to run the Output EncodedAudioChunks algorithm with encoded outputs.
Return "processed"
.
flush()
When invoked, run these steps:
If [[state]]
is not "configured"
, return a promise rejected with InvalidStateError
DOMException
.
Let promise be a new Promise.
Append promise to [[pending flush promises]]
.
Queue a control message to flush the codec with promise.
Return promise.
Running a control message to flush the codec means performing these steps with promise.
Enqueue the following steps to the [[codec work queue]]
:
Signal [[codec implementation]]
to emit all internal pending outputs.
Let encoded outputs be a list of encoded audio data outputs emitted by [[codec implementation]]
.
Queue a task to perform these steps:
If encoded outputs is not empty, run the Output EncodedAudioChunks algorithm with encoded outputs.
Remove promise from [[pending flush promises]]
.
Resolve promise.
Return "processed"
.
reset()
When invoked, run the Reset AudioEncoder algorithm with an AbortError
DOMException
.
close()
When invoked, run the Close AudioEncoder algorithm with an AbortError
DOMException
.
isConfigSupported(config)
NOTE: The returned AudioEncoderSupport
config
will contain only the dictionary members that User Agent recognized. Unrecognized dictionary members will be ignored. Authors can detect unrecognized dictionary members by comparing config
to their provided config.
When invoked, run these steps:
If config is not a valid AudioEncoderConfig, return a promise rejected with TypeError
.
Let p be a new Promise.
Let checkSupportQueue be the result of starting a new parallel queue.
Enqueue the following steps to checkSupportQueue:
Let supported be the result of running the Check Configuration Support algorithm with config.
Queue a task to run the following steps:
Let encoderSupport be a newly constructed AudioEncoderSupport
, initialized as follows:
Set config
to the result of running the Clone Configuration algorithm with config.
Set supported
to supported.
Resolve p with encoderSupport.
Return p.
If [[dequeue event scheduled]]
equals true
, return.
Assign true
to [[dequeue event scheduled]]
.
Queue a task to run the following steps:
Assign false
to [[dequeue event scheduled]]
.
For each output in outputs:
Let chunkInit be an EncodedAudioChunkInit
with the following keys:
Let chunk be a new EncodedAudioChunk
constructed with chunkInit.
Let chunkMetadata be a new EncodedAudioChunkMetadata
.
Let encoderConfig be the [[active encoder config]]
.
Let outputConfig be a new AudioDecoderConfig
that describes output. Initialize outputConfig as follows:
Assign encoderConfig.sampleRate
to outputConfig.sampleRate
.
Assign to encoderConfig.numberOfChannels
to outputConfig.numberOfChannels
.
Assign outputConfig.description
with a sequence of codec specific bytes as determined by the [[codec implementation]]
. The User Agent MUST ensure that the provided description could be used to correctly decode output.
NOTE: The codec specific requirements for populating the description
are described in the [WEBCODECS-CODEC-REGISTRY].
If outputConfig and [[active output config]]
are not equal dictionaries:
Assign outputConfig to chunkMetadata.decoderConfig
.
Assign outputConfig to [[active output config]]
.
Invoke [[output callback]]
with chunk and chunkMetadata.
If [[state]]
is "closed"
, throw an InvalidStateError
.
Set [[state]]
to "unconfigured"
.
Set [[active encoder config]]
to null
.
Set [[active output config]]
to null
.
Signal [[codec implementation]]
to cease producing output for the previous configuration.
Remove all control messages from the [[control message queue]]
.
If [[encodeQueueSize]]
is greater than zero:
Set [[encodeQueueSize]]
to zero.
Run the Schedule Dequeue Event algorithm.
For each promise in [[pending flush promises]]
:
Reject promise with exception.
Remove promise from [[pending flush promises]]
.
Run the Reset AudioEncoder algorithm with exception.
Set [[state]]
to "closed"
.
Clear [[codec implementation]]
and release associated system resources.
If exception is not an AbortError
DOMException
, invoke the [[error callback]]
with exception.
EncodedAudioChunkOutputCallback
alongside an associated EncodedAudioChunk
.
dictionary EncodedAudioChunkMetadata
{
AudioDecoderConfig decoderConfig;
};
decoderConfig
, of type AudioDecoderConfig
A AudioDecoderConfig
that authors MAY use to decode the associated EncodedAudioChunk
.
[Exposed=(Window,DedicatedWorker), SecureContext] interface6.1. Internal SlotsVideoEncoder
: EventTarget { constructor(VideoEncoderInitinit
); readonly attribute CodecState state; readonly attribute unsigned long encodeQueueSize; attribute EventHandler ondequeue; undefined configure(VideoEncoderConfigconfig
); undefined encode(VideoFrameframe
, optional VideoEncoderEncodeOptionsoptions
= {}); Promise<undefined> flush(); undefined reset(); undefined close(); static Promise<VideoEncoderSupport> isConfigSupported(VideoEncoderConfigconfig
); }; dictionaryVideoEncoderInit
{ required EncodedVideoChunkOutputCallbackoutput
; required WebCodecsErrorCallbackerror
; }; callbackEncodedVideoChunkOutputCallback
= undefined (EncodedVideoChunkchunk
, optional EncodedVideoChunkMetadatametadata
= {});
[[control message queue]]
A queue of control messages to be performed upon this codec instance. See [[control message queue]].
[[message queue blocked]]
A boolean indicating when processing the [[control message queue]]
is blocked by a pending control message. See [[message queue blocked]].
[[codec implementation]]
Underlying encoder implementation provided by the User Agent. See [[codec implementation]].
[[codec work queue]]
A parallel queue used for running parallel steps that reference the [[codec implementation]]
. See [[codec work queue]].
[[codec saturated]]
A boolean indicating when the [[codec implementation]]
is unable to accept additional encoding work.
[[output callback]]
Callback given at construction for encoded outputs.
[[error callback]]
Callback given at construction for encode errors.
[[active encoder config]]
The VideoEncoderConfig
that is actively applied.
[[active output config]]
The VideoDecoderConfig
that describes how to decode the most recently emitted EncodedVideoChunk
.
[[state]]
The current CodecState
of this VideoEncoder
.
[[encodeQueueSize]]
The number of pending encode requests. This number will decrease as the underlying codec is ready to accept new input.
[[pending flush promises]]
A list of unresolved promises returned by calls to flush()
.
[[dequeue event scheduled]]
A boolean indicating whether a dequeue
event is already scheduled to fire. Used to avoid event spam.
[[active orientation]]
An integer and boolean pair indicating the [[flip]]
and [[rotation]]
of the first VideoFrame
given to encode()
after configure()
.
VideoEncoder(init)
Let e be a new VideoEncoder
object.
Assign a new queue to [[control message queue]]
.
Assign false
to [[message queue blocked]]
.
Assign null
to [[codec implementation]]
.
Assign the result of starting a new parallel queue to [[codec work queue]]
.
Assign false
to [[codec saturated]]
.
Assign init.output to [[output callback]]
.
Assign init.error to [[error callback]]
.
Assign null
to [[active encoder config]]
.
Assign null
to [[active output config]]
.
Assign "unconfigured"
to [[state]]
Assign 0
to [[encodeQueueSize]]
.
Assign a new list to [[pending flush promises]]
.
Assign false
to [[dequeue event scheduled]]
.
Return e.
state
, of type CodecState, readonly
Returns the value of [[state]]
.
encodeQueueSize
, of type unsigned long, readonly
Returns the value of [[encodeQueueSize]]
.
ondequeue
, of type EventHandler
An event handler IDL attribute whose event handler event type is dequeue
.
dequeue
Fired at the VideoEncoder
when the encodeQueueSize
has decreased.
configure(config)
NOTE: This method will trigger a NotSupportedError
if the User Agent does not support config. Authors are encouraged to first check support by calling isConfigSupported()
with config. User Agents don’t have to support any particular codec type or configuration.
When invoked, run these steps:
If config is not a valid VideoEncoderConfig, throw a TypeError
.
If [[state]]
is "closed"
, throw an InvalidStateError
.
Set [[state]]
to "configured"
.
Set [[active orientation]]
to null
.
Queue a control message to configure the encoder using config.
Running a control message to configure the encoder means performing these steps:
Assign true
to [[message queue blocked]]
.
Enqueue the following steps to [[codec work queue]]
:
Let supported be the result of running the Check Configuration Support algorithm with config.
If supported is false
, queue a task to run the Close VideoEncoder algorithm with NotSupportedError
and abort these steps.
If needed, assign [[codec implementation]]
with an implementation supporting config.
Configure [[codec implementation]]
with config.
queue a task to run the following steps:
Assign false
to [[message queue blocked]]
.
Return "processed"
.
encode(frame, options)
When invoked, run these steps:
If the value of frame’s [[Detached]]
internal slot is true
, throw a TypeError
.
If [[state]]
is not "configured"
, throw an InvalidStateError
.
If [[active orientation]]
is not null
and does not match frame’s [[rotation]]
and [[flip]]
throw a DataError
.
If [[active orientation]]
is null
, set it to frame’s [[rotation]]
and [[flip]]
.
Let frameClone hold the result of running the Clone VideoFrame algorithm with frame.
Increment [[encodeQueueSize]]
.
Queue a control message to encode frameClone.
Running a control message to encode the frame means performing these steps:
If [[codec saturated]]
equals true
, return "not processed"
.
If encoding frame will cause the [[codec implementation]]
to become saturated, assign true
to [[codec saturated]]
.
Decrement [[encodeQueueSize]]
and run the Schedule Dequeue Event algorithm.
Enqueue the following steps to the [[codec work queue]]
:
Attempt to use [[codec implementation]]
to encode the frameClone according to options.
If encoding results in an error, queue a task to run the Close VideoEncoder algorithm with EncodingError
and return.
If [[codec saturated]]
equals true
and [[codec implementation]]
is no longer saturated, queue a task to perform the following steps:
Assign false
to [[codec saturated]]
.
Let encoded outputs be a list of encoded video data outputs emitted by [[codec implementation]]
.
If encoded outputs is not empty, queue a task to run the Output EncodedVideoChunks algorithm with encoded outputs.
Return "processed"
.
flush()
When invoked, run these steps:
If [[state]]
is not "configured"
, return a promise rejected with InvalidStateError
DOMException
.
Let promise be a new Promise.
Append promise to [[pending flush promises]]
.
Queue a control message to flush the codec with promise.
Return promise.
Running a control message to flush the codec means performing these steps with promise:
Enqueue the following steps to the [[codec work queue]]
:
Signal [[codec implementation]]
to emit all internal pending outputs.
Let encoded outputs be a list of encoded video data outputs emitted by [[codec implementation]]
.
Queue a task to perform these steps:
If encoded outputs is not empty, run the Output EncodedVideoChunks algorithm with encoded outputs.
Remove promise from [[pending flush promises]]
.
Resolve promise.
Return "processed"
.
reset()
When invoked, run the Reset VideoEncoder algorithm with an AbortError
DOMException
.
close()
When invoked, run the Close VideoEncoder algorithm with an AbortError
DOMException
.
isConfigSupported(config)
NOTE: The returned VideoEncoderSupport
config
will contain only the dictionary members that User Agent recognized. Unrecognized dictionary members will be ignored. Authors can detect unrecognized dictionary members by comparing config
to their provided config.
When invoked, run these steps:
If config is not a valid VideoEncoderConfig, return a promise rejected with TypeError
.
Let p be a new Promise.
Let checkSupportQueue be the result of starting a new parallel queue.
Enqueue the following steps to checkSupportQueue:
Let supported be the result of running the Check Configuration Support algorithm with config.
Queue a task to run the following steps:
Let encoderSupport be a newly constructed VideoEncoderSupport
, initialized as follows:
Set config
to the result of running the Clone Configuration algorithm with config.
Set supported
to supported.
Resolve p with encoderSupport.
Return p.
If [[dequeue event scheduled]]
equals true
, return.
Assign true
to [[dequeue event scheduled]]
.
Queue a task to run the following steps:
Assign false
to [[dequeue event scheduled]]
.
For each output in outputs:
Let chunkInit be an EncodedVideoChunkInit
with the following keys:
Let data
contain the encoded video data from output.
Let type
be the EncodedVideoChunkType
of output.
Let timestamp
be the [[timestamp]]
from the VideoFrame
associated with output.
Let duration
be the [[duration]]
from the VideoFrame
associated with output.
Let chunk be a new EncodedVideoChunk
constructed with chunkInit.
Let chunkMetadata be a new EncodedVideoChunkMetadata
.
Let encoderConfig be the [[active encoder config]]
.
Let outputConfig be a VideoDecoderConfig
that describes output. Initialize outputConfig as follows:
Assign encoderConfig.codec
to outputConfig.codec
.
Assign encoderConfig.width
to outputConfig.codedWidth
.
Assign encoderConfig.height
to outputConfig.codedHeight
.
Assign encoderConfig.displayWidth
to outputConfig.displayAspectWidth
.
Assign encoderConfig.displayHeight
to outputConfig.displayAspectHeight
.
Assign [[rotation]]
from the VideoFrame
associated with output to outputConfig.rotation
.
Assign [[flip]]
from the VideoFrame
associated with output to outputConfig.flip
.
Assign the remaining keys of outputConfig
as determined by [[codec implementation]]
. The User Agent MUST ensure that the configuration is completely described such that outputConfig could be used to correctly decode output.
NOTE: The codec specific requirements for populating the description
are described in the [WEBCODECS-CODEC-REGISTRY].
If outputConfig and [[active output config]]
are not equal dictionaries:
Assign outputConfig to chunkMetadata.decoderConfig
.
Assign outputConfig to [[active output config]]
.
If encoderConfig.scalabilityMode
describes multiple temporal layers:
Let svc be a new SvcOutputMetadata
instance.
Let temporal_layer_id be the zero-based index describing the temporal layer for output.
Assign temporal_layer_id to svc.temporalLayerId
.
Assign svc to chunkMetadata.svc
.
If encoderConfig.alpha
is set to "keep"
:
Let alphaSideData be the encoded alpha data in output.
Assign alphaSideData to chunkMetadata.alphaSideData
.
Invoke [[output callback]]
with chunk and chunkMetadata.
If [[state]]
is "closed"
, throw an InvalidStateError
.
Set [[state]]
to "unconfigured"
.
Set [[active encoder config]]
to null
.
Set [[active output config]]
to null
.
Signal [[codec implementation]]
to cease producing output for the previous configuration.
Remove all control messages from the [[control message queue]]
.
If [[encodeQueueSize]]
is greater than zero:
Set [[encodeQueueSize]]
to zero.
Run the Schedule Dequeue Event algorithm.
For each promise in [[pending flush promises]]
:
Reject promise with exception.
Remove promise from [[pending flush promises]]
.
Run the Reset VideoEncoder algorithm with exception.
Set [[state]]
to "closed"
.
Clear [[codec implementation]]
and release associated system resources.
If exception is not an AbortError
DOMException
, invoke the [[error callback]]
with exception.
EncodedVideoChunkOutputCallback
alongside an associated EncodedVideoChunk
.
dictionaryEncodedVideoChunkMetadata
{ VideoDecoderConfig decoderConfig; SvcOutputMetadata svc; BufferSource alphaSideData; }; dictionarySvcOutputMetadata
{ unsigned long temporalLayerId; };
decoderConfig
, of type VideoDecoderConfig
A VideoDecoderConfig
that authors MAY use to decode the associated EncodedVideoChunk
.
svc
, of type SvcOutputMetadata
A collection of metadata describing this EncodedVideoChunk
with respect to the configured scalabilityMode
.
alphaSideData
, of type BufferSource
A BufferSource
that contains the EncodedVideoChunk
’s extra alpha channel data.
temporalLayerId
, of type unsigned long
A number that identifies the temporal layer for the associated EncodedVideoChunk
.
If the codec string in config.codec is not a valid codec string or is otherwise unrecognized by the User Agent, return false
.
If config is an AudioDecoderConfig
or VideoDecoderConfig
and the User Agent can’t provide a codec that can decode the exact profile (where present), level (where present), and constraint bits (where present) indicated by the codec string in config.codec, return false
.
If config is an AudioEncoderConfig
or VideoEncoderConfig
:
If the codec string in config.codec contains a profile and the User Agent can’t provide a codec that can encode the exact profile indicated by config.codec, return false
.
If the codec string in config.codec contains a level and the User Agent can’t provide a codec that can encode to a level less than or equal to the level indicated by config.codec, return false
.
If the codec string in config.codec contains constraint bits and the User Agent can’t provide a codec that can produce an encoded bitstream at least as constrained as indicated by config.codec, return false
.
If the User Agent can provide a codec to support all entries of the config, including applicable default values for keys that are not included, return true
.
NOTE: The types AudioDecoderConfig
, VideoDecoderConfig
, AudioEncoderConfig
, and VideoEncoderConfig
each define their respective configuration entries and defaults.
NOTE: Support for a given configuration can change dynamically if the hardware is altered (e.g. external GPU unplugged) or if essential hardware resources are exhausted. User Agents describe support on a best-effort basis given the resources that are available at the time of the query.
Otherwise, return false.
NOTE: This algorithm will copy only the dictionary members that the User Agent recognizes as part of the dictionary type.
Run these steps:
Let dictType be the type of dictionary config.
Let clone be a new empty instance of dictType.
For each dictionary member m defined on dictType:
If config[m]
is a nested dictionary, set clone[m]
to the result of recursively running the Clone Configuration algorithm with config[m]
.
Otherwise, assign a copy of config[m]
to clone[m]
.
Note: This implements a "deep-copy". These configuration objects are frequently used as the input of asynchronous operations. Copying means that modifying the original object while the operation is in flight won’t change the operation’s outcome.
7.3. Signalling Configuration Support 7.3.1. AudioDecoderSupportdictionary AudioDecoderSupport
{
boolean supported;
AudioDecoderConfig config;
};
supported
, of type boolean
config
is supported by the User Agent.
config
, of type AudioDecoderConfig
AudioDecoderConfig
used by the User Agent in determining the value of supported
.
dictionary VideoDecoderSupport
{
boolean supported;
VideoDecoderConfig config;
};
supported
, of type boolean
config
is supported by the User Agent.
config
, of type VideoDecoderConfig
VideoDecoderConfig
used by the User Agent in determining the value of supported
.
dictionary AudioEncoderSupport
{
boolean supported;
AudioEncoderConfig config;
};
supported
, of type boolean
config
is supported by the User Agent.
config
, of type AudioEncoderConfig
AudioEncoderConfig
used by the User Agent in determining the value of supported
.
dictionary VideoEncoderSupport
{
boolean supported;
VideoEncoderConfig config;
};
supported
, of type boolean
config
is supported by the User Agent.
config
, of type VideoEncoderConfig
VideoEncoderConfig
used by the User Agent in determining the value of supported
.
A valid codec string MUST meet the following conditions.
Is valid per the relevant codec specification (see examples below).
It describes a single codec.
It is unambiguous about codec profile, level, and constraint bits for codecs that define these concepts.
NOTE: In other media specifications, codec strings historically accompanied a MIME type as the "codecs=" parameter (isTypeSupported()
, canPlayType()
) [RFC6381]. In this specification, encoded media is not containerized; hence, only the value of the codecs parameter is accepted.
NOTE: Encoders for codecs that define level and constraint bits have flexibility around these parameters, but won’t produce bitstreams that have a higher level or are less constrained than requested.
The format and semantics for codec strings are defined by codec registrations listed in the [WEBCODECS-CODEC-REGISTRY]. A compliant implementation MAY support any combination of codec registrations or none at all.
7.5. AudioDecoderConfigdictionary AudioDecoderConfig
{
required DOMString codec;
[EnforceRange] required unsigned long sampleRate;
[EnforceRange] required unsigned long numberOfChannels;
AllowSharedBufferSource description;
};
To check if an AudioDecoderConfig
is a valid AudioDecoderConfig, run these steps:
If codec
is empty after stripping leading and trailing ASCII whitespace, return false
.
If description
is [detached], return false.
Return true
.
codec
, of type DOMString
sampleRate
, of type unsigned long
numberOfChannels
, of type unsigned long
description
, of type AllowSharedBufferSource
NOTE: The registrations in the [WEBCODECS-CODEC-REGISTRY] describe whether/how to populate this sequence, corresponding to the provided codec
.
dictionary VideoDecoderConfig
{
required DOMString codec;
AllowSharedBufferSource description;
[EnforceRange] unsigned long codedWidth;
[EnforceRange] unsigned long codedHeight;
[EnforceRange] unsigned long displayAspectWidth;
[EnforceRange] unsigned long displayAspectHeight;
VideoColorSpaceInit colorSpace;
HardwareAcceleration hardwareAcceleration = "no-preference";
boolean optimizeForLatency;
double rotation = 0;
boolean flip = false;
};
To check if a VideoDecoderConfig
is a valid VideoDecoderConfig, run these steps:
If codec
is empty after stripping leading and trailing ASCII whitespace, return false
.
If one of codedWidth
or codedHeight
is provided but the other isn’t, return false
.
If codedWidth
= 0 or codedHeight
= 0, return false
.
If one of displayAspectWidth
or displayAspectHeight
is provided but the other isn’t, return false
.
If displayAspectWidth
= 0 or displayAspectHeight
= 0, return false
.
If description
is [detached], return false.
Return true
.
codec
, of type DOMString
description
, of type AllowSharedBufferSource
NOTE: The registrations in the [WEBCODECS-CODEC-REGISTRY] describes whether/how to populate this sequence, corresponding to the provided codec
.
codedWidth
, of type unsigned long
codedHeight
, of type unsigned long
NOTE: codedWidth
and codedHeight
are used when selecting a [[codec implementation]]
.
displayAspectWidth
, of type unsigned long
displayAspectHeight
, of type unsigned long
NOTE: displayWidth
and displayHeight
can both be different from displayAspectWidth
and displayAspectHeight
, but have identical ratios, after scaling is applied when creating the video frame.
colorSpace
, of type VideoColorSpaceInit
VideoFrame
.colorSpace
for VideoFrame
s associated with this VideoDecoderConfig
. If colorSpace
exists, the provided values will override any in-band values from the bitsream.
hardwareAcceleration
, of type HardwareAcceleration, defaulting to "no-preference"
HardwareAcceleration
.
optimizeForLatency
, of type boolean
EncodedVideoChunk
s that have to be decoded before a VideoFrame
is output.
NOTE: In addition to User Agent and hardware limitations, some codec bitstreams require a minimum number of inputs before any output can be produced.
rotation
, of type double, defaulting to 0
rotation
attribute on decoded frames.
flip
, of type boolean, defaulting to false
flip
attribute on decoded frames.
dictionary AudioEncoderConfig
{
required DOMString codec;
[EnforceRange] required unsigned long sampleRate;
[EnforceRange] required unsigned long numberOfChannels;
[EnforceRange] unsigned long long bitrate;
BitrateMode bitrateMode = "variable";
};
NOTE: Codec-specific extensions to AudioEncoderConfig
are described in their registrations in the [WEBCODECS-CODEC-REGISTRY].
To check if an AudioEncoderConfig
is a valid AudioEncoderConfig, run these steps:
If codec
is empty after stripping leading and trailing ASCII whitespace, return false
.
If the AudioEncoderConfig
has a codec-specific extension and the corresponding registration in the [WEBCODECS-CODEC-REGISTRY] defines steps to check whether the extension is a valid extension, return the result of running those steps.
If sampleRate
or numberOfChannels
are equal to zero, return false
.
Return true
.
codec
, of type DOMString
sampleRate
, of type unsigned long
numberOfChannels
, of type unsigned long
bitrate
, of type unsigned long long
bitrateMode
, of type BitrateMode, defaulting to "variable"
constant
or variable
bitrate as defined by [MEDIASTREAM-RECORDING].
NOTE: Not all audio codecs support specific BitrateMode
s, Authors are encouraged to check by calling isConfigSupported()
with config.
dictionary VideoEncoderConfig
{
required DOMString codec;
[EnforceRange] required unsigned long width;
[EnforceRange] required unsigned long height;
[EnforceRange] unsigned long displayWidth;
[EnforceRange] unsigned long displayHeight;
[EnforceRange] unsigned long long bitrate;
double framerate;
HardwareAcceleration hardwareAcceleration = "no-preference";
AlphaOption alpha = "discard";
DOMString scalabilityMode;
VideoEncoderBitrateMode bitrateMode = "variable";
LatencyMode latencyMode = "quality";
DOMString contentHint;
};
NOTE: Codec-specific extensions to VideoEncoderConfig
are described in their registrations in the [WEBCODECS-CODEC-REGISTRY].
To check if a VideoEncoderConfig
is a valid VideoEncoderConfig, run these steps:
If codec
is empty after stripping leading and trailing ASCII whitespace, return false
.
If displayWidth
= 0 or displayHeight
= 0, return false
.
Return true
.
codec
, of type DOMString
width
, of type unsigned long
EncodedVideoChunk
s in pixels, prior to any display aspect ratio adjustments.
The encoder MUST scale any VideoFrame
whose [[visible width]]
differs from this value.
height
, of type unsigned long
EncodedVideoChunk
s in pixels, prior to any display aspect ratio adjustments.
The encoder MUST scale any VideoFrame
whose [[visible height]]
differs from this value.
displayWidth
, of type unsigned long
EncodedVideoChunk
s in pixels. Defaults to width
if not present.
displayHeight
, of type unsigned long
EncodedVideoChunk
s in pixels. Defaults to width
if not present.
NOTE: Providing a
displayWidth
or
displayHeight
that differs from
width
and
height
signals that chunks are to be scaled after decoding to arrive at the final display aspect ratio.
For many codecs this is merely pass-through information, but some codecs can sometimes include display sizing in the bitstream.
bitrate
, of type unsigned long long
NOTE: Authors are encouraged to additionally provide a framerate
to inform rate control.
framerate
, of type double
timestamp
, SHOULD be used by the video encoder to calculate the optimal byte length for each encoded frame. Additionally, the value SHOULD be considered a target deadline for outputting encoding chunks when latencyMode
is set to realtime
.
hardwareAcceleration
, of type HardwareAcceleration, defaulting to "no-preference"
HardwareAcceleration
.
alpha
, of type AlphaOption, defaulting to "discard"
VideoFrame
inputs SHOULD be kept or discarded prior to encoding. If alpha
is equal to discard
, alpha data is always discarded, regardless of a VideoFrame
’s [[format]]
.
scalabilityMode
, of type DOMString
bitrateMode
, of type VideoEncoderBitrateMode, defaulting to "variable"
VideoEncoderBitrateMode
.
NOTE: The precise degree of bitrate fluctuation in either mode is implementation defined.
latencyMode
, of type LatencyMode, defaulting to "quality"
LatencyMode
.
contentHint
, of type DOMString
The User Agent MAY use this hint to set expectations about incoming VideoFrame
s and to improve encoding quality. If using this hint:
The User Agent MUST respect other explicitly set encoding options when configuring the encoder, whether they are codec-specific encoding options or not.
The User Agent SHOULD make a best-effort attempt to use additional configuration options to improve encoding quality, according to the goals defined by the corresponding video content hint.
NOTE: Some encoder options are implementation specific, and mappings between contentHint
and those options cannot be prescribed.
The User Agent MUST NOT refuse the configuration if it doesn’t support this content hint. See isConfigSupported()
.
enum HardwareAcceleration
{
"no-preference",
"prefer-hardware",
"prefer-software",
};
When supported, hardware acceleration offloads encoding or decoding to specialized hardware. prefer-hardware
and prefer-software
are hints. While User Agents SHOULD respect these values when possible, User Agents may ignore these values in some or all circumstances for any reason.
To prevent fingerprinting, if a User Agent implements [media-capabilities], the User Agent MUST ensure rejection or acceptance of a given HardwareAcceleration
preference reveals no additional information on top of what is inherent to the User Agent and revealed by [media-capabilities]. If a User Agent does not implement [media-capabilities] for reasons of fingerprinting, they SHOULD ignore the HardwareAcceleration
preference.
NOTE: Good examples of when a User Agent can ignore
prefer-hardware
or
prefer-software
are for reasons of user privacy or circumstances where the User Agent determines an alternative setting would better serve the end user.
Most authors will be best served by using the default of no-preference
. This gives the User Agent flexibility to optimize based on its knowledge of the system and configuration. A common strategy will be to prioritize hardware acceleration at higher resolutions with a fallback to software codecs if hardware acceleration fails.
Authors are encouraged to carefully weigh the tradeoffs when setting a hardware acceleration preference. The precise tradeoffs will be device-specific, but authors can generally expect the following:
Setting a value of prefer-hardware
or prefer-software
can significantly restrict what configurations are supported. It can occur that the user’s device does not offer acceleration for any codec, or only for the most common profiles of older codecs. It can also occur that a given User Agent lacks a software based codec implementation.
Hardware acceleration does not simply imply faster encoding / decoding. Hardware acceleration often has higher startup latency but more consistent throughput performance. Acceleration will generally reduce CPU load.
For decoding, hardware acceleration is often less robust to inputs that are mislabeled or violate the relevant codec specification.
Hardware acceleration will often be more power efficient than purely software based codecs.
For lower resolution content, the overhead added by hardware acceleration can yield decreased performance and power efficiency compared to purely software based codecs.
Given these tradeoffs, a good example of using "prefer-hardware" would be if an author intends to provide their own software based fallback via WebAssembly.
Alternatively, a good example of using "prefer-software" would be if an author is especially sensitive to the higher startup latency or decreased robustness generally associated with hardware acceleration.
no-preference
prefer-software
NOTE: This can cause the configuration to be unsupported on platforms where an unaccelerated codec is unavailable or is incompatible with other aspects of the codec configuration.
prefer-hardware
NOTE: This can cause the configuration to be unsupported on platforms where an accelerated codec is unavailable or is incompatible with other aspects of the codec configuration.
enum AlphaOption
{
"keep",
"discard",
};
Describes how the user agent SHOULD behave when dealing with alpha channels, for a variety of different operations.
keep
VideoFrame
s, if it is present.
discard
VideoFrame
’s alpha channel data.
enum LatencyMode
{
"quality",
"realtime"
};
quality
Indicates that the User Agent SHOULD optimize for encoding quality. In this mode:
realtime
Indicates that the User Agent SHOULD optimize for low latency. In this mode:
dictionary VideoEncoderEncodeOptions
{
boolean keyFrame = false;
};
NOTE: Codec-specific extensions to VideoEncoderEncodeOptions
are described in their registrations in the [WEBCODECS-CODEC-REGISTRY].
keyFrame
, of type boolean, defaulting to false
true
indicates that the given frame MUST be encoded as a key frame. A value of false
indicates that the User Agent has flexibility to decide whether the frame will be encoded as a key frame.
enum VideoEncoderBitrateMode
{
"constant",
"variable",
"quantizer"
};
constant
bitrate
.
variable
bitrate
.
quantizer
VideoEncoderEncodeOptions
.
enum CodecState
{
"unconfigured",
"configured",
"closed"
};
unconfigured
configured
closed
callbackThese interfaces represent chunks of encoded media. 8.1. EncodedAudioChunk InterfaceWebCodecsErrorCallback
= undefined(DOMExceptionerror
);
[Exposed=(Window,DedicatedWorker), Serializable] interface8.1.1. Internal SlotsEncodedAudioChunk
{ constructor(EncodedAudioChunkInitinit
); readonly attribute EncodedAudioChunkType type; readonly attribute long long timestamp; // microseconds readonly attribute unsigned long long? duration; // microseconds readonly attribute unsigned long byteLength; undefined copyTo(AllowSharedBufferSourcedestination
); }; dictionaryEncodedAudioChunkInit
{ required EncodedAudioChunkTypetype
; [EnforceRange] required long longtimestamp
; // microseconds [EnforceRange] unsigned long longduration
; // microseconds required AllowSharedBufferSourcedata
; sequence<ArrayBuffer>transfer
= []; }; enumEncodedAudioChunkType
{"key"
,"delta"
, };
[[internal data]]
An array of bytes representing the encoded chunk data.
[[type]]
Describes whether the chunk is a key chunk.
[[timestamp]]
The presentation timestamp, given in microseconds.
[[duration]]
The presentation duration, given in microseconds.
[[byte length]]
The byte length of [[internal data]]
.
EncodedAudioChunk(init)
If init.transfer
contains more than one reference to the same ArrayBuffer
, then throw a DataCloneError
DOMException
.
For each transferable in init.transfer
:
If [[Detached]]
internal slot is true
, then throw a DataCloneError
DOMException
.
Let chunk be a new EncodedAudioChunk
object, initialized as follows
Assign init.type
to [[type]]
.
Assign init.timestamp
to [[timestamp]]
.
If init.duration
exists, assign it to [[duration]]
, or assign null
otherwise.
Assign init.data.byteLength
to [[byte length]]
;
If init.transfer
contains an ArrayBuffer
referenced by init.data
the User Agent MAY choose to:
Let resource be a new media resource referencing sample data in init.data
.
Otherwise:
Assign a copy of init.data
to [[internal data]]
.
For each transferable in init.transfer
:
Perform DetachArrayBuffer on transferable
Return chunk.
type
, of type EncodedAudioChunkType, readonly
Returns the value of [[type]]
.
timestamp
, of type long long, readonly
Returns the value of [[timestamp]]
.
duration
, of type unsigned long long, readonly, nullable
Returns the value of [[duration]]
.
byteLength
, of type unsigned long, readonly
Returns the value of [[byte length]]
.
copyTo(destination)
When invoked, run these steps:
If the [[byte length]]
of this EncodedAudioChunk
is greater than in destination, throw a TypeError
.
Copy the [[internal data]]
into destination.
EncodedAudioChunk
serialization steps (with value, serialized, and forStorage) are:
If forStorage is true
, throw a DataCloneError
.
For each EncodedAudioChunk
internal slot in value, assign the value of each internal slot to a field in serialized with the same name as the internal slot.
EncodedAudioChunk
deserialization steps (with serialized and value) are:
For all named fields in serialized, assign the value of each named field to the EncodedAudioChunk
internal slot in value with the same name as the named field.
NOTE: Since EncodedAudioChunk
s are immutable, User Agents can choose to implement serialization using a reference counting model similar to § 9.2.6 Transfer and Serialization.
[Exposed=(Window,DedicatedWorker), Serializable] interface8.2.1. Internal SlotsEncodedVideoChunk
{ constructor(EncodedVideoChunkInitinit
); readonly attribute EncodedVideoChunkType type; readonly attribute long long timestamp; // microseconds readonly attribute unsigned long long? duration; // microseconds readonly attribute unsigned long byteLength; undefined copyTo(AllowSharedBufferSourcedestination
); }; dictionaryEncodedVideoChunkInit
{ required EncodedVideoChunkTypetype
; [EnforceRange] required long longtimestamp
; // microseconds [EnforceRange] unsigned long longduration
; // microseconds required AllowSharedBufferSourcedata
; sequence<ArrayBuffer>transfer
= []; }; enumEncodedVideoChunkType
{"key"
,"delta"
, };
[[internal data]]
An array of bytes representing the encoded chunk data.
[[type]]
The EncodedVideoChunkType
of this EncodedVideoChunk
;
[[timestamp]]
The presentation timestamp, given in microseconds.
[[duration]]
The presentation duration, given in microseconds.
[[byte length]]
The byte length of [[internal data]]
.
EncodedVideoChunk(init)
If init.transfer
contains more than one reference to the same ArrayBuffer
, then throw a DataCloneError
DOMException
.
For each transferable in init.transfer
:
If [[Detached]]
internal slot is true
, then throw a DataCloneError
DOMException
.
Let chunk be a new EncodedVideoChunk
object, initialized as follows
Assign init.type
to [[type]]
.
Assign init.timestamp
to [[timestamp]]
.
If duration is present in init, assign init.duration
to [[duration]]
. Otherwise, assign null
to [[duration]]
.
Assign init.data.byteLength
to [[byte length]]
;
If init.transfer
contains an ArrayBuffer
referenced by init.data
the User Agent MAY choose to:
Let resource be a new media resource referencing sample data in init.data
.
Otherwise:
Assign a copy of init.data
to [[internal data]]
.
For each transferable in init.transfer
:
Perform DetachArrayBuffer on transferable
Return chunk.
type
, of type EncodedVideoChunkType, readonly
Returns the value of [[type]]
.
timestamp
, of type long long, readonly
Returns the value of [[timestamp]]
.
duration
, of type unsigned long long, readonly, nullable
Returns the value of [[duration]]
.
byteLength
, of type unsigned long, readonly
Returns the value of [[byte length]]
.
copyTo(destination)
When invoked, run these steps:
If [[byte length]]
is greater than the [[byte length]]
of destination, throw a TypeError
.
Copy the [[internal data]]
into destination.
EncodedVideoChunk
serialization steps (with value, serialized, and forStorage) are:
If forStorage is true
, throw a DataCloneError
.
For each EncodedVideoChunk
internal slot in value, assign the value of each internal slot to a field in serialized with the same name as the internal slot.
EncodedVideoChunk
deserialization steps (with serialized and value) are:
For all named fields in serialized, assign the value of each named field to the EncodedVideoChunk
internal slot in value with the same name as the named field.
NOTE: Since EncodedVideoChunk
s are immutable, User Agents can choose to implement serialization using a reference counting model similar to § 9.4.7 Transfer and Serialization.
This section is non-normative.
Decoded media data MAY occupy a large amount of system memory. To minimize the need for expensive copies, this specification defines a scheme for reference counting (clone()
and close()
).
NOTE: Authors are encouraged to call close()
immediately when frames are no longer needed.
A media resource is storage for the actual pixel data or the audio sample data described by a VideoFrame
or AudioData
.
The AudioData
[[resource reference]]
and VideoFrame
[[resource reference]]
internal slots hold a reference to a media resource.
VideoFrame
.clone()
and AudioData
.clone()
return new objects whose [[resource reference]]
points to the same media resource as the original object.
VideoFrame
.close()
and AudioData
.close()
will clear their [[resource reference]]
slot, releasing the reference their media resource.
A media resource MUST remain alive at least as long as it continues to be referenced by a [[resource reference]]
.
NOTE: When a media resource is no longer referenced by a [[resource reference]]
, the resource can be destroyed. User Agents are encouraged to destroy such resources quickly to reduce memory pressure and facilitate resource reuse.
This section is non-normative.
AudioData
and VideoFrame
are both transferable and serializable objects. Their transfer and serialization steps are defined in § 9.2.6 Transfer and Serialization and § 9.4.7 Transfer and Serialization respectively.
Transferring an AudioData
or VideoFrame
moves its [[resource reference]]
to the destination object and closes (as in close()
) the source object. Authors MAY use this facility to move an AudioData
or VideoFrame
between realms without copying the underlying media resource.
Serializing an AudioData
or VideoFrame
effectively clones (as in clone()
) the source object, resulting in two objects that reference the same media resource. Authors MAY use this facility to clone an AudioData
or VideoFrame
to another realm without copying the underlying media resource.
[Exposed=(Window,DedicatedWorker), Serializable, Transferable] interface9.2.1. Internal SlotsAudioData
{ constructor(AudioDataInitinit
); readonly attribute AudioSampleFormat? format; readonly attribute float sampleRate; readonly attribute unsigned long numberOfFrames; readonly attribute unsigned long numberOfChannels; readonly attribute unsigned long long duration; // microseconds readonly attribute long long timestamp; // microseconds unsigned long allocationSize(AudioDataCopyToOptionsoptions
); undefined copyTo(AllowSharedBufferSourcedestination
, AudioDataCopyToOptionsoptions
); AudioData clone(); undefined close(); }; dictionaryAudioDataInit
{ required AudioSampleFormatformat
; required floatsampleRate
; [EnforceRange] required unsigned longnumberOfFrames
; [EnforceRange] required unsigned longnumberOfChannels
; [EnforceRange] required long longtimestamp
; // microseconds required BufferSourcedata
; sequence<ArrayBuffer>transfer
= []; };
[[resource reference]]
A reference to a media resource that stores the audio sample data for this AudioData
.
[[format]]
The AudioSampleFormat
used by this AudioData
. Will be null
whenever the underlying format does not map to an AudioSampleFormat
or when [[Detached]]
is true
.
[[sample rate]]
The sample-rate, in Hz, for this AudioData
.
[[number of frames]]
[[number of channels]]
The number of audio channels for this AudioData
.
[[timestamp]]
The presentation timestamp, in microseconds, for this AudioData
.
AudioData(init)
If init is not a valid AudioDataInit, throw a TypeError
.
If init.transfer
contains more than one reference to the same ArrayBuffer
, then throw a DataCloneError
DOMException
.
For each transferable in init.transfer
:
If [[Detached]]
internal slot is true
, then throw a DataCloneError
DOMException
.
Let frame be a new AudioData
object, initialized as follows:
Assign false
to [[Detached]]
.
Assign init.format
to [[format]]
.
Assign init.sampleRate
to [[sample rate]]
.
Assign init.numberOfFrames
to [[number of frames]]
.
Assign init.numberOfChannels
to [[number of channels]]
.
Assign init.timestamp
to [[timestamp]]
.
If init.transfer
contains an ArrayBuffer
referenced by init.data
the User Agent MAY choose to:
Let resource be a new media resource referencing sample data in data.
Otherwise:
Let resource be a media resource containing a copy of init.data
.
Let resourceReference be a reference to resource.
Assign resourceReference to [[resource reference]]
.
For each transferable in init.transfer
:
Perform DetachArrayBuffer on transferable
Return frame.
format
, of type AudioSampleFormat, readonly, nullable
The AudioSampleFormat
used by this AudioData
. Will be null
whenever the underlying format does not map to a AudioSampleFormat
or when [[Detached]]
is true
.
The format
getter steps are to return [[format]]
.
sampleRate
, of type float, readonly
The sample-rate, in Hz, for this AudioData
.
The sampleRate
getter steps are to return [[sample rate]]
.
numberOfFrames
, of type unsigned long, readonly
The number of frames for this AudioData
.
The numberOfFrames
getter steps are to return [[number of frames]]
.
numberOfChannels
, of type unsigned long, readonly
The number of audio channels for this AudioData
.
The numberOfChannels
getter steps are to return [[number of channels]]
.
timestamp
, of type long long, readonly
The presentation timestamp, in microseconds, for this AudioData
.
The numberOfChannels
getter steps are to return [[timestamp]]
.
duration
, of type unsigned long long, readonly
The duration, in microseconds, for this AudioData
.
The duration
getter steps are to:
Let microsecondsPerSecond be 1,000,000
.
Let durationInSeconds be the result of dividing [[number of frames]]
by [[sample rate]]
.
Return the product of durationInSeconds and microsecondsPerSecond.
allocationSize(options)
Returns the number of bytes required to hold the samples as described by options.
When invoked, run these steps:
If [[Detached]]
is true
, throw an InvalidStateError
DOMException
.
Let copyElementCount be the result of running the Compute Copy Element Count algorithm with options.
Let destFormat be the value of [[format]]
.
If options.format
exists, assign options.format
to destFormat.
Let bytesPerSample be the number of bytes per sample, as defined by the destFormat.
Return the product of multiplying bytesPerSample by copyElementCount.
copyTo(destination, options)
Copies the samples from the specified plane of the AudioData
to the destination buffer.
When invoked, run these steps:
If [[Detached]]
is true
, throw an InvalidStateError
DOMException
.
Let copyElementCount be the result of running the Compute Copy Element Count algorithm with options.
Let destFormat be the value of [[format]]
.
If options.format
exists, assign options.format
to destFormat.
Let bytesPerSample be the number of bytes per sample, as defined by the destFormat.
If the product of multiplying bytesPerSample by copyElementCount is greater than destination.byteLength
, throw a RangeError
.
Let resource be the media resource referenced by [[resource reference]]
.
Let planeFrames be the region of resource corresponding to options.planeIndex
.
Copy elements of planeFrames into destination, starting with the frame positioned at options.frameOffset
and stopping after copyElementCount samples have been copied. If destFormat does not equal [[format]]
, convert elements to the destFormat AudioSampleFormat
while making the copy.
clone()
Creates a new AudioData with a reference to the same media resource.
When invoked, run these steps:
If [[Detached]]
is true
, throw an InvalidStateError
DOMException
.
Return the result of running the Clone AudioData algorithm with this.
close()
Clears all state and releases the reference to the media resource. Close is final.
When invoked, run the Close AudioData algorithm with this.
Run these steps:
Let destFormat be the value of [[format]]
.
If options.format
exists, assign options.format
to destFormat.
If destFormat describes an interleaved AudioSampleFormat
and options.planeIndex
is greater than 0
, throw a RangeError
.
Otherwise, if destFormat describes a planar AudioSampleFormat
and if options.planeIndex
is greater or equal to [[number of channels]]
, throw a RangeError
.
If [[format]]
does not equal destFormat and the User Agent does not support the requested AudioSampleFormat
conversion, throw a NotSupportedError
DOMException
. Conversion to f32-planar
MUST always be supported.
Let frameCount be the number of frames in the plane identified by options.planeIndex
.
If options.frameOffset
is greater than or equal to frameCount, throw a RangeError
.
Let copyFrameCount be the difference of subtracting options.frameOffset
from frameCount.
If options.frameCount
exists:
If options.frameCount
is greater than copyFrameCount, throw a RangeError
.
Otherwise, assign options.frameCount
to copyFrameCount.
Let elementCount be copyFrameCount.
If destFormat describes an interleaved AudioSampleFormat
, multiply elementCount by [[number of channels]]
return elementCount.
Run these steps:
Let clone be a new AudioData
initialized as follows:
Let resource be the media resource referenced by data’s [[resource reference]]
.
Let reference be a new reference to resource.
Assign reference to [[resource reference]]
.
Assign the values of data’s [[Detached]]
, [[format]]
, [[sample rate]]
, [[number of frames]]
, [[number of channels]]
, and [[timestamp]]
slots to the corresponding slots in clone.
Return clone.
Run these steps:
Assign true
to data’s [[Detached]]
internal slot.
Assign null
to data’s [[resource reference]]
.
Assign 0
to data’s [[sample rate]]
.
Assign 0
to data’s [[number of frames]]
.
Assign 0
to data’s [[number of channels]]
.
Assign null
to data’s [[format]]
.
AudioDataInit
is a valid AudioDataInit, run these steps:
If sampleRate
less than or equal to 0
, return false
.
If numberOfFrames
= 0
, return false
.
If numberOfChannels
= 0
, return false
.
Verify data
has enough data by running the following steps:
Let totalSamples be the product of multiplying numberOfFrames
by numberOfChannels
.
Let bytesPerSample be the number of bytes per sample, as defined by the format
.
Let totalSize be the product of multiplying bytesPerSample with totalSamples.
Let dataSize be the size in bytes of data
.
If dataSize is less than totalSize, return false.
Return true
.
AudioData
transfer steps (with value and dataHolder) are:
If value’s [[Detached]]
is true
, throw a DataCloneError
DOMException
.
For all AudioData
internal slots in value, assign the value of each internal slot to a field in dataHolder with the same name as the internal slot.
Run the Close AudioData algorithm with value.
AudioData
transfer-receiving steps (with dataHolder and value) are:
For all named fields in dataHolder, assign the value of each named field to the AudioData
internal slot in value with the same name as the named field.
AudioData
serialization steps (with value, serialized, and forStorage) are:
If value’s [[Detached]]
is true
, throw a DataCloneError
DOMException
.
If forStorage is true
, throw a DataCloneError
.
Let resource be the media resource referenced by value’s [[resource reference]]
.
Let newReference be a new reference to resource.
Assign newReference to |serialized.resource reference|.
For all remaining AudioData
internal slots (excluding [[resource reference]]
) in value, assign the value of each internal slot to a field in serialized with the same name as the internal slot.
AudioData
deserialization steps (with serialized and value) are:
For all named fields in serialized, assign the value of each named field to the AudioData
internal slot in value with the same name as the named field.
dictionary AudioDataCopyToOptions
{
[EnforceRange] required unsigned long planeIndex;
[EnforceRange] unsigned long frameOffset = 0;
[EnforceRange] unsigned long frameCount;
AudioSampleFormat format;
};
planeIndex
, of type unsigned long
The index identifying the plane to copy from.
frameOffset
, of type unsigned long, defaulting to 0
An offset into the source plane data indicating which frame to begin copying from. Defaults to 0
.
frameCount
, of type unsigned long
The number of frames to copy. If not provided, the copy will include all frames in the plane beginning with frameOffset
.
format
, of type AudioSampleFormat
The output AudioSampleFormat
for the destination data. If not provided, the resulting copy will use this AudioData’s [[format]]
. Invoking copyTo()
will throw a NotSupportedError
if conversion to the requested format is not supported. Conversion from any AudioSampleFormat
to f32-planar
MUST always be supported.
NOTE: Authors seeking to integrate with [WEBAUDIO] can request f32-planar
and use the resulting copy to create and AudioBuffer
or render via AudioWorklet
.
An audio sample format describes the numeric type used to represent a single sample (e.g. 32-bit floating point) and the arrangement of samples from different channels as either interleaved or planar. The audio sample type refers solely to the numeric type and interval used to store the data, this is u8
, s16
, s32
, or f32
for respectively unsigned 8-bits, signed 16-bits, signed 32-bits, and 32-bits floating point number. The audio buffer arrangement refers solely to the way the samples are laid out in memory (planar or interleaved).
A sample refers to a single value that is the magnitude of a signal at a particular point in time in a particular channel.
A frame or (sample-frame) refers to a set of values of all channels of a multi-channel signal, that happen at the exact same time.
NOTE: Consequently, if an audio signal is mono (has only one channel), a frame and a sample refer to the same thing.
All audio samples in this specification are using linear pulse-code modulation (Linear PCM): quantization levels are uniform between values.
NOTE: The Web Audio API, that is expected to be used with this specification, also uses Linear PCM.
enum AudioSampleFormat
{
"u8",
"s16",
"s32",
"f32",
"u8-planar",
"s16-planar",
"s32-planar",
"f32-planar",
};
u8
8-bit unsigned integer samples with interleaved channel arrangement.
s16
16-bit signed integer samples with interleaved channel arrangement.
s32
32-bit signed integer samples with interleaved channel arrangement.
f32
u8-planar
8-bit unsigned integer samples with planar channel arrangement.
s16-planar
16-bit signed integer samples with planar channel arrangement.
s32-planar
32-bit signed integer samples with planar channel arrangement.
f32-planar
When an AudioData
has an AudioSampleFormat
that is interleaved, the audio samples from different channels are laid out consecutively in the same buffer, in the order described in the section § 9.3.3 Audio channel ordering. The AudioData
has a single plane, that contains a number of elements therefore equal to [[number of frames]]
* [[number of channels]]
.
When an AudioData
has an AudioSampleFormat
that is planar, the audio samples from different channels are laid out in different buffers, themselves arranged in an order described in the section § 9.3.3 Audio channel ordering. The AudioData
has a number of planes equal to the AudioData
’s [[number of channels]]
. Each plane contains [[number of frames]]
elements.
NOTE: The Web Audio API currently uses f32-planar
exclusively.
The minimum value and maximum value of an audio sample, for a particular audio sample type, are the values below which (respectively above which) audio clipping might occur. They are otherwise regular types, that can hold values outside this interval during intermediate processing.
The bias value for an audio sample type is the value that often corresponds to the middle of the range (but often the range is not symmetrical). An audio buffer comprised only of values equal to the bias value is silent.
NOTE: There is no data type that can hold 24 bits of information conveniently, but audio content using 24-bit samples is common, so 32-bits integers are commonly used to hold 24-bit content.
AudioData
containing 24-bit samples SHOULD store those samples in s32
or f32
. When samples are stored in s32
, each sample MUST be left-shifted by 8
bits. By virtue of this process, samples outside of the valid 24-bit range ([-8388608, +8388607]) will be clipped. To avoid clipping and ensure lossless transport, samples MAY be converted to f32
.
NOTE: While clipping is unavoidable in u8
, s16
, and s32
samples due to their storage types, implementations SHOULD take care not to clip internally when handling f32
samples.
When decoding, the ordering of the audio channels in the resulting AudioData
MUST be the same as what is present in the EncodedAudioChunk
.
When encoding, the ordering of the audio channels in the resulting EncodedAudioChunk
MUST be the same as what is preset in the given AudioData
.
In other terms, no channel reordering is performed when encoding and decoding.
NOTE: The container either implies or specifies the channel mapping: the channel attributed to a particular channel index.
9.4. VideoFrame InterfaceNOTE: VideoFrame
is a CanvasImageSource
. A VideoFrame
can be passed to any method accepting a CanvasImageSource
, including CanvasDrawImage
’s drawImage()
.
[Exposed=(Window,DedicatedWorker), Serializable, Transferable] interface9.4.1. Internal SlotsVideoFrame
{ constructor(CanvasImageSourceimage
, optional VideoFrameInitinit
= {}); constructor(AllowSharedBufferSourcedata
, VideoFrameBufferInitinit
); readonly attribute VideoPixelFormat? format; readonly attribute unsigned long codedWidth; readonly attribute unsigned long codedHeight; readonly attribute DOMRectReadOnly? codedRect; readonly attribute DOMRectReadOnly? visibleRect; readonly attribute double rotation; readonly attribute boolean flip; readonly attribute unsigned long displayWidth; readonly attribute unsigned long displayHeight; readonly attribute unsigned long long? duration; // microseconds readonly attribute long long timestamp; // microseconds readonly attribute VideoColorSpace colorSpace; VideoFrameMetadata metadata(); unsigned long allocationSize( optional VideoFrameCopyToOptionsoptions
= {}); Promise<sequence<PlaneLayout>> copyTo( AllowSharedBufferSourcedestination
, optional VideoFrameCopyToOptionsoptions
= {}); VideoFrame clone(); undefined close(); }; dictionaryVideoFrameInit
{ unsigned long longduration
; // microseconds long longtimestamp
; // microseconds AlphaOptionalpha
= "keep"; // Default matches image. May be used to efficiently crop. Will trigger // new computation of displayWidth and displayHeight using image's pixel // aspect ratio unless an explicit displayWidth and displayHeight are given. DOMRectInitvisibleRect
; doublerotation
= 0; booleanflip
= false; // Default matches image unless visibleRect is provided. [EnforceRange] unsigned longdisplayWidth
; [EnforceRange] unsigned longdisplayHeight
; VideoFrameMetadatametadata
; }; dictionaryVideoFrameBufferInit
{ required VideoPixelFormatformat
; required [EnforceRange] unsigned longcodedWidth
; required [EnforceRange] unsigned longcodedHeight
; required [EnforceRange] long longtimestamp
; // microseconds [EnforceRange] unsigned long longduration
; // microseconds // Default layout is tightly-packed. sequence<PlaneLayout>layout
; // Default visible rect is coded size positioned at (0,0) DOMRectInitvisibleRect
; doublerotation
= 0; booleanflip
= false; // Default display dimensions match visibleRect. [EnforceRange] unsigned longdisplayWidth
; [EnforceRange] unsigned longdisplayHeight
; VideoColorSpaceInitcolorSpace
; sequence<ArrayBuffer>transfer
= []; VideoFrameMetadatametadata
; }; dictionaryVideoFrameMetadata
{ // Possible members are recorded in the VideoFrame Metadata Registry. };
[[resource reference]]
A reference to the media resource that stores the pixel data for this frame.
[[format]]
A VideoPixelFormat
describing the pixel format of the VideoFrame
. Will be null
whenever the underlying format does not map to a VideoPixelFormat
or when [[Detached]]
is true
.
[[coded width]]
Width of the VideoFrame
in pixels, potentially including non-visible padding, and prior to considering potential ratio adjustments.
[[coded height]]
Height of the VideoFrame
in pixels, potentially including non-visible padding, and prior to considering potential ratio adjustments.
[[visible left]]
The number of pixels defining the left offset of the visible rectangle.
[[visible top]]
The number of pixels defining the top offset of the visible rectangle.
[[visible width]]
The width of pixels to include in visible rectangle, starting from [[visible left]]
.
[[visible height]]
The height of pixels to include in visible rectangle, starting from [[visible top]]
.
[[rotation]]
The rotation to applied to the VideoFrame
when rendered, in degrees clockwise. Rotation applies before flip.
[[flip]]
Whether a horizontal flip is applied to the VideoFrame
when rendered. Flip is applied after rotation.
[[display width]]
Width of the VideoFrame
when displayed after applying aspect ratio adjustments.
[[display height]]
Height of the VideoFrame
when displayed after applying aspect ratio adjustments.
[[duration]]
The presentation duration, given in microseconds. The duration is copied from the EncodedVideoChunk
corresponding to this VideoFrame
.
[[timestamp]]
The presentation timestamp, given in microseconds. The timestamp is copied from the EncodedVideoChunk
corresponding to this VideoFrame
.
[[color space]]
The VideoColorSpace
associated with this frame.
[[metadata]]
The VideoFrameMetadata
associated with this frame. Possible members are recorded in [webcodecs-video-frame-metadata-registry]. By design, all VideoFrameMetadata
properties are serializable.
VideoFrame(image, init)
Check the usability of the image argument. If this throws an exception or returns bad, then throw an InvalidStateError
DOMException
.
If image is not origin-clean, then throw a SecurityError
DOMException
.
Let frame be a new VideoFrame
.
Switch on image:
NOTE: Authors are encouraged to provide a meaningful timestamp unless it is implicitly provided by the CanvasImageSource
at construction. Interfaces that consume VideoFrame
s can rely on this value for timing decisions. For example, VideoEncoder
can use timestamp
values to guide rate control (see framerate
).
If image’s media data has no natural dimensions (e.g., it’s a vector graphic with no specified content size), then throw an InvalidStateError
DOMException
.
Let resource be a new media resource containing a copy of image’s media data. If this is an animated image, image’s bitmap data MUST only be taken from the default image of the animation (the one that the format defines is to be used when animation is not supported or is disabled), or, if there is no such image, the first frame of the animation.
Let codedWidth and codedHeight be the width and height of resource.
Let baseRotation and baseFlip describe the rotation and flip of image relative to resource.
Let defaultDisplayWidth and defaultDisplayHeight be the natural width and natural height of image.
Run the Initialize Frame With Resource algorithm with init, frame, resource, codedWidth, codedHeight, baseRotation, baseFlip, defaultDisplayWidth, and defaultDisplayHeight.
If image’s networkState
attribute is NETWORK_EMPTY
, then throw an InvalidStateError
DOMException
.
Let currentPlaybackFrame be the VideoFrame
at the current playback position.
If metadata
does not exist in init, assign currentPlaybackFrame.[[metadata]]
to it.
Run the Initialize Frame From Other Frame algorithm with init, frame, and currentPlaybackFrame.
Let resource be a new media resource containing a copy of image’s bitmap data.
NOTE: Implementers are encouraged to avoid a deep copy by using reference counting where feasible.
Let width be image.width
and height be image.height
.
Run the Initialize Frame With Resource algorithm with init, frame, resource, width, height, 0
, false
, width, and height.
Run the Initialize Frame From Other Frame algorithm with init, frame, and image.
Return frame.
VideoFrame(data, init)
If init is not a valid VideoFrameBufferInit, throw a TypeError
.
Let defaultRect be «[ "x:" → 0
, "y" → 0
, "width" → init.codedWidth
, "height" → init.codedWidth
]».
Let overrideRect be undefined
.
If init.visibleRect
exists, assign its value to overrideRect.
Let parsedRect be the result of running the Parse Visible Rect algorithm with defaultRect, overrideRect, init.codedWidth
, init.codedHeight
, and init.format
.
If parsedRect is an exception, return parsedRect.
Let optLayout be undefined
.
Let combinedLayout be the result of running the Compute Layout and Allocation Size algorithm with parsedRect, init.format
, and optLayout.
If combinedLayout is an exception, throw combinedLayout.
If data.byteLength
is less than combinedLayout’s allocationSize, throw a TypeError
.
If init.transfer
contains more than one reference to the same ArrayBuffer
, then throw a DataCloneError
DOMException
.
For each transferable in init.transfer
:
If [[Detached]]
internal slot is true
, then throw a DataCloneError
DOMException
.
If init.transfer
contains an ArrayBuffer
referenced by data the User Agent MAY choose to:
Let resource be a new media resource referencing pixel data in data.
Otherwise:
Let resource be a new media resource containing a copy of data. Use visibleRect
and layout
to determine where in data the pixels for each plane reside.
The User Agent MAY choose to allocate resource with a larger coded size and plane strides to improve memory alignment. Increases will be reflected by codedWidth
and codedHeight
. Additionally, the User Agent MAY use visibleRect
to copy only the visible rectangle. It MAY also reposition the visible rectangle within resource. The final position will be reflected by visibleRect
.
For each transferable in init.transfer
:
Perform DetachArrayBuffer on transferable
Let resourceCodedWidth be the coded width of resource.
Let resourceCodedHeight be the coded height of resource.
Let resourceVisibleLeft be the left offset for the visible rectangle of resource.
Let resourceVisibleTop be the top offset for the visible rectangle of resource.
The spec SHOULD provide definitions (and possibly diagrams) for coded size, visible rectangle, and display size. See #166.
Let frame be a new VideoFrame
object initialized as follows:
Assign resourceCodedWidth, resourceCodedHeight, resourceVisibleLeft, and resourceVisibleTop to [[coded width]]
, [[coded height]]
, [[visible left]]
, and [[visible top]]
respectively.
If init.visibleRect
exists:
Let truncatedVisibleWidth be the value of visibleRect
.width
after truncating.
Assign truncatedVisibleWidth to [[visible width]]
.
Let truncatedVisibleHeight be the value of visibleRect
.height
after truncating.
Assign truncatedVisibleHeight to [[visible height]]
.
Otherwise:
Assign [[coded width]]
to [[visible width]]
.
Assign [[coded height]]
to [[visible height]]
.
Assign the result of running the Parse Rotation algorithm, with init.rotation
, to [[rotation]]
.
If displayWidth
and displayHeight
exist in init, assign them to [[display width]]
and [[display height]]
respectively.
Otherwise:
If [[rotation]]
is equal to 0
or 180
:
Assign [[visible width]]
to [[display width]]
.
Assign [[visible height]]
to [[display height]]
.
Otherwise:
Assign [[visible height]]
to [[display width]]
.
Assign [[visible width]]
to [[display height]]
.
Assign init’s timestamp
and duration
to [[timestamp]]
and [[duration]]
respectively.
Let colorSpace be undefined
.
If init.colorSpace
exists, assign its value to colorSpace.
Assign init’s format
to [[format]]
.
Assign the result of running the Pick Color Space algorithm, with colorSpace and [[format]]
, to [[color space]]
.
Assign the result of calling Copy VideoFrame metadata with init’s metadata
to frame.[[metadata]]
.
Return frame.
format
, of type VideoPixelFormat, readonly, nullable
Describes the arrangement of bytes in each plane as well as the number and order of the planes. Will be null
whenever the underlying format does not map to a VideoPixelFormat
or when [[Detached]]
is true
.
The format
getter steps are to return [[format]]
.
codedWidth
, of type unsigned long, readonly
Width of the VideoFrame
in pixels, potentially including non-visible padding, and prior to considering potential ratio adjustments.
The codedWidth
getter steps are to return [[coded width]]
.
codedHeight
, of type unsigned long, readonly
Height of the VideoFrame
in pixels, potentially including non-visible padding, and prior to considering potential ratio adjustments.
The codedHeight
getter steps are to return [[coded height]]
.
codedRect
, of type DOMRectReadOnly, readonly, nullable
A DOMRectReadOnly
with width
and height
matching codedWidth
and codedHeight
and x
and y
at (0,0)
. Offered for convenience for use with allocationSize()
and copyTo()
.
The codedRect
getter steps are:
If [[Detached]]
is true
, return null
.
Let rect be a new DOMRectReadOnly
, initialized as follows:
Assign [[coded width]]
and [[coded height]]
to width
and height
respectively.
Return rect.
visibleRect
, of type DOMRectReadOnly, readonly, nullable
A DOMRectReadOnly
describing the visible rectangle of pixels for this VideoFrame
.
The visibleRect
getter steps are:
If [[Detached]]
is true
, return null
.
Let rect be a new DOMRectReadOnly
, initialized as follows:
Assign [[visible left]]
, [[visible top]]
, [[visible width]]
, and [[visible height]]
to x
, y
, width
, and height
respectively.
Return rect.
rotation
, of type double, readonly
The rotation to applied to the VideoFrame when rendered, in degrees clockwise. Rotation applies before flip.
The rotation
getter steps are to return [[rotation]]
.
flip
, of type boolean, readonly
Whether a horizontal flip is applied to the VideoFrame
when rendered. Flip applies after rotation.
displayWidth
, of type unsigned long, readonly
Width of the VideoFrame when displayed after applying rotation and aspect ratio adjustments.
The displayWidth
getter steps are to return [[display width]]
.
displayHeight
, of type unsigned long, readonly
Height of the VideoFrame when displayed after applying rotation and aspect ratio adjustments.
The displayHeight
getter steps are to return [[display height]]
.
timestamp
, of type long long, readonly
The presentation timestamp, given in microseconds. For decode, timestamp is copied from the EncodedVideoChunk
corresponding to this VideoFrame
. For encode, timestamp is copied to the EncodedVideoChunk
s corresponding to this VideoFrame
.
The timestamp
getter steps are to return [[timestamp]]
.
duration
, of type unsigned long long, readonly, nullable
The presentation duration, given in microseconds. The duration is copied from the EncodedVideoChunk
corresponding to this VideoFrame.
The duration
getter steps are to return [[duration]]
.
colorSpace
, of type VideoColorSpace, readonly
The VideoColorSpace
associated with this frame.
The colorSpace
getter steps are to return [[color space]]
.
A allocationSize (an unsigned long
)
A computedLayouts (a list of computed plane layout structs).
A computed plane layout is a struct that consists of:
A destinationOffset (an unsigned long
)
A destinationStride (an unsigned long
)
A sourceTop (an unsigned long
)
A sourceHeight (an unsigned long
)
A sourceLeftBytes (an unsigned long
)
A sourceWidthBytes (an unsigned long
)
allocationSize(options)
Returns the minimum byte length for a valid destination BufferSource
to be used with copyTo()
with the given options.
When invoked, run these steps:
If [[Detached]]
is true
, throw an InvalidStateError
DOMException
.
If [[format]]
is null
, throw a NotSupportedError
DOMException
.
Let combinedLayout be the result of running the Parse VideoFrameCopyToOptions algorithm with options.
If combinedLayout is an exception, throw combinedLayout.
Return combinedLayout’s allocationSize.
copyTo(destination, options)
Asynchronously copies the planes of this frame into destination according to options. The format of the data is options.format
, if it exists or this VideoFrame
’s format
otherwise.
NOTE: Promises that are returned by several calls to copyTo()
are not guaranteed to resolve in the order they were returned.
When invoked, run these steps:
If [[Detached]]
is true
, return a promise rejected with a InvalidStateError
DOMException
.
If [[format]]
is null
, return a promise rejected with a NotSupportedError
DOMException
.
Let combinedLayout be the result of running the Parse VideoFrameCopyToOptions algorithm with options.
If combinedLayout is an exception, return a promise rejected with combinedLayout.
If destination.byteLength
is less than combinedLayout’s allocationSize, return a promise rejected with a TypeError
.
If options.format
is equal to one of RGBA
, RGBX
, BGRA
, BGRX
then:
Let newOptions be the result of running the Clone Configuration algorithm with options.
Assign undefined
to newOptions.format
.
Let rgbFrame be the result of running the Convert to RGB frame algorithm with this, options.format
, and options.colorSpace
.
Return the result of calling copyTo()
on rgbFrame with destination and newOptions.
Let p be a new Promise
.
Let copyStepsQueue be the result of starting a new parallel queue.
Let planeLayouts be a new list.
Enqueue the following steps to copyStepsQueue:
Let resource be the media resource referenced by [[resource reference]]
.
Let numPlanes be the number of planes as defined by [[format]]
.
Let planeIndex be 0
.
While planeIndex is less than combinedLayout’s numPlanes:
Let sourceStride be the stride of the plane in resource as identified by planeIndex.
Let computedLayout be the computed plane layout in combinedLayout’s computedLayouts at the position of planeIndex
Let sourceOffset be the product of multiplying computedLayout’s sourceTop by sourceStride
Add computedLayout’s sourceLeftBytes to sourceOffset.
Let destinationOffset be computedLayout’s destinationOffset.
Let rowBytes be computedLayout’s sourceWidthBytes.
Let layout be a new PlaneLayout
, with offset
set to destinationOffset and stride
set to rowBytes.
Let row be 0
.
While row is less than computedLayout’s sourceHeight:
Copy rowBytes bytes from resource starting at sourceOffset to destination starting at destinationOffset.
Increment sourceOffset by sourceStride.
Increment destinationOffset by computedLayout’s destinationStride.
Increment row by 1
.
Increment planeIndex by 1
.
Append layout to planeLayouts.
Queue a task to resolve p with planeLayouts.
Return p.
clone()
Creates a new VideoFrame
with a reference to the same media resource.
When invoked, run these steps:
If the value of frame’s [[Detached]]
internal slot is true
, throw an InvalidStateError
DOMException
.
Return the result of running the Clone VideoFrame algorithm with this.
close()
Clears all state and releases the reference to the media resource. Close is final.
When invoked, run the Close VideoFrame algorithm with this.
metadata()
Gets the VideoFrameMetadata
associated with this frame.
When invoked, run these steps:
If [[Detached]]
is true
, throw an InvalidStateError
DOMException
.
Return the result of calling Copy VideoFrame metadata with [[metadata]]
.
Let frame be a new VideoFrame
, constructed as follows:
Assign false
to [[Detached]]
.
Let resource be the media resource described by output.
Let resourceReference be a reference to resource.
Assign resourceReference to [[resource reference]]
.
If output uses a recognized VideoPixelFormat
, assign that format to [[format]]
. Otherwise, assign null
to [[format]]
.
Let codedWidth and codedHeight be the coded width and height of the output in pixels.
Let visibleLeft, visibleTop, visibleWidth, and visibleHeight be the left, top, width and height for the visible rectangle of output.
Let displayWidth and displayHeight be the display size of output in pixels.
If displayAspectWidth and displayAspectHeight are provided, increase displayWidth or displayHeight until the ratio of displayWidth to displayHeight matches the ratio of displayAspectWidth to displayAspectHeight.
Assign codedWidth, codedHeight, visibleLeft, visibleTop, visibleWidth, visibleHeight, displayWidth, and displayHeight to [[coded width]]
, [[coded height]]
, [[visible left]]
, [[visible top]]
, [[visible width]]
, and [[visible height]]
respectively.
Assign duration and timestamp to [[duration]]
and [[timestamp]]
respectively.
Assign [[color space]]
with the result of running the Pick Color Space algorithm, with colorSpace and [[format]]
.
Return frame.
If overrideColorSpace is provided, return a new VideoColorSpace
constructed with overrideColorSpace.
User Agents MAY replace null
members of the provided overrideColorSpace with guessed values as determined by implementer defined heuristics.
Otherwise, if [[format]]
is an RGB format return a new instance of the sRGB Color Space
Otherwise, return a new instance of the REC709 Color Space.
If visibleRect
exists:
Let validAlignment be the result of running the Verify Rect Offset Alignment with format and visibleRect.
If validAlignment is false
, return false
.
If any attribute of visibleRect
is negative or not finite, return false
.
If visibleRect
.width
== 0
or visibleRect
.height
== 0
return false
.
If visibleRect
.y
+ visibleRect
.height
> codedHeight, return false
.
If visibleRect
.x
+ visibleRect
.width
> codedWidth, return false
.
If codedWidth = 0 or codedHeight = 0,return false
.
If only one of displayWidth
or displayHeight
exists, return false
.
If displayWidth
== 0
or displayHeight
== 0
, return false
.
Return true
.
VideoFrameBufferInit
is a valid VideoFrameBufferInit, run these steps:
If codedWidth
= 0 or codedHeight
= 0,return false
.
If any attribute of visibleRect
is negative or not finite, return false
.
If visibleRect
.y
+ visibleRect
.height
> codedHeight
, return false
.
If visibleRect
.x
+ visibleRect
.width
> codedWidth
, return false
.
If only one of displayWidth
or displayHeight
exists, return false
.
If displayWidth
= 0 or displayHeight
= 0, return false
.
Return true
.
Let format be otherFrame.format
.
If init.alpha
is discard
, assign otherFrame.format
’s equivalent opaque format format.
Let validInit be the result of running the Validate VideoFrameInit algorithm with format and otherFrame’s [[coded width]]
and [[coded height]]
.
If validInit is false
, throw a TypeError
.
Let resource be the media resource referenced by otherFrame’s [[resource reference]]
.
Assign a new reference for resource to frame’s [[resource reference]]
.
Assign the following attributes from otherFrame to frame: codedWidth
, codedHeight
, colorSpace
.
Let defaultVisibleRect be the result of performing the getter steps for visibleRect
on otherFrame.
Let baseRotation and baseFlip be otherFrame’s [[rotation]]
and [[flip]]
, respectively.
Let defaultDisplayWidth and defaultDisplayHeight be otherFrame’s [[display width]]
and [[display height]]
, respectively.
Run the Initialize Visible Rect, Orientation, and Display Size algorithm with init, frame, defaultVisibleRect, baseRotation, baseFlip, defaultDisplayWidth, and defaultDisplayHeight.
If duration
exists in init, assign it to frame’s [[duration]]
. Otherwise, assign otherFrame.duration
to frame’s [[duration]]
.
If timestamp
exists in init, assign it to frame’s [[timestamp]]
. Otherwise, assign otherFrame’s timestamp
to frame’s [[timestamp]]
.
Assign format to frame.[[format]]
.
Assign the result of calling Copy VideoFrame metadata with init’s metadata
to frame.[[metadata]]
.
Let format be null
.
If resource uses a recognized VideoPixelFormat
, assign the VideoPixelFormat
of resource to format.
Let validInit be the result of running the Validate VideoFrameInit algorithm with format, width and height.
If validInit is false
, throw a TypeError
.
Assign a new reference for resource to frame’s [[resource reference]]
.
If init.alpha
is discard
, assign format’s equivalent opaque format to format.
Assign format to [[format]]
Assign codedWidth and codedHeight to frame’s [[coded width]]
and [[coded height]]
respectively.
Let defaultVisibleRect be a new DOMRect
constructed with «[ "x:" → 0
, "y" → 0
, "width" → codedWidth, "height" → codedHeight ]»
Run the Initialize Visible Rect, Orientation, and Display Size algorithm with init, frame, defaultVisibleRect, defaultDisplayWidth, and defaultDisplayHeight.
Assign init
.duration
to frame’s [[duration]]
.
Assign init
.timestamp
to frame’s [[timestamp]]
.
If resource has a known VideoColorSpace
, assign its value to [[color space]]
.
Otherwise, assign a new VideoColorSpace
, constructed with an empty VideoColorSpaceInit
, to [[color space]]
.
Let visibleRect be defaultVisibleRect.
If init.visibleRect
exists, assign it to visibleRect.
Assign visibleRect’s x
, y
, width
, and height
, to frame’s [[visible left]]
, [[visible top]]
, [[visible width]]
, and [[visible height]]
respectively.
Let rotation be the result of running the Parse Rotation algorithm, with init.rotation
.
Assign the result of running the Add Rotations algorithm, with baseRotation, baseFlip, and rotation, to frame’s [[rotation]]
.
If baseFlip is equal to init.flip
, assign false
to frame’s [[flip]]
. Otherwise, assign true
to frame’s [[flip]]
.
If displayWidth
and displayHeight
exist in init, assign them to [[display width]]
and [[display height]]
respectively.
Otherwise:
If baseRotation is equal to 0
or 180
:
Otherwise:
Let displayWidth be |frame|'s {{VideoFrame/[[visible width]]}} * |widthScale|
, rounded to the nearest integer.
Let displayHeight be |frame|'s {{VideoFrame/[[visible height]]}} * |heightScale|
, rounded to the nearest integer.
If rotation is equal to 0
or 180
:
Assign displayWidth to frame’s [[display width]]
.
Assign displayHeight to frame’s [[display height]]
.
Otherwise:
Assign displayHeight to frame’s [[display width]]
.
Assign displayWidth to frame’s [[display height]]
.
Let clone be a new VideoFrame
initialized as follows:
Let resource be the media resource referenced by frame’s [[resource reference]]
.
Let newReference be a new reference to resource.
Assign newReference to clone’s [[resource reference]]
.
Assign all remaining internal slots of frame (excluding [[resource reference]]
) to those of the same name in clone.
Return clone.
Assign null
to frame’s [[resource reference]]
.
Assign true
to frame’s [[Detached]]
.
Assign null
to frame’s format
.
Assign 0
to frame’s [[coded width]]
, [[coded height]]
, [[visible left]]
, [[visible top]]
, [[visible width]]
, [[visible height]]
, [[rotation]]
, [[display width]]
, and [[display height]]
.
Assign false
to frame’s [[flip]]
.
Assign a new VideoFrameMetadata
to frame.[[metadata]]
.
Let alignedRotation be the nearest multiple of 90
to rotation, rounding ties towards positive infinity.
Let fullTurns be the greatest multiple of 360
less than or equal to alignedRotation.
Return |alignedRotation| - |fullTurns|
.
If baseFlip is false
, let combinedRotation be |baseRotation| + |rotation|
. Otherwise, let combinedRotation be |baseRotation| - |rotation|
.
Let fullTurns be the greatest multiple of 360
less than or equal to combinedRotation.
Return |combinedRotation| - |fullTurns|
.
Let defaultRect be the result of performing the getter steps for visibleRect
.
Let overrideRect be undefined
.
If options.rect
exists, assign the value of options.rect
to overrideRect.
Let parsedRect be the result of running the Parse Visible Rect algorithm with defaultRect, overrideRect, [[coded width]]
, [[coded height]]
, and [[format]]
.
If parsedRect is an exception, return parsedRect.
Let optLayout be undefined
.
Let format be undefined
.
If options.format
does not exist, assign [[format]]
to format.
Otherwise, if options.format
is equal to one of RGBA
, RGBX
, BGRA
, BGRX
, then assign options.format
to format, otherwise return NotSupportedError
.
Let combinedLayout be the result of running the Compute Layout and Allocation Size algorithm with parsedRect, format, and optLayout.
Return combinedLayout.
If format is null
, return true
.
Let planeIndex be 0
.
Let numPlanes be the number of planes as defined by format.
While planeIndex is less than numPlanes:
Let plane be the Plane identified by planeIndex as defined by format.
Let sampleWidth be the horizontal sub-sampling factor of each subsample for plane.
Let sampleHeight be the vertical sub-sampling factor of each subsample for plane.
If rect.x
is not a multiple of sampleWidth, return false
.
If rect.y
is not a multiple of sampleHeight, return false
.
Increment planeIndex by 1
.
Return true
.
Let sourceRect be defaultRect
If overrideRect is not undefined
:
If either of overrideRect.width
or height
is 0
, return a TypeError
.
If the sum of overrideRect.x
and overrideRect.width
is greater than codedWidth, return a TypeError
.
If the sum of overrideRect.y
and overrideRect.height
is greater than codedHeight, return a TypeError
.
Assign overrideRect to sourceRect.
Let validAlignment be the result of running the Verify Rect Offset Alignment algorithm with format and sourceRect.
If validAlignment is false
, throw a TypeError
.
Return sourceRect.
Let numPlanes be the number of planes as defined by format.
If layout is not undefined
and its length does not equal numPlanes, throw a TypeError
.
Let minAllocationSize be 0
.
Let computedLayouts be a new list.
Let endOffsets be a new list.
Let planeIndex be 0
.
While planeIndex < numPlanes:
Let plane be the Plane identified by planeIndex as defined by format.
Let sampleBytes be the number of bytes per sample for plane.
Let sampleWidth be the horizontal sub-sampling factor of each subsample for plane.
Let sampleHeight be the vertical sub-sampling factor of each subsample for plane.
Let computedLayout be a new computed plane layout.
Set computedLayout’s sourceTop to the result of the division of truncated parsedRect.y
by sampleHeight, rounded up to the nearest integer.
Set computedLayout’s sourceHeight to the result of the division of truncated parsedRect.height
by sampleHeight, rounded up to the nearest integer.
Set computedLayout’s sourceLeftBytes to the result of the integer division of truncated parsedRect.x
by sampleWidth, multiplied by sampleBytes.
Set computedLayout’s sourceWidthBytes to the result of the integer division of truncated parsedRect.width
by sampleWidth, multiplied by sampleBytes.
If layout is not undefined
:
Let planeLayout be the PlaneLayout
in layout at position planeIndex.
If planeLayout.stride
is less than computedLayout’s sourceWidthBytes, return a TypeError
.
Assign planeLayout.offset
to computedLayout’s destinationOffset.
Assign planeLayout.stride
to computedLayout’s destinationStride.
Otherwise:
NOTE: If an explicit layout was not provided, the following steps default to tight packing.
Assign minAllocationSize to computedLayout’s destinationOffset.
Assign computedLayout’s sourceWidthBytes to computedLayout’s destinationStride.
Let planeSize be the product of multiplying computedLayout’s destinationStride and sourceHeight.
Let planeEnd be the sum of planeSize and computedLayout’s destinationOffset.
If planeSize or planeEnd is greater than maximum range of unsigned long
, return a TypeError
.
Append planeEnd to endOffsets.
Assign the maximum of minAllocationSize and planeEnd to minAllocationSize.
NOTE: The above step uses a maximum to allow for the possibility that user specified plane offsets reorder planes.
Let earlierPlaneIndex be 0
.
While earlierPlaneIndex is less than planeIndex.
Let earlierLayout be computedLayouts[earlierPlaneIndex]
.
If endOffsets[planeIndex]
is less than or equal to earlierLayout’s destinationOffset or if endOffsets[earlierPlaneIndex]
is less than or equal to computedLayout’s destinationOffset, continue.
NOTE: If plane A ends before plane B starts, they do not overlap.
Otherwise, return a TypeError
.
Increment earlierPlaneIndex by 1
.
Append computedLayout to computedLayouts.
Increment planeIndex by 1
.
Let combinedLayout be a new combined buffer layout, initialized as follows:
Assign computedLayouts to computedLayouts.
Assign minAllocationSize to allocationSize.
Return combinedLayout.
Assert: colorSpace is equal to one of srgb
or display-p3
.
If colorSpace is equal to srgb
return a new instance of the sRGB Color Space
If colorSpace is equal to display-p3
return a new instance of the Display P3 Color Space
This algorithm MUST be called only if format is equal to one of RGBA
, RGBX
, BGRA
, BGRX
.
Let convertedFrame be a new VideoFrame
, constructed as follows:
Assign false
to [[Detached]]
.
Assign format to [[format]]
.
Let width be frame’s [[visible width]]
.
Let height be frame’s [[visible height]]
.
Assign width, height, 0, 0, width, height, width, and height to [[coded width]]
, [[coded height]]
, [[visible left]]
, [[visible top]]
, [[visible width]]
, and [[visible height]]
respectively.
Assign frame’s [[duration]]
and frame’s [[timestamp]]
to [[duration]]
and [[timestamp]]
respectively.
Assign the result of running the Convert PredefinedColorSpace to VideoColorSpace algorithm with colorSpace to [[color space]]
.
Let resource be a new media resource containing the result of conversion of media resource referenced by frame’s [[resource reference]]
into a color space and pixel format specified by [[color space]]
and [[format]]
respectively.
Assign the reference to resource to [[resource reference]]
Return convertedFrame.
Let metadataCopySerialized be StructuredSerialize(metadata).
Let metadataCopy be StructuredDeserialize(metadataCopySerialized, the current Realm).
Return metadataCopy.
The goal of this algorithm is to ensure that metadata owned by a VideoFrame
is immutable.
VideoFrame
transfer steps (with value and dataHolder) are:
If value’s [[Detached]]
is true
, throw a DataCloneError
DOMException
.
For all VideoFrame
internal slots in value, assign the value of each internal slot to a field in dataHolder with the same name as the internal slot.
Run the Close VideoFrame algorithm with value.
VideoFrame
transfer-receiving steps (with dataHolder and value) are:
For all named fields in dataHolder, assign the value of each named field to the VideoFrame
internal slot in value with the same name as the named field.
VideoFrame
serialization steps (with value, serialized, and forStorage) are:
If value’s [[Detached]]
is true
, throw a DataCloneError
DOMException
.
If forStorage is true
, throw a DataCloneError
.
Let resource be the media resource referenced by value’s [[resource reference]]
.
Let newReference be a new reference to resource.
Assign newReference to |serialized.resource reference|.
For all remaining VideoFrame
internal slots (excluding [[resource reference]]
) in value, assign the value of each internal slot to a field in serialized with the same name as the internal slot.
VideoFrame
deserialization steps (with serialized and value) are:
For all named fields in serialized, assign the value of each named field to the VideoFrame
internal slot in value with the same name as the named field.
When rendered, for example by CanvasDrawImage
drawImage()
, a VideoFrame
MUST be converted to a color space compatible with the rendering target, unless color conversion is explicitly disabled.
Color space conversion during ImageBitmap
construction is controlled by ImageBitmapOptions
colorSpaceConversion
. Setting this value to "none"
disables color space conversion.
The rendering of a VideoFrame
is produced from the media resource by applying any necessary color space conversion, cropping to the visibleRect
, rotating clockwise by rotation
degrees, and flipping horizontally if flip
is true
.
dictionary VideoFrameCopyToOptions
{
DOMRectInit rect;
sequence<PlaneLayout> layout;
VideoPixelFormat format;
PredefinedColorSpace colorSpace;
};
rect
, of type DOMRectInit
A DOMRectInit
describing the rectangle of pixels to copy from the VideoFrame
. If unspecified, the visibleRect
will be used.
NOTE: The coded rectangle can be specified by passing VideoFrame
’s codedRect
.
NOTE: The default rect
does not necessarily meet the sample-alignment requirement and can result in copyTo()
or allocationSize()
rejecting.
layout
, of type sequence<PlaneLayout>
The PlaneLayout
for each plane in VideoFrame
, affording the option to specify an offset and stride for each plane in the destination BufferSource
. If unspecified, the planes will be tightly packed. It is invalid to specify planes that overlap.
format
, of type VideoPixelFormat
A VideoPixelFormat
for the pixel data in the destination BufferSource
. Potential values are: RGBA
, RGBX
, BGRA
, BGRX
. If it does not exist, the destination BufferSource
will be in the same format as format
.
colorSpace
, of type PredefinedColorSpace
A PredefinedColorSpace
that MUST be used as a target color space for the pixel data in the destination BufferSource
, but only if format
is one of RGBA
, RGBX
, BGRA
, BGRX
, otherwise it is ignored. If it does not exist, srgb
is used.
VideoFrame
interface uses DOMRect
s to specify the position and dimensions for a rectangle of pixels. DOMRectInit
is used with copyTo()
and allocationSize()
to describe the dimensions of the source rectangle. VideoFrame
defines codedRect
and visibleRect
for convenient copying of the coded size and visible region respectively.
NOTE: VideoFrame pixels are only addressable by integer numbers. All floating point values provided to DOMRectInit
will be truncated.
PlaneLayout
is a dictionary specifying the offset and stride of a VideoFrame
plane once copied to a BufferSource
. A sequence of PlaneLayout
s MAY be provided to VideoFrame
’s copyTo()
to specify how the plane is laid out in the destination BufferSource
. Alternatively, callers can inspect copyTo()
’s returned sequence of PlaneLayout
s to learn the offset and stride for planes as decided by the User Agent.
dictionary PlaneLayout
{
[EnforceRange] required unsigned long offset;
[EnforceRange] required unsigned long stride;
};
offset
, of type unsigned long
The offset in bytes where the given plane begins within a BufferSource
.
stride
, of type unsigned long
The number of bytes, including padding, used by each row of the plane within a BufferSource
.
enum VideoPixelFormat
{
// 4:2:0 Y, U, V
"I420",
"I420P10",
"I420P12",
// 4:2:0 Y, U, V, A
"I420A",
"I420AP10",
"I420AP12",
// 4:2:2 Y, U, V
"I422",
"I422P10",
"I422P12",
// 4:2:2 Y, U, V, A
"I422A",
"I422AP10",
"I422AP12",
// 4:4:4 Y, U, V
"I444",
"I444P10",
"I444P12",
// 4:4:4 Y, U, V, A
"I444A",
"I444AP10",
"I444AP12",
// 4:2:0 Y, UV
"NV12",
// 4:4:4 RGBA
"RGBA",
// 4:4:4 RGBX (opaque)
"RGBX",
// 4:4:4 BGRA
"BGRA",
// 4:4:4 BGRX (opaque)
"BGRX",
};
Sub-sampling is a technique where a single sample contains information for multiple pixels in the final image. Sub-sampling can be horizontal, vertical or both, and has a factor, that is the number of final pixels in the image that are derived from a sub-sampled sample.
If a
VideoFrame
is in
I420
format, then the very first component of the second plane (the U plane) corresponds to four pixels, that are the pixels in the top-left angle of the image. Consequently, the first component of the second row corresponds to the four pixels below those initial four top-left pixels. The
sub-sampling factoris 2 in both the horizontal and vertical direction.
If a VideoPixelFormat
has an alpha component, the format’s equivalent opaque format is the same VideoPixelFormat
, without an alpha component. If a VideoPixelFormat
does not have an alpha component, it is its own equivalent opaque format.
Integer values are unsigned unless otherwise specified.
I420
The U an V planes are sub-sampled horizontally and vertically by a factor of 2 compared to the Y plane.
Each sample in this format is 8 bits.
There are codedWidth
* codedHeight
samples (and therefore bytes) in the Y plane, arranged starting at the top left of the image, in codedHeight
rows of codedWidth
samples.
The U and V planes have a number of rows equal to the result of the division of codedHeight
by 2, rounded up to the nearest integer. Each row has a number of samples equal to the result of the division of codedWidth
by 2, rounded up to the nearest integer. Samples are arranged starting at the top left of the image.
The visible rectangle offset (visibleRect
.x
and visibleRect
.y
) MUST be even.
I420P10
The U and V planes are sub-sampled horizontally and vertically by a factor of 2 compared to the Y plane.
Each sample in this format is 10 bits, encoded as a 16-bit integer in little-endian byte order.
There are codedWidth
* codedHeight
samples in the Y plane, arranged starting at the top left of the image, in codedHeight
rows of codedWidth
samples.
The U and V planes have a number of rows equal to the result of the division of codedHeight
by 2, rounded up to the nearest integer. Each row has a number of samples equal to the result of the division of codedWidth
by 2, rounded up to the nearest integer. Samples are arranged starting at the top left of the image.
The visible rectangle offset (visibleRect
.x
and visibleRect
.y
) MUST be even.
I420P12
The U and V planes are sub-sampled horizontally and vertically by a factor of 2 compared to the Y plane.
Each sample in this format is 12 bits, encoded as a 16-bit integer in little-endian byte order.
There are codedWidth
* codedHeight
samples in the Y plane, arranged starting at the top left of the image, in codedHeight
rows of codedWidth
samples.
The U and V planes have a number of rows equal to the result of the division of codedHeight
by 2, rounded up to the nearest integer. Each row has a number of samples equal to the result of the division of codedWidth
by 2, rounded up to the nearest integer. Samples are arranged starting at the top left of the image.
The visible rectangle offset (visibleRect
.x
and visibleRect
.y
) MUST be even.
I420A
The U an V planes are sub-sampled horizontally and vertically by a factor of 2 compared to the Y and Alpha planes.
Each sample in this format is 8 bits.
There are codedWidth
* codedHeight
samples (and therefore bytes) in the Y and Alpha planes, arranged starting at the top left of the image, in codedHeight
rows of codedWidth
samples.
The U and V planes have a number of rows equal to the result of the division of codedHeight
by 2, rounded up to the nearest integer. Each row has a number of samples equal to the result of the division of codedWidth
by 2, rounded up to the nearest integer. Samples are arranged starting at the top left of the image.
The visible rectangle offset (visibleRect
.x
and visibleRect
.y
) MUST be even.
I420AP10
The U and V planes are sub-sampled horizontally and vertically by a factor of 2 compared to the Y and Alpha planes.
Each sample in this format is 10 bits, encoded as a 16-bit integer in little-endian byte order.
There are codedWidth
* codedHeight
samples in the Y and Alpha planes, arranged starting at the top left of the image, in codedHeight
rows of codedWidth
samples.
The U and V planes have a number of rows equal to the result of the division of codedHeight
by 2, rounded up to the nearest integer. Each row has a number of samples equal to the result of the division of codedWidth
by 2, rounded up to the nearest integer. Samples are arranged starting at the top left of the image.
The visible rectangle offset (visibleRect
.x
and visibleRect
.y
) MUST be even.
I420AP12
The U and V planes are sub-sampled horizontally and vertically by a factor of 2 compared to the Y and Alpha planes.
Each sample in this format is 12 bits, encoded as a 16-bit integer in little-endian byte order.
There are codedWidth
* codedHeight
samples in the Y and Alpha planes, arranged starting at the top left of the image, in codedHeight
rows of codedWidth
samples.
The U and V planes have a number of rows equal to the result of the division of codedHeight
by 2, rounded up to the nearest integer. Each row has a number of samples equal to the result of the division of codedWidth
by 2, rounded up to the nearest integer. Samples are arranged starting at the top left of the image.
The visible rectangle offset (visibleRect
.x
and visibleRect
.y
) MUST be even.
I422
The U an V planes are sub-sampled horizontally by a factor of 2 compared to the Y plane, and not sub-sampled vertically.
Each sample in this format is 8 bits.
There are codedWidth
* codedHeight
samples (and therefore bytes) in the Y and plane, arranged starting at the top left of the image, in codedHeight
rows of codedWidth
samples.
The U and V planes have codedHeight
rows. Each row has a number of samples equal to the result of the division of codedWidth
by 2, rounded up to the nearest integer. Samples are arranged starting at the top left of the image.
The visible rectangle horizontal offset (visibleRect
.x
) MUST be even.
I422P10
The U and V planes are sub-sampled horizontally by a factor of 2 compared to the Y plane, and not sub-sampled vertically.
Each sample in this format is 10 bits, encoded as a 16-bit integer in little-endian byte order.
There are codedWidth
* codedHeight
samples in the Y plane, arranged starting at the top left of the image, in codedHeight
rows of codedWidth
samples.
The U and V planes have codedHeight
rows. Each row has a number of samples equal to the result of the division of codedWidth
by 2, rounded up to the nearest integer. Samples are arranged starting at the top left of the image.
The visible rectangle horizontal offset (visibleRect
.x
) MUST be even.
I422P12
The U and V planes are sub-sampled horizontally by a factor of 2 compared to the Y plane, and not sub-sampled vertically.
Each sample in this format is 12 bits, encoded as a 16-bit integer in little-endian byte order.
There are codedWidth
* codedHeight
samples in the Y plane, arranged starting at the top left of the image, in codedHeight
rows of codedWidth
samples.
The U and V planes have codedHeight
rows. Each row has a number of samples equal to the result of the division of codedWidth
by 2, rounded up to the nearest integer. Samples are arranged starting at the top left of the image.
The visible rectangle horizontal offset (visibleRect
.x
) MUST be even.
I422A
The U and V planes are sub-sampled horizontally by a factor of 2 compared to the Y and Alpha planes, and not sub-sampled vertically.
Each sample in this format is 8 bits.
There are codedWidth
* codedHeight
samples (and therefore bytes) in the Y and Alpha planes, arranged starting at the top left of the image, in codedHeight
rows of codedWidth
samples.
The U and V planes have codedHeight
rows. Each row has a number of samples equal to the result of the division of codedWidth
by 2, rounded up to the nearest integer. Samples are arranged starting at the top left of the image.
The visible rectangle horizontal offset (visibleRect
.x
) MUST be even.
I422AP10
The U and V planes are sub-sampled horizontally by a factor of 2 compared to the Y and Alpha planes, and not sub-sampled vertically.
Each sample in this format is 10 bits, encoded as a 16-bit integer in little-endian byte order.
There are codedWidth
* codedHeight
samples in the Y and Alpha planes, arranged starting at the top left of the image, in codedHeight
rows of codedWidth
samples.
The U and V planes have codedHeight
rows. Each row has a number of samples equal to the result of the division of codedWidth
by 2, rounded up to the nearest integer. Samples are arranged starting at the top left of the image.
The visible rectangle horizontal offset (visibleRect
.x
) MUST be even.
I422AP12
The U and V planes are sub-sampled horizontally by a factor of 2 compared to the Y and Alpha planes, and not sub-sampled vertically.
Each sample in this format is 12 bits, encoded as a 16-bit integer in little-endian byte order.
There are codedWidth
* codedHeight
samples in the Y and Alpha planes, arranged starting at the top left of the image, in codedHeight
rows of codedWidth
samples.
The U and V planes have codedHeight
rows. Each row has a number of samples equal to the result of the division of codedWidth
by 2, rounded up to the nearest integer. Samples are arranged starting at the top left of the image.
The visible rectangle horizontal offset (visibleRect
.x
) MUST be even.
I444
This format does not use sub-sampling.
Each sample in this format is 8 bits.
There are codedWidth
* codedHeight
samples (and therefore bytes) in all three planes, arranged starting at the top left of the image, in codedHeight
rows of codedWidth
samples.
I444P10
This format does not use sub-sampling.
Each sample in this format is 10 bits, encoded as a 16-bit integer in little-endian byte order.
There are codedWidth
* codedHeight
samples in all three planes, arranged starting at the top left of the image, in codedHeight
rows of codedWidth
samples.
I444P12
This format does not use sub-sampling.
Each sample in this format is 12 bits, encoded as a 16-bit integer in little-endian byte order.
There are codedWidth
* codedHeight
samples in all three planes, arranged starting at the top left of the image, in codedHeight
rows of codedWidth
samples.
I444A
This format does not use sub-sampling.
Each sample in this format is 8 bits.
There are codedWidth
* codedHeight
samples (and therefore bytes) in all four planes, arranged starting at the top left of the image, in codedHeight
rows of codedWidth
samples.
I444AP10
This format does not use sub-sampling.
Each sample in this format is 10 bits, encoded as a 16-bit integer in little-endian byte order.
There are codedWidth
* codedHeight
samples in all four planes, arranged starting at the top left of the image, in codedHeight
rows of codedWidth
samples.
I444AP12
This format does not use sub-sampling.
Each sample in this format is 12 bits, encoded as a 16-bit integer in little-endian byte order.
There are codedWidth
* codedHeight
samples in all four planes, arranged starting at the top left of the image, in codedHeight
rows of codedWidth
samples.
NV12
The U an V components are sub-sampled horizontally and vertically by a factor of 2 compared to the components in the Y planes.
Each sample in this format is 8 bits.
There are codedWidth
* codedHeight
samples (and therefore bytes) in the Y and plane, arranged starting at the top left of the image, in codedHeight
rows of codedWidth
samples.
The UV plane is composed of interleaved U and V values, in a number of rows equal to the result of the division of codedHeight
by 2, rounded up to the nearest integer. Each row has a number of elements equal to the result of the division of codedWidth
by 2, rounded up to the nearest integer. Each element is composed of two Chroma samples, the U and V samples, in that order. Samples are arranged starting at the top left of the image.
The visible rectangle offset (visibleRect
.x
and visibleRect
.y
) MUST be even.
An image in the NV12 pixel format that is 16 pixels wide and 10 pixels tall will be arranged like so in memory:
YYYYYYYYYYYYYYYY YYYYYYYYYYYYYYYY YYYYYYYYYYYYYYYY YYYYYYYYYYYYYYYY YYYYYYYYYYYYYYYY YYYYYYYYYYYYYYYY YYYYYYYYYYYYYYYY YYYYYYYYYYYYYYYY YYYYYYYYYYYYYYYY YYYYYYYYYYYYYYYY UVUVUVUVUVUVUVUV UVUVUVUVUVUVUVUV UVUVUVUVUVUVUVUV UVUVUVUVUVUVUVUV UVUVUVUVUVUVUVUV
All samples being linear in memory.
RGBA
Each sample in this format is 8 bits, and each pixel is therefore 32 bits.
There are codedWidth
* codedHeight
* 4 samples (and therefore bytes) in the single plane, arranged starting at the top left of the image, in codedHeight
rows of codedWidth
samples.
RGBA
’s equivalent opaque format is RGBX
.
RGBX
Each sample in this format is 8 bits. The fourth element in each pixel is to be ignored, the image is always fully opaque.
There are codedWidth
* codedHeight
* 4 samples (and therefore bytes) in the single plane, arranged starting at the top left of the image, in codedHeight
rows of codedWidth
samples.
BGRA
Each sample in this format is 8 bits.
There are codedWidth
* codedHeight
* 4 samples (and therefore bytes) in the single plane, arranged starting at the top left of the image, in codedHeight
rows of codedWidth
samples.
BGRA
’s equivalent opaque format is BGRX
.
BGRX
Each sample in this format is 8 bits. The fourth element in each pixel is to be ignored, the image is always fully opaque.
There are codedWidth
* codedHeight
* 4 samples (and therefore bytes) in the single plane, arranged starting at the top left of the image, in codedHeight
rows of codedWidth
samples.
[Exposed=(Window,DedicatedWorker)] interface9.9.1. Internal SlotsVideoColorSpace
{ constructor(optional VideoColorSpaceInitinit
= {}); readonly attribute VideoColorPrimaries? primaries; readonly attribute VideoTransferCharacteristics? transfer; readonly attribute VideoMatrixCoefficients? matrix; readonly attribute boolean? fullRange; [Default] VideoColorSpaceInittoJSON
(); }; dictionaryVideoColorSpaceInit
{ VideoColorPrimaries?primaries
= null; VideoTransferCharacteristics?transfer
= null; VideoMatrixCoefficients?matrix
= null; boolean?fullRange
= null; };
[[primaries]]
The color primaries.
[[transfer]]
The transfer characteristics.
[[matrix]]
The matrix coefficients.
[[full range]]
Indicates whether full-range color values are used.
VideoColorSpace(init)
Let c be a new VideoColorSpace
object, initialized as follows:
Assign init.primaries
to [[primaries]]
.
Assign init.transfer
to [[transfer]]
.
Assign init.matrix
to [[matrix]]
.
Assign init.fullRange
to [[full range]]
.
Return c.
primaries
, of type VideoColorPrimaries, readonly, nullable
The primaries
getter steps are to return the value of [[primaries]]
.
transfer
, of type VideoTransferCharacteristics, readonly, nullable
The transfer
getter steps are to return the value of [[transfer]]
.
matrix
, of type VideoMatrixCoefficients, readonly, nullable
The matrix
getter steps are to return the value of [[matrix]]
.
fullRange
, of type boolean, readonly, nullable
The fullRange
getter steps are to return the value of [[full range]]
.
enum VideoColorPrimaries
{
"bt709",
"bt470bg",
"smpte170m",
"bt2020",
"smpte432",
};
bt709
bt470bg
smpte170m
bt2020
smpte432
enum VideoTransferCharacteristics
{
"bt709",
"smpte170m",
"iec61966-2-1",
"linear",
"pq",
"hlg",
};
bt709
smpte170m
iec61966-2-1
linear
pq
hlg
enum VideoMatrixCoefficients
{
"rgb",
"bt709",
"bt470bg",
"smpte170m",
"bt2020-ncl",
};
rgb
bt709
bt470bg
smpte170m
bt2020-ncl
This section is non-normative.
Image codec definitions are typically accompanied by a definition for a corresponding file format. Hence image decoders often perform both duties of unpacking (demuxing) as well as decoding the encoded image data. The WebCodecs ImageDecoder
follows this pattern, which motivates an interface design that is notably different from that of VideoDecoder
and AudioDecoder
.
In spite of these differences, ImageDecoder
uses the same codec processing model as the other codec interfaces. Additionally, ImageDecoder
uses the VideoFrame
interface to describe decoded outputs.
[Exposed=(Window,DedicatedWorker), SecureContext] interface10.2.1. Internal SlotsImageDecoder
{ constructor(ImageDecoderInitinit
); readonly attribute DOMString type; readonly attribute boolean complete; readonly attribute Promise<undefined> completed; readonly attribute ImageTrackList tracks; Promise<ImageDecodeResult> decode(optional ImageDecodeOptionsoptions
= {}); undefined reset(); undefined close(); static Promise<boolean> isTypeSupported(DOMStringtype
); };
[[control message queue]]
A queue of control messages to be performed upon this codec instance. See [[control message queue]].
[[message queue blocked]]
A boolean indicating when processing the [[control message queue]]
is blocked by a pending control message. See [[message queue blocked]].
[[codec work queue]]
A parallel queue used for running parallel steps that reference the [[codec implementation]]
. See [[codec work queue]].
[[ImageTrackList]]
An ImageTrackList
describing the tracks found in [[encoded data]]
[[type]]
A string reflecting the value of the MIME type
given at construction.
[[complete]]
A boolean indicating whether [[encoded data]]
is completely buffered.
[[completed promise]]
The promise used to signal when [[complete]]
becomes true
.
[[codec implementation]]
An underlying image decoder implementation provided by the User Agent. See [[codec implementation]].
[[encoded data]]
A byte sequence containing the encoded image data to be decoded.
[[prefer animation]]
A boolean reflecting the value of preferAnimation
given at construction.
[[pending decode promises]]
A list of unresolved promises returned by calls to decode().
[[internal selected track index]]
Identifies the image track within [[encoded data]]
that is used by decoding algorithms.
[[tracks established]]
A boolean indicating whether the track list has been established in [[ImageTrackList]]
.
[[closed]]
A boolean indicating that the ImageDecoder
is in a permanent closed state and can no longer be used.
[[progressive frame generations]]
A mapping of frame indices to Progressive Image Frame Generations. The values represent the Progressive Image Frame Generation for the VideoFrame
which was most recently output by a call to decode()
with the given frame index.
ImageDecoder(init)
NOTE: Calling decode()
on the constructed ImageDecoder
will trigger a NotSupportedError
if the User Agent does not support type. Authors are encouraged to first check support by calling isTypeSupported()
with type. User Agents don’t have to support any particular type.
When invoked, run these steps:
If init is not valid ImageDecoderInit, throw a TypeError
.
If init.transfer
contains more than one reference to the same ArrayBuffer
, then throw a DataCloneError
DOMException
.
For each transferable in init.transfer
:
If [[Detached]]
internal slot is true
, then throw a DataCloneError
DOMException
.
Let d be a new ImageDecoder
object. In the steps below, all mentions of ImageDecoder
members apply to d unless stated otherwise.
Assign a new queue to [[control message queue]]
.
Assign false
to [[message queue blocked]]
.
Assign the result of starting a new parallel queue to [[codec work queue]]
.
Assign [[ImageTrackList]]
a new ImageTrackList
initialized as follows:
Assign a new list to [[track list]]
.
Assign -1
to [[selected index]]
.
Assign null
to [[codec implementation]]
.
If init.preferAnimation
exists, assign init.preferAnimation
to the [[prefer animation]]
internal slot. Otherwise, assign 'null' to [[prefer animation]]
internal slot.
Assign a new list to [[pending decode promises]]
.
Assign -1
to [[internal selected track index]]
.
Assign false
to [[tracks established]]
.
Assign false
to [[closed]]
.
Assign a new map to [[progressive frame generations]]
.
If init’s data
member is of type ReadableStream
:
Assign a new list to [[encoded data]]
.
Assign false
to [[complete]]
Queue a control message to configure the image decoder with init.
Let reader be the result of getting a reader for data
.
In parallel, perform the Fetch Stream Data Loop on d with reader.
Otherwise:
Assert that init.data
is of type BufferSource
.
If init.transfer
contains an ArrayBuffer
referenced by init.data
the User Agent MAY choose to:
Let [[encoded data]]
reference bytes in data representing an encoded image.
Otherwise:
Assign a copy of init.data
to [[encoded data]]
.
Assign true
to [[complete]]
.
Resolve [[completed promise]]
.
Queue a control message to configure the image decoder with init.
Queue a control message to decode track metadata.
For each transferable in init.transfer
:
Perform DetachArrayBuffer on transferable
return d.
Running a control message to configure the image decoder means running these steps:
Let supported be the result of running the Check Type Support algorithm with init.type
.
If supported is false
, run the Close ImageDecoder algorithm with a NotSupportedError
DOMException
and return "processed"
.
Otherwise, assign the [[codec implementation]]
internal slot with an implementation supporting init.type
Assign true
to [[message queue blocked]]
.
Enqueue the following steps to the [[codec work queue]]
:
Configure [[codec implementation]]
in accordance with the values given for colorSpaceConversion
, desiredWidth
, and desiredHeight
.
Assign false
to [[message queue blocked]]
.
Return "processed"
.
Running a control message to decode track metadata means running these steps:
Enqueue the following steps to the [[codec work queue]]
:
Run the Establish Tracks algorithm.
type
, of type DOMString, readonly
A string reflecting the value of the MIME type
given at construction.
complete
, of type boolean, readonly
Indicates whether [[encoded data]]
is completely buffered.
The complete
getter steps are to return [[complete]]
.
completed
, of type Promise<undefined>, readonly
The promise used to signal when complete
becomes true
.
The completed
getter steps are to return [[completed promise]]
.
tracks
, of type ImageTrackList, readonly
Returns a live ImageTrackList
, which provides metadata for the available tracks and a mechanism for selecting a track to decode.
The tracks
getter steps are to return [[ImageTrackList]]
.
decode(options)
Enqueues a control message to decode the frame according to options.
When invoked, run these steps:
If [[closed]]
is true
, return a Promise
rejected with an InvalidStateError
DOMException
.
If [[ImageTrackList]]
’s [[selected index]]
is '-1', return a Promise
rejected with an InvalidStateError
DOMException
.
If options is undefined
, assign a new ImageDecodeOptions
to options.
Let promise be a new Promise
.
Append promise to [[pending decode promises]]
.
Queue a control message to decode the image with options, and promise.
Return promise.
Running a control message to decode the image means running these steps:
Enqueue the following steps to the [[codec work queue]]
:
Wait for [[tracks established]]
to become true
.
If options.completeFramesOnly
is false
and the image is a Progressive Image for which the User Agent supports progressive decoding, run the Decode Progressive Frame algorithm with options.frameIndex
and promise.
Otherwise, run the Decode Complete Frame algorithm with options.frameIndex
and promise.
reset()
Immediately aborts all pending work.
When invoked, run the Reset ImageDecoder algorithm with an AbortError
DOMException
.
close()
Immediately aborts all pending work and releases system resources. Close is final.
When invoked, run the Close ImageDecoder algorithm with an AbortError
DOMException
.
isTypeSupported(type)
Returns a promise indicating whether the provided config is supported by the User Agent.
When invoked, run these steps:
If type is not a valid image MIME type, return a Promise
rejected with TypeError
.
Let p be a new Promise
.
In parallel, resolve p with the result of running the Check Type Support algorithm with type.
Return p.
Run these steps:
Let readRequest be the following read request.
If [[closed]]
is true
, abort these steps.
If chunk is not a Uint8Array object, queue a task to run the Close ImageDecoder algorithm with a DataError
DOMException
and abort these steps.
Let bytes be the byte sequence represented by the Uint8Array object.
Append bytes to the [[encoded data]]
internal slot.
If [[tracks established]]
is false
, run the Establish Tracks algorithm.
Otherwise, run the Update Tracks algorithm.
Run the Fetch Stream Data Loop algorithm with reader.
Assign true
to [[complete]]
Resolve [[completed promise]]
.
Queue a task to run the Close ImageDecoder algorithm with a NotReadableError
DOMException
Read a chunk from reader given readRequest.
Run these steps:
Assert [[tracks established]]
is false
.
If [[encoded data]]
does not contain enough data to determine the number of tracks:
If complete
is true
, queue a task to run the Close ImageDecoder algorithm with a InvalidStateError
DOMException
.
Abort these steps.
If the number of tracks is found to be 0
, queue a task to run the Close ImageDecoder algorithm and abort these steps.
Let newTrackList be a new list.
For each image track found in [[encoded data]]
:
Let newTrack be a new ImageTrack
, initialized as follows:
Assign this to [[ImageDecoder]]
.
Assign tracks
to [[ImageTrackList]]
.
If image track is found to be animated, assign true
to newTrack’s [[animated]]
internal slot. Otherwise, assign false
.
If image track is found to describe a frame count, assign that count to newTrack’s [[frame count]]
internal slot. Otherwise, assign 0
.
NOTE: If this was constructed with data
as a ReadableStream
, the frameCount
can change as additional bytes are appended to [[encoded data]]
. See the Update Tracks algorithm.
If image track is found to describe a repetition count, assign that count to [[repetition count]]
internal slot. Otherwise, assign 0
.
NOTE: A value of Infinity
indicates infinite repetitions.
Assign false
to newTrack’s [[selected]]
internal slot.
Append newTrack to newTrackList.
Let selectedTrackIndex be the result of running the Get Default Selected Track Index algorithm with newTrackList.
Let selectedTrack be the track at position selectedTrackIndex within newTrackList.
Assign true
to selectedTrack’s [[selected]]
internal slot.
Assign selectedTrackIndex to [[internal selected track index]]
.
Assign true
to [[tracks established]]
.
Queue a task to perform the following steps:
Assign newTrackList to the tracks
[[track list]]
internal slot.
Assign selectedTrackIndex to tracks
[[selected index]]
.
Resolve [[ready promise]]
.
Run these steps:
If [[encoded data]]
identifies a Primary Image Track:
Let primaryTrack be the ImageTrack
from trackList that describes the Primary Image Track.
Let primaryTrackIndex be position of primaryTrack within trackList.
If [[prefer animation]]
is null
, return primaryTrackIndex.
If primaryTrack.animated
equals [[prefer animation]]
, return primaryTrackIndex.
If any ImageTrack
s in trackList have animated
equal to [[prefer animation]]
, return the position of the earliest such track in trackList.
Return 0
.
A track update struct is a struct that consists of a track index (unsigned long
) and a frame count (unsigned long
).
Run these steps:
Assert [[tracks established]]
is true
.
Let trackChanges be a new list.
Let trackList be a copy of tracks
' [[track list]]
.
For each track in trackList:
Let trackIndex be the position of track in trackList.
Let latestFrameCount be the frame count as indicated by [[encoded data]]
for the track corresponding to track.
Assert that latestFrameCount is greater than or equal to track.frameCount
.
If latestFrameCount is greater than track.frameCount
:
Let change be a track update struct whose track index is trackIndex and frame count is latestFrameCount.
Append change to tracksChanges.
If tracksChanges is empty, abort these steps.
Queue a task to perform the following steps:
For each update in trackChanges:
Let updateTrack be the ImageTrack
at position update.trackIndex
within tracks
' [[track list]]
.
Assign update.frameCount
to updateTrack’s [[frame count]]
.
Assert that [[tracks established]]
is true
.
Assert that [[internal selected track index]]
is not -1
.
Let encodedFrame be the encoded frame identified by frameIndex and [[internal selected track index]]
.
Wait for any of the following conditions to be true (whichever happens first):
[[encoded data]]
contains enough bytes to completely decode encodedFrame.
[[encoded data]]
is found to be malformed.
complete
is true
.
[[closed]]
is true
.
If [[encoded data]]
is found to be malformed, run the Fatally Reject Bad Data algorithm and abort these steps.
If [[encoded data]]
does not contain enough bytes to completely decode encodedFrame, run the Reject Infeasible Decode algorithm with promise and abort these steps.
Attempt to use [[codec implementation]]
to decode encodedFrame.
If decoding produces an error, run the Fatally Reject Bad Data algorithm and abort these steps.
If [[progressive frame generations]]
contains an entry keyed by frameIndex, remove the entry from the map.
Let output be the decoded image data emitted by [[codec implementation]]
corresponding to encodedFrame.
Let decodeResult be a new ImageDecodeResult
initialized as follows:
Assign 'true' to complete
.
Let duration be the presentation duration for output as described by encodedFrame. If encodedFrame does not have a duration, assign null
to duration.
Let timestamp be the presentation timestamp for output as described by encodedFrame. If encodedFrame does not have a timestamp:
If encodedFrame is a still image assign 0
to timestamp.
If encodedFrame is a constant rate animated image and duration is not null
, assign |frameIndex| * |duration|
to timestamp.
If a timestamp can otherwise be trivially generated from metadata without further decoding, assign that to timestamp.
Otherwise, assign 0
to timestamp.
If [[encoded data]]
contains orientation metadata describe it as rotation and flip, otherwise set rotation to 0 and flip to false.
Assign image
with the result of running the Create a VideoFrame algorithm with output, timestamp, duration, rotation, and flip.
Run the Resolve Decode algorithm with promise and decodeResult.
Assert that [[tracks established]]
is true
.
Assert that [[internal selected track index]]
is not -1
.
Let encodedFrame be the encoded frame identified by frameIndex and [[internal selected track index]]
.
Let lastFrameGeneration be null
.
If [[progressive frame generations]]
contains a map entry with the key frameIndex, assign the value of the map entry to lastFrameGeneration.
Wait for any of the following conditions to be true (whichever happens first):
[[encoded data]]
contains enough bytes to decode encodedFrame to produce an output whose Progressive Image Frame Generation exceeds lastFrameGeneration.
[[encoded data]]
is found to be malformed.
complete
is true
.
[[closed]]
is true
.
If [[encoded data]]
is found to be malformed, run the Fatally Reject Bad Data algorithm and abort these steps.
Otherwise, if [[encoded data]]
does not contain enough bytes to decode encodedFrame to produce an output whose Progressive Image Frame Generation exceeds lastFrameGeneration, run the Reject Infeasible Decode algorithm with promise and abort these steps.
Attempt to use [[codec implementation]]
to decode encodedFrame.
If decoding produces an error, run the Fatally Reject Bad Data algorithm and abort these steps.
Let output be the decoded image data emitted by [[codec implementation]]
corresponding to encodedFrame.
Let decodeResult be a new ImageDecodeResult
.
If output is the final full-detail progressive output corresponding to encodedFrame:
Assign true
to decodeResult’s complete
.
If [[progressive frame generations]]
contains an entry keyed by frameIndex, remove the entry from the map.
Otherwise:
Assign false
to decodeResult’s complete
.
Let frameGeneration be the Progressive Image Frame Generation for output.
Add a new entry to [[progressive frame generations]]
with key frameIndex and value frameGeneration.
Let duration be the presentation duration for output as described by encodedFrame. If encodedFrame does not describe a duration, assign null
to duration.
Let timestamp be the presentation timestamp for output as described by encodedFrame. If encodedFrame does not have a timestamp:
If encodedFrame is a still image assign 0
to timestamp.
If encodedFrame is a constant rate animated image and duration is not null
, assign |frameIndex| * |duration|
to timestamp.
If a timestamp can otherwise be trivially generated from metadata without further decoding, assign that to timestamp.
Otherwise, assign 0
to timestamp.
If [[encoded data]]
contains orientation metadata describe it as rotation and flip, otherwise set rotation to 0 and flip to false.
Assign image
with the result of running the Create a VideoFrame algorithm with output, timestamp, duration, rotation, and flip.
Remove promise from [[pending decode promises]]
.
Resolve promise with decodeResult.
Queue a task to perform these steps:
If [[closed]]
, abort these steps.
Assert that promise is an element of [[pending decode promises]]
.
Remove promise from [[pending decode promises]]
.
Resolve promise with result.
Assert that complete
is true
or [[closed]]
is true
.
If complete
is true
, let exception be a RangeError
. Otherwise, let exception be an InvalidStateError
DOMException
.
Queue a task to perform these steps:
If [[closed]]
, abort these steps.
Assert that promise is an element of [[pending decode promises]]
.
Remove promise from [[pending decode promises]]
.
Reject promise with exception.
Queue a task to perform these steps:
If [[closed]]
, abort these steps.
Run the Close ImageDecoder algorithm with an EncodingError
DOMException
.
If the User Agent can provide a codec to support decoding type, return true
.
Otherwise, return false
.
Signal [[codec implementation]]
to abort any active decoding operation.
For each decodePromise in [[pending decode promises]]
:
Reject decodePromise with exception.
Remove decodePromise from [[pending decode promises]]
.
Run the Reset ImageDecoder algorithm with exception.
Assign true
to [[closed]]
.
Clear [[codec implementation]]
and release associated system resources.
If [[ImageTrackList]]
is empty, reject [[ready promise]]
with exception. Otherwise perform these steps,
Remove all entries from [[ImageTrackList]]
.
Assign -1
to [[ImageTrackList]]
’s [[selected index]]
.
If [[complete]]
is false resolve [[completed promise]]
with exception.
typedef (AllowSharedBufferSource or ReadableStream)ImageBufferSource
; dictionaryImageDecoderInit
{ required DOMString type; required ImageBufferSource data; ColorSpaceConversion colorSpaceConversion = "default"; [EnforceRange] unsigned long desiredWidth; [EnforceRange] unsigned long desiredHeight; boolean preferAnimation; sequence<ArrayBuffer>transfer
= []; };
To determine if an ImageDecoderInit
is a valid ImageDecoderInit, run these steps:
If type is not a valid image MIME type, return false
.
If data is of type ReadableStream
and the ReadableStream is disturbed or locked, return false
.
If data is of type BufferSource
:
If desiredWidth
exists and desiredHeight
does not exist, return false
.
If desiredHeight
exists and desiredWidth
does not exist, return false
.
Return true
.
A valid image MIME type is a string that is a valid MIME type string and for which the type
, per Section 8.3.1 of [RFC9110], is image
.
type
, of type DOMString
String containing the MIME type of the image file to be decoded.
data
, of type ImageBufferSource
BufferSource
or ReadableStream
of bytes representing an encoded image file as described by type
.
colorSpaceConversion
, of type ColorSpaceConversion, defaulting to "default"
Controls whether decoded outputs' color space is converted or ignored, as defined by colorSpaceConversion
in ImageBitmapOptions
.
desiredWidth
, of type unsigned long
Indicates a desired width for decoded outputs. Implementation is best effort; decoding to a desired width MAY not be supported by all formats/ decoders.
desiredHeight
, of type unsigned long
Indicates a desired height for decoded outputs. Implementation is best effort; decoding to a desired height MAY not be supported by all formats/decoders.
preferAnimation
, of type boolean
For images with multiple tracks, this indicates whether the initial track selection SHOULD prefer an animated track.
NOTE: See the Get Default Selected Track Index algorithm.
dictionary ImageDecodeOptions
{
[EnforceRange] unsigned long frameIndex = 0;
boolean completeFramesOnly = true;
};
frameIndex
, of type unsigned long, defaulting to 0
The index of the frame to decode.
completeFramesOnly
, of type boolean, defaulting to true
For Progressive Images, a value of false
indicates that the decoder MAY output an image
with reduced detail. Each subsequent call to decode()
for the same frameIndex
will resolve to produce an image with a higher Progressive Image Frame Generation (more image detail) than the previous call, until finally the full-detail image is produced.
If completeFramesOnly
is assigned true
, or if the image is not a Progressive Image, or if the User Agent does not support progressive decoding for the given image type, calls to decode()
will only resolve once the full detail image is decoded.
dictionary ImageDecodeResult
{
required VideoFrame image;
required boolean complete;
};
image
, of type VideoFrame
The decoded image.
complete
, of type boolean
Indicates whether image
contains the final full-detail output.
NOTE: complete
is always true
when decode()
is invoked with completeFramesOnly
set to true
.
[Exposed=(Window,DedicatedWorker)] interface10.6.1. Internal SlotsImageTrackList
{ getter ImageTrack (unsigned longindex
); readonly attribute Promise<undefined> ready; readonly attribute unsigned long length; readonly attribute long selectedIndex; readonly attribute ImageTrack? selectedTrack; };
[[ready promise]]
The promise used to signal when the ImageTrackList
has been populated with ImageTrack
s.
NOTE: ImageTrack
frameCount
can receive subsequent updates until complete
is true
.
[[track list]]
The list of ImageTrack
s describe by this ImageTrackList
.
[[selected index]]
The index of the selected track in [[track list]]
. A value of -1
indicates that no track is selected. The initial value is -1
.
ready
, of type Promise<undefined>, readonly
The ready
getter steps are to return the [[ready promise]]
.
length
, of type unsigned long, readonly
The length
getter steps are to return the length of [[track list]]
.
selectedIndex
, of type long, readonly
The selectedIndex
getter steps are to return [[selected index]]
;
selectedTrack
, of type ImageTrack, readonly, nullable
The selectedTrack
getter steps are:
If [[selected index]]
is -1
, return null
.
Otherwise, return the ImageTrack from [[track list]]
at the position indicated by [[selected index]]
.
[Exposed=(Window,DedicatedWorker)]
interface ImageTrack
{
readonly attribute boolean animated;
readonly attribute unsigned long frameCount;
readonly attribute unrestricted float repetitionCount;
attribute boolean selected;
};
10.7.1. Internal Slots
[[ImageDecoder]]
The ImageDecoder
instance that constructed this ImageTrack
.
[[ImageTrackList]]
The ImageTrackList
instance that lists this ImageTrack
.
[[animated]]
Indicates whether this track contains an animated image with multiple frames.
[[frame count]]
The number of frames in this track.
[[repetition count]]
The number of times the animation is intended to repeat.
[[selected]]
Indicates whether this track is selected for decoding.
animated
, of type boolean, readonly
The animated
getter steps are to return the value of [[animated]]
.
NOTE: This attribute provides an early indication that frameCount
will ultimately exceed 0 for images where the frameCount
starts at 0
and later increments as new chunks of the ReadableStream
data
arrive.
frameCount
, of type unsigned long, readonly
The frameCount
getter steps are to return the value of [[frame count]]
.
repetitionCount
, of type unrestricted float, readonly
The repetitionCount
getter steps are to return the value of [[repetition count]]
.
selected
, of type boolean
The selected
getter steps are to return the value of [[selected]]
.
The selected
setter steps are:
If [[ImageDecoder]]
’s [[closed]]
slot is true
, abort these steps.
Let newValue be the given value.
If newValue equals [[selected]]
, abort these steps.
Assign newValue to [[selected]]
.
Let parentTrackList be [[ImageTrackList]]
Let oldSelectedIndex be the value of parentTrackList [[selected index]]
.
If oldSelectedIndex is not -1
:
Let oldSelectedTrack be the ImageTrack
in parentTrackList [[track list]]
at the position of oldSelectedIndex.
Assign false
to oldSelectedTrack [[selected]]
If newValue is true
, let selectedIndex be the index of this ImageTrack
within parentTrackList’s [[track list]]
. Otherwise, let selectedIndex be -1
.
Assign selectedIndex to parentTrackList [[selected index]]
.
Run the Reset ImageDecoder algorithm on [[ImageDecoder]]
.
Queue a control message to [[ImageDecoder]]
’s control message queue to update the internal selected track index with selectedIndex.
Process the control message queue belonging to [[ImageDecoder]]
.
Running a control message to update the internal selected track index means running these steps:
Enqueue the following steps to [[ImageDecoder]]
’s [[codec work queue]]
:
Assign selectedIndex to [[internal selected track index]]
.
Remove all entries from [[progressive frame generations]]
.
When resources are constrained, a User Agent MAY proactively reclaim codecs. This is particularly true in the case where hardware codecs are limited, and shared accross web pages or platform apps.
To reclaim a codec, a User Agent MUST run the appropriate close algorithm (amongst Close AudioDecoder, Close AudioEncoder, Close VideoDecoder and Close VideoEncoder) with a QuotaExceededError
DOMException
.
The rules governing when a codec may be reclaimed depend on whether the codec is an active or inactive codec and/or a background codec.
An active codec is a codec that has made progress on the [[codec work queue]] in the past 10 seconds
.
NOTE: A reliable sign of the working queue’s progress is a call to output()
callback.
An inactive codec is any codec that does not meet the definition of an active codec.
A background codec is a codec whose ownerDocument
(or owner set’s Document
, for codecs in workers) has a hidden
attribute equal to true
.
A User Agent MUST only reclaim a codec that is either an inactive codec, a background codec, or both. A User Agent MUST NOT reclaim a codec that is both active and in the foreground, i.e. not a background codec.
Additionally, User Agents MUST NOT reclaim an active background codec if it is:
An encoder, e.g. an AudioEncoder
or VideoEncoder
.
NOTE: This prevents long running encode tasks from being interrupted.
An AudioDecoder
or VideoDecoder
, when there is, respectively, an active AudioEncoder
or VideoEncoder
in the same global object.
NOTE: This prevents prevents breaking long running transcoding tasks.
An AudioDecoder
, when its tab is audibly playing audio.
This section is non-normative.
The primary security impact is that features of this API make it easier for an attacker to exploit vulnerabilities in the underlying platform codecs. Additionally, new abilities to configure and control the codecs can allow for new exploits that rely on a specific configuration and/or sequence of control operations.
Platform codecs are historically an internal detail of APIs like HTMLMediaElement
, [WEBAUDIO], and [WebRTC]. In this way, it has always been possible to attack the underlying codecs by using malformed media files/streams and invoking the various API control methods.
For example, you can send any stream to a decoder by first wrapping that stream in a media container (e.g. mp4) and setting that as the src
of an HTMLMediaElement
. You can then cause the underlying video decoder to be reset()
by setting a new value for <video>.currentTime
.
WebCodecs makes such attacks easier by exposing low level control when inputs are provided and direct access to invoke the codec control methods. This also affords attackers the ability to invoke sequences of control methods that were not previously possible via the higher level APIs.
The Working Group expects User Agents to mitigate this risk by extensively fuzzing their implementation with random inputs and control method invocations. Additionally, User Agents are encouraged to isolate their underlying codecs in processes with restricted privileges (sandbox) as a barrier against successful exploits being able to read user data.
An additional concern is exposing the underlying codecs to input mutation race conditions, such as allowing a site to mutate a codec input or output while the underlying codec is still operating on that data. This concern is mitigated by ensuring that input and output interfaces are immutable.
13. Privacy ConsiderationsThis section is non-normative.
The primary privacy impact is an increased ability to fingerprint users by querying for different codec capabilities to establish a codec feature profile. Much of this profile is already exposed by existing APIs. Such profiles are very unlikely to be uniquely identifying, but can be used with other metrics to create a fingerprint.
An attacker can accumulate a codec feature profile by calling IsConfigSupported()
methods with a number of different configuration dictionaries. Similarly, an attacker can attempt to configure()
a codec with different configuration dictionaries and observe which configurations are accepted.
Attackers can also use existing APIs to establish much of the codec feature profile. For example, the [media-capabilities] decodingInfo()
API describes what types of decoders are supported and its powerEfficient
attribute can signal when a decoder uses hardware acceleration. Similarly, the [WebRTC] getCapabilities()
API can be used to determine what types of encoders are supported and the getStats()
API can be used to determine when an encoder uses hardware acceleration. WebCodecs will expose some additional information in the form of low level codec features.
A codec feature profile alone is unlikely to be uniquely identifying. Underlying codecs are often implemented entirely in software (be it part of the User Agent binary or part of the operating system), such that all users who run that software will have a common set of capabilities. Additionally, underlying codecs are often implemented with hardware acceleration, but such hardware is mass produced and devices of a particular class and manufacture date (e.g. flagship phones manufactured in 2020) will often have common capabilities. There will be outliers (some users can be running outdated versions of software codecs or use a rare mix of custom assembled hardware), but most of the time a given codec feature profile is shared by a large group of users.
Segmenting groups of users by codec feature profile still amounts to a bit of entropy that can be combined with other metrics to uniquely identify a user. User Agents MAY partially mitigate this by returning an error whenever a site attempts to exhaustively probe for codec capabilities. Additionally, User Agents MAY implement a "privacy budget", which depletes as authors use WebCodecs and other identifying APIs. Upon exhaustion of the privacy budget, codec capabilities could be reduced to a common baseline or prompt for user approval.
14. Best Practices for Authors Using WebCodecsThis section is non-normative.
While WebCodecs internally operates on background threads, authors working with realtime media or in contended main thread environments are encouraged to ensure their media pipelines operate in worker contexts entirely independent of the main thread where possible. For example, realtime media processing of VideoFrame
s are generally to be done in a worker context.
The main thread has significant potential for high contention and jank that can go unnoticed in development, yet degrade inconsistently across devices and User Agents in the field -- potentially dramatically impacting the end user experience. Ensuring the media pipeline is decoupled from the main thread helps provide a smooth experience for end users.
Authors using the main thread for their media pipeline ought to be sure of their target frame rates, main thread workload, how their application will be embedded, and the class of devices their users will be using.
15. AcknowledgementsThe editors would like to thank Alex Russell, Chris Needham, Dale Curtis, Dan Sanders, Eugene Zemtsov, Francois Daoust, Guido Urdaneta, Harald Alvestrand, Jan-Ivar Bruaroey, Jer Noble, Mark Foltz, Peter Thatcher, Steve Anton, Matt Wolenetz, Rijubrata Bhaumik, Thomas Guilbert, Tuukka Toivonen, and Youenn Fablet for their contributions to this specification. Thank you also to the many others who contributed to the specification, including through their participation on the mailing list and in the issues.
The Working Group dedicates this specification to our colleague Bernard Aboba.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.3