A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://www.w3.org/TR/2012/WD-FileAPI-20121025 below:

File API

Abstract

This specification provides an API for representing file objects in web applications, as well as programmatically selecting them and accessing their data. This includes:

Additionally, this specification defines objects to be used within threaded web applications for the synchronous reading of files.

The section on Requirements and Use Cases [REQ] covers the motivation behind this specification.

This API is designed to be used in conjunction with other APIs and elements on the web platform, notably: XMLHttpRequest (e.g. with an overloaded send() method for File or Blob objects), postMessage, DataTransfer (part of the drag and drop API defined in [HTML,]) and Web Workers. Additionally, it should be possible to programmatically obtain a list of files from the input element when it is in the File Upload state[HTML]. These kinds of behaviors are defined in the appropriate affiliated specifications.

1. Introduction

This section is informative.

Web applications should have the ability to manipulate as wide as possible a range of user input, including files that a user may wish to upload to a remote server or manipulate inside a rich web application. This specification defines the basic representations for files, lists of files, errors raised by access to files, and programmatic ways to read files. Additionally, this specification also defines an interface that represents "raw data" which can be asynchronously processed on the main thread of conforming user agents. The interfaces and API defined in this specification can be used with other interfaces and APIs exposed to the web platform.

The File interface represents file data typically obtained from the underlying file system, and the Blob interface ("Binary Large Object" - a name originally introduced to web APIs in Google Gears) represents immutable raw data. File or Blob reads should happen asynchronously on the main thread, with an optional synchronous API used within threaded web applications. An asynchronous API for reading files prevents blocking and UI "freezing" on a user agent's main thread. This specification defines an asynchronous API based on an event model to read and access a File or Blob's data. A FileReader object provides asynchronous read methods to access that file's data through event handler attributes and the firing of events. The use of events and event handlers allows separate code blocks the ability to monitor the progress of the read (which is particularly useful for remote drives or mounted drives, where file access performance may vary from local drives) and error conditions that may arise during reading of a file. An example will be illustrative.

In the example below, different code blocks handle progress, error, and success conditions.

ECMAScript


function startRead() {  
   
  
  var file = document.getElementById('file').files[0];
  if(file){
    getAsText(file);
  }
}

function getAsText(readFile) {
        
  var reader = new FileReader();
  
        
  reader.readAsText(readFile, "UTF-16");
  
  
  reader.onprogress = updateProgress;
  reader.onload = loaded;
  reader.onerror = errorHandler;
}

function updateProgress(evt) {
  if (evt.lengthComputable) {
    
    var loaded = (evt.loaded / evt.total);
    if (loaded < 1) {
      
      
    }
  }
}

function loaded(evt) {  
      
  var fileString = evt.target.result;
  
  if(utils.regexp.isChinese(fileString)) {
    
  }
  else {
    
  }
       
}

function errorHandler(evt) {
  if(evt.target.error.name == "NotReadableError") {
    
  }
}
2. Conformance

Everything in this specification is normative except for examples and sections marked as being informative.

The keywords “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “RECOMMENDED”, “MAY” and “OPTIONAL” in this document are to be interpreted as described in Key words for use in RFCs to Indicate Requirement Levels [RFC2119].

The following conformance classes are defined by this specification:

conforming user agent

A user agent is considered to be a conforming user agent if it satisfies all of the MUST-, REQUIRED- and SHALL-level criteria in this specification that apply to implementations. This specification uses both the terms "conforming user agent" and "user agent" to refer to this product class.

User agents may implement algorithms in this specifications in any way desired, so long as the end result is indistinguishable from the result that would be obtained from the specification's algorithms.

User agents that use ECMAScript to implement the APIs defined in this specification must implement them in a manner consistent with the ECMAScript Bindings defined in the Web IDL specification [WEBIDL] as this specification uses that specification and terminology.

3. Dependencies

This specification relies on underlying specifications.

DOM

A conforming user agent must support at least the subset of the functionality defined in DOM4 that this specification relies upon; in particular, it must support EventTarget. [DOM4]

Progress Events

A conforming user agent must support the Progress Events specification. Data access on read operations is enabled via Progress Events.[ProgressEvents]

HTML

A conforming user agent must support at least the subset of the functionality defined in HTML that this specification relies upon; in particular, it must support event loops and event handler attributes. [HTML]

Web IDL

A conforming user agent must also be a conforming implementation of the IDL fragments in this specification, as described in the Web IDL specification. [WebIDL]

Typed Arrays

A conforming user agent must support the Typed Arrays specification [TypedArrays].

Parts of this specification rely on the Web Workers specification; for those parts of this specification, the Web Workers specification is a normative dependency. [Workers]

4. Terminology and Algorithms

The terms and algorithms <fragment>, <scheme>, document, unloading document cleanup steps, event handler attributes, event handler event type, origin, same origin, event loops, task, task source, URL, provide a stable state, queue a task, neuter, UTF-8, UTF-16. structured clone and collect a sequence of characters are as defined by the HTML specification [HTML].

When this specification says to terminate an algorithm the user agent must terminate the algorithm after finishing the step it is on. Asynchronous read methods defined in this specification may return before the algorithm in question is terminated, and can be terminated by an abort() call.

The term throw in this specification, as it pertains to exceptions, is used as defined in the DOM4 specification [DOM4].

The algorithms and steps in this specification use the following mathematical operations:

5. The FileList Interface

This interface is a list of File objects.

IDL


    interface FileList {
      getter File? item(unsigned long index);
      readonly attribute unsigned long length;
    };
    

Sample usage typically involves DOM access to the <input type="file"> element within a form, and then accessing selected files.

ECMAScript


    
    
    var file = document.forms['uploadData']['fileChooser'].files[0];
    
    
    
    
    if(file)
    {
      
    }  
    
5.1. Attributes
length

must return the number of files in the FileList object. If there are no files, this attribute must return 0.

5.2. Methods and Parameters
item(index)

must return the indexth File object in the FileList. If there is no indexth File object in the FileList, then this method must return null.

index must be treated by user agents as value for the position of a File object in the FileList, with 0 representing the first file. Supported property indices [WebIDL] are the numbers in the range zero to one less than the number of File objects represented by the FileList object. If there are no such File objects, then there are no supported property indices [WebIDL].

The

HTMLInputElement

interface [

HTML

] has a readonly attribute of type

FileList

, which is what is being accessed in the above example. Other interfaces with a readonly attribute of type

FileList

include the

DataTransfer

interface [

HTML

].

6. Line Endings

Web applications will need to manipulate data across platforms with different character representations for line endings, also known as newline, line break or end-of-line (EOL) markers signifying the end of a line of text. Unix platforms use LF (e.g. "\n" or U+000A in Unicode) whereas other platforms like Windows use CR+LF (e.g. "\r\n" or U+000D followed by U+000A in Unicode). An EOL Marker in this specification is either U+000A or U+000D followed by the character U+000A.

While particularly useful for DOMString types and text files, a consideration of line endings is also important when creating binary data within the web application that can be consumed by the underlying platform, which could be Unix-based or Windows-based. This interface provides an API to always convert line endings to the appropriate type, depending on the underlying platform.

IDL


  [NoInterfaceObject] interface LineEndings {

      DOMString toNativeLineEndings(DOMString string);
  };

  Window implements LineEndings;
  WorkerGlobalScope implements LineEndings;

  
6.1. Methods and Parameters
toNativeLineEndings

Returns a new DOMString based on the string input parameter, with EOL Markers expected by the underlying platform, and must behave as below:

  1. Let input be set to string.

  2. Let position be a pointer into input, initially pointing to the start of the string.

  3. Let result be the empty string.

  4. Let native line ending be be the character U+000A, or the character U+000D followed by the character U+000A, as determined by the underlying platform's conventions. While position is not past the end of input, run the following sub-steps:

    1. Collect a sequence of characters that are not U+000A or U+000D.

    2. Add the string collected in the previous step to result. If position is past the end of input return result and exit this algorithm.

    3. If the character at position is U+000D, and the character at position+1 is U+000A, advance position by two characters, and add native line ending to result.

    4. If the character at position is U+000A, add native line ending to result and advance position by one character.

These examples assume that the toNativeLineEndings function is available on the global object.

ECMAScript



var s = "This is a sentence.";
s += "\n";

sl = toNativeLineEndings(s);


7. The Blob Interface

This interface represents immutable raw data. It provides a method to slice data objects between ranges of bytes into further chunks of raw data. It also provides an attribute representing the size of the chunk of data. The File interface inherits from this interface.

IDL


    [Constructor, 
     Constructor(sequence<(ArrayBuffer or ArrayBufferView or Blob or DOMString)> blobParts, optional BlobPropertyBag options)] 
    interface Blob {
      
      readonly attribute unsigned long long size;
      readonly attribute DOMString type;
      
      
      
      Blob slice(optional long long start,
                 optional long long end,
                 optional DOMString contentType);
      void close(); 
    
    };

    dictionary BlobPropertyBag {
		
      DOMString type = "";
	
    };
    
7.1. Constructors

The Blob() constructor can be invoked with zero, one, two or three parameters. When the Blob() constructor is invoked, user agents must run the following steps:

  1. If invoked with zero parameters, return a new Blob object consisting of 0 bytes, with size set to 0, and with type set to the empty string.

  2. Otherwise, the constructor is invoked with a blobParts sequence. Let a be that sequence.

  3. Let bytes be an empty sequence of bytes.

  4. Let length be a's length. For 0 ≤ i < length, repeat the following steps:

    1. Let element be the ith element of a.

    2. If element is a DOMString, run the following substeps:

      1. Let s be the result of converting element to a sequence of Unicode characters [Unicode] using the algorithm for doing so in WebIDL [WebIDL].

      2. Encode s as UTF-8 and append the resulting bytes to bytes.

      The algorithm from WebIDL [WebIDL] replaces unmatched surrogates in an invalid UTF-16 string with U+FFFD replacement characters. Scenarios exist when the Blob constructor may result in some data loss due to lost or scrambled character sequences.

    3. If element is an ArrayBufferView [TypedArrays], convert it to a sequence of byteLength bytes from the underlying ArrayBuffer, starting at the byteOffset of the ArrayBufferView [TypedArrays], and append those bytes to bytes.

    4. If element is an ArrayBuffer [TypedArrays], convert it to a sequence of byteLength bytes, and append those bytes to bytes.

    5. If element is a Blob, append the bytes it represents to bytes. The type of the Blob array element is ignored.

  5. Return a Blob object consisting of bytes, with its size set to the length of bytes, and its type set to the type member of the options argument, if used.

7.1.1. Constructor Parameters

The Blob() constructor can be invoked with the parameters below:

A blobParts sequence
which takes any number of the following types of elements, and in any order:
An optional BlobPropertyBag
which takes one member:

Examples of constructor usage follow.

ECMAScript




var a = new Blob();




var buffer = new ArrayBuffer(1024);



var shorts = new Uint16Array(buffer, 512, 128);
var bytes = new Uint8Array(buffer, shorts.byteOffset + shorts.byteLength);

var b = new Blob([toNativeLineEndings("foobarbazetcetc" + "birdiebirdieboo")], {type: "text/plain;charset=UTF-8"});

var c = new Blob([b, shorts]);

var a = new Blob([b, c, bytes]);

var d = new Blob([buffer, b, c, bytes]);

7.2. Snapshot State

Each Blob must have a snapshot state, which must be initially set to the state of the underlying storage, if any such underlying storage exists. The snapshot state must be preserved through structured clone. If, at the time of processing any read method on the Blob, the state of the underlying storage containing the Blob is not equal to snapshot state, the read must fail with a NotReadableError.

Snapshot state is a conceptual marker most useful for File objects backed by on-disk resources.

7.3. Attributes
size

Returns the size of the Blob object in bytes. On getting, conforming user agents must return the total number of bytes that can be read by a FileReader or FileReaderSync object, or 0 if the Blob has no bytes to be read. If the Blob has been neutered with close called on it, then size must return 0.

type

The ASCII-encoded string in lower case representing the media type of the Blob, expressed as an RFC2046 MIME type [RFC2046]. On getting, conforming user agents must return the MIME type of the Blob, if it is known. If conforming user agents cannot determine the media type of the Blob, they must return the empty string. A string is a valid MIME type if it matches the media-type token defined in section 3.7 "Media Types" of RFC 2616 [RFC2616]. If not the empty string, user agents must treat it as an RFC2616 media-type [RFC2616], and as an opaque string that can be ignored if it is an invalid media-type. This value must be used as the Content-Type header when dereferencing a Blob URI.

7.4. Methods and Parameters 7.4.1. The slice method

The slice method returns a new Blob object with bytes ranging from the optional start parameter upto but not including the optional end parameter, and with a type attribute that is the value of the optional contentType parameter. It must act as follows :

  1. Let O be the Blob object on which the slice method is being called.

  2. The optional start parameter is a value for the start point of a slice call, and must be treated as a byte-order position, with the zeroth position representing the first byte. User agents must process slice with start normalized according to the following:

    1. If the optional start parameter is not used as a parameter when making this call, let relativeStart be 0.

    2. If start is negative, let relativeStart be max((size + start), 0)).

    3. Else, let relativeStart be min(start, size).

    4. This defines the normalization of the start parameter. When processing the slice call, user agents must normalize start to relativeStart.

  3. The optional end parameter is a value for the end point of a slice call. The following requirements are normative for this parameter, and user agents must process slice with end normalized according to the following:

    1. If the optional end parameter is not used as a parameter when making this call, let relativeEnd be size.

    2. If end is negative, let relativeEnd be max((size + end), 0)

    3. Else, let relativeEnd be min(end, size)

    4. This defines the normalization of the end parameter. When processing the slice call, user agents must normalize end to relativeEnd.

  4. The optional contentType parameter is used to set a value identical to one that is set with the HTTP/1.1 Content-Type header [RFC2616] on the Blob returned by the slice call. The following requirements are normative for this parameter, and user agents must process the slice with contentType normalized according to the following:

    1. If the contentType parameter is not provided, let relativeContentType be set to the empty string .

    2. Else let relativeContentType be set to contentType. User agents must treat it as an RFC2616 media-type [RFC2616], and as an opaque string that can be ignored if it is an invalid media-type. This value must be used as the Content-Type header when dereferencing a Blob URI.

    3. This defines the normalization of the contentType parameter. When processing the slice call, user agents must normalize contentType to relativeContentType.

  5. Let span be max((relativeEnd - relativeStart), 0).

  6. Return a new Blob object S with the following characteristics:

    1. S consists of span consecutive bytes from O, beginning with the byte at byte-order position relativeStart.

    2. S.size = span.

    3. S.type = relativeContentType.

    4. Let the snapshot state of S be set to the snapshot state of O.

The examples below illustrate the different types of slice calls possible. Since the File interface inherits from the Blob interface, examples are based on the use of the File interface.

ECMAScript


    
    
    var file = document.getElementById('file').files[0];
    if(file)
    {
      
      
      
      var fileClone = file.slice(); 
      var fileClone2 = file.slice(0, file.size);
      
      
      
      
      var fileChunkFromEnd = file.slice(-(Math.round(file.size/2)));
      
      
      
      var fileChunkFromStart = file.slice(0, Math.round(file.size/2));
      
      
      
      var fileNoMetadata = file.slice(0, -150, "application/experimental");      
    }
    
7.4.2. The close method

Calling the close method must permanently neuter the original Blob object. This is an irreversible and non-idempotent operation; once a Blob has been neutered, it cannot be used again; dereferencing a Blob URI bound to a Blob object on which close has been called results in a 500 Error. A neutered Blob must have a size of 0.

Calling close must not affect an ongoing read operation via any asynchronous read methods . Calling close must not affect any Blob objects created by a slice call on the Blob object on which close has been called. While Blob objects can be neutered via a call to close, they are not Transferable [HTML]. They are immutable, and thus invalidating them on the sending side is not useful; implementations can share Blob data between two threads without needing invalidation.

8. The File Interface

This interface describes a single file in a FileList and exposes its name. It inherits from Blob.

8.1. Attributes
name

The name of the file; on getting, this must return the name of the file as a string. There are numerous file name variations on different systems; this is merely the name of the file, without path information. On getting, if user agents cannot make this information available, they must return the empty string.

lastModifiedDate

The last modified date of the file. On getting, if user agents can make this information available, this must return a new Date[HTML] object initialized to the last modified date of the file. If the last modification date and time are not known, the attribute must return the current date and time as a Date object.

The File interface is available on objects that expose an attribute of type FileList; these objects are defined in HTML [HTML]. The File interface, which inherits from Blob, is immutable, and thus represents file data that can be read into memory at the time a read operation is initiated. User agents must process reads on files that no longer exist at the time of read as errors, throwing a NotFoundError exception if using a FileReaderSync on a Web Worker [Workers] or firing an error event with the error attribute returning a NotFoundError DOMError.

9. The FileReader Interface

This interface provides methods to read File objects or Blob objects into memory, and to access the data from those Files or Blobs using progress events and event handler attributes; it inherits from EventTarget [DOM4]. It is desirable to read data from file systems asynchronously in the main thread of user agents. This interface provides such an asynchronous API, and is specified to be used with the global object (Window [HTML]).

9.2. Constructors

When the FileReader() constructor is invoked, the user agent must return a new FileReader object.

In environments where the global object is represented by a Window or a WorkerGlobalScope object, the FileReader constructor must be available.

9.4. FileReader States

The FileReader object can be in one of 3 states. The readyState attribute, on getting, must return the current state, which must be one of the following values:

EMPTY (numeric value 0)

The FileReader object has been constructed, and there are no pending reads. None of the read methods have been called. This is the default state of a newly minted FileReader object, until one of the read methods have been called on it.

LOADING (numeric value 1)

A File or Blob is being read. One of the read methods is being processed, and no error has occurred during the read.

DONE (numeric value 2)

The entire File or Blob has been read into memory, OR a file error occurred during read, OR the read was aborted using abort(). The FileReader is no longer reading a File or Blob. If readyState is set to DONE it means at least one of the read methods have been called on this FileReader.

9.5. Reading a File or Blob 9.5.2. The result attribute

On getting, the result attribute returns a Blob's data as a DOMString, or as an ArrayBuffer [TypedArrays], or null, depending on the read method that has been called on the FileReader, and any errors that may have occurred. It can also return partial Blob data. Partial Blob data is the part of the File or Blob that has been read into memory currently; when processing the read method readAsText, partial Blob data is a DOMString that is incremented as more bytes are loaded (a portion of the total) [ProgressEvents], and when processing readAsArrayBuffer partial Blob data is an ArrayBuffer [TypedArrays] object consisting of the bytes loaded so far (a portion of the total)[ProgressEvents]. The list below is normative for the result attribute and is the conformance criteria for this attribute:

If a read is successful, the result attribute must return a non-null value only after a progress event (see also [ProgressEvents]) has fired, since all the read methods access Blob data asynchronously. Tasks are queued to update the result attribute as Blob data is made available.

9.5.4. The readAsText(blob, encoding) method

When the readAsText(blob, encoding) method is called (the encoding argument is optional), the user agent must run the steps below (unless otherwise indicated).

  1. If readyState = LOADING throw an InvalidStateError [DOM4] and terminate these steps.

    Note: The readAsText() method returns due to the algorithm being terminated.

  2. If the blob has been neutered through the close method, throw an InvalidStateError exception [DOM4] and terminate this algorithm.

    Note: The readAsText() method returns due to the algorithm being terminated.

  3. If an error occurs during reading the blob parameter, set readyState to DONE and set result to null. Proceed to the error steps below.

    1. Fire a progress event called error. Set the error attribute; on getting, the error attribute must be a a DOMError object that indicates the kind of file error that has occurred.

    2. Fire a progress event called loadend.

    3. Terminate this algorithm.

      Note: The readAsText() method returns due to the algorithm being terminated.

  4. If no error has occurred, set readyState to LOADING

  5. Fire a progress event called loadstart.

  6. Return the readAsText() method, but continue to process the steps in this algorithm

  7. Make progress notifications.

  8. While processing the read, as data from the blob becomes available, user agents should queue tasks to update the result with partial Blob data represented as a string in a format determined by the encoding determination until the read is complete, to accompany the firing of progress events. On getting, the result attribute returns partial Blob data representing the number of bytes currently loaded (as a fraction of the total) [ProgressEvents], decoded into memory according to the encoding determination; user agents must return at least one such result while processing this read method. The last returned value is at completion of the read.

    Partial Blob data must be returned such that where possible, the bytes read thus far should be valid code points within the encoding; in particular, when executing the encoding determination for Partial Blob data, user agents must NOT return the U+FFFD character for bytes that are invalid within an encoding till the entire codepoint has been read. The following encoding caveat must be followed:

    Suppose a file resource is to be read in UTF-8, and in hexadecimal the bytes in this file are E3 83 91 E3 83 91, which is effectively 0x30D1 0x30D1. Suppose the first 5 bytes have been read. The result returned here must be 0x30D1 and have result.length == 1 , and NOT be 0x30D1 0xFFFD with result.length == 2 . Even though the trailing E3 83 is not a valid code point in UTF-8 at the fifth byte, user agents must NOT return a result with such invalid code points replaced with U+FFFD till it can be determined definitively that the codepoint is invalid.

  9. When the blob has been read into memory fully, set readyState to DONE

  10. Terminate this algorithm.

9.5.5. The readAsArrayBuffer(blob) method

When the readAsArrayBuffer(blob) method is called, the user agent must run the steps below (unless otherwise indicated).

  1. If readyState = LOADING throw an InvalidStateError exception [DOM4] and terminate these steps.

    Note: The readAsArrayBuffer() method returns due to the algorithm being terminated.

  2. If the blob has been neutered through the close method, throw an InvalidStateError exception [DOM4] and terminate this algorithm.

    Note: The readAsArrayBuffer() method returns due to the algorithm being terminated.

  3. If an error occurs during reading the blob parameter, set readyState to DONE and set result to null. Proceed to the error steps below.

    1. Fire a progress event called error. Set the error attribute; on getting, the error attribute must be a a DOMError that indicates the kind of file error that has occurred.

    2. Fire a progress event called loadend.

    3. Terminate this algorithm.

      Note: The readAsArrayBuffer() method returns due to the algorithm being terminated.

  4. If no error has occurred, set readyState to LOADING

  5. Fire a progress event called loadstart.

  6. Return the readAsArrayBuffer() method, but continue to process the steps in this algorithm.

  7. Make progress notifications.

  8. While processing the read, as data from the blob becomes available, user agents should queue tasks to update the result with partial Blob data as an ArrayBuffer [TypedArrays] object containing the bytes read until the read is complete, to accompany the firing of progress events. On getting, the result attribute returns partial Blob data representing the number of bytes currently loaded (as a fraction of the total) [ProgressEvents], as an ArrayBuffer [TypedArrays] object; user agents must return at least one such ArrayBuffer [TypedArrays] while processing this read method. The last returned value is at completion of the read.

  9. When the blob has been read into memory fully, set readyState to DONE

  10. Terminate this algorithm.

9.5.8. Determining Encoding

When reading blob objects using the readAsText() read method, the following encoding determination steps must be followed:

  1. Let charset be null

  2. If the encoding argument is present, and is the name or alias of a character set used on the Internet [IANACHARSET], let charset be set to the value of the encoding parameter.

  3. If charset is null, then for each of the rows in the following table, starting with the first one and going down, if the first bytes of blob match the bytes given in the first column, then let charset be the encoding given in the cell in the second column of that row. If there is no match then charset remains null.

    Bytes in Hexadecimal Description FE FF UTF-16BE BOM FF FE UTF-16LE BOM EF BB BF UTF-8 BOM
  4. If charset is null, and the blob argument's type attribute is present, and its Charset Parameter [RFC2046] is the name or alias of a character set used on the Internet [IANACHARSET], let charset be set to its Charset Parameter.

    If blob has a type attribute of text/plain;charset=UTF-8 then charset is UTF-8.

  5. If charset is null let charset be UTF-8.

  6. Return the result of decoding the blob using charset; on getting, the result attribute of the FileReader object returns a string in charset format. The synchronous readAsText method of the FileReaderSync object returns a string in charset format. Replace bytes or sequences of bytes that are not valid according to the charset with a single U+FFFD character [Unicode]. When processing Partial Blob Data, use the encoding caveat, if applicable.

9.5.9. Events

When this specification says to make progress notifications for a read method, the following steps must be followed:

  1. While the read method is processing, queue a task to fire a progress event called progress at the FileReader object about every 50ms or for every byte read into memory, whichever is least frequent. At least one event called progress must fire before load is fired, and at 100% completion of the read operation; if 100% of blob can be read into memory in less than 50ms, user agents must fire a progress event called progress at completion.

    If a given implementation uses buffers of size 65536 bytes to read files, and thus limits reading operations to that size, and a read method is called on a file that is 65537 bytes, then that implementation must fire one progress event for the 65536 first bytes, one progress event for the 65537th byte (which is at completion of read), one load event and one loadend event.

  2. When the data from the blob has been completely read into memory, queue a task to fire a progress event called load at the FileReader object.

  3. When the data from the blob has been completely read into memory, queue a task to fire a progress event called loadend at the FileReader object.

When this specification says to fire a progress event called e (for some ProgressEvent e at a FileReader reader), the following are normative:

9.5.9.1. Event Summary

The following are the events that are fired at FileReader objects; firing events is defined in DOM Core [DOM4].

Event name Interface Fired when… loadstart ProgressEvent When the read starts. progress ProgressEvent While reading (and decoding) blob, and reporting partial Blob data (progess.loaded/progress.total) abort ProgressEvent When the read has been aborted. For instance, by invoking the abort() method. error ProgressEvent When the read has failed (see errors). load ProgressEvent When the read has successfully completed. loadend ProgressEvent When the request has completed (either in success or failure). 10. Reading on Threads

Web Workers allow for the use of synchronous File or Blob read APIs, since such reads on threads do not block the main thread. This section defines a synchronous API, which can be used within Workers [Web Workers]. Workers can avail of both the asynchronous API (the FileReader object) and the synchronous API (the FileReaderSync object).

10.1. The FileReaderSync Interface

This interface provides methods to synchronously read File or Blob objects into memory.

10.1.1. Constructors

When the FileReaderSync() constructor is invoked, the user agent must return a new FileReaderSync object.

In environments where the global object is represented by a WorkerGlobalScope object, the FileReaderSync constructor must be available.

10.1.2. The readAsText method

When the readAsText(blob, encoding) method is called (the encoding argument is optional), the following steps must be followed:

  1. If an error occurs during reading of the blob parameter, throw the appropriate exception. Terminate these overall steps.

  2. If no error has occurred, read blob into memory. Return the data contents of blob using the encoding determination algorithm.

10.1.3. The readAsDataURL method

When the readAsDataURL(blob) method is called, the following steps must be followed:

  1. If an error occurs during reading of the blob parameter, throw the appropriate exception. Terminate these overall steps.

  2. If no error has occurred, read blob into memory. Return the data contents of blob as a Data URL [DataURL]

11. Errors and Exceptions

Error conditions can occur when reading files from the underlying filesystem. The list below of potential error conditions is informative.

11.1. Throwing an Exception or Returning an Error

This section is normative. Error conditions can arise when reading a file.

Synchronous read methods throw exceptions of the type in the table below if there has been an error with reading.

The error attribute of the FileReader object must return a DOMError object [DOM4] of the most appropriate type from the table below if there has been an error, and otherwise returns null.

Type Description NotFoundError If the File or Blob resource could not be found at the time the read was processed, then for asynchronous read methods the error attribute must return a "NotFoundError" DOMError and synchronous read methods must throw a NotFoundError exception. SecurityError If:

then for asynchronous read methods the error attribute may return a "SecurityError" DOMError and synchronous read methods may throw a SecurityError exception.

This is a security error to be used in situations not covered by any other exception type.

NotReadableError If the snapshot state of a File or a Blob does not match the state of the underlying storage, then for asynchronous read methods the error attribute must return a "NotReadableError" DOMError and synchronous read methods must throw a NotReadableError exception. If the File or Blob cannot be read, typically due due to permission problems that occur after a snapshot state has been established (e.g. concurrent lock on the underlying storage with another application) then for asynchronous read methods the error attribute must return a "NotReadableError" DOMError and synchronous read methods must throw a NotReadableError exception. EncodingError If the File or Blob cannot be encoded as Base64 for a Data URL [DataURL] owing to URL length limitations in implementations, then for asynchronous read methods the error attribute must return a "EncodingError" DOMError, and synchronous read methods must throw an EncodingError exception. 12. A URI for Blob and File reference

This section defines a scheme for a URI used to refer to Blob objects (and File objects).

12.1. Requirements for a New Scheme

This specification defines a scheme with URIs of the sort: blob:550e8400-e29b-41d4-a716-446655440000#aboutABBA. This section provides some requirements and is an informative discussion.

12.2. Discussion of Existing Schemes

This section is an informative discussion of existing schemes that may have been repurposed or reused for the use cases for URIs above, and justification for why a new scheme is considered preferable. These schemes include HTTP [RFC2616], file [RFC1630][RFC1738], and a scheme such as urn:uuid [RFC4122]. One broad consideration in determining what scheme to use is providing something with intuitive appeal to web developers.

12.3. Definition of blob URI Scheme

This section defines a blob: URI scheme using a formal grammar. A blob: URI consists of the blob: scheme and an opaque string, along with an optional fragment identifier. In this specification an opaque string is a unique string which can be heuristically generated upon demand such that the probability that two are alike is small, and which is hard to guess (e.g. the Universally Unique IDentifier (UUID) as defined in [RFC4122] is an opaque string). A fragment identifier is optional, and if used, has a distinct interpretation depending on the media type of the Blob or File resource in question [RFC2046].

This section uses the Augmented Backus-Naur Form (ABNF), defined in [RFC5234]. All blob: URLs must follow this ABNF.

ABNF


	blob = scheme ":" opaqueString [fragIdentifier]

	scheme = "blob"

	; scheme is always "blob"

	; opaqueString tokens must be globally unique
	; opaqueString could be a UUID in its canonical form

	
12.3.1. The Opaque String

Opaque strings must NOT include any reserved characters from [RFC3986] without percent-encoding them. Opaque strings must be globally unique. Such strings should only use characters in the ranges U+002A to U+002B, U+002D to U+002E, U+0030 to U+0039, U+0041 to U+005A, U+005E to U+007E [Unicode], and should be at least 36 characters long. UUID is one potential option available to user agents for use with Blob URIs as opaque strings, and their use is strongly encouraged. UUIDs are defined in [RFC4122]. For an ABNF of UUID, see Appendix A.

12.4. Discussion of Fragment Identifier

The fragment's format, resolution and processing directives depend on the media type [RFC2046] of a potentially retrieved representation, even though such a retrieval is only performed if the blob: URI is dereferenced. For example, in an HTML file [HTML] the fragment identifier could be used to refer to an anchor within the file. If the user agent does not recognize the media type of the resource, OR if a fragment identifer is not meaningful within the resource, it must ignore the fragment identifier. Additionally, user agents must honor additional fragment processing directives given in the relevant media format specifications; in particular, this includes any modifications to the fragment production given in HTML [HTML]. The following section is normative for fragment identifers in general, though it should be noted that affiliated specifications may extend this definition.

ABNF


	fragIdentifier = "#" fragment

	; Fragment Identifiers depend on the media type of the Blob
	; fragment is defined in [RFC3986]
	; fragment processing for HTML is defined in [HTML]

	fragment    = *( pchar / "/" / "?" )

	pchar       = unreserved / pct-encoded / sub-delims / ":" / "@"

	unreserved    = ALPHA / DIGIT / "-" / "." / "_" / "~"

	pct-encoded   = "%" HEXDIG HEXDIG

	sub-delims    = "!" / "$" / "&" / "'" / "(" / ")"
	                 / "*" / "+" / "," / ";" / "="
	

A valid Blob URI reference could look like: blob:550e8400-e29b-41d4-a716-446655440000#aboutABBA where "#aboutABBA" might be an HTML fragment identifier referring to an element with an id attribute of "aboutABBA".

12.7. Dereferencing Model for Blob URIs

User agents must only support requests with GET [RFC2616]. If the Blob has a type attribute, or if the Blob has been created with a slice call which uses a contentType argument, responses to dereferencing the Blob URI must include the Content-Type header from HTTP [RFC2616] with the value of the type attribute or contentType argument. Responses to dereferencing the Blob URI must include the Content-Length header from HTTP [RFC2616] with the value of the size attribute. Specifically, responses must only support a subset of responses that are equivalent to the following from HTTP [RFC2616]:

12.7.2. 500 Error Condition

This response [RFC2616] must be used if:

This response may be accompanied by additional messages in the response indicating why the Blob resource could not be served. See blob: protocol examples.

The 500 Error Condition provides a response code, but not a fixed status. User agents MAY leave it as simply "500 Error Condition" or supply additional status information (e.g. "500 Origin Violation"). Implementers are strongly encouraged to provide messages to developers along with the response code.

12.7.3. Request and Response Headers

This section provides sample exchanges between web applications and user agents using the blob: protocol. A request can be triggered using HTML markup of the sort <img src="blob:550e8400-e29b-41d4-a716-446655440000">, after a web application calls URL.createObjectURL on a given blob, which returns blob:550e8400-e29b-41d4-a716-446655440000 to dereference that blob. These examples merely illustrate the protocol; web developers are not likely to interact with all the headers, but the getAllResponseHeaders() method of XMLHttpRequest, if used, will show relevant response headers [XHR2].

Requests could look like this:

If the blob has an affiliated media type [RFC2046] represented by its type attribute, then the response message should include the Content-Type header from RFC2616 [RFC2616]. See processing media types.

If there is a file error or any other kind of error associated with the blob, then a user agent can respond with a 500 Error Condition as the response message. This should also be used if any method other than GET is used to make the request.

12.8. Creating and Revoking a Blob URI

Blob URIs are created and revoked using methods exposed on the URL object, supported by global objects Window [HTML] and WorkerGlobalScope [Web Workers]. Revocation of a Blob URI decouples the Blob URI from the resource it refers to, and if it is dereferenced after it is revoked, user agents must return a 500 response. This section describes a supplemental interface to the URL specification [URL API] and presents methods for Blob URI creation and revocation.

ECMAScript user agents of this specification must ensure that they do not expose a prototype property on the URL interface object unless the user agent also implements the URL [URL API] specification. In other words, URL.prototype must evaluate to true if the user agent implements the URL [URL API] specification, and must NOT evaluate to true otherwise.

12.8.1. Methods and Parameters
The createObjectURL static method

Returns a unique Blob URI each time it is called on a valid blob argument, which is a non-null Blob in scope of the global object's URL property from which this static method is called.

  1. If this method is called with a Blob argument that is NOT valid, then user agents must return null.

  2. If this method is called with a valid Blob argument, user agents must run the following steps:

    1. The optional objectURLOptions dictionary argument called options has a boolean member autoRevoke that defaults to true; if called without using the optional dictionary, or if called with autoRevoke set to true, execute the following sub-steps:
      1. Return a unique Blob URI that can be used to dereference the blob argument, and run the rest of this algorithm asynchronously.
      2. Provide a stable state

        Note:

        Bug 19544

        will determine this algorithmic step. What is marked here is provisional and therefore unreliable; final

        autoRevoke

        behavior will be determined by the solution to that bug.

      3. Revoke the Blob URI, which is equivalent to calling revokeObjectURL on it, and terminate this algorithm.
    2. If this method is called with the autoRevoke dictionary member set to false, return a unique Blob URI that can be used to dereference the blob argument.

In the example below, after obtaining a reference to a Blob object (in this case, a user-selected File from the underlying file system), the static method URL.createObjectURL() is called on that Blob object.

ECMAScript


	var file = document.getElementById('file').files[0];
	if(file){
	  blobURLref = window.URL.createObjectURL(file);
	  myimg.src = blobURLref;

	}
	
The revokeObjectURL static method

Revokes the Blob URI provided in the string url argument.

  1. If the url refers to a Blob that is both valid and in the same origin of the global object's URL property on which this static method was called, user agents must return a 500 response code when the url is dereferenced.
  2. If the url refers to a Blob that is NOT valid OR if the value provided for the url argument is not a Blob URI OR if the url argument refers to a Blob that is NOT in the same origin as the global object's URL property, this method call does nothing. User agents may display a message on the error console.

The url argument to the revokeObjectURL method is a Blob URI string.

In the example below, window1 and window2 are separate, but in the same origin; window2 could be an iframe [HTML] inside window1.

ECMAScript


	myurl = window1.URL.createObjectURL(myblob);
	window2.URL.revokeObjectURL(myurl);
	

Since window1 and window2 are in the same origin, the URL.revokeObjectURL call ensures that subsequent dereferencing of myurl results in a 500 Error Condition response.

12.9. Examples of Blob URI Creation and Revocation

Blob URIs are strings that dereference Blob objects, and can persist for as long as the document from which they were minted using URL.createObjectURL() - see Lifetime of Blob URIs.

This section gives sample usage of creation and revocation of Blob URIs with explanations.

In the example below, two img elements [HTML] refer to the same Blob URI, which was minted for one-time use only using the boolean key in the optional dictionary argument:

ECMAScript


<script>url = URL.createObjectURL(blob); </script><script> img2.src=url;</script>

In the example above, the assignment in the second script element fails, since autoRevoke is true by default.

In the example below, URL.revokeObjectURL() is explicitly called.

ECMAScript


var blobURLref = URL.createObjectURL(file, {autoRevoke:false});
img1 = new Image();
img2 = new Image();

// Both assignments below work as expected

img1.src = blobURLref;
img2.src = blobURLref;

// ... Following body load
// Check if both images have loaded


if(img1.complete && img2.complete)
{
	// Ensure that subsequent refs return a 500
	
	URL.revokeObjectURL(blobURLref);
}
else {

	msg("Images cannot be previewed!");
	
	// revoke the string-based reference
	
	URL.revokeObjectURL(blobURLref);

}

The example above allows multiple references to a single Blob URI, but revokes the Blob URI string after both image objects have been loaded. While not restricting number of uses of the Blob URI offers more flexibility, it creates the potential for strings that linger after they are useful, especially in web applications where the document may persist for a while. Developers must explicitly set the autoRevoke dictionary member to false in order to enable this usage.

13. Security Considerations

This section is informative.

This specification allows web content to read files from the underlying file system, as well as provides a means for files to be accessed by unique identifiers, and as such is subject to some security considerations. This specification also assumes that the primary user interaction is with the <input type="file"/> element of HTML forms [HTML], and that all files that are being read by FileReader objects have first been selected by the user. Important security considerations include preventing malicious file selection attacks (selection looping), preventing access to system-sensitive files, and guarding against modifications of files on disk after a selection has taken place.

This section is provisional; more security data may supplement this in subsequent drafts.

14. Requirements and Use Cases

This section covers what the requirements are for this API, as well as illustrates some use cases. This version of the API does not satisfy all use cases; subsequent versions may elect to address these.

15. Appendix A

This section is informative and not normative.

15.1. An ABNF for UUID

The following is an informative ABNF [ABNF] for UUID, which is a strongly encouraged choice for the opaqueString production of Blob URIs.

ABNF


	UUID                   = time-low "-" time-mid "-"
	                         time-high-and-version "-"
	                         clock-seq-and-reserved
	                         clock-seq-low "-" node
	time-low               = 4hexOctet
	time-mid               = 2hexOctet
	time-high-and-version  = 2hexOctet
	clock-seq-and-reserved = hexOctet
	clock-seq-low          = hexOctet
	node                   = 6hexOctet
	hexOctet               = hexDigit hexDigit
	hexDigit =
	         "0" / "1" / "2" / "3" / "4" / "5" / "6" / "7" / "8" / "9" /
	         "a" / "b" / "c" / "d" / "e" / "f" /
	         "A" / "B" / "C" / "D" / "E" / "F"

	
16. Acknowledgements

This specification was originally developed by the SVG Working Group. Many thanks to Mark Baker and Anne van Kesteren for their feedback.

Thanks to Robin Berjon for editing the original specification.

Special thanks to Olli Pettay, Nikunj Mehta, Garrett Smith, Aaron Boodman, Michael Nordman, Jian Li, Dmitry Titov, Ian Hickson, Darin Fisher, Sam Weinig, Adrian Bateman and Julian Reschke.

Thanks to the W3C WebApps WG, and to participants on the public-webapps@w3.org listserv

17. References 17.1. Normative references
RFC2119
Key words for use in RFCs to Indicate Requirement Levels, S. Bradner. IETF.
HTML
HTML5: A vocabulary and associated APIs for HTML and XHTML (work in progress), I. Hickson. W3C.
ProgressEvents
Progress Events, A. van Kesteren. W3C.
RFC2397
The "data" URL Scheme, L. Masinter. IETF.
Web Workers
Web Workers (work in progress), I. Hickson. W3C.
DOM4
DOM4 (work in progress), A. Gregor, A. van Kesteren, Ms2ger. W3C.
Unicode
The Unicode Standard, Version 5.2.0., J. D. Allen, D. Anderson, et al. Unicode Consortium.
RFC2616
Hypertext Transfer Protocol -- HTTP/1.1, R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, T. Berners-Lee. IETF.
RFC2046
Multipurpose Internet Mail Extensions (MIME) Part Two: Media Extensions, N. Freed, N. Borenstein. IETF.
IANA Charsets
Official Names for Character Sets on the Internet, K. Simonsen, W.Choi, et al. IANA.
Typed Arrays
Typed Arrays (work in progress), V. Vukicevic, K. Russell. Khronos Group.
RFC5234
Augmented BNF for Syntax Specifications: ABNF, D. Crocker, P. Overell. IETF.
URL API Specification
URL API (work in progress), A. Barth. W3C.
WebIDL Specification
WebIDL (work in progress), C. McCormack.
ECMAScript
ECMAScript 5th Edition, A. Wirfs-Brock, P. Lakshman et al.
MIME Sniffing
MIME Sniffing (work in progress), A. Barth and I. Hickson.
17.2. Informative References
XMLHttpRequest
XMLHttpRequest Level 2 (work in progress), A. van Kesteren. W3C.
Google Gears Blob API
Google Gears Blob API (deprecated)
RFC4122
A Universally Unique IDentifier (UUID) URN Namespace, P. Leach, M. Mealling, R. Salz. IETF.
RFC3986
Uniform Resource Identifier (URI): Generic Syntax, T. Berners-Lee, R. Fielding, L. Masinter. IETF.
RFC1630
Universal Resource Identifiers in WWW, T. Berners-Lee. IETF.
RFC1738
Uniform Resource Locators (URL), T. Berners-Lee, L. Masinter, M. McCahill. IETF.
WebRTC 1.0
WebRTC 1.0, A. Bergkvist, D. Burnett, C. Jennings, A. Narayanan. W3C.

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.3