RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from http://www.egenix.com/www2002/python/eGenix-mx-Extensions-v2.x.html/mxURL.html below:

eGenix.com: Website 2002: Python: mxURL

To simplify and speed up handling URLs the package provides a new type to work with them in an object oriented way.

URL objects can be added to each other as well as right added to strings giving a joined URL object in both cases. The join semantics depend on the URL schemes and their attributes.

Predfined URL schemes

These schemes are predefined by the module. The function register_scheme() (see above) allows adding new ones or changing the behaviour for predefined ones.

The uses_* fields are integers 0 or 1 representing the schemes possibilities. When a feature is set to 0 the corresponding field is left out while parsing the URL. Characters which would normally be seen as separators are ignored then.

uses_relative is important when joining URLs. Only URLs with uses_relative will have their paths joined according to the common rules.

Note that the URL object constructors will raise a ValueError exception for unknown schemes they find in the construction string.

URL Constructors

These constructors are available in the package:

URL(url): Create a new URL object from url. Takes either a string or another URL as argument. The url is stored normalized.
RawURL(url): Create a new URL object from url. Takes either a string or another URL as argument. The url is not normalized but stored as-is.
BuildURL(scheme='', netloc='', path='', params='', query='', fragment=''): Create a new URL object from the given parameters. The url is stored normalized. This constructor can handle keywords.

Normalizing means that unnecessary relative components and slashes are removed from the URL prior to storing it. The stored URL will always be equivalent to the one given to the constructor.

Note: The URL type uses a scheme feature dictionary to figure out how to parse different schemes. Use the add_scheme() to access this dictionary.

URL Instance Methods

A URL instance url defines these methods:

depth()

Return the depth of the URL. Depth is only defined if the URL is normalized and absolute. If the URL is not absolute or contains relative components (e.g. /a/../b/) an error will be raised. Toplevel has depth 0.

normalized()

Return a new URL object pointing to the same URL but normalized.

parsed()

Return a tuple (scheme, netloc, path, params, query, fragment) just as urlparse.urlparse() does.

basic()

Return a new URL object pointing to the same base URL, but without the parts params, query and fragment.

In case the url already forfills this requirement, a new reference to it is returned.

relative(baseURL)

Return a new URL object that when joined with baseURL results in the same URL as the object itself.

URL and baseURL must both be absolute URLs for this to work. An exception is raised otherwise. The base URL should provide also scheme and netloc, because otherwise joining might result in lossage of scheme information. If only the URL provides a scheme, then the returned relative URL will also include that scheme.

Parameters, fragment and query of the URL object are preserved; only the path is made relative and the netloc removed (relative paths and netlocs don't go together).

In case both URLs provide schemes and/or netlocs that point to different resources, the method simply returns a new reference to the object.

rebuild(scheme='', netloc='', path='', params='', query='', fragment='')

Return a new URL object created from the given parameters and the URL object. This method can handle keywords.

Arguments not given are taken unchanged from the URL object.

pathentry(index)

Return the path entry index.

index may be negative to indicate an entry counted from the right (with -1 being the rightmost entry). An IndexError is raised in case the index lies out of range. Leading and trailing slashes are not counted.

pathlen()

Return the path length as defined by the .pathentry() method.

Leading and trailing slashes are not counted.

pathtuple()

Return the path as tuple of strings.

Leading and trailing slashes are ignored and the slashes are not included.

URL Instance Variables

A URL instance url provides access to these (read-only) variables:

absolute: Is the URL absolute ? Returns 1 or 0.
base: Base part of the URL's path: everything excluding a possibly given file with extension.
ext: Extension (without dot) of the file given in the URL converted to lowercase letters.
file: File name of the file pointed to by the URL. Directories are not included.
fragment: Fragment part of the URL without the '#'.
netloc: Network location as given in the URL without the leading '//' and possibly terminating '/'. Username and password are included if given (//user:passwd@host:port/).
host: Hostname included in the network location part of the URL (//user:passwd@host:port/) or ''.
user: Username included in the network location part of the URL (//user:passwd@host:port/) or ''.
passwd: Password included in the network location part of the URL (//user:passwd@host:port/) or ''.
port: Port included in the network location part of the URL (//user:passwd@host:port/) as integer.
params: Parameter section of the URL without the ';'.
path: Path as given in the URL. If the URL contains a netloc part this will always start with a '/'.
scheme: Access scheme without the terminating ':'.
string, url: The complete URL as string.
absolute: 1 iff the object's path is absolute; 0 otherwise.
normal: 1 iff the object's path has been normalized; 0 otherwise.
mimetype: The MIME type as string ("major/minor") or "*/*" if it cannot be determined. The package uses the types_map dictionary of the Python standard lib's mimetype module as basis for finding out the MIME type. You can add entries to that dictionary at runtime to adapt the mechanism to your needs.

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4