URI
is a module providing classes to handle Uniform Resource Identifiers (RFC2396).
Uniform way of handling URIs.
Flexibility to introduce custom URI
schemes.
Flexibility to have an alternate URI::Parser
(or just different patterns and regexpâs).
require 'uri' uri = URI("http://foo.com/posts?id=30&limit=5#time=1305298413") uri.scheme uri.host uri.path uri.query uri.fragment uri.to_sAdding custom URIs¶ ↑
module URI class RSYNC < Generic DEFAULT_PORT = 873 end register_scheme 'RSYNC', RSYNC end URI.scheme_list uri = URI("rsync://rsync.foo.com")RFC References¶ ↑
A good place to view an RFC spec is www.ietf.org/rfc.html.
Here is a list of all related RFCâs:
Class
tree¶ ↑
URI::Generic
(in uri/generic.rb)
URI::File
- (in uri/file.rb)
URI::FTP
- (in uri/ftp.rb)
URI::HTTP
- (in uri/http.rb)
URI::HTTPS
- (in uri/https.rb)
URI::LDAP
- (in uri/ldap.rb)
URI::LDAPS
- (in uri/ldaps.rb)
URI::MailTo
- (in uri/mailto.rb)
URI::Parser
- (in uri/common.rb)
URI::REGEXP
- (in uri/common.rb)
URI::REGEXP::PATTERN - (in uri/common.rb)
URI::Util - (in uri/common.rb)
URI::Error
- (in uri/common.rb)
URI::InvalidURIError
- (in uri/common.rb)
URI::InvalidComponentError
- (in uri/common.rb)
URI::BadURIError
- (in uri/common.rb)
Akira Yamada <akira@ruby-lang.org>
Akira Yamada <akira@ruby-lang.org> Dmitry V. Sabanin <sdmitry@lrn.ru> Vincent Batts <vbatts@hashbangbash.com>
Copyright © 2001 akira yamada <akira@ruby-lang.org> You can redistribute it and/or modify it under the same term as Ruby.
decode_uri_component(str, enc=Encoding::UTF_8) click to toggle source
Decodes given str
of URL-encoded data.
This does not decode + to SP.
def self.decode_uri_component(str, enc=Encoding::UTF_8) _decode_uri_component(/%\h\h/, str, enc) end
decode_www_form(str, enc=Encoding::UTF_8, separator: '&', use__charset_: false, isindex: false) click to toggle source
Decodes URL-encoded form data from given str
.
This decodes application/x-www-form-urlencoded data and returns an array of key-value arrays.
This refers url.spec.whatwg.org/#concept-urlencoded-parser, so this supports only &-separator, and doesnât support ;-separator.
ary = URI.decode_www_form("a=1&a=2&b=3") ary ary.assoc('a').last ary.assoc('b').last ary.rassoc('a').last Hash[ary]
See URI.decode_www_form_component
, URI.encode_www_form
.
def self.decode_www_form(str, enc=Encoding::UTF_8, separator: '&', use__charset_: false, isindex: false) raise ArgumentError, "the input of #{self.name}.#{__method__} must be ASCII only string" unless str.ascii_only? ary = [] return ary if str.empty? enc = Encoding.find(enc) str.b.each_line(separator) do |string| string.chomp!(separator) key, sep, val = string.partition('=') if isindex if sep.empty? val = key key = +'' end isindex = false end if use__charset_ and key == '_charset_' and e = get_encoding(val) enc = e use__charset_ = false end key.gsub!(/\+|%\h\h/, TBLDECWWWCOMP_) if val val.gsub!(/\+|%\h\h/, TBLDECWWWCOMP_) else val = +'' end ary << [key, val] end ary.each do |k, v| k.force_encoding(enc) k.scrub! v.force_encoding(enc) v.scrub! end ary end
decode_www_form_component(str, enc=Encoding::UTF_8) click to toggle source
encode_uri_component(str, enc=nil) click to toggle source
Encodes str
using URL encoding
This encodes SP to %20 instead of +.
def self.encode_uri_component(str, enc=nil) _encode_uri_component(/[^*\-.0-9A-Z_a-z]/, TBLENCURICOMP_, str, enc) end
encode_www_form(enum, enc=nil) click to toggle source
Generates URL-encoded form data from given enum
.
This generates application/x-www-form-urlencoded data defined in HTML5 from given an Enumerable
object.
This internally uses URI.encode_www_form_component(str)
.
This method doesnât convert the encoding of given items, so convert them before calling this method if you want to send data as other than original encoding or mixed encoding data. (Strings which are encoded in an HTML5 ASCII incompatible encoding are converted to UTF-8.)
This method doesnât handle files. When you send a file, use multipart/form-data.
This refers url.spec.whatwg.org/#concept-urlencoded-serializer
URI.encode_www_form([["q", "ruby"], ["lang", "en"]]) URI.encode_www_form("q" => "ruby", "lang" => "en") URI.encode_www_form("q" => ["ruby", "perl"], "lang" => "en") URI.encode_www_form([["q", "ruby"], ["q", "perl"], ["lang", "en"]])
See URI.encode_www_form_component
, URI.decode_www_form
.
def self.encode_www_form(enum, enc=nil) enum.map do |k,v| if v.nil? encode_www_form_component(k, enc) elsif v.respond_to?(:to_ary) v.to_ary.map do |w| str = encode_www_form_component(k, enc) unless w.nil? str << '=' str << encode_www_form_component(w, enc) end end.join('&') else str = encode_www_form_component(k, enc) str << '=' str << encode_www_form_component(v, enc) end end.join('&') end
encode_www_form_component(str, enc=nil) click to toggle source
Encodes given str
to URL-encoded form data.
This method doesnât convert *, -, ., 0-9, A-Z, _, a-z, but does convert SP (ASCII space) to + and converts others to %XX.
If enc
is given, convert str
to the encoding before percent encoding.
This is an implementation of www.w3.org/TR/2013/CR-html5-20130806/forms.html#url-encoded-form-data.
See URI.decode_www_form_component
, URI.encode_www_form
.
def self.encode_www_form_component(str, enc=nil) _encode_uri_component(/[^*\-.0-9A-Z_a-z]/, TBLENCWWWCOMP_, str, enc) end
for(scheme, *arguments, default: Generic) click to toggle source
Construct a URI
instance, using the scheme to detect the appropriate class from URI.scheme_list
.
def self.for(scheme, *arguments, default: Generic) const_name = scheme.to_s.upcase uri_class = INITIAL_SCHEMES[const_name] uri_class ||= if /\A[A-Z]\w*\z/.match?(const_name) && Schemes.const_defined?(const_name, false) Schemes.const_get(const_name, false) end uri_class ||= default return uri_class.new(scheme, *arguments) end
join(*str) click to toggle source
Synopsis¶ ↑URI::join(str[, str, ...])Args¶ ↑
str
String(s) to work with, will be converted to RFC3986 URIs before merging.
Joins URIs.
Usage¶ ↑require 'uri' URI.join("http://example.com/","main.rbx") URI.join('http://example.com', 'foo') URI.join('http://example.com', '/foo', '/bar') URI.join('http://example.com', '/foo', 'bar') URI.join('http://example.com', '/foo/', 'bar')
def self.join(*str) RFC3986_PARSER.join(*str) end
open(name, *rest, &block) click to toggle source
Allows the opening of various resources including URIs.
If the first argument responds to the âopenâ method, âopenâ is called on it with the rest of the arguments.
If the first argument is a string that begins with <code>(protocol)://<code>, it is parsed by URI.parse
. If the parsed object responds to the âopenâ method, âopenâ is called on it with the rest of the arguments.
Otherwise, Kernel#open
is called.
OpenURI::OpenRead#open
provides URI::HTTP#open
, URI::HTTPS#open
and URI::FTP#open
, Kernel#open
.
We can accept URIs and strings that begin with http://, https:// and ftp://. In these cases, the opened file object is extended by OpenURI::Meta
.
Calls superclass method
def self.open(name, *rest, &block) if name.respond_to?(:open) name.open(*rest, &block) elsif name.respond_to?(:to_str) && %r{\A[A-Za-z][A-Za-z0-9+\-\.]*://} =~ name && (uri = URI.parse(name)).respond_to?(:open) uri.open(*rest, &block) else super end end
parse(uri) click to toggle source
Synopsis¶ ↑URI::parse(uri_str)Args¶ ↑ Description¶ ↑
Creates one of the URIâs subclasses instance from the string.
Raises¶ ↑URI::InvalidURIError
Raised if URI
given is not a correct one.
require 'uri' uri = URI.parse("http://www.ruby-lang.org/") uri.scheme uri.host
Itâs recommended to first ::escape the provided uri_str
if there are any invalid URI
characters.
def self.parse(uri) RFC3986_PARSER.parse(uri) end
regexp(schemes = nil) click to toggle source
Synopsis¶ ↑URI::regexp([match_schemes])Args¶ ↑
match_schemes
Array
of schemes. If given, resulting regexp matches to URIs whose scheme is one of the match_schemes.
Returns a Regexp
object which matches to URI-like strings. The Regexp
object returned by this method includes arbitrary number of capture group (parentheses). Never rely on its number.
require 'uri' html_string.slice(URI.regexp) html_string.sub(URI.regexp(['ftp']), '') html_string.scan(URI.regexp) do |*matches| p $& end
def self.regexp(schemes = nil) warn "URI.regexp is obsolete", uplevel: 1 if $VERBOSE DEFAULT_PARSER.make_regexp(schemes) end
register_scheme(scheme, klass) click to toggle source
Register the given klass
to be instantiated when parsing URLs with the given scheme
. Note that currently only schemes which after .upcase are valid constant names can be registered (no -/+/. allowed).
def self.register_scheme(scheme, klass) Schemes.const_set(scheme.to_s.upcase, klass) end
scheme_list() click to toggle source
Returns a Hash
of the defined schemes.
def self.scheme_list Schemes.constants.map { |name| [name.to_s.upcase, Schemes.const_get(name)] }.to_h end
split(uri) click to toggle source
Synopsis¶ ↑URI::split(uri)Args¶ ↑ Description¶ ↑
Splits the string on following parts and returns array with result:
Scheme
Userinfo
Host
Port
Registry
Path
Opaque
Query
Fragment
require 'uri' URI.split("http://www.ruby-lang.org/")
def self.split(uri) RFC3986_PARSER.split(uri) endPrivate Class Methods
_decode_uri_component(regexp, str, enc) click to toggle source
def self._decode_uri_component(regexp, str, enc) raise ArgumentError, "invalid %-encoding (#{str})" if /%(?!\h\h)/.match?(str) str.b.gsub(regexp, TBLDECWWWCOMP_).force_encoding(enc) end
_encode_uri_component(regexp, table, str, enc) click to toggle source
def self._encode_uri_component(regexp, table, str, enc) str = str.to_s.dup if str.encoding != Encoding::ASCII_8BIT if enc && enc != Encoding::ASCII_8BIT str.encode!(Encoding::UTF_8, invalid: :replace, undef: :replace) str.encode!(enc, fallback: ->(x){"&##{x.ord};"}) end str.force_encoding(Encoding::ASCII_8BIT) end str.gsub!(regexp, table) str.force_encoding(Encoding::US_ASCII) end
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4