Wednesday, 13 October 2010

URL Encoding

Some characters cannot be part of a URL (for example, the space) and some other characters have a special meaning in a URL: for example, the character # can be used to specify a subsection (or fragment) of a document; the character = is used to separate a name from a value. A query string may need to be converted (that is what URL Encoding is) to satisfy these constraints.

In particular, encoding the query string uses the following rules:

* Letters (A-Z and a-z), numbers (0-9) and the characters '.','-','~' and '_' are left as-is
* SPACE is encoded as '+'
* All other characters are encoded as %FF hex representation with any non-ASCII characters first encoded as UTF-8 (or other specified encoding)

The octet corresponding to the tilde ("~") character is often encoded as "%7E" by older URI processing implementations; the "%7E" can be replaced by"~" without changing its interpretation.

The encoding of SPACE as '+' and the selection of "as-is" characters distinguishes this encoding from RFC 1738.

Technically, the form content is only encoded as a query string when the form submission method is GET. The same encoding is used by default when the submission method is POST, but the result is not sent as a query string, that is, is not added to the action URL of the form. Rather, the string is sent as the body of the request.

http://en.wikipedia.org/wiki/Query_string

http://en.wikipedia.org/wiki/URL_encoding

There are two built-in methods in ASP.NET which can be used to encode a string or URL. They are Server.URLEncode()and Server.URLPathEncode().

Server.URLPathEncode method
URL-encodes the path portion of a URL string and returns the encoded string. It will leave the querystring, if present, as it is.

The Server.URLEncode method
The URLEncode method applies URL encoding rules, including escape characters, to a specified string.

URLEncode converts characters as follows:
* Spaces ( ) are converted to plus signs (+).
* Non-alphanumeric characters are escaped to their hexadecimal representation.

Browser URL encoding and website request validation

No comments:

Post a Comment