The url
module provides utilities for URL resolution and parsing. It can be accessed using:
const url = require('url');
A URL string is a structured string containing multiple meaningful components. When parsed, a URL object is returned containing properties for each of these components.
The following details each of the components of a parsed URL. The example 'http://user:[email protected]:8080/p/a/t/h?query=string#hash'
is used to illustrate each.
┌─────────────────────────────────────────────────────────────────────────────┐ │ href │ ├──────────┬┬───────────┬─────────────────┬───────────────────────────┬───────┤ │ protocol ││ auth │ host │ path │ hash │ │ ││ ├──────────┬──────┼──────────┬────────────────┤ │ │ ││ │ hostname │ port │ pathname │ search │ │ │ ││ │ │ │ ├─┬──────────────┤ │ │ ││ │ │ │ │ │ query │ │ " http: // user:pass @ host.com : 8080 /p/a/t/h ? query=string #hash " │ ││ │ │ │ │ │ │ │ └──────────┴┴───────────┴──────────┴──────┴──────────┴─┴──────────────┴───────┘ (all spaces in the "" line should be ignored -- they are purely for formatting)
The auth
property is the username and password portion of the URL, also referred to as "userinfo". This string subset follows the protocol
and double slashes (if present) and precedes the host
component, delimited by an ASCII "at sign" (@
). The format of the string is {username}[:{password}]
, with the [:{password}]
portion being optional.
For example: 'user:pass'
The hash
property consists of the "fragment" portion of the URL including the leading ASCII hash (#
) character.
For example: '#hash'
The host
property is the full lower-cased host portion of the URL, including the port
if specified.
For example: 'host.com:8080'
The hostname
property is the lower-cased host name portion of the host
component without the port
included.
For example: 'host.com'
The href
property is the full URL string that was parsed with both the protocol
and host
components converted to lower-case.
For example: 'http://user:[email protected]:8080/p/a/t/h?query=string#hash'
The path
property is a concatenation of the pathname
and search
components.
For example: '/p/a/t/h?query=string'
No decoding of the path
is performed.
The pathname
property consists of the entire path section of the URL. This is everything following the host
(including the port
) and before the start of the query
or hash
components, delimited by either the ASCII question mark (?
) or hash (#
) characters.
For example '/p/a/t/h'
No decoding of the path string is performed.
The port
property is the numeric port portion of the host
component.
For example: '8080'
The protocol
property identifies the URL's lower-cased protocol scheme.
For example: 'http:'
The query
property is either the query string without the leading ASCII question mark (?
), or an object returned by the querystring
module's parse()
method. Whether the query
property is a string or object is determined by the parseQueryString
argument passed to url.parse()
.
For example: 'query=string'
or {'query': 'string'}
If returned as a string, no decoding of the query string is performed. If returned as an object, both keys and values are decoded.
The search
property consists of the entire "query string" portion of the URL, including the leading ASCII question mark (?
) character.
For example: '?query=string'
No decoding of the query string is performed.
The slashes
property is a boolean
with a value of true
if two ASCII forward-slash characters (/
) are required following the colon in the protocol
.
urlObject
<Object> | <string> A URL object (as returned by url.parse()
or constructed otherwise). If a string, it is converted to an object by passing it to url.parse()
.The url.format()
method returns a formatted URL string derived from urlObject
.
If urlObject
is not an object or a string, url.parse()
will throw a TypeError
.
The formatting process operates as follows:
result
is created.urlObject.protocol
is a string, it is appended as-is to result
.urlObject.protocol
is not undefined
and is not a string, an Error
is thrown.urlObject.protocol
that do not end with an ASCII colon (:
) character, the literal string :
will be appended to result
.//
will be appended to result
:urlObject.slashes
property is true;urlObject.protocol
begins with http
, https
, ftp
, gopher
, or file
;urlObject.auth
property is truthy, and either urlObject.host
or urlObject.hostname
are not undefined
, the value of urlObject.auth
will be coerced into a string and appended to result
followed by the literal string @
.urlObject.host
property is undefined
then:urlObject.hostname
is a string, it is appended to result
.urlObject.hostname
is not undefined
and is not a string, an Error
is thrown.urlObject.port
property value is truthy, and urlObject.hostname
is not undefined
::
is appended to result
, andurlObject.port
is coerced to a string and appended to result
.urlObject.host
property value is truthy, the value of urlObject.host
is coerced to a string and appended to result
.urlObject.pathname
property is a string that is not an empty string:urlObject.pathname
does not start with an ASCII forward slash (/
), then the literal string '/' is appended to result
.urlObject.pathname
is appended to result
.urlObject.pathname
is not undefined
and is not a string, an Error
is thrown.urlObject.search
property is undefined
and if the urlObject.query
property is an Object
, the literal string ?
is appended to result
followed by the output of calling the querystring
module's stringify()
method passing the value of urlObject.query
.urlObject.search
is a string:urlObject.search
does not start with the ASCII question mark (?
) character, the literal string ?
is appended to result
.urlObject.search
is appended to result
.urlObject.search
is not undefined
and is not a string, an Error
is thrown.urlObject.hash
property is a string:urlObject.hash
does not start with the ASCII hash (#
) character, the literal string #
is appended to result
.urlObject.hash
is appended to result
.urlObject.hash
property is not undefined
and is not a string, an Error
is thrown.result
is returned.URL
<URL> A WHATWG URL objectoptions
<Object>auth
<boolean> true
if the serialized URL string should include the username and password, false
otherwise. Defaults to true
.fragment
<boolean> true
if the serialized URL string should include the fragment, false
otherwise. Defaults to true
.search
<boolean> true
if the serialized URL string should include the search query, false
otherwise. Defaults to true
.unicode
(Boolean) true
if Unicode characters appearing in the host component of the URL string should be encoded directly as opposed to being Punycode encoded. Defaults to false
.Returns a customizable serialization of a URL String representation of a WHATWG URL object.
The URL object has both a toString()
method and href
property that return string serializations of the URL. These are not, however, customizable in any way. The url.format(URL[, options])
method allows for basic customization of the output.
For example:
const myURL = new URL('https://a:[email protected]?abc#foo'); console.log(myURL.href); // Prints https://a:[email protected]/?abc#foo console.log(myURL.toString()); // Prints https://a:[email protected]/?abc#foo console.log(url.format(myURL, {fragment: false, unicode: true, auth: false})); // Prints 'https://你好你好?abc'
Note: This variation of the url.format()
method is currently considered to be experimental.
urlString
<string> The URL string to parse.parseQueryString
<boolean> If true
, the query
property will always be set to an object returned by the querystring
module's parse()
method. If false
, the query
property on the returned URL object will be an unparsed, undecoded string. Defaults to false
.slashesDenoteHost
<boolean> If true
, the first token after the literal string //
and preceding the next /
will be interpreted as the host
. For instance, given //foo/bar
, the result would be {host: 'foo', pathname: '/bar'}
rather than {pathname: '//foo/bar'}
. Defaults to false
.The url.parse()
method takes a URL string, parses it, and returns a URL object.
The url.resolve()
method resolves a target URL relative to a base URL in a manner similar to that of a Web browser resolving an anchor tag HREF.
For example:
url.resolve('/one/two/three', 'four') // '/one/two/four' url.resolve('http://example.com/', '/one') // 'http://example.com/one' url.resolve('http://example.com/one', '/two') // 'http://example.com/two'
URLs are only permitted to contain a certain range of characters. Spaces (' '
) and the following characters will be automatically escaped in the properties of URL objects:
< > " ` \r \n \t { } | \ ^ '
For example, the ASCII space character (' '
) is encoded as %20
. The ASCII forward slash (/
) character is encoded as %3C
.
The url
module provides an experimental implementation of the WHATWG URL Standard as an alternative to the existing url.parse()
API.
const URL = require('url').URL; const myURL = new URL('https://example.org/foo'); console.log(myURL.href); // https://example.org/foo console.log(myURL.protocol); // https: console.log(myURL.hostname); // example.org console.log(myURL.pathname); // /foo
Note: Using the delete
keyword (e.g. delete myURL.protocol
, delete myURL.pathname
, etc) has no effect but will still return true
.
A comparison between this API and url.parse()
is given below. Above the URL 'http://user:[email protected]:8080/p/a/t/h?query=string#hash'
, properties of an object returned by url.parse()
are shown. Below it are properties of a WHATWG URL
object.
Note: WHATWG URL's origin
property includes protocol
and host
, but not username
or password
.
┌─────────────────────────────────────────────────────────────────────────────────────────┐ │ href │ ├──────────┬──┬─────────────────────┬─────────────────┬───────────────────────────┬───────┤ │ protocol │ │ auth │ host │ path │ hash │ │ │ │ ├──────────┬──────┼──────────┬────────────────┤ │ │ │ │ │ hostname │ port │ pathname │ search │ │ │ │ │ │ │ │ ├─┬──────────────┤ │ │ │ │ │ │ │ │ │ query │ │ " http: // user : pass @ host.com : 8080 /p/a/t/h ? query=string #hash " │ │ │ │ │ hostname │ port │ │ │ │ │ │ │ │ ├──────────┴──────┤ │ │ │ │ protocol │ │ username │ password │ host │ │ │ │ ├──────────┴──┼──────────┴──────────┼─────────────────┤ │ │ │ │ origin │ │ origin │ pathname │ search │ hash │ ├─────────────┴─────────────────────┴─────────────────┴──────────┴────────────────┴───────┤ │ href │ └─────────────────────────────────────────────────────────────────────────────────────────┘ (all spaces in the "" line should be ignored -- they are purely for formatting)
Creates a new URL
object by parsing the input
relative to the base
. If base
is passed as a string, it will be parsed equivalent to new URL(base)
.
const myURL = new URL('/foo', 'https://example.org/'); // https://example.org/foo
A TypeError
will be thrown if the input
or base
are not valid URLs. Note that an effort will be made to coerce the given values into strings. For instance:
const myURL = new URL({toString: () => 'https://example.org/'}); // https://example.org/
Unicode characters appearing within the hostname of input
will be automatically converted to ASCII using the Punycode algorithm.
const myURL = new URL('https://你好你好'); // https://xn--6qqa088eba/
Additional examples of parsed URLs may be found in the WHATWG URL Standard.
Gets and sets the fragment portion of the URL.
const myURL = new URL('https://example.org/foo#bar'); console.log(myURL.hash); // Prints #bar myURL.hash = 'baz'; console.log(myURL.href); // Prints https://example.org/foo#baz
Invalid URL characters included in the value assigned to the hash
property are percent-encoded. Note that the selection of which characters to percent-encode may vary somewhat from what the url.parse()
and url.format()
methods would produce.
Gets and sets the host portion of the URL.
const myURL = new URL('https://example.org:81/foo'); console.log(myURL.host); // Prints example.org:81 myURL.host = 'example.com:82'; console.log(myURL.href); // Prints https://example.com:82/foo
Invalid host values assigned to the host
property are ignored.
Gets and sets the hostname portion of the URL. The key difference between url.host
and url.hostname
is that url.hostname
does not include the port.
const myURL = new URL('https://example.org:81/foo'); console.log(myURL.hostname); // Prints example.org myURL.hostname = 'example.com:82'; console.log(myURL.href); // Prints https://example.com:81/foo
Invalid hostname values assigned to the hostname
property are ignored.
Gets and sets the serialized URL.
const myURL = new URL('https://example.org/foo'); console.log(myURL.href); // Prints https://example.org/foo myURL.href = 'https://example.com/bar' // Prints https://example.com/bar
Getting the value of the href
property is equivalent to calling url.toString()
.
Setting the value of this property to a new value is equivalent to creating a new URL
object using new URL(value)
. Each of the URL
object's properties will be modified.
If the value assigned to the href
property is not a valid URL, a TypeError
will be thrown.
Gets the read-only serialization of the URL's origin. Unicode characters that may be contained within the hostname will be encoded as-is without Punycode encoding.
const myURL = new URL('https://example.org/foo/bar?baz'); console.log(myURL.origin); // Prints https://example.org
const idnURL = new URL('https://你好你好'); console.log(idnURL.origin); // Prints https://你好你好 console.log(idnURL.hostname); // Prints xn--6qqa088eba
Gets and sets the password portion of the URL.
const myURL = new URL('https://abc:[email protected]'); console.log(myURL.password); // Prints xyz myURL.password = '123'; console.log(myURL.href); // Prints https://abc:[email protected]
Invalid URL characters included in the value assigned to the password
property are percent-encoded. Note that the selection of which characters to percent-encode may vary somewhat from what the url.parse()
and url.format()
methods would produce.
Gets and sets the path portion of the URL.
const myURL = new URL('https://example.org/abc/xyz?123'); console.log(myURL.pathname); // Prints /abc/xyz myURL.pathname = '/abcdef'; console.log(myURL.href); // Prints https://example.org/abcdef?123
Invalid URL characters included in the value assigned to the pathname
property are percent-encoded. Note that the selection of which characters to percent-encode may vary somewhat from what the url.parse()
and url.format()
methods would produce.
Gets and sets the port portion of the URL.
const myURL = new URL('https://example.org:8888'); console.log(myURL.port); // Prints 8888 // Default ports are automatically transformed to the empty string // (HTTPS protocol's default port is 443) myURL.port = '443'; console.log(myURL.port); // Prints the empty string console.log(myURL.href); // Prints https://example.org/ myURL.port = 1234; console.log(myURL.port); // Prints 1234 console.log(myURL.href); // Prints https://example.org:1234/ // Completely invalid port strings are ignored myURL.port = 'abcd'; console.log(myURL.port); // Prints 1234 // Leading numbers are treated as a port number myURL.port = '5678abcd'; console.log(myURL.port); // Prints 5678 // Non-integers are truncated myURL.port = 1234.5678; console.log(myURL.port); // Prints 1234 // Out-of-range numbers are ignored myURL.port = 1e10; console.log(myURL.port); // Prints 1234
The port value may be set as either a number or as a String containing a number in the range 0
to 65535
(inclusive). Setting the value to the default port of the URL
objects given protocol
will result in the port
value becoming the empty string (''
).
If an invalid string is assigned to the port
property, but it begins with a number, the leading number is assigned to port
. Otherwise, or if the number lies outside the range denoted above, it is ignored.
Gets and sets the protocol portion of the URL.
const myURL = new URL('https://example.org'); console.log(myURL.protocol); // Prints https: myURL.protocol = 'ftp'; console.log(myURL.href); // Prints ftp://example.org
Invalid URL protocol values assigned to the protocol
property are ignored.
Gets and sets the serialized query portion of the URL.
const myURL = new URL('https://example.org/abc?123'); console.log(myURL.search); // Prints ?123 myURL.search = 'abc=xyz'; console.log(myURL.href); // Prints https://example.org/abc?abc=xyz
Any invalid URL characters appearing in the value assigned the search
property will be percent-encoded. Note that the selection of which characters to percent-encode may vary somewhat from what the url.parse()
and url.format()
methods would produce.
Gets the URLSearchParams
object representing the query parameters of the URL. This property is read-only; to replace the entirety of query parameters of the URL, use the url.search
setter. See URLSearchParams
documentation for details.
Gets and sets the username portion of the URL.
const myURL = new URL('https://abc:[email protected]'); console.log(myURL.username); // Prints abc myURL.username = '123'; console.log(myURL.href); // Prints https://123:[email protected]
Any invalid URL characters appearing in the value assigned the username
property will be percent-encoded. Note that the selection of which characters to percent-encode may vary somewhat from what the url.parse()
and url.format()
methods would produce.
The toString()
method on the URL
object returns the serialized URL. The value returned is equivalent to that of url.href
and url.toJSON()
.
Because of the need for standard compliance, this method does not allow users to customize the serialization process of the URL. For more flexibility, require('url').format()
method might be of interest.
The toJSON()
method on the URL
object returns the serialized URL. The value returned is equivalent to that of url.href
and url.toString()
.
This method is automatically called when an URL
object is serialized with JSON.stringify()
.
const myURLs = [ new URL('https://www.example.com'), new URL('https://test.example.org') ]; console.log(JSON.stringify(myURLs)); // Prints ["https://www.example.com/","https://test.example.org/"]
The URLSearchParams
API provides read and write access to the query of a URL
.
The WHATWG URLSearchParams
interface and the querystring
module have similar purpose, but the purpose of the querystring
module is more general, as it allows the customization of delimiter characters (&
and =
). On the other hand, this API is designed purely for URL query strings.
const URL = require('url').URL; const myURL = new URL('https://example.org/?abc=123'); console.log(myURL.searchParams.get('abc')); // Prints 123 myURL.searchParams.append('abc', 'xyz'); console.log(myURL.href); // Prints https://example.org/?abc=123&abc=xyz myURL.searchParams.delete('abc'); myURL.searchParams.set('a', 'b'); console.log(myURL.href); // Prints https://example.org/?a=b
init
<String> The URL queryAppend a new name-value pair to the query string.
name
<string>
Remove all name-value pairs whose name is name
.
Returns an ES6 Iterator over each of the name-value pairs in the query. Each item of the iterator is a JavaScript Array. The first item of the Array is the name
, the second item of the Array is the value
.
Alias for urlSearchParams[@@iterator]()
.
fn
<Function> Function invoked for each name-value pair in the query.thisArg
<Object> Object to be used as this
value for when fn
is calledIterates over each name-value pair in the query and invokes the given function.
const URL = require('url').URL; const myURL = new URL('https://example.org/?a=b&c=d'); myURL.searchParams.forEach((value, name, searchParams) => { console.log(name, value, myURL.searchParams === searchParams); }); // Prints: // a b true // c d true
Returns the value of the first name-value pair whose name is name
. If there are no such pairs, null
is returned.
Returns the values of all name-value pairs whose name is name
. If there are no such pairs, an empty array is returned.
Returns true
if there is at least one name-value pair whose name is name
.
Returns an ES6 Iterator over the names of each name-value pair.
const { URLSearchParams } = require('url'); const params = new URLSearchParams('foo=bar&foo=baz'); for (const name of params.keys()) { console.log(name); } // Prints: // foo // foo
Sets the value in the URLSearchParams
object associated with name
to value
. If there are any pre-existing name-value pairs whose names are name
, set the first such pair's value to value
and remove all others. If not, append the name-value pair to the query string.
const { URLSearchParams } = require('url'); const params = new URLSearchParams(); params.append('foo', 'bar'); params.append('foo', 'baz'); params.append('abc', 'def'); console.log(params.toString()); // Prints foo=bar&foo=baz&abc=def params.set('foo', 'def'); params.set('xyz', 'opq'); console.log(params.toString()); // Prints foo=def&abc=def&xyz=opq
Sort all existing name-value pairs in-place by their names. Sorting is done with a stable sorting algorithm, so relative order between name-value pairs with the same name is preserved.
This method can be used, in particular, to increase cache hits.
const params = new URLSearchParams('query[]=abc&type=search&query[]=123'); params.sort(); console.log(params.toString()); // Prints query%5B%5D=abc&query%5B%5D=123&type=search
Returns the search parameters serialized as a string, with characters percent-encoded where necessary.
Returns an ES6 Iterator over the values of each name-value pair.
Returns an ES6 Iterator over each of the name-value pairs in the query string. Each item of the iterator is a JavaScript Array. The first item of the Array is the name
, the second item of the Array is the value
.
Alias for urlSearchParams.entries()
.
const { URLSearchParams } = require('url'); const params = new URLSearchParams('foo=bar&xyz=baz'); for (const [name, value] of params) { console.log(name, value); } // Prints: // foo bar // xyz baz
Returns the Punycode ASCII serialization of the domain
. If domain
is an invalid domain, the empty string is returned.
It performs the inverse operation to require('url').domainToUnicode()
.
const url = require('url'); console.log(url.domainToASCII('español.com')); // Prints xn--espaol-zwa.com console.log(url.domainToASCII('中文.com')); // Prints xn--fiq228c.com console.log(url.domainToASCII('xn--iñvalid.com')); // Prints an empty string
Note: The require('url').domainToASCII()
method is introduced as part of the new URL
implementation but is not part of the WHATWG URL standard.
Returns the Unicode serialization of the domain
. If domain
is an invalid domain, the empty string is returned.
It performs the inverse operation to require('url').domainToASCII()
.
const url = require('url'); console.log(url.domainToUnicode('xn--espaol-zwa.com')); // Prints español.com console.log(url.domainToUnicode('xn--fiq228c.com')); // Prints 中文.com console.log(url.domainToUnicode('xn--iñvalid.com')); // Prints an empty string
Note: The require('url').domainToUnicode()
API is introduced as part of the the new URL
implementation but is not part of the WHATWG URL standard.
URLs are permitted to only contain a certain range of characters. Any character falling outside of that range must be encoded. How such characters are encoded, and which characters to encode depends entirely on where the character is located within the structure of the URL. The WHATWG URL Standard uses a more selective and fine grained approach to selecting encoded characters than that used by the older url.parse()
and url.format()
methods.
The WHATWG algorithm defines three "encoding sets" that describe ranges of characters that must be percent-encoded:
The simple encode set includes code points in range U+0000 to U+001F (inclusive) and all code points greater than U+007E.
The default encode set includes the simple encode set and code points U+0020, U+0022, U+0023, U+003C, U+003E, U+003F, U+0060, U+007B, and U+007D.
The userinfo encode set includes the default encode set and code points U+002F, U+003A, U+003B, U+003D, U+0040, U+005B, U+005C, U+005D, U+005E, and U+007C.
The simple encode set is used primary for URL fragments and certain specific conditions for the path. The userinfo encode set is used specifically for username and passwords encoded within the URL. The default encode set is used for all other cases.
When non-ASCII characters appear within a hostname, the hostname is encoded using the Punycode algorithm. Note, however, that a hostname may contain both Punycode encoded and percent-encoded characters. For example:
const URL = require('url').URL; const myURL = new URL('https://%CF%80.com/foo'); console.log(myURL.href); // Prints https://xn--1xa.com/foo console.log(myURL.origin); // Prints https://π.com
© Joyent, Inc. and other Node contributors
Licensed under the MIT License.
Node.js is a trademark of Joyent, Inc. and is used with its permission.
We are not endorsed by or affiliated with Joyent.
https://nodejs.org/dist/latest-v7.x/docs/api/url.html