While surfing the internet you might have come across URLs containing special characters such as spaces, punctuation marks, and non-ASCII characters. These special characters in the website URL can cause issues with the functionality of the URL and also in SEO. However, we can easily solve the issue using several methods using JavaScript.
Table of Contents
In this article, we will discuss “How to remove special characters from the URL” using several JavaScript methods.
What are special characters in URLs ?
Special Characters in a URL are any characters that are not alphanumeric or commonly used punctuation marks. Special characters can include spaces, percentage signs, question marks and non-ASCII characters
Here is an example of an URL with special characters.
https://www.example.com/my%20first%20post.html
How to remove special characters from URL ?
There are several ways in Javascript to remove Special Characters from an URL. However, here we will discuss about three commonly used methods:
- Using Regular Expression
- Using encodeURI() function
- Using encodeURIComponent() function
Using Regular Expressions to remove Special Characters.
Regular Expression is a powerful tool to manipulate strings. We can use it to search for a string, replace patterns, or to include special characters in a string.
We can use Regular Expression (Regex) to remove special characters from a given URL string too. Here is an example using Javascript.
let url = "https://www.example.com/my page.html";
url = url.replace(/%20| /g, "");
console.log(url); // "https://www.example.com/mypage.html"
In the above code, we have used the replace() method with the regular expression to replace all %20
and empty spaces from the given URL.
The |
character tells the regular expression to match either %20
or an empty space character. And the g
(global) flag is used to replace all the occurrences of them.
Using the encodeURI() function to remove Special Characters.
The encodeURI()
is a built-in function that encodes special characters in a URL in Javascript. We can use this method to encode any special characters, removing them from the URL.
The encodeURI()
replaces a special character with its corresponding hexadecimal escape sequence.
Example:
let url = "https://www.example.com/my first post.html";
let newUrl = encodeURI(url);
console.log(newUrl) //https://www.example.com/my%20first%20post.html
In the above code, we have used the encodeURI()
function to encode any special characters from the given URL string. Here, the spaces in the URL are replaced by their corresponding hexadecimal escape sequence i.e %20
.
Using encodeURIComponent() function
The encodeURIComponent()
function is similar to the encodeURI()
function but it encodes all characters in the URL including the commonly used punctuation marks.
Example:
let url = "https://www.example.com/my page.html?query=foo&bar=baz";
url = encodeURIComponent(url);
console.log(url); //"https%3A%2F%2Fwww.example.com%2Fmy%20page.html%3Fquery%3Dfoo%26bar%3Dbaz"
As you can see, the encodeURIComponent()
function not just remove the special characters from the path but from the entire domain name.
So, while using it, it is better to separate the domain name and the post path after the forward-slash. And then use encodeURIComponent() to the path to encode and replace the special characters with their corresponding hexadecimal value.
let url = "https://www.example.com/my first post.html";
let postURL = url.substring("https://www.example.com/".length)
let newUrl = "https://www.example.com/" + encodeURIComponent(postURL)
console.log(newUrl)
Output:
https://www.example.com/my%20first%20post.html
Here, we have used the substring
method to separate the path of the URL from the main domain.
Next using encodeURIComponent()
we have encoded all the characters in the URL by effectively removing any special characters from it and replacing it with their hexadecimal value.
Conclusion:
In this article, we learned how to remove any special character from a given URL using JavaScript. The common methods are by using a regular expression, the encodeURI()
method, and the encodeURIComponent()
method.