Regex or Regular Expressions are powerful patterns used to match, search, and manipulate text. Yet they are hard to master which is why I am creating a breakdown, that everyone can understand. They are widely used in various programming languages and tools for tasks such as data validation, pattern matching, and text processing.
The regex /^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w\s\.-]*)*\/?$/ is a pattern that matches a URL. It consists of various components such as anchors, quantifiers, grouping constructs, bracket expressions, character classes, the OR operator, flags, and character escapes. This guide will provide a comprehensive explanation of each component used in the regex pattern.
- Anchors
- Quantifiers
- Grouping Constructs
- Bracket Expressions
- Character Classes
- The OR Operator
- Flags
- Character Escapes
Anchors are special characters that match the position of characters within a string rather than matching specific characters. The ^ and $ symbols used in this regex are examples of anchors. The ^ matches the start of the string, while the $ matches the end of the string.
Ex: /^(https?://)? Here the ^ starts the string.
Ex: ([\/\w \.-]*)*\/?$ Here the $ ends the string.
Quantifiers specify how many times a pattern should occur in the input string. In this regex, quantifiers such as ? and * are used to match optional and zero or more occurrences of a pattern respectively. Also in this regex is + which matches one or more occurences of a pattern.
Ex: /^(https? Here the ? makes the s optional for a given url.
Ex: ([\/\w \.-]*)* Here the first * matches 0 or more of the given case starting with a "/" followed with any word character (alphanumeric & underscore). Then a space character followed by a "." ending with a "-".
Ex: ([\da-z\.-]+) Here the + will match one or more occurences of any digit (0-9), any letter (a-z) followed by a "." ending with a "-".
Ex: ([a-z\.]{2,6}) Here the { } will match the specified quantity of 2-6, so the length cannot be shorter than 2 or greater than 6.
Grouping constructs are used to group multiple characters or patterns together. The ( ) symbols in this regex are examples of grouping constructs.
Ex: (https?:\/\/) Here the parentheses group the http or https and the "://" together.
Bracket expressions are used to match a range of characters. In this regex, bracket expressions such as [a-z.-] are used to match a range of alphanumeric characters, dots, and hyphens.
Ex: [a-z\.] Here the [ ] lets us use any character a-z case sensitive followed by a period.
Character classes are a shorthand way of representing a group of characters. In this regex, character classes such as \w and \s are used to match alphanumeric characters and whitespace characters respectively.
Ex: ([\/\w \.-]*)*\/?$/ Here \w matches any word characters (alphanumeric & underscore).
Ex: ([\/\w\s\.-]*)*\/?$/ Here the \s is used in place of a space which is considered a whitespace character.
Character escapes are used to match specific characters that have special meaning in a regex pattern. In this regex, character escapes such as / are used to match the forward slash character.
Ex: (https?:\/\/) Here the \ precedes the character to escape which in this example is the "/".
Travontaz Lowry