Skip to content

Instantly share code, notes, and snippets.

@CQlove
Last active August 29, 2023 03:44
Show Gist options
  • Select an option

  • Save CQlove/ba4068a95930337e594e30c75b9202d1 to your computer and use it in GitHub Desktop.

Select an option

Save CQlove/ba4068a95930337e594e30c75b9202d1 to your computer and use it in GitHub Desktop.

JC's Regex Tutorial

Regex is short for regular expression, and you may also see some people use RE to refer to that as well. This is a concept in computer science. By using this rule you can match the string. So that Regular expressions are often used to match, retrieve, and replace text that matches a pattern.

Summary

We will talk about the basic Regex components next. Hopefully you will understand /^([a-z0-9_\.-]+)@([\da-z\.-]+)\.([a-z\.]{2,6})$/ after reading this tutorial. Just a reminder; if you want to use Regex in JavaScript, add / before Regex without space and add another / after Regex without space to let JS know you want to get string only.

Table of Contents

Regex Components

Anchors

^ and $ are the most important anchors. ^ is the starting point and $ is the end point. Let's say you want to find an initial (JC); you can use ^JC to find whatever sting starts with JC(upper JC used here so that you will not find anything that starts with lowercase jc). So JC$ will match whatever ends with uppercase JC because it's the endpoint. ^[a-z][0-9]$ is start with a lowercase letter and end with a number.

Quantifiers

We have a number of symbols to precisely control how many times a pattern repeats, including the following different types:

  • + : Indicates that the previous expression appears 1 or more times, which is at least appears once:
    • [a-z][0]+: This means at least we have one or more zeros in string after lowercase letter, for example if you have some string like cjx0 or abcd000 will be find.
  • ? : Indicates that the previous expression appears 1 or 0 times.(It can be 0 but 1 time is the maximum):
    • [a-z][0]?: This means we have at most one zero in string after lowercase letter, for example if you have some string like cjx0 or abc will be find.
  • * : Indicates that the previous expression appears 0 or many times.(It can be any time):
    • [a-z][0]*: This means we have no zero or many zeros in string after lowercase letter, for example if you have some string like abcd000, cjx0 or abc will be find.
  • {} : A more precise way to limit expression repetitions:
    • {x} : Define x times expression will repeat, x can be any number:
      • [a-z]{3}: It only has 3 lowercase letters, like cat, dog.
    • {x,} : Define at least x times expression will repeat, x can be any number:
      • {1,}: This is the same function with +.
      • {0,}: This is the same function with *.
      • [a-z][0-9]{3,}: This means at least 3 or more than 3 numbers after the lowercase number in a string, such as cjx666, abc1234 or aaaa11111.
    • {x,y}: Define at least x times expression will repeat but no more than y times:
      • {0,1}: This is the same function with ?.
      • [a-z][0-9]{3,5}: This means at least 3 or more than 3 numbers after the lowercase number in a string, such as cjx666, abc1234 or abba12345.

Grouping Constructs

The grouping structure allows you to treat parts of an expression as a whole, each () as a part, so that /^([a-z0-9_\.-]+)@([\da-z\.-]+)\.([a-z\.]{2,6})$/ has three groups. ([a-z0-9_\.-]+)will be the username as the first group; the second group ([\da-z\.-]+) will be the server or domain name such as gmail, hotmail, outlook; and very last goup ([a-z\.]{2,6}) will be the domain extention like .com.

Bracket Expressions

You can simply understand [] as a search range or it contains which information you need:

  • [0-9] means any number between 0-9
  • [a-z] means any of all lowercase letters
  • [A-Z] means any of all uppercase letters
  • [a-zA-Z] means any one of all letters, can uppercase or lowercase
  • [0-9a-zA-Z] means any of all letters and numbers
  • [_-] means underscore or hyphen.

Character Classes

  • \d is the same with [0-9]
  • \wis the same with [0-9a-zA-Z]

Character Escapes

\ just like JavaScript's \, it will make the special characters change to plain string. So that \. in the first group ([a-z0-9_\.-]+) means we can contain the string of period.

Summarize

Now you understand /^([a-z0-9_\.-]+)@([\da-z\.-]+)\.([a-z\.]{2,6})$/ is a string, starts with at least one of the lowercase letters, numbers, underscore, period or hyphen(which is the yourusername or first group). Then you have a @ after the first group. After that, you have a server or domain as the second group which contains at least one number, lowercase letter, period, or hyphen(no underscore for this group). You have another period after the second group as well. After this, you will end with the third group will be the domain extension, which contains at least 2 and up to 6 lowercase letters or periods. All in all, this is a Regex just like "[email protected]".

Author

My name is Jianxiong Chen. You can check more work in my github repository: https://github.com/CQlove

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment