Skip to content

Instantly share code, notes, and snippets.

@Revivedaniel
Last active February 21, 2023 02:34
Show Gist options
  • Select an option

  • Save Revivedaniel/b116edaf4ee7669c58f55f9f95261fd2 to your computer and use it in GitHub Desktop.

Select an option

Save Revivedaniel/b116edaf4ee7669c58f55f9f95261fd2 to your computer and use it in GitHub Desktop.

Longest Distinct Substring Algorithm using the Sliding Window method

Given I have a string, I want to know what the length of the longest substring containing distinct characters.
In other words, I want to know the length of the largest set of characters in a row where the characters are not duplicates.

longestDistinctSubstring(str)

  • The string passed into the function will only ever contain lowercase letters with no spaces.
  • The string might be empty.
  • In the case where the string is empty, return 0 as there are no characters.
  • In the case where all the letters are the same in the string, return 1 as there is only 1 unique letter.

Examples

longestDistinctSubstring('')                 // Return: 0 because the string is empty
longestDistinctSubstring('rithmschool')      // Return: 7
longestDistinctSubstring('thisisawesome')    // Return: 6
longestDistinctSubstring('thecatinthehat')   // Return: 7
longestDistinctSubstring('bbbbbb')           // Return: 1 because there is only 1 unique character in the string
longestDistinctSubstring('longestsubstring') // Return: 8
longestDistinctSubstring('thisishowwedoit')  // Return: 6

Let's break it down!

Let's use the following case as an example.

longestDistinctSubstring('thisishowwedoit');

We will start the window at the beginning of the string and use a lookup object to keep track of the characters in the window.
We will check if the first character in the string is in the lookup, aka the window.

'thisishowwedoit'
 ^

lookup = {};
startIndex = 0;
endIndex = 0;
longestSubstring = 0;

Since this is the first character, the character will not be in the lookup.
We can safely say that this character is unique to the window and we should add it to the lookup.
We will also increase the size of the window and update the longestSubstring since our window is 1 element wide.

'thisishowwedoit'
 ^^

lookup = {
  't': 1
};
startIndex = 0;
endIndex = 1;
longestSubstring = 1;

We will continue this logic until we hit a duplicate character.
Increasing the longestSubstring as we increase the window size.

'thisishowwedoit'
 ^-^

lookup = {
  't': 1,
  'h': 1
};
startIndex = 0;
endIndex = 2;
longestSubstring = 2;
'thisishowwedoit'
 ^--^

lookup = {
  't': 1,
  'h': 1,
  'i': 1
};
startIndex = 0;
endIndex = 3;
longestSubstring = 3;
'thisishowwedoit'
 ^---^

lookup = {
  't': 1,
  'h': 1,
  'i': 1,
  's': 1
};
startIndex = 0;
endIndex = 4;
longestSubstring = 4;

Now that we have reached a duplicate character 'i', we now know that this is the end of a unique substring.
To make sure that we find the largest substring, we need to shrink the window until there are no duplicate characters.
We do this by moving the startIndex of the window forwards until there are no duplicates in the window again.

'thisishowwedoit'
  ^--^

lookup = {
  't': 0,
  'h': 1,
  'i': 1,
  's': 1
};
startIndex = 1;
endIndex = 4;
longestSubstring = 4;

There is still an 'i' within the window so let's continue until we remove the 'i'.
Notice that the longestSubstring stays the same since we are only tracking it when the window is wider than the longestSubstring value.

'thisishowwedoit'
   ^-^

lookup = {
  't': 0,
  'h': 0,
  'i': 1,
  's': 1
};
startIndex = 2;
endIndex = 4;
longestSubstring = 4;
'thisishowwedoit'
    ^^

lookup = {
  't': 0,
  'h': 0,
  'i': 0,
  's': 1
};
startIndex = 3;
endIndex = 4;
longestSubstring = 4;

Now that there is not an 'i' in the window, we can continue forward to evaluate the entire string for a longer substring.
We repeat the these two steps until we reach the end of our string.

'thisishowwedoit'
    ^-^

lookup = {
  't': 0,
  'h': 0,
  'i': 1,
  's': 1
};
startIndex = 3;
endIndex = 5;
longestSubstring = 4;
'thisishowwedoit'
     ^^

lookup = {
  't': 0,
  'h': 0,
  'i': 1,
  's': 0
};
startIndex = 4;
endIndex = 5;
longestSubstring = 4;
'thisishowwedoit'
     ^-^

lookup = {
  't': 0,
  'h': 0,
  'i': 1,
  's': 1
};
startIndex = 4;
endIndex = 6;
longestSubstring = 4;
'thisishowwedoit'
     ^--^

lookup = {
  't': 0,
  'h': 1,
  'i': 1,
  's': 1
};
startIndex = 4;
endIndex = 7;
longestSubstring = 4;
'thisishowwedoit'
     ^---^

lookup = {
  't': 0,
  'h': 1,
  'i': 1,
  's': 1,
  'o': 1
};
startIndex = 4;
endIndex = 8;
longestSubstring = 4;
'thisishowwedoit'
     ^----^

lookup = {
  't': 0,
  'h': 1,
  'i': 1,
  's': 1,
  'o': 1,
  'w': 1
};
startIndex = 4;
endIndex = 9;
longestSubstring = 5;

We've increased our longestSubstring and ran into another duplicate scenario.
Let's continue with our logic steps.

'thisishowwedoit'
      ^---^

lookup = {
  't': 0,
  'h': 1,
  'i': 0,
  's': 1,
  'o': 1,
  'w': 1
};
startIndex = 5;
endIndex = 9;
longestSubstring = 5;
'thisishowwedoit'
       ^--^

lookup = {
  't': 0,
  'h': 1,
  'i': 0,
  's': 0,
  'o': 1,
  'w': 1
};
startIndex = 6;
endIndex = 9;
longestSubstring = 5;
'thisishowwedoit'
        ^-^

lookup = {
  't': 0,
  'h': 0,
  'i': 0,
  's': 0,
  'o': 1,
  'w': 1
};
startIndex = 7;
endIndex = 9;
longestSubstring = 5;
'thisishowwedoit'
         ^^

lookup = {
  't': 0,
  'h': 0,
  'i': 0,
  's': 0,
  'o': 0,
  'w': 1
};
startIndex = 8;
endIndex = 9;
longestSubstring = 5;
'thisishowwedoit'
          ^

lookup = {
  't': 0,
  'h': 0,
  'i': 0,
  's': 0,
  'o': 0,
  'w': 0
};
startIndex = 9;
endIndex = 9;
longestSubstring = 5;

We've removed every character in the lookup because there were 2 of the same characters in a row.
Let's finish evaluating the string.

'thisishowwedoit'
          ^^

lookup = {
  't': 0,
  'h': 0,
  'i': 0,
  's': 0,
  'o': 0,
  'w': 1
};
startIndex = 9;
endIndex = 10;
longestSubstring = 5;
'thisishowwedoit'
          ^-^

lookup = {
  't': 0,
  'h': 0,
  'i': 0,
  's': 0,
  'o': 0,
  'w': 1,
  'e': 1
};
startIndex = 9;
endIndex = 11;
longestSubstring = 5;
'thisishowwedoit'
          ^--^

lookup = {
  't': 0,
  'h': 0,
  'i': 0,
  's': 0,
  'o': 0,
  'w': 1,
  'e': 1,
  'd': 1
};
startIndex = 9;
endIndex = 12;
longestSubstring = 5;
'thisishowwedoit'
          ^---^

lookup = {
  't': 0,
  'h': 0,
  'i': 0,
  's': 0,
  'o': 1,
  'w': 1,
  'e': 1,
  'd': 1
};
startIndex = 9;
endIndex = 13;
longestSubstring = 5;
'thisishowwedoit'
          ^----^

lookup = {
  't': 0,
  'h': 0,
  'i': 1,
  's': 0,
  'o': 1,
  'w': 1,
  'e': 1,
  'd': 1
};
startIndex = 9;
endIndex = 14;
longestSubstring = 5;
'thisishowwedoit'
          ^-----^

lookup = {
  't': 1,
  'h': 0,
  'i': 1,
  's': 0,
  'o': 1,
  'w': 1,
  'e': 1,
  'd': 1
};
startIndex = 9;
endIndex = 15; // outside the array length now
longestSubstring = 6;

We've now evaluated the entire string for the longest subarray of unique characters and found it to be 6 characters long!

Pseudo Code

Let's psuedo code this!
First we can deal with our edge cases where the string is blank or there is only 1 character.

+// if the string length is 0, return 0
+// if the string length is 1, return 1

Next, we can define the variables to keep track of the window start, end, and characters and the longestSubstring.

// if the string length is 0, return 0
// if the string length is 1, return 1
+// define lookup variable for the characters in the window
+// define a  start index for the start position of the window
+// define an end index for the end position of the window
+// define a longestSubstring variabe to track the largest substring length

Now, let's create a while loop with a condition being that the startIndex is less than the string.length
This is the continue looping until we tell it to or the window has reached the end of the string.

// if the string length is 0, return 0
// if the string length is 1, return 1
// define lookup variable for the characters in the window
// define a  start index for the start position of the window
// define an end index for the end position of the window
// define a longestSubstring variabe to track the largest substring length

+// define a while loop with the condition being startIndex < string.length

The first condition we should check within the while loop is if the character at the endIndex position is present within the lookup.
If it is not we should add it to the lookup, expand the window, and update the longestSubstring value if the window size is larger than the longestSubstring value.

// if the string length is 0, return 0
// if the string length is 1, return 1
// define lookup variable for the characters in the window
// define a  start index for the start position of the window
// define an end index for the end position of the window
// define a longestSubstring variabe to track the largest substring length

// define a while loop with the condition being startIndex < string.length
+  // if the character at string[endIndex] is not in the lookup variable and the endIndex is less than the str.length
+    // add it to the variable
+    // move the end index forwards
+    // set longestSubstring to either longestSubstring or endIndex - startIndex, whichever is larger. 

Now, if the character at the endIndex position is present within the lookup we should shrink the window until there are no duplicate values.
To do this we can remove the value at the start of the window and move the start of the window forward 1 and repeat the logic checks.

// if the string length is 0, return 0
// if the string length is 1, return 1
// define lookup variable for the characters in the window
// define a  start index for the start position of the window
// define an end index for the end position of the window
// define a longestSubstring variabe to track the largest substring length

// define a while loop with the condition being startIndex < string.length
  // if the character at string[endIndex] is not in the lookup variable and the endIndex is less than the str.length
    // add it to the variable
    // move the end index forwards
    // set longestSubstring to either longestSubstring or endIndex - startIndex, whichever is larger. 
+  // else if the character is in the lookup
+    // shrink the window by subtracting 1 from the character at string[startIndex] from the lookup variable
+    // add 1 to startIndex

Finally, if those two conditions are not true, then we must have reached the end of the string and we need to break the loop.
We then return the longestSubstring variable we were keeping track of.

// if the string length is 0, return 0
// if the string length is 1, return 1
// define lookup variable for the characters in the window
// define a  start index for the start position of the window
// define an end index for the end position of the window
// define a longestSubstring variabe to track the largest substring length

// define a while loop with the condition being startIndex < string.length
  // if the character at string[endIndex] is not in the lookup variable and the endIndex is less than the str.length
    // add it to the variable
    // move the end index forwards
    // set longestSubstring to either longestSubstring or endIndex - startIndex, whichever is larger. 
  // else if the character is in the lookup
    // shrink the window by subtracting 1 from the character at string[startIndex] from the lookup variable
    // add 1 to startIndex
+  // else break the loop
return longestSubstring; 

Let's code!

First we can deal with our edge cases where the string is blank or there is only 1 character.

longestDistinctSubstring(str) {
  // if the string length is 0, return 0
  // if the string length is 1, return 1
  if (str.length === 0 || str.length === 1) return str.length; // str.length will either bo 0 or 1 here
}

Next we will declare our variables.

longestDistinctSubstring(str) {
  if (str.length === 0 || str.length === 1) return str.length;
  
  // define lookup variable for the characters in the window
  // define a  start index for the start position of the window
  // define an end index for the end position of the window
  // define a longestSubstring variabe to track the largest substring length
  let lookup = {};
  let startIndex = 0;
  let endIndex = 0;
  let longestSubstring = 0;
}

Then, we will define our while loop.

longestDistinctSubstring(str) {
  if (str.length === 0 || str.length === 1) return str.length;
  
  let lookup = {};
  let startIndex = 0;
  let endIndex = 0;
  let longestSubstring = 0;
  
  // define a while loop with the condition being startIndex < string.length
  while (startIndex < str.length) {}
}

Now we can check if the character at the endIndex position is present within the lookup.
If it is not we should add it to the lookup, expand the window, and update the longestSubstring value if the window size is larger than the longestSubstring value.

longestDistinctSubstring(str) {
  if (str.length === 0 || str.length === 1) return str.length;
  
  let lookup = {};
  let startIndex = 0;
  let endIndex = 0;
  let longestSubstring = 0;
  
  while (startIndex < str.length) {
    // if the letter at string[endIndex] is not in the lookup variable and the endIndex is less than the str.length
    if (!lookup[str[endIndex]] && endIndex < str.length) { // Also checking that the endIndex is within the str still
      // add it to the variable
      // move the end index forwards
      // set longestSubstring to either longestSubstring or endIndex - startIndex, whichever is larger.  
      lookup[str[endIndex]] = 1;
      endIndex++;
      longestSubstring = Math.max(longestSubstring, endIndex - startIndex);
    }
  }
}

And, if the character at the endIndex position is present within the lookup we should shrink the window until there are no duplicate values.

longestDistinctSubstring(str) {
  if (str.length === 0 || str.length === 1) return str.length;
  
  let lookup = {};
  let startIndex = 0;
  let endIndex = 0;
  let longestSubstring = 0;
  
  while (startIndex < str.length) {
    if (!lookup[str[endIndex]] && endIndex < str.length) { 
      lookup[str[endIndex]] = 1;
      endIndex++;
      longestSubstring = Math.max(longestSubstring, endIndex - startIndex);
    } else if (lookup[str[endIndex]]) {
      // else if the character is in the lookup
      // shrink the window by subtracting 1 from the character at string[startIndex] from the lookup variable
      // add 1 to startIndex
      lookup[str[startIndex]]--;
      startIndex++
    }
  }
}

Finally, we will break the loop and return the longestSubarray.

longestDistinctSubstring(str) {
  if (str.length === 0 || str.length === 1) return str.length;
  
  let lookup = {};
  let startIndex = 0;
  let endIndex = 0;
  let longestSubstring = 0;
  
  while (startIndex < str.length) {
    if (!lookup[str[endIndex]] && endIndex < str.length) { 
      lookup[str[endIndex]] = 1;
      endIndex++;
      longestSubstring = Math.max(longestSubstring, endIndex - startIndex);
    } else if (lookup[str[endIndex]]) {
      lookup[str[startIndex]]--;
      startIndex++
    } else {
      // else break the loop
      break;
    }
  }
  
  return longestSubstring;
}

Tada! We now have an algorithm that will determine the longest substring with all unique characters. Time Complexity - O(N)

Space Complexity - O(1)

longestDistinctSubstring(str) {
  if (str.length === 0 || str.length === 1) return str.length;
  
  let lookup = {};
  let startIndex = 0;
  let endIndex = 0;
  let longestSubstring = 0;
  
  while (startIndex < str.length) {
    if (!lookup[str[endIndex]] && endIndex < str.length) { 
      lookup[str[endIndex]] = 1;
      endIndex++;
      longestSubstring = Math.max(longestSubstring, endIndex - startIndex);
    } else if (lookup[str[endIndex]]) {
      lookup[str[startIndex]]--;
      startIndex++
    } else {
      break;
    }
  }
  
  return longestSubstring;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment