Skip to content

Instantly share code, notes, and snippets.

@Technoash
Created October 14, 2024 22:32
Show Gist options
  • Select an option

  • Save Technoash/6b20648f64bbad2786c8a8a9c83f62b6 to your computer and use it in GitHub Desktop.

Select an option

Save Technoash/6b20648f64bbad2786c8a8a9c83f62b6 to your computer and use it in GitHub Desktop.
Explain left joins done in javascript

Recap of the leftJoin Function

First, here's the leftJoin function from the script:

function leftJoin(
  left: ReturnAssessment[],
  right: UCMReturn[],
  leftKey: keyof ReturnAssessment,
  rightKey: keyof UCMReturn
) {
  const rightLookup = right.reduce<Record<string, UCMReturn[]>>((acc, item) => {
    const key = item[rightKey as string];
    if (!acc[key]) {
      acc[key] = [];
    }
    acc[key].push(item);
    return acc;
  }, {});

  const joined: Array<ReturnAssessment & Partial<UCMReturn>> = [];

  for (const leftItem of left) {
    const key = leftItem[leftKey as string];
    const matches = rightLookup[key];

    if (matches) {
      for (const rightItem of matches) {
        joined.push({ ...leftItem, ...rightItem });
      }
    } else {
      joined.push({ ...leftItem });
    }
  }

  return joined;
}

Breaking Down the Function

Let's go through this function line by line, explaining every part, especially focusing on what the map is, why we're using it, and how it helps us perform a left join.


1. Function Definition

function leftJoin(
  left: ReturnAssessment[],
  right: UCMReturn[],
  leftKey: keyof ReturnAssessment,
  rightKey: keyof UCMReturn
) { /* function body */ }
  • Purpose: This function performs a left join between two datasets (arrays of objects).
  • Parameters:
    • left: The left dataset (an array of ReturnAssessment objects).
    • right: The right dataset (an array of UCMReturn objects).
    • leftKey: The key in the left dataset to join on (e.g., 'ASSESSMENT_ID').
    • rightKey: The key in the right dataset to join on (e.g., 'ASSESSMENT_ID').

2. Creating the Map (rightLookup)

This is where we create the map:

const rightLookup = right.reduce<Record<string, UCMReturn[]>>((acc, item) => {
  const key = item[rightKey as string];
  if (!acc[key]) {
    acc[key] = [];
  }
  acc[key].push(item);
  return acc;
}, {});

Breaking it down:

  • Objective: To create a map (also known as a dictionary or object) that organizes the right dataset (right) in a way that we can quickly find all items that match a specific key.

  • Terminology:

    • Map (rightLookup): An object where each key is an ASSESSMENT_ID from the right dataset, and each value is an array of items from the right dataset that have that ASSESSMENT_ID.

Let's understand each part:

  • right.reduce<...>((acc, item) => { /*...*/ }, {});
    • right: The right dataset (array of UCMReturn objects).
    • .reduce<...>: A method that processes each item in the array and builds up an accumulated result (acc).
    • acc: The accumulator, which in this case is the map we're building.
    • item: The current item from the right dataset being processed.
    • {}: The initial value of the accumulator, an empty object.

Inside the reduce function:

  1. Extract the Key:

    const key = item[rightKey as string];
    • Purpose: Get the value of the ASSESSMENT_ID from the current item.
    • Example: If item is { ASSESSMENT_ID: 'A', RETURN_ID: '9218' }, then key will be 'A'.
  2. Initialize the Array for the Key if It Doesn't Exist:

    if (!acc[key]) {
      acc[key] = [];
    }
    • Purpose: If there isn't already an array for this ASSESSMENT_ID in the map, create one.
    • Explanation: This ensures that acc[key] is always an array.
  3. Add the Current Item to the Array for This Key:

    acc[key].push(item);
    • Purpose: Add the current item to the array corresponding to its ASSESSMENT_ID.
  4. Return the Accumulator for the Next Iteration:

    return acc;
    • Purpose: The reduce function requires returning the accumulator for the next iteration.

Resulting Data Structure (rightLookup):

  • Type: An object (map) where:
    • Keys: ASSESSMENT_IDs from the right dataset.
    • Values: Arrays of UCMReturn objects that have that ASSESSMENT_ID.

Example:

If right (the right dataset) is:

[
  { ASSESSMENT_ID: 'A', RETURN_ID: '9218' },
  { ASSESSMENT_ID: 'A', RETURN_ID: '8420' },
  { ASSESSMENT_ID: 'B', RETURN_ID: '1516' },
  { ASSESSMENT_ID: 'F', RETURN_ID: '1517' },
]

Then rightLookup will be:

{
  'A': [
    { ASSESSMENT_ID: 'A', RETURN_ID: '9218' },
    { ASSESSMENT_ID: 'A', RETURN_ID: '8420' },
  ],
  'B': [
    { ASSESSMENT_ID: 'B', RETURN_ID: '1516' },
  ],
  'F': [
    { ASSESSMENT_ID: 'F', RETURN_ID: '1517' },
  ],
}

Why Are We Using a Map?

  • Quick Access: A map allows us to find all matching items for a given ASSESSMENT_ID instantly.
  • Efficiency: Searching through an array each time would be slow, especially with large datasets. A map reduces the time complexity from linear (O(n)) to constant (O(1)).

3. Initializing the Joined Array

const joined: Array<ReturnAssessment & Partial<UCMReturn>> = [];
  • Purpose: This is an array where we will store the results of the left join.
  • Type Explanation:
    • ReturnAssessment & Partial<UCMReturn>: Each item in joined will have all the properties of a ReturnAssessment, plus any matching properties from UCMReturn.
    • Partial<UCMReturn>: Indicates that the UCMReturn part might be incomplete (some properties may be missing if there was no match).

4. Performing the Left Join

for (const leftItem of left) {
  const key = leftItem[leftKey as string];
  const matches = rightLookup[key];

  if (matches) {
    for (const rightItem of matches) {
      joined.push({ ...leftItem, ...rightItem });
    }
  } else {
    joined.push({ ...leftItem });
  }
}

Step-by-Step Explanation:

a. Loop Over Each Item in the Left Dataset

for (const leftItem of left) { /*...*/ }
  • Purpose: Go through every item in the left dataset (returnAssessments).

b. Get the Key from the Left Item

const key = leftItem[leftKey as string];
  • Purpose: Extract the ASSESSMENT_ID from the current leftItem.
  • Example: If leftItem is { ASSESSMENT_ID: 'A', ASSPER_REG_CLIENT_ID: '122199', ... }, then key will be 'A'.

c. Find Matches in the Right Map

const matches = rightLookup[key];
  • Purpose: Look up the ASSESSMENT_ID in rightLookup to find any matching items from the right dataset.
  • Result:
    • If there are matches, matches will be an array of UCMReturn objects.
    • If there are no matches, matches will be undefined.

d. Check If Matches Were Found

if (matches) { /*...*/ } else { /*...*/ }
  • Purpose: Decide what to do based on whether matches were found.

e. If Matches Were Found

for (const rightItem of matches) {
  joined.push({ ...leftItem, ...rightItem });
}
  • Explanation:
    • Loop Over Each Matching Right Item: There might be multiple matches (e.g., multiple RETURN_IDs for the same ASSESSMENT_ID).
    • Combine Left and Right Items:
      • { ...leftItem, ...rightItem }:
        • The { ...leftItem } syntax copies all properties from leftItem.
        • The { ...leftItem, ...rightItem } syntax copies all properties from both leftItem and rightItem. If they share any properties, the values from rightItem overwrite those from leftItem.
      • Add the Combined Item to joined:
        • Each combination is added to the joined array.

f. If No Matches Were Found

joined.push({ ...leftItem });
  • Explanation:
    • Include the Left Item As Is:
      • Since there are no matches in the right dataset, we include the left item alone.
      • This ensures that all items from the left dataset are represented in the final result, even if they have no matching right data.

5. Returning the Joined Array

return joined;
  • Purpose: After processing all items, we return the joined array, which now contains the result of the left join.

Visualizing the Process with Examples

Let's use our sample data to see how this works in practice.

Left Dataset (returnAssessments):

[
  { ASSESSMENT_ID: 'A', ASSPER_REG_CLIENT_ID: '122199', ASSPER_REG_REVENUE_TYPE: 'F', REVERSAL_IND: 'N' },
  { ASSESSMENT_ID: 'B', ASSPER_REG_CLIENT_ID: '122180', ASSPER_REG_REVENUE_TYPE: 'F', REVERSAL_IND: 'N' },
  { ASSESSMENT_ID: 'C', ASSPER_REG_CLIENT_ID: '122170', ASSPER_REG_REVENUE_TYPE: 'F', REVERSAL_IND: 'Y' },
]

Right Dataset (ucmReturns):

[
  { ASSESSMENT_ID: 'A', RETURN_ID: '9218' },
  { ASSESSMENT_ID: 'A', RETURN_ID: '8420' },
  { ASSESSMENT_ID: 'B', RETURN_ID: '1516' },
  { ASSESSMENT_ID: 'F', RETURN_ID: '1517' },
]

Step 1: Creating the rightLookup Map

  • Resulting rightLookup:
{
  'A': [
    { ASSESSMENT_ID: 'A', RETURN_ID: '9218' },
    { ASSESSMENT_ID: 'A', RETURN_ID: '8420' },
  ],
  'B': [
    { ASSESSMENT_ID: 'B', RETURN_ID: '1516' },
  ],
  'F': [
    { ASSESSMENT_ID: 'F', RETURN_ID: '1517' },
  ],
}

Step 2: Performing the Left Join

Processing Each Left Item:

  1. Left Item with ASSESSMENT_ID: 'A':

    • key = 'A'
    • matches = rightLookup['A']
      • Matches found: Two items.
    • For Each Match:
      • First Match:
        • rightItem = { ASSESSMENT_ID: 'A', RETURN_ID: '9218' }
        • Combined Item:
          {
            ASSESSMENT_ID: 'A',
            ASSPER_REG_CLIENT_ID: '122199',
            ASSPER_REG_REVENUE_TYPE: 'F',
            REVERSAL_IND: 'N',
            RETURN_ID: '9218',
          }
      • Second Match:
        • rightItem = { ASSESSMENT_ID: 'A', RETURN_ID: '8420' }
        • Combined Item:
          {
            ASSESSMENT_ID: 'A',
            ASSPER_REG_CLIENT_ID: '122199',
            ASSPER_REG_REVENUE_TYPE: 'F',
            REVERSAL_IND: 'N',
            RETURN_ID: '8420',
          }
    • Both combined items are added to joined.
  2. Left Item with ASSESSMENT_ID: 'B':

    • key = 'B'
    • matches = rightLookup['B']
      • Matches found: One item.
    • For the Match:
      • rightItem = { ASSESSMENT_ID: 'B', RETURN_ID: '1516' }
      • Combined Item:
        {
          ASSESSMENT_ID: 'B',
          ASSPER_REG_CLIENT_ID: '122180',
          ASSPER_REG_REVENUE_TYPE: 'F',
          REVERSAL_IND: 'N',
          RETURN_ID: '1516',
        }
    • The combined item is added to joined.
  3. Left Item with ASSESSMENT_ID: 'C':

    • key = 'C'
    • matches = rightLookup['C']
      • No matches found (matches is undefined).
    • Include the Left Item As Is:
      {
        ASSESSMENT_ID: 'C',
        ASSPER_REG_CLIENT_ID: '122170',
        ASSPER_REG_REVENUE_TYPE: 'F',
        REVERSAL_IND: 'Y',
        // No RETURN_ID
      }
    • This item is added to joined.

Final joined Array:

[
  {
    ASSESSMENT_ID: 'A',
    ASSPER_REG_CLIENT_ID: '122199',
    ASSPER_REG_REVENUE_TYPE: 'F',
    REVERSAL_IND: 'N',
    RETURN_ID: '9218',
  },
  {
    ASSESSMENT_ID: 'A',
    ASSPER_REG_CLIENT_ID: '122199',
    ASSPER_REG_REVENUE_TYPE: 'F',
    REVERSAL_IND: 'N',
    RETURN_ID: '8420',
  },
  {
    ASSESSMENT_ID: 'B',
    ASSPER_REG_CLIENT_ID: '122180',
    ASSPER_REG_REVENUE_TYPE: 'F',
    REVERSAL_IND: 'N',
    RETURN_ID: '1516',
  },
  {
    ASSESSMENT_ID: 'C',
    ASSPER_REG_CLIENT_ID: '122170',
    ASSPER_REG_REVENUE_TYPE: 'F',
    REVERSAL_IND: 'Y',
    // No RETURN_ID
  },
]

Summarizing the Key Points

  • The Map (rightLookup):

    • Definition: An object that groups items from the right dataset (ucmReturns) by their ASSESSMENT_ID.
    • Purpose: To allow quick and efficient access to all items from the right dataset that match a given ASSESSMENT_ID.
    • Usage in Code: Created using the reduce function to process the right dataset once and organize it for fast lookups.
  • Performing the Left Join:

    • Process:
      • For each item in the left dataset, use the map to find matching items in the right dataset.
      • If matches are found, combine each one with the left item.
      • If no matches, include the left item alone.
    • Outcome: All items from the left dataset are included in the result, with additional data from the right dataset when available.
  • Why This Approach Is Efficient:

    • Using a Map:
      • Accessing matches is fast because it's a direct lookup (O(1) time complexity).
      • Without a map, we'd have to search through the right dataset for each left item, which would be much slower (O(n) time complexity for each search).
  • Understanding for a Newbie:

    • Think of the Map as a Phone Directory:
      • Keys (Names): ASSESSMENT_IDs.
      • Values (Phone Numbers): Lists of RETURN_IDs associated with each ASSESSMENT_ID.
      • When you have a name (an ASSESSMENT_ID from the left dataset), you can quickly look up their phone numbers (matching RETURN_IDs) without flipping through every page.

Conclusion

  • Map (rightLookup): A critical part of the leftJoin function that allows us to efficiently perform the join operation.
  • Left Join Mechanism:
    • Ensures all data from the left dataset is preserved.
    • Adds matching data from the right dataset when available.
  • Efficiency and Clarity:
    • Using a map makes the code efficient and easier to understand once you grasp how maps work.
  • Key Takeaway for a Newbie:
    • Maps/Dictionaries: Fundamental data structures in programming that allow fast data retrieval based on keys.
    • Left Join in Code: By organizing data efficiently (with a map), we can mimic complex database operations in our programs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment