Skip to content

Instantly share code, notes, and snippets.

@ricardocuellar
Created December 9, 2025 22:45
Show Gist options
  • Select an option

  • Save ricardocuellar/259d1fdbdb8d3a37e144f957376fdd48 to your computer and use it in GitHub Desktop.

Select an option

Save ricardocuellar/259d1fdbdb8d3a37e144f957376fdd48 to your computer and use it in GitHub Desktop.
Python + n8n - Section 4 - Analyze words (black list)
SPANISH_STOPWORDS = {
"de", "la", "que", "el", "en", "y", "a", "los", "del", "se", "las",
"por", "un", "para", "con", "no", "una", "su", "al", "lo", "como",
"más", "mas", "o", "pero", "sus", "le", "ya", "si", "porque", "cuando",
"muy", "sin", "sobre", "también", "tambien", "me", "hasta", "hay",
"donde", "han", "quien", "entre", "está", "esta", "ser", "son",
}
def tokenize(text: str) -> List[str]:
text = text.lower()
return re.findall(r"[a-záéíóúüñ]+", text, flags=re.IGNORECASE)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment