I am taking the json from the following gist original rsdb gitst - local copy It contains a json with a set of records on racial and ethnical slurs.
Wrote a python script that does the following things
- create a json where all slurs are grouped by the name of the group, which is the subject of the slur slurs sorted by group name
- another one of the same structure, where the group names are sorted by the number of slurs
- another file that counts all the slurs for each group, sorts them by number of slurs
Apart from being a fun little exercise, it illustrates a very old data processing / data-science trick: By grouping all the entries by their index value and looking at some common metadata (such as the name of the index key value), you can easily spot some irregularities: for example there are some strange / duplicated group names, and you can spot them more easily when looking at just the group names.
- So much creative energy is/was being wasted on these slurs. It seems that each group is defining itself in terms of the slurs that it launches at the 'others'. Each group claims to be the opposite of those slurs ('we are not like them'), but they don't get prettier when they start persecute the minority groups living besides them.
- For me as a Jew: fascinating to realize, how some relatively small groups get a disproportionately large amount of 'attention'. It's the twenty first century, but most of Europe seems to be getting back to it's antisemitic roots. Isn't that perplexing?
- Actually this one is providing an explanation of the meaning of slurs, so it can be seen as educational material...
- Also interesting: will i get banned for this one :-)