Skip to content

Instantly share code, notes, and snippets.

@sunziping2016
Created April 9, 2023 10:00
Show Gist options
  • Select an option

  • Save sunziping2016/bc4b2c489236d8a4308c2f55a0d36a08 to your computer and use it in GitHub Desktop.

Select an option

Save sunziping2016/bc4b2c489236d8a4308c2f55a0d36a08 to your computer and use it in GitHub Desktop.
#!/usr/bin/env python3
import regex as re
from glob import iglob
def main():
pattern = re.compile(r'([\p{ishan}]+)', re.unicode)
total = 0
for filename in sorted(iglob('data/chap*.tex')):
with open(filename, 'r', encoding='utf-8') as f:
text = f.read()
items = pattern.findall(text)
num = sum(len(i) for i in items)
total += num
print(filename, num)
print('total:', total)
if __name__ == '__main__':
main()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment