I'm no longer using this exact setup, it is based on an old version of libfprint and further development is moving at a slow pace.
This document will still be available for further reference, but I don't plan to update it.
| import re | |
| import glob | |
| def main(): | |
| outputFile = 'output.tsv' | |
| files = glob.glob('*.html') | |
| print "Extracting from %s files" %(len(files)) | |
| data = [] |