I hereby claim:
- I am ryanwitt on github.
- I am onecreativenerd (https://keybase.io/onecreativenerd) on keybase.
- I have a public key ASB_TkeaXoaqMw5ii1FGzwCwYblooenmt-s59k24W87OZAo
To claim this, I am signing this object:
I hereby claim:
To claim this, I am signing this object:
| class RedisTools: | |
| ''' | |
| A set of utility tools for interacting with a redis cache | |
| ''' | |
| def __init__(self): | |
| self._queues = ["default", "high", "low", "failed"] | |
| self.get_redis_connection() | |
| def get_redis_connection(self): |
| #!/bin/sh | |
| VERSION=0.12.2 | |
| PLATFORM=linux | |
| ARCH=x64 | |
| PREFIX=/usr/local | |
| mkdir -p "$PREFIX" && \ | |
| curl http://nodejs.org/dist/v$VERSION/node-v$VERSION-$PLATFORM-$ARCH.tar.gz \ | |
| | tar xzvf - --strip-components=1 -C "$PREFIX" |
| def collect_ranges(s): | |
| """ | |
| Returns a generator of tuples of consecutive numbers found in the input. | |
| >>> list(collect_ranges([])) | |
| [] | |
| >>> list(collect_ranges([1])) | |
| [(1, 1)] | |
| >>> list(collect_ranges([1,2,3])) | |
| [(1, 3)] |
| // | |
| // cpuse.js - simple continuous cpu monitor for node | |
| // | |
| // Intended for programs wanting to monitor and take action on overall CPU load. | |
| // | |
| // The monitor starts as soon as you require the module, then you can query it at | |
| // any later time for the average cpu: | |
| // | |
| // > var cpuse = require('cpuse'); | |
| // > cpuse.averages(); |
| // Check mongodb working set size (Mongo 2.4+). | |
| // Paste this into mongo console, get back size in GB | |
| db.runCommand({ | |
| serverStatus:1, workingSet:1, metrics:0, locks:0 | |
| }).workingSet.pagesInMemory * 4096 / (Math.pow(2,30)); |
You need 7zip installed to grab the NPI database. (brew install p7zip osx)
To create the index, run the init_* scripts. You would need the doctor graph referral data to use *_refer.*, but the NPI database will be automatically downloaded for you. Indexing happens on all cores, and takes less than 10 min on my 8 core machine.
To grab lines matching a search term, use python search_npi.py term.
Note: index performance is good if you have a lot of memory. Index file blocks will stay hot in cache, but they are loaded each time the program is run, which is super inefficient. Should use an on-disk hashtable where the offsets can be calculated instead.
| froms = {} | |
| tos = {} | |
| for i,line in enumerate(file('refer.2011.csv')): | |
| try: | |
| fr, to, count = line.strip().split(',') | |
| froms[fr] = froms.get(fr,0) + 1 | |
| tos[to] = tos.get(to,0) + 1 | |
| except: | |
| import traceback; traceback.print_exc() |
| import random | |
| import matplotlib.pyplot as plt | |
| k = 1000 | |
| array = [] | |
| for n, x in enumerate([range(k)[random.randrange(k)] for x in range(100000)]): | |
| if n < k: | |
| array.append(x) | |
| else: | |
| if random.random() < k/float(n): |