很多公司都大量使用了python,其中有一些开发规范,code guidline, 通用组件,基础框架是可以共用的。
每个公司都自己搞一套, 太浪费人力,我想开一帖和大家讨论一下这些python基础设施的搭建。
原则是我们尽量不重新发明轮子,但开源组件这么多,也要有个挑选的过程和组合使用的过程,在这里讨论一下。
另一方面,有些开源组件虽然强大,但我们不能完全的驾驭它,或只使用其中很少的一部分,我们就可以考虑用python实现一个简单的轮子,可控性更强,最好不要超过300行代码。
| // 4 spaces to 2 spaces | |
| %s;^\(\s\+\);\=repeat(' ', len(submatch(0))/2);g | |
| // Tab to 2 spaces | |
| :%s/\t/ /g |
| import spark.SparkContext | |
| import SparkContext._ | |
| /** | |
| * A port of [[http://blog.echen.me/2012/02/09/movie-recommendations-and-more-via-mapreduce-and-scalding/]] | |
| * to Spark. | |
| * Uses movie ratings data from MovieLens 100k dataset found at [[http://www.grouplens.org/node/73]] | |
| */ | |
| object MovieSimilarities { |
| class UnhandledExceptionError(Exception): | |
| """The exception could not be handled by the supervisor.""" | |
| class SupervisorKilledError(Exception): | |
| """The supervisor was killed.""" | |
| class SupervisorAbortedError(Exception): | |
| """The supervisor gave up after maximum number of failures.""" |
| class JsonSerializableMixin(object): | |
| """ | |
| Converts all the properties of the object into a dict for use in json. | |
| You can define the following as your class properties. | |
| _json_eager_load : | |
| list of which child classes need to be eagerly loaded. This applies | |
| to one-to-many relationships defined in SQLAlchemy classes. | |
| _base_blacklist : |
| #! /usr/bin/python | |
| # coding: utf-8 | |
| """Copy table from mysql to sqlite. | |
| Require: | |
| * SQLAlchemy | |
| * MySQLdb or PyMySQL | |
| Usage: |
| class Backend(object): | |
| def __init__(self): | |
| engine = create_engine("mysql://{0}:{1}@{2}/{3}".format(options.mysql_user, options.mysql_pass, options.mysql_host, options.mysql_db) | |
| , pool_size = options.mysql_poolsize | |
| , pool_recycle = 3600 | |
| , echo=options.debug | |
| , echo_pool=options.debug) | |
| self._session = sessionmaker(bind=engine) | |
| @classmethod |
| from gevent import monkey; monkey.patch_all() | |
| import gevent | |
| import gevent.greenlet | |
| from functools import partial | |
| from random import random | |
| import urllib | |
| import urllib2 | |
| def on_exception(fun, greenlet): |