I hereby claim:
- I am corey on github.
- I am coreyr (https://keybase.io/coreyr) on keybase.
- I have a public key ASB8VlTKBAJeOdUzWFC-FhjDQgnl_FsisFq2vyVQeSzwewo
To claim this, I am signing this object:
I hereby claim:
To claim this, I am signing this object:
A friend asked me for a few pointers to interesting, mostly recent papers on data warehousing and "big data" database systems, with an eye towards real-world deployments. I figured I'd share the list. While it's biased and rather incomplete but maybe of interest to someone. While many are obvious choices (I've omitted several, like MapReduce), I think there are a few underappreciated gems.
###Dataflow Engines:
Dryad--general-purpose distributed parallel dataflow engine
http://research.microsoft.com/en-us/projects/dryad/eurosys07.pdf
Spark--in memory dataflow
http://www.cs.berkeley.edu/~matei/papers/2012/nsdi_spark.pdf
| # config/initializers/extensions/active_record.rb | |
| module ActiveRecord | |
| class Base | |
| class << self | |
| delegate :pluck, to: :scoped | |
| end | |
| end | |
| class CollectionProxy | |
| delegate :pluck, to: :scoped |
| # Have you ever had to sleep() in Capybara-WebKit to wait for AJAX and/or CSS animations? | |
| describe 'Modal' do | |
| should 'display login errors' do | |
| visit root_path | |
| click_link 'My HomeMarks' | |
| within '#login_area' do | |
| fill_in 'email', with: '[email protected]' | |
| fill_in 'password', with: 'test' |
| # activerecord/lib/active_record/associations/builder/belongs_to.rb automagically creates | |
| # the private methods for you if you include counter_cache in your belongs_to association. | |
| # We simply override the basic behavior of these with our own conditions. | |
| # | |
| # For more information, check out: | |
| # https://github.com/rails/rails/blob/733bfa63f5d8d3b963202b6d3e9f00b4db070b91/activerecord/lib/active_record/associations/builder/belongs_to.rb | |
| # Lines 23 - 44 | |
| class Inventory < ActiveRecord::Base | |
| belongs_to :user, counter_cache:true |
| (ns gist.globhfs | |
| (:import [cascading.tap GlobHfs])) | |
| ;; ### Bucket to Cluster | |
| ;; | |
| ;;; To get tuples back out of our directory structure on S3, we employ | |
| ;; Cascading's [GlobHFS] (http://goo.gl/1Vwdo) tap, along with an | |
| ;; interface tailored for datasets stored in the MODIS sinusoidal | |
| ;; projection. For details on the globbing syntax, see | |
| ;; [here](http://goo.gl/uIEzu). |
| # unicorn_rails -c /data/github/current/config/unicorn.rb -E production -D | |
| rails_env = ENV['RAILS_ENV'] || 'production' | |
| # 16 workers and 1 master | |
| worker_processes (rails_env == 'production' ? 16 : 4) | |
| # Load rails+github.git into the master before forking workers | |
| # for super-fast worker spawn times | |
| preload_app true |
| # If your workers are inactive for a long period of time, they'll lose | |
| # their MySQL connection. | |
| # | |
| # This hack ensures we re-connect whenever a connection is | |
| # lost. Because, really. why not? | |
| # | |
| # Stick this in RAILS_ROOT/config/initializers/connection_fix.rb (or somewhere similar) | |
| # | |
| # From: | |
| # http://coderrr.wordpress.com/2009/01/08/activerecord-threading-issues-and-resolutions/ |
| # Author: Pieter Noordhuis | |
| # Description: Simple demo to showcase Redis PubSub with EventMachine | |
| # | |
| # Requirements: | |
| # - rubygems: eventmachine, thin, cramp, sinatra, yajl-ruby | |
| # - a browser with WebSocket support | |
| # | |
| # Usage: | |
| # ruby redis_pubsub_demo.rb | |
| # |