Skip to content

Instantly share code, notes, and snippets.

@Harshakvarma
Forked from twolfson/README.md
Created May 8, 2018 19:29
Show Gist options
  • Select an option

  • Save Harshakvarma/364bf23615155396505f2355e4e4330b to your computer and use it in GitHub Desktop.

Select an option

Save Harshakvarma/364bf23615155396505f2355e4e4330b to your computer and use it in GitHub Desktop.
Quick and dirty database dump to S3 via Node.js

We are implementing database dumping which is straightforward but can be tedious to setup. Here's our setup:

  1. Create AWS user for db backups (e.g. db-backups-{{app}})
    • Save credentials in a secure location
    • If adding db scrubbing, use a separate user (e.g db-scrubs-{{app}})
  2. Create bucket for S3 access logging (e.g. s3-access-log-{{app}})
  3. Create consistently named bucket for db dumps (e.g. db-backups-{{app}})
    • Enable logging to s3-access-log-{{app}} with prefix of db-backups-{{app}}
  4. Add IAM policy for bucket access
  5. Upload a dump to S3 via our script
    • node backup-local-db.js
// Based on: https://gist.github.com/twolfson/f5d8adead6def0b55663
// Load in our dependencies
var assert = require('assert');
var fs = require('fs');
var AWS = require('aws-sdk');
var spawn = require('child_process').spawn;
// Define our constants upfront
var dbName = 'dbName';
var S3_BUCKET = 'S3_BUCKET';
var s3AccessKeyId = 'S3_ACCESS_KEY_ID';
var s3SecretAccessKey = 'S3_SECRET_ACCESS_KEY';
// Determine our filename
// 20170312.011924.307000000.sql.gz
var timestamp = (new Date()).toISOString()
.replace(/^(\d{4})-(\d{2})-(\d{2})T(\d{2}):(\d{2}):(\d{2}).(\d{3})Z$/, '$1$2$3.$4$5$6.$7000000');
var filepath = timestamp + '.sql.gz';
// Configure AWS credentials
// http://docs.aws.amazon.com/sdk-for-javascript/v2/developer-guide/loading-node-credentials-environment.html
// DEV: There's likely a better non-environment way to do this but it's not well documented
process.env.AWS_ACCESS_KEY_ID = s3AccessKeyId;
process.env.AWS_SECRET_ACCESS_KEY = s3SecretAccessKey;
// Define our S3 connection
// https://aws.amazon.com/sdk-for-node-js/
// http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html
var s3 = new AWS.S3();
// Dump our database to a file so we can collect its length
// DEV: We output `stderr` to `process.stderr`
// DEV: We write to disk so S3 client can calculate `Content-Length` of final result before uploading
console.log('Dumping `pg_dump` into `gzip`');
var pgDumpChild = spawn('pg_dump', [dbName], {stdio: ['ignore', 'pipe', 'inherit']});
pgDumpChild.on('exit', function (code) {
if (code !== 0) {
throw new Error('pg_dump: Bad exit code (' + code + ')');
}
});
var gzipChild = spawn('gzip', {stdio: ['pipe', 'pipe', 'inherit']});
gzipChild.on('exit', function (code) {
if (code !== 0) {
throw new Error('gzip: Bad exit code (' + code + ')');
}
});
var writeStream = fs.createWriteStream(filepath);
pgDumpChild.stdout.pipe(gzipChild.stdin);
gzipChild.stdout.pipe(writeStream);
// When our write stream is completed
writeStream.on('finish', function handleFinish () {
// Upload our gzip stream into S3
// http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#putObject-property
console.log('Uploading "' + filepath + '" to S3');
s3.putObject({
Bucket: S3_BUCKET,
Key: filepath,
ACL: 'private',
ContentType: 'text/plain',
ContentEncoding: 'gzip',
Body: fs.createReadStream(filepath)
}, function handlePutObject (err, data) {
// If there was an error, throw it
if (err) {
throw err;
// Otherwise, log success
} else {
console.log('Successfully uploaded "' + filepath + '"');
}
});
});
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment