Marcus Breese mbreese

Many modern websites are dynamically generated/loaded. This means that if you want to download them / scrape them, or use them with an LLM/MCP server, you need to download them with a full web browser. You can't just curl the data and user that HTML. You'll miss all of the content.

So, to do this, you can use this Dockerfile and script to download any website and convert the page to markdown.

db.sh

This script will manage a user-run postgres database. The main idea here is to run Postgres on a system where you don't have root access (and don't want root access). If you can not install Postgres through the system (or docker), then this is one way to still be able to use the database, even if you can't start it as a service.

Create a database and initialize it with a user login

./db.sh initdb dbname username userpass
Start/stop the database

	#!/usr/bin/env python3
	#
	# Converts SIFT format files to VCF format for easier annotation lookup
	#
	# Sift input gz files are tab-delimited gzip compressed files. Coordinates are 1-based, so
	# this is a pretty easy conversion.
	#

	import os
	import sys

	#!/home/mbreese/.local/src/bcrypt-venv/bin/python3
	import sys
	import bcrypt

	def usage():
	sys.stderr.write('''\
	Encrypting text

	Usage: bcrypt {-r} [text]
	You can also pipe raw plaintext through stdin (will be stripped unless -r is set!)

	#!/usr/bin/env python3
	import os
	import sys

	if len(sys.argv) == 1:
	print(os.path.abspath('.'))
	else:
	for arg in sys.argv[1:]:
	print(os.path.abspath(arg))

	package io.compgen.ngsutils.support;

	import sun.misc.Signal;
	import sun.misc.SignalHandler;

	abstract public class CustomSignalHandler implements SignalHandler {

	private SignalHandler old = null;

	private void setOld(SignalHandler old) {

	#!/bin/bash
	COLS="$(tput cols)"

	while read LINE; do
	echo $LINE \| head -c $COLS
	done

	#!/bin/bash
	#
	# This will by default take snapshots of all zfs filesystems,
	# but could be adapted to only pull specific ones. This script
	# will also rotate snapshots, removing those that are older
	# than 2 weeks.
	#

	CURDATE=$(date +%Y%m%d)
	TWOWEEKSAGO=$(date -d'2 weeks ago' +%Y%m%d)

	#!/usr/bin/env python

	import sys
	import datetime

	rate = 100000

	if len(sys.argv) > 1:
	rate = int(sys.argv[1])