Skip to content

Instantly share code, notes, and snippets.

@inutano
Last active March 15, 2026 08:17
Show Gist options
  • Select an option

  • Save inutano/3dfcdc4c1f0314c8597fde368eec2133 to your computer and use it in GitHub Desktop.

Select an option

Save inutano/3dfcdc4c1f0314c8597fde368eec2133 to your computer and use it in GitHub Desktop.
BioHackrXiv: GitHub Actions-based paper submission plan

BioHackrXiv: GitHub Actions-based Paper Submission Plan

Current State

Component Repo Status
PDF engine bhxiv-gen-pdf Docker image at ghcr.io/biohackrxiv/bhxiv-gen-pdf:master
Preview web app preview.biohackrxiv.org Sinatra app that clones repos / accepts ZIPs, calls gen-pdf
Paper template publication-template GitHub template repo with a basic CI workflow
Index site bhxiv-index Jekyll site at index.biohackrxiv.org, authoritative event list
Metadata/RDF bhxiv-metadata SPARQL pipeline, partially broken, stale event list

Event Management

  • Authoritative event list: bhxiv-index/_data/meetings.yml (~37 events, actively maintained)
  • Stale event list: bhxiv-metadata/etc/events.yaml (~11 events, not synchronized)
  • Event registration: Issue template exists in bhxiv-metadata but should move to bhxiv-index
  • SPARQL pipeline: Partially non-functional (queries commented out in preview site)

Issues with the Current System

  1. PDF only available as artifact — users must navigate to Actions > Run > Artifacts to download
  2. No pull request preview — only triggers on push to main, not on PRs
  3. No GitHub Pages deployment — PDF isn't served at a stable URL
  4. Fragile metadata validation — uses raw sed+grep instead of proper YAML parsing; unhelpful error messages
  5. Docker image uses pinned old versions — Ruby 3.0.2, Pandoc 2.16.1, Debian Bullseye-era TexLive
  6. No PR comment with PDF link — no feedback loop for authors
  7. Two divergent event listsbhxiv-metadata and bhxiv-index are not synchronized
  8. No documentation for organizers or submitters — contributors can't find how to register events

In Progress

Documentation PRs

Pending Review

Phase 1: Enhance the publication-template workflow

Goal: Make the GitHub Actions workflow the primary way to generate PDFs.

Changes to publication-template/.github/workflows/gen_pdf.yaml:

  1. Trigger on PRs too — add pull_request trigger so authors get feedback before merging
  2. Deploy PDF to GitHub Pages — after successful build on main, publish paper.pdf to GitHub Pages so each repo has a stable preview URL (https://<user>.github.io/<repo>/paper.pdf)
  3. Post PR comment with PDF link — on PR builds, upload artifact and comment with a link to download it
  4. Improve metadata validation — replace the fragile sed | grep checks with a proper validation step (use yq to parse YAML front matter and give clear error messages for each missing field)
  5. Add status badge — include a build-status badge in the template README.md

Phase 2: Modernize the PDF engine (bhxiv-gen-pdf)

  1. Update base image — Ruby 3.0.2 is EOL; bump to 3.2+ or 3.3
  2. Update Pandoc — 2.16.1 → 3.x (check compatibility with LaTeX template and Lua filters)
  3. Pin TexLive by year — use a more recent TexLive version from a Debian Bookworm base
  4. Add multi-arch build — support linux/arm64 for faster CI on newer GitHub runners
  5. Improve local usage documentation — make it easy for users to run gen-pdf locally via Docker (see Phase 5)

Phase 3: Create a reusable GitHub Action (recommended)

  1. Package as a composite or Docker GitHub Action in bhxiv-gen-pdf — so instead of using the third-party docker-run-action, paper repos can simply do:
- uses: biohackrxiv/bhxiv-gen-pdf@v1
  with:
    paper_dir: paper

This is cleaner, versionable, and gives you control over the interface.

Phase 4: Consolidate event management

  1. Move issue template from bhxiv-metadata to bhxiv-index
  2. Automate event approval — issue triggers a GitHub Actions workflow that generates a PR with meetings.yml entry + tag page scaffold
  3. Validate paper event fields against the approved event list during PDF generation
  4. Deprecate bhxiv-metadata/etc/events.yaml in favor of bhxiv-index/_data/meetings.yml

Phase 5: Convert preview.biohackrxiv.org to a static documentation site

Goal: Replace the Sinatra/Docker PDF generation app with a static page that serves as the central documentation and instruction hub for BioHackrXiv users.

The preview.biohackrxiv.org domain is well-known and already linked from many places, so it should remain active as a landing page.

Content for the static site

The page should cover:

  1. Overview — what BioHackrXiv is and how the submission process works
  2. For paper submitters (or link to index site guide)
    • How to create a repo from the publication-template
    • Required YAML metadata fields with examples
    • How to find your event's metadata on the index site
  3. For event organizers (or link to index site guide)
    • How to register an event via issue on bhxiv-index
  4. How to generate a PDF — three options:
    • Option A: GitHub Actions (recommended) — push to your repo, PDF is auto-generated and deployed to GitHub Pages
    • Option B: Run locally with Dockerdocker run --rm -v $(pwd):/work ghcr.io/biohackrxiv/bhxiv-gen-pdf:master gen-pdf /work/paper
    • Option C: Run locally without Docker — install Ruby, Pandoc, TexLive, clone bhxiv-gen-pdf, run gen-pdf
  5. Links — publication index, meetings list, published papers on OSF, GitHub org, contact

Implementation

  • Replace the Sinatra app with a static HTML page (or a minimal Jekyll/Hugo site)
  • Host via GitHub Pages from the preview.biohackrxiv.org repo
  • Keep the preview.biohackrxiv.org CNAME pointing to GitHub Pages
  • Remove Docker/Sinatra/Nginx infrastructure

Before retiring the dynamic server

  • Phase 1 complete (GitHub Actions workflow is the primary PDF generation method)
  • Guide pages merged and live on index.biohackrxiv.org
  • All references to the old preview form updated across org repos
  • Static site deployed and tested on preview.biohackrxiv.org
  • Wait one hackathon cycle to confirm the new workflow works in practice

Recommended Implementation Order

Step What Where Effort Status
1 Add organizer/submitter guide pages bhxiv-index, preview Small PR #8, PR #33
2 Add PR trigger + PR comment with artifact link publication-template Small TODO
3 Add GitHub Pages deployment of PDF publication-template Small TODO
4 Improve metadata validation publication-template Small TODO
5 Move issue template to bhxiv-index bhxiv-index Small TODO
6 Package as reusable GitHub Action bhxiv-gen-pdf Medium TODO
7 Modernize Docker image (Ruby, Pandoc, TexLive) bhxiv-gen-pdf Medium TODO
8 Automate event approval workflow bhxiv-index Medium TODO
9 Convert preview site to static docs preview Medium TODO

Architecture Diagram (Current → Target)

Current

Author writes paper.md
        |
        v
[preview.biohackrxiv.org]  ← Sinatra web app (manual step)
        |
        v
[bhxiv-gen-pdf CLI + Pandoc + TexLive]
        |
        v
    paper.pdf (ephemeral, /tmp based)

Target

Author writes paper.md in GitHub repo (from template)
        |
        v
[GitHub Actions on push/PR]
  uses: biohackrxiv/bhxiv-gen-pdf@v1
        |
        ├─ PR → artifact + comment with link
        └─ main → GitHub Pages (stable URL)
        |
        v
    paper.pdf (persistent, versioned)

[preview.biohackrxiv.org]  ← Static documentation site
  - Getting started guide
  - Links to template, index, meetings
  - Instructions for GitHub Actions / local Docker / local CLI
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment