Skip to content

Instantly share code, notes, and snippets.

@arubis
Last active January 27, 2026 16:49
Show Gist options
  • Select an option

  • Save arubis/e5597c4fda2fc01dd66d39bae83cbea7 to your computer and use it in GitHub Desktop.

Select an option

Save arubis/e5597c4fda2fc01dd66d39bae83cbea7 to your computer and use it in GitHub Desktop.
Contractor Onboarding Checklist - Nebula Aurora (Updated Jan 2026)

Contractor Onboarding Checklist - Nebula Aurora (Updated Jan 2026)

Contractor Onboarding Checklist

Nebula Aurora

Welcome to Nebula Aurora! This checklist will get you fully set up to start contributing.

flowchart LR
    subgraph "Week 1"
        A[Admin Setup] --> B[Communication]
        B --> C[Platform Access]
        C --> D[Environment Setup]
    end
    D --> E[Getting Started]
    E --> F[Task Workflow]
Loading

1. Administrative Setup

Complete these as emails arrive (usually within 1-2 days of starting):

  • Sign NDA (email notification)
  • Sign statement of work contract via Remote.com or platform.a.team* (if applicable)
  • Install Insightful for time tracking*

*May not apply to all team members—confirm with your manager before proceeding.


2. Communication

  • Join Discord (invite will be sent to you)
  • Attend onboarding call: Daily at 10:00–10:30 PM ISTGoogle Meet link
  • Note: Office hours are mandatory, Tuesdays and Fridays at 10:30 PM IST

3. Platform Access

These require an admin to grant access first. Reach out on Discord to Kartik, Chai, Vaibhav, or Nikos if blocked.

Apex Platform

  • Review the Apex Setup Guide for reference
  • Log into Apex UI with your Google account
  • Retrieve your API key from the Apex UI dashboard

apex-arena CLI

Install the CLI tool using the install script in the Nebula repo:

cd Nebula
bash apex-arena-install.sh

You'll need an API key from Apex UI to complete the installation.

Authenticate with the same Google account you used for Apex UI:

gcloud auth login

Configure Docker to pull from Google Artifact Registry:

gcloud auth configure-docker us-central1-docker.pkg.dev

Key Commands

Command Purpose
apex-arena init <task_name> Create a new task from template
apex-arena check-anatomy <folder> Validate task folder structure
apex-arena check-quality <task> AI-powered quality review (also runs during push/update)
apex-arena validate-grader <grader.py> Check grader for issues (also runs during push/update)
apex-arena test-solution <task_id> Run solution.sh and verify score
apex-arena tasks list List your tasks
apex-arena tasks push <dir> Push a new task to Apex (returns a UUID)
apex-arena tasks update <uuid> <dir> Update an existing task by UUID
apex-arena tasks download <id> Download task(s) by ID
apex-arena eval --tasks <ids> Run evaluations locally (multiple tasks)
apex-arena evaluations run <task_id> Run evaluations locally (single/remote tasks)
apex-arena grade Grade a specific problem
apex-arena version Show current apex-arena version
apex-arena update Update apex-arena to latest version

Run apex-arena --help or apex-arena <command> --help for the full list of commands and options.

GitHub


4. Environment Setup

Important: Local development is only supported on Linux or in a Linux VM. Running the Nebula container directly on macOS is not supported due to container/k3s compatibility issues.

Pull the Nebula Docker Image

The Nebula image uses immutable versioned tags — there is no :latest tag. Always use the fully qualified image name with a version tag. Check the CHANGELOG and releases for the current version.

docker pull us-central1-docker.pkg.dev/bespokelabs/nebula-devops-registry/nebula-devops:1.0.1

Tag it locally for convenience:

docker tag us-central1-docker.pkg.dev/bespokelabs/nebula-devops-registry/nebula-devops:1.0.1 nebula-devops:latest

Test that it runs:

docker run -d \
  --name nebula-test-container \
  --privileged \
  --cgroupns=private \
  nebula-devops

Verify the environment:

docker exec -it -u ubuntu nebula-test-container bash
watch -n 1 kubectl get pods

Workspace VM

If you're not certain you'll be running exclusively on local Linux or hosted environments, request a VM in the #vm-request Discord channel.

Once your VM is being provisioned:

  • Add your SSH public key to the pubkey spreadsheet
  • Wait for VM connection details from admin
  • SSH into VM and verify access

5. Getting Started

Required Reading

Note: The Nebula Aurora Real Scenarios spreadsheet is deprecated. Task tracking has moved to the GitHub Project Board.

Get Your Assignment

  • Receive your task category: SRE / DevOps / Platform Engineering / CloudOps
  • Check the Task Tracking Board for available tasks and assignments

Submit Your First Task

Create a simple task to verify your setup:

flowchart LR
    A[init] --> B[edit]
    B --> C[check-anatomy]
    C --> D[check-quality]
    D --> E[test-solution]
    E --> F[push]
    D -.->|issues| B
    E -.->|fails| B
Loading
  1. Initialize a new task:

    apex-arena init my-first-task
  2. Edit the generated files in tasks/my-first-task/

  3. Validate your task:

    apex-arena check-anatomy tasks/my-first-task
    apex-arena check-quality tasks/my-first-task
  4. Test your solution:

    apex-arena test-solution my-first-task
  5. Push to Apex using the spec ID for your category. The first push returns a UUID for your task:

    apex-arena tasks push tasks/my-first-task --spec <spec-id>

    You can view your task in the Apex UI at https://apex-ui-v2-319533213591.us-central1.run.app/tasks/<uuid>.

    For all subsequent updates, use update with that UUID:

    apex-arena tasks update <task-uuid> tasks/my-first-task
    Category Spec ID
    DevOps b407a435-9dc1-4cc3-950c-3194a8f08fde
    SRE 46394e31-2a74-47c1-8359-51e1b678146d
    Platform Engineering 9e4d158e-96ff-4435-ab39-4d1e389f4b47
    CloudOps 450f2e9c-ba04-429c-bf80-e22be0065313

6. Task Workflow

This section covers the ongoing process for creating, reviewing, and evaluating tasks. You are expected to complete at least 1 approved task per week, including all reviews and iterations.

Review Process

All tasks go through bot review, then two layers of human review before acceptance:

flowchart TD
    A[Approved Task Ideas] --> B[In Progress]
    B --> B1[Bot Review]
    B1 --> C[Ready for Primary Review]
    B1 -.->|address feedback| B
    C --> D[In Primary Review]
    D --> D1[Implementing Primary Feedback]
    D1 --> D
    D --> E[Ready for Secondary Review]
    E --> F[In Secondary Review]
    F --> F1[Implementing Secondary Feedback]
    F1 --> F
    F --> G[Approved]
    G --> H[Done]
Loading
  1. Bot Review — Before requesting human review, run @nebula-reviewer <apex-task-uuid> in your task-feedback thread. Address all valid points raised by the bot. You can also use @nebula-reviewer improve <task_id> for suggestions on increasing difficulty. The bot supports version and model selection:
    @nebula-reviewer abc123...                    # latest version, biggie-nebula (default)
    @nebula-reviewer abc123... 2                  # version 2, biggie-nebula
    @nebula-reviewer abc123... smalli-nebula      # latest version, smalli-nebula
    @nebula-reviewer abc123... 2 smalli-nebula    # version 2, smalli-nebula
    
    Limit bot usage to once per version of your task — don't spam it.
  2. Primary Review — A team member reviews your task for correctness and clarity
  3. Secondary Review — A category lead performs final approval

Note: If a reviewer requests changes, move the task to the corresponding "Implementing Feedback" column while you address it, then back to the review column when ready. Don't move it all the way back to In Progress. See this Discord thread for context.

Use the GitHub project board to track your task through these stages.

Evaluation Requirements

  • Use the biggie-nebula model for all evaluations
  • Run at least 8 rollouts per evaluation
  • Target a score of < 0.7 across rollouts
  • Evaluations can be run locally via apex-arena eval or apex-arena evaluations run, or hosted through the Apex UI web interface
  • Review rollout transcripts in Apex UI — ensure failures are due to task difficulty, not grader bugs or environment instability
  • If you see inconsistent pass/fail across rollouts with similar agent behavior, add a short sleep (e.g., 60s) before the first grader check to let the environment stabilize

Important: Your task Dockerfiles must reference the fully qualified versioned image for hosted evals to work:

FROM us-central1-docker.pkg.dev/bespokelabs/nebula-devops-registry/nebula-devops:1.0.1

Do not use FROM nebula-devops:latest — it will not resolve in hosted environments.

Grading Guidelines

Partial grading is preferred. Break your task into functional subscores, each representing a real milestone toward the goal.

Rules for subscores:

  • Incremental — represents real progress toward the goal
  • Objective — deterministic and measurable
  • Not gameable — can't earn the reward without actual work
  • Equal weights — all subscores should have equal weight

Quick self-test for each subscore:

  1. "If the agent ONLY gets this subscore, did they make real progress?" → should be yes
  2. "Can the agent get this reward without working toward the actual goal?" → should be no

Bad subscores: file_exists, no_syntax_errors, config_valid, pods_ready Good subscores: database_running, app_responds, backup_restores_data, canary_routes_traffic

Example for a task "deploy flask app with postgres database":

# Bad - agent can score 0.8 without anything working
subscores = {"requirements_exists": 1, "dockerfile_exists": 1, "no_syntax_errors": 1, "config_file_valid": 1}

# Good - each subscore means something actually works
subscores = {"database_running": 1, "app_responds": 1, "db_connection_works": 1}
weights = {"database_running": 0.33, "app_responds": 0.33, "db_connection_works": 0.33}

Quick Reference

Resource Link
Apex UI https://apex-ui-v2-319533213591.us-central1.run.app/
Apex Setup Guide Google Slides
GitHub Repo https://github.com/NebulaAuroras/Nebula
Task Tracking Board GitHub Project
Nebula CHANGELOG CHANGELOG.md
Instructions Doc Google Doc
Daily Onboarding Call Google Meet — 10:00 PM IST
Office Hours (Tue/Fri) Google Meet — 10:30 PM IST

Need Help?

Post in Discord or attend office hours. Kartik, Vaibhav, and Nikos can help with access and operational issues. Shahryar can assist with technical and task-feedback support. For infrastructure issues, ask Greg or Dylan in #nebula-infra.


This is an unofficial onboarding document maintained by Dylan. Bespoke is working on improving their official docs, which may supersede this one over time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment