Contractor Onboarding Checklist - Nebula Aurora (Updated Jan 2026)

Contractor Onboarding Checklist

Nebula Aurora

Welcome to Nebula Aurora! This checklist will get you fully set up to start contributing.

flowchart LR
    subgraph "Week 1"
        A[Admin Setup] --> B[Communication]
        B --> C[Platform Access]
        C --> D[Environment Setup]
    end
    D --> E[Getting Started]
    E --> F[Task Workflow]

1. Administrative Setup

Complete these as emails arrive (usually within 1-2 days of starting):

Sign NDA (email notification)
Sign statement of work contract via Remote.com or platform.a.team* (if applicable)
Install Insightful for time tracking*
- You'll receive an invite from support@insightful.io (check spam)
- If no invite, your account may be active—download from app.insightful.io and log in
- Setup guide: Insightful onboarding instructions

*May not apply to all team members—confirm with your manager before proceeding.

2. Communication

Join Discord (invite will be sent to you)
Attend onboarding call: Daily at 10:00–10:30 PM IST — Google Meet link
Note: Office hours are mandatory, Tuesdays and Fridays at 10:30 PM IST

3. Platform Access

These require an admin to grant access first. Reach out on Discord to Kartik, Chai, Vaibhav, or Nikos if blocked.

Apex Platform

Review the Apex Setup Guide for reference
Log into Apex UI with your Google account
Retrieve your API key from the Apex UI dashboard

apex-arena CLI

Install the CLI tool using the install script in the Nebula repo:

cd Nebula
bash apex-arena-install.sh

You'll need an API key from Apex UI to complete the installation.

Authenticate with the same Google account you used for Apex UI:

gcloud auth login

Configure Docker to pull from Google Artifact Registry:

gcloud auth configure-docker us-central1-docker.pkg.dev

Key Commands

Command	Purpose
`apex-arena init <task_name>`	Create a new task from template
`apex-arena check-anatomy <folder>`	Validate task folder structure
`apex-arena check-quality <task>`	AI-powered quality review (also runs during push/update)
`apex-arena validate-grader <grader.py>`	Check grader for issues (also runs during push/update)
`apex-arena test-solution <task_id>`	Run solution.sh and verify score
`apex-arena tasks list`	List your tasks
`apex-arena tasks push <dir>`	Push a new task to Apex (returns a UUID)
`apex-arena tasks update <uuid> <dir>`	Update an existing task by UUID
`apex-arena tasks download <id>`	Download task(s) by ID
`apex-arena eval --tasks <ids>`	Run evaluations locally (multiple tasks)
`apex-arena evaluations run <task_id>`	Run evaluations locally (single/remote tasks)
`apex-arena grade`	Grade a specific problem
`apex-arena version`	Show current apex-arena version
`apex-arena update`	Update apex-arena to latest version

Run apex-arena --help or apex-arena <command> --help for the full list of commands and options.

GitHub

Confirm you can access NebulaAuroras/Nebula
Confirm you can access the Task Tracking Board

4. Environment Setup

Important: Local development is only supported on Linux or in a Linux VM. Running the Nebula container directly on macOS is not supported due to container/k3s compatibility issues.

Pull the Nebula Docker Image

The Nebula image uses immutable versioned tags — there is no :latest tag. Always use the fully qualified image name with a version tag. Check the CHANGELOG and releases for the current version.

docker pull us-central1-docker.pkg.dev/bespokelabs/nebula-devops-registry/nebula-devops:1.0.1

Tag it locally for convenience:

docker tag us-central1-docker.pkg.dev/bespokelabs/nebula-devops-registry/nebula-devops:1.0.1 nebula-devops:latest

Test that it runs:

docker run -d \
  --name nebula-test-container \
  --privileged \
  --cgroupns=private \
  nebula-devops

Verify the environment:

docker exec -it -u ubuntu nebula-test-container bash
watch -n 1 kubectl get pods

Workspace VM

If you're not certain you'll be running exclusively on local Linux or hosted environments, request a VM in the #vm-request Discord channel.

Once your VM is being provisioned:

Add your SSH public key to the pubkey spreadsheet
Wait for VM connection details from admin
SSH into VM and verify access

5. Getting Started

Required Reading

Start Here / Single Source of Truth — check the "Last Updated" date; some sections may be outdated
Nebula Aurora Instructions — task creation workflow, environment details, debugging
Apex Arena Documentation — CLI reference and task format

Note: The Nebula Aurora Real Scenarios spreadsheet is deprecated. Task tracking has moved to the GitHub Project Board.

Get Your Assignment

Receive your task category: SRE / DevOps / Platform Engineering / CloudOps
Check the Task Tracking Board for available tasks and assignments

Submit Your First Task

Create a simple task to verify your setup:

flowchart LR
    A[init] --> B[edit]
    B --> C[check-anatomy]
    C --> D[check-quality]
    D --> E[test-solution]
    E --> F[push]
    D -.->|issues| B
    E -.->|fails| B

Initialize a new task:
```
apex-arena init my-first-task
```
Edit the generated files in tasks/my-first-task/

Validate your task:

apex-arena check-anatomy tasks/my-first-task
apex-arena check-quality tasks/my-first-task

Test your solution:
```
apex-arena test-solution my-first-task
```

Push to Apex using the spec ID for your category. The first push returns a UUID for your task:

apex-arena tasks push tasks/my-first-task --spec <spec-id>

You can view your task in the Apex UI at https://apex-ui-v2-319533213591.us-central1.run.app/tasks/<uuid>.

For all subsequent updates, use update with that UUID:

apex-arena tasks update <task-uuid> tasks/my-first-task

Category	Spec ID
DevOps	`b407a435-9dc1-4cc3-950c-3194a8f08fde`
SRE	`46394e31-2a74-47c1-8359-51e1b678146d`
Platform Engineering	`9e4d158e-96ff-4435-ab39-4d1e389f4b47`
CloudOps	`450f2e9c-ba04-429c-bf80-e22be0065313`

6. Task Workflow

This section covers the ongoing process for creating, reviewing, and evaluating tasks. You are expected to complete at least 1 approved task per week, including all reviews and iterations.

Review Process

All tasks go through bot review, then two layers of human review before acceptance:

flowchart TD
    A[Approved Task Ideas] --> B[In Progress]
    B --> B1[Bot Review]
    B1 --> C[Ready for Primary Review]
    B1 -.->|address feedback| B
    C --> D[In Primary Review]
    D --> D1[Implementing Primary Feedback]
    D1 --> D
    D --> E[Ready for Secondary Review]
    E --> F[In Secondary Review]
    F --> F1[Implementing Secondary Feedback]
    F1 --> F
    F --> G[Approved]
    G --> H[Done]

Bot Review — Before requesting human review, run @nebula-reviewer <apex-task-uuid> in your task-feedback thread. Address all valid points raised by the bot. You can also use @nebula-reviewer improve <task_id> for suggestions on increasing difficulty. The bot supports version and model selection:
```
@nebula-reviewer abc123...                    # latest version, biggie-nebula (default)
@nebula-reviewer abc123... 2                  # version 2, biggie-nebula
@nebula-reviewer abc123... smalli-nebula      # latest version, smalli-nebula
@nebula-reviewer abc123... 2 smalli-nebula    # version 2, smalli-nebula
```
Limit bot usage to once per version of your task — don't spam it.
Primary Review — A team member reviews your task for correctness and clarity
Secondary Review — A category lead performs final approval

Note: If a reviewer requests changes, move the task to the corresponding "Implementing Feedback" column while you address it, then back to the review column when ready. Don't move it all the way back to In Progress. See this Discord thread for context.

Use the GitHub project board to track your task through these stages.

Evaluation Requirements

Use the biggie-nebula model for all evaluations
Run at least 8 rollouts per evaluation
Target a score of < 0.7 across rollouts
Evaluations can be run locally via apex-arena eval or apex-arena evaluations run, or hosted through the Apex UI web interface
Review rollout transcripts in Apex UI — ensure failures are due to task difficulty, not grader bugs or environment instability
If you see inconsistent pass/fail across rollouts with similar agent behavior, add a short sleep (e.g., 60s) before the first grader check to let the environment stabilize

Important: Your task Dockerfiles must reference the fully qualified versioned image for hosted evals to work:
FROM us-central1-docker.pkg.dev/bespokelabs/nebula-devops-registry/nebula-devops:1.0.1
Do not use FROM nebula-devops:latest — it will not resolve in hosted environments.

Grading Guidelines

Partial grading is preferred. Break your task into functional subscores, each representing a real milestone toward the goal.

Rules for subscores:

Incremental — represents real progress toward the goal
Objective — deterministic and measurable
Not gameable — can't earn the reward without actual work
Equal weights — all subscores should have equal weight

Quick self-test for each subscore:

"If the agent ONLY gets this subscore, did they make real progress?" → should be yes
"Can the agent get this reward without working toward the actual goal?" → should be no

Bad subscores: file_exists, no_syntax_errors, config_valid, pods_ready Good subscores: database_running, app_responds, backup_restores_data, canary_routes_traffic

Example for a task "deploy flask app with postgres database":

# Bad - agent can score 0.8 without anything working
subscores = {"requirements_exists": 1, "dockerfile_exists": 1, "no_syntax_errors": 1, "config_file_valid": 1}

# Good - each subscore means something actually works
subscores = {"database_running": 1, "app_responds": 1, "db_connection_works": 1}
weights = {"database_running": 0.33, "app_responds": 0.33, "db_connection_works": 0.33}

Quick Reference

Resource	Link
Apex UI	https://apex-ui-v2-319533213591.us-central1.run.app/
Apex Setup Guide	Google Slides
GitHub Repo	https://github.com/NebulaAuroras/Nebula
Task Tracking Board	GitHub Project
Nebula CHANGELOG	CHANGELOG.md
Instructions Doc	Google Doc
Daily Onboarding Call	Google Meet — 10:00 PM IST
Office Hours (Tue/Fri)	Google Meet — 10:30 PM IST

Need Help?

Post in Discord or attend office hours. Kartik, Vaibhav, and Nikos can help with access and operational issues. Shahryar can assist with technical and task-feedback support. For infrastructure issues, ask Greg or Dylan in #nebula-infra.

This is an unofficial onboarding document maintained by Dylan. Bespoke is working on improving their official docs, which may supersede this one over time.

arubis/contractor-onboarding-draft.md

Select an option

No results found