Skip to content

Instantly share code, notes, and snippets.

@iftheshoefritz
Created November 28, 2025 16:11
Show Gist options
  • Select an option

  • Save iftheshoefritz/2212733990793b1e663594d8465485d3 to your computer and use it in GitHub Desktop.

Select an option

Save iftheshoefritz/2212733990793b1e663594d8465485d3 to your computer and use it in GitHub Desktop.
GEMP-LOTR Bot Neural Network System Documentation - ML-based gameplay bots using SMILE library

Bot Neural Network System Documentation

Overview

The GEMP-LOTR bot system uses machine learning to make gameplay decisions. While referred to as "neural nets" in conversation, the implementation uses Logistic Regression classifiers from the SMILE (Statistical Machine Intelligence and Learning Engine) library rather than deep neural networks.

Architecture

Technology Stack

  • ML Library: SMILE 2.6.0 (com.github.haifengl:smile-core)
  • Model Type: Logistic Regression classifiers (smile.classification.LogisticRegression)
  • Training Method: Reinforcement learning through self-play
  • Language: Java 21

Key Components

Main Bot Implementation

  • FotrStarterBot.java - Primary bot that uses trained models to make decisions
    • Location: gemp-lotr/gemp-lotr-server/src/main/java/com/gempukku/lotro/bots/rl/fotrstarters/FotrStarterBot.java
    • Handles all game decision types (card selection, assignments, multiple choice, etc.)

Model Management

  • ModelRegistry.java - Stores and retrieves trained models at runtime
  • ModelIO.java - Handles saving/loading models to/from disk
    • Models saved as serialized Java objects (.model files)
    • Default location: bot-models/ directory in project root

Training System

  • BotService.java - Coordinates model training and bot initialization
  • AbstractTrainer.java - Base class for all specialized trainers
  • FotrStartersLearningStepsPersistence.java - Saves/loads training data

Model Storage

Model Files Location

Directory: /gemp-lotr/bot-models/

Format: Serialized Java objects (.model extension)

Size: ~500-1600 bytes per model (very lightweight)

Current Trained Models (32 total)

Card Selection Models:

  • PlayFromHandTrainer.model
  • AttachItemTrainer.model
  • ExertTrainer.model
  • HealTrainer.model
  • DiscardFromHandTrainer.model
  • DiscardFromPlayTrainer.model
  • ReconcileTrainer.model
  • SanctuaryTrainer.model
  • ArcheryWoundTrainer.model
  • SkirmishOrderTrainer.model
  • FallBackCardSelectionTrainer.model

Multiple Choice Models:

  • MulliganTrainer.model
  • GoFirstTrainer.model
  • AnotherMoveTrainer.model

Integer Choice Models:

  • BurdenTrainer.model

Assignment Models:

  • FpAssignmentTrainer.model
  • ShadowAssignmentTrainer.model

Card Action Models (by phase):

  • Fellowship: FellowshipPlayCardTrainer.model, FellowshipUseCardTrainer.model, FellowshipHealTrainer.model, FellowshipTransferTrainer.model
  • Shadow: ShadowPlayCardTrainer.model, ShadowUseCardTrainer.model
  • Maneuver: ManeuverPlayCardTrainer.model, ManeuverUseCardTrainer.model
  • Skirmish: SkirmishFpPlayCardTrainer.model, SkirmishFpUseCardTrainer.model, SkirmishShadowPlayCardTrainer.model, SkirmishShadowUseCardTrainer.model
  • Regroup: RegroupPlayCardTrainer.model

Other Models:

  • OptionalResponsesCardActionTrainer.model
  • CardFromDiscardTrainer.model
  • StartingFellowshipTrainer.model

Training Process

Configuration Flags (in BotService.java)

START_SIMULATIONS_AT_STARTUP = false  // Set to true to enable training
LOAD_MODELS_FROM_FILES = true         // Set to false to retrain from data

Two Operating Modes

Mode 1: Production (Default)

  • LOAD_MODELS_FROM_FILES = true
  • Loads pre-trained models from bot-models/*.model files at server startup
  • No training occurs
  • Fast startup
  • Bot ready to play immediately

Mode 2: Training Mode

  • START_SIMULATIONS_AT_STARTUP = true
  • Triggers self-play training loop

Self-Play Training Loop

Process (defined in BotService.runSelfPlayTrainingLoop()):

  1. Generation 0 (Bootstrap):

    • Two random bots play 1,000 games against each other
    • Generates initial training data from random play
  2. Subsequent Generations:

    • Training bot vs Random bot: 20% of games
    • Training bot vs Training bot: 80% of games
    • Configurable games per generation (default: 10,000)
  3. Data Collection:

    • Each game decision stored as a LearningStep containing:
      • State vector (extracted game state features)
      • Semantic action taken
      • Reward value
      • Decision context
    • Stored in ReplayBuffer (capacity: 100,000 steps)
  4. Persistence:

    • Training steps saved to .jsonl files (JSON Lines format)
    • One file per trainer type
    • Naming pattern: fotr-starters-{trainer-name}.jsonl
    • Files accumulate across generations (append mode)
  5. Retraining:

    • After each generation, all models retrained on complete historical data
    • Process per trainer:
      List<LearningStep> steps = persistence.load(trainer);
      SoftClassifier<double[]> model = trainer.train(steps);
      modelRegistry.registerModel(trainer.getClass(), model);
      ModelIO.saveModel(trainer.getClass(), model);
  6. Model Update:

    • New models saved to bot-models/*.model
    • Used in next generation of play

Training Algorithm

Core Training Code (from AbstractTrainer.java):

public SoftClassifier<double[]> trainWithPoints(List<LabeledPoint> points) {
    double[][] x = points.stream().map(LabeledPoint::x).toArray(double[][]::new);
    int[] y = points.stream().mapToInt(LabeledPoint::y).toArray();
    return LogisticRegression.fit(x, y);
}
  • Uses SMILE's logistic regression with default parameters
  • Supervised learning on labeled gameplay decisions
  • Each trainer extracts relevant training points from LearningStep history
  • Binary or multi-class classification depending on decision type

How Bots Make Decisions

Decision Flow (FotrStarterBot.java)

  1. Game State → Features:

    • RLGameStateFeatures.extractFeatures() converts game state to feature vector
  2. Decision Type Routing:

    • INTEGER → Integer choice trainers
    • MULTIPLE_CHOICE → Multiple choice trainers
    • CARD_SELECTION → Card selection trainers
    • CARD_ACTION_CHOICE → Card action trainers (by phase)
    • ASSIGN_MINIONS → Assignment trainers
    • ARBITRARY_CARDS → Arbitrary card trainers
    • ACTION_CHOICE → Action choice (currently random)
  3. Model Inference:

    • Appropriate trainer selected based on decision context
    • Trainer uses its trained model to predict best action
    • Falls back to random decision if no trainer applies
  4. Action Recording:

    • State, action, and context stored as LearningStep
    • Episode reward assigned when game ends
    • Steps added to replay buffer for future training

Specialized Trainers

Each trainer is responsible for one specific type of decision:

Example - MulliganTrainer:

  • Triggers on decision text containing "mulligan"
  • Binary classification: "Yes" or "No"
  • Learns when to keep vs mulligan starting hand

Example - FpAssignmentTrainer:

  • Handles Free Peoples player assigning characters to minions
  • Multi-class classification over possible assignment combinations
  • Learns optimal skirmish assignments

Training Data Format

LearningStep Structure (JSON Lines)

Each line in .jsonl files represents one decision:

{
  "stateVector": [0.2, 0.5, 1.0, ...],  // Extracted game state features
  "action": {...},                       // Semantic action object
  "reward": 1.0,                         // Game outcome reward
  "isCurrentPlayer": true,               // Whether bot was active player
  "decision": {...}                      // Decision context
}

Reward Structure

  • Win: +1.0
  • Loss: -1.0 (or 0.0, depending on implementation)
  • Applied to all decisions in the episode
  • Credit assignment is global (no temporal discount)

Bot Instances

Registered Bots

General Bot: ~bot

  • Used for any deck/format
  • Uses FotrStarterBot with trained models

Format-Specific Bots (FotR Block only):

  • ~AragornBot - Plays Aragorn Starter deck
  • ~GandalfBot - Plays Gandalf Starter deck

All bots share the same trained models but may play different decks.

Development Notes

To Enable Training Mode

  1. Edit BotService.java:

    START_SIMULATIONS_AT_STARTUP = true;
  2. Optionally configure:

    runSelfPlayTrainingLoop(
        5,      // generations
        10000   // games per generation
    );
  3. Rebuild and restart server:

    mvn install
    docker-compose restart
  4. Training will run at startup (may take hours depending on configuration)

To Retrain from Existing Data

  1. Edit BotService.java:

    LOAD_MODELS_FROM_FILES = false;
  2. Ensure .jsonl training data files exist in project root

  3. Restart server - models will be retrained and saved

Performance Characteristics

  • Model Size: Very lightweight (~1KB each)
  • Inference Speed: Fast (logistic regression is simple)
  • Training Speed: Depends on data size, but relatively fast
  • Memory: ReplayBuffer holds 100K steps in memory during training

Future Enhancement Possibilities

  • Implement temporal difference learning (TD-learning)
  • Add discount factor for credit assignment
  • Experiment with different feature engineering
  • Try ensemble methods
  • Add opponent modeling
  • Implement deep neural networks (would require DL4J or similar)
  • Expand to other formats beyond FotR starters

Related Files

Core Bot Code

  • gemp-lotr-server/src/main/java/com/gempukku/lotro/bots/
    • BotService.java - Main coordination
    • BotPlayer.java - Bot interface
    • BotGameStateListener.java - Observes game state

RL Framework

  • gemp-lotr-server/src/main/java/com/gempukku/lotro/bots/rl/
    • LearningBotPlayer.java - Learning bot interface
    • LearningStep.java - Training data structure
    • ReplayBuffer.java - Experience replay
    • DecisionAnswerer.java - Trainer interface
    • RLGameStateFeatures.java - Feature extraction interface
    • semanticaction/ - Semantic action representations

FotR Starters Implementation

  • gemp-lotr-server/src/main/java/com/gempukku/lotro/bots/rl/fotrstarters/
    • FotrStarterBot.java - Main bot implementation
    • FotrStartersRLGameStateFeatures.java - Feature extraction
    • CardFeatures.java - Card-specific features
    • models/ - All 32 trainer implementations

Simulation

  • gemp-lotr-server/src/main/java/com/gempukku/lotro/bots/simulation/
    • FotrStartersSimulation.java - Game simulation setup
    • SimpleBatchSimulationRunner.java - Batch game runner
    • SimulationStats.java - Statistics tracking

Contact

For questions about the bot system, contact ketura in the #gemp-dev channel of the PC Discord.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment