Bot Neural Network System Documentation

Overview

The GEMP-LOTR bot system uses machine learning to make gameplay decisions. While referred to as "neural nets" in conversation, the implementation uses Logistic Regression classifiers from the SMILE (Statistical Machine Intelligence and Learning Engine) library rather than deep neural networks.

Architecture

Technology Stack

ML Library: SMILE 2.6.0 (com.github.haifengl:smile-core)
Model Type: Logistic Regression classifiers (smile.classification.LogisticRegression)
Training Method: Reinforcement learning through self-play
Language: Java 21

Key Components

Main Bot Implementation

FotrStarterBot.java - Primary bot that uses trained models to make decisions
- Location: gemp-lotr/gemp-lotr-server/src/main/java/com/gempukku/lotro/bots/rl/fotrstarters/FotrStarterBot.java
- Handles all game decision types (card selection, assignments, multiple choice, etc.)

Model Management

ModelRegistry.java - Stores and retrieves trained models at runtime
ModelIO.java - Handles saving/loading models to/from disk
- Models saved as serialized Java objects (.model files)
- Default location: bot-models/ directory in project root

Training System

BotService.java - Coordinates model training and bot initialization
AbstractTrainer.java - Base class for all specialized trainers
FotrStartersLearningStepsPersistence.java - Saves/loads training data

Model Storage

Model Files Location

Directory: /gemp-lotr/bot-models/

Format: Serialized Java objects (.model extension)

Size: ~500-1600 bytes per model (very lightweight)

Current Trained Models (32 total)

Card Selection Models:

PlayFromHandTrainer.model
AttachItemTrainer.model
ExertTrainer.model
HealTrainer.model
DiscardFromHandTrainer.model
DiscardFromPlayTrainer.model
ReconcileTrainer.model
SanctuaryTrainer.model
ArcheryWoundTrainer.model
SkirmishOrderTrainer.model
FallBackCardSelectionTrainer.model

Multiple Choice Models:

MulliganTrainer.model
GoFirstTrainer.model
AnotherMoveTrainer.model

Integer Choice Models:

BurdenTrainer.model

Assignment Models:

FpAssignmentTrainer.model
ShadowAssignmentTrainer.model

Card Action Models (by phase):

Fellowship: FellowshipPlayCardTrainer.model, FellowshipUseCardTrainer.model, FellowshipHealTrainer.model, FellowshipTransferTrainer.model
Shadow: ShadowPlayCardTrainer.model, ShadowUseCardTrainer.model
Maneuver: ManeuverPlayCardTrainer.model, ManeuverUseCardTrainer.model
Skirmish: SkirmishFpPlayCardTrainer.model, SkirmishFpUseCardTrainer.model, SkirmishShadowPlayCardTrainer.model, SkirmishShadowUseCardTrainer.model
Regroup: RegroupPlayCardTrainer.model

Other Models:

OptionalResponsesCardActionTrainer.model
CardFromDiscardTrainer.model
StartingFellowshipTrainer.model

Training Process

Configuration Flags (in BotService.java)

START_SIMULATIONS_AT_STARTUP = false  // Set to true to enable training
LOAD_MODELS_FROM_FILES = true         // Set to false to retrain from data

Two Operating Modes

Mode 1: Production (Default)

LOAD_MODELS_FROM_FILES = true
Loads pre-trained models from bot-models/*.model files at server startup
No training occurs
Fast startup
Bot ready to play immediately

Mode 2: Training Mode

START_SIMULATIONS_AT_STARTUP = true
Triggers self-play training loop

Self-Play Training Loop

Process (defined in BotService.runSelfPlayTrainingLoop()):

Generation 0 (Bootstrap):
- Two random bots play 1,000 games against each other
- Generates initial training data from random play
Subsequent Generations:
- Training bot vs Random bot: 20% of games
- Training bot vs Training bot: 80% of games
- Configurable games per generation (default: 10,000)
Data Collection:
- Each game decision stored as a LearningStep containing:
  - State vector (extracted game state features)
  - Semantic action taken
  - Reward value
  - Decision context
- Stored in ReplayBuffer (capacity: 100,000 steps)
Persistence:
- Training steps saved to .jsonl files (JSON Lines format)
- One file per trainer type
- Naming pattern: fotr-starters-{trainer-name}.jsonl
- Files accumulate across generations (append mode)

Retraining:

After each generation, all models retrained on complete historical data

Process per trainer:

List<LearningStep> steps = persistence.load(trainer);
SoftClassifier<double[]> model = trainer.train(steps);
modelRegistry.registerModel(trainer.getClass(), model);
ModelIO.saveModel(trainer.getClass(), model);

Model Update:
- New models saved to bot-models/*.model
- Used in next generation of play

Training Algorithm

Core Training Code (from AbstractTrainer.java):

public SoftClassifier<double[]> trainWithPoints(List<LabeledPoint> points) {
    double[][] x = points.stream().map(LabeledPoint::x).toArray(double[][]::new);
    int[] y = points.stream().mapToInt(LabeledPoint::y).toArray();
    return LogisticRegression.fit(x, y);
}

Uses SMILE's logistic regression with default parameters
Supervised learning on labeled gameplay decisions
Each trainer extracts relevant training points from LearningStep history
Binary or multi-class classification depending on decision type

How Bots Make Decisions

Decision Flow (FotrStarterBot.java)

Game State → Features:
- RLGameStateFeatures.extractFeatures() converts game state to feature vector
Decision Type Routing:
- INTEGER → Integer choice trainers
- MULTIPLE_CHOICE → Multiple choice trainers
- CARD_SELECTION → Card selection trainers
- CARD_ACTION_CHOICE → Card action trainers (by phase)
- ASSIGN_MINIONS → Assignment trainers
- ARBITRARY_CARDS → Arbitrary card trainers
- ACTION_CHOICE → Action choice (currently random)
Model Inference:
- Appropriate trainer selected based on decision context
- Trainer uses its trained model to predict best action
- Falls back to random decision if no trainer applies
Action Recording:
- State, action, and context stored as LearningStep
- Episode reward assigned when game ends
- Steps added to replay buffer for future training

Specialized Trainers

Each trainer is responsible for one specific type of decision:

Example - MulliganTrainer:

Triggers on decision text containing "mulligan"
Binary classification: "Yes" or "No"
Learns when to keep vs mulligan starting hand

Example - FpAssignmentTrainer:

Handles Free Peoples player assigning characters to minions
Multi-class classification over possible assignment combinations
Learns optimal skirmish assignments

Training Data Format

LearningStep Structure (JSON Lines)

Each line in .jsonl files represents one decision:

{
  "stateVector": [0.2, 0.5, 1.0, ...],  // Extracted game state features
  "action": {...},                       // Semantic action object
  "reward": 1.0,                         // Game outcome reward
  "isCurrentPlayer": true,               // Whether bot was active player
  "decision": {...}                      // Decision context
}

Reward Structure

Win: +1.0
Loss: -1.0 (or 0.0, depending on implementation)
Applied to all decisions in the episode
Credit assignment is global (no temporal discount)

Bot Instances

Registered Bots

General Bot: ~bot

Used for any deck/format
Uses FotrStarterBot with trained models

Format-Specific Bots (FotR Block only):

~AragornBot - Plays Aragorn Starter deck
~GandalfBot - Plays Gandalf Starter deck

All bots share the same trained models but may play different decks.

Development Notes

To Enable Training Mode

Edit BotService.java:
```
START_SIMULATIONS_AT_STARTUP = true;
```

Optionally configure:

runSelfPlayTrainingLoop(
    5,      // generations
    10000   // games per generation
);

Rebuild and restart server:
```
mvn install
docker-compose restart
```
Training will run at startup (may take hours depending on configuration)

To Retrain from Existing Data

Edit BotService.java:
```
LOAD_MODELS_FROM_FILES = false;
```
Ensure .jsonl training data files exist in project root
Restart server - models will be retrained and saved

Performance Characteristics

Model Size: Very lightweight (~1KB each)
Inference Speed: Fast (logistic regression is simple)
Training Speed: Depends on data size, but relatively fast
Memory: ReplayBuffer holds 100K steps in memory during training

Future Enhancement Possibilities

Implement temporal difference learning (TD-learning)
Add discount factor for credit assignment
Experiment with different feature engineering
Try ensemble methods
Add opponent modeling
Implement deep neural networks (would require DL4J or similar)
Expand to other formats beyond FotR starters

Related Files

Core Bot Code

gemp-lotr-server/src/main/java/com/gempukku/lotro/bots/
- BotService.java - Main coordination
- BotPlayer.java - Bot interface
- BotGameStateListener.java - Observes game state

RL Framework

gemp-lotr-server/src/main/java/com/gempukku/lotro/bots/rl/
- LearningBotPlayer.java - Learning bot interface
- LearningStep.java - Training data structure
- ReplayBuffer.java - Experience replay
- DecisionAnswerer.java - Trainer interface
- RLGameStateFeatures.java - Feature extraction interface
- semanticaction/ - Semantic action representations

FotR Starters Implementation

gemp-lotr-server/src/main/java/com/gempukku/lotro/bots/rl/fotrstarters/
- FotrStarterBot.java - Main bot implementation
- FotrStartersRLGameStateFeatures.java - Feature extraction
- CardFeatures.java - Card-specific features
- models/ - All 32 trainer implementations

Simulation

gemp-lotr-server/src/main/java/com/gempukku/lotro/bots/simulation/
- FotrStartersSimulation.java - Game simulation setup
- SimpleBatchSimulationRunner.java - Batch game runner
- SimulationStats.java - Statistics tracking

Contact

For questions about the bot system, contact ketura in the #gemp-dev channel of the PC Discord.

iftheshoefritz/BOT_NEURAL_NETWORK_SYSTEM.md

Select an option

No results found