Service Region Test Investigation - Detailed Findings

Date: 2025-10-03 Test File: apps/admin/test/admin_web/integration/service_region_priority_test.exs:134 Test Name: "fetches only providers meeting criteria"

Executive Summary

Critical Finding: The insert_polygons function is COMPLETELY UNNECESSARY for this test. The test runs in 2.7 seconds, and the actual function being tested (available_providers) only takes 6.75ms. The remaining 2.6+ seconds is spent on WKT decoding and database inserts that serve no purpose.

Performance Breakdown (Instrumented)

Current Timing (with `insert_polygons` commented out)

Total Test Time: 2.7 seconds

Breakdown:
- Insert provider 'Orion Towing': 24.47ms
- Create 4 provider nodes for 'Orion Towing': 1,291.39ms  (48% of test time)
- Insert provider 'Battery City': 0.46ms
- Create 3 provider nodes for 'Battery City': 1,273.22ms  (47% of test time)
- Call @subject.available_providers: 6.75ms              (0.25% of test time!)

Key Insight: 95% of test time is spent creating provider nodes with WKT-decoded coordinates, but the actual function being tested only takes 6.75ms!

What Takes Time Creating Provider Nodes

Each provider node creation involves:

Reading from CSV stream (geodata_stream)
Filtering postcodes
WKT decoding - Geo.WKT.decode!(geo_shape["shape_geom"]) (~25-30ms per postcode)
Extracting random coordinates from decoded polygon
Database insert

Total for 7 provider nodes: ~2,564ms (2.6 seconds)

Investigation: Does `available_providers` Need PostGIS?

The Function Implementation

File: apps/admin/lib/admin/service_regions.ex:115

def available_providers(region_id, service_type) do
  case Ecto.UUID.cast(region_id) do
    {:ok, service_region_id} ->
      query =
        from(p in Provider,
          distinct: p.id,
          join: pn in ProviderNode,
          on: pn.provider_id == p.id,
          join: pcode in Postcode,
          on: pcode.postcode == pn.postcode,  # ← String match only!
          join: sp in ServiceRegion,
          on: sp.id == pcode.service_region_id,
          where:
            sp.id == ^service_region_id and
              p.service_type == ^service_type and
              p.status == ^"Active"
        )

      Repo.all(query)

    error ->
      Logger.error("#{__MODULE__}.available_providers/2 UUID CAST ERROR...")
      []
  end
end

Analysis

CRITICAL: This function does ZERO geographic/spatial queries!

The join logic:

Provider → ProviderNode (provider_id match)
ProviderNode → Postcode (postcode STRING match: pcode.postcode == pn.postcode)
Postcode → ServiceRegion (service_region_id match)

No PostGIS operations:

❌ No distance calculations
❌ No ST_Within, ST_Contains, ST_Intersects
❌ No geographic buffers or radius queries
❌ No coordinate-based spatial queries
✅ Only string-based postcode matching

What the Test Actually Needs

Current test setup:

# Setup creates:
1. Service region with postcodes: ["2224", "2225", "2226", "2227", "2228", "2229"]
2. Provider nodes with postcodes matching the region
3. Coordinates extracted from WKT geometries (longitude, latitude)

# What available_providers actually uses:
- Only the postcode STRING from ProviderNode
- NEVER uses longitude or latitude
- NEVER queries GeoPostcode table (with PostGIS geometries)

Proof: The Postcode table (joined in the query) is NOT the GeoPostcode table. They're different tables!

GeoPostcode - Contains PostGIS geometries, created by insert_polygons - NEVER QUERIED
Postcode - Simple table with postcode strings and service_region_id - ACTUALLY USED

Why WKT Decoding is Expensive and Unnecessary

WKT Geometry Example

"POLYGON((151.05 -33.90, 151.10 -33.90, 151.10 -33.95, 151.05 -33.95, 151.05 -33.90))"

Decoding process (~25-30ms per postcode):

Parse WKT string
Validate polygon structure
Extract all coordinate pairs
Flatten nested lists
Select random coordinate

What the test uses from this:

A longitude value (e.g., 151.07)
A latitude value (e.g., -33.92)

What available_providers uses:

NOTHING - these coordinates are stored in provider_node but never queried!

The Smoking Gun: Database Tables

Looking at the query joins:

join: pcode in Postcode,  # ← This is Members.Schemas.Operations.Postcode
on: pcode.postcode == pn.postcode

But insert_polygons inserts into:

GeoPostcode.changeset(i)  # ← This is Members.Schemas.Operations.GeoPostcode
|> OperationsRepo.insert!(on_conflict: :nothing)

These are DIFFERENT TABLES:

Postcode - Simple lookup table (postcode string → service_region_id)
GeoPostcode - PostGIS geometry table with complex polygons

The test creates Postcode records via:

OperationsFactory.insert(:service_region_postcode,
  service_region: svc_region,
  postcode: pcode)

This creates records in the simple Postcode table, NOT GeoPostcode.

Conclusion

Facts

available_providers NEVER queries GeoPostcode table
- It queries the simple Postcode table (string lookups)
- No PostGIS/spatial operations
insert_polygons is completely irrelevant
- Inserts into GeoPostcode (with complex geometries)
- This table is never touched by the test logic
WKT decoding wastes 2.5+ seconds
- Extracts coordinates that are never used by the query
- available_providers doesn't look at longitude/latitude
- Only matches on postcode strings
Test works WITHOUT insert_polygons
- Currently runs in 2.7s with it commented out
- Would still work with simpler coordinates

Recommended Optimizations

Quick Win: Use Fixed Coordinates (saves ~2.5 seconds)

Replace WKT decoding with simple pre-defined coordinates:

# Instead of:
{long, lat} = Geo.WKT.decode!(geo_shape["shape_geom"]).coordinates
              |> List.flatten()
              |> Enum.random()

# Use:
coords = %{
  "2224" => {151.15, -33.95},
  "2225" => {151.25, -33.95},
  "2226" => {151.20, -34.00},
  # ... etc
}
{long, lat} = coords[postcode]

Expected improvement: 2.7s → ~0.2s (92% faster!)

Alternative: Remove Coordinates Entirely (saves ~2.5 seconds)

Since available_providers doesn't use coordinates:

# Don't even set longitude/latitude - just use postcodes
OperationsFactory.insert(:provider_node,
  provider: other,
  locality: "Newcastle",
  postcode: "2225"
  # No longitude/latitude needed!
)

Expected improvement: 2.7s → ~0.2s (92% faster!)

Risk: If provider_node factory requires coordinates, use defaults:

longitude: 0.0,
latitude: 0.0

Action Items

Remove WKT decoding - Use pre-defined coordinate map or defaults
Verify insert_polygons is unused - Confirm GeoPostcode table is never queried
Update test - Simplify to only create data that's actually used
Document pattern - Share this finding for similar tests

Files to Update

apps/admin/test/admin_web/integration/service_region_priority_test.exs:134
- Replace WKT decoding with coordinate map
- Remove dependency on CSV files
- Expected: 2.7s → 0.2s

Questions Answered

Q: Does insert_polygons play a role in available_providers? A: NO. The function only does string-based postcode matching. It never queries the GeoPostcode table with PostGIS geometries.

Q: Why does the test take 2.7 seconds? A: Because it's doing expensive WKT geometry decoding (~25-30ms × 7 postcodes = ~200ms) plus stream processing overhead and database inserts (~2.3s total). The actual function being tested takes 6.75ms.

Q: Can we optimize this without changing behavior? A: YES. Replace WKT decoding with a simple coordinate lookup map. The test logic doesn't depend on actual geographic data - just postcode strings.

wiserfirst/service_region_test_investigation.md

Select an option

No results found

Select an option

No results found

Service Region Test Investigation - Detailed Findings

Executive Summary

Performance Breakdown (Instrumented)

Current Timing (with `insert_polygons` commented out)

What Takes Time Creating Provider Nodes

Investigation: Does `available_providers` Need PostGIS?

The Function Implementation

Analysis

What the Test Actually Needs

Why WKT Decoding is Expensive and Unnecessary

WKT Geometry Example

The Smoking Gun: Database Tables

Conclusion

Facts

Recommended Optimizations

Quick Win: Use Fixed Coordinates (saves ~2.5 seconds)

Alternative: Remove Coordinates Entirely (saves ~2.5 seconds)

Action Items

Files to Update

Questions Answered

wiserfirst/service_region_test_investigation.md

Service Region Test Investigation - Detailed Findings

Executive Summary

Performance Breakdown (Instrumented)

Current Timing (with insert_polygons commented out)

What Takes Time Creating Provider Nodes

Investigation: Does available_providers Need PostGIS?

The Function Implementation

Analysis

What the Test Actually Needs

Why WKT Decoding is Expensive and Unnecessary

WKT Geometry Example

The Smoking Gun: Database Tables

Conclusion

Facts

Recommended Optimizations

Quick Win: Use Fixed Coordinates (saves ~2.5 seconds)

Alternative: Remove Coordinates Entirely (saves ~2.5 seconds)

Action Items

Files to Update

Questions Answered

Current Timing (with `insert_polygons` commented out)

Investigation: Does `available_providers` Need PostGIS?