Date: January 20, 2026
Investigated by: Scott Walker
Company: 634ecb9e56cf240017ee64cc (Onin Staffing)
Period: September 2025
Two analytics pages show different user counts for the same company and month:
/analytics/bonus_totals/summary(Legacy Analytics)/admin/reports/participation(Participation Report)
For company 634ecb9e56cf240017ee64cc in September 2025:
- Bonus Totals: 673 users
- Participation Report: 723 users
- Difference: 50 users
Backfill jobs are creating HistoryTracker records with created_at dates from before users existed. These invalid historical records propagate through the analytics pipeline:
- Backfill jobs create HistoryTracker records with fake historical timestamps
InitializeTimeSeriesUserPropertiesreads those records and creates TSUP records with the same timestampsGetGivingAndReceivingEligibilityDatacounts users based on TSUP records, including months before users existed
The User's created_at field is still correct. The bug doesn't change when the user was actually created. What happens is that queries using TimeSeriesUserProperties are fooled into thinking the user was "active" in earlier months.
| Data Source | What it shows for jperri | Correct? |
|---|---|---|
User.created_at |
October 2, 2025 | ✅ Correct |
TimeSeriesUserProperties |
Records claiming 2022, Aug 2025, etc. | ❌ Misleading |
Queries that are fooled (use TSUP without checking User.created_at):
GetGivingAndReceivingEligibilityData→ Participation Report
Queries that work correctly (check User.created_at):
GetGroupsPointsActivity→ Bonus Totals Summary
Fix options:
- Fix the backfill jobs (preferred) - stop creating invalid historical records at the source
- Fix the queries - add a
$lookupto User collection and filter bycreated_at - Both - belt and suspenders approach
The backfill fix is preferred because it solves the problem at the source and prevents other TSUP-based queries from being affected in the future.
File: app/lib/jobs/backfills/backfill_total_monthly_allowance_history_for_user.rb
Bug: Creates synthetic HistoryTracker records for company budget changes that occurred before the user was created.
# Line 267 - BUG: No check for user.created_at!
budget_history.select { |r| r.created_at >= LOOKBACK_START_DATE }.each do |record|Example: Company budget changed in 2022. User created in October 2025. Backfill creates HistoryTracker records dated 2022 for this user.
Evidence (user [email protected]):
Written: 2025-11-07 | Claims: 2022-10-18 | reason: Synthetic total_monthly_allowance backfill
Written: 2025-11-07 | Claims: 2022-12-15 | reason: Synthetic total_monthly_allowance backfill
Written: 2025-11-07 | Claims: 2022-12-19 | reason: Synthetic total_monthly_allowance backfill
Written: 2025-11-07 | Claims: 2025-08-31 | reason: Synthetic total_monthly_allowance backfill
Fix:
# Filter to only events after user was created
budget_history.select { |r| r.created_at >= LOOKBACK_START_DATE && r.created_at >= user.created_at }.each do |record|File: app/lib/jobs/backfills/backfill_manager_history.rb
Bug: Uses end of previous month as timestamp without checking if user existed then.
# Line 56 - Gets end of previous month
end_of_last_month = Time.current.end_of_month.prev_month.end_of_day
# Line 81 - Uses it as created_at
created_at: end_of_last_month,Scenario: User created November 15, 2025. Job runs same day. Creates record dated October 31, 2025 (before user existed).
Fix:
# Use the later of: end of last month OR user's created_at
timestamp = [end_of_last_month, user.created_at].maxRan comparison query to identify user count differences:
- Bonus Totals: 673 users (from User collection with
created_atfilter) - Participation: 723 users (from TimeSeriesUserProperties)
- 54 users in Participation but not Bonus Totals
- All 54 were created AFTER September 2025
User [email protected] (ID: 68de5a0183d0e33fca76efc8):
User.created_at: 2025-10-02 (October)- TSUP records claimed timestamps from 2022 (before user existed)
Used BSON ObjectId generation_time to find when records were actually written:
TSUP _id creation time: 2025-11-08 (when actually written)
TSUP timestamp field: 2022-10-18 (what it claims)
Found synthetic HistoryTracker records with manipulated created_at:
HistoryTracker written: 2025-11-07 | created_at field: 2022-10-18
reason: "Synthetic total_monthly_allowance backfill"
The BackfillTotalMonthlyAllowanceHistoryForUser job creates these records without checking user.created_at.
BackfillTotalMonthlyAllowanceHistoryForUser
↓ Creates HistoryTracker with created_at = 2022-10-18
↓ (for user created 2025-10-02)
InitializeTimeSeriesUserProperties
↓ Reads HistoryTracker, uses created_at as timestamp
↓ Creates TSUP record with timestamp = 2022-10-18
GetGivingAndReceivingEligibilityData
↓ Finds TSUP record for September 2025 (uses "last known state" logic)
↓ Counts user in September even though they didn't exist
Participation Report shows inflated user count
| File | Role |
|---|---|
app/lib/jobs/backfills/backfill_total_monthly_allowance_history_for_user.rb |
Bug source - creates invalid historical records |
app/lib/jobs/backfills/backfill_manager_history.rb |
Secondary bug - similar issue |
app/lib/analytics/jobs/materialized_views/initialize_time_series_user_properties.rb |
Propagates invalid records to TSUP |
app/lib/analytics/queries/get_giving_and_receiving_eligibility_data.rb |
Uses TSUP for participation counts |
app/lib/analytics/queries/get_groups_points_activity.rb |
Correctly filters by user.created_at |
In build_event_list method, filter events by user creation date:
# User history events - add user.created_at filter
user_history.select { |r| r.created_at >= LOOKBACK_START_DATE && r.created_at >= user.created_at }.each do |record|
# Budget history events - add user.created_at filter
budget_history.select { |r| r.created_at >= LOOKBACK_START_DATE && r.created_at >= user.created_at }.each do |record|In create_synthetic_history method:
def create_synthetic_history(user)
end_of_last_month = Time.current.end_of_month.prev_month.end_of_day
# Don't create records dated before user existed
timestamp = [end_of_last_month, user.created_at].max
# ... rest of method using timestamp instead of end_of_last_month
endDelete invalid HistoryTracker and TSUP records where timestamp < user.created_at:
# Find and remove invalid synthetic records
# (Script to be written after fixes are deployed)| Query | Uses TSUP For | Checks User.created_at? | Status |
|---|---|---|---|
GetGivingAndReceivingEligibilityData |
User counts for participation | ❌ No | AFFECTED - inflates user counts |
CalculateRecognitionGivingRate |
Eligible giver counts | ❌ No | AFFECTED - inflates denominator in giving rate KVMs |
GetHistoricalCustomProperties |
User property lookups | ❌ No | |
GetUserPointsActivity |
Property lookups only | ✅ Yes (line 204) | SAFE - starts from User collection |
GetGroupsPointsActivity |
Property lookups only | ✅ Yes | SAFE - starts from User collection |
GetUsersLastRecognizedByManager |
Manager ID verification | N/A | SAFE - validation only, not counting |
| Materialized View | Uses TSUP For | Lookup Pattern | Impact |
|---|---|---|---|
DailyUserRecognitionStatistics |
Custom properties at recognition time | timestamp <= end_of_day, sort DESC, limit 1 |
✅ Safe |
MonthlyUserConnections |
Giver/receiver properties | timestamp <= bonus_created_at, sort DESC, limit 1 |
✅ Safe |
MonthlyUserStatistics |
Historical property lookups | Starts from User collection | ✅ Safe |
HumanGivenRecognitionWithUserProperties |
Giver/receiver properties | timestamp <= bonus_created_at, sort DESC, limit 1 |
✅ Safe |
MonthlyTeamParticipation |
Manager ID lookups | timestamp <= month_end, sort DESC, limit 1 |
✅ Safe |
MonthlyGroupCheckInStatistics |
Historical properties | timestamp <= month_end, sort DESC, limit 1 |
✅ Safe |
These views use the pattern:
{"$match": {"$expr": {"$lte": ["$timestamp", "$$target_date"]}}},
{"$sort": {timestamp: -1}},
{"$limit": 1}This gets the most recent TSUP record before the target date. For a user created Oct 2, 2025 with backdated TSUP:
- Looking up properties for a bonus on Oct 15, 2025
- Finds TSUP records: 2022-10-18, 2025-08-31, 2025-10-02
- Sorts DESC and takes limit 1 → 2025-10-02 (correct!)
The backdated records are only problematic when:
- Counting users based on TSUP existence (not property lookups)
- Looking up properties for dates before real activity (but users can't have activity before creation)
High Impact (user counting):
GetGivingAndReceivingEligibilityData→ Participation Report - AFFECTEDCalculateRecognitionGivingRate→ Recognition Giving Rate KVMs - AFFECTED
Not Impacted:
- All materialized views that do property lookups based on activity dates
- Queries that start from User collection and filter by
created_at
company_id = BSON::ObjectId("634ecb9e56cf240017ee64cc")
start_time = Time.zone.parse("2025-09-01").beginning_of_day
end_time = Time.zone.parse("2025-09-30").end_of_day
# Bonus Totals approach
bt_users = User.where(
company_id: company_id,
user_mode: {"$in" => User::HUMAN_MODES},
created_at: {"$lte" => end_time},
"$or" => [{deleted_at: false}, {deleted_at: {"$gt" => start_time}}]
).pluck(:id, :email, :user_mode)user_id = BSON::ObjectId('68de5a0183d0e33fca76efc8')
Analytics::MaterializedViews::TimeSeriesUserProperties
.where(user_id: user_id)
.order_by(timestamp: :asc)
.each do |tsup|
puts "Written: #{tsup.id.generation_time} | Claims: #{tsup.timestamp} | v#{tsup.version}"
end[7, 8, 9, 10].each do |v|
ht = HistoryTracker.where("association_chain.id" => user_id, version: v).first
puts "Version #{v}: written=#{ht.id.generation_time} | created_at=#{ht.created_at} | reason=#{ht.reason}"
end# Find users with manager backfill records dated before they existed
synthetic_manager_records = HistoryTracker.where(reason: "Synthetic manager history backfill")
.only(:id, :created_at, "association_chain.id")
affected_count = 0
synthetic_manager_records.each do |ht|
user_id = ht.association_chain.first["id"]
user = User.where(id: user_id).only(:id, :created_at).first
next unless user
affected_count += 1 if ht.created_at < user.created_at
end
puts "Total synthetic manager records: #{synthetic_manager_records.count}"
puts "Affected (HT dated before user existed): #{affected_count}"# Find users with TMA backfill records dated before they existed
synthetic_tma_records = HistoryTracker.where(reason: "Synthetic total_monthly_allowance backfill")
.only(:id, :created_at, "association_chain.id")
affected_count = 0
affected_users = []
synthetic_tma_records.each do |ht|
user_id = ht.association_chain.first["id"]
user = User.where(id: user_id).only(:id, :email, :created_at).first
next unless user
if ht.created_at < user.created_at
affected_count += 1
affected_users << {user_id: user.id, email: user.email, ht_date: ht.created_at, user_created: user.created_at}
end
end
puts "Total synthetic TMA records: #{synthetic_tma_records.count}"
puts "Affected (HT dated before user existed): #{affected_count}"
puts "\nSample affected users:"
affected_users.first(10).each { |u| puts " #{u[:email]}: HT=#{u[:ht_date]}, User=#{u[:user_created]}" }script/compare_user_counts.rb- Compares user counts between the two approaches (can be deleted after investigation)