Skip to content

Instantly share code, notes, and snippets.

@vals
Created November 13, 2025 06:39
Show Gist options
  • Select an option

  • Save vals/d4dbcd64e1f0f376a28d5a938dedf20e to your computer and use it in GitHub Desktop.

Select an option

Save vals/d4dbcd64e1f0f376a28d5a938dedf20e to your computer and use it in GitHub Desktop.
analysis_date h5ad_file species total_cells design_type edviz_grammar factors tool_version
2025-11-12
GSE166504.h5ad
Mouse (Mus musculus)
82192
Unbalanced factorial design with crossed factors
(CellFraction(2) × DietTimepoint(4)) > Sample(38) : CellType(13)
cell_fraction
diet_timepoint
sample
cell_type
0.1.0

Experimental Design Card

Dataset Information

File: GSE166504.h5ad Analysis Date: 2025-11-12 Species: Mouse (Mus Musculus) Total Cells: 82,192

Experimental Context

Experiment Type: Liver cell profiling under different diet conditions

Research Question: How do high-fat high-sugar (HFHS) diet and duration affect liver cell populations in hepatocytes vs non-parenchymal cells?

Factor Descriptions:

  • Cell Fraction: Two major liver cell fractions: Hepatocytes (parenchymal cells performing metabolic functions) vs NPC (non-parenchymal cells including immune, endothelial, and stellate cells)
  • Diet Timepoint: Diet conditions and duration: 15weeks, 30weeks, 34weeks on HFHS diet, and Chow (control diet). Note: Hepatocyte fraction lacks 34weeks timepoint, creating an unbalanced design
  • Sample: Individual samples representing technical replicates (Captures) from biological replicates (Animals) across cell fraction and diet/time combinations. Total of 38 samples: 16 Hepatocyte samples (across 3 timepoints) and 22 NPC samples (across 4 timepoints)
  • Cell Type: Cell populations identified by marker expression: immune cells (B, T, DCs, pDCs, NK, Neutrophils), liver-specific cells (Hepatocytes, Kupffer cells, Stellate cells, Hepatic progenitor cells), stromal cells (Endothelial cells, Myofibroblasts), and monocyte/macrophages

Design Structure

Identified Factors

Factor Levels Type
Cell Fraction 2 Treatment
Diet Timepoint 4 Treatment
Sample 38 Replicate
Cell Type 13 Observation

Design Classification

This dataset exhibits a hierarchical nested design with multiple levels of nesting. Cell Types are observed across all samples, creating a crossed relationship with the nested structure.

Design Diagram

┌──────────────────── Design Structure ────────────────────┐
│                                                          │
│ CellFraction(2)  ────×──── DietTimepoint(4)              │
│    ↓                                                     │
│                                                          │
│ Sample(38)                                               │
│    :                                                     │
│                                                          │
│ CellType(13)                                             │
│                                                          │
└──────────────────────────────────────────────────────────┘

Grammar Notation

(CellFraction(2) × DietTimepoint(4)) > Sample(38) : CellType(13)

Distribution Summary

Samples per Cell Fraction: 16 - 22 (mean: 19)
Samples per Diet Timepoint: 2 - 12 (mean: 9)
Cells per Cell Type: 99 - 25,848 (mean: 6,322)

Analysis Considerations

This design structure has implications for statistical analysis:

Random Effects Modeling: The nesting of sample within cell_fraction indicates that sample-specific variation should be modeled as a random effect. When testing for cell_fraction effects, use mixed-effects models with random intercepts for sample (e.g., ~ cell_fraction + (1|sample) in lme4 notation).

Aggregation Strategy: For differential expression testing, pseudobulking to the sample level preserves the experimental unit structure. Aggregate cells to sample-by-cell_type pseudobulk profiles before applying standard DE methods, treating samples as biological replicates.

Contrast Specification: When comparing cell_fractions, ensure contrasts are computed at the sample level, not the cell level, to avoid pseudoreplication and inflated Type I error rates.

Design Notes

The design is unbalanced: Hepatocyte samples lack the 34weeks timepoint (only present in NPC samples)

Samples include both biological replicates (Animals) and technical replicates (Captures), with varying numbers of captures per animal

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment