Outputs
This document provides a comprehensive guide to all outputs generated by the Main pipeline. Understanding these outputs is essential for downstream analysis and quality assessment.
Output Directory Structure
The Main pipeline generates outputs in a structured format:
experiment_output_directory/
├── cell_profiling/ # Single-cell analysis results
│ ├── patch-0-cell_by_marker.csv # Per-cell marker expression
│ ├── patch-0-cell_metadata.csv # Cell morphology and statistics
│ ├── patch-1-cell_by_marker.csv
│ └── patch-1-cell_metadata.csv
├── visualization/ # Optional visualization outputs
│ ├── whole_sample_rgb.png # Full image overview (generated during preprocess)
│ ├── patches_rgb/ # Individual patch visualizations
│ └── segmentation_overlays/ # Segmentation mask overlays
├── channel_stats.csv # Channel-level statistics
├── extracted_channel_patches.npy.zst # Processed image patches
├── original_seg_res_batch.pickle # Raw segmentation results
├── matched_seg_res_batch.pickle # Processed segmentation results
├── patches_metadata.csv # Patch-level metadata and QC
├── copied_config.yaml # Configuration record
└── processing_log.txt # Execution log and timing
└── seg_evaluation_metrics.pkl.gz # Segmentation evaluation results
Core Output Files
Single-Cell Profiling Data
The cell_profiling/ directory contains the primary analytical outputs for single-cell analysis.
patch-{i}-cell_by_marker.csv
Purpose: Per-cell marker expression quantification for downstream analysis
Structure: Each row represents a single cell, with columns for cell identifier and marker intensities.
| Column | Type | Description | Example Values |
|---|---|---|---|
cell_id | integer | Unique cell identifier within patch | 1, 2, 3, ... |
DAPI | float | Mean nuclear marker intensity | 145.23, 289.67 |
Pan-Cytokeratin | float | Mean membrane marker intensity | 78.45, 156.89 |
CD3 | float | Mean T-cell marker intensity | 12.34, 234.56 |
CD8 | float | Mean cytotoxic T-cell marker intensity | 5.67, 189.23 |
... | float | Additional marker intensities | ... |
File Size: Typically 50KB - 5MB per patch depending on cell count and marker number Usage: Primary input for clustering, phenotyping, and statistical analysis
patch-{i}-cell_metadata.csv
Purpose: Comprehensive cell morphology and quality metrics for each cell
Structure: Each row represents a single cell with extensive morphological and statistical features.
| Column Category | Columns | Type | Description |
|---|---|---|---|
| Identity | cell_id | integer | Unique cell identifier matching marker data |
| Basic Morphology | area | float | Cell area in pixels |
centroid_x, centroid_y | float | Cell center coordinates | |
perimeter | float | Cell boundary length in pixels | |
convex_area | float | Area of convex hull around cell | |
axis_major_length, axis_minor_length | float | Major/minor axis lengths | |
eccentricity | float | Shape eccentricity (0=circle, 1=line) | |
| Intensity Quality | {marker}_cov | float | Coefficient of variation per marker |
{marker}_laplacian_var | float | Local intensity variation measure | |
| Overall Quality | cell_entropy | float | Shannon entropy of marker distribution |
File Size: Typically 100KB - 10MB per patch Usage: Quality filtering, morphological analysis, spatial analysis
Image and Processing Data
channel_stats.csv
Purpose: Summary statistics for intensity values across all image channels
| Column | Type | Description | Example |
|---|---|---|---|
Channel | string | Channel/marker name | "DAPI", "CD3" |
Min | float | Minimum intensity value | 0.0 |
Median | float | Median intensity value | 45.67 |
Max | float | Maximum intensity value | 4095.0 |
95% | float | 95th percentile intensity | 234.56 |
Mean | float | Mean intensity value | 78.45 |
Std Dev | float | Standard deviation | 123.45 |
File Size: Small (< 10KB) Usage: Quality control, intensity normalization, channel selection
extracted_channel_patches.npy.zst
Purpose: Processed image patches ready for segmentation
Format: NumPy array with dimensions (num_patches, patch_height, patch_width, num_channels)
Channels: Typically [nucleus, wholecell] channels
Data Type: float32 or uint16
File Size: Large (100MB - 10GB depending on image size and patch count)
Usage: Input for segmentation algorithms, visualization
patches_metadata.csv
Purpose: Quality control and metadata for each image patch
| Column | Type | Description | Typical Range |
|---|---|---|---|
patch_id | integer | Unique patch identifier | 0, 1, 2, ... |
height, width | integer | Patch dimensions in pixels | 1000, 2000 |
nucleus_mean | float | Mean nuclear channel intensity | 10-500 |
nucleus_std | float | Nuclear channel standard deviation | 5-200 |
nucleus_non_zero_perc | float | Fraction of non-zero nuclear pixels | 0.0-1.0 |
wholecell_mean | float | Mean wholecell channel intensity | 5-300 |
wholecell_std | float | Wholecell channel standard deviation | 3-150 |
wholecell_non_zero_perc | float | Fraction of non-zero wholecell pixels | 0.0-1.0 |
is_empty | boolean | Patch marked as empty tissue | true/false |
is_noisy | boolean | Patch marked as too noisy | true/false |
is_bad_patch | boolean | Overall patch quality flag | true/false |
is_informative | boolean | Patch suitable for analysis | true/false |
File Size: Small to medium (10KB - 1MB) Usage: Patch filtering, quality assessment, debugging
Segmentation Results
original_seg_res_batch.pickle
Purpose: Raw segmentation masks before post-processing Format: Pickled list of dictionaries, one per patch Contents per patch:
cell: Cell segmentation mask (uint16, shape(height, width))nucleus: Nucleus segmentation mask (uint16, shape(height, width))
File Size: Medium to large (10MB - 1GB) Usage: Debugging, alternative post-processing, method comparison
matched_seg_res_batch.pickle
Purpose: Processed segmentation masks with cell-nucleus matching Format: Pickled list of dictionaries, one per patch Contents per patch:
cell_matched_mask: Cells successfully matched to nuclei (uint16)nucleus_matched_mask: Nuclei successfully matched to cells (uint16)cell_outside_nucleus_mask: Cell regions excluding nucleus (uint16)matched_fraction: Fraction of cells successfully matched (float, 0-1)
File Size: Medium to large (10MB - 1GB) Usage: Final analysis, quality assessment, downstream processing
Configuration and Logs
copied_config.yaml
Purpose: Complete configuration record for reproducibility Content: Exact copy of configuration used for processing File Size: Small (< 10KB) Usage: Reproducibility, debugging, parameter tracking
processing_log.txt
Purpose: Detailed execution log with timing information Content: Timestamped processing steps, warnings, performance metrics File Size: Small to medium (10KB - 10MB) Usage: Debugging, performance analysis, quality control
Optional Visualization Outputs
When visualization features are enabled, additional outputs are generated:
visualization/whole_sample_rgb.png
- Status: Generated by the preprocess overview module; the main pipeline skips this step.
- Usage: RGB overview of the entire sample when preprocess visualization is enabled.
visualization/patches_rgb/patch-{i}.png
- Purpose: Individual patch visualizations
- Content: RGB composite of each patch
- File Size: 1-10MB per patch
- Configuration:
visualization.visualize_patches: true
visualization/segmentation_overlays/patch-{i}_overlay.png
- Purpose: Segmentation quality assessment
- Content: Original image with segmentation masks overlaid
- File Size: 5-50MB per patch
- Configuration:
visualization.visualize_segmentation: true
File Size Estimates (Placeholder numbers)
| Output Type | Small Image (<10K x 10K) | Medium Image (10K-30K) | Large Image (>30K x 30K) |
|---|---|---|---|
| Cell Profiling | 1-10MB | 10-100MB | 100MB-1GB |
| Image Patches | 100MB-1GB | 1-5GB | 5-20GB |
| Segmentation | 10-100MB | 100MB-1GB | 1-5GB |
| Visualization | 10-100MB | 100MB-1GB | 1-10GB |
| Total | 200MB-2GB | 2-10GB | 10-50GB |