.. _Examples overview: ================= Examples Overview ================= This page summarizes all example datasets in MPRAsnakeflow and links to the full step-by-step instructions for each workflow. Core Workflow Examples ====================== * :doc:`assignment_example1` Basic assignment-only workflow on 5'/5' WT MPRA data in HepG2 from `Klein et al. (2019) `_. Use this example to learn barcode-to-oligo assignment from raw reads. * :doc:`count_example1` Basic experiment/count workflow on 5'/5' WT MPRA data in HepG2 from `Klein et al. (2019) `_, using a precomputed assignment file. * :doc:`combined_example1` Combined assignment + experiment workflow on the same HepG2 WT MPRA dataset from `Klein et al. (2019) `_, useful as an end-to-end minimal example. Published Dataset Examples ========================== * :doc:`plasmid_example` ENCODE plasmid-based MPRA in A549 from Tewhey lab, published in `Gosai et al. (Nature 2024) `_. Demonstrates assignment preprocessing when barcodes are attached to forward reads and experiment setup with shared DNA input across replicates. * :doc:`complex_readstructure_example` Complex-read-structure MPRA from `Abell et al. (Science 2022) `_, GEO `GSE174534 `_. Focuses on handling non-trivial read layouts, trimming/adapters, strand-sensitive assignment, and reverse-complement design handling. * :doc:`GSE306816_example` Deep perturbation STR MPRA from `Zhang et al. (bioRxiv 2025) `_, GEO `GSE306816 `_. Shows assignment and counting for repeat-rich constructs and comparison against published assignment files. * :doc:`GSE293036_example` Multiple sclerosis variant MPRA from `Granitto et al. (G3 2025) `_, GEO `GSE293036 `_. Demonstrates conversion from supplementary design tables, assignment generation, and condition-specific experiment counting. * :doc:`GSE316891_example` L1a1 MPRA dataset from Yan et al., GEO `GSE316891 `_. A peer-reviewed paper or preprint link is not currently available in the example source data. Includes assignment generation, experiment counting, and direct comparison between workflow-generated and GEO-provided assignment files. * :doc:`GSE284330_example` Processed HepG2 sub1 MPRA data from `Zaratiana et al. `_, GEO `GSE284330 `_. STARR-seq like assay, therefore not optimal for the workflow. But it uses designed oligonucleotides, and we demonstrate experiment-only processing using oligonucleotides as barcodes and shared DNA input across RNA replicates. * :doc:`GSE271608_example` Synthetic promoter MPRA from `Zahm et al. (Nat Commun. 2024) `_, GEO `GSE271608 `_. Demonstrates experiment-only processing using an externally supplied barcode dictionary when the raw assignment-building reads are not available. * :doc:`GSE307247_example` Splicing-focused reporter assay from `Koplik et al. (bioRxiv 2025) `_, GEO `GSE307247 `_. Shows how to use MPRAsnakeflow for count quantification in a non-standard MPRA setting with supplied barcode references and UMI-aware counting. * :doc:`GSE325670_example` Variation/saturation mutagenesis MPRA from `Hauser et al. (2026) `_, GEO umbrella `GSE325670 `_ (example uses GSE325256). Highlights challenging library assignment settings, including bbmap/bwa-additional-filtering, strand-sensitive assignment, and downstream comparison in experiment counting.