Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions workflows/comparative_genomics/hyphy/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,16 @@
# Changelog

## [0.2] - 2026-07-02

### Changed
- Replaced single `reference cds` input with `reference GTF` + `reference Fasta` inputs across all HyPhy workflows
- Enables automated CDS extraction from annotated reference genomes
- Aligns workflow parameters with BRC Analytics `ASSEMBLY_FASTA_URL` and `GENE_MODEL_URL` variables
- Updated CAPHEINE, HyPhy Core, HyPhy Compare, and HyPhy Preprocessing to version 0.2
- Replaced `denv1_ref_cds.fasta` with `denv1_genome.fasta` (NC_001477.1 full genome) as the reference FASTA test input
- Added `denv1_ref.gtf` with coordinates for two DENV1 CDS regions (capsid protein C and prM)
- Updated test parameter files to use genome FASTA + GTF inputs

## [0.1] - 2026-02-26

### Added
Expand Down
11 changes: 7 additions & 4 deletions workflows/comparative_genomics/hyphy/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,8 @@ This directory contains Galaxy workflows for running HyPhy (Hypothesis Testing u
The main workflow that orchestrates the complete HyPhy pipeline, including codon-aware preprocessing and optional branch-comparison analyses. Inspired by the [veg/capheine](https://github.com/veg/capheine) Nextflow implementation, version 1.1.0.

**Inputs:**
- **Reference CDS FASTA** (required): Multi-gene CDS reference file (e.g., from NCBI)
- **Reference GTF** (required): GTF annotation for the reference genome
- **Reference Fasta** (required): Genome FASTA for the reference assembly
- **Unaligned sequences** (required): List collection of FASTA files, one per sample
- **Foreground regexp** (optional): Regular expression to match foreground sequence names for branch labeling
- **Foreground list** (optional): Dataset with cleaned sequence identifiers for foreground branches
Expand Down Expand Up @@ -52,14 +53,16 @@ Subworkflow for sequence cleanup and codon-aware alignment.
## Test Data

The `test-data/` directory contains:
- `denv1_ref_cds.fasta`: Reference coding sequences from Dengue virus 1
- `foreground_seqs_list.tabular`: Example foreground sequence identifiers
- `denv1_genome.fasta`: Reference genome FASTA for Dengue virus 1 (NC_001477.1)
- `denv1_ref.gtf`: GTF annotation for two DENV1 CDS regions (capsid protein C and prM; coords 95–394 and 437–934)
- `denv1_ref_cds.fasta`: Pre-extracted CDS sequences (retained for reference; not used as a workflow input in v0.2+)
- `foreground_seqs_list.txt`: Example foreground sequence identifiers
- `unaligned_seqs/`: Directory with 39 unaligned FASTA files for testing

## Running Tests

Tests are defined in `capheine-core-and-compare-tests.yml` with four scenarios:
1. Core only (reference CDS + unaligned sequences)
1. Core only (reference GTF + reference Fasta + unaligned sequences)
2. Core + Compare with regex (no foreground list)
3. Core + Compare with foreground list (no regex)
4. Core + Compare with all inputs (regex takes precedence)
Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,12 @@
- doc: Test CAPHEINE with reference CDS and unaligned sequences only (Core workflow only)
job:
reference cds:
reference GTF:
class: File
path: test-data/denv1_ref_cds.fasta
path: test-data/denv1_ref.gtf
filetype: gtf
reference Fasta:
class: File
path: test-data/denv1_genome.fasta
filetype: fasta
unaligned sequences:
class: Collection
Expand Down Expand Up @@ -46,9 +50,13 @@

- doc: Test CAPHEINE with reference CDS, unaligned sequences, and regex (no foreground list)
job:
reference cds:
reference GTF:
class: File
path: test-data/denv1_ref.gtf
filetype: gtf
reference Fasta:
class: File
path: test-data/denv1_ref_cds.fasta
path: test-data/denv1_genome.fasta
filetype: fasta
unaligned sequences:
class: Collection
Expand Down Expand Up @@ -105,9 +113,13 @@

- doc: Test CAPHEINE with reference CDS, unaligned sequences, and foreground list (no regex)
job:
reference cds:
reference GTF:
class: File
path: test-data/denv1_ref_cds.fasta
path: test-data/denv1_ref.gtf
filetype: gtf
reference Fasta:
class: File
path: test-data/denv1_genome.fasta
filetype: fasta
unaligned sequences:
class: Collection
Expand Down Expand Up @@ -168,9 +180,13 @@

- doc: Test CAPHEINE with all inputs (reference CDS, unaligned sequences, regex, and foreground list)
job:
reference cds:
reference GTF:
class: File
path: test-data/denv1_ref.gtf
filetype: gtf
reference Fasta:
class: File
path: test-data/denv1_ref_cds.fasta
path: test-data/denv1_genome.fasta
filetype: fasta
unaligned sequences:
class: Collection
Expand Down
732 changes: 435 additions & 297 deletions workflows/comparative_genomics/hyphy/capheine-core-and-compare.ga

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion workflows/comparative_genomics/hyphy/hyphy-compare.ga
Original file line number Diff line number Diff line change
Expand Up @@ -381,5 +381,5 @@
"tags": [],
"uuid": "9357601d-c7cf-4341-bf87-a5b0fca7e57b",
"version": 1,
"release": "0.1"
"release": "0.2"
}
24 changes: 14 additions & 10 deletions workflows/comparative_genomics/hyphy/hyphy-core-tests.yml
Original file line number Diff line number Diff line change
@@ -1,8 +1,12 @@
- doc: Test HyPhy Core produces HyPhy JSON collections
job:
reference cds:
reference GTF:
class: File
path: test-data/denv1_ref_cds.fasta
path: test-data/denv1_ref.gtf
filetype: gtf
reference Fasta:
class: File
path: test-data/denv1_genome.fasta
filetype: fasta
unaligned sequences:
class: Collection
Expand Down Expand Up @@ -31,41 +35,41 @@
outputs:
meme_output:
element_tests:
"NC_001477.1|capsid_protein_C|95-394_DENV1":
"capsid_protein_C":
asserts:
has_text:
text: "{"
"NC_001477.1|membrane_glycoprotein":
"membrane_glycoprotein_precursor_prM":
asserts:
has_text:
text: "{"
prime_output:
element_tests:
"NC_001477.1|capsid_protein_C|95-394_DENV1":
"capsid_protein_C":
asserts:
has_text:
text: "{"
"NC_001477.1|membrane_glycoprotein":
"membrane_glycoprotein_precursor_prM":
asserts:
has_text:
text: "{"
busted_output:
element_tests:
"NC_001477.1|capsid_protein_C|95-394_DENV1":
"capsid_protein_C":
asserts:
has_text:
text: "{"
"NC_001477.1|membrane_glycoprotein":
"membrane_glycoprotein_precursor_prM":
asserts:
has_text:
text: "{"
fel_output:
element_tests:
"NC_001477.1|capsid_protein_C|95-394_DENV1":
"capsid_protein_C":
asserts:
has_text:
text: "{"
"NC_001477.1|membrane_glycoprotein":
"membrane_glycoprotein_precursor_prM":
asserts:
has_text:
text: "{"
Loading
Loading