RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://github.com/ncbench/ncbench-workflow below:

GitHub - ncbench/ncbench-workflow

NCBench continuous small variants benchmarking workflow.

A Snakemake workflow for benchmarking callsets of small genomic variants, using popular benchmark datasets like Genome in a Bottle or CHM-eval. A detailed description of the workflow, also outlining all involved insights and design decisions can be found under https://doi.org/10.12688/f1000research.140344.1.

Download raw data:

Germline:
Somatic:

Run your pipeline on it.
Upload results to zenodo. Please use either vcf.gz or .bcf. You can use bgzip <your vcf file>.vcf to compress the file.

Create a pull request that adds your results to the config file, under variant-calls. Thereby, comply to the following structure:

my-callset: # choose a descriptive name for your callset
 labels:
   site: # name of your institute, group, department etc.
   pipeline: # name of the pipeline
   trimming: # tool used to trim reads
   read-mapping: # used read mapper
   base-quality-recalibration: # base recalibration method (remove if unused)
   realignment: # realignment method (remove if unused)
   variant-detection: # variant callers (provide comma-separated list if multiple ones are used)
   genotyping: # genotyper/event-typer used
   url: # URL of used pipeline
   # add any additional relevant attributes (they will appear in the false positive and false negative tables of the online report)
 subcategory: # category of callsets to include this one (see other entries in the config file and align with them if possible)
 zenodo:
   deposition: # zenodo record id (e.g. 7734975)
   filename: # name of bcf/vcf.gz file in the zenodo record
 benchmark: # benchmark to use (one of giab-NA12878-agilent-200M, giab-NA12878-agilent-75M, giab-NA12878-twist, and more, see https://github.com/snakemake-workflows/dna-seq-benchmark/blob/main/workflow/resources/presets.yaml)
 rename-contigs: resources/rename-contigs/ucsc-to-ensembl.txt # rename contigs from UCSC (prefixed with chr) to Ensembl style (remove if your contigs are already in Ensembl style)

The pull request will be automatically executed with the ncbench workflow and you will be able to download the resulting report with the assessment of your callset as an artifact from the github actions CI interface.
Once the pull request has been reviewed and merged, your results will appear in the online report at https://ncbench.github.io.
If your callset receives an update, update your zenodo record and create a new pull request that updates the zenodo record ID in your config entry.

The latest results for all contributed callsets are shown at https://ncbench.github.io.

For running ncbench locally, the following steps are required:

Mamba and Install snakemake.
Clone this git repository

Adapt the configuration according to your needs (e.g. add your own callset, and maybe remove all the other callsets if you are only interested in your own). Whn adding your own callset, you can either refer to a zenodo repository, but also (which in the local case is probably more useful, refer to a local path. The following is a minimal entry for evaluating a local callset, to be added to the variant-calls section in the file config/config.yaml of your local clone:

my-callset: # choose a descriptive name for your callset
 path: # path to vcf/bcf/vcf.gz file containing your variant calls (both SNVs and indels, sorted by coordinate)
 benchmark: # benchmark to use (one of giab-NA12878-agilent-200M, giab-NA12878-agilent-75M, giab-NA12878-twist, and more, see https://github.com/snakemake-workflows/dna-seq-benchmark/blob/main/workflow/resources/presets.yaml)
 rename-contigs: resources/rename-contigs/ucsc-to-ensembl.txt # rename contigs from UCSC (prefixed with chr) to Ensembl style (remove if your contigs are already in Ensembl style)

Run the workflow, first in dryrun mode with snakemake -n --sdm conda and then in reality with snakemake --sdm conda --cores N with N being your desired number of cores. You can also run it on cluster or cloud middleware. The Snakemake documentation provides all the details.

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4