A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/vgteam/vg/wiki/Extracting-a-FASTA-from-a-Graph below:

Extracting a FASTA from a Graph · vgteam/vg Wiki · GitHub

Graph references often contain linear references within them, which you might want copies of for, for example, calling variants with a linear-reference-based caller like Google's DeepVariant.

If you don't already have a FASTA file for an assembly that is included in a graph, you can use vg to extract the assembly FASTA directly from the graph, like this:

vg paths --extract-fasta -x test/graphs/rgfa_with_reference.rgfa --paths-by GRCh38

Here, the argument to -x should be the graph file, in rGFA, GFA, .vg, .gbz, or any other graph file format that vg can read (see File Formats). The argument to --paths-by should be the prefix of the set of paths you would like to extract; generally you can use a sample or assembly name here. You can use vg paths --list -x <the graph> to get a list of all paths available.

This will produce a FASTA file on standard output:

In most cases, the sequence names in the FASTA will be in PanSN format (see Path Metadata Model); these will match the names used by vg surject, and so a FASTA extracted like this is easy to use with a BAM file produced by vg surject.

To save it to a file, you can redirect the output with >.

If you are interested in extracting haplotype paths from a .gbwt file, you can pass the .gbwt file with the -g option to vg paths, and the corresponding .gg file or any matching graph with -x.


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4