eGenerateMasksWithDuster
283 const string Trigger()
const{
returntrigger; }
420 string Th()
const{
return th; }
465 r<< sformat << smem;
504 bool UseBA()
const{
returnuse_ba; }
518EAppType
type= eAny,
519 booldetermine_input =
true);
551: resource( newResource ) {}
560{
if( resource && resource != &
NcbiCin)
deleteresource; }
568 operator bool()
const{
returnresource !=
NULL; }
609 static voidFillIdList(
const string&
file_name,
Virtual base class for all input readers.
A base class for winmasker output writers.
Winmasker configuration errors.
virtual const char * GetErrCodeString() const override
Get the description of an error.
NCBI_EXCEPTION_DEFAULT(CWinMaskConfigException, CException)
@ eInconsistentOptions
Option validation failure.
@ eInputOpenFail
Can not open input file.
@ eReaderAllocFail
Memory allocation for input reader object failed.
const CNcbiIstream * operator->() const
CNcbiIstream * operator->()
CIstreamProxy(CNcbiIstream *newResource=NULL)
const CNcbiIstream & operator*() const
CNcbiIstream & operator*()
Objects of this class contain winmasker configuration data.
string trigger
type of the event that triggers masking
Uint4 merge_cutoff_score
average unit score triggering interval merging
string iformatstr
input format
Uint4 dust_level
level value for dusting
string Th() const
Percentage thresholds.
bool merge_pass
perform extra interval merging passes or not
double MinScorePct() const
double t_low_pct
minimum allowed unit score as percentage of units with lower count
Uint4 MeanMergeCutoffDist() const
Distance at which intervals are considered candidates for merging.
const CIdSet * ExcludeIds() const
The set of query ids to exclude from processing.
CMaskWriter * writer
output writer object
CMaskReader * reader
input reader object
Uint4 smem
memory (in megabytes available for masking stage)
Uint1 TMin_Count() const
Number of units to count.
bool MergePass() const
Flag to run the interval merging passes.
bool FaList() const
Use a list of fasta files.
CWinMaskUtil::CIdSet CIdSet
Uint4 pattern
base pattern to use for discontiguous units
Uint4 min_score
minimum allowed unit score
double MaxScorePct() const
const CIdSet * Ids() const
The set of query ids to process.
Uint1 merge_unit_step
unit step to use when merging intervals
bool CheckDup() const
Check for possibly duplicate sequences in the input.
bool use_ba
use bit array based optimization
Uint1 UnitStep() const
Unit step.
Uint4 SetMinScore() const
Get the alternative score for low scoring units.
bool UseBA() const
Whether to use bit array optimization for optimized binary counts format.
CWinMaskConfig(const CWinMaskConfig &rhs)
Prohibit copy constructor.
string Input() const
Value of the -input parameter.
const string InFmt() const
Input file format.
bool fa_list
indicates whether input is a list of fasta file names
double t_extend_pct
minimum score for interval extension as percentage of units with lower count
Uint1 tmin_count
number of units to count for min trigger
bool discontig
true, if using discontiguous units
Uint1 UnitSize() const
n-mer size used for n-mer frequency counting.
Uint4 DustLinker() const
Dust linker (in bps).
double ExtendScorePct() const
const string Trigger() const
Type of the event triggering the masking.
Uint4 mem
memory available for unit counts generator
Uint8 genome_size
total size of the genome in bases
EAppType AppType() const
Type of application to run.
CWinMaskConfig & operator=(const CWinMaskConfig &rhs)
Prohibit assignment operator.
CIstreamProxy is
input file resource manager
Uint4 WindowStep() const
Window step.
Uint4 mean_merge_cutoff_dist
distance at which intervals are considered for merging
string Output() const
Value of the -output parameter.
string output
output file name (may be empty to indicate stdout)
CIdSet * exclude_ids
set of ids to exclude from processing
string sformat
unit counts format for counts generator
CWinMaskUtil::CIdSet_TextMatch CIdSet_TextMatch
Uint4 cutoff_score
window score that triggers masking
bool MatchId() const
Use CSeq_id objects to match/print sequence ids.
Uint4 window_step
window step
EAppType app_type
type of application to run
Uint4 DustWindow() const
Dust window.
const string LStatName() const
Get the name of the length statistics file.
string const GetMetaData() const
Get metadata string to be added to the counts file.
Uint4 set_max_score
score to use for high scoring units
const string SFormat() const
Format in which the unit counts generator should generate its output.
Uint4 textend
t_extend value for extension of masked intervals
Uint4 Mem() const
Memory available for n-mer frequency counting.
Uint4 Textend() const
Get the t_extend value.
string metadata
metadata associated with counts file
Uint4 max_score
maximum allowed unit score
string input
input file name
bool Discontig() const
Whether discontiguous units are used.
double t_thres_pct
threshold score for starting masking as percentage of units with lower count
Uint1 unit_size
unit size (used in unit counts generator
Uint4 DustLevel() const
Dust level.
CIdSet * ids
set of ids to process
Uint4 MaxScore() const
Get the maximum unit score.
Uint4 SetMaxScore() const
Get the alternative score for high scoring units.
Uint4 Pattern() const
Pattern to form discontiguous units.
Uint4 dust_linker
number of bases to use for linking
Uint8 GenomeSize() const
Total genome length.
Uint4 MergeCutoffScore() const
Average unit score triggering the interval merging.
Uint1 unit_step
unit step
Uint4 dust_window
window size for dusting
Uint4 abs_merge_cutoff_dist
distance triggering unconditional interval merging
bool text_match
identify seq ids by string matching
Uint4 set_min_score
score to use for low scoring units
string lstat_name
name of the file containing unit length statitsics
Uint4 MinScore() const
Get the minimum unit score.
double ThresScorePct() const
Uint4 AbsMergeCutoffDist() const
Distance at which intervals are merged unconditionally.
string th
percetages to compute winmask thresholds
CWinMaskUtil::CIdSet_SeqId CIdSet_SeqId
Uint4 CutoffScore() const
Get the average unit score threshold.
Uint1 window_size
length of a window in base pairs
CMaskWriter & Writer()
Get the output writer object.
Uint1 MergeUnitStep() const
Unit step to use for interval merging.
bool checkdup
check for duplicate contigs
Uint1 WindowSize() const
Get the window size.
double t_high_pct
highest allowed unit score as percentage of units with lower count
Implementation of CIdSet that compares CSeq_id handles.
Implementation of CIdSet that does substring matching.
Base class for sets of seq_id representations used with -ids and -exclude-ids options.
The NCBI C++ standard methods for dealing with std::string.
static SQLCHAR output[256]
uint8_t Uint1
1-byte (8-bit) unsigned integer
uint32_t Uint4
4-byte (32-bit) unsigned integer
uint64_t Uint8
8-byte (64-bit) unsigned integer
#define END_NCBI_SCOPE
End previously defined NCBI scope.
#define BEGIN_NCBI_SCOPE
Define ncbi namespace.
IO_PREFIX::istream CNcbiIstream
Portable alias for istream.
#define NCBI_XALGOWINMASK_EXPORT
<!DOCTYPE HTML >< html > n< header > n< title > PubSeq Gateway Help Page</title > n< style > n th
GenericReader< UTF8< char >, UTF8< char >, CrtAllocator > Reader
Reader with UTF8 encoding and default allocator.
Defines command line argument related classes.
NCBI C++ stream class wrappers for triggering between "new" and "old" C++ stream libraries.
double r(size_t dimension_, const Int4 *score_, const double *prob_, double theta_)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4