Defines exception class and several constants for SeqDB. More...
Go to the source code of this file.
Go to the SVN repository for this file.
USING_SCOPE (objects) Include definitions from the objects namespace. More...Defines exception class and several constants for SeqDB.
Defines classes: CSeqDBException
Implemented for: UNIX, MS-Windows
Definition in file seqdbcommon.hpp.
◆ TOid ◆ TPig ◆ TSeqDBAliasFileInstanceSet of values found in one instance of one alias file.
Definition at line 1838 of file seqdbcommon.hpp.
◆ TSeqDBAliasFileValuesContents of all alias file are returned in this type of container.
Definition at line 1844 of file seqdbcommon.hpp.
◆ TSeqDBAliasFileVersionsContents of all instances of a particular alias file pathname.
Definition at line 1841 of file seqdbcommon.hpp.
◆ TTi ◆ EBlastDbVersionBLAST database version.
Enumerator eBDB_Version4 eBDB_Version5Definition at line 51 of file seqdbcommon.hpp.
◆ EOidMaskType ◆ ESeqDBAllocTypeCertain methods have an "Alloc" version.
When these methods are used, the following constants can be specified to indicate which libraries to use to allocate returned data, so the corresponding calls (delete[] vs. free()) can be used to delete the data.
Enumerator eAtlas eMalloc eNewDefinition at line 121 of file seqdbcommon.hpp.
◆ ESeqDBIdTypeVarious identifier formats used in Id lookup.
Enumerator eGiId eTiIdGenomic ID is a relatively stable numeric identifier for sequences.
ePigIdTrace ID is a numeric identifier for Trace sequences.
eStringIdEach PIG identifier refers to exactly one protein sequence.
eHashIdSome sequence sources uses string identifiers.
eOIDLookup from sequence hash values to OIDs.
The ordinal id indicates the order of the data in the volume's index file.
Definition at line 1963 of file seqdbcommon.hpp.
◆ GetBlastSeqIdString() ◆ IsStringId() ◆ SeqDB_CombineAndQuote()Combine and quote a list of database names.
SeqDB permits multiple databases to be opened by a single CSeqDB instance, by passing the database names as a space-delimited list to the CSeqDB constructor. To support paths and filenames with embedded spaces, surround any space-containing names with double quotes ('"'). Filenames not containing spaces may be quoted safely with no effect. (This solution prevents the use of names containing embedded double quotes.)
This method combines a list of database names into a string encoded in this way.
Combine and quote a list of database names.
Definition at line 1717 of file seqdbcommon.cpp.
References dbname(), i, int, and ncbi::grid::netcache::search::fields::size.
◆ SeqDB_CompareVolume() ◆ SeqDB_GetFileExtensions() ◆ SeqDB_GetLMDBFileExtensions() void SeqDB_GetLMDBFileExtensions ( bool db_is_protein, vector< string > & extn ) ◆ SeqDB_GetMetadataFileExtension() void SeqDB_GetMetadataFileExtension ( bool db_is_protein, string & extn ) ◆ SeqDB_GetOidMaskFileExt() ◆ SeqDB_IsBinaryGiList()Read a text or binary SeqId list from a file.
The SeqIds in a file are read into the provided vector<string>. If the in_order parameter is not null, the function will test the SeqIds for orderedness. It will set the bool to which in_order points to true if so, false if not.
void SeqDB_ReadSeqIdList(const string & fname, vector<string> & sis, bool * in_order = 0); Returns true if the file name passed contains a binary gi list
Definition at line 1400 of file seqdbcommon.cpp.
References CSeqDBFileGiList::eGiList, and s_ContainsBinaryNumericIdList().
Referenced by BOOST_AUTO_TEST_CASE(), and CBlastDBAliasApp::CreateAliasFile().
◆ SeqDB_IsBinaryTiList() ◆ SeqDB_ReadBinaryGiList() void SeqDB_ReadBinaryGiList ( const string & name, vector< TGi > & gis ) ◆ SeqDB_ReadGiList() [1/2] ◆ SeqDB_ReadGiList() [2/2]Read a text or binary GI list from a file.
The GIs in a file are read into the provided vector<int>. If the in_order parameter is not null, the function will test the GIs for orderedness. It will set the bool to which in_order points to true if so, false if not.
Definition at line 1462 of file seqdbcommon.cpp.
References ITERATE, and SeqDB_ReadGiList().
◆ SeqDB_ReadMemoryGiList()Read a text or binary GI list from an area of memory.
The GIs in a memory region are read into the provided SGiOid vector. The GI half of each element of the vector is assigned, but the OID half will be left as -1. If the in_order parameter is not null, the function will test the GIs for orderedness. It will set the bool to which in_order points to true if so, false if not.
Definition at line 925 of file seqdbcommon.cpp.
References _ASSERT, GI_FROM, NCBI_THROW, s_ReadDigit(), s_SeqDB_IsBinaryNumericList(), SeqDB_GetStdOrd(), and ZERO_GI.
Referenced by CSeqDBNodeFileIdList::CSeqDBNodeFileIdList(), and SeqDB_ReadGiList().
◆ SeqDB_ReadMemoryMixList()Read an ID list (mixed type) from an area of memory.
The Seq ids in a memory region are read into the provided SSeqIdOid vector. The gi, ti or seqid half of each element of the vector is assigned, but the OID half will be left as -1. If the in_order parameter is not null, the function will test the SeqIds for orderedness. It will set the bool to which in_order points to true if so, false if not.
Definition at line 1324 of file seqdbcommon.cpp.
References eGiId, eStringId, eTiId, GI_FROM, head, SeqDB_SimplifyAccession(), and NStr::ToLower().
Referenced by SeqDB_ReadMixList().
◆ SeqDB_ReadMemoryPigList() ◆ SeqDB_ReadMemorySiList()Read a text SeqID list from an area of memory.
The Seqids in a memory region are read into the provided SSeqIdOid vector. The SeqId half of each element of the vector is assigned, but the OID half will be left as -1. If the in_order parameter is not null, the function will test the SeqIds for orderedness. It will set the bool to which in_order points to true if so, false if not.
Definition at line 1284 of file seqdbcommon.cpp.
References NStr::eTrunc_Both, head, and NStr::TruncateSpaces().
Referenced by CSeqDBNodeFileIdList::CSeqDBNodeFileIdList(), and SeqDB_ReadSiList().
◆ SeqDB_ReadMemoryTiList()Read a text or binary TI list from an area of memory.
The TIs in a memory region are read into the provided STiOid vector. The TI half of each element of the vector is assigned, but the OID half will be left as -1. If the in_order parameter is not null, the function will test the TIs for orderedness. It will set the bool to which in_order points to true if so, false if not.
Definition at line 1149 of file seqdbcommon.cpp.
References int, NCBI_THROW, s_ReadDigit(), s_SeqDB_IsBinaryNumericList(), and SeqDB_GetStdOrd().
Referenced by CSeqDBNodeFileIdList::CSeqDBNodeFileIdList(), and SeqDB_ReadTiList().
◆ SeqDB_ReadMixList()Read a text SeqId list from a file.
The Seqids in a file are read into the provided SSeqIdOid vector. The Gi/Ti/Si half of each element of the vector is assigned, but the OID half will be left as -1. If the in_order parameter is not null, the function will test the SeqIds for orderedness. It will set the bool to which in_order points to true if so, false if not.
Definition at line 1428 of file seqdbcommon.cpp.
References CMemoryFile::GetPtr(), CMemoryFile::GetSize(), SeqDB_MakeOSPath(), and SeqDB_ReadMemoryMixList().
Referenced by CSeqDBFileGiList::CSeqDBFileGiList().
◆ SeqDB_ReadPigList() ◆ SeqDB_ReadSiList() ◆ SeqDB_ReadTiList()Read a text or binary TI list from a file.
The TIs in a file are read into the provided STiOid vector. The TI half of each element of the vector is assigned, but the OID half will be left as -1. If the in_order parameter is not null, the function will test the TIs for orderedness. It will set the bool to which in_order points to true if so, false if not.
Definition at line 1417 of file seqdbcommon.cpp.
References CMemoryFile::GetPtr(), CMemoryFile::GetSize(), SeqDB_MakeOSPath(), and SeqDB_ReadMemoryTiList().
Referenced by CSeqDBFileGiList::CSeqDBFileGiList().
◆ SeqDB_ResolveDbPath()Resolve a file path using SeqDB's path algorithms.
This finds a file using the same algorithm used by SeqDB to find blast database filenames. The filename must include the extension if any. Paths which start with '/', '\', or a drive letter (depending on operating system) will be treated as absolute paths. If the file is not found an empty string will be returned.
Definition at line 453 of file seqdbcommon.cpp.
References s_SeqDB_FindBlastDBPath().
Referenced by CIndexedDb_New::AddIndexInfo(), BOOST_AUTO_TEST_CASE(), CIgAnnotationInfo::CIgAnnotationInfo(), CIndexedDb_Old::CIndexedDb_Old(), CTaxDBFileInfo::CTaxDBFileInfo(), CTaxonomy4BlastSQLite::CTaxonomy4BlastSQLite(), CBlastDatabaseArgs::ExtractAlgorithmOptions(), CIgBlastArgs::ExtractAlgorithmOptions(), CBlastSeqidlistFile::GetSeqidlistInfo(), s_GetTaxIDList(), CBlastTabularInfo::x_CheckTaxDB(), CCmdLineBlastXML2ReportData::x_InitCommon(), CSeqDBGiMask::x_Open(), CVDBAliasNode::x_ResolveDBList(), and CVDBAliasNode::x_ResolveVDBList().
◆ SeqDB_ResolveDbPathForLinkoutDB()Resolve a file path using SeqDB's path algorithms.
Identical to SeqDB_ResolveDbPathNoExtension with the exception that this function searches for ISAM or SQLite files, specifically those storing numeric and string data (for LinkoutDB; i.e.: '.sqlite3'). This is intended to check whether the files used in LinkoutDB exist or not.
Definition at line 472 of file seqdbcommon.cpp.
References CSeqDBAtlas::GenerateSearchPath(), and s_SeqDB_TryPaths().
◆ SeqDB_ResolveDbPathNoExtension()Resolve a file path using SeqDB's path algorithms.
Identical to SeqDB_ResolveDbPath with the exception that this function does not require the extension to be provided. This is intended to check whether a BLAST DB exists or not.
Definition at line 464 of file seqdbcommon.cpp.
References s_SeqDB_FindBlastDBPath().
Referenced by CBlastDBCmdApp::Run(), s_DoesBlastDbExist(), and CVDBAliasNode::x_ResolveDBList().
◆ SeqDB_SequenceHash() [1/2] ◆ SeqDB_SequenceHash() [2/2] unsigned SeqDB_SequenceHash ( const char * sequence, int length )Returns a path minus filename.
Substring version of the above. This returns the part of a file Sequence Hashing
This computes a hash of a sequence. The sequence is expected to be in either ncbistdaa format (for protein) or ncbi8na format (for nucleotide). These formats are produced by CSeqDB::GetAmbigSeq() if the kSeqDBNuclNcbiNA8 encoding is selected.
Returns a path minus filename.
Definition at line 146 of file seqdbobj.cpp.
References SeqDB_ComputeSequenceHash().
Referenced by BOOST_AUTO_TEST_CASE(), CSeqDBImpl::GetSequenceHash(), and CWriteDB_Impl::x_ComputeHash().
◆ SeqDB_SimplifyAccession() [1/2]String id simplification.
This simpler version will convert string id to the standard ISAM form, and return "" if the conversion fails.
Definition at line 2610 of file seqdbcommon.cpp.
References eStringId, result, and SeqDB_SimplifyAccession().
◆ SeqDB_SimplifyAccession() [2/2]String id simplification.
This routine tries to produce a numerical type from a string identifier. SeqDB can use faster lookup mechanisms if a PIG, GI, or OID type can be recognized in the string, for example. Even when the output is a string, it may be better formed for the purpose of lookup in the string ISAM file.
Definition at line 2535 of file seqdbcommon.cpp.
References CSeq_id::BestRank(), NStr::EqualNocase(), eStringId, NStr::fConvErr_NoThrow, FindBestChoice(), NULL, CSeq_id::ParseFastaIds(), result, s_SeqDB_ParseSeqIDs(), SeqDB_SimplifySeqid(), NStr::SplitInTwo(), and NStr::ToLower().
Referenced by CSeqDBVol::AccessionToOids(), CInputGiList::AppendSi(), BOOST_AUTO_TEST_CASE(), CSeqDBGiList::PreprocessIdsForISAMSiLookup(), CSeqDBNegativeList::PreprocessIdsForISAMSiLookup(), SeqDB_ReadMemoryMixList(), and SeqDB_SimplifyAccession().
◆ SeqDB_SimplifySeqid()Seq-id simplification.
Given a Seq-id, this routine devolves it to a GI or PIG if possible. If not, it formats the Seq-id into a canonical form for lookup in the string ISAM files. If the Seq-id was parsed from an accession, it can be provided in the "acc" parameter, and it will be used if the Seq-id is not in a form this code can recognize. In the case that new Seq-id types are added, support for which has not been added to this code, this mechanism will try to use the original string.
Definition at line 2264 of file seqdbcommon.cpp.
References CSeq_id::AsFastaString(), CTextseq_id_Base::CanGetAccession(), CDbtag_Base::CanGetDb(), CTextseq_id_Base::CanGetName(), CDbtag_Base::CanGetTag(), CTextseq_id_Base::CanGetVersion(), NStr::CompareNocase(), CSeq_id_Base::e_Ddbj, CSeq_id_Base::e_Embl, CSeq_id_Base::e_Genbank, CSeq_id_Base::e_General, CSeq_id_Base::e_Gi, CSeq_id_Base::e_Gibbsq, CSeq_id_Base::e_Gpipe, CSeq_id_Base::e_Local, CSeq_id_Base::e_Other, CSeq_id_Base::e_Pir, CSeq_id_Base::e_Prf, CSeq_id_Base::e_Swissprot, CSeq_id_Base::e_Tpd, CSeq_id_Base::e_Tpe, CSeq_id_Base::e_Tpg, CSeq_id::eFasta, eGiId, eOID, ePigId, eStringId, eTiId, CSeq_id::fLabel_GeneralDbIsContent, CSeq_id::fLabel_Version, CTextseq_id_Base::GetAccession(), CDbtag_Base::GetDb(), CSeq_id_Base::GetGeneral(), CSeq_id_Base::GetGi(), CSeq_id_Base::GetGibbsq(), CObject_id_Base::GetId(), CSeq_id::GetLabel(), CSeq_id_Base::GetLocal(), CTextseq_id_Base::GetName(), CObject_id_Base::GetStr(), CDbtag_Base::GetTag(), CSeq_id::GetTextseq_Id(), CTextseq_id_Base::GetVersion(), GI_TO, NStr::IntToString(), CObject_id_Base::IsStr(), result, NStr::StringToInt8(), NStr::ToLower(), NStr::UIntToString(), and CSeq_id_Base::Which().
Referenced by CSeqDBGiList::FindId(), CSeqDBNegativeList::FindId(), SeqDB_SimplifyAccession(), CSeqDBVol::SeqidToOids(), and CBlastDB_BioseqFormatter::Write().
◆ SeqDB_SplitQuoted()Split a (possibly) quoted list of database names into pieces.
SeqDB permits multiple databases to be opened by a single CSeqDB instance, by passing the database names as a space-delimited list to the CSeqDB constructor. To support paths and filenames with embedded spaces, surround any space-containing names with double quotes ('"'). Filenames not containing spaces may be quoted safely with no effect. (This solution prevents the use of names containing embedded double quotes.)
This method splits a string encoded in this way into individual database names. Note that the resulting vector's objects are CTempString "slice" objects, and are only valid while the original (encoded) string is unchanged.
Definition at line 1744 of file seqdbcommon.cpp.
References dbname(), ITERATE, and tmp.
◆ USING_SCOPE()Include definitions from the objects namespace.
◆ kSeqDBEntryDuplicate ◆ kSeqDBEntryNotFound ◆ kSeqDBGroupAliasFileName ◆ kSeqDBNuclBlastNA8 ◆ kSeqDBNuclNcbiNA8RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4