A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/doxyhtml/classCSeqDBIsam.html below:

NCBI C++ ToolKit: CSeqDBIsam Class Reference

Search Toolkit Book for CSeqDBIsam

CSeqDBIsam. More...

#include <objtools/blast/seqdb_reader/impl/seqdbisam.hpp>

  CSeqDBIsam (CSeqDBAtlas &atlas, const string &dbname, char prot_nucl, char file_ext_char, ESeqDBIdType ident_type)   Constructor. More...
    ~CSeqDBIsam ()   Destructor. More...
  bool  PigToOid (TPig pig, TOid &oid)   PIG translation. More...
  bool  IdToOid (Int8 id, TOid &oid)   GI or TI translation. More...
  void  IdsToOids (int vol_start, int vol_end, CSeqDBGiList &ids)   Translate Gis and Tis to Oids for the given ID list. More...
  void  IdsToOids (int vol_start, int vol_end, CSeqDBNegativeList &ids)   Compute list of included OIDs based on a negative ID list. More...
  void  StringToOids (const string &acc, vector< TOid > &oids, bool adjusted, bool &version_check)   String translation. More...
  bool  SeqidToOid (const string &acc, TOid &oid)   Seq-id translation. More...
  void  HashToOids (unsigned hash, vector< TOid > &oids)   Sequence hash lookup. More...
  void  UnLease ()   Return any memory held by this object to the atlas. More...
  void  GetIdBounds (Int8 &low_id, Int8 &high_id, int &count)   Get Numeric Bounds. More...
  void  GetIdBounds (string &low_id, string &high_id, int &count)   Get String Bounds. More...
    CObject (void)   Constructor. More...
    CObject (const CObject &src)   Copy constructor. More...
  virtual  ~CObject (void)   Destructor. More...
  CObjectoperator= (const CObject &src) THROWS_NONE   Assignment operator. More...
  bool  CanBeDeleted (void) const THROWS_NONE   Check if object can be deleted. More...
  bool  IsAllocatedInPool (void) const THROWS_NONE   Check if object is allocated in memory pool (not system heap) More...
  bool  Referenced (void) const THROWS_NONE   Check if object is referenced. More...
  bool  ReferencedOnlyOnce (void) const THROWS_NONE   Check if object is referenced only once. More...
  void  AddReference (void) const   Add reference to object. More...
  void  RemoveReference (void) const   Remove reference to object. More...
  void  ReleaseReference (void) const   Remove reference without deleting object. More...
  virtual void  DoNotDeleteThisObject (void)   Mark this object as not allocated in heap – do not delete this object. More...
  virtual void  DoDeleteThisObject (void)   Mark this object as allocated in heap – object can be deleted. More...
  void *  operator new (size_t size)   Define new operator for memory allocation. More...
  void *  operator new[] (size_t size)   Define new[] operator for 'array' memory allocation. More...
  void  operator delete (void *ptr)   Define delete operator for memory deallocation. More...
  void  operator delete[] (void *ptr)   Define delete[] operator for memory deallocation. More...
  void *  operator new (size_t size, void *place)   Define new operator. More...
  void  operator delete (void *ptr, void *place)   Define delete operator. More...
  void *  operator new (size_t size, CObjectMemoryPool *place)   Define new operator using memory pool. More...
  void  operator delete (void *ptr, CObjectMemoryPool *place)   Define delete operator. More...
  virtual void  DebugDump (CDebugDumpContext ddc, unsigned int depth) const   Define method for dumping debug information. More...
    CDebugDumpable (void)   virtual  ~CDebugDumpable (void)   void  DebugDumpText (ostream &out, const string &bundle, unsigned int depth) const   void  DebugDumpFormat (CDebugDumpFormatter &ddf, const string &bundle, unsigned int depth) const   void  DumpToConsole (void) const   template<class T > void  x_LoadIndex (CSeqDBFileMemMap &lease, vector< T > &keys, vector< TIndx > &offs)   Load and extract all index samples into array at once. More...
  template<class T > void  x_LoadData (CSeqDBFileMemMap &lease, vector< T > &keys, vector< int > &vals, int num_keys, TIndx begin)   Load and extract a data page into array at once. More...
  template<class T > void  x_TranslateGiList (int vol_start, CSeqDBGiList &gis)   GiList Translation. More...
  bool  x_IdentToOid (Int8 id, TOid &oid)   Numeric identifier lookup. More...
  EErrorCode  x_SearchIndexNumeric (Int8 Number, int *Data, Uint4 *Index, Int4 &SampleNum, bool &done)   Index file search. More...
  void  x_SearchNegativeMulti (int vol_start, int vol_end, CSeqDBNegativeList &gis, bool use_tis)   Negative ID List Translation. More...
  void  x_SearchNegativeMultiSeq (int vol_start, int vol_end, CSeqDBNegativeList &gis)   EErrorCode  x_SearchDataNumeric (Int8 Number, int *Data, Uint4 *Index, Int4 SampleNum)   Data file search. More...
  EErrorCode  x_NumericSearch (Int8 Number, int *Data, Uint4 *Index)   Numeric identifier lookup. More...
  EErrorCode  x_StringSearch (const string &term_in, vector< string > &term_out, vector< string > &value_out, vector< TIndx > &index_out)   String identifier lookup. More...
  EErrorCode  x_InitSearch (void)   Initialize the search object. More...
  int  x_GetPageNumElements (Int4 SampleNum, Int4 *Start)   Determine the number of elements in the data page. More...
  bool  x_SparseStringToOids (const string &acc, vector< int > &oids, bool adjusted)   Lookup a string in a sparse table. More...
  int  x_DiffCharLease (const string &term_in, CSeqDBFileMemMap &lease, const string &file_name, TIndx file_length, Uint4 at_least, TIndx KeyOffset, bool ignore_case)   Find the first character to differ in two strings. More...
  int  x_DiffChar (const string &term_in, const char *begin, const char *end, bool ignore_case)   Find the first character to differ in two strings. More...
  void  x_ExtractData (const char *key_start, const char *entry_end, vector< string > &key_out, vector< string > &data_out)   Extract the data from a key-value pair in memory. More...
  TIndx  x_GetIndexKeyOffset (TIndx sample_offset, Uint4 sample_num)   Get the offset of the specified sample. More...
  void  x_GetIndexString (TIndx key_offset, int length, string &prefix, bool trim_to_null)   Read a string from the index file. More...
  int  x_DiffSample (const string &term_in, Uint4 SampleNum, TIndx &KeyOffset)   Find the first character to differ in two strings. More...
  void  x_ExtractAllData (const string &term_in, TIndx sample_index, vector< TIndx > &indices_out, vector< string > &keys_out, vector< string > &data_out)   Find matches in the given page of a string ISAM file. More...
  void  x_ExtractPageData (const string &term_in, TIndx page_index, const char *beginp, const char *endp, vector< TIndx > &indices_out, vector< string > &keys_out, vector< string > &data_out)   Find matches in the given memory area of a string ISAM file. More...
  void  x_LoadPage (TIndx SampleNum1, TIndx SampleNum2, const char **beginp, const char **endp)   Map a page into memory. More...
  int  x_TestNumericSample (CSeqDBFileMemMap &index_lease, int index, Int8 key_in, Int8 &key_out, int &data_out)   Test a sample key value from a numeric index. More...
  void  x_GetNumericSample (CSeqDBFileMemMap &index_lease, int index, Int8 &key_out, int &data_out)   Get a sample key value from a numeric index. More...
  bool  x_FindInNegativeList (CSeqDBNegativeList &ids, int &index, Int8 key, bool use_tis)   Find ID in the negative GI list using PBS. More...
  bool  x_FindInNegativeList (CSeqDBNegativeList &ids, int &index, string key)   void  x_MapDataPage (int sample_index, int &start, int &num_elements, const void **data_page_begin)   Map a data page. More...
  void  x_GetDataElement (const void *dpage, int index, Int8 &key, int &data)   Get a particular data element from a data page. More...
  void  x_GetDataElement (const void *dpage, int index, string &key, int &data)   void  x_FindIndexBounds ()   Find the least and greatest keys in this ISAM file. More...
  bool  x_OutOfBounds (Int8 key)   Check whether a numeric key is within this volume's bounds. More...
  bool  x_OutOfBounds (string key)   Check whether a string key is within this volume's bounds. More...
  Uint8  x_GetNumericKey (const void *p)   int  x_GetNumericData (const void *p)   void  x_LoadStringData (const char *begin, string &key, int &data)   template<> void  x_LoadIndex (CSeqDBFileMemMap &lease, vector< TGi > &keys, vector< TIndx > &offs)   Load and extract all index samples into array at once. More...
  template<> void  x_LoadData (CSeqDBFileMemMap &lease, vector< TGi > &keys, vector< int > &vals, int num_keys, TIndx begin)   Load and extract a data page into array at once. More...
  template<> void  x_LoadIndex (CSeqDBFileMemMap &lease, vector< string > &keys, vector< TIndx > &offs)   template<> void  x_LoadData (CSeqDBFileMemMap &lease, vector< string > &keys, vector< int > &vals, int num_keys, TIndx begin)  

CSeqDBIsam.

Manages one ISAM file, which will translate either PIGs, GIs, or Accessions to OIDs. Translation in the other direction is done in the CSeqDBVol code. Files managed by this class include those with the extensions pni, pnd, ppi, ppd, psi, psd, nsi, nsd, nni, and nnd. Each instance of this object will manage one pair of these files, including one whose name ends in 'i' and one whose name ends in 'd'.

Definition at line 127 of file seqdbisam.hpp.

◆ TGiOid

Import the type representing one GI, OID association.

Definition at line 130 of file seqdbisam.hpp.

◆ TId

Type large enough to hold any numerical ID.

Definition at line 158 of file seqdbisam.hpp.

◆ TIndx

Type which is large enough to span the bytes of an ISAM file.

Definition at line 143 of file seqdbisam.hpp.

◆ TOid

This class works with OIDs relative to a specific volume.

Definition at line 146 of file seqdbisam.hpp.

◆ TTi

PIG identifiers for numeric indices over protein volumes.

Genomic IDs, the most common numerical identifier. Identifier type for trace databases.

Definition at line 155 of file seqdbisam.hpp.

◆ EErrorCode

Exit conditions occurring in this code.

Enumerator eNotFound  eNoError 

The key was not found.

eBadVersion 

Lookup was successful.

eBadType 

The format version of the ISAM file is unsupported.

eWrongFile 

The requested ISAM type did not match the file.

eInitFailed 

The file was not found, or was the wrong length.

Definition at line 489 of file seqdbisam.hpp.

◆ EIsamDbType

Types of database this class can access.

Enumerator eNumeric  eNumericNoData 

Numeric database with Key/Value pairs in the index file.

eString 

This type is not supported.

eStringDatabase 

String database type used here.

eStringBin 

This type is not supported.

eNumericLongId 

This type is not supported.

Definition at line 133 of file seqdbisam.hpp.

◆ CSeqDBIsam()

Constructor.

An ISAM file object corresponds to an index file and a data file, and converts identifiers (string, GI, or PIG) into OIDs relative to a particular database volume.

Parameters
atlas The memory management object. [in] dbname The name of the volume's files (minus the extension). [in] prot_nucl Whether the sequences are protein or nucleotide. [in] file_ext_char This is 's', 'n', or 'p', for string, GI, or PIG, respectively. [in] ident_type The type of identifiers this database translates. [in]

Definition at line 1102 of file seqdbisam.cpp.

References CSeqDBFileMemMap::Clear(), dbname(), DEFAULT_NISAM_SIZE, DEFAULT_SISAM_SIZE, eGiId, eHashId, eNoError, eNumeric, ePigId, eString, eStringId, eTiId, CSeqDBFileMemMap::Init(), m_DataFname, m_DataLease, m_IndexFname, m_IndexLease, m_Initialized, m_PageSize, m_Type, msg(), NCBI_THROW, x_FindIndexBounds(), x_InitSearch(), and x_MakeFilenames().

◆ ~CSeqDBIsam() CSeqDBIsam::~CSeqDBIsam ( )

Destructor.

Releases all resources associated with this object.

Definition at line 1211 of file seqdbisam.cpp.

References UnLease().

◆ GetIdBounds() [1/2] void CSeqDBIsam::GetIdBounds ( Int8low_id, Int8high_id, intcount  )

Get Numeric Bounds.

Fetch the lowest, highest, and total number of numeric keys in the database index. If the operation fails, zero will be returned for count.

Parameters
low_id Lowest numeric id value in database. [out] high_id Highest numeric id value in database. [out] count Number of numeric id values in database. [out] locked Lock holder object for this thread. [in]

Definition at line 1625 of file seqdbisam.cpp.

References count, CSeqDBIsam::SIsamKey::GetNumeric(), CSeqDBIsam::SIsamKey::IsSet(), m_FirstKey, m_Initialized, m_LastKey, and m_NumTerms.

Referenced by CSeqDBVol::GetGiBounds(), CSeqDBVol::GetPigBounds(), and CSeqDBVol::GetStringBounds().

◆ GetIdBounds() [2/2] void CSeqDBIsam::GetIdBounds ( stringlow_id, stringhigh_id, intcount  ) ◆ HashToOids() void CSeqDBIsam::HashToOids ( unsigned  hash, vector< TOid > &  oids  )

Sequence hash lookup.

This methods tries to find sequences associated with a given sequence hash value. The provided value is numeric but the ISAM file uses a string format, because string searches can return multiple results per key, and there may be multiple OIDs for a given hash value due to identical sequences and collisions.

Parameters
hash The sequence hash value to look up. [in] oids The returned oids. [out] locked The lock hold object for this thread. [in|out]

Definition at line 1667 of file seqdbisam.cpp.

References _ASSERT, eHashId, eNoError, eNotFound, ITERATE, ncbi::grid::netcache::search::fields::key, m_IdentType, m_Initialized, NStr::UIntToString(), and x_StringSearch().

Referenced by CSeqDBVol::HashToOids().

◆ IdsToOids() [1/2]

Translate Gis and Tis to Oids for the given ID list.

This method iterates over a vector of Gi/OID and/or Ti/OID pairs. For each pair where the OID is -1, the GI or TI will be looked up in the ISAM file, and (if found) the correct OID will be stored (otherwise the -1 will remain). This method will normally be called once for each volume.

Parameters
vol_start The starting OID of this volume. [in] vol_end The fist OID past the end of this volume. [in] ids The set of GI-OID or TI-OID pairs. [in|out] locked The lock holder object for this thread. [in|out]

Definition at line 1388 of file seqdbisam.cpp.

References eGiId, ePigId, eStringId, eTiId, m_IdentType, and NCBI_THROW.

Referenced by CSeqDBVol::IdsToOids().

◆ IdsToOids() [2/2]

Compute list of included OIDs based on a negative ID list.

This method iterates over a vector of Gis or Tis, along with the corresponding ISAM file for this volume. Each OID found in the ISAM file is marked in the negative ID list. For those for which the GI or TI is not mentioned in the negative ID list, the OID will be marked as an 'included' OID in the ID list (that OID will be searched). The OIDs for IDs that are not found in the ID list will be marked as 'visible' OIDs. When this process is done for all volumes, the SeqDB object will use all OIDs that are either marked as 'included' or NOT marked as 'visible'. The 'visible' list is needed because otherwise iteration would skip IDs that are do not have GIs or TIs (whichever is being iterated). To use this method, this volume must have an ISAM file matching the negative ID list's identifier type or an exception will be thrown.

Parameters
vol_start The starting OID of this volume. [in] vol_end The fist OID past the end of this volume. [in] ids The set of GI-OID pairs. [in|out] locked The lock holder object for this thread. [in|out]

Definition at line 1421 of file seqdbisam.cpp.

References _ASSERT, eGiId, eStringId, eTiId, CSeqDBNegativeList::GetNumGis(), CSeqDBNegativeList::GetNumSis(), CSeqDBNegativeList::GetNumTis(), CSeqDBNegativeList::InsureOrder(), m_IdentType, x_SearchNegativeMulti(), and x_SearchNegativeMultiSeq().

◆ IdToOid() ◆ IndexExists() bool CSeqDBIsam::IndexExists ( const stringdbname, char  prot_nucl, char  file_ext_char  ) static ◆ PigToOid()

PIG translation.

A PIG identifier is translated to an OID. PIG identifiers are used exclusively for protein sequences. One PIG corresponds to exactly one sequences of amino acids, and vice versa. They are also stable; the sequence a PIG points to will never be changed.

Parameters
pig The PIG to look up. [in] oid The returned oid. [out] locked The lock hold object for this thread. [in|out]
Returns
true if the PIG was found

Definition at line 203 of file seqdbisam.hpp.

References _ASSERT, ePigId, m_IdentType, and x_IdentToOid().

Referenced by CSeqDBVol::PigToOid(), and CSeqDBVol::x_StringToOids().

◆ SeqidToOid()

Seq-id translation.

A Seq-id identifier (serialized to a string) is translated into an OID. This routine will attempt to simplify the seqid so as to use the faster numeric lookup techniques whenever possible.

Parameters
acc A string containing the Seq-id. [in] oid The returned oid. [out] locked The lock hold object for this thread. [in|out]
◆ StringToOids()

String translation.

A string id is translated to one or more OIDs. String ids are used by some groups which produce sequence data. In some cases, the string may correspond to more than one OID. For this reason, the OIDs are returned in a vector. The string provided is looked up in several ways. If it contains a pipe character ("|") the data will be interpreted as a SeqID. This routine can use faster lookup mechanisms if the simplification routines were able to recognize the sequence as one of several types that have numerical indices. The version_check flag is needed to support sparse indexing. If version_check is true, and the string has a version, and the lookup fails, this method will try to remove the version and search again. On return from this method version_check will be set to true if and only if the first search failed and the versionless search succeeded. CSeqDBVol::x_CheckVersions() can then be called to verify the OIDs; see that method for more information about this scenario.

Parameters
acc The string to look up. [in] oids The returned oids. [out] adjusted Whether the simplification adjusted the string. [in|out] version_check If the version can be stripped [in] and if it was [out]. locked The lock hold object for this thread. [in|out]

Definition at line 1236 of file seqdbisam.cpp.

References _ASSERT, CSeq_id::AsFastaString(), eNoError, eNotFound, eStringId, CSeq_id::fParse_AnyLocal, CSeq_id::fParse_RawText, isdigit(), ITERATE, m_IdentType, m_Initialized, ncbi::grid::netcache::search::fields::size, and x_StringSearch().

Referenced by CSeqDBVol::x_StringToOids().

◆ UnLease() void CSeqDBIsam::UnLease ( ) ◆ x_DiffChar()

Find the first character to differ in two strings.

This finds the index of the first character to differ in meaningful way between two strings. One of the strings is a term that is passed in; the other is a range of memory represented by two pointers.

Parameters
term_in The key string to compare against. begin A pointer to the start of the second string. end A pointer to the end of the second string. ignore_case Whether to treat the search as case-sensitive
Returns
The position of the first difference.

Definition at line 589 of file seqdbisam.cpp.

References ch1, ch2, ENDS_ISAM_KEY(), i, int, result, s_SeqDBIsam_NullifyEOLs(), and toupper().

Referenced by x_DiffCharLease(), x_ExtractAllData(), and x_ExtractPageData().

◆ x_DiffCharLease()

Find the first character to differ in two strings.

This finds the index of the first character to differ in meaningful way between two strings. One of the strings is a term that is passed in; the other is assumed to be located in the ISAM table, a lease to which is passed to this function.

Parameters
term_in The key string to compare against. lease A lease to hold the data in the ISAM table file. file_name The name of the ISAM file to work with. file_length The length of the file named by file_name. at_least Try to get at least this many bytes. KeyOffset The location of the key in the leased file. ignore_case Whether to treat the search as case-sensitive locked The lock holder object for this thread.
Returns
The position of the first difference.

Definition at line 516 of file seqdbisam.cpp.

References file_name, CSeqDBFileMemMap::GetFileDataPtr(), int, result, and x_DiffChar().

Referenced by x_DiffSample().

◆ x_DiffSample()

Find the first character to differ in two strings.

This finds the index of the first character to differ between two strings. The first string is provided, the second is one of the sample strings, indicated by the index of that sample value.

Parameters
term_in The key string to compare against. SampleNum Selects which sample to compare with. KeyOffset The returned offset of the key that was used. locked This thread's lock holder object.

Definition at line 863 of file seqdbisam.cpp.

References CSeqDBFileMemMap::GetFileDataPtr(), m_IndexFileLength, m_IndexFname, m_IndexLease, m_KeySampleOffset, m_MaxLineSize, m_NumSamples, m_PageSize, MEMORY_ONLY_PAGE_SIZE, SeqDB_GetStdOrd(), and x_DiffCharLease().

Referenced by x_StringSearch().

◆ x_ExtractAllData() void CSeqDBIsam::x_ExtractAllData ( const stringterm_in, TIndx  sample_index, vector< TIndx > &  indices_out, vector< string > &  keys_out, vector< string > &  data_out  ) private

Find matches in the given page of a string ISAM file.

This searches the area around a specific page of the data file to find all matches to term_in. The results are returned in vectors. This method may search multiple pages.

Parameters
term_in The key string to compare against. sample_index Selects which page to search. indices_out The index of each match. keys_out The key of each match. data_out The value of each match. locked This thread's lock holder object.

Definition at line 688 of file seqdbisam.cpp.

References m_NumSamples, m_PageSize, s_SeqDBIsam_NullifyEOLs(), x_DiffChar(), x_ExtractPageData(), and x_LoadPage().

Referenced by x_StringSearch().

◆ x_ExtractData() void CSeqDBIsam::x_ExtractData ( const char *  key_start, const char *  entry_end, vector< string > &  key_out, vector< string > &  data_out  ) private

Extract the data from a key-value pair in memory.

Given pointers to a location in mapped memory, and the end of the mapped data, this finds the key and data values for the object at that location.

Parameters
key_start A pointer to the beginning of the key-value pair in memory. entry_end A pointer to the end of the mapped area of memory. key_out A string holding the ISAM entry's key data_out A string holding the ISAM entry's value

Definition at line 793 of file seqdbisam.cpp.

References ISAM_DATA_CHAR, and s_SeqDBIsam_NullifyEOLs().

Referenced by x_ExtractPageData(), and x_FindIndexBounds().

◆ x_ExtractPageData()

Find matches in the given memory area of a string ISAM file.

This searches the specified section of memory to find all matches to term_in. The results are returned in vectors.

Parameters
term_in The key string to compare against. page_index Selects which page to search. beginp Pointer to the start of the memory area endp Pointer to the end of the memory area indices_out The index of each match. keys_out The key of each match. data_out The value of each match.

Definition at line 634 of file seqdbisam.cpp.

References s_SeqDBIsam_NullifyEOLs(), x_DiffChar(), and x_ExtractData().

Referenced by x_ExtractAllData(), and x_StringSearch().

◆ x_FindIndexBounds() void CSeqDBIsam::x_FindIndexBounds ( ) private

Find the least and greatest keys in this ISAM file.

Definition at line 1461 of file seqdbisam.cpp.

References _ASSERT, eNumeric, m_FirstKey, m_LastKey, m_NumSamples, m_Type, s_SeqDBIsam_NullifyEOLs(), CSeqDBIsam::SIsamKey::SetNumeric(), CSeqDBIsam::SIsamKey::SetString(), x_ExtractData(), x_GetDataElement(), x_LoadPage(), x_Lower(), and x_MapDataPage().

Referenced by CSeqDBIsam().

◆ x_FindInNegativeList() [1/2]

Find ID in the negative GI list using PBS.

Use parabolic binary search to find the specified ID in the negative ID list. The 'index' value is the index to start the search at (this must refer to an index at or before the target data if the search is to succeed). Whether the search was successful or not, the index will be moved forward past any elements with values less than 'key'.

Parameters
ids Negative ID list. [in|out] index Index into negative ID list. [in|out] key Key for which to search. [in] use_tis If true, search for a TI, else for a GI. [in]
Returns
True if the search found the ID.

Definition at line 1428 of file seqdbisam.hpp.

References ncbi::grid::netcache::search::fields::key, CSeqDBNegativeList::ListSize(), and x_GetId().

Referenced by x_SearchNegativeMulti(), and x_SearchNegativeMultiSeq().

◆ x_FindInNegativeList() [2/2] ◆ x_GetDataElement() [1/2] void CSeqDBIsam::x_GetDataElement ( const void *  dpage, int  index, Int8key, intdata  ) inlineprivate ◆ x_GetDataElement() [2/2] void CSeqDBIsam::x_GetDataElement ( const void *  dpage, int  index, stringkey, intdata  ) inlineprivate ◆ x_GetId() [1/2] ◆ x_GetId() [2/2] ◆ x_GetIndexKeyOffset()

Get the offset of the specified sample.

For string ISAM indices, the index file contains a table of offsets of the index file samples. This function gets the offset of the specified sample in the index file's table.

Parameters
sample_offset The offset into the file of the set of samples. sample_num The index of the sample to get. locked This thread's lock holder object.
Returns
The offset of the sample in the index file.

Definition at line 823 of file seqdbisam.cpp.

References CSeqDBFileMemMap::GetFileDataPtr(), m_IndexLease, and SeqDB_GetStdOrd().

Referenced by x_StringSearch().

◆ x_GetIndexString() void CSeqDBIsam::x_GetIndexString ( TIndx  key_offset, int  length, stringprefix, bool  trim_to_null  ) private

Read a string from the index file.

Given an offset into the index file, and a maximum length, this function returns the bytes in a string object.

Parameters
key_offset The offset into the file of the first byte. length The maximum number of bytes to get. prefix The string in which to return the data. trim_to_null Whether to search for a null and return only that much data. locked This thread's lock holder object.

Definition at line 836 of file seqdbisam.cpp.

References CSeqDBFileMemMap::GetFileDataPtr(), i, m_IndexLease, and str().

Referenced by x_StringSearch().

◆ x_GetNumericData() int CSeqDBIsam::x_GetNumericData ( const void *  p ) inlineprivate ◆ x_GetNumericKey() Uint8 CSeqDBIsam::x_GetNumericKey ( const void *  p ) inlineprivate ◆ x_GetNumericSample()

Get a sample key value from a numeric index.

Given the index of a sample value, this code will get the key. If data values are stored in the index file, the corresponding data value will also be returned. The offset of the data block is computed and returned as well.

Parameters
index_lease The memory lease to use with the index file. index The index of the sample to get. key_out The key found will be returned here. data_out If an exact match, the data found will be returned here.

Definition at line 1315 of file seqdbisam.hpp.

References CSeqDBFileMemMap::GetFileDataPtr(), m_KeySampleOffset, m_TermSize, x_GetNumericData(), and x_GetNumericKey().

◆ x_GetPageNumElements() Int4 CSeqDBIsam::x_GetPageNumElements ( Int4  SampleNum, Int4Start  ) private

Determine the number of elements in the data page.

The number of elements is determined based on whether this is the last page and the configured page size.

Parameters
SampleNum Which data page will be searched. Start The returned index of the start of the page.
Returns
The number of elements in this data page.

Definition at line 123 of file seqdbisam.cpp.

References m_NumSamples, m_NumTerms, and m_PageSize.

Referenced by x_MapDataPage(), and x_SearchDataNumeric().

◆ x_IdentToOid()

Numeric identifier lookup.

Given a numeric identifier, this routine finds the OID.

Parameters
id The GI or PIG identifier to look up. oid The returned oid. locked The lock holder object for this thread.
Returns
true if the identifier was found.

Definition at line 1222 of file seqdbisam.cpp.

References eNoError, and x_NumericSearch().

Referenced by IdToOid(), and PigToOid().

◆ x_InitSearch()

Initialize the search object.

The first identifier search sets up the object by calling this function, which reads the metadata from the index file and sets all the fields needed for ISAM lookups.

Parameters
locked The lock holder object for this thread.
Returns
A non-zero error on failure, or eNoError on success.

Definition at line 59 of file seqdbisam.cpp.

References eBadType, eBadVersion, eNoError, eNumeric, eNumericLongId, eWrongFile, CSeqDBFileMemMap::GetFileDataPtr(), CSeqDBAtlas::GetFileSizeL(), ISAM_VERSION, m_Atlas, m_DataFileLength, m_DataFname, m_IdxOption, m_IndexFileLength, m_IndexFname, m_IndexLease, m_Initialized, m_KeySampleOffset, m_LongId, m_MaxLineSize, m_NumSamples, m_NumTerms, m_PageSize, m_TermSize, m_Type, MEMORY_ONLY_PAGE_SIZE, and SeqDB_GetStdOrd().

Referenced by CSeqDBIsam().

◆ x_LoadData() [1/3] ◆ x_LoadData() [2/3] ◆ x_LoadData() [3/3] ◆ x_LoadIndex() [1/3] ◆ x_LoadIndex() [2/3] ◆ x_LoadIndex() [3/3] ◆ x_LoadPage() void CSeqDBIsam::x_LoadPage ( TIndx  SampleNum1, TIndx  SampleNum2, const char **  beginp, const char **  endp  ) private

Map a page into memory.

Given two indices, this method maps into memory the area starting at the beginning of the first index and extending to the end of the other. (If the indices are equal, only one page would be mapped.)

Parameters
SampleNum1 The first page index. SampleNum2 The second page index. beginp The returned starting offset of the mapped area. endp The returned ending offset of the mapped area. locked This thread's lock holder object.

Definition at line 899 of file seqdbisam.cpp.

References _ASSERT, CSeqDBFileMemMap::GetFileDataPtr(), m_DataFname, m_DataLease, m_IndexLease, m_KeySampleOffset, and SeqDB_GetStdOrd().

Referenced by x_ExtractAllData(), x_FindIndexBounds(), and x_StringSearch().

◆ x_LoadStringData() void CSeqDBIsam::x_LoadStringData ( const char *  begin, stringkey, intdata  ) inlineprivate ◆ x_Lower() ◆ x_MakeFilenames() void CSeqDBIsam::x_MakeFilenames ( const stringdbname, char  prot_nucl, char  file_ext_char, stringindex_name, stringdata_name  ) staticprivate

Make filenames for ISAM file.

Parameters
dbname Base name of the database volume. [in] prot_nucl 'n' or 'p' for protein or nucleotide. [in] file_ext_char Identifier symbol; 's' for string, etc. [in] index_name Filename of ISAM index file. [out] data_name Filename of ISAM data file. [out]

Definition at line 1173 of file seqdbisam.cpp.

References dbname(), isalpha(), and NCBI_THROW.

Referenced by CSeqDBIsam(), and IndexExists().

◆ x_MapDataPage() void CSeqDBIsam::x_MapDataPage ( int  sample_index, intstart, intnum_elements, const void **  data_page_begin  ) inlineprivate ◆ x_NumericSearch()

Numeric identifier lookup.

Given a numeric identifier, this routine finds the OID.

Parameters
Number The GI or PIG identifier to look up. Data The returned OID. Index The returned location in the ISAM table, or NULL. locked The lock holder object for this thread.
Returns
A non-zero error on failure, or eNoError on success.

Definition at line 498 of file seqdbisam.cpp.

References done, x_SearchDataNumeric(), and x_SearchIndexNumeric().

Referenced by x_IdentToOid().

◆ x_OutOfBounds() [1/2] bool CSeqDBIsam::x_OutOfBounds ( Int8  key ) private ◆ x_OutOfBounds() [2/2] ◆ x_SearchDataNumeric()

Data file search.

Given a numeric identifier, this routine finds the OID in the data file.

Parameters
Number The GI or PIG identifier to look up. Data The returned OID. Index The returned location in the ISAM table, or NULL. SampleNum The location of the page in the data file to search. locked The lock holder object for this thread.
Returns
A non-zero error on failure, or eNoError on success.

Definition at line 421 of file seqdbisam.cpp.

References _ASSERT, eNoError, eNotFound, eNumericNoData, first(), CSeqDBFileMemMap::GetFileDataPtr(), last(), m_DataFname, m_DataLease, m_TermSize, m_Type, NULL, x_GetNumericData(), x_GetNumericKey(), and x_GetPageNumElements().

Referenced by x_NumericSearch().

◆ x_SearchIndexNumeric()

Index file search.

Given a numeric identifier, this routine finds the OID or the page in the data file where the OID can be found.

Parameters
Number The GI or PIG identifier to look up. Data The returned OID. Index The returned location in the ISAM table, or NULL. SampleNum The returned location in the data file if not done. done true if the OID was found. locked
Returns
A non-zero error on failure, or eNoError on success.

Definition at line 140 of file seqdbisam.cpp.

References _ASSERT, done, eInitFailed, eNoError, eNotFound, eNumericNoData, CSeqDBFileMemMap::GetFileDataPtr(), m_IndexFname, m_IndexLease, m_Initialized, m_KeySampleOffset, m_NumSamples, m_PageSize, m_TermSize, m_Type, NULL, x_GetNumericData(), x_GetNumericKey(), and x_OutOfBounds().

Referenced by x_NumericSearch().

◆ x_SearchNegativeMulti()

Negative ID List Translation.

Given a Negative ID list, this routine turns on the bits for the OIDs found in the volume but not in the negated ID list.

Parameters
vol_start The starting OID for this ISAM file's database volume. vol_end The ending OID for this ISAM file's database volume. gis The Negative ID list to translate. use_tis Iterate over TIs if true (GIs otherwise). locked The lock holder object for this thread.

Definition at line 219 of file seqdbisam.cpp.

References _ASSERT, CSeqDBNegativeList::AddIncludedOid(), CSeqDBNegativeList::AddVisibleOid(), eNumericNoData, CSeqDBNegativeList::GetNumGis(), CSeqDBNegativeList::GetNumTis(), i, m_Initialized, m_NumSamples, m_Type, NCBI_THROW, x_FindInNegativeList(), x_GetDataElement(), and x_MapDataPage().

Referenced by IdsToOids().

◆ x_SearchNegativeMultiSeq()

Definition at line 333 of file seqdbisam.cpp.

References CSeqDBNegativeList::AddIncludedOid(), CSeqDBNegativeList::AddVisibleOid(), i, CSeqDBNegativeList::ListSize(), m_DataLease, m_IndexLease, m_Initialized, m_NumSamples, m_NumTerms, m_PageSize, NCBI_THROW, s_IsSameAccession(), x_FindInNegativeList(), x_LoadData(), and x_LoadIndex().

Referenced by IdsToOids().

◆ x_SparseStringToOids()

Lookup a string in a sparse table.

This does string lookup in a sparse string table. There is no support (code) for this since there are currently no examples of this kind of table to test against.

Parameters
acc The string to look up. oids The returned oids found by the search. adjusted Whether the key was changed by the identifier simplification logic. locked The lock holder object for this thread.
Returns
true if results were found

Definition at line 1378 of file seqdbisam.cpp.

References _TROUBLE.

◆ x_StringSearch()

String identifier lookup.

Given a string identifier, this routine finds the OID(s).

Parameters
term_in The string identifier to look up. term_out The returned keys (as strings). value_out The returned oids (as strings). index_out The locations where the matches were found. locked The lock holder object for this thread.
Returns
A non-zero error on failure, or eNoError on success.

Definition at line 934 of file seqdbisam.cpp.

References NStr::CompareNocase(), eInitFailed, eNoError, eNotFound, CSeqDBFileMemMap::GetFileDataPtr(), int, m_IndexFileLength, m_IndexLease, m_Initialized, m_KeySampleOffset, m_MaxLineSize, m_NumSamples, m_PageSize, MEMORY_ONLY_PAGE_SIZE, tolower(), x_DiffSample(), x_ExtractAllData(), x_ExtractPageData(), x_GetIndexKeyOffset(), x_GetIndexString(), x_LoadPage(), and x_OutOfBounds().

Referenced by HashToOids(), and StringToOids().

◆ x_TestNumericSample()

Test a sample key value from a numeric index.

This method reads the key value of an index file sample element from a numeric index file. The calling code should insure that the data is mapped in, and that the file type is correct. The key value found will be compared to the search key. This method will return 0 for an exact match, -1 if the key is less than the sample, or 1 if the key is greater. If the match is exact, it will also return the data in data_out.

Parameters
index_lease The memory lease to use with the index file. index The index of the sample to get. key_in The key for which the user is searching. key_out The key found will be returned here. data_out If an exact match, the data found will be returned here.
Returns
-1, 0 or 1 when key_in is less, equal greater than key_out.

Definition at line 1284 of file seqdbisam.hpp.

References CSeqDBFileMemMap::GetFileDataPtr(), m_KeySampleOffset, m_TermSize, x_GetNumericData(), and x_GetNumericKey().

◆ x_TranslateGiList()

GiList Translation.

Given a GI list, this routine finds the OID for each ID in the list not already having a translation.

Parameters
vol_start The starting OID for this ISAM file's database volume. gis The GI list to translate. locked The lock holder object for this thread.

Definition at line 549 of file seqdbisam.hpp.

References CSeqDBGiList::eGi, CSeqDBGiList::GetKey(), CSeqDBGiList::GetSize(), CSeqDBGiList::InsureOrder(), m_DataLease, m_IndexLease, m_Initialized, m_NumSamples, m_NumTerms, m_PageSize, NCBI_THROW, T, x_LoadData(), and x_LoadIndex().

◆ m_Atlas ◆ m_DataFileLength TIndx CSeqDBIsam::m_DataFileLength private ◆ m_DataFname string CSeqDBIsam::m_DataFname private ◆ m_DataLease ◆ m_FileStart char* CSeqDBIsam::m_FileStart private ◆ m_FirstKey ◆ m_FirstOffset Int4 CSeqDBIsam::m_FirstOffset private ◆ m_IdentType ◆ m_IdxOption Int4 CSeqDBIsam::m_IdxOption private ◆ m_IndexFileLength TIndx CSeqDBIsam::m_IndexFileLength private ◆ m_IndexFname string CSeqDBIsam::m_IndexFname private ◆ m_IndexLease

A persistent lease on the ISAM index file.

Definition at line 1186 of file seqdbisam.hpp.

Referenced by CSeqDBIsam(), UnLease(), x_DiffSample(), x_GetIndexKeyOffset(), x_GetIndexString(), x_InitSearch(), x_LoadPage(), x_SearchIndexNumeric(), x_SearchNegativeMultiSeq(), x_StringSearch(), and x_TranslateGiList().

◆ m_Initialized bool CSeqDBIsam::m_Initialized private ◆ m_KeySampleOffset TIndx CSeqDBIsam::m_KeySampleOffset private ◆ m_LastKey ◆ m_LastOffset Int4 CSeqDBIsam::m_LastOffset private ◆ m_LongId bool CSeqDBIsam::m_LongId private ◆ m_MaxLineSize Int4 CSeqDBIsam::m_MaxLineSize private ◆ m_NumSamples Int4 CSeqDBIsam::m_NumSamples private

Number of terms in ISAM index.

Definition at line 1212 of file seqdbisam.hpp.

Referenced by x_DiffSample(), x_ExtractAllData(), x_FindIndexBounds(), x_GetPageNumElements(), x_InitSearch(), x_LoadIndex(), x_SearchIndexNumeric(), x_SearchNegativeMulti(), x_SearchNegativeMultiSeq(), x_StringSearch(), and x_TranslateGiList().

◆ m_NumTerms Int4 CSeqDBIsam::m_NumTerms private ◆ m_PageSize Int4 CSeqDBIsam::m_PageSize private ◆ m_TermSize int CSeqDBIsam::m_TermSize private ◆ m_TestNonUnique bool CSeqDBIsam::m_TestNonUnique private ◆ m_Type

The documentation for this class was generated from the following files:


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4