X-Hacker.org- SIx Driver RDD v3.00 - Reference Guide - <b>hiper-seek index creation:</b>

Click above to get retro games delivered to your door ever month! X-Hacker.org- SIx Driver RDD v3.00 - Reference Guide - <b>hiper-seek index creation:</b>
[<<Previous Entry] [^^Up^^] [Next Entry>>] [Menu] [About The Guide]
  HiPer-SEEK Index Creation:

  hs_Create() creates the index file header and establishes certain
  features of the index file including the file path/name, the size of a
  memory buffer, the size of the index records, case sensitivity and the
  selection of a filter. It does not make any entries. Index records must
  be added using hs_Add().  HiPer-SEEK index files may be given any legal
  name but are usually given the extension .HSX.

  The .HSX creation process will normally consist of creating, adding and
  then closing. Closing and then reopening the .HSX file allows the memory
  buffer to be resized and the open mode to be set to shareable.  Create is
  always done exclusively.

  A memory buffer is used for hs_Create(), hs_Filter(), and hs_Open().
  These three functions are the only ones requiring memory.  With
  hs_Create() and hs_Open(), the allocated memory is released by
  hs_Close().  The hs_Filter() function releases the allocated memory
  automatially as soon as it returns. In general, the more space allocated,
  the better the performance.  This is especially true if the entire .HSX
  index file can fit into the memory buffer.  The buffer size can be
  changed each time hs_Create(), hs_Filter(), or hs_Open() is called.
  Buffer sizes from 5K to 20K are usually the most reasonable.  If the .HSX
  file is opened sharable, the memory buffer is automatically set to 1K.
  Both hs_Create() and hs_Open() return a handle that is used to identify
  the particular index file.

  HiPer-SEEK Index Key Sizes:

  The size of the index keys is also established by hs_Create(). The
  choices are 16, 32 or 64 bytes. This size factor is the amount of space
  used by each index record within the .HSX index file. The key size is a
  critical factor in determining HiPer-SEEK performance. A ratio of text
  record size to index record size greater than 50 to 1 will usually result
  in excessive aliases.  The trade off is between file size and alias rate.

  Choosing the next larger size index key will double the size of the .HSX
  index file. Because the index file must be read to complete a search,
  larger keys increase the time it takes for the best case search.
  Excessive aliases are very costly in processing time required to verify
  aliases.

  With a key size of 64 bytes, the 50 to 1 rule generally limits text
  record size to 3200 characters. This is not an absolute. The nature of
  the text itself and the length of search strings also play very important
  roles in overall performance. It is usually worthwhile to experiment and
  test different index sizes with real data.

  Case Sensitivity:

  Case sensitivity is determined when the index is created. When upper and
  lower case are distinguished within the index file, the number of
  signatures are effectively doubled. It is recommended that, unless
  absolutely necessary, upper and lower case should the treated the same.
  It is possible to add case sensitivity to the verification process and
  allow the user to specify if a particular search is to be case specific.
  The use of a filter is also specified at create time.  This filter is a
  translation of characters from the source text records to the index.
  Most applications will use the filter to mask off the high-order bit and
  treat all non-printing characters -- such as line-feeds and tabs -- as
  spaces.  Some non-English character sets which use the high- order bit
  require that the filter not be used.

  Adding Keys:

  As we said, hs_Create() does not add records to the .HSX index. It only
  creates the structure and sets the various options of the index file.
  hs_Add() is called to add each index record to the file. It receives a
  handle and a character string and returns a number representing the index
  record number. The first addition to an index file will return 1, the
  second, 2 and so on. It will be up to the application to know what blocks
  of text these index record numbers represent. Using the examples given
  above, they can represent physical record numbers of a structured file,
  line numbers or offsets within a file or even entire files.

  Additions are always made to the end of the index file. If a source text
  record is edited, hs_Replace() is used to update the appropriate index
  record. Both hs_Add() and hs_Replace() perform a single index record
  operation so that the index file can be quickly and easily maintained.
  There is usually no need to rebuild the entire index when changes to the
  source text are made.

  There are some occasions when the .HSX index file should be rebuilt.  The
  hs_KeyCount() function can be used to compare the number of index records
  to the number of source text blocks to test if the one to one
  relationship between text records and index records is corrupted. If so
  the .HSX file should be rebuilt.
Online resources provided by: http://www.X-Hacker.org --- NG 2 HTML conversion by Dave Pearson