3.3. Utils¶
3.3.1. GSGW¶
-
class
pymzml.utils.GSGW.
GSGW
(file=None, max_idx=10000, max_idx_len=8, max_offset_len=8, output_path='./test.dat.igzip', comp_str=-1)[source]¶ Generalized Gzip writer class with random access to indexed offsets.
Keyword Arguments: - file (string) – Filename for the resulting file
- max_idx (int) – max number of indices which can be saved in this file
- max_idx_len (int) – maximal length of the index in bytes, must be between 1 and 255
- max_offset_len (int) – maximal length of the offset in bytes
- output_path (str) – path to the output file
-
_allocate_index_bytes
()[source]¶ Allocate ‘self.max_index_num’ bytes of length ‘self.max_idx_len’ in the header for inserting the index later on.
-
_write_gen_header
(Index=False, FLAGS=None)[source]¶ Write a valid gzip header with creation time, user defined flag fields and allocated index.
Keyword Arguments: - Index (bool) – whether to or not to write an index into this header.
- FLAGS (list, optional) – list of flags (FTEXT, FHCRC, FEXTRA, FNAME) to set for this header.
Returns: byte offset of the file pointer
Return type: offset (int)
-
_write_identifier
(identifier)[source]¶ Convert and write the identifier into output file.
Parameters: identifier (str or int) – identifier to write into index
-
_write_offset
(offset)[source]¶ Convert and write offset to output file.
Parameters: offset (int) – offset which will be formatted and written into file index
-
add_data
(data, identifier)[source]¶ Create a new gzip member with compressed ‘data’ indexed with ‘index’.
Parameters: - data (str) – uncompressed data to write to file
- index (str or int) – unique index for the data
-
encoding
¶ Returns the encoding used for this file
-
file_out
¶ Output filehandler
-
write_index
()[source]¶ Only called after all the data is written, i.e. all calls to
add_data()
have been done.Seek back to the beginning of the file and write the index into the allocated comment bytes (see _write_gen_header(Index=True)).
3.3.2. GSGR¶
-
class
pymzml.utils.GSGR.
GSGR
(file=None)[source]¶ Generalized Gzip reader class which enables random access in files written with the
GSGW
class.Keyword Arguments: file (str) – path to file to read -
_read_basic_header
()[source]¶ Read and save compression method, bitflags, changetime, compression speed and os.
-
read
(size=-1)[source]¶ Read the content of the in File in binary mode
Keyword Arguments: size (int, optional) – number of bytes to read, -1 for everything Returns: parsed bytes from input file Return type: data (bytes)
-
Example:
.. class SQLiteDatabase(object):
.. """
.. Example implementation of a database Connector,
.. which can be used to make run accept paths to
.. sqlite db files.
Example:
.. def _open(self, path):
.. if path.endswith('.gz'):
.. if self._indexed_gzip(path):
.. self.file_handler = indexedGzip.IndexedGzip(path, self.encoding)
.. else:
.. self.file_handler = standardGzip.StandardGzip(path, self.encoding)
.. # Insert a new condition to enable your new fileclass
.. elif path.endswith('.db'):
.. self.file_handler = utils.SQLiteConnector.SQLiteDatabase(path, self.encoding)
.. else:
.. self.file_handler = standardMzml.StandardMzml(path, self.encoding)
.. return self.file_handler