Small helper functions that don’t fit anywhere else.
Return a new name that has suffix attached; replaces other extensions.
Arguments : |
|
---|
Changed in version 0.9.0: Also permits NamedStream to pass through.
Context manager to open a compressed (bzip2, gzip) or plain file (uses anyopen()).
Open datasource (gzipped, bzipped, uncompressed) and return a stream.
datasource can be a filename or a stream (see isstream()). By default, a stream is reset to its start if possible (via seek() or reset()).
If possible, the attribute stream.name is set to the filename or “<stream>” if no filename could be associated with the datasource.
Arguments : |
|
---|---|
Returns : | tuple stream which is a file-like object |
Changed in version 0.9.0: Only returns the stream and tries to set stream.name = filename instead of the previous behavior to return a tuple (stream, filename).
Split extension in path p at the left-most separator.
Determine full path of executable program on PATH.
(Jay at http://stackoverflow.com/questions/377017/test-if-executable-exists-in-python)
Many of the readers are not restricted to just reading files. They can also use gzip-compressed or bzip2-compressed files (through the internal use of openany()). It is also possible to provide more general streams as inputs, such as a cStringIO.StringIO() instances (essentially, a memory buffer) by wrapping these instances into a NamedStream. This NamedStream can then be used in place of an ordinary file name (typically, with a class:~MDAnalysis.core.AtomGroup.Universe but it is also possible to write to such a stream using MDAnalysis.Writer()).
In the following example, we use a PDB stored as a string pdb_s:
import MDAnalysis
from MDAnalysis.core.util import NamedStream
import cStringIO
pdb_s = "TITLE Lonely Ion\nATOM 1 NA NA+ 1 81.260 64.982 10.926 1.00 0.00\n"
u = MDAnalysis.Universe(NamedStream(cStringIO.StringIO(pdb_s), "ion.pdb"))
print(u)
# <Universe with 1 atoms>
print(u.atoms.positions)
# [[ 81.26000214 64.98200226 10.92599964]]
It is important to provide a proper pseudo file name with the correct extension (”.pdb”) to NamedStream because the file type recognition uses the extension of the file name to determine the file format or alternatively provide the format="pdb" keyword argument to the Universe.
The use of streams becomes more interesting when MDAnalysis is used as glue between different analysis packages and when one can arrange things so that intermediate frames (typically in the PDB format) are not written to disk but remain in memory via e.g. cStringIO buffers.
Note
A remote connection created by urllib2.urlopen() is not seekable and therefore will often not work as an input. But try it...
Stream that also provides a (fake) name.
By wrapping a stream stream in this class, it can be passed to code that uses inspection of the filename to make decisions. For instance. os.path.split() will work correctly on a NamedStream.
The class can be used as a context manager.
NamedStream is derived from io.IOBase (to indicate that it is a stream) and basestring (that one can use iterable() in the same way as for strings).
Example
Wrap a cStringIO.StringIO() instance to write to:
import cStringIO
import os.path
stream = cStringIO.StringIO()
f = NamedStream(stream, "output.pdb")
print(os.path.splitext(f))
Wrap a file instance to read from:
stream = open("input.pdb")
f = NamedStream(stream, stream.name)
Use as a context manager (closes stream automatically when the with block is left):
with NamedStream(open("input.pdb"), "input.pdb") as f:
# use f
print f.closed # --> False
# ...
print f.closed # --> True
Note
This class uses its own __getitem__() method so if stream implements stream.__getitem__() then that will be masked and this class should not be used.
Warning
By default, NamedStream.close() will not close the stream but instead reset() it to the beginning. [1] Provide the force=True keyword to NamedStream.close() to always close the stream.
Initialize the NamedStream from a stream and give it a name.
The constructor attempts to rewind the stream to the beginning unless the keyword reset is set to False. If rewinding fails, a MDAnalysis.StreamWarning is issued.
Note
By default, this stream will not be closed by with and close() (see there) unless the close keyword is set to True.
Arguments : |
|
---|---|
Keywords : |
New in version 0.9.0.
Reset or close the stream.
If NamedStream.close_stream is set to False (the default) then this method will not close the stream and only reset() it.
If the force = True keyword is provided, the stream will be closed.
Note
This close() method is non-standard. del NamedStream always closes the underlying stream.
Return the underlying file descriptor (an integer) of the stream if it exists.
An IOError is raised if the IO object does not use a file descriptor.
Flush the write buffers of the stream if applicable.
This does nothing for read-only and non-blocking streams. For file objects one also needs to call os.fsync() to write contents to disk.
Return True if the stream can be read from.
If False, read() will raise IOError.
Change the stream position to the given byte offset .
offset is interpreted relative to the position indicated by whence. Values for whence are:
Returns : | the new absolute position. |
---|
Return True if the stream supports random access.
If False, seek(), tell() and truncate() will raise IOError.
Detect if obj is a stream.
We consider anything a stream that has the methods
and either set of the following
See also
Arguments : |
|
---|---|
Returns : | True is obj is a stream, False otherwise |
New in version 0.9.0.
Returns True if obj can be iterated over and is not a string.
FORTRANReader provides a method to parse FORTRAN formatted lines in a file.
Usage:
atomformat = FORTRANReader('2I10,2X,A8,2X,A8,3F20.10,2X,A8,2X,A8,F20.10')
for line in open('coordinates.crd'):
serial,TotRes,resName,name,x,y,z,chainID,resSeq,tempFactor = atomformat.read(line)
Fortran format edit descriptors; see Fortran Formats for the syntax.
Only simple one-character specifiers supported here: I F E A X (see FORTRAN_format_regex).
Strings are stripped of leading and trailing white space.
Set up the reader with the FORTRAN format string.
The string fmt should look like ‘2I10,2X,A8,2X,A8,3F20.10,2X,A8,2X,A8,F20.10’.
Return how many format entries could be populated with legal values.
Parse the descriptor.
parse_FORTRAN_format(edit_descriptor) –> dict
Returns : | dict with totallength (in chars), repeat, length, format, decimals |
---|---|
Raises : | ValueError if the edit_descriptor is not recognized and cannot be parsed |
Note
Specifiers: L ES EN T TL TR / r S SP SS BN BZ are not supported, and neither are the scientific notation Ew.dEe forms.
Parse line according to the format string and return list of values.
Values are converted to Python types according to the format specifier.
Returns : | list of entries with appropriate types |
---|---|
Raises : | ValueError if any of the conversions cannot be made (e.g. space for an int) |
See also
Regular expresssion (see re) to parse a simple FORTRAN edit descriptor. (?P<repeat>\d?)(?P<format>[IFELAX])(?P<numfmt>(?P<length>\d+)(\.(?P<decimals>\d+))?)?
Converts between 3-letter and 1-letter amino acid codes.
See also
Data are defined in amino_acid_codes and inverse_aa_codes.
Process residue string.
Argument : | The residue must contain a 1-letter or 3-letter or 4-letter residue string, a number (the resid) and optionally an atom identifier, which must be separate from the residue with a colon (”:”). White space is allowed in between. |
---|---|
Returns : | (3-letter aa string, resid, atomname); known 1-letter aa codes are converted to 3-letter codes |
Returns the unit vector normal to two vectors.
If the two vectors are collinear, the vector \(\mathbf{0}\) is returned.
Returns the length of a vector, sqrt(v.v).
Faster than numpy.linalg.norm() because no frills.
Cache a property within a class
Requires the Class to have a cache dict called “_cache”
Usage:
@property @cached(‘keyname’) def size(self):
# This code gets ran only if the lookup of keyname fails # After this code has been ran once, the result is stored in # _cache with the key: ‘keyname’ size = 10.0
New in version 0.9.0.
Footnotes
[1] | The reason why NamedStream.close() does not close a stream by default (but just rewinds it to the beginning) is so that one can use the class NamedStream as a drop-in replacement for file names, which are often re-opened (e.g. when the same file is used as a topology and coordinate file or when repeatedly iterating through a trajectory in some implementations). The close=True keyword can be supplied in order to make NamedStream.close() actually close the underlying stream and NamedStream.close(force=True) will also close it. |