3.7. Regex patterns¶
This file hosts a centralized point for all regex patterns used in pymzML
Note
We need some comment lines for each regex, i.e. what does it catch .. maybe with examples and stuff …
Collection of regular expressions to catch spectrum XML-tags.
-
pymzml.regex_patterns.
CHROMATOGRAM_AND_SPECTRUM_PATTERN_WITH_ID
= re.compile('<\\s*(chromatogram|spectrum)\\s*(id=(\\".*?\\")|index=\\".*?\\")\\s(id=(\\".*?\\"))*\\s*.*\\sdefaultArrayLength=\\"[0-9]+\\">')¶ Regex to catch combined chromatogram and spectrum patterns
-
pymzml.regex_patterns.
CHROMATOGRAM_CLOSE_PATTERN
= re.compile(b'</chromatogram>')¶ Regex to catch spectrum xml close tags
-
pymzml.regex_patterns.
CHROMATOGRAM_ID_PATTERN
= re.compile('<chromatogram.*id="(.*?)".*?>')¶ Regex to catch chromatogram id patterns
-
pymzml.regex_patterns.
CHROMATOGRAM_PATTERN
= re.compile('<chromatogram.*id="(.*?)".*?>')¶ Regex to catch chromatogram id pattern (again ?)
-
pymzml.regex_patterns.
FILE_ENCODING_PATTERN
= re.compile(b'encoding="(?P<encoding>[A-Za-z0-9-]*)"')¶ Regex to catch xml file encoding
-
pymzml.regex_patterns.
MOBY_DICK_CHAPTER_PATTERN
= re.compile('CHAPTER ([0-9]+).*')¶ Regex to catch moby dick chapter number used in the index gezip writer example.
-
pymzml.regex_patterns.
SIM_INDEX_PATTERN
= re.compile(b'(?P<type>idRef=")(?P<nativeID>.*)">(?P<offset>[0-9]*)</offset>')¶ Regex pattern for SIM index
-
pymzml.regex_patterns.
SPECTRUM_CLOSE_PATTERN
= re.compile(b'</spectrum>')¶ Regex to catch spectrum xml close tags
-
pymzml.regex_patterns.
SPECTRUM_ID_PATTERN
= re.compile('="{0,1}([0-9]*)"{0,1}>{0,1}$')¶ Simplified spectrum id regex. Greedly catches ints at the end of line
-
pymzml.regex_patterns.
SPECTRUM_INDEX_PATTERN
= re.compile(b'(?P<type>(scan=|nativeID="))(?P<nativeID>[0-9]*)">"(?P<offset>[0-9]*)</offset>')¶ Regex pattern for spectrum index works for obo format 1.1.0 until <last version checked>
- Catches:
- demo 1
- demo 2
-
pymzml.regex_patterns.
SPECTRUM_OPEN_PATTERN
= re.compile(b'<*spectrum[^>]*index="(?P<index>[0-9]+)" id="(?P<id>[^"]+)" defaultArrayLength="[0-9]+">')¶ Regex to catch specturm open xml tag with encoded array length
-
pymzml.regex_patterns.
SPECTRUM_TAG_PATTERN
= re.compile('<spectrum.*?id="(?P<index>[^"]+)".*?>')¶ Regex to catch spectrum tag pattern