Tutorials

How much Un-structure can you afford to handle?

Whereas most parsers require well defined delimiters, tags, or fixed size fields to grasp data fields and carry on transformations, the parser that we are about to present is capable of extracting data from look-alike stuff. Its mapping power is directly related to that of regular expressions which are state-of-the-art in pattern matching. Data resisting pattern matching by being even less structured would imply concepts like ontologies and semantic nets which are not in scope here.

There is a significant gap between a regular expression software library providing just the raw capability to match a pattern, as sophisticated as it can be, and the final production of an XML document. That is indeed the added value of the reverseXSL software: regular expressions are organized to conduct four tasks (identify, cut, extract, and validate) in turn, and recursively, till reaching the atoms of data which must be output into your XML document.

Text to XML: converting loosely-structured text data

EDI to XML: converting a legacy EDI Message to XML