Reverse Template is a handy Python module that enables to extract data from HTML pages. Several times in a past I needed to process some HTML pages and get some data from them into program. I've used various techniques like regular expressiosn, HTML parsers etc., which aways required some specific programming to extract data. Recently I decided to create a small framework, that will enable to do this quickly, effectively and easily with minimal programming.
An approach that looked as valuable to me was to use a template that will indentify, which data need to be read from the page. Template should be close to page HTML code , so it should be easy to create it.
So steps would be:
So this approach is somehow opposite to the way how templates work for dynamic pages, so that why it is called Reverse Template.
Here is available extract from Python docstring.
Sample Reverse Templates
Again available under GPL license.
Current version is 0.1.1 (beta quality)
Source code is available here.