http-stream-xml¶
Parse XML in HTTP response on the fly, by chunks.
It’s essential if you want only beginning of huge document.
For example if you deal with NCBI PubMed biomedical articles corpus with Entrez API. The Enrez API tends to return very big documents (megabytes). And even if you need just some headers you have to download whole document just to parse it.
The http-stream-xml library helps you to partially download response and parse them.
It does not matter if the server use HTTP protocol chunks.
Installation¶
pip install http-stream-xml --upgrade
Usage sample¶
Receives data from NCBI PubMed biomedical articles corpus with Entrez API.
The code downloads only small part of Entrez response, just to extract some summary data. So you do not have to download whole huge Entrez answer to get just basic gene description.
python -m http_stream_xml.entrez