Memory usage when processing large responses with http_client

Jan 9, 2015 at 1:18 PM
Hello,

We are processing large, streaming http responses and observed that if the client is unable to process the response fast enough, producer_consumer_buffer keeps buffering up more data and memory usage can grow without control.

We came to conclusion that we probably ought to write our own version of producer_consumer_buffer that (for example) restricts the total size of outstanding data available for read. Can you please advise whether this is a reasonable approach?

Regards,

David.
Jan 9, 2015 at 5:41 PM
Hi David,

Yes our http_client will keep receiving data from the underlying HTTP API for the response as more data comes in. Creating a 'bounded' producer consumer buffer I think would work for throttling the response data.

If I recall correctly you are running on the Windows Desktop platform, so we are utilizing WinHttp. What happens is in one big asynchronous loop we query WinHttp asking for the next piece of the response body and then write it into the response stream, a producer_consumer_buffer if a different one hasn't been specified. The writing into the producer_consumer_buffer is handled in to ways. First we try to use the alloc/commit APIs. These allow us to tell WinHttp to directly write into the space preallocated in the buffer, there by saving a copy. If alloc/commit isn't supported by the buffer then we use putn to write into the buffer.

One thing you could do to create a bounded producer consumer buffer would be to not support alloc/commit and have the putn implementation only actually complete writing if the total data in the buffer is under some limit. I don't think this would be too hard to do by copying and modifying our producer_consumer_buffer.

Just curious how large are the response bodies you are processing?

Steve
Jan 10, 2015 at 12:07 PM
Thanks for the reply. Yes we are using the Windows Desktop Platform with the VS2013 compiler.

We've already experimented with http_request::set_response_stream, initially just using product_consumer_buffer to prove it can work. Our first obstacle was observing that when set_response_stream is used, the stream is not closed which broke our reading loop. I can see why this is the case. We found a solution by setting up a continuation task on http_response::content_ready() that seems to work correctly.

Our first attempt at writing a modified implementation of the buffer failed so thanks for the tips; I'm sure they will come in handy when we try again next week.

It depends on the type of call, but our response bodies are typically anything from 100s bytes to 1GB. The upper limit would be dictated by user data. Typically, the large responses consist of framed data which can be processed without receiving the whole body.

Thanks.

David.
Jan 12, 2015 at 8:50 PM
Hi David,

Yes we don't close the response stream if it was set by the user through set_response_stream because there could be scenarios where you might want to write multiple response bodies to some stream or file. It might be valuable to contribute back to the library the buffered producer consumer buffer once you have it complete.

Steve