How to escape the URI correctly? (Ampersand in the query string...)

Jul 8, 2014 at 9:47 AM
(This post is loosely related to my earlier post.)

I am (almost) successfully consuming the geolocation service (address to GPS coordinates) using the code below. However, I have observed the problem when the address contains the ampersand character. The reason is that the builder.to_string() does uri-escape the special characters, but it does not escape the ampersand (as it is used in URI to separate the arguments). See below:
 wstring str;

    // Create the client, build the query, and start the request.
    http_client client(L"http://api4.mapy.cz/");
    uri_builder builder(L"/geocode");
    builder.append_query(L"query", address);
    wstring request_string{ builder.to_string() };

    client.request(methods::GET, request_string)
        .then([&str](http_response response)
    {
        // Headers arrived.
        return response.extract_string();
    }).then([&str](utility::string_t responseBody)
    {
        // Body arrived.
        str = responseBody;
    }).wait();
The address contains 3V & H, s.r.o., ... (that is the name of a company that can be part of the geolocation query in the case). The request_string contains /geocode?query=3V%20&%20H,%20s.r.o.,%20... which seems almost fine. However, notice the ampersand just after query=3V%20. I would like to get the /geocode?query=3V%20%26%20H,%20s.r.o.,%20.... If I escape the address before appending to the query, it is escaped twice and the result would be like query=3V%2520%26%2520H,%2520s.r.o.,%2520... which is wrong.

How should I solve the situation?

Thanks,
Petr
Jul 10, 2014 at 3:44 AM
Hi Petr,

There is a 3rd parameter to uri_builder::append_query, which indicates whether or not to performing encoding, by default it is on. The & is not being encoded because of the rules for RFC 3986 for the query component. Here is a code snippet illustrating:
    const std::wstring address(L"a bc&123");

    uri_builder builder2(L"/geocache");
    builder2.append_query(L"query", address); // Default query component encoding performed
    wprintf(L"URI2:%s\n", builder2.to_string().c_str());

    uri_builder builder3(L"/geocache");
    builder3.append_query(L"query", uri::encode_data_string(address), false); // Full encoding of all unreserved characters
    wprintf(L"URI3:%s\n", builder3.to_string().c_str());
Here is the output of running:

URI2:/geocache?query=a%20bc&123
URI3:/geocache?query=a%20bc%26123

If you want to encode a data string and encode all unreserved characters you can use the function uri::encode_data_string. This option is what you want here.

Steve
Jul 10, 2014 at 7:38 AM
Hi Steve,

Is the presented uri_builder::append_query() with three arguments part of any development branch? In the master branch, the following template has only two arguments:
template<typename T>
        uri_builder &append_query(utility::string_t name, const T &value)
        {
            utility::ostringstream_t ss;
            ss << name << _XPLATSTR("=") << value;
            return append_query(ss.str(), true);
        }
The question is whether it should not get the third argument. Anyway, I did use the
builder.append_query(L"query="+uri::encode_data_string(address), false);
and it works fine now.

Thanks for the help,

Petr
Jul 10, 2014 at 5:54 PM
Hi Petr,

Yes my mistake, I thought this got merged in for the last release but it isn't in the master branch yet. It is located in our development branch if you like to use it. It will be included in our next release.

Like you mentioned, you can use the other version of append_query as well for now.

Thanks,
Steve