I'm presently looking to get HttpComponents to transmit HttpRequests and retrieve the Response. Of all Web addresses this works with no problem, however when I attempt to obtain the Link to a phpBB Forum namely http://www.forum.animenokami.com the customer takes additional time and also the responseEntity consists of passages more often than once producing a damaged html file.

For instance the meta data are contained six occasions. Because so many other Web addresses work I can not evaluate which I'm doing wrong. The Page is working properly in known Browsers, so it's no problem on their own side.

This is actually the code I personally use to transmit and receive.

        URI uri1 = new URI("http://www.forum.animenokami.com");
    HttpGet get = new HttpGet(uri1);
    get.setHeader(new BasicHeader("User-Agent", "Mozilla/5.0 (Windows NT 5.1; rv:6.0) Gecko/20100101 Firefox/6.0"));
    HttpClient httpClient = new DefaultHttpClient();
    HttpResponse response = httpClient.execute(get);
    HttpEntity ent = response.getEntity();
    InputStream is = ent.getContent();
    BufferedInputStream bis = new BufferedInputStream(is);
    byte[] tmp = new byte[2048];
    int l;
    String ret = "";
    while ((l = bis.read(tmp)) != -1){
        ret += new String(tmp);
    }

I think you'll might help me. If you want any longer Information I'll viewed it as quickly as possible.

This code is totally damaged:

String ret = "";
while ((l = bis.read(tmp)) != -1){
    ret += new String(tmp);
}

Three things:

  • This really is transforming the whole buffer right into a string on each iteration, no matter just how much data continues to be read. (I suspect this really is what's really failing inside your situation.)
  • It's while using default platform encoding, that is rarely advisable.
  • It's using string concatenation inside a loop, which results in poor performance.

Fortunately you are able to avoid all this effortlessly using [cde]:

EntityUtils

Which will make use of the appropriate character encoding specified by the response, if any, or ISO-8859-1 otherwise. (There's another overload which enables you to definitely specify which character encoding to make use of when not specified.)

It's worth understanding wrong together with your original code though instead of just changing it using the better code, to ensure that you do not result in the same mistakes in other situations.

It really works fine but things i do not understand is the reason why I begin to see the same text multiple occasions only about this URL.

It will likely be since your client is seeing more incomplete buffers if this reads the socket. Than might be:

  • because there's a network bandwidth bottleneck on the way in the remote site for your client,
  • since the remote website is doing a bit of unnecessary flushes, or
  • another reason.

The thing is that the client must seriously consider the amount of bytes read in to the buffer through the String text = EntityUtils.toString(ent); call, otherwise it'll finish up placing junk. Network streams particularly are prone not filling the buffer.