I'm presently focusing on a task that is using Apache HttpClient 4.1.2 also it retrieves some data from the website.

Exactly what the application does: it is going to some web page after which goes to another (found) pages until it reaches the finish (e.g.: visit the first page -> finds 20 more pages -> visit every next 20 pages). However , it will get stuck on locating some random pages also it does not continue the crawl.

Here's some code:

DefaultHttpClient mainHttp;
HttpPost post;
HttpResponse response;
HttpEntity entity;
String s;
int curPage = 1;
int index = 0;
boolean ok = true;


while (ok) { 
  response = mainHttp.execute(post);
  entity = response.getEntity();
  if (entity != null) {
    System.out.println("Enter " + curPage);
    s = EntityUtils.toString(entity);
    System.out.println("Exit " + curPage);
    index = s.indexOf("[" + curPage + "]");
    if (index > 0) {
    } else {
      ok = false;

Around the debug window is shows something similar to this:

Enter 1
Exit 1
Enter n

I'm also utilizing a http request analyzer and that i saw that around the page that stucks, the information isn't retrieved completely (it does not achieve the </html> or even the finish from the page).

So what can I actually do to skip or retry installing the information in such instances? Can anybody assist me to?



The particular configurations were:

mainHttp.setHttpRequestRetryHandler(new DefaultHttpRequestRetryHandler(1, true));
mainHttp.getParams().setParameter("http.connection-manager.timeout", 15000);
mainHttp.getParams().setParameter("http.socket.timeout", 15000);
mainHttp.getParams().setParameter("http.connection.timeout", 15000);

where 15000 may be the timeout in miliseconds.

Appreciate your help.

DefaultMethodRetryHandler retryhandler = new DefaultMethodRetryHandler(1, true);
mainHttp.getParams().setParameter(HttpMethodParams.RETRY_HANDLER, retryhandler);   

Source: http://hc.apache.org/httpclient-3.x/tutorial.html (Method recovery)

But this really is only when there have been any exceptions that happened, try checking for IOExceptions any time you create a request