Your browser was unable to load all of the resources. They may have been blocked by your firewall, proxy or browser configuration.
Press Ctrl+F5 or Ctrl+Shift+R to have your browser try again.

Error Testing Job: sporadic but frequent #4180

SameOldSong ·

We experience problems with certain builds terminating with Error testing job quite regularly.
One thing they have in common is that these are long running tests (several hours).
For some reason, the connection is lost between agent and server after about 3-6 hours of build running.

However, very often, when I check the agent some time after Error testing job occurs - the agent is actually running, has no errors in the console and is displayed in the list of Acitve nodes in Grid. So the connection seems to be automatically re-established later.

Agent log typically contains:

ERROR com.pmease.quickbuild.Quickbuild - Error connecting server.
com.pmease.quickbuild.RemotingException: 500: java.net.SocketException: Connection reset
at com.caucho.hessian.client.HessianURLConnection.sendRequest(HessianURLConnection.java:165)
at com.caucho.hessian.client.HessianProxy.sendRequest(HessianProxy.java:300)
at com.caucho.hessian.client.HessianProxy.invoke(HessianProxy.java:171)

Questions:

  1. is there any way to narrow down the root cause of the problem?

  2. is there a setting allowing to increase the time out, before the server or the agent decides that connection is lost? and/or - a setting to increase the number of re-connection attempts?

Thank you

  • replies 2
  • views 1026
  • stars 1
robinshen ADMIN ·

If you are on QB8 or later. The step has an option "network disconnect tolerance" in its advanced setting allowing to handle temporal networking losses.

SameOldSong ·

Thank you. One more reason to upgrade :slight_smile: