Your browser was unable to load all of the resources. They may have been blocked by your firewall, proxy or browser configuration.
Press Ctrl+F5 or Ctrl+Shift+R to have your browser try again.

Build fails with 'Error testing job' #4405

MFalkner ·

QuickBuild version 12 (had same issue with 11).
I've got a build that runs once a week and fails every time with connection lost after 2.5 hours.
There is no log entry at this time in the server log, and also no logs available on and steps (all are empty).
The only information I've got is the following:

102810 Failed 2022-03-06 01:30:52 2 hours 29 minutes 10 seconds Scheduler 0 builds 0 builds
Error testing job.
Build is already stopped.

04:00:03,668 ERROR - Build is failed.
java.lang.RuntimeException: Error executing step execution job.
at com.pmease.quickbuild.stepsupport.StepExecutionTask.reduce(StepExecutionTask.java:29)
at com.pmease.quickbuild.stepsupport.StepExecutionTask.reduce(StepExecutionTask.java:19)
at com.pmease.quickbuild.grid.GridTaskFuture.get(GridTaskFuture.java:168)
at com.pmease.quickbuild.DefaultBuildEngine.run(DefaultBuildEngine.java:639)
at com.pmease.quickbuild.DefaultBuildEngine.process(DefaultBuildEngine.java:476)
at com.pmease.quickbuild.DefaultBuildEngine.access$000(DefaultBuildEngine.java:152)
at com.pmease.quickbuild.DefaultBuildEngine$2.run(DefaultBuildEngine.java:1289)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: com.pmease.quickbuild.QuickbuildException: Error testing job.
at com.pmease.quickbuild.grid.GridTaskFuture.testJobs(GridTaskFuture.java:111)
at com.pmease.quickbuild.grid.GridTaskFuture.get(GridTaskFuture.java:150)
... 7 more
Caused by: com.caucho.hessian.client.HessianRuntimeException: com.caucho.hessian.client.HessianRuntimeException: Error connecting 'http://192.168.193.41:8811/service/node'
at com.caucho.hessian.client.HessianProxy.sendRequest(HessianProxy.java:285)
at com.caucho.hessian.client.HessianProxy.invoke(HessianProxy.java:171)
at com.sun.proxy.$Proxy96.testGridJob(Unknown Source)
at com.pmease.quickbuild.grid.GridTaskFuture.testJobs(GridTaskFuture.java:89)
... 8 more
Caused by: com.caucho.hessian.client.HessianRuntimeException: Error connecting 'http://192.168.193.41:8811/service/node'
at com.caucho.hessian.client.HessianURLConnection.getOutputStream(HessianURLConnection.java:101)
at com.caucho.hessian.client.HessianProxy.sendRequest(HessianProxy.java:283)
... 11 more
Caused by: java.net.ConnectException: Connection refused: connect
at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)
at java.net.AbstractPlainSocketImpl.connect(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at sun.net.NetworkClient.doConnect(Unknown Source)
at sun.net.www.http.HttpClient.openServer(Unknown Source)
at sun.net.www.http.HttpClient.openServer(Unknown Source)
at sun.net.www.http.HttpClient. (Unknown Source)
at sun.net.www.http.HttpClient.New(Unknown Source)
at sun.net.www.http.HttpClient.New(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.connect(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(Unknown Source)
at com.caucho.hessian.client.HessianURLConnection.getOutputStream(HessianURLConnection.java:99)
... 12 more

Why would the connection be lost always after 2.5 hours?
It can not be timeout, as then the build would be marked as timed out, correct?

What could be the reason?
Martin

  • replies 3
  • views 511
  • stars 0
robinshen ADMIN ·

Does the machine reboot during build? When this issue happens, please run below from QB server to see if it works:
telnet 192.168.193.41 8811

MFalkner ·

No, the server doe not reboot, I've checked this.
I suspected some maintenance tasks interfering, but after scheduling the job 2 hours earlier, the build broke again after 2.5 hours.

robinshen ADMIN ·

QB tests network connectivity periodically while running a step and cancels it upon network failure. You may increase network disconnect tolerance in advanced setting of parent step of the failed step to see if it solves the issue.