Your browser was unable to load all of the resources. They may have been blocked by your firewall, proxy or browser configuration.
Press Ctrl+F5 or Ctrl+Shift+R to have your browser try again.

Curious Error On Build Agent, Build Happened but Never Logged #4568

tomz ·

@robinshen

Two overnight build jobs completely lost their build logs. The errors around the issue look as follows:
AGENT LOG

2024-04-29 04:50:49,414 [qtp885338582-34] ERROR com.pmease.quickbuild.grid.NodeServiceImpl - Unable to find job '191035ff-1ef7-4bfd-837e-3f6c20c69b18' on node 'buildhostaddress' (Job is ever started: false).

BUILD LOG

04:50:49,285 ERROR - Build is failed.
    java.lang.RuntimeException: Error executing step execution job.
    	at com.pmease.quickbuild.stepsupport.StepExecutionTask.reduce(StepExecutionTask.java:29)
    	at com.pmease.quickbuild.stepsupport.StepExecutionTask.reduce(StepExecutionTask.java:19)
    	at com.pmease.quickbuild.grid.GridTaskFuture.get(GridTaskFuture.java:168)
    	at com.pmease.quickbuild.DefaultBuildEngine.run(DefaultBuildEngine.java:645)
    	at com.pmease.quickbuild.DefaultBuildEngine.process(DefaultBuildEngine.java:482)
    	at com.pmease.quickbuild.DefaultBuildEngine.access$000(DefaultBuildEngine.java:156)
    	at com.pmease.quickbuild.DefaultBuildEngine$2.run(DefaultBuildEngine.java:1333)
    	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    	at java.lang.Thread.run(Thread.java:750)
    Caused by: com.pmease.quickbuild.QuickbuildException: Unable to find job '191035ff-1ef7-4bfd-837e-3f6c20c69b18' on node 'buildhostaddress'.

However, the build ran to completion just fine. Any idea what might cause this?

  • replies 7
  • views 62
  • stars 0
robinshen ADMIN ·

Please check console log of agent "buildhostaddress" to see if there are any errors printed around that time.

tomz ·

The only error log on the agent around the time of the problem is the one already printed in the first post. However, more information in the log does look important. It appears to be in the middle of an unplanned agent restart. More log here:

2024-04-29 04:50:15,621 [WrapperStartStopAppMain] INFO  com.pmease.quickbuild.bootstrap.Bootstrap - Cleaning cached plugin libraries...
2024-04-29 04:50:16,012 [WrapperStartStopAppMain] INFO  com.pmease.quickbuild.bootstrap.Bootstrap - Cleaning temp directory...
2024-04-29 04:50:16,012 [WrapperStartStopAppMain] INFO  com.pmease.quickbuild.bootstrap.Bootstrap - Benchmarking...
2024-04-29 04:50:16,684 [WrapperStartStopAppMain] INFO  com.pmease.quickbuild.bootstrap.Bootstrap - Connecting to server 'https://bwe-joqbmaster.bioware.ad.ea.com:1443'...
2024-04-29 04:50:17,356 [WrapperStartStopAppMain] INFO  com.pmease.quickbuild.bootstrap.AgentUpdater - Updating framework...
2024-04-29 04:50:17,403 [WrapperStartStopAppMain] INFO  com.pmease.quickbuild.bootstrap.AgentUpdater - Updating plugins...
2024-04-29 04:50:17,559 [WrapperStartStopAppMain] INFO  com.pmease.quickbuild.bootstrap.AgentUpdater - Updating plugins/com.pmease.quickbuild.libs...
2024-04-29 04:50:17,574 [WrapperStartStopAppMain] INFO  com.pmease.quickbuild.bootstrap.AgentUpdater - Updating plugins/site...
2024-04-29 04:50:17,590 [WrapperStartStopAppMain] INFO  com.pmease.quickbuild.bootstrap.AgentUpdater - Updating asset...
2024-04-29 04:50:17,621 [WrapperStartStopAppMain] INFO  com.pmease.quickbuild.bootstrap.Bootstrap - Booting QuickBuild from 'C:\QBAgent'...
2024-04-29 04:50:17,621 [WrapperStartStopAppMain] INFO  com.pmease.quickbuild.bootstrap.Bootstrap - Starting framework...
2024-04-29 04:50:18,696 [WrapperStartStopAppMain] INFO  com.pmease.quickbuild.equinoxadapter.EquinoxAdapter - Caching plugin libraries...
2024-04-29 04:50:29,330 [WrapperStartStopAppMain] INFO  com.pmease.quickbuild.Quickbuild - Starting QuickBuild agent...
2024-04-29 04:50:29,330 [WrapperStartStopAppMain] INFO  com.pmease.quickbuild.Quickbuild - Initializing velocity...
2024-04-29 04:50:29,439 [WrapperStartStopAppMain] INFO  com.pmease.quickbuild.Quickbuild - Checking node launch data...
2024-04-29 04:50:49,414 [qtp885338582-34] ERROR com.pmease.quickbuild.grid.NodeServiceImpl - Unable to find job '191035ff-1ef7-4bfd-837e-3f6c20c69b18' on node 'buildhostaddress' (Job is ever started: false).
2024-04-29 04:50:49,537 [WrapperStartStopAppMain] INFO  com.pmease.quickbuild.Quickbuild - Starting plugins...

It was pushing an update to the agent by the looks of it - but it was doing so when a job was activating. This looks like a race condition that should be avoided.

robinshen ADMIN ·

I see. Agent is only updated upon server upgrade, or when an old agent is started after server upgrade. Is this the case at your side?

tomz ·

There was a server upgrade but it was a very long time ago (9 days before this error). I find it surprising that it took this long for the agent to do the upgrade. We normally have the build schedules paused for a few hours after the upgrade which I thought was sufficient to let all the agents take the update at that time.

robinshen ADMIN ·

How many agents to you have? And is it possible that this agent is started recently? (hence it does not have chance to upgrade previously).

tomz ·

Roughly 425 agents. The last time they were restarted on purpose was the same date mentioned, 9 days before update. Maybe it took so long due to the amount of agents?

robinshen ADMIN ·

It should not take so long. Please watch it for some time to see if this happens again.