Your browser was unable to load all of the resources. They may have been blocked by your firewall, proxy or browser configuration.
Press Ctrl+F5 or Ctrl+Shift+R to have your browser try again.

It seems, QB tries to delete workspace before Pre-Execute Action runs #4062

waldemar ·

We have a case where QB fails because 'Failed to delete file'. It appears QB is trying to delete workspace before the script in 'Pre-Execute Action' runs. The failure happens because the file is owned by root. The 'Pre-Execute Action' script would do the right thing (sudo rm -rf) but unfortunately QB does not give it a chance.

In our setup, we run a 'Snapshot Taking' script to setup ssh keys for the git repository. It seems that QB tries to delete the workspace as part of snap shot taking. Here is the trace. The snapshot taking script is listed in this trace as well.

2019-03-04 16:09:56,759 [pool-1-thread-4577] ERROR com.pmease.quickbuild.DefaultBuildEngine - Error processing build request.
java.lang.RuntimeException: Error executing check condition job.
at com.pmease.quickbuild.CheckConditionTask.reduce(CheckConditionTask.java:39)
at com.pmease.quickbuild.CheckConditionTask.reduce(CheckConditionTask.java:16)
at com.pmease.quickbuild.grid.GridTaskFuture.get(GridTaskFuture.java:155)
at com.pmease.quickbuild.DefaultBuildEngine.process(DefaultBuildEngine.java:395)
at com.pmease.quickbuild.DefaultBuildEngine.access$000(DefaultBuildEngine.java:143)
at com.pmease.quickbuild.DefaultBuildEngine$2.run(DefaultBuildEngine.java:1233)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: Failed to evaluate below expression in configuration 'gaikai/ebuilder/brd/brd release':
groovy:
cmd='install_github_ssh_keys'
logger.info('snapshot taking on: ' + node.getHostName() + ' exec: ' + cmd)
assert util.execute(cmd) == 0, "Could not install keys"
for (repo in configuration.getReferencedRepositories()) {
repo.takeSnapshot();
}
at com.pmease.quickbuild.util.ExceptionUtils.wrapException(ExceptionUtils.java:87)
at com.pmease.quickbuild.DefaultScriptEngine.evaluate(DefaultScriptEngine.java:321)
at com.pmease.quickbuild.DefaultScriptEngine.evaluate(DefaultScriptEngine.java:74)
at com.pmease.quickbuild.setting.configuration.snapshot.ScriptSnapshotTaking.takeSnapshot(ScriptSnapshotTaking.java:39)
at com.pmease.quickbuild.model.Configuration.takeSnapshot(Configuration.java:1988)
at com.pmease.quickbuild.CheckConditionJob.execute(CheckConditionJob.java:35)
at com.pmease.quickbuild.grid.GridJob.run(GridJob.java:129)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
... 3 more
Caused by: com.pmease.quickbuild.QuickbuildException: Failed to delete file '/opt/qb/buildagent/workspace/2803/release/ms7-1.0.1/breview/app/node_modules/p-limit/readme.md'.
at com.pmease.quickbuild.plugin.basis.BasisPlugin$32.evaluate(BasisPlugin.java:397)
at com.pmease.quickbuild.DefaultScriptEngine.evaluate(DefaultScriptEngine.java:305)

We know that some of our builds leave files owned by root behind. We have a script in Pre-Execute action to clean this up but it looks like this script does not run early enough. This is with qb 8.0.37

  • replies 4
  • views 25
  • stars 0
robinshen ADMIN ·

QB does not have the logic to try deleting any file while taking snapshot. Is it possible that "install_github_ssh_keys" includes some file deletion logic?

waldemar ·

The script copies keys from vault to a tmpfs file, then updates .ssh accordingly. It does rm -rf $HOME/.ssh but it does not operate on workspace files. It is a shell script. If it failed I would expect to see assertion failing. Instead, we see a java a exception.
Perhaps what happens is that util.execute() checks for exit code and raises exception before we have a chance to assert but how would it know what failed in the script?? This looks like there is some java code, not a shell script, that tries to delete workspace.

I am also a bit confused about what happens where. The above exception is from the server log. The agent log does not have that exception but has a message produced by the logger.info(). I would think the script runs on the agent but the log appears on the server.

robinshen ADMIN ·

My apologize, QB does try to re-create (clean and clone again) the workspace when taking the snapshot, and this happens when QB detects that some setting of the git repository definition is changed. Please put the logic of changing permission at start of the snapshot taking script so that QB can go ahead to do things necessary.

waldemar ·

At the time snapshot taking runs, we are already committed to a new build, so running 'sudo chown -R build.build' does not make a lot of sense for us. We want to clean workspace instead. I understand QB is trying to re-use existing repositories to avoid re-cloning. But the build does not know about that and clears the workspace at the start of build anyway which results in re-cloning.

I think maybe QB should just do 'sudo rm -rf' when it has to. At least it would do it only if necessary so some savings would still result.

I also think maybe QB should keep its own workspace for evaluating repos, that way it would have full control over them and there would be no interference from buildd.

In meantime, we're going to try to re-factor the build to allow a non-root build.