Your browser was unable to load all of the resources. They may
have been blocked by your firewall, proxy or browser configuration.
Press Ctrl+F5 or Ctrl+Shift+R to have your browser try again.
We just upgraded from 5.0.7 to 5.1.36, and I'm seeing strange behavior when stopping a running build.
Some steps show that their status as 'cancelled', some seem to continue running (and pass) others show 'failed' and others are 'skipped'.
It used to be that the running step when we would stop a build was 'failed' and the rest were skipped.
The build did stop after a few minutes, but I can't tell what actually ran and what didn't. (Clearly, I don't want anything to run once I hit stop...)
Is this a bug? Something that needs to be changed in step setup?
In new QB version, sending cancellation does not stop the build immediately, instead it just sends cancellation signal (via SIG_TERM) to running steps, and in some case, steps might not respond to the signal timely (for instance if the step forks a command not responding to SIG_TERM, or the step itself runs a groovy script with a loop). Previous version of QB cancels build forcibly and it may result in unexpected behavior (for instance the build is actaully marked as cancelled, but some of the prodcess forked by its steps might still be running to cause issues if another build runs).
Thanks for the response.
I understand that the running step may take time to stop and that the actions / scripts / etc behind it will complete before the build stops. (I believe that this was always the case.)
My issue is regarding subsequent steps.
If step5 was running when we issued the stop (and it shows up in the step status as cancelled) then shouldn't all steps after step5 be skipped?
We're seeing that - for example - step6 is 'successful', step7 and step8 are 'skipped', step9 is 'failed', step10 and step11 are 'skipped', etc.
Thanks for the clarification. This seems abnormal. At my side, I tested and all sequential steps are marked as skipped after the cancelled step. Can you please let me know detail step to reproduce this issue?
It's not a specific type of step. It's happened on several builds, and on multiple nodes in each build. The behavior isn't really consistent.
Here's an example from one build...
The build was cancelled during a Shell/Batch Command step. Subsequent steps went as follows:
- Several Checkout steps were run successfully
- An Execute Script step ran successfully
- Three Ant steps are marked as Canceled
- An Execute Script step was successful
- An Artifacts step was successful
(this was the end of one "group" of steps in the build, and then the next group continued)
- Ant step - cancelled
- Publish step successful
- Shell/Batch step - cancelled
- Execute Script - skipped (probably because of build step condition)
- 2 Shell/Batch steps - failed
There are more steps after that, but I think that gives an idea of what we're seeing.
The build continues to run - some steps show up as cancelled, others fail (probably because of previous steps that didn't run correctly.)
Looking at a few stopped builds, it seems that 'skipped' steps are skipped because of their execute conditions and not because the build is stopping.
So overall, it could just be that the build is continuing to run and the stop itself isn't working.
I thought that maybe it was caused by issuing the Stop command during a certain type of step (e.g. Shell/Batch Command) but I see cases where the first step marked as 'cancelled' is an Ant step.
Some other notes that may be of interest:
- We're running with the 'Legacy Command Mode' setting set to true (because otherwise these steps were failing)
- Our builds run on multiple nodes and often have steps that do things like copying files from one node to another.
- There are groups of steps, such as "Checkout and Build", followed by "Install and Test"
- The builds are generally running fine. It's just that the "Stop Build" isn't working correctly.
What other information can I provide that would be of help?
Thanks for the info. I did get it reproduced. All steps keeps running are steps that won't respond to thread interruptions (the execute script step and artifact publish step will not respond to thread interruptions). I filed an issue as below:
Please watch it to get notified when fixed.
The link that you said to watch - viewtopic.php?f=1&t=3386 - links me back to this thread.
Was that the intention, or is there a separate link to track the open issue?
Sorry my mistake. Correct link here: