QB2B7: Smart Dependencies #615

scastria · 2 decades ago

The QB2 documentation contains information on build dependencies, however, it only refers to the usage of the QB trigger step allowing one configuration to trigger another one. I currently have 60 configurations that I build every night in a nightly configuration with 60 trigger steps, 1 trigger step for each of the 60 configurations. These 60 configurations have a complex dependency diagram between each other that I manage with IVY. To get the most efficient build workflow, I have my nightly configuration with the 60 trigger steps hardcoded with a series of parallel and serial steps to build as much in parallel as possible while still building the lower level dependencies first before those that need the dependencies. This is OK, but does not produce THE most efficient build workflow possible. With varying build times of each module, a more dynamic approach is needed for a perfectly efficient solution. I propose the following enhancement that I do not believe is available in any other automated build system that I know of (I could be wrong):

1. Enhance QB so that the web GUI allows me to teach QB2 the dependency information of all of my 60 configurations. (it would be nice to have QB2 get this information from IVY, but that is not required for this perfect solution and could be added later) The web GUI would add a "dependencies" property to the configuration object to be stored in the database. The dependencies property is simply a list that I can add other configurations to that represent the dependent configurations of this one. Of course, I would need scriptable support so that this list of dependent configuration names can have variables inside.

2. Add a new composite step type called "dependency" or "dynamic" or "graph", something like that. This new composite step type would work just like the parallel step type in terms of web GUI where you just drop in one or more steps (should be restricted to trigger step types) where order does NOT matter. The ordering will be dynamic.

3. This new "dependency" composite step type would work as follows:
a. The list of trigger steps contained within is scanned to find those that have 0 dependencies using the dependency information I provided in enhancement #1 above
b. All 0 dependency trigger step configurations are then triggered all at once
c. Then every time any configuration completes, the process is repeated again searching for any new trigger steps (that have not been triggered yet) whose dependencies have finished building now that 1 or more configurations have just completed
d. These new trigger steps with completed dependencies are then triggered all at once
e. repeat process until no more trigger steps are left to be built

This guarantees that every configuration in my collection of 60 is triggered as soon as possible when its dependencies have finished. This results in slightly different build workflows every time since the timing of each module can fluctuate for various reasons like the developer adding more unit tests which significantly slow down the build time or something. With my current approach of hardcoding serial and parallel steps, I will be constantly tweaking the workflow as things change over time. With my perfect solution described above, all I would have to do is maintain the depedency information in the QB2 database which probably won't change frequently AND which could later be handled automatically with QB2+IVY integration.

At first glance, this may sound like a huge enhancement, but perhaps not. #1 would probably only take 1 day or so to add the extra property. Then all the work left is to code the new "dependency" composite step process which is probably centralized to a small place in the QB2 code base.

replies 17
views 6737
stars 0

robinshen ADMIN · 2 decades ago

2.0 beta8 is just released to address your scenario, together with other fixes. Agents needs to be re-installed when upgrade to this version. Refer to the release notes for details:
http://wiki.pmease.com/display/qb2/2.0+beta8

Project dependencies are implemented through QuickBuild repository as introduced in the documentation. It is designed not only to trigger other configurations, but also retrieve artifacts from those configurations. For situations where artifacts are resolved and retrieved outside of QuickBuild (for example, using Maven, or Ivy), the trigger step can simply be used to declare the dependency. For example, if project1 depends on project2, you may define a step triggering build of project2 with "waiting for finish" option checked, and add this step into step execution graph of project1. This trigger step should be arranged to be executed before the step retrieving Ivy artifacts. This simple approach also applies for complex dependency graphs. Considering below dependency graph:
1. project1 depends on project2 and project3
2. project2 depends on project3

To reflect this dependency graph in QuickBuild, just do the following:
1. Add a trigger step to trigger project2 and project3 in project1 (note that a single trigger step is able to trigger multiple configurations simutaneously)
2. Add a trigger step to trigger project3 in project2

With this defined, project3 will be triggered and finished first, then project2, and finally project1, and in a single trigger chain, only one build will be generated per project.

scastria · 2 decades ago

This sounds very promising!!

Two question come up:

1. In your last complex scenarios, project 3 would be triggered by both 1 and 2 at about the same time. What do you do to detect this and only build project 3 once?
2. One of my previous posts mentioned the ability to be on the overview page of an active build that shows me the colored graph as it is building and click on a trigger step to take me to that configuration's overview page that the parent build is waiting on. With or without this, how would I get an overview of all the 60 modules as they build in dependency order via QB2 project dependencies mechanism? I would only see the direct dependency trigger steps of the top level module in its overview page with no way of easily seeing the builds progress for the low level dependencies. For example, if we add project 4 to your complex example that is dependent on project 1, then I would want to look at the overview progress page for project 4 and want to see the status of ALL descendants. The way QB2 is now, I would only see the trigger step for project 1 and not be able to tell if project 1 is actually building or if it is really waiting on its dependent trigger steps. Having the ability to click on trigger steps as my previous post talked about would make this better since it was ease navigation down through the dependencies as they build.

robinshen ADMIN · 2 decades ago

QuickBuild uses the concept of trigger chain to keep track of builds directly or indirectly triggered by an user (or scheduler). When a configuration get the triggering request from a trigger step, it first checks if a build in this configuration is already associated with curren trigger chain. If yes, the associated build will be returned and if not, a new build will be generated and associated with current trigger chain. In this way, each configuration is guranteed to only generate a single build even if it is referenced by multiple trigger steps.

As to your second concern, QuickBuild currently lacks the ability of presenting the whole dependency graph. The ability to click the trigger step in step status graph to navigate to the triggered configuration helps this to some extent, and we've created an JIRA issue to track this. However we may not implement it in 2.0 version since this feature is superceded by another bigger feature: the ability to display step contents in step status graph. A workaround is to view configurations in dashboard, which displays all configurations that is currently being built if exanded.

PS: Please re-download beta8 since the original version includes license check and refuses to work if #configurations exceed 16.

scastria · 2 decades ago

I am attempting your suggestion but it seems your explanation is mixing two things. First you say to use QB repository and then you mention a trigger step that can trigger more than one configuration. A trigger step does not want a QB repository, but simply comma delimited list of configurations. So it seems I don't need to create a QB repository unless I want to retrieve artifacts? What if I wanted artifacts of multiple QB repos? The checkout step doesn't allow more than one QB repo to be specified. That makes the trigger step non-consistent with the checkout QB step.

I just added a trigger step to trigger dependencies before anything else in my master step definition. I am getting this exception:

22:52:19,8 [before-dependencies@cmlinuxbuild2:8811] ERROR - Step 'before-dependencies' is failed.
com.pmease.quickbuild.RemotingException: The collection was unreferenced
at org.hibernate.impl.SessionImpl.getFilterQueryPlan(SessionImpl.java:1437)
at org.hibernate.impl.SessionImpl.createFilter(SessionImpl.java:1247)
at com.pmease.quickbuild.persistence.AgentQueryCarrier.assembleQuery(AgentQueryCarrier.java:77)
at com.pmease.quickbuild.ServerServlet.queryUniqueResult(ServerServlet.java:76)
at sun.reflect.GeneratedMethodAccessor28.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at com.caucho.hessian.server.HessianSkeleton.invoke(HessianSkeleton.java:192)
at com.caucho.hessian.server.HessianSkeleton.invoke(HessianSkeleton.java:110)
at com.caucho.hessian.server.HessianServlet.service(HessianServlet.java:416)
at com.pmease.quickbuild.GridRemotingServlet.service(GridRemotingServlet.java:31)
at org.eclipse.equinox.http.servlet.internal.ServletRegistration.handleRequest(ServletRegistration.java:90)
at org.eclipse.equinox.http.servlet.internal.ProxyServlet.processAlias(ProxyServlet.java:111)
at org.eclipse.equinox.http.servlet.internal.ProxyServlet.service(ProxyServlet.java:59)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:689)
at org.eclipse.equinox.http.jetty.internal.HttpServerManager$InternalHttpServiceServlet.service(HttpServerManager.java:318)
at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:380)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:324)
at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:535)
at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:880)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:835)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:213)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)
at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:520)

scastria · 2 decades ago

It looks like if I force the trigger step to run on the server node, then it works. Are trigger steps limited to be run on server node only?

scastria · 2 decades ago

In addition to the previous question, I think I found a Queue bug/issue with triggering builds:

I only have 1 queue and it allows for 10 workers. If I start one build that then triggers 9 dependent builds and if each one of those 9 dependent builds has 1 dependent build, then nothing happens. The lowest level builds that CAN actually do something since they have no dependencies are in the WAITING state. The 10 workers are used up by all builds waiting for dependencies to be built which will never happen because there are no more workers to use. So it is in deadlock. Perhaps, a build that triggers another build with "wait until it finishes" turned on should give up its spot in the queue for its dependencies to execute? Or maybe come up with another build state like Pending? I am not sure what the best solution is here. The only way for my build to do anything is if I make my workers HUGE to get enough empty spots for the lowest level dependencies to run first.

robinshen ADMIN · 2 decades ago

Yes, QuickBuild repository is used to retrieve artifacts besides triggering dependencies while trigger step is simply used to trigger dependencies. To checkout from multiple QuickBuild repositories, just define multiple checkout steps.

Currently it is a bug that the trigger step can only be executed from server node. We will fix this bug in next version. For now, please just arrange this step to be executed on server node, and it does not put any load on server since this step just triggers specified configuration and waits there for their finishing.

The queue problem will also be addressed in next version and we will assign a worker to whole trigger chain instead of single build to avoid the deadlock issue.

Thanks

scastria · 2 decades ago

Can you describe this queue bug fix you made a little more? If I set my queue to worker of size 1 and I trigger a 60 module trigger chain, I would like nothing to get deadlocked and have 1 module build at a time. Is that the behavior I should expect?

#10

robinshen ADMIN · 2 decades ago

Consider the case of you manually triggers build A, and build A automatically triggers build B (either through the trigger build step or through QuickBuild repository). Build A will be queued and if there is a free worker there, it will be running, however when build B is fired as result of build A, QuickBuild knows that it is a dependent build and will execute it right away even if there is no additional workers to avoid deadlock since build A might be waiting for finish of build B.

Hope this clarifies things.

#11

scastria · 2 decades ago

I understand your simple case, but I am referring to a more complicated case. Let's say A triggers in parallel B, C, D, E, and F. Build A will only continue until after all other 5 dependency builds finish. Let's say only 1 worker is available. A will start grabbing the free worker. B, C, D, E, and F all want to trigger at the same time. My question is will all 5 dependencies all build simultaneously even though only 1 worker is available? OR will B run using the worker that A originally grabbed and have C, D, E, and F wait for B to finish, etc? I think the latter is the correct solution. If QB2 implements the former, then there is no purpose to configuring the number of worker threads in a trigger chain scenario as it is being incorrectly ignored.

I have a case where I only have 1 node available to perform javadocs. I don't want that node to get overloaded so I set all the javadocs builds to have their own queue called "docs". I set the docs queue to 1 worker thread so that the 1 node that does javadocs only does 1 thing at a time. If I trigger my complex 60 module trigger chain from a single top level module, I do NOT want all 60 modules to build simultaneously on that one node. That will kill the 1 node.

When a dependency build runs from a trigger of another build, I think the dependency build should take the place of the parent build in the worker thread. If no worker threads are available, then that build should wait. Assuming there are no circular dependencies, this should NOT cause a deadlock since there will always be a module at the bottom of the trigger chain with no dependencies. That will use up any available worker threads.

My testing on QB2.0.2 shows the number of worker threads is ignored in a trigger chain scenario. I think this is incorrect.

#12

robinshen ADMIN · 2 decades ago

As you've observed, queue does not apply to dependency builds. In your example, build A can not release its worker since itself might still be doing heavy load jobs while waiting for other builds to finish: consider the case where an additional build step is added to the parallel composition step triggering the dependency builds.
So under QB's architecture, the original build can not release its worker since QB have no idea of what is it doing (it is possible that it is waiting for dependency build in one step and running tests in another step at the same time).
The queue is designed to control original builds and it is useless in case of dependency builds. In 2.1, we are going to generify queue into resource which can be applied at step level (instead of build level as queue currently does) and your specific issue here will be able to be addressed by then.

#13

robinshen ADMIN · 2 decades ago

For now, you may utilize the "pre-execute script" and "post-execute script" of the "docs" step to prevent the docs node from being overloaded:
1. pre-execute script detects if a lock file exists. If yes, it waits until the file is deleted; otherwise, it creates the lock file and continue.
2. post-execute script deletes the lock file.

#14

scastria · 2 decades ago

This lock file solution sounds a little kludgy. I think I have another workaround which should work until 2.1 comes out.

I will add a trigger-docs step in my standard module build workflow. The trigger step will be set to NOT wait until finished. Then when I trigger my large dependency trigger chain from the top level module, a bunch of docs builds will be triggered as a side effect. Since all of these docs builds will be normal standalone builds, they will honor the queue setting and go one at a time.

#15

scastria · 2 decades ago

This didn't work. QB still ignored the queue worker thread setting on these separate triggered builds. I don't think I can wait until 2.1 for a solution to this problem.

I understand the way you implemented QB2 with respect to trigger chains and preventing deadlocks. I am happy to wait until 2.1 to get this resolved in a more flexible way for trigger chain builds. However, I think your current solution should only apply to triggered builds with "Wait Until Finish"=true since only those could cause deadlocks. Since my suggested work around below involves triggering external builds with "Wait Until Finish"=false, I think those should be considered standalone builds that are NOT part of any trigger chain AND honor the queue worker thread setting.

If you agree with my proposal, could you possibly make this change for a new 2.0.4 build to keep me going until 2.1 so that triggered builds with "Wait Until Finish"=false are considered independent and honor the queue worker thread setting?

#16

robinshen ADMIN · 2 decades ago

We can not do this even if "wait for finish" is set to "false" since other builds in the same chain might still be waiting for the build. For example:
1. A triggers B and wait for it
2. A triggers C and do not wait for it.
3. B triggers C and wait for it.

Since QB makes sure that only one build is fired (in this case C is requested by both A and B) if requested multiple times in the same trigger chain, and since QB does not have a global picture of the whole dependency graph, it just can not assume that there is no other builds waiting for current build.

For now, the best workaround is to utilize step pre-execute and post-execute scripts. With groovy, it should be rather easy to get it implemented.

Sorry for the inconvenience.

#17

robinshen ADMIN · 2 decades ago

Another approach I am thinking of is to make the "docs" step running in a single configuration and set run mode of that configuration to be "sequentially". Other builds just trigger that configuration instead of running the "docs" step directly.

#18

scastria · 2 decades ago

I have a docs child configuration for each of my 60 modules. There isn't just one docs configuration so this "sequential" suggestion won't work.

I guess another workaround is for me not to use my main trigger chain to kick off the javadocs builds but to set each of the separate docs configurations to be run continuously with their own schedule. Since each of these will be standalone builds, they will honor the queue worker thread settings.