Your browser was unable to load all of the resources. They may have been blocked by your firewall, proxy or browser configuration.
Press Ctrl+F5 or Ctrl+Shift+R to have your browser try again.

QuickBuild Interface is slow #2397

amn ·
We have been running into issues with slow performance of the QuickBuild interface. Are there any recommendations you have for this? The interface itself fluctuates, but can become almost unusable at times that seem to correspond to GC Time going up to 4 minutes long as well as the number of GC Runs closing in on 600. The thread count also closes in on around 600 at these times.

QuickBuild Server:
[list]Users: 269
Configurations: 12561
Build Agents: 801 (around 215 are active currently with the others being started when they are going to be used and then shut back down)
User Agents: 2[/list:u]

Machine resources:
[list]CPU: 2.6 GHz AMD / 4 processors
Memory: 8GB[/list:u]

Settings in the wrapper.conf for memory:
[list]wrapper.java.additional.3=-XX:MaxPermSize=1024m

# Initial Java Heap Size (in MB)
wrapper.java.initmemory=3072

# Maximum Java Heap Size (in MB)
wrapper.java.maxmemory=3072[/list:u]
  • replies 19
  • views 8372
  • stars 0
robinshen ADMIN ·
Please try below to see if the situation can get better:
1. you are running latest QB version.
2. edit "conf/hibernate.properties" to incease database connections by tuning property "hibernate.c3p0.max_size" to for example 100
3. make sure all your build jobs are running on build agents to reserve QB server for web access and build job orchestration.
4. designate a certain build agent to manage artifact publishing/downloading for server.
amn ·
1. We are running version 5.0.13 and there has been nothing in the change log relating to any performance changes.
2. We have already increased the number of database connections to 25 and have not seen the number of busy connections go above 13.
3. All of our jobs are running on build agents. Nothing runs on the QB server. It is only for web access and build job orchestration as you suggested.
4. We do not publish any artifacts to the server with our builds which also means nobody is downloading them from the server.
robinshen ADMIN ·
While inerface is slow, can you please check if there are many builds be running by examing the queue page? Please also run jstack to get a stack trace of QB process when this happens.
amn ·
There were 46 builds in the queue. I ran jstack, but it is too large to post in a message so I emailed it to you with the same subject as this thread.
robinshen ADMIN ·
From the stack trace, I only saw one UI thread being busy and it is reading response from database.

java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at com.mysql.jdbc.util.ReadAheadInputStream.fill(ReadAheadInputStream.java:114)
at com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(ReadAheadInputStream.java:161)
at com.mysql.jdbc.util.ReadAheadInputStream.read(ReadAheadInputStream.java:189)
...
at com.pmease.quickbuild.web.page.configuration.ConfigurationOverview$26$1.load(ConfigurationOverview.java:530)

This thread is serving requests from configuration overview page. However since the thread is in RUNNABLE state, it is unlikely that this thread will be blocked for a long time. Can you please help to take the stack trace while waiting for response of a obviously slowed down web requests?

Also I just noticed that you have 12561 configurations. With this number of configurations, I'd suggest to tune the maxmemory used by QB JVM process up to 8G.
amn ·
Do you have a suggestion for the PermGen and minmemory settings?

I emailed you a new stack trace where trying to go to the dashboard page was taking a while to load.
robinshen ADMIN ·
PermGen should be set to 256M, and minmemory does not matter. Also please turn off measurements to see if it helps. To turn off measurements, edit setting of plugin "grid measurements plugin" in "administration / plugin management" page, and uncheck the option "collect metrics", then save the setting.
amn ·
One thing that has become clear is that it seems the interface is much slower for users that are not admins.

This makes me wonder if the database requests are taking up a lot of time trying to find out what configurations to display to a user and what permissions they have.
robinshen ADMIN ·
Please send your database backup to [robin AT pmease DOT com] and I will check what might be wrong.
amn ·
After restoring from our backup it looks like QuickBuild performance is going to be a problem with our configuration and permission structure. Currently we have things setup like this:

Project A
Continuous Integration
Create_Prodfix_Branch
install
activate
dev
test
prod
deploy
dev
test
prod
deploy and activate
dev
test
rollback
dev
test
prod
operations
bounce
dev
test
prod
start
dev
test
prod
stop
dev
test
prod
prodfix_branch
build
release
snapshot
merge_changes_to_trunk
trunk
build
maven site
release
snapshot
test


Then each project has two permissions groups associated with it. One is for the developers of it and the other is for operators who run production installs and monitor the production environment.

Is anyone else doing something similar or is there a suggestion on how this could be done differently for QuickBuild to be able to handle without significant performance impacts?
robinshen ADMIN ·
As explained in my email sent to you, this permission mode causes too much groups to be created, and considering you have 13000 more configurations, QB takes considerable time to check permissions for an user with hundreds of groups. This is burden both to QB itself and to your maintenance task.

If you assign the group to a parent configuration tree, adding child configurations to that tree will not require you to add configuration to every group since group gains access to child configurations automatically if it is granted access to parent configuration. So organizing configuration hierarichy appropriately will also reduce your work and increase performances. As a last resort, you may write some script to programmatically assign permissions to groups in a batch.
amn ·
The permissions can not just be defined at some higher level and inherited down. Users are allowed to do different things in each configuration. For example developers can only run builds and install/operate dev and test environments. They are removed access from running prod installs and operations. The opposite is then true for the operators groups.

Also, no edit permissions are allowed below the actions level configurations so that they are uniform across environments.

You say that this is a burden on our maintenance task, but this truly simplifies our maintenance task. It makes it easy to see what users have access to what projects and also was our best option with the limited actions available within permission groups. Simply a button to add a new permission or remove it and then having to type in the configuration and what access is needed for that configuration is poor to say the least. We created a template permission group for a project so all we have to do is copy it and do a replace all on template with the project name and add the appropriate users. For the operators this is a predefined list in the template group so we don't even have to worry about adding users, we just copy it and replace the word template with the project name in each permission line.

Is there some way to concatenate a bunch of permission groups and still be intuitive to see what a user has access to without having to go through hundreds of permission lines in a giant permission group?
robinshen ADMIN ·
We profiled with your database and it turns out there still exist some places we can improve. The result is 5.0.31 which runs much faster even with hundreds of groups associcated with particular user.
amn ·
This new version has definitely provided UI improvement overall.

We still have the issue of performance at times and looking at the metrics in QuickBuild, the GC Time can be over 4 minutes which just brings the application to a crawl. Increasing memory will typically just spread out these garbage colllections, but not the times themselves. Are there any suggestions for configuring the jvm or things that might be causing this?

You still have our database that is up-to-date basically with the only change being we upgraded to version 5.0.31.
robinshen ADMIN ·
Based on our supporting experience, if GC time takes too long, the most effective way would be to split the instance to avoid managing builds of tens of thousands of configurations in a single QB server.
amn ·
And there is no way to scale the application horizontally by adding multiple instances behind a farm address using the same database? (I think I had seen some mention of this in previous threads.)

Also, that didn't really answer the question about memory specific tuning. With you knowing the number of configurations, users, permissions, etc... that we are using are their suggested settings for the JVM heap size, garbage collection type, and memory and cpu on the server?

Previously you have said that the MaxPermGen should be 256M, but when we do that the server runs out of memory until we increase it to 1024M. Without being the developers for this application and only the users we can't realistically know what these configurations should be.

Another bit of information is the GC Time is never great, but it doesn't seem to get above a three minutes until after being up for a week and then the application basically becomes unusable and needs to be restarted.
robinshen ADMIN ·
QB does not support multiple instances sharing the same database. To support this, QB server has to take care of cooperating of its core funcionalities (such as scheduling, resource allocation, caching) to all these instances and this is quite complicated. We met GC pausing issue at some other customer site, and dumped QB memory and found that majority of memory is released, but is waiting to be garbage collected. And most of the objects in this memory are XML strings and this is not strange as QB stores nearly everything in XML in database for easier migration purpose. Everytime a build is set to run, QB loads relevant steps/repositories/other objects in the belonging configuration and parent configuration and copies them to make sure they won't subject to external change during a build. This can consume considerable memory if for instance some parent configuration defines a lot of objects (steps, repositories etc.) and this is difficult to be improved considering QB's hierarchical structure.

As to PermGen space memory, it is mainly used for class data and string constants, and should never exceeds 256M even if you have tens of thousands of configurations. We have several customers reporting that QB is reporting that PermGen space is run out of memory, and it all turns out that they forget to follow step 5 of server installation guide to tune MaxPermGen space up to 256M. And once tuned up, never encounter any issue about PermGen space. So it is quite odd that your site needs 1G PermGen space to run.

Also it is uncommon that GC starts to crawl only after one week since QB starts, except that many builds are set to run after a week. So please get a memory dump with jmap and ftp to our site (I guess I've sent the ftp site/user/pass previously when we discuss PermGen space issue. If not, drop me an email and I will reply with the ftp info) for ananysis.
amn ·
We have improved the performance of the interface at least to not degrade over time with various JVM and GC tuning. We still see too many Full GC runs than we would like, but we are trying to further tune the parameters to eliminate those.

In the meantime we are still seeing the issue where the PermGen memory just continues to go up until it runs out and the instance reports an Out of PermGen error and needs to be restarted. Currently we have the MaxPermGen set to 2GB which can last about 20 days or so depending. Obviously we would love to not have to worry about this issue.

Is there anything that has been done between 5.1.20 and 6.0.4 as far as memory is concerned? Would there be anything we could provide that would help investigate why we see to be the only site experiencing this issue with the PermGen memory?
robinshen ADMIN ·
There is no memory relevant changes between QB5 and QB6. The reason why you are getting frequent PermGen space issues might be that your site has many configurations (up to 20000 if I remembered it correctly) and these many configurations can introduce many many string operations when QB runs build as configurations are stored as XML for migration purpose. Or maybe you can try with some other JVM without using PermGen space such as Oracle JRocket to see if it works.