Your browser was unable to load all of the resources. They may have been blocked by your firewall, proxy or browser configuration.
Press Ctrl+F5 or Ctrl+Shift+R to have your browser try again.

Unable to create directory #4296

JShelton ·

Using 10.0.21. Recently upgraded our build grid from one that was on 6.0.28. We're seeing an issue, unsure if related to the upgrade (since the issue didn't crop up for many weeks after the upgrade), where a build near instantly fails and we get an error similar to

Unable to create directory: /home/quickbuild/qbartifacts/builds/b658/2658

Most commonly, this occurs in configurations that have been triggered by "Trigger Other Builds" step in other configurations (but we also saw it occur once on a configuration that was manually run via the "run" button). The error always shows itself in the master step and there is no actual log for it. Strangely, we can generally kick off the build again and most of the time it will run again just fine. Another time, it occurred repeatedly on one configuration before clearing itself up and running normally without an issue since.

Our global storage directory (/home/quickbuild/qbartifacts) is writable by the user that is running QB server, though that shouldn't be an issue because it doesn't happen 100% of the time. Server is running on CentOS 7. We have so far been unable to reproduce the error on demand as it seems completely random.

Any ideas on how to address this? Has this sort of error been encountered before in past QB versions?

  • replies 11
  • views 1616
  • stars 0
robinshen ADMIN ·
JShelton ·

Unfortunately, upgrade to 10.0.30 has NOT fixed the issue.

I've noticed a pattern here. I've set up a test config that runs every 20 seconds and all that it does is trigger another config that does nothing but execute a groovy script which logs some text. It appears the failures very consistently happen every 2 or 3 hours and, when the failures start, 2 or 3 builds fail at maximum. Essentially, every 2 or 3 hours this failure will be triggered for no more than one minute.

We are using OpenJDK 1.8.0_232.

robinshen ADMIN ·

Can you please show me the full stack trace in log?

JShelton ·

This is the most I can get from the server log:

2021-01-19 13:22:41,896 [pool-2-thread-5586] ERROR com.pmease.quickbuild.DefaultBuildEngine - Build is failed.
    com.pmease.quickbuild.QuickbuildException: Unable to create directory: /home/quickbuild/qbartifacts/builds/b659/298659
        at com.pmease.quickbuild.util.FileUtils.createDir(FileUtils.java:174)
        at com.pmease.quickbuild.DefaultBuildEngine.run(DefaultBuildEngine.java:581)
        at com.pmease.quickbuild.DefaultBuildEngine.process(DefaultBuildEngine.java:470)
        at com.pmease.quickbuild.DefaultBuildEngine.access$000(DefaultBuildEngine.java:148)
        at com.pmease.quickbuild.DefaultBuildEngine$2.run(DefaultBuildEngine.java:1275)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

No logging at all is left behind on the build experiencing the error. The error (as referenced in the original post above) can only be seen in the build itself via the 'Build Overview' tab.

robinshen ADMIN ·

This is odd. With all directory creation logic being synchronized, this should not happen... Is it on a local drive or a network mounted drive? Do you have any step command writing data to the publish directory manually?

JShelton ·

This is all on a local drive. I'm fairly sure we do not write to that directory in any way. In a cursory look at that directory, the only things existing are a bunch of auto generated directories and subdirectories that only contain build.log.

I have not been able to reproduce in a Windows host, but will be shortly attempting to reproduce in a different test environment on Linux.

robinshen ADMIN ·

Are you able to reproduce this with a separate QB server only running these test builds? If yes, please me backup of your sample database. It will help a lot.

Thanks

JShelton ·

Oddly, I am unable to reproduce this in any other environments. Even one on Linux that is nearly identical to the one experiencing the issue (including configuration layout, structure and regularly scheduled jobs). There shouldn't be any scripts at all pointing to the "qbartifacts" directory and I can back that up with the searching that I have done.

I will continue trying to reproduce the error, perhaps with a more identical environment if I can. Any other ideas you have to narrow this down appreciated.

robinshen ADMIN ·

As long as the directory creation operation is synchronized (as changed in 10.0.30), I never seen this issue on other site. Do you have any virus scan software running in the machine? Also you may try with Oracle JDK to see if there is any difference if possible.

JShelton ·

Interesting. Fairly sure we've solved the problem.

I missed a critical detail. The failing directories were always b657, b658, and b659. I didn't notice they were always the same. When I checked the qbartifacts directory, there were build dirs going up to b999 and looking in a sampling of them didn't show anything out of the ordinary. Those 3 directories were owned by root rather than the user that runs QuickBuild. The directories were last created/modified in October, before we added the user to the RUN_AS_USER property within server.sh, so the QB server must have been started using a sudo command. The fix then, clearly, was to give ownership back to the QB user.

This tells me that subdirectories used for builds in the build artifacts directory aren't always just going up sequentially, nor are they seemingly dedicated to each configuration that can run. It seems to me that these top level b### directories are rotated through for configurations to use for new builds. Can you explain a little of how that works for my own knowledge?

robinshen ADMIN ·

Thanks for the info. Various bxxx directories are created in order not to put too many build directories inside a single "builds" directory. The digits of bxxx are taken from last three digits of build id. So builds from different configurations might be put into same bxxx directory.