Your browser was unable to load all of the resources. They may
have been blocked by your firewall, proxy or browser configuration.
Press Ctrl+F5 or Ctrl+Shift+R to have your browser try again.
Hi,
I have agents running as non-root users on many Unix and Linux flavors except this one, Solaris-X86 64 bit (Solaris 10) running as a Solaris zone. It starts perfectly if root.
I have switched on DEBUG logging, nothing there. It is stuck while triggering the wrapper.jar. Last messages:
jvm 1 | WrapperManager Debug: Implementation Version: 3.5.12
jvm 1 | WrapperManager Debug: Is Sealed?: False
jvm 1 | WrapperManager Debug: org.tanukisoftware.wrapper.WrapperManager protection domain:
jvm 1 | WrapperManager Debug: Location: file:/opt/build/buildagent.1/plugins/com.pmease.quickbuild.bootstrap/lib/wrapper.jar
And here it will wait for eternity.
I have listed the jar (logged in as the desired user), nothing special as far as I can tell.
Anybody ?
Thanks in advance,
Boaz Keynan
-
replies
19
-
views
5014
-
stars
0
-
Can you please make sure the user has full permission over all files and directories recursively?
Thanks for the reply.
I know it smells of a permission problem. I have changed owner and mod recursively for the agent directory more than once in between trials, to no avail. More than that, it happened on two separate Solaris10 X86 machines (but as I said, works well on more than ten other systems, including Aix, HPUX, various Linuxes and Solaris over Sparc).
Is there a way to get more detailed debug information ? could it be that the fact it's a zone and not a standalone machine is the cause ?
Please edit "conf/wrapper.conf" to point property "wrapper.java.command" to desired version of JDK, for instance, /path/to/yourjdk/bin/java, then start again to see if it works.
This has been done to begin with. As I mentioned earlier, the agent's configuration works well as long as the user is root. Also, "wrapper.java.command.loglevel=INFO" but it does not reveal anything interesting. Can you suggest any setup change to the wrapper or JDK to receive more information ?
Please uncomment below line in wrapper.conf to get more information:
# wrapper.debug=TRUE
Also please let me know the QB server version.
My server version is 4.0.41.I have switched on wrapper.debug before posting this issue, please see above. Anyhow, I'll try to workaround this issue for the time being by setting my UID from within the application script.
BTW, there's a (less critical) bug with refreshing the "Latest build" column on the dashboard.
Thanks,
Boaz
This is odd. If you are able to schedule a WebEx or GotoMeeting session, I'd like to check this issue with you online.
After playing around,I can define the situation better.
1) I have to be root when starting the agent (which is obviously the case when starting at boot time).
2) If RUN_AS_USER is non-root, it is detected by the server in rare cases, can't reproduce. In most cases it is not detected. Which means that RUN_AS_USER must be commented. So I am still root. As I stated above, this is true only for Solaris X86 (a Solaris zone in my case).
3) My build works via Buildr and Ant, and triggered by a shell script. I setuid at the beginning of my shell script, it seems to work OK so far.
Thanks,
Boaz
Thanks for the updated info. Does it work if you start QB agent directly through "./agent.sh console" as non-root instead of starting it as service?
Sun changed the way process privileges are handled in Solaris10 (including the privilege to change UID). Is that covered in QB 4 agent and the included Tanuki wrapper ? my build on Solaris 9 (Sparc) works well, and the one that fails to suid is on Solaris 10 (X86). So maybe my problem is the version of Solaris and not machine architecture.
The "RUN_AS_USER" setting should only take effect if you run QB as system service. If run in console mode, all processes will be executed using the same user launching command "agent.sh console". I will contact JSW support to see if they have some idea.
OK, please let me know if you have any news.
Meanwhile, I worked around this problem by doing my own "RUNNING_AS" via su. This is an ugly workaround (another level of envelope to the build), so I shall be happy to replace it with a proper solution if and when you have one.
On the way to this workaround, I hit another unpleasant "feature". I intended to trigger my build script via a su - c '...' straight from QB configuration. Alas, my build command is long (many parameters) and the quoted su command broke into multiple lines in the configuration box. Quickbuild does not know how to handle a quoted string across multiple lines (at least in this kind of box). If I am wrong, please make me wise.
The result was triggering an envelope script which runs as root (no need for quoted string now) and this one triggers the su command. Very ugly.
Thanks,
Boaz
I still did not get the response from JSW yet. How about putting the long command in a shell script file, and call that script file from QB?
Please see response from JSW support:
I have read through the thread in your forum. I have never heard of such problem before, and also on our solaris test machines, I haven't seen anything similar to that.
It seems that more that there is something with his JVM not working correctly. The JVM starts and then for some reason it blocks(?). When it blocks is he able to cause a thread dump? He can trigger a thread dump by hitting [Ctrl] + [] when he runs the application in console mode.
Also the place where he said it starts to hang is not being called if the debug mode is not enabled. Anyways the next step the Wrapper should print out after this is the MD5 checksum of the wrapper.jar file.
What jvm version is he using?
Thanks, that was what I called a higher level of envelope. Called a script that does su with a -c parameter which is the quoted build call. It works, but I would prefer a more elegant solution.
1) The wrapper was obviously in debug mode when I copied the log messages for you, but it behaves exactly the same in normal production mode.
2) I run Quickbuild's agent with JDK 1.5.0.22, but I tried it also with 1.6.0.11, same behavior.
I shall run it again as non-root in console mode and try to get Java stack trace for JSW.
Thanks, Boaz
Tried to reproduce the problem for stack trace, it worked perfect. Service as root, console as root, service as the user configured in agent.sh, all worked perfect. Nothing was changed in the system, not even a reboot. No configuration changes in QB.
This implies a non-consistent problem. In the past, it worked once or twice then started to fail consistently. Now it is back in good health. Since I have my workaround in place, I am closing this thread for the time being.
I may have one problem with the workaround. It appears to cause loss of output line on Aix 5.3 agents (only on those). At least I noticed this behavior only after implementing the workaround, maybe it happened in the past. Anything on this ?
Thanks,
Boaz
So do you mean that it works now as non-root user even without your workaround? Does the output line still get lost without the "su" workaround?