Your browser was unable to load all of the resources. They may have been blocked by your firewall, proxy or browser configuration.
Press Ctrl+F5 or Ctrl+Shift+R to have your browser try again.

Unable to find job #3543

goydex ·
We are hitting a re-occurring issue now with a particular configuration. From the log can you see why this error is being frequently displayed:

We are running 6.1.17.


2016-06-06 12:53:59,275 [pool-1-thread-9] INFO com.pmease.quickbuild.DefaultBuildEngine - Processing build request (configuration:<removed>/CI/<removed>, request id:821b276a-6638-4f2e-8be9-7a0a872d09f2)
2016-06-06 12:53:59,280 [pool-1-thread-9] INFO com.pmease.quickbuild.DefaultBuildEngine - Checking build condition on node (address: <removed>-AGENT06:8811, ip: 10.64.105.142)...
2016-06-06 12:54:10,382 [pool-1-thread-42] ERROR com.pmease.quickbuild.grid.GridTaskFuture - Unable to find job (job class: com.pmease.quickbuild.CheckConditionJob, job id: 02dd2911-0fbe-4c18-9662-81783b25dd7c, node: <removed>-AGENT02:8811)
2016-06-06 12:54:10,487 [pool-1-thread-42] ERROR com.pmease.quickbuild.DefaultBuildEngine - Error processing build request.
java.lang.RuntimeException: Error executing check condition job.
at com.pmease.quickbuild.CheckConditionTask.reduce(CheckConditionTask.java:39)
at com.pmease.quickbuild.CheckConditionTask.reduce(CheckConditionTask.java:16)
at com.pmease.quickbuild.grid.GridTaskFuture.get(GridTaskFuture.java:102)
at com.pmease.quickbuild.grid.GridTaskFuture.get(GridTaskFuture.java:106)
at com.pmease.quickbuild.DefaultBuildEngine.process(DefaultBuildEngine.java:332)
at com.pmease.quickbuild.DefaultBuildEngine.access$000(DefaultBuildEngine.java:139)
at com.pmease.quickbuild.DefaultBuildEngine$2.run(DefaultBuildEngine.java:1143)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: com.pmease.quickbuild.QuickbuildException: Unable to find job '02dd2911-0fbe-4c18-9662-81783b25dd7c' on node '<removed>-AGENT02:8811'.
at com.pmease.quickbuild.grid.GridTaskFuture.testJobs(GridTaskFuture.java:42)
at com.pmease.quickbuild.grid.GridTaskFuture.get(GridTaskFuture.java:84)
... 7 more
2016-06-06 12:54:10,491 [pool-1-thread-42] INFO com.pmease.quickbuild.DefaultBuildEngine - Processing build request (configuration:<removed>/CI/<removed>, request id:02e486b3-63fe-4584-88b0-35544f1a46b7)
  • replies 17
  • views 7533
  • stars 0
goydex ·
From the agent console log:

jvm 1    | 2016-06-06 12:39:13,961 ERROR - Unable to find job '4d62b02f-d4b1-4d0d-976b-785e77279f53' on node 'COREVAL-AGENT02:8811' (Job is ever started: true).
jvm 1 | 2016-06-06 12:39:33,441 INFO - Taking repository snapshots...
jvm 1 | 2016-06-06 12:39:33,448 INFO - Taking snapshot of repository 'SCMPerforce'...
jvm 1 | 2016-06-06 12:39:33,448 INFO - Determining head revision for repository: SCMPerforce
jvm 1 | 2016-06-06 12:39:33,449 INFO - Updating client spec of 'qb-COREVAL-AGENT02-CI-TOPFW'...
jvm 1 | 2016-06-06 12:39:33,911 INFO - Evaluating build condition...
jvm 1 | 2016-06-06 12:39:33,929 INFO - Getting changes of 'SCMPerforce' since build 'CI-TOPFramework_1.0.128'...
jvm 1 | 2016-06-06 12:39:36,548 WARN - There are too many changes found in repository 'SCMPerforce', truncating to save resource...
jvm 1 | 2016-06-06 12:39:36,748 ERROR - Error notifying task node of job finishing (job class: 4c08dd46-397a-46a6-bfe6-9ff3976ad8c7, job id: com.pmease.quickbuild.CheckConditionJob, task node: coreval-live2:8810)
jvm 1 | com.caucho.hessian.client.HessianConnectionException: 413: java.io.IOException: Server returned HTTP response code: 413 for URL: https://coreval-build.<removed>.com/service/node
jvm 1 | at com.caucho.hessian.client.HessianURLConnection.sendRequest(HessianURLConnection.java:165)
jvm 1 | at com.caucho.hessian.client.HessianProxy.sendRequest(HessianProxy.java:300)
jvm 1 | at com.caucho.hessian.client.HessianProxy.invoke(HessianProxy.java:171)
jvm 1 | at com.sun.proxy.$Proxy49.gridJobFinished(Unknown Source)
jvm 1 | at com.pmease.quickbuild.grid.GridJob.run(GridJob.java:126)
jvm 1 | at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
jvm 1 | at java.util.concurrent.FutureTask.run(Unknown Source)
jvm 1 | at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
jvm 1 | at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
jvm 1 | at java.lang.Thread.run(Unknown Source)
jvm 1 | Caused by: java.io.IOException: Server returned HTTP response code: 413 for URL: https://coreval-build.<removed>.com/service/node
jvm 1 | at sun.reflect.GeneratedConstructorAccessor11.newInstance(Unknown Source)
jvm 1 | at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
jvm 1 | at java.lang.reflect.Constructor.newInstance(Unknown Source)
jvm 1 | at sun.net.www.protocol.http.HttpURLConnection$10.run(Unknown Source)
jvm 1 | at sun.net.www.protocol.http.HttpURLConnection$10.run(Unknown Source)
jvm 1 | at java.security.AccessController.doPrivileged(Native Method)
jvm 1 | at sun.net.www.protocol.http.HttpURLConnection.getChainedException(Unknown Source)
jvm 1 | at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(Unknown Source)
jvm 1 | at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
jvm 1 | at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(Unknown Source)
jvm 1 | at com.caucho.hessian.client.HessianURLConnection.sendRequest(HessianURLConnection.java:145)
jvm 1 | ... 9 more
jvm 1 | Caused by: java.io.IOException: Server returned HTTP response code: 413 for URL: https://coreval-build.<removed>.com/service/node
jvm 1 | at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(Unknown Source)
jvm 1 | at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
jvm 1 | at java.net.HttpURLConnection.getResponseCode(Unknown Source)
jvm 1 | at sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(Unknown Source)
jvm 1 | at com.caucho.hessian.client.HessianURLConnection.sendRequest(HessianURLConnection.java:132)
jvm 1 | ... 9 more
robinshen ADMIN ·
This might happen if some agent running this configuration is changing its ip address during the build. Please check if this is your case.
goydex ·
[quote="robinshen"]This might happen if some agent running this configuration is changing its ip address during the build. Please check if this is your case.[/quote]

We have checked the console and server log, the IP address of the machines are not changing. The machine IP's are guaranteed to remain the same by our IT folks. If it helps, all nodes are VM's.
robinshen ADMIN ·
Or is that agent is being restarted during build?
goydex ·
[quote="robinshen"]Or is that agent is being restarted during build?[/quote]

The agents are not being rebooted. They are VM's so they might be being migrated, but that should not result in any QoS issues.

What additional logs can we generate to help solve this issue?
goydex ·
Why would we be getting HTTP 413 errors from the build nodes?
robinshen ADMIN ·
Sorry I missed that error code. This means that the http request is too large to be handled. Do you have build variables holding too long data in the failing configuration, for instance, reading a file content into some build variable? If so, please avoid doing that as build variables are meant to hold small piece of information controlling the build process.
goydex ·
[quote="robinshen"]Sorry I missed that error code. This means that the http request is too large to be handled. Do you have build variables holding too long data in the failing configuration, for instance, reading a file content into some build variable? If so, please avoid doing that as build variables are meant to hold small piece of information controlling the build process.[/quote]

The longest variables is the workspace client spec -- we didn't see any very much larger.

In the mean time, I copied the configs to a new config and they have not shown the (!) or the same error over night. Several builds have occurred. What debug data does this provide for you?
goydex ·
New data!! I believe this shows the cause but I need your help in understanding it.

From the server log:

2016-06-09 10:04:22,024 [pool-1-thread-1107] ERROR com.pmease.quickbuild.grid.GridTaskFuture - Unable to find job (job class: com.pmease.quickbuild.CheckConditionJob, job id: 67edc8bf-153e-4581-aa41-023388135670, node: COREVAL-AGENT06:8811)
2016-06-09 10:04:22,128 [pool-1-thread-1107] ERROR com.pmease.quickbuild.DefaultBuildEngine - Error processing build request.
java.lang.RuntimeException: Error executing check condition job.
at com.pmease.quickbuild.CheckConditionTask.reduce(CheckConditionTask.java:39)
at com.pmease.quickbuild.CheckConditionTask.reduce(CheckConditionTask.java:16)
at com.pmease.quickbuild.grid.GridTaskFuture.get(GridTaskFuture.java:102)
at com.pmease.quickbuild.grid.GridTaskFuture.get(GridTaskFuture.java:106)
at com.pmease.quickbuild.DefaultBuildEngine.process(DefaultBuildEngine.java:332)
at com.pmease.quickbuild.DefaultBuildEngine.access$000(DefaultBuildEngine.java:139)
at com.pmease.quickbuild.DefaultBuildEngine$2.run(DefaultBuildEngine.java:1143)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: com.pmease.quickbuild.QuickbuildException: Unable to find job '67edc8bf-153e-4581-aa41-023388135670' on node 'COREVAL-AGENT06:8811'.
at com.pmease.quickbuild.grid.GridTaskFuture.testJobs(GridTaskFuture.java:42)
at com.pmease.quickbuild.grid.GridTaskFuture.get(GridTaskFuture.java:84)
... 7 more


We look on the agent, and the corresponding snippet in the console log is:

jvm 1    | 2016-06-09 10:03:52,705 INFO  - Taking repository snapshots...
jvm 1 | 2016-06-09 10:03:52,708 INFO - Taking snapshot of repository 'SCMPerforce'...
jvm 1 | 2016-06-09 10:03:52,708 INFO - Determining head revision for repository: SCMPerforce
jvm 1 | 2016-06-09 10:03:52,709 INFO - Updating client spec of 'qb-COREVAL-AGENT06-CI-TOPFW'...
jvm 1 | 2016-06-09 10:03:53,196 INFO - Evaluating build condition...
jvm 1 | 2016-06-09 10:03:53,214 INFO - Getting changes of 'SCMPerforce' since build 'CI-TOPFramework_1.0.128'...
jvm 1 | 2016-06-09 10:03:56,174 WARN - There are too many changes found in repository 'SCMPerforce', truncating to save resource...
jvm 1 | 2016-06-09 10:03:56,224 ERROR - Error notifying task node of job finishing (job class: 67edc8bf-153e-4581-aa41-023388135670, job id: com.pmease.quickbuild.CheckConditionJob, task node: coreval-live2:8810)
jvm 1 | com.caucho.hessian.client.HessianConnectionException: 413: java.io.IOException: Server returned HTTP response code: 413 for URL: http://coreval-build.<removed>.com/service/node
jvm 1 | at com.caucho.hessian.client.HessianURLConnection.sendRequest(HessianURLConnection.java:165)
jvm 1 | at com.caucho.hessian.client.HessianProxy.sendRequest(HessianProxy.java:300)
jvm 1 | at com.caucho.hessian.client.HessianProxy.invoke(HessianProxy.java:171)
jvm 1 | at com.sun.proxy.$Proxy50.gridJobFinished(Unknown Source)
jvm 1 | at com.pmease.quickbuild.grid.GridJob.run(GridJob.java:126)
jvm 1 | at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
jvm 1 | at java.util.concurrent.FutureTask.run(Unknown Source)
jvm 1 | at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
jvm 1 | at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
jvm 1 | at java.lang.Thread.run(Unknown Source)
jvm 1 | Caused by: java.io.IOException: Server returned HTTP response code: 413 for URL: http://coreval-build.<removed>.com/service/node
jvm 1 | at sun.reflect.GeneratedConstructorAccessor132.newInstance(Unknown Source)
jvm 1 | at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
jvm 1 | at java.lang.reflect.Constructor.newInstance(Unknown Source)
jvm 1 | at sun.net.www.protocol.http.HttpURLConnection$10.run(Unknown Source)
jvm 1 | at sun.net.www.protocol.http.HttpURLConnection$10.run(Unknown Source)
jvm 1 | at java.security.AccessController.doPrivileged(Native Method)
jvm 1 | at sun.net.www.protocol.http.HttpURLConnection.getChainedException(Unknown Source)
jvm 1 | at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(Unknown Source)
jvm 1 | at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
jvm 1 | at com.caucho.hessian.client.HessianURLConnection.sendRequest(HessianURLConnection.java:145)
jvm 1 | ... 9 more
jvm 1 | Caused by: java.io.IOException: Server returned HTTP response code: 413 for URL: http://coreval-build.<removed>.com/service/node
jvm 1 | at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(Unknown Source)
jvm 1 | at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
jvm 1 | at java.net.HttpURLConnection.getResponseCode(Unknown Source)
jvm 1 | at com.caucho.hessian.client.HessianURLConnection.sendRequest(HessianURLConnection.java:132)
jvm 1 | ... 9 more
jvm 1 | 2016-06-09 10:04:22,777 ERROR - Unable to find job '67edc8bf-153e-4581-aa41-023388135670' on node 'COREVAL-AGENT06:8811' (Job is ever started: true).


So it seems job id: 67edc8bf-153e-4581-aa41-023388135670 is getting "lost" as it is unable to complete a POST request to the server, crashes internally, then the server can no longer refer to that job.

I looked at the nginx access logs for the HTTP ERR 413 events see the following troubling entries:

2016/06/06 10:21:47 [error] 22435#0: *32760160 client intended to send too large body: 1534564 bytes, client: <removed>.<removed>.105.175, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:21:58 [error] 22437#0: *32760471 client intended to send too large body: 1520104 bytes, client: <removed>.<removed>.105.175, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:21:58 [error] 22437#0: *32760157 client intended to send too large body: 1537295 bytes, client: <removed>.<removed>.104.43, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:22:28 [error] 22437#0: *32760308 client intended to send too large body: 1520104 bytes, client: <removed>.<removed>.105.142, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:22:28 [error] 22435#0: *32760775 client intended to send too large body: 1537295 bytes, client: <removed>.<removed>.105.142, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:22:47 [error] 22435#0: *32760629 client intended to send too large body: 1534564 bytes, client: <removed>.<removed>.104.43, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:22:59 [error] 22435#0: *32760620 client intended to send too large body: 1520104 bytes, client: <removed>.<removed>.105.175, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:22:59 [error] 22435#0: *32759501 client intended to send too large body: 1537295 bytes, client: <removed>.<removed>.104.146, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:23:28 [error] 22435#0: *32760783 client intended to send too large body: 1520104 bytes, client: <removed>.<removed>.105.142, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:23:28 [error] 22438#0: *32761135 client intended to send too large body: 1537295 bytes, client: <removed>.<removed>.105.142, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:23:47 [error] 22437#0: *32760985 client intended to send too large body: 1534564 bytes, client: <removed>.<removed>.105.175, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:23:58 [error] 22435#0: *32760910 client intended to send too large body: 1520104 bytes, client: <removed>.<removed>.104.43, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:23:59 [error] 22438#0: *32761272 client intended to send too large body: 1537295 bytes, client: <removed>.<removed>.105.175, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:24:29 [error] 22438#0: *32761144 client intended to send too large body: 1520104 bytes, client: <removed>.<removed>.105.142, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:24:29 [error] 22437#0: *32761486 client intended to send too large body: 1537295 bytes, client: <removed>.<removed>.105.142, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:24:47 [error] 22438#0: *32761334 client intended to send too large body: 1534564 bytes, client: <removed>.<removed>.105.175, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:24:59 [error] 22438#0: *32761331 client intended to send too large body: 1520104 bytes, client: <removed>.<removed>.104.43, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:24:59 [error] 22438#0: *32761578 client intended to send too large body: 1537295 bytes, client: <removed>.<removed>.105.175, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:25:29 [error] 22437#0: *32761488 client intended to send too large body: 1537286 bytes, client: <removed>.<removed>.105.142, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:25:29 [error] 22435#0: *32761852 client intended to send too large body: 1520104 bytes, client: <removed>.<removed>.105.142, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:25:47 [error] 22437#0: *32760978 client intended to send too large body: 1534564 bytes, client: <removed>.<removed>.104.146, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:25:59 [error] 22437#0: *32762080 client intended to send too large body: 1520104 bytes, client: <removed>.<removed>.105.175, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:26:00 [error] 22437#0: *32761712 client intended to send too large body: 1537295 bytes, client: <removed>.<removed>.105.175, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:26:29 [error] 22437#0: *32761708 client intended to send too large body: 1520104 bytes, client: <removed>.<removed>.104.43, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:26:30 [error] 22435#0: *32761856 client intended to send too large body: 1537295 bytes, client: <removed>.<removed>.105.142, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:26:47 [error] 22435#0: *32762106 client intended to send too large body: 1534564 bytes, client: <removed>.<removed>.105.175, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:27:00 [error] 22437#0: *32762395 client intended to send too large body: 1520104 bytes, client: <removed>.<removed>.105.175, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:27:00 [error] 22437#0: *32762027 client intended to send too large body: 1537295 bytes, client: <removed>.<removed>.104.146, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:27:30 [error] 22437#0: *32762265 client intended to send too large body: 1520104 bytes, client: <removed>.<removed>.105.142, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:27:30 [error] 22437#0: *32762257 client intended to send too large body: 1537286 bytes, client: <removed>.<removed>.104.43, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:27:47 [error] 22436#0: *32762483 client intended to send too large body: 1534555 bytes, client: <removed>.<removed>.104.146, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:28:00 [error] 22436#0: *32762481 client intended to send too large body: 1520104 bytes, client: <removed>.<removed>.105.175, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:28:00 [error] 22437#0: *32762640 client intended to send too large body: 1537295 bytes, client: <removed>.<removed>.105.142, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:28:30 [error] 22437#0: *32762632 client intended to send too large body: 1520104 bytes, client: <removed>.<removed>.104.43, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:28:30 [error] 22437#0: *32762871 client intended to send too large body: 1537286 bytes, client: <removed>.<removed>.105.175, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:28:47 [error] 22437#0: *32762875 client intended to send too large body: 1534564 bytes, client: <removed>.<removed>.105.142, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:29:00 [error] 22436#0: *32763032 client intended to send too large body: 1520104 bytes, client: <removed>.<removed>.105.175, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:29:01 [error] 22436#0: *32763158 client intended to send too large body: 1537295 bytes, client: <removed>.<removed>.105.142, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:29:31 [error] 22438#0: *32763232 client intended to send too large body: 1537295 bytes, client: <removed>.<removed>.105.175, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:29:31 [error] 22436#0: *32763028 client intended to send too large body: 1520104 bytes, client: <removed>.<removed>.104.43, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:29:47 [error] 22438#0: *32763237 client intended to send too large body: 1534564 bytes, client: <removed>.<removed>.105.142, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:30:01 [error] 22437#0: *32763388 client intended to send too large body: 1520104 bytes, client: <removed>.<removed>.105.175, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:30:01 [error] 22438#0: *32763509 client intended to send too large body: 1537295 bytes, client: <removed>.<removed>.105.142, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:30:31 [error] 22437#0: *32763384 client intended to send too large body: 1520104 bytes, client: <removed>.<removed>.104.43, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:30:31 [error] 22438#0: *32763588 client intended to send too large body: 1537295 bytes, client: <removed>.<removed>.105.175, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:30:47 [error] 22438#0: *32763592 client intended to send too large body: 1534564 bytes, client: <removed>.<removed>.105.142, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:31:01 [error] 22435#0: *32763926 client intended to send too large body: 1537286 bytes, client: <removed>.<removed>.105.142, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:31:01 [error] 22436#0: *32763744 client intended to send too large body: 1520104 bytes, client: <removed>.<removed>.105.175, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:31:31 [error] 22435#0: *32764048 client intended to send too large body: 1520104 bytes, client: <removed>.<removed>.104.43, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:31:32 [error] 22437#0: *32764017 client intended to send too large body: 1537286 bytes, client: <removed>.<removed>.105.175, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:31:48 [error] 22437#0: *32764021 client intended to send too large body: 1534564 bytes, client: <removed>.<removed>.105.142, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:32:01 [error] 22437#0: *32762772 client intended to send too large body: 1520104 bytes, client: <removed>.<removed>.104.146, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:32:02 [error] 22437#0: *32764319 client intended to send too large body: 1537295 bytes, client: <removed>.<removed>.105.142, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:32:32 [error] 22435#0: *32764181 client intended to send too large body: 1520104 bytes, client: <removed>.<removed>.105.175, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:32:32 [error] 22438#0: *32764566 client intended to send too large body: 1537295 bytes, client: <removed>.<removed>.105.175, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:32:47 [error] 22435#0: *32764411 client intended to send too large body: 1534564 bytes, client: <removed>.<removed>.105.142, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:33:02 [error] 22435#0: *32764176 client intended to send too large body: 1520104 bytes, client: <removed>.<removed>.104.43, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:33:02 [error] 22435#0: *32764409 client intended to send too large body: 1537295 bytes, client: <removed>.<removed>.104.146, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:33:32 [error] 22436#0: *32764574 client intended to send too large body: 1520104 bytes, client: <removed>.<removed>.105.175, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:33:33 [error] 22438#0: *32764939 client intended to send too large body: 1537295 bytes, client: <removed>.<removed>.105.175, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:33:50 [error] 22436#0: *32764639 client intended to send too large body: 1534564 bytes, client: <removed>.<removed>.105.142, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:34:03 [error] 22438#0: *32765089 client intended to send too large body: 1520095 bytes, client: <removed>.<removed>.105.142, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:34:03 [error] 22436#0: *32764780 client intended to send too large body: 1537295 bytes, client: <removed>.<removed>.104.43, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:34:32 [error] 22438#0: *32765297 client intended to send too large body: 1520104 bytes, client: <removed>.<removed>.105.175, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:34:33 [error] 22438#0: *32764944 client intended to send too large body: 1537295 bytes, client: <removed>.<removed>.105.175, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:34:47 [error] 22437#0: *32764786 client intended to send too large body: 1534564 bytes, client: <removed>.<removed>.104.146, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:35:03 [error] 22437#0: *32765498 client intended to send too large body: 1520104 bytes, client: <removed>.<removed>.105.142, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:35:03 [error] 22438#0: *32765172 client intended to send too large body: 1537295 bytes, client: <removed>.<removed>.105.142, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"
2016/06/06 10:35:33 [error] 22438#0: *32765324 client intended to send too large body: 1520104 bytes, client: <removed>.<removed>.105.175, server: coreval-build.<removed>.com, request: "POST /service/node HTTP/1.1", host: "coreval-build.<removed>.com"


The agents are trying to push between 1.5 - 2.7 MB of data over HTTP.

As a stop-gap I have increased the max body size in nginx to 200MB.

I need help understanding this issues --

1. Is this HTTP data size expected?
2. How can we dig deeper into the HTTP request data being sent?
3. Is the line about "There are too many changes found in repository 'SCMPerforce', truncating to save resource..." an issue? Is it trying to download/transfer the entire p4 changelist information since the dawn of man on each access?
robinshen ADMIN ·
There is no easy way to show the request data unless you run it through some network package sniffer. The SCM changes data will also be sent via request, and a too much SCM changes may cause the issue. Do you mean that the configuration works if you increase nginx request data size?
goydex ·
[quote="robinshen"]There is no easy way to show the request data unless you run it through some network package sniffer. The SCM changes data will also be sent via request, and a too much SCM changes may cause the issue. Do you mean that the configuration works if you increase nginx request data size?[/quote]

Yes the configuration works if we increase the client_max_body_size option to something large, and the HTTP 413 error stop in the log.

Furthermore, we notice that the scm_report.h2.db file is 5MB large in the failing configs, but in new configs the file is 148KB initially. Would this large size have some bearing on the problem?
robinshen ADMIN ·
Yes definitely as QB has to pass around the SCM changes between different stages of the build via http protocol.
goydex ·
[quote="robinshen"]Yes definitely as QB has to pass around the SCM changes between different stages of the build via http protocol.[/quote]

Then is the large size of the database file an issue? Or is there a configuration change we can make to reduce the amount of SCM change data being passed around during a build?
robinshen ADMIN ·
QB limits the change sets to 20000, but even that size can exceed the http header to send. In this case, just configure client_max_body_size to be a large size as that seems to be able to solve the issue, or do you have any concern with this approach?
goydex ·
[quote="robinshen"]QB limits the change sets to 20000, but even that size can exceed the http header to send. In this case, just configure client_max_body_size to be a large size as that seems to be able to solve the issue, or do you have any concern with this approach?[/quote]

I can do this config. I've done it and the issue is resolved. My only concern would be this problem reoccurring. i.e. this HTTP data growing over time. If there a way to measure this performance over time to keep my finger on it? Some analytics plugin?
robinshen ADMIN ·
Sorry we do not have that plugin. Since the problem is caused by too many SCM changes, and we already set a limit on that. It should not increase over time.
goydex ·
[quote="robinshen"]Sorry we do not have that plugin. Since the problem is caused by too many SCM changes, and we already set a limit on that. It should not increase over time.[/quote]

Thanks. I suppose an issue is partially caused by internal process of putting much more verbose information into the changelist description.