Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

should run a deployment to completion and then scale to zero [Conformance] #10002

Closed
0xmichalis opened this issue Jul 23, 2016 · 8 comments
Closed
Assignees
Labels
area/tests component/apps kind/test-flake Categorizes issue or PR as related to test flakes. priority/P1

Comments

@0xmichalis
Copy link
Contributor

0xmichalis commented Jul 23, 2016

Because of the latest changes in deployment logs

STEP: verifying the scale is updated on the deployment config
STEP: deploying a few more times
Jul 22 22:07:25.246: INFO: Running 'oc deploy --namespace=extended-test-cli-deployment-f3mzp-9tfjf --config=/tmp/openshift/extended-test-cli-deployment-f3mzp-9tfjf-user.kubeconfig --latest --follow deployment-test'
Jul 22 22:07:45.721: INFO: Error running &{/data/src/github.com/openshift/origin/_output/local/bin/linux/amd64/oc [oc deploy --namespace=extended-test-cli-deployment-f3mzp-9tfjf --config=/tmp/openshift/extended-test-cli-deployment-f3mzp-9tfjf-user.kubeconfig --latest --follow deployment-test] []   Started deployment #2
Error from server: The get operation against ReplicationController could not be completed at this time, please try again.
 Started deployment #2
Error from server: The get operation against ReplicationController could not be completed at this time, please try again.
 [] <nil> 0xc820d49f20 exit status 1 <nil> true [0xc82002f208 0xc82002f230 0xc82002f230] [0xc82002f208 0xc82002f230] [0xc82002f210 0xc82002f228] [0xa7ed60 0xa7eec0] 0xc8202c1080}:
Started deployment #2
Error from server: The get operation against ReplicationController could not be completed at this time, please try again.
• Failure [63.861 seconds]
deploymentconfigs
/data/src/github.com/openshift/origin/test/extended/deployments/deployments.go:598
  with test deployments
  /data/src/github.com/openshift/origin/test/extended/deployments/deployments.go:252
    should run a deployment to completion and then scale to zero [Conformance] [It]
    /data/src/github.com/openshift/origin/test/extended/deployments/deployments.go:251

    Expected error:
        <*exec.ExitError | 0xc820d49f80>: {
            ProcessState: {
                pid: 4354,
                status: 256,
                rusage: {
                    Utime: {Sec: 0, Usec: 120189},
                    Stime: {Sec: 0, Usec: 20722},
                    Maxrss: 28188,
                    Ixrss: 0,
                    Idrss: 0,
                    Isrss: 0,
                    Minflt: 6714,
                    Majflt: 0,
                    Nswap: 0,
                    Inblock: 0,
                    Oublock: 0,
                    Msgsnd: 0,
                    Msgrcv: 0,
                    Nsignals: 0,
                    Nvcsw: 2160,
                    Nivcsw: 15,
                },
            },
            Stderr: nil,
        }
        exit status 1
    not to have occurred

    /data/src/github.com/openshift/origin/test/extended/deployments/deployments.go:240
------------------------------
@smarterclayton
Copy link
Contributor

What does "The Get operation cannot be completed at this time"? Is this the max rate limiter on the server? Do we have something hot looping?

@liggitt
Copy link
Contributor

liggitt commented Aug 3, 2016

no, I think it's https://github.com/openshift/origin/blob/master/pkg/deploy/registry/deploylog/rest.go#L154-L159

latest, ok, err := registry.WaitForRunningDeployment(r.rn, target, r.timeout)
if err != nil {
    return nil, errors.NewBadRequest(fmt.Sprintf("unable to wait for deployment %s to run: %v", deployutil.LabelForDeployment(target), err))
}
if !ok {
    return nil, errors.NewServerTimeout(kapi.Resource("ReplicationController"), "get", 2)
}

@0xmichalis
Copy link
Contributor Author

The client will retry on serverTimeout

@smarterclayton
Copy link
Contributor

Well, the message shouldn't be written to output, so this looks like a minor bug there. If this is truly "timeout forever" then this flake is now "container did not start"

@smarterclayton
Copy link
Contributor

Raising priority because this is now showing up fairly frequently in PRs

@smarterclayton
Copy link
Contributor

#10176

@0xmichalis
Copy link
Contributor Author

Well, the message shouldn't be written to output, so this looks like a minor bug there. If this is truly "timeout forever" then this flake is now "container did not start"

Is anybody familiar with the place in the client where server timeouts are retried? I couldn't find it. This is most probably a "deployer container did not start" error but I am not sure at which point should we intercept the message. All old test logs are dead so it would be really helpful next time this occurs to post the failure trap output.

@smarterclayton
Copy link
Contributor

request.go

On Mon, Aug 15, 2016 at 11:23 AM, Michail Kargakis <notifications@github.com

wrote:

Well, the message shouldn't be written to output, so this looks like a
minor bug there. If this is truly "timeout forever" then this flake is now
"container did not start"

Is anybody familiar with the place in the client where server timeouts are
retried? I couldn't find it. This is most probably a "deployer container
did not start" error but I am not sure at which point should we intercept
the message. All old test logs are dead so it would be really helpful next
time this occurs to post the failure trap output.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#10002 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_p3Jn7JBDCZB2wGd865R6awbWoiRcks5qgISFgaJpZM4JTUtn
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/tests component/apps kind/test-flake Categorizes issue or PR as related to test flakes. priority/P1
Projects
None yet
Development

No branches or pull requests

5 participants