Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port forwarding not work when pods recreated #2408

Closed
cheung-chifung opened this issue Jul 4, 2019 · 32 comments
Closed

Port forwarding not work when pods recreated #2408

cheung-chifung opened this issue Jul 4, 2019 · 32 comments
Assignees
Labels
area/portforward kind/bug Something isn't working priority/p1 High impact feature/bug.

Comments

@cheung-chifung
Copy link

cheung-chifung commented Jul 4, 2019

I'm making a simple golang app and trying to use skaffold for local development withskaffold dev --port-forward.

However, it only works for the first time, then fails to open ports again when files changed and pods recreated. I even tried v0.33.0 and the latest version 2f2a6f4, neither works.

I checked issue #1594 #1815 and tried v0.29.0, and it works as I expected.

I notice that when I use v0.33.0, every time I changed files, skaffold will start a new pod then exit (delete?) the old one, but when I use v0.29.0, only the new one starts, and there is no exit log. So I guess is the old one blocks port-forward somehow since v0.29.0

Expected behavior

Recreate port forwarding correctly just like v0.29.0 did.

Actual behavior

Only successful for the first time.

Information

  • Skaffold version: v0.33.0
  • Operating system: MacOSX 10.14.5
  • Contents of skaffold.yaml:
apiVersion: skaffold/v1beta10
kind: Config
build:
  artifacts:
    - image: api
      context: ./api/
deploy:
  kubectl:
    manifests:
      - ./manifests/*
profiles:
  - name: minikube
    activation:
      - kubeContext: minikube
        command: dev

Steps to reproduce the behavior

  1. use minikube and v0.33.0, checkout a minimal app with service and deployment only.
  2. run skaffold dev --port-forward, can see port-forward works for the first time.
  3. change the source codes, trigger the pod recreate.
  4. from the logs, you can see skaffold starts a new pod with a different name with the old one, which will be deleted after the new one started listening on the port.
  5. stop skaffold and rerun the same command skaffold dev --port-forward, it works again.
@dgageot dgageot added area/portforward kind/bug Something isn't working labels Jul 5, 2019
@cmoad
Copy link

cmoad commented Jul 8, 2019

I'm seeing this exact issue after upgrading from 0.32.0 to 0.33.0.

Skaffold command:

skaffold dev --cleanup=false --no-prune --port-forward --toot

@priyawadhwa
Copy link
Contributor

Hey @cmoad -- so I merged #2215 recently which changed the functionality of port-forward for skaffold dev. Skaffold should now only automatically port forward services upon skaffold dev (previously it was pods). If you need to port forward pods as well, you can define them specifically in the skaffold config, as described here. Does that solve your issue?

@japsen
Copy link

japsen commented Jul 9, 2019

for me it looks like the port-forwarding is initiated while the pods that are linked to the service are still "terminating". Once the new pod is initiated, the port-forward command which is already/still running will not be updated to link to the new pod.

Maybe this could be solved by re-initiating the port-forward when the kubectl port-forward gives "No such container" or "Timeout occured" errors.

@priyawadhwa
Copy link
Contributor

Hey @japsen so we currently poll kubectl port-forward for one minute, waiting for it to connect successfully. How long does it typically take for your service to connect to a new pod?

@priyawadhwa priyawadhwa self-assigned this Jul 9, 2019
@japsen
Copy link

japsen commented Jul 10, 2019

@priyawadhwa In my case I have some simple k8s deployments consisting of 4 deployments (redis, elastic, postgres, our product) All 1 replica; having a service that exposes the ports (ClusterIP).
I'm using local useDockerCli build; after pushing to the repo it almost instantly shows the following.
It looks like "Waiting for the deployments to stabilize" doesn't do anything.

Starting deploy...
kubectl client version: 1.12+
deployment.extensions/elasticsearch created
service/elasticsearch created
deployment.extensions/productname created
service/productname created
deployment.extensions/postgres created
service/postgres created
deployment.extensions/redis created
service/redis created
Deploy complete in 1.461292015s
Waiting for deployments to stabilize
Watching for changes every 1s...
Port Forwarding service/elasticsearch 9200 -> 9200
etc..

At that moment, the pods are still on "ContainerCreating". On the initial deployment on an empty cluster it takes about 110 seconds for all pods to be 1/1 Running.

If I trigger a rebuild by altering a file outside of the "sync" folders (file sync works OK!) then after pushing to the repo, the product pod will be on 1/1 Terminating for about 30 seconds. After that 0/1 Terminating and after about 10 seconds the new pod will be on 1/1 Running. In the meantime the port-forward was already established towards the service/productname during the 1/1 Terminating phase. It looks like it doesn't get updated or reset once the pod reaches it's 1/1 Running phase.

If I manually trigger the following bash script at that moment, then the port forwarding works OK.

#!/usr/bin/env bash

ps -ef | grep 'service/productname' | grep -v grep | awk '{print $2}' | xargs kill -9

kubectl port-forward service/productname 7000:7000 &
kubectl port-forward service/productname 8000:8000 &
etc...

I'm running skaffold v0.33.0 on mac.

@cmoad
Copy link

cmoad commented Jul 10, 2019

@priyawadhwa On first run I can see the services getting successfully forwarded:

Port Forwarding service/abbott-v2 3001 -> 3001
Port Forwarding service/zeppo-v2 5000 -> 5000

However, if the underlying pods change during a deployment update the service port forward becomes unresponsive.

# first run of skaffold dev
% curl http://localhost:3001/health
ok

# after deployment update from a file change
% curl http://localhost:3001/health
curl: (52) Empty reply from server

It looks like the service is still forwarding to the pods correctly inside the minikube VM:

# minikube ssh, then:
$ curl http://10.96.144.148:3001/health
ok

One more test to confirm the tunnel is dropped:

# first run
% sudo netstat -ln | grep 3001
Password:
tcp6       0      0  ::1.52273                                     ::1.3001                                      TIME_WAIT

# after update
% sudo netstat -ln | grep 3001
(no output)

@priyawadhwa
Copy link
Contributor

@japsen -- thanks for your detailed explanation, I'll take a look at it more closely and let you know what next steps would be.

@cmoad -- this is currently expected behavior. skaffold only establishes a connection after deploy and doesn't continuously check if the connection is valid. So, if a user manually changes the pod underneath the service that is being port-forwarded, skaffold won't pick up on that change, and the connection won't work anymore.

@priyawadhwa
Copy link
Contributor

@japsen I think this issue should be fixed with #2359 -- the healthcheck will wait until deployments have stabilized, and then port forwarding will start, so the pods should be ready by the time port forwarding starts.

cc @tejal29

@priyawadhwa
Copy link
Contributor

Hey @japsen -- #2359 was just merged! Could you check and see if this solves your issue?

To download the latest bleeding edge binary, see instructions here

@japsen
Copy link

japsen commented Jul 12, 2019

hi @priyawadhwa thanks for being on this!

I just tested the bleeding edge bin.
On initial run on an empty cluster; after pushing it works as expected. It waits and creates the forwards once pods are ready.

Two breaking problems that occur:

  • If I change a file, then file sync seems to be stuck in a loop. I changed a README.md file and it keeps syncing.
Syncing 1 files for eu.gcr.io......
Syncing 1 files for eu.gcr.io......
Syncing 1 files for eu.gcr.io......
Syncing 1 files for eu.gcr.io......
etc..... it doesn't stop.
  • If I change my skaffold.yaml, I would expect it to start a rebuild. However it prints out the following and doesn't continue. This does break the faulty file sync loop though.
Watching for changes every 1s...
time="2019-07-12T11:06:17+02:00" level=error msg="error running dev loop: configuration changed"

my skaffold version is: 9981c37

@priyawadhwa
Copy link
Contributor

@japsen thanks for testing this out! Since both of these bugs seem unrelated to port forwarding, would you mind opening new issues for them? And if the port forwarding bug is fixed, we can go ahead and close this issue.

@demisx
Copy link
Contributor

demisx commented Jul 12, 2019

Same issue here. Had to roll back to v0.31 for it to work.

@demisx
Copy link
Contributor

demisx commented Jul 12, 2019

Just tried it with the latest version 4c144df and it didn't make any difference. Once skaffold rebuilds the image and restarts container, the previously working port times out.

@priyawadhwa
Copy link
Contributor

@demisx could you please provide logs for the bug you're seeing?

@demisx
Copy link
Contributor

demisx commented Jul 12, 2019

@priyawadhwa Hmm... There is really nothing different in the logs I am seeing. It's just, as others mentioned, the app no longer works on its port after skaffold updates the docker image and launches a new pod. The port is still open and listening, but any request to it simply times out. Again, this happens after the first update that requires skaffold to build a new image and launch it in a new pod. Works fine in v0.31.

@cmoad
Copy link

cmoad commented Jul 15, 2019

@priyawadhwa It seems like port forwarding the deployment has the same effect in v0.33.0. Is there any way to produce a reliable port forward as seen in v0.32.0? Thanks.

@priyawadhwa
Copy link
Contributor

Hey @cmoad @demisx I found some bugs in port forwarding and just opened #2477, once it's merged can you let me know if it fixes your issue?

@prokher
Copy link

prokher commented Jul 25, 2019

@priyawadhwa I am using v0.34.0 which seems to have #2477 inside, but an issues is still here. I run skaffold dev --port-forward and when pod gets restarted (after changing a source file) browser is not longer able to connect to the forwarded port .(Actually, the connection is timed out.)

@prokher
Copy link

prokher commented Jul 25, 2019

Could there be some workaround for this issue, until the fix released?

@demisx
Copy link
Contributor

demisx commented Jul 25, 2019

My workaround so far was to roll back to v0.31. The last version that just works for me.

@priyawadhwa
Copy link
Contributor

@prokher @demisx we have a PR open to fix some more bugs in port forwarding (#2543) -- once that's in I'll comment here and we can see if it fixes your issues.

@priyawadhwa
Copy link
Contributor

Hey @prokher @demisx the PR was just merged -- could you try again with skaffold @ master? Instructions for downloading the latest bleeding edge binary are here

@prokher
Copy link

prokher commented Jul 26, 2019

@priyawadhwa Tried the latest bleeding edge binary, it seems to work. Thank you very much!

@demisx
Copy link
Contributor

demisx commented Jul 26, 2019

@priyawadhwa I've been using the latest bleeding edge binary a1efe8c and it's been working pretty good so far. Some issues I've noticed that may not be related to this one:

  1. The first time I've started it, one of the ports for downloaded image wasn't forwarded, though I have port forwarding set up in skaffold.yaml (see below). However, subsequent runs seem to be OK so far. I'll keep an eye and report this separately, if I notice a pattern.

    portForward:
          - resourceType: service
            resourceName: postgres
            port: 5432
          - resourceType: service
            resourceName: elasticsearch
            port: 9200
  2. When I change any file that matches any of the sync patterns, skaffold also tries to sync this file to a completed init-db job container(log: INFO[1182] Skipping sync with pod init-db-7klrh because it's not running), even though I have no sync set up for this resource.

  3. Hitting [Ctrl+C] does not clean up after itself anymore. It stops port forwarding, but all resources remain running in the cluster. I have to run skaffold delete -p dev manually to clean things up.

@GavinOsborn
Copy link

GavinOsborn commented Jul 29, 2019

@priyawadhwa I too have this issue.
Environment

  • OS: Windows
  • Environment: Minikube (v1.2.0 with virtual box)
  • K8s Versions Tried: 1.12.8 1.15.0
    (with corresponding kubectl versions)
  • Skaffold Versions: 0.34.0 and 0.34.1
    (installed using chocolaty)

Observed Behaviour

  • skaffold dev --port-forward successfully opens ports for the initial build.
  • When I make a change the hot-reload is successful
  • Connection to the new pod is never established
    • Despite the logs often report success
    • Occasionally logs report a new port # assigned (and still no actual connectivity)

Steps to Reproduce
Have recreated using several of the sample applications, most recently I modified the nodejs sample. I made two changes:

  1. I swapped out nodemon src/index.js for node node src/index
  2. I used this skaffold.yaml
apiVersion: skaffold/v1beta13
kind: Config
build:
  artifacts:
  - image: gcr.io/k8s-skaffold/node-example
    context: backend
deploy:
  kubectl:
    manifests:
    - k8s/*.yaml
  1. Run skaffold dev --port-forward -v info which successfully opened port 3000 locally
  2. I tested http://localhost:3000 in my browser several times and it worked as expected
  3. I modified the contents of src/index.js which prompted a reload
  4. Reload was successful but the port was assigned to 3001
  5. I attempted to browse http://localhost:3001 and got nothing
  6. I attempted to browse http://localhost:3000 again and got nothing
  7. Looking at the console output I noticed the following errors which happened exactly when I tried steps 7 & 8:

time="2019-07-29T18:50:20+10:00" level=info msg="retrying kubectl port-forward due to error: E0729 18:50:18.953052 13220 portforward.go:400] an error occurred forwarding 3001 -> 3000: error forwarding port 3000 to pod 7a27efad03e8f0807a1c40dde824e7809a65f6e089068df5719eb3bb9eac4e7e, uid : container not running (7a27efad03e8f0807a1c40dde824e7809a65f6e089068df5719eb3bb9eac4e7e)\n"
Port forwarded service/node from remote port 3000 to local port 3001
time="2019-07-29T18:50:46+10:00" level=info msg="retrying kubectl port-forward due to error: E0729 18:50:44.778788 11392 portforward.go:400] an error occurred forwarding 3000 -> 3000: error forwarding port 3000 to pod 7a27efad03e8f0807a1c40dde824e7809a65f6e089068df5719eb3bb9eac4e7e, uid : Error: No such container: 7a27efad03e8f0807a1c40dde824e7809a65f6e089068df5719eb3bb9eac4e7e\n"
Port forwarded service/node from remote port 3000 to local port 3000

  1. Trying the requests again did not bare any fruit

You can find my full console output here, let me know if I can provide anything else that might help:

PS C:\dev\src\github\skaffold\integration\examples\nodejs> skaffold dev --port-forward -v info
time="2019-07-29T18:46:35+10:00" level=info msg="starting gRPC server on port 50051"
time="2019-07-29T18:46:35+10:00" level=info msg="starting gRPC HTTP server on port 50052"
time="2019-07-29T18:46:35+10:00" level=info msg="Skaffold &{Version:v0.34.1 ConfigVersion:skaffold/v1beta13 GitVersion: GitCommit:a1efe8cc46e7584ad71c2f140cbfb94c1b4d82ff GitTreeState:clean BuildDate:2019-07-25T22:35:35Z GoVersion:go1.12 Compiler:gc Platform:windows/amd64}"
time="2019-07-29T18:46:35+10:00" level=info msg="no config entry found for kube-context minikube"
time="2019-07-29T18:46:35+10:00" level=info msg="Using kubectl context: minikube"
time="2019-07-29T18:46:35+10:00" level=info msg="no config entry found for kube-context minikube"
time="2019-07-29T18:46:35+10:00" level=info msg="no config entry found for kube-context minikube"
time="2019-07-29T18:46:37+10:00" level=info msg="no config entry found for kube-context minikube"
Generating tags...

  • gcr.io/k8s-skaffold/node-example -> gcr.io/k8s-skaffold/node-example:v0.34.1-9-gc1ff25a6-dirty
    Tags generated in 258.6111ms
    Starting build...
    Found [minikube] context, using local docker daemon.
    Building [gcr.io/k8s-skaffold/node-example]...
    Sending build context to Docker daemon 101.4kB
    Step 1/7 : FROM node:10.15.3-alpine
    10.15.3-alpine: Pulling from library/node
    e7c96db7181b: Pulling fs layer
    df9eac31dfef: Pulling fs layer
    0a20433d95a4: Pulling fs layer
    0a20433d95a4: Verifying Checksum
    0a20433d95a4: Download complete
    e7c96db7181b: Download complete
    e7c96db7181b: Pull complete
    df9eac31dfef: Download complete
    df9eac31dfef: Pull complete
    0a20433d95a4: Pull complete
    Digest: sha256:aa28f3b6b4087b3f289bebaca8d3fb82b93137ae739aa67df3a04892d521958e
    Status: Downloaded newer image for node:10.15.3-alpine
    ---> 56bc3a1ed035
    Step 2/7 : WORKDIR /app
    ---> Running in a622f22a0d29
    Removing intermediate container a622f22a0d29
    ---> 9da07585ad13
    Step 3/7 : EXPOSE 3000
    ---> Running in bf1bedd6afab
    Removing intermediate container bf1bedd6afab
    ---> fb8606f40237
    Step 4/7 : CMD ["npm", "run", "dev"]
    ---> Running in 0466be1bbd7e
    Removing intermediate container 0466be1bbd7e
    ---> af409cbc06b9
    Step 5/7 : COPY package* ./
    ---> 8a927883b9dc
    Step 6/7 : RUN npm install
    ---> Running in 17de82aeef89

nodemon@1.18.7 postinstall /app/node_modules/nodemon
node bin/postinstall || exit 0

Love nodemon? You can now support the project via the open collective:

https://opencollective.com/nodemon/donate

npm WARN backend@1.0.0 No description
npm WARN backend@1.0.0 No repository field.
npm WARN backend@1.0.0 No license field.
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: fsevents@1.2.4 (node_modules/fsevents):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for fsevents@1.2.4: wanted {"os":"darwin","arch":"any"} (current: {"os":"linux","arch":"x64"})

added 265 packages from 161 contributors and audited 2359 packages in 14.22s
found 34 high severity vulnerabilities
run npm audit fix to fix them, or npm audit for details
Removing intermediate container 17de82aeef89
---> 9027af0f5d5e
Step 7/7 : COPY . .
---> 48a60d092e62
Successfully built 48a60d092e62
Successfully tagged gcr.io/k8s-skaffold/node-example:v0.34.1-9-gc1ff25a6-dirty
Build complete in 59.2691226s
Starting test...
Test complete in 3.7613ms
Starting deploy...
kubectl client version: 1.15
service/node created
deployment.apps/node created
Deploy complete in 4.3388013s
Watching for changes...
time="2019-07-29T18:47:48+10:00" level=info msg="Stream logs from pod: node-5d46645698-dnfqs container: node"
[node-5d46645698-dnfqs node]
[node-5d46645698-dnfqs node] > backend@1.0.0 dev /app
[node-5d46645698-dnfqs node] > node src/index.js
[node-5d46645698-dnfqs node]
[node-5d46645698-dnfqs node] Example app listening on port 3000!
Port forwarded service/node from remote port 3000 to local port 3000
time="2019-07-29T18:49:29+10:00" level=info msg="files modified: [backend\src\index.js]"
Generating tags...

  • gcr.io/k8s-skaffold/node-example -> gcr.io/k8s-skaffold/node-example:v0.34.1-9-gc1ff25a6-dirty
    Tags generated in 413.9461ms
    Starting build...
    Found [minikube] context, using local docker daemon.
    Building [gcr.io/k8s-skaffold/node-example]...
    Sending build context to Docker daemon 101.4kB
    Step 1/7 : FROM node:10.15.3-alpine
    ---> 56bc3a1ed035
    Step 2/7 : WORKDIR /app
    ---> Using cache
    ---> 9da07585ad13
    Step 3/7 : EXPOSE 3000
    ---> Using cache
    ---> fb8606f40237
    Step 4/7 : CMD ["npm", "run", "dev"]
    ---> Using cache
    ---> af409cbc06b9
    Step 5/7 : COPY package* ./
    ---> Using cache
    ---> 8a927883b9dc
    Step 6/7 : RUN npm install
    ---> Using cache
    ---> 9027af0f5d5e
    Step 7/7 : COPY . .
    ---> 5bdac18ecf60
    Successfully built 5bdac18ecf60
    Successfully tagged gcr.io/k8s-skaffold/node-example:v0.34.1-9-gc1ff25a6-dirty
    Build complete in 594.7379ms
    Starting test...
    Test complete in 2.9599ms
    Starting deploy...
    kubectl client version: 1.15
    deployment.apps/node configured
    Deploy complete in 2.9984198s
    Watching for changes...
    Port forwarded service/node from remote port 3000 to local port 3001
    time="2019-07-29T18:49:36+10:00" level=info msg="Stream logs from pod: node-68c49d6464-b2pbj container: node"
    [node-68c49d6464-b2pbj node]
    [node-68c49d6464-b2pbj node] > backend@1.0.0 dev /app
    [node-68c49d6464-b2pbj node] > node src/index.js
    [node-68c49d6464-b2pbj node]
    [node-68c49d6464-b2pbj node] Example app listening on port 3000!
    [node-5d46645698-dnfqs node]
    time="2019-07-29T18:50:20+10:00" level=info msg="retrying kubectl port-forward due to error: E0729 18:50:18.953052 13220 portforward.go:400] an error occurred forwarding 3001 -> 3000: error forwarding port 3000 to pod 7a27efad03e8f0807a1c40dde824e7809a65f6e089068df5719eb3bb9eac4e7e, uid : container not running (7a27efad03e8f0807a1c40dde824e7809a65f6e089068df5719eb3bb9eac4e7e)\n"
    Port forwarded service/node from remote port 3000 to local port 3001
    time="2019-07-29T18:50:46+10:00" level=info msg="retrying kubectl port-forward due to error: E0729 18:50:44.778788 11392 portforward.go:400] an error occurred forwarding 3000 -> 3000: error forwarding port 3000 to pod 7a27efad03e8f0807a1c40dde824e7809a65f6e089068df5719eb3bb9eac4e7e, uid : Error: No such container: 7a27efad03e8f0807a1c40dde824e7809a65f6e089068df5719eb3bb9eac4e7e\n"
    Port forwarded service/node from remote port 3000 to local port 3000
    Cleaning up...
    PS C:\dev\src\github\skaffold\integration\examples\nodejs>

@demisx
Copy link
Contributor

demisx commented Jul 29, 2019

I've been testing with the latest bleeding version 4ba3d06 today and all issues reported earlier still exist. The port forwarding seems got worse, though. I can't start skaffold dev with all of ports forwarded properly. Something is missing each time and I have to forward manually.

@balopat
Copy link
Contributor

balopat commented Jul 30, 2019

@demisx - I'm sorry to see this and thank you for persisting in testing with us, we are focused on getting this stabilized asap. The kubectl port-forward issues were not calculated in the redesign of our port forwarding, we might have to move to yet another different design to get around them.

Can you share a bit more info? Are you on minikube? On linux/mac/windows?
It is surprising to hear port-forwarding doesn't work the first time either. Can you share logs with -v trace?

@balopat
Copy link
Contributor

balopat commented Jul 30, 2019

@demisx - master contains one more port-forward fix, it should be all good now. After a couple of refreshes (while the new pod is coming up, and potentially kubectl portforward being restarted after a failed connection), it should work.

We noticed today a bug with the latest minikube - if you are on minikube 1.2.0 - make sure to wait until #2577 is merged.

@demisx
Copy link
Contributor

demisx commented Jul 31, 2019

@balopat I am happy to help where I can as Skaffold is one cool project that makes our development more efficient daily. I've been using the latest cdffdad bleeding version today and, indeed, the port forwarding issue seems to be resolved. Haven't had any problems yet. The other issues I've reported earlier still exist, but probably are out of scope here.

Here is my environment:

  • OS: 10.14.5 Mojave
  • Local K8S: Docker for Desktop 2.0.0.3 (K8S Server: 1.10.11)
  • K8S Client: 1.15.1

Please let me know if you need any other info.

@nkubala
Copy link
Contributor

nkubala commented Jul 31, 2019

@demisx thanks for the info, and glad to hear you're seeing the issue fixed with the latest commits! this has obviously been a pain for us recently so any progress is good progress :) since this particular issue seems to be resolved I'm gonna close this issue, but if you are still seeing the other problems you reported here, please open them up in separate issues so we can track them 👍

@nkubala nkubala closed this as completed Jul 31, 2019
@demisx
Copy link
Contributor

demisx commented Jul 31, 2019

@nkubala Got it. So, is it OK to create new GitHub issues for problems I am seeing in the latest master or should I report against the releases only?

@nkubala
Copy link
Contributor

nkubala commented Jul 31, 2019

yes thats fine! the sooner we catch the issues the better :) just be sure and include the commit if you're seeing them at HEAD

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/portforward kind/bug Something isn't working priority/p1 High impact feature/bug.
Projects
None yet
Development

No branches or pull requests

10 participants