Skip to content

Commit

Permalink
ipvs: avoid drop first packet to reuse conntrack
Browse files Browse the repository at this point in the history
Since commit f719e37 ("ipvs: drop first packet
to redirect conntrack"), when a new TCP connection
meet the conditions that need reschedule, the first
syn packet is dropped, this cause one second latency
for the new connection, more discussion about this
problem can easy seach from google, such as:

1)One second connection delay in masque
https://marc.info/?t=151683118100004&r=1&w=2

2)IPVS low throughput #70747
kubernetes/kubernetes#70747

3)Apache Bench can fill up ipvs service proxy in seconds torvalds#544
cloudnativelabs/kube-router#544

4)Additional 1s latency in `host -> service IP -> pod`
kubernetes/kubernetes#90854

The root cause is when the old session is expired, the
conntrack related to the session is dropped by
ip_vs_conn_drop_conntrack. The code is as follows:
```
static void ip_vs_conn_expire(struct timer_list *t)
{
...

                if ((cp->flags & IP_VS_CONN_F_NFCT) &&
                    !(cp->flags & IP_VS_CONN_F_ONE_PACKET)) {
                        /* Do not access conntracks during subsys cleanup
                         * because nf_conntrack_find_get can not be used after
                         * conntrack cleanup for the net.
                         */
                        smp_rmb();
                        if (ipvs->enable)
                                ip_vs_conn_drop_conntrack(cp);
                }
...
}
```
As the code show, only if the condition  (cp->flags & IP_VS_CONN_F_NFCT)
is true, ip_vs_conn_drop_conntrack will be called.
So we solve this bug by following steps:
1) erase the IP_VS_CONN_F_NFCT flag (it is safely because no packets will
   use the old session)
2) call ip_vs_conn_expire_now to release the old session, then the related
   conntrack will not be dropped
3) then ipvs unnecessary to drop the first syn packet,
   it just continue to pass the syn packet to the next process,
   create a new ipvs session, and the new session will related to
   the old conntrack(which is reopened by conntrack as a new one),
   the next whole things is just as normal as that the old session
   isn't used to exist.

This patch has been verified on our thousands of kubernets node servers on Tencent Inc.
Signed-off-by: YangYuxi <yx.atom1@gmail.com>
  • Loading branch information
yyx authored and intel-lab-lkp committed Jun 11, 2020
1 parent c382928 commit b6e32ff
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions net/netfilter/ipvs/ip_vs_core.c
Original file line number Diff line number Diff line change
Expand Up @@ -2086,11 +2086,11 @@ ip_vs_in(struct netns_ipvs *ipvs, unsigned int hooknum, struct sk_buff *skb, int
}

if (resched) {
if (uses_ct)
cp->flags &= ~IP_VS_CONN_F_NFCT;
if (!atomic_read(&cp->n_control))
ip_vs_conn_expire_now(cp);
__ip_vs_conn_put(cp);
if (uses_ct)
return NF_DROP;
cp = NULL;
}
}
Expand Down

0 comments on commit b6e32ff

Please sign in to comment.