-
Notifications
You must be signed in to change notification settings - Fork 363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add EnableLogging
and LogLabel
supports for Node NetworkPolicy
#6626
base: main
Are you sure you want to change the base?
Conversation
16431bb
to
cb09cbc
Compare
82bc18c
to
1589c2a
Compare
a829204
to
43efbe4
Compare
// used as iptables log prefix. According to https://ipset.netfilter.org/iptables-extensions.man.html, the prefix is | ||
// up to 29 letters long. | ||
if len(logLabel) > 28 { | ||
klog.InfoS("The log label exceeds 28 characters and will be truncated", "logLabel", logLabel) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we log this in the validation webhook (instead or in addition)? cc @tnqn for his opinion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with validating with a webhook.
One more thing, we need to add some basic information about the rule to the logLabel which will be used as iptables log prefix. The worst case is like the following:
| Antrea |:|Out|:|Reject|:| [user-provided label] |:|
| 6 |1|3 |1|6 |1| n |1|
As a result, the user-provided log label (n
) is up to 10 characters long. Will that be sufficient?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that having a fixed-limit, regardless of the rule direction or the rule action, may be the right approach.
Maybe we could replace "Out" with "O" and "In" with "I". I would give us 2 extra characters (making the limit 12 for the user-provided label), without really compromising readability too much. I was thinking of doing the same for the action, but I think it would hurt readability too much?
12 characters is not much, but we are heavily constrained by the implementation.
I was initially thinking of failing in the validation webhook if the limit is exceeded, but I now feel like it is better to just log a message warning the user. I'd log it once in the webhook (controller) and once in the agent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd log it once in the webhook (controller) and once in the agent.
Do you mean that something should be shown when a Node NetworkPolicy with a log label over 12 characters?
I made some changes to the validator. If I understood correctly, what I expected is that: when I use kubectl
to apply a Node NetworkPolicy with a log label over 12 characters, a warning message will be shown in the terminal. It seems that there is something wrong, after applying such Node NetworkPolicy, nothing is shown in the terminal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you will need to use the Warnings
field of the AdmissionResponse
: https://pkg.go.dev/k8s.io/api/admission/v1#AdmissionResponse
I am not sure we have done it in the past. Is this what you tried?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! That's what I need!
43efbe4
to
0244a6d
Compare
// used as iptables log prefix. According to https://ipset.netfilter.org/iptables-extensions.man.html, the prefix is | ||
// up to 29 letters long. | ||
if len(logLabel) > 28 { | ||
klog.InfoS("The log label exceeds 28 characters and will be truncated", "logLabel", logLabel) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This log is a little misleading, probably better to include the logLabel with prefix and the original one in Node NP.
klog.InfoS("The log label exceeds 28 characters and will be truncated", "logLabel", logLabel) | |
klog.InfoS("The log label exceeds 28 characters and will be truncated", "logLabel", logLabel, "rule.LogLabel", rule.LogLabel) |
I am wondering will this kind of truncation has any side-affect? this label seems likely to be duplicated in different Node NPs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this label seems likely to be duplicated in different Node NPs.
Do you mean that different Node NPs that have that same logLabels?
3dfb4ee
to
5df6efe
Compare
503637b
to
fa40171
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good on my side, only minor comments
EnableLogging: true, | ||
LogLabel: "ingress1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just want to make sure that we still have a policy rule for which EnableLogging
is false, for the sake of testing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You just reminded me. I left one policy with EnableLogging
set to false
and another policy with EnableLogging
set to true
but with LogLabe
l left empty, for the sake of testing.
446c1da
to
4fef712
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good on my side
@luolanzone do you want to do another review?
Probably no time this week, I will check with hongliang offline for the question I posted. @antoninbas you can go ahead. Thanks. |
/test-all |
@hongliangl I just realized, shouldn't we have an e2e test for this? Without it we only have unit tests and no guarantee that it actually works |
Make sense, I'll add. |
4fef712
to
eb62bbb
Compare
test/e2e/nodenetworkpolicy_test.go
Outdated
// Before generating some traffic, there should be no corresponding log on hostNetwork Pod x/a. | ||
stdout, _, err := data.RunCommandFromPod(getNS("x"), podXaName, "c80", []string{"dmesg"}) | ||
require.NoError(t, err) | ||
require.NotContains(t, stdout, expectedLogPrefix) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this would fail if I wanted to run the test locally multiple times on the same Kind cluster (e.g., because I am troubleshooting something related to Node NPs).
It should be possible to run the test multiple times on the same cluster. One acceptable approach could be to have a random label, and include a comment in the code explaining why the label is random. There may be better solutions though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added a random label.
Besides, this test cannot be run on github Kind cluster. In a Kind cluster, each Kubernetes Node runs as a Docker container. The iptables LOG target doesn't generate kernel logs inside containers unless /proc/sys/net/netfilter/nf_log_all_netns is set to 1. However, this setting cannot be modified in GitHub Actions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's fine to skip that specific test on Kind
ed07d42
to
8eebffc
Compare
8eebffc
to
72574d2
Compare
/test-all-features-conformance |
// In a Kind cluster, each Kubernetes Node runs as a Docker container. The iptables LOG target doesn't generate | ||
// kernel logs inside containers unless /proc/sys/net/netfilter/nf_log_all_netns is set to 1. However, this setting | ||
// cannot be modified in GitHub Actions. | ||
skipIfProviderIs(t, "kind", "The iptables LOG action doesn't generate log in container") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you think we should add a note about this in the documentation, in case someone is trying to use the feature on a Kind cluster and cannot figure out why it's not working?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense. We should mention it in the documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
test/e2e/nodenetworkpolicy_test.go
Outdated
require.Contains(t, stdout, expectedLogPrefix) | ||
require.NotContains(t, stdout, toBeTruncated) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: assert
is more appropriate than require
here IMO
IMO we should do something better here. We should locate the actual log entry (maybe with a regex) and validate independent fields.
BTW, does the log always show up immediately or should we use Poll
to avoid flakes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated this part:
- Add a regex to match source/destination IP, random log label
- Use
assert.EventuallyWithT
to avoid flakes.
|
||
// Generate some traffic that will be dropped by acnp-drop-x-to-y-egress. | ||
_, err = k8sUtils.Probe(getNS("x"), "a", getNS("y"), "a", p80, ProtocolTCP, nil, nil) | ||
require.NoError(t, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure when to use failOnError
or require.NoError
directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure either. I saw that failOnError
will delete the test namespace. In this test, I don't think it is necessary to delete the test namespace when getting an error. Maybe require.NoError
is sufficient.
dbe52ad
to
c2b86b9
Compare
This commit introduces limited support for traffic logging in Node NetworkPolicy. The limitations are: - Traffic logs are written only to the system log (not managed by Antrea). Users can filter logs using syslog filters. - The `LogLabel` for Node NetworkPolicy is restricted to a maximum of 12 characters. Node NetworkPolicy's data path is implemented via iptables. An iptables "non-terminating target" `LOG` is added before the final matching rule to log packets to the system kernel log. The logs provide packet match details, such as: ``` Sep 2 10:31:07 k8s-node-control-plane kernel: [6657320.789675] Antrea:I:Allow:allow-http:IN=ens224 OUT= MAC=00:50:56:a7:fb:18:00:50:56:a7:23:47:08:00 SRC=10.10.0.10 DST=192.168.240.200 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=52813 DF PROTO=TCP SPT=57658 DPT=80 WINDOW=64240 RES=0x00 SYN URGP=0 Sep 2 10:31:11 k8s-node-control-plane kernel: [6657324.899219] Antrea:I:Drop:default-drop:IN=ens224 OUT= MAC=00:50:56:a7:fb:18:00:50:56:a7:23:47:08:00 SRC=192.168.240.201 DST=192.168.240.200 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=27486 DF PROTO=TCP SPT=33152 DPT=80 WINDOW=64240 RES=0x00 SYN URGP=0 ``` The log prefix (e.g., `Antrea:I:Allow:allow-http:`) is up to 29 characters long and includes a user-provided log label (up to 12 characters). The log prefix format: ``` |---1--| |2| |---3--| |----------4--------| |Antrea|:|I|:|Reject|:|user-provided label|:| |6 |1|1|1|4-6 |1|1-12 |1| ``` - Part 1: Fixed, "Antrea" - Part 2: Direction, "I" (In) or "O" (Out) - Part 3: Action, "Allow", "Drop", or "Reject" - Part 4: User-provided log label, up to 12 characters Signed-off-by: Hongliang Liu <hongliang.liu@broadcom.com>
c2b86b9
to
ca3a41c
Compare
For #6525
This commit introduces limited support for traffic logging in Node
NetworkPolicy. The limitations are:
Users can filter logs using syslog filters.
LogLabel
for Node NetworkPolicy is restricted to a maximum of12 characters.
Node NetworkPolicy's data path is implemented via iptables. An iptables
"non-terminating target"
LOG
is added before the final matching rule tolog packets to the system kernel log. The logs provide packet match details,
such as:
The log prefix (e.g.,
Antrea:I:Allow:allow-http:
) is up to 29 characters longand includes a user-provided log label (up to 12 characters). The log prefix format: