-
Notifications
You must be signed in to change notification settings - Fork 18.8k
Allow direct routing to container ports from trusted interfaces #49832
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
44ca6dc
to
c1d50bc
Compare
c1d50bc
to
29b4c52
Compare
29b4c52
to
eec99ec
Compare
eec99ec
to
79c88a3
Compare
// TrustedHostInterfaces can be used to supply a list of host interfaces that are | ||
// allowed direct access to published ports on a container's address. | ||
TrustedHostInterfaces = "com.docker.network.bridge.trusted_host_interfaces" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not strictly related to this PR, but I think we should consider adding CLI flags that would set these options and make them more discoverable.
Alternatively, at least list them in the description of --opt
in the network create --help
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, agreed - they are terrible ... the worst part is we can't currently error when a -o
option isn't matched because it contains a typo - it'll still appear in inspect output, but just silently won't do what was expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Should we consider removing |
Not yet unfortunately ... it's still needed to stop the daemon from trying to create iptables "raw" rules on hosts that don't have kernel support for it (the original purpose of that env var). This new flag only stops it from creating the raw rule that blocks remote access to container ports, not the one that blocks access remote access to ports published the host's IPv4 loopback address. (And it doesn't stop the daemon from trying to delete from the raw table, which would also be an error without the kernel module.) |
@@ -137,6 +138,7 @@ func (ncfg *networkConfiguration) MarshalJSON() ([]byte, error) { | |||
nMap["GwModeIPv4"] = ncfg.GwModeIPv4 | |||
nMap["GwModeIPv6"] = ncfg.GwModeIPv6 | |||
nMap["EnableICC"] = ncfg.EnableICC | |||
nMap["TrustedHostInterfaces"] = strings.Join(ncfg.TrustedHostInterfaces, " ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not simply marshal the slice of strings as a JSON array of strings? The persistence of bridge network configurations is so far removed from parsing the netlabels that I am unconvinced that there would be any maintenance or code readability advantage to using the same serialization scheme in the persistence layer. Marshaling to a JSON array has the advantage of being compatible with the default un/marshalling of a Go []string
struct field. With a stringly-typed list we would have to define a custom type that implements json.Marshaler
and json.Unmarshaler
in order to un/marshal directly to/from a struct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With a stringly-typed list we would have to define a custom type that implements json.Marshaler and json.Unmarshaler in order to un/marshal directly to/from a struct.
Yes, exactly. Although, it'd marshal without needing a custom type so the other way to be to unmarshal by type asserting the value to []interface{}
, then iterate over that and type assert the elements to strings.
So, "simply marshal the slice of strings as a JSON array of strings" isn't as simple as simply storing a string!
(I've added a commit with a mashal/unmarshal unit test for networkConfiguration
as there didn't seem to be one.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the other way to be to unmarshal by type asserting the value to
[]interface{}
, then iterate over that and type assert the elements to strings.
Fair point. Though just for completeness' sake, we do have a "./internal/sliceutil".Map
function which makes that operation not much more verbose than a strings.FieldsFunc
call.
sliceutil.Map(v, func(a any) string { s, _ := a.(string); return s })
2170160
to
5075be7
Compare
@@ -337,6 +340,8 @@ func (c *networkConfiguration) fromLabels(labels map[string]string) error { | |||
if c.EnableICC, err = strconv.ParseBool(value); err != nil { | |||
return parseErr(label, value, err.Error()) | |||
} | |||
case TrustedHostInterfaces: | |||
c.TrustedHostInterfaces = strings.FieldsFunc(value, func(r rune) bool { return r == ':' }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FieldsFunc splits the string s at each run of Unicode code points...
(emphasis mine.)
value
is a bag of bytes, not a Unicode string. While I can't come up with an pathological input which would split in an unexpected way, I still worry about the risk of surprising behaviours from treating the value as a UTF-8 string.
c.TrustedHostInterfaces = strings.FieldsFunc(value, func(r rune) bool { return r == ':' }) | |
c.TrustedHostInterfaces = strings.Split(value, ":") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not quite that, because strings.Split("", ":")
is [""] ... so the bridge driver sees an empty interface name and generates a broken iptables rule. But, ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, that's annoying. I overlooked that difference in behaviour in the godoc. Since my worry is more theoretical than anything at this point, I'm fine with reverting to strings.FieldsFunc
until I can prove the existence of a pathological input that would round-trip incorrectly. Such an input may not exist. Now excuse me while I whip up a quick fuzz test...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fuzzing found nothing. My fears were unfounded. strings.FieldsFunc
is fit for purpose, as far as I can tell, to split a bag-of-bytes string on a UTF-8 codepoint. And it makes sense once I think about it: UTF-8 is self-synchronizing so there is no confusing an ASCII char for part of a multibyte sequence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, thanks Cory - I've put it back to FieldsFunc
, and rebased just because it had got a bit behind. I'll let the tests run, then merge.
@@ -215,6 +217,10 @@ func (ncfg *networkConfiguration) UnmarshalJSON(b []byte) error { | |||
ncfg.GwModeIPv6, _ = newGwMode(v.(string)) | |||
} | |||
ncfg.EnableICC = nMap["EnableICC"].(bool) | |||
if v, ok := nMap["TrustedHostInterfaces"]; ok { | |||
s, _ := v.(string) | |||
ncfg.TrustedHostInterfaces = strings.FieldsFunc(s, func(r rune) bool { return r == ':' }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ncfg.TrustedHostInterfaces = strings.FieldsFunc(s, func(r rune) bool { return r == ':' }) | |
ncfg.TrustedHostInterfaces = strings.Split(s, ":") |
GwModeIPv4 gwMode | ||
GwModeIPv6 gwMode | ||
EnableICC bool | ||
TrustedHostInterfaces []string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should write down the preconditions that other code depends upon so the next person knows not to break the contract.
TrustedHostInterfaces []string | |
TrustedHostInterfaces []string // Interface names must not contain ':' characters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, just in case Linux starts allowing ':' chars!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm referring to the bridge_store code which would not faithfully round-trip names that contain colon characters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, sure - I added the comment. (But, it's still a list of interface names, and it's a Linux rule that interface names can't contain ':' chars.)
5075be7
to
a62fe56
Compare
trusted_host_interface have access to published ports on container addresses - enabling direct routing to the container via those interfaces. Signed-off-by: Rob Murray <rob.murray@docker.com>
Signed-off-by: Rob Murray <rob.murray@docker.com>
Per-network option com.docker.network.bridge.trusted-host-interfaces accepts a list of interfaces that are allowed to route directly to a container's published ports in a bridge network with nat enabled. This daemon level option disables direct access filtering, enabling direct access to published ports on container addresses in all bridge networks, via all host interfaces. It overlaps with short-term env-var workaround: DOCKER_INSECURE_NO_IPTABLES_RAW=1 - it does not allow packets sent from outside the host to reach ports published only to 127.0.0.1 - it will outlive iptables (the workaround was initially intended for hosts that do not have kernel support for the "raw" iptables table). Signed-off-by: Rob Murray <rob.murray@docker.com>
a62fe56
to
44a3453
Compare
…es" (#22601) ## Description ~**This update is for moby 28.2.0 - do not merge until it ships.**~ - 28.2 has shipped now. Add description for daemon option `--allow-direct-routing` and network option `com.docker.network.bridge.trusted_host_interfaces`. ## Related issues or tickets moby/moby#49832 ## Reviews <!-- Notes for reviewers here --> <!-- List applicable reviews (optionally @tag reviewers) --> - [ ] Technical review - [ ] Editorial review - [ ] Product review Signed-off-by: Rob Murray <rob.murray@docker.com>
- What I did
Since 28.0.0, docker's iptables rules will block packets addressed to a container's address, unless they originate from the host.
The issue linked above describes a case where this is undesirable (Flannel VXLAN provides an overlay network, and packets arriving on its interface should be treated as belonging to the bridge network). The existing workarounds are:
DOCKER_INSECURE_NO_IPTABLES_RAW=1
nat-unprotected
Follow-ups needed:
dockerd.md
- How I did it
bridge: add option com.docker.network.bridge.trusted_host_interfaces
Interfaces listed have access to published ports on the container's address - enabling direct routing to the container via those interfaces.
It accepts a colon-separated list (comma is allowed in an interface name, colon isn't).
For example:
(Like other options for user-defined networks, it's not possible to supply this option for the default bridge network.)
Add daemon option --allow-direct-routing
This daemon level option disables direct access filtering, enabling direct access to published ports on container addresses in all bridge networks, via all host interfaces.
For example, in
/etc/docker/daemon.json
:It overlaps with short-term env-var workaround
DOCKER_INSECURE_NO_IPTABLES_RAW=1
, but:127.0.0.1
.- How to verify it
Updated integration tests.
- Human readable description for the release notes