Proposal: Container annotations (metadata) #9013

squaremo · 2014-11-06T23:08:34Z

This PR describes "Container annotations", that is, metadata for containers.

Motivation

The motivation for annotations is to be able to "hang" values on a container so that clients -- e.g., plugins -- can examine them and react accordingly. For example, a weave plugin might attach a container to the weave network based on a weaveIPs attribute.

Attributes can be set while the container is running, and are visible to consumers of the remote API and to command-line tooling, but not in general from within the container.

Synopsis

Command line:

# Run a container with an annotation
INSTANCE=$(docker run -ti --annotate example.com/animal=Dog ubuntu)
# Annotate a container (whether it's running or not)
docker annotate example.com/animal Fox $INSTANCE
# Get an annotation
docker inspect --format='{{ index .Annotations "example.com" "animal" }}' $INSTANCE

NB that annotation get JSON literals, so some values will require quoting; I'll discuss that below.

Dockerfile:

ANNOTATE example.com/animal Goat

Design choices

Keys

Keys are namespaced labels <domain>/<id>, where the <id> has no structure (i.e., no dots). This is to line up with e.g., Kubernetes' idea of annotation and label keys.

Although there's no structure to keys when setting annotations, the values may be maps and arrays, and you can reach down into them with the format argument.

docker annotate example.com/animals '["deer", "otter", "seal"]' $INSTANCE
docker inspect --format='{{ index .Annotations "example.com" "animals" 0 }}' $INSTANCE

JSON values

Values are JSON literals. This will often necessitate quoting, which can be awkward. The upside is that you can have structured values, like lists of labels or IP addresses or animals.

Anything that can't be parsed as a JSON literal is interpreted as a string -- but care needs to be taken here, since null, numbers and booleans look like strings without the quotes. Anything that's not a hard-coded value, e.g., variables in scripts, should always be quoted if it's intended as a string.

Update semantics

The given semantics eschew transactions and fine-grained operations; you can inspect an annotation then set it, or you can just stomp on whatever value it has already; but there's no operation for say, adding an entry to an array-valued annotation, atomically. This is to keep things simple to implement and script, mainly.

null for delete

The literal null is used to remove an annotation. This is consistent with what you get if you ask docker inspect for a field that doesn't exist -- i.e., null means missing.

Events

There's an event "update" that gets fired whenever container annotations are changed. This is so that clients can react to changes in whatever annotations they were interested in.

It doesn't carry any additional information: clients are expected to go and look for a new value and react accordingly. Pos 8000 sibly the key could be included in the event, so clients don't end up spamming docker inspect.

crosbymichael · 2014-11-06T23:11:08Z

Have you considered the term label instead of attributes?

squaremo · 2014-11-06T23:13:14Z

Have you considered the term label instead of attributes?

Yes; but I didn't think "label" does justice to structured values. I could be talked around ..

rade · 2014-11-07T10:57:53Z

docs/sources/reference/commandline/cli.md

+
+## attr
+
+    Usage: docker attr KEY VALUE


Surley there should be a container-id somewhere in there. Same mistake is in the PR text.

Signed-off-by: Michael Bridgen <mikeb@squaremobius.net>

bfirsh · 2014-11-07T12:36:44Z

+1 labels, partially because it is already an established term from Kubernetes.

@brendandburns @bgrant0607 You have expressed interest in getting Kubernetes-style labels upstream. Is this proposal something that could work for Kubernetes?

rade · 2014-11-07T12:43:21Z

@bfirsh

+1 labels, partially because it is already an established term from Kubernetes.

Only if the functionality provided here essentially matches Kubernetes. In particular, the kubernetes docs are completely silent as to what form label values take, and the only examples given are strings.

squaremo · 2014-11-07T14:52:45Z

+1 labels, partially because it is already an established term from Kubernetes.

Having had a look at the Kubernetes labels design doc, I think using "labels" here would be a mistake. It should be possible (as in it could be a design goal) to implement Kubernetes' idea of labels using attributes, but there are disconnects:

Kubernetes wants labels to apply to pods (and other Kubernetes API objects), not containers. Actually it's a bit difficult to tell what's included in "other objects", but I think containers are not included.
Labels are identifying, that is, intended to be used to select objects; attributes as suggested here don't have that property (though some future docker API might include it for convenience)

So calling them the same thing would quite possibly induce confusion. If anything, attributes are similar to Kubernetes' idea of annotations, but as @rade suggests, I'd want to avoid conflating them unless they were really the same thing.

squaremo · 2014-11-07T15:00:15Z

So calling [attributes] [labels] would quite possibly induce confusion.

If it turns out that what people want really is exactly Kubernetes' labels, that's a different situation. I don't think labels would necessarily align with my motivation as given above (although, perhaps selection by label would be useful ..).

bgrant0607 · 2014-11-07T15:07:34Z

Kubernetes has 2 mechanisms, for different purposes:

Labels are intended to be used for identifying attributes. There are lots of examples in the doc. They are used fo 8000 r filtering/selecting/matching, both in mechanisms such as our replication controllers and services, as well as operations, such as GET, DELETE, etc. We plan to index them (and reverse-index them) for fast lookup.

Annotations are intended to be used to attach arbitrary, potentially structured data. Again, there are examples in the document.

For both, keys are restricted to DNS labels. Validation code.

We haven't decided how to strict label values yet but, yes, we do expect them to be relatively simple strings.

For annotations, we'd be happy with something JSON literals for values.

For applications to communicate data upwards, we plan to introduce different mechanisms.

squaremo · 2014-11-07T15:17:20Z

Thanks @bgrant0607. Do you think what's proposed here is congruent with Kubernetes' annotations (I think so)? And hence or otherwise, ought they have the same name?

bgrant0607 · 2014-11-07T15:34:44Z

They are congruent in purpose and we could set the same restriction on the values.

How attached are you to Go identifiers for keys?

Are Docker container names still restricted to [a-zA-Z_-.] ?

@thockin @smarterclayton

squaremo · 2014-11-07T15:45:52Z

They are congruent in purpose and we could set the same restriction on the values.

That it's JSON values (and null means missing)? This would line things up nicely, if it works for your purposes. I'm not sure if there will be a technical need to line up exactly (perhaps you have an opinion on that) but it won't hurt in any case.

How attached are you to Go identifiers for keys?

Not very. The motivation for this restriction is so they work nicely with using go templates in docker inspect; but, for keys with dashes, {{index ...}} can be used, so it's not a big deal really.

I'll change "attributes" to "annotations", and relax the key restriction, then we can see how we like it.

squaremo · 2014-11-07T16:16:40Z

I'll [...] relax the key restriction

The labels design says "Valid labels follow a slightly modified RFC952 format: 24 characters or less, all lowercase, begins with alpha, dashes (-) are allowed, and ends with alphanumeric." but it's not said whether this is talking about the keys or the values, or which production in the grammar given in RFC952 is meant. I presume it's keys, and this one:

<name>  ::= <let>[*[<let-or-digit-or-hyphen>]<let-or-digit>]

I'd prefer to refer to something more precisely, or just give a regexp: [a-z]([a-z0-9-]*[a-z0-9])?

By the way, encouraging (as I do) people to prefix keys with a namespace and requiring lowercase means most keys will include dashes, and therefore not be amenable to `docker inspect --format='{{ .Annotations.key.field }}', which is a bit of a pain. What is the motivation for using the particular formulation above?

thockin · 2014-11-07T16:36:07Z

The motivation for using DNS names for input validation is primarily so
that we don't have a proliferation of validation formats. Names can be X,
but labels can be Y, and containers can be Z.

Everything we name is somewhere on the DNS spectrum from RFC952 to
RFC1123. We don't have to stay there, I suppose, but my next point makes
more trouble. I believe we need to add explicit namespacing of labels such
that different pieces of the system can attach labels and not collide. The
proposal that we have on the table is /. E.g.
docker.io/name or kubernetes.io/controller.

Again, we don't HAVE to use that format, but it is somewhat well
established.

Also, docker container names allow dashes anyway, so you're proposing
something that is similar to but not quite the same - which is a recipe for
user confusion.

As for values we are less opinionated - they are just strings. JSON
literals means that I would have to say things like
docker attr foo "bar"
? Puke. Can't we do better and infer quotes?

If we can find common ground on the spec, we can extract all of this into a
neutral library that we both use - I think that would be a win all around.

On Fri, Nov 7, 2014 at 8:17 AM, Michael Bridgen notifications@github.com
wrote:

I'll [...] relax the key restriction

The labels design says "Valid labels follow a slightly modified RFC952
format: 24 characters or less, all lowercase, begins with alpha, dashes (-)
are allowed, and ends with alphanumeric." but it's not said whether this is
talking about the keys or the values, or which production in the grammar
given in RFC952 is meant. I presume it's keys, and this one:

::= [*[]]

I'd prefer to refer to something more precisely, or just give a regexp:
a-z?

By the way, encouraging (as I do) people to prefix keys with a namespace
and requiring lowercase means most keys will include dashes, and
therefore not be amenable to `docker inspect --format='{{
.Annotations.key.field }}', which is a bit of a pain. What is the
motivation for using the particular formulation above?

Reply to this email directly or view it on GitHub
#9013 (comment).

squaremo · 2014-11-07T17:04:55Z

As for values we are less opinionated - they are just strings. JSON literals means that I would have to say things like docker attr foo "bar" ? Puke. Can't we do better and infer quotes?

Yes, quoting is a pain. Inferring quotes in the case of contiguous characters probably isn't a big deal, though there's ambiguity around numbers of course, so care then has to be taken by users to quote when using values from elsewhere, etc. Anyway.

However, I would like to do better than "just strings" for values; if they are just strings, then there's an extra parsing step needed for tools to get anything useful out of structured values.

I believe we need to add explicit namespacing of labels such that different pieces of the system can attach labels and not collide. The proposal that we have on the table is /.

Namespacing is good, and that scheme is fine. As above, my concern is that this makes docker's use of templates in --format rather awkward for tooling that wants to use annotations*. If someone has a suggestion for how to reconcile these things, I'll happily consider it.

*It's my understanding that this would in general require e.g., {{index .Annotations "docker.io/foobar"}}, and nested values would need something like {{ (index .Annotations "docker.io/foobar").bar.baz }}. I'm prepared to be taught something new though ...

thockin · 2014-11-07T17:14:15Z

I agree that Go's format language falls down here. Could we define the
values to be YAML instead of JSON? YAML has rules for inferring strings vs
numbers, and yet JSO syntax is valid.

foo=bar # 8000 string
foo='[ bar, qux ]' # list of strings

On Fri, Nov 7, 2014 at 9:05 AM, Michael Bridgen notifications@github.com
wrote:

As for values we are less opinionated - they are just strings. JSON
literals means that I would have to say things like docker attr foo "bar"
? Puke. Can't we do better and infer quotes?

Yes, quoting is a pain. Inferring quotes in the case of contiguous
characters probably isn't a big deal, though there's ambiguity around
numbers of course, so care then has to be taken by users to quote when
using values from elsewhere, etc. Anyway.

However, I would like to do better than "just strings" for values; if they
are just strings, then there's an extra parsing step needed for tools to
get anything useful out of structured values.

I believe we need to add explicit namespacing of labels such that
different pieces of the system can attach labels and not collide. The
proposal that we have on the table is /.

Namespacing is good, and that scheme is fine. As above, my concern is that
this makes docker's use of templates in --format rather awkward for
tooling that wants to use annotations*. If someone has a suggestion for how
to reconcile these things, I'll happily consider it.

*It's my understanding that this would in general require e.g., {{index
.Annotations "docker.io/foobar"}}, and nested values would need something
like {{ (index .Annotations "docker.io/foobar").bar.baz }}. I'm prepared
to be taught something new though ...

Reply to this email directly or view it on GitHub
#9013 (comment).

squaremo · 2014-11-07T17:26:45Z

Could we define the values to be YAML instead of JSON?

Much of docker's tooling thinks in JSON terms, so perhaps best not.

my concern is that this makes docker's use of templates in --format rather awkward

Maybe there's some way I can work namespaces in explicitly, so the "non go identifier" keys are confined to a distinct argument, and don't appear in templates. Probably this would require a docker annotation ... command, which isn't ideal -- it might even be better have the awkward templates as above. I'll have a think about it.

errordeveloper · 2014-11-07T20:21:23Z

I think this is an important enabling piece that the API must have!

errordeveloper · 2014-11-08T00:48:53Z

One simple example where you'd want more then labels is descriptions or comments of some sort.

Anyways, I don't see a point of even bothering to add restrictive metadata format to anything other then free-form JSON. If metadata field is to be there in the first place, why not make it map to anything as far as JSON goes?

Regarding format debate, YAML is newline sensitive format and JSON values are not newline-friendly. The Docker API is currently JSON-only, so it would be a major pain to deal with YAML.

errordeveloper · 2014-11-08T11:02:24Z

I certainly agree that this could be called annotations instead of attributes, that's pretty much the same thing, if it helps to synchronise naming of stuff between Kubernetes and Docker. Although, as we all know naming is the hardest part and it's hard to agree. As of labels, the only reason I'd see them as separate feature is because you might want to show them in docker ps and that they are generally more likely to be present the annotations... However, why wouldn't you just keep Docker less opinionated about metadata and implement labels and annotations under a more generic place you could call attributes, user_properties, or even just meta?

thaJeztah · 2014-11-08T11:38:34Z

I have a feeling this discussion is repeating the same path that lead to #8955 (a continuation of #7470) although those were for images, not containers.

Perhaps those should be taken into account, so that we can get a consistent implementation / naming?

bfirsh · 2015-01-14T14:56:55Z

@squaremo FYI, we've got a pull request going here: #9882

dreamcat4 · 2015-01-14T17:57:03Z

+1 for this.

My views on this feature are:

Am guessing that Go templates are good for best performance, being language-native, which is why we have them. So perhaps if you want to offer users an alternative and simpler way of parsing, then the most obvious alternative would be to include a json library in the binary and add to docker inspect a --json option to be used in conjunction with the existing --format flag. So we could parse json format strings just like are accepted by the program jq.
I want this feature to work for stopped data-only containers as well as running ones. Since data-only containers should always be in stopped mode and I definitely need to have metadata on those kinds of containers too.
What the heck is wrong with calling it "metadata", rather than anything else? It that not what this stuff actually is? Arbitrary user-specified metadata? Why not call a stick a stick?
This feature should be designed in mind to be exposable via a future intospection feature. But I also need a very similar feature called intraspection. This design is already good because it treats all of the new user-set metadata just like a container's other existing metadata. ++1.

thaJeztah · 2015-02-28T23:16:59Z

@jfrazelle should this be closed, now that a design approval was given on #9882 ?

jessfraz · 2015-03-02T02:22:18Z

I think we can close after #9882 is merged

rade · 2015-03-02T07:53:44Z

@jfrazelle

I think we can close after #9882 is merged

No, because of #9882 (comment)

thaJeztah · 2015-03-02T10:27:28Z

@rade ah, thanks. Yes that is different;

To reiterate, a more important difference is that annotations per #9013 can be changed after a container is started.

I'm thinking what's best here; #9882 allows setting the labels, but those are immutable. Also different is that this proposal stores structured (JSON) data, whereas #9882 only supports strings. Because of the overlap and differences between the proposals, I don't think this proposal will fully make it.

Perhaps a new proposal should be created as a "follow up" to #9882, the new proposal would only include the feature to being able to change (add/remove/update) labels on running containers, e.g. "Proposal: Allow modifying labels (meta-data) on running containers".

Would that be reasonable?

squaremo · 2015-03-02T12:54:17Z

Would that be reasonable?

Not from my point of view. As people have mentioned repeatedly above, this is a feature quite distinct from labels: #9013 (comment) identifies the different uses, and #9013 (comment) describes the various purposes to which labels, annotations, etc., can be put.

I am all for avoiding proliferation of features. If "runtime labels" can serve both purposes, without compromising either, than we can conflate them with annotations as described here. But being superficially similar isn't a good enough reason. After all, labels are pretty similar to environment entries on the face of it, aren't they.

In any case, I don't see why submitting another PR with the same use cases and so on -- rather than figuring out how to adapt this one if that's necessary -- would advance things, unless politically.

Is there any 1-design-review feedback?

crosbymichael · 2015-03-10T21:17:44Z

@squaremo from what I understand your argument for this proposal over the other label proposals is that this is mutable annotations and the others are immutable?

rade · 2015-03-10T21:35:02Z

I believe there are two major differences to #9882:

Annotations can be set on containers in any state, including stopped and running containers. Labels can only bet set on prior/during container creation.
Annotations are mutable, and there is a corresponding event stream resulting from that mutation. Labels are immutable.

rhatdan · 2015-03-11T16:57:13Z

Yes, so lets get Immutable labels merged. :^)

squaremo · 2015-03-11T23:49:31Z

@squaremo from what I understand your argument for this proposal over the other label proposals is that this is mutable annotations and the others are immutable?

I'm not arguing for this proposal over another, only that they should not be conflated (and this proposal be closed as a duplicate as a consequence), since annotations as described here are useful in ways that labels in #9882 and elsewhere are not.

The main difference is that annotations can be changed after container creation, yep. Other differences are less important to their purpose.

rade · 2015-03-13T22:16:58Z

Is there anything in #9882 that would prevent its 'labels' from getting the functionality described here, at some future point?

I would not like to see a future docker having three ways of annotating containers - env vars, labels and annotations. We could end up there by accident, e.g. what if some users employ labels in a way that requires them to be immutable?

icecrime · 2015-03-17T05:16:41Z

~~Closed by #9882~~ Oh apparently it's not the same thing oO

LK4D4 · 2015-03-31T17:26:09Z

@squaremo Can we open new proposal in light of merged labels? Because I'm not sure that labels+annotations+whatever is cool way.

icecrime · 2015-04-02T16:57:35Z

+1, it really needs to start fresh knowing that #9882 is now merged.

crosbymichael added the Proposal label Nov 6, 2014

rade mentioned this pull request Nov 6, 2014

Expose weave network to docker inspect weaveworks/weave#117

Closed

squaremo changed the title ~~Container attribute docs~~ Proposal: Container attributes (metadata) Nov 6, 2014

rade reviewed Nov 7, 2014
View reviewed changes

squaremo force-pushed the attributes_proposal branch from 89830cd to 05bb1fa Compare November 7, 2014 11:12

Container attribute docs

a983514

Signed-off-by: Michael Bridgen <mikeb@squaremobius.net>

squaremo force-pushed the attributes_proposal branch from 05bb1fa to a983514 Compare November 7, 2014 11:16

This was referenced Nov 7, 2014

Adding arbitrary metadata to containers #6839

Closed

Add ability for container to publish metadata #2336

Closed

Add label when running a container #6997

Closed

jamtur01 added the /project/doc label Nov 10, 2014

bfirsh mentioned this pull request Dec 23, 2014

Use --label name=* to update Name in docker info #9571

Closed

thaJeztah mentioned this pull request Dec 29, 2014

Include Dockerfile source in the image #9841

Closed

bfirsh mentioned this pull request Dec 31, 2014

API: Allow for container IDs to be forced through the remote API. #9854

Closed

ibuildthecloud mentioned this pull request Jan 3, 2015

Proposal: One Meta Data to Rule Them All => Labels #9882

Merged

jessfraz added the status/1-design-review label Feb 10, 2015

rade mentioned this pull request Mar 4, 2015

[dns] idiomatic support for containers with more than one name weaveworks/weave#364

Closed

jessfraz assigned crosbymichael Mar 10, 2015

icecrime closed this Mar 17, 2015

vieux removed the status/1-design-review label Mar 17, 2015

icecrime reopened this Mar 17, 2015

icecrime removed the /project/doc label Mar 31, 2015

icecrime closed this Apr 2, 2015

rade mentioned this pull request May 7, 2015

[dns] retain extra records on restart weaveworks/weave#635

Open

thaJeztah mentioned this pull request Aug 15, 2015

Add labels to running Docker containers #15496

Closed

bboreham mentioned this pull request Feb 25, 2016

Proposal: Add label support to update command #18958

Closed

4 tasks

Proposal: Container annotations (metadata) #9013

Proposal: Container annotations (metadata) #9013

Uh oh!

Conversation

Motivation

Synopsis

Design choices

Keys

JSON values

Update semantics

null for delete

Events

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!