Blame doc/rtd/topics/merging.rst

Packit Service a04d08
**************************
Packit Service a04d08
Merging User-Data Sections
Packit Service a04d08
**************************
Packit Service a04d08
Packit Service a04d08
Overview
Packit Service a04d08
========
Packit Service a04d08
Packit Service a04d08
This was implemented because it has been a common feature request that there be
Packit Service a04d08
a way to specify how cloud-config yaml "dictionaries" provided as user-data are
Packit Service a04d08
merged together when there are multiple yaml files to merge together (say when
Packit Service a04d08
performing an #include).
Packit Service a04d08
Packit Service a04d08
Since previously the merging algorithm was very simple and would only overwrite
Packit Service a04d08
and not append lists, or strings, and so on it was decided to create a new and
Packit Service a04d08
improved way to merge dictionaries (and their contained objects) together in a
Packit Service a04d08
way that is customizable, thus allowing for users who provide cloud-config
Packit Service a04d08
user-data to determine exactly how their objects will be merged.
Packit Service a04d08
Packit Service a04d08
For example.
Packit Service a04d08
Packit Service a04d08
.. code-block:: yaml
Packit Service a04d08
Packit Service a04d08
   #cloud-config (1)
Packit Service a04d08
   runcmd:
Packit Service a04d08
     - bash1
Packit Service a04d08
     - bash2
Packit Service a04d08
Packit Service a04d08
   #cloud-config (2)
Packit Service a04d08
   runcmd:
Packit Service a04d08
     - bash3
Packit Service a04d08
     - bash4
Packit Service a04d08
Packit Service a04d08
The previous way of merging the two objects above would result in a final
Packit Service a04d08
cloud-config object that contains the following.
Packit Service a04d08
Packit Service a04d08
.. code-block:: yaml
Packit Service a04d08
Packit Service a04d08
   #cloud-config (merged)
Packit Service a04d08
   runcmd:
Packit Service a04d08
     - bash3
Packit Service a04d08
     - bash4
Packit Service a04d08
Packit Service a04d08
Typically this is not what users want; instead they would likely prefer:
Packit Service a04d08
Packit Service a04d08
.. code-block:: yaml
Packit Service a04d08
Packit Service a04d08
   #cloud-config (merged)
Packit Service a04d08
   runcmd:
Packit Service a04d08
     - bash1
Packit Service a04d08
     - bash2
Packit Service a04d08
     - bash3
Packit Service a04d08
     - bash4
Packit Service a04d08
Packit Service a04d08
This way makes it easier to combine the various cloud-config objects you have
Packit Service a04d08
into a more useful list, thus reducing duplication necessary to accomplish the
Packit Service a04d08
same result with the previous method.
Packit Service a04d08
Packit Service a04d08
Packit Service a04d08
Built-in Mergers
Packit Service a04d08
================
Packit Service a04d08
Packit Service a04d08
Cloud-init provides merging for the following built-in types:
Packit Service a04d08
Packit Service a04d08
- Dict
Packit Service a04d08
- List
Packit Service a04d08
- String
Packit Service a04d08
Packit Service a04d08
The ``Dict`` merger has the following options which control what is done with
Packit Service a04d08
values contained within the config.
Packit Service a04d08
Packit Service a04d08
- ``allow_delete``: Existing values not present in the new value can be
Packit Service a04d08
  deleted, defaults to False
Packit Service a04d08
- ``no_replace``: Do not replace an existing value if one is already present,
Packit Service a04d08
  enabled by default.
Packit Service a04d08
- ``replace``: Overwrite existing values with new ones.
Packit Service a04d08
Packit Service a04d08
The ``List`` merger has the following options which control what is done with
Packit Service a04d08
the values contained within the config.
Packit Service a04d08
Packit Service a04d08
- ``append``:  Add new value to the end of the list, defaults to False.
Packit Service a04d08
- ``prepend``:  Add new values to the start of the list, defaults to False.
Packit Service a04d08
- ``no_replace``: Do not replace an existing value if one is already present,
Packit Service a04d08
  enabled by default.
Packit Service a04d08
- ``replace``: Overwrite existing values with new ones.
Packit Service a04d08
Packit Service a04d08
The ``Str`` merger has the following options which control what is done with
Packit Service a04d08
the values contained within the config.
Packit Service a04d08
Packit Service a04d08
- ``append``:  Add new value to the end of the string, defaults to False.
Packit Service a04d08
Packit Service a04d08
Common options for all merge types which control how recursive merging is
Packit Service a04d08
done on other types.
Packit Service a04d08
Packit Service a04d08
- ``recurse_dict``: If True merge the new values of the dictionary, defaults to
Packit Service a04d08
  True.
Packit Service a04d08
- ``recurse_list``: If True merge the new values of the list, defaults to
Packit Service a04d08
  False.
Packit Service a04d08
- ``recurse_array``: Alias for ``recurse_list``.
Packit Service a04d08
- ``recurse_str``: If True merge the new values of the string, defaults to
Packit Service a04d08
  False.
Packit Service a04d08
Packit Service a04d08
Packit Service a04d08
Customizability
Packit Service a04d08
===============
Packit Service a04d08
Packit Service a04d08
Because the above merging algorithm may not always be desired (just as the
Packit Service a04d08
previous merging algorithm was not always the preferred one), the concept of
Packit Service a04d08
customized merging was introduced through 'merge classes'.
Packit Service a04d08
Packit Service a04d08
A merge class is a class definition which provides functions that can be used
Packit Service a04d08
to merge a given type with another given type.
Packit Service a04d08
Packit Service a04d08
An example of one of these merging classes is the following:
Packit Service a04d08
Packit Service a04d08
.. code-block:: python
Packit Service a04d08
Packit Service a04d08
   class Merger(object):
Packit Service a04d08
       def __init__(self, merger, opts):
Packit Service a04d08
           self._merger = merger
Packit Service a04d08
           self._overwrite = 'overwrite' in opts
Packit Service a04d08
Packit Service a04d08
       # This merging algorithm will attempt to merge with
Packit Service a04d08
       # another dictionary, on encountering any other type of object
Packit Service a04d08
       # it will not merge with said object, but will instead return
Packit Service a04d08
       # the original value
Packit Service a04d08
       #
Packit Service a04d08
       # On encountering a dictionary, it will create a new dictionary
Packit Service a04d08
       # composed of the original and the one to merge with, if 'overwrite'
Packit Service a04d08
       # is enabled then keys that exist in the original will be overwritten
Packit Service a04d08
       # by keys in the one to merge with (and associated values). Otherwise
Packit Service a04d08
       # if not in overwrite mode the 2 conflicting keys themselves will
Packit Service a04d08
       # be merged.
Packit Service a04d08
       def _on_dict(self, value, merge_with):
Packit Service a04d08
           if not isinstance(merge_with, (dict)):
Packit Service a04d08
               return value
Packit Service a04d08
           merged = dict(value)
Packit Service a04d08
           for (k, v) in merge_with.items():
Packit Service a04d08
               if k in merged:
Packit Service a04d08
                   if not self._overwrite:
Packit Service a04d08
                       merged[k] = self._merger.merge(merged[k], v)
Packit Service a04d08
                   else:
Packit Service a04d08
                       merged[k] = v
Packit Service a04d08
               else:
Packit Service a04d08
                   merged[k] = v
Packit Service a04d08
           return merged
Packit Service a04d08
Packit Service a04d08
As you can see there is a '_on_dict' method here that will be given a source
Packit Service a04d08
value and a value to merge with. The result will be the merged object. This
Packit Service a04d08
code itself is called by another merging class which 'directs' the merging to
Packit Service a04d08
happen by analyzing the types of the objects to merge and attempting to find a
Packit Service a04d08
know object that will merge that type. I will avoid pasting that here, but it
Packit Service a04d08
can be found in the `mergers/__init__.py` file (see `LookupMerger` and
Packit Service a04d08
`UnknownMerger`).
Packit Service a04d08
Packit Service a04d08
So following the typical cloud-init way of allowing source code to be
Packit Service a04d08
downloaded and used dynamically, it is possible for users to inject there own
Packit Service a04d08
merging files to handle specific types of merging as they choose (the basic
Packit Service a04d08
ones included will handle lists, dicts, and strings). Note how each merge can
Packit Service a04d08
have options associated with it which affect how the merging is performed, for
Packit Service a04d08
example a dictionary merger can be told to overwrite instead of attempt to
Packit Service a04d08
merge, or a string merger can be told to append strings instead of discarding
Packit Service a04d08
other strings to merge with.
Packit Service a04d08
Packit Service a04d08
How to activate
Packit Service a04d08
===============
Packit Service a04d08
Packit Service a04d08
There are a few ways to activate the merging algorithms, and to customize them
Packit Service a04d08
for your own usage.
Packit Service a04d08
Packit Service a04d08
1. The first way involves the usage of MIME messages in cloud-init to specify
Packit Service a04d08
   multipart documents (this is one way in which multiple cloud-config is
Packit Service a04d08
   joined together into a single cloud-config). Two new headers are looked
Packit Service a04d08
   for, both of which can define the way merging is done (the first header to
Packit Service a04d08
   exist wins).  These new headers (in lookup order) are 'Merge-Type' and
Packit Service a04d08
   'X-Merge-Type'. The value should be a string which will satisfy the new
Packit Service a04d08
   merging format definition (see below for this format).
Packit Service a04d08
Packit Service a04d08
2. The second way is actually specifying the merge-type in the body of the
Packit Service a04d08
   cloud-config dictionary. There are 2 ways to specify this, either as a
Packit Service a04d08
   string or as a dictionary (see format below). The keys that are looked up
Packit Service a04d08
   for this definition are the following (in order), 'merge_how',
Packit Service a04d08
   'merge_type'.
Packit Service a04d08
Packit Service a04d08
String format
Packit Service a04d08
-------------
Packit Service a04d08
Packit Service a04d08
The string format that is expected is the following.
Packit Service a04d08
Packit Service a04d08
::
Packit Service a04d08
Packit Service a04d08
   classname1(option1,option2)+classname2(option3,option4)....
Packit Service a04d08
Packit Service a04d08
The class name there will be connected to class names used when looking for the
Packit Service a04d08
class that can be used to merge and options provided will be given to the class
Packit Service a04d08
on construction of that class.
Packit Service a04d08
Packit Service a04d08
For example, the default string that is used when none is provided is the
Packit Service a04d08
following:
Packit Service a04d08
Packit Service a04d08
::
Packit Service a04d08
Packit Service a04d08
   list()+dict()+str()
Packit Service a04d08
Packit Service a04d08
Dictionary format
Packit Service a04d08
-----------------
Packit Service a04d08
Packit Service a04d08
A dictionary can be used when it specifies the same information as the
Packit Service a04d08
string format (i.e. the second option above), for example:
Packit Service a04d08
Packit Service a04d08
.. code-block:: python
Packit Service a04d08
Packit Service a04d08
   {'merge_how': [{'name': 'list', 'settings': ['append']},
Packit Service a04d08
                  {'name': 'dict', 'settings': ['no_replace', 'recurse_list']},
Packit Service a04d08
                  {'name': 'str', 'settings': ['append']}]}
Packit Service a04d08
Packit Service a04d08
This would be the equivalent format for default string format but in dictionary
Packit Service a04d08
form instead of string form.
Packit Service a04d08
Packit Service a04d08
Specifying multiple types and its effect
Packit Service a04d08
========================================
Packit Service a04d08
Packit Service a04d08
Now you may be asking yourself, if I specify a merge-type header or dictionary
Packit Service a04d08
for every cloud-config that I provide, what exactly happens?
Packit Service a04d08
Packit Service a04d08
The answer is that when merging, a stack of 'merging classes' is kept, the
Packit Service a04d08
first one on that stack is the default merging classes, this set of mergers
Packit Service a04d08
will be used when the first cloud-config is merged with the initial empty
Packit Service a04d08
cloud-config dictionary. If the cloud-config that was just merged provided a
Packit Service a04d08
set of merging classes (via the above formats) then those merging classes will
Packit Service a04d08
be pushed onto the stack. Now if there is a second cloud-config to be merged
Packit Service a04d08
then the merging classes from the cloud-config before the first will be used
Packit Service a04d08
(not the default) and so on. This way a cloud-config can decide how it will
Packit Service a04d08
merge with a cloud-config dictionary coming after it.
Packit Service a04d08
Packit Service a04d08
Other uses
Packit Service a04d08
==========
Packit Service a04d08
Packit Service a04d08
In addition to being used for merging user-data sections, the default merging
Packit Service a04d08
algorithm for merging 'conf.d' yaml files (which form an initial yaml config
Packit Service a04d08
for cloud-init) was also changed to use this mechanism so its full
Packit Service a04d08
benefits (and customization) can also be used there as well. Other places that
Packit Service a04d08
used the previous merging are also, similarly, now extensible (metadata
Packit Service a04d08
merging, for example).
Packit Service a04d08
Packit Service a04d08
Note, however, that merge algorithms are not used *across* types of
Packit Service a04d08
configuration.  As was the case before merging was implemented,
Packit Service a04d08
user-data will overwrite conf.d configuration without merging.
Packit Service a04d08
Packit Service a04d08
Example cloud-config
Packit Service a04d08
====================
Packit Service a04d08
Packit Service a04d08
A common request is to include multiple ``runcmd`` directives in different
Packit Service a04d08
files and merge all of the commands together.  To achieve this, we must modify
Packit Service a04d08
the default merging to allow for dictionaries to join list values.
Packit Service a04d08
Packit Service a04d08
Packit Service a04d08
The first config
Packit Service a04d08
Packit Service a04d08
.. code-block:: yaml
Packit Service a04d08
Packit Service a04d08
   #cloud-config
Packit Service a04d08
   merge_how:
Packit Service a04d08
    - name: list
Packit Service a04d08
      settings: [append]
Packit Service a04d08
    - name: dict
Packit Service a04d08
      settings: [no_replace, recurse_list]
Packit Service a04d08
Packit Service a04d08
   runcmd:
Packit Service a04d08
     - bash1
Packit Service a04d08
     - bash2
Packit Service a04d08
Packit Service a04d08
The second config
Packit Service a04d08
Packit Service a04d08
.. code-block:: yaml
Packit Service a04d08
Packit Service a04d08
   #cloud-config
Packit Service a04d08
   merge_how:
Packit Service a04d08
    - name: list
Packit Service a04d08
      settings: [append]
Packit Service a04d08
    - name: dict
Packit Service a04d08
      settings: [no_replace, recurse_list]
Packit Service a04d08
Packit Service a04d08
   runcmd:
Packit Service a04d08
     - bash3
Packit Service a04d08
     - bash4
Packit Service a04d08
Packit Service a04d08
Packit Service a04d08
.. vi: textwidth=78