- Boot loader contextualization
- Mixing user data formats
- Cluster contextualization
A context is a small (up to 16kB), usually human-readable snippet that is used to apply a role to a virtual machine. A context allows to have a singe virtual machine image that can back many different virtual machine instances in so far as the instance can adapt to various cloud infrastructures and use cases depending on the context. In the process of contextualization, the cloud infrastructure makes the context available to the virtual machine and the virtual machine interprets the context. On contextualization, the virtual machine can, for instance, start certain services, create users, or set configuration parameters.
For contextualization, we distinguish between so called meta-data and user-data. The meta-data is provided by the cloud infrastructure and is not modifiable by the user. For example, meta-data includes the instance ID as issued by the cloud infrastructure and the virtual machine's assigned network parameters. The user-data is provided by the user on creation of the virtual machine.
Meta-data and user-data are typically accessible through an HTTP server on a private IP address such as 169.254.169.254. The cloud infrastructure shields user-data from different VMs so that it can be used to deliver credentials to a virtual machine.
In CernVM, there are three entities that interpret the user-data. Each of them typically read "what it understands" while ignoring the rest. The µCernVM bootloader interprets a very limited set of key-value pairs that are used to initialize the virtual hardware and to select the CernVM-FS operating system repository. In a later boot phase, amiconfig and cloud-init are used to contextualize the virtual machine. The amiconfig system was initially provided by rPath but it is now maintained by us. It provides very simple, key-value based contextualization that is processed by a number of amiconfig plugins. The cloud-init system is maintained by Redhat and Ubuntu and provides a more sophisticated but also slightly more complex contextualization mechanism.
The µCernVM bootloader can process EC2, Openstack, and vSphere user data. Within the user data everything is ignored expect a block of the form
[ucernvm-begin] key1=value1 key2=value2 ... [ucernvm-end]
The following key-value pairs are recognized:
- Can be on or off. When turned on, use all of the hard disk as root partition instead of the first 20G
- HTTP proxy in CernVM-FS notation
- WPAD proxy autoconfig URLs separated by ';'
- List of Stratum 1 servers, e.g. cvmfs-stratum-one.cern.ch,another.com
- The snapshot name, useful for the long-term data preservation
- Base64 encoded .tar.gz ball, which gets extracted in the root tree
- Can be on or off, defaults to on. When turned off, glideinWMS auto detection gets disabled
The amiconfig contextualization executes on boot time, parses user data and looks for python style configuration blocks. If a match is found the corresponding plugin will process the options and execute configuration steps if needed. By default, enabled rootsshkeys is the only enabled plugins (others can be enabled in the configuration file).
rootshkeys - allow injection of root ssh keys
amildap - setup LDAP connection cernvm - configure various CernVM options condor - setup Condor batch system disablesshpasswdauth - if activated, it will disable ssh authentication with password dnsupdate - update DNS server with current host IO ganglia - configure gmond (ganglia monitoring) hostname - set hostname noip - register IP address with NOIP dynamic DNS service nss - /etc/nsswithch.conf configuration puppet - set parameters for puppet configuration management squid - configure squid for use with CernVM-FS
Common amiconfig options:
[amiconfig] plugins = <list of plugins to enable> disabed_plugins = <list of plugins to disable>
Specific plugin options:
[cernvm] # list of ',' seperated organisations/experiments (lowercase) organisations = <list> # list of ',' seperated repositories (lowercase) repositories = <list> # list of ',' separated user accounts to create <user:group:[password]> users = <list> # CernVM user shell </bin/bash|/bin/tcsh> shell = <shell> # CVMFS HTTP proxy proxy = http://<host>:<port>;DIRECT ---------------------------------------------------------- # url from where to retrieve initial CernVM configuration config_url = <url> # list of ',' separated scripts to be executed as given user: <user>:/path/to/script.sh contextualization_command = <list> # list of ',' seperated services to start services = <list> # extra environment variables to define environment = VAR1=<value>,VAR2=<value>
[condor] # host name hostname = <FQDN> # master host name condor_master = <FQDN> # shared secret key condor_secret = <string> #------------------------ # collector name collector_name = <string> # condor user condor_user = <string> # condor group condor_group = <string> # condor directory condor_dir = <path> # condor admin condor_admin = <path> highport = 9700 lowport = 9600 uid_domain = filesystem_domain = allow_write = *.$uid_domain extra_vars = use_ips =
If the user data string starts with a line starting with #!, it will be interpreted as a bash script and executed. The same user data string may as well contain amiconfig contextualization options but they must be placed after the configuration script which must end with an exit statement. The interpreter can be /bin/sh or /bin/sh.before or /bin/sh.after depending on when the script is to be executed, before or after amiconfig contextualization. A script for the /bin/sh interpreter is executed after amiconfig contextualization.
As an alternative to amiconfig, CernVM 3 supports cloud-init contextualization. Some of the contextualization tasks done by amiconfig can be done by cloud-init as well due to the native cloud-init modules for cvmfs, ganglia, and condor.
The user-data for cloud-init and for amiconfig can be mixed. The cloud-init syntax supports user data divided into multiple MIME parts. One of these MIME parts can contain amiconfig or µCernVM formatted user-data. All contextualization agents (µCernVM, amiconfig, cloud-init) parse the user data and each one interprets what it understands.
The following example illustrates how to mix amiconfig and cloud-init. We have an amiconfig context amiconfig-user-data that starts a catalog server for use with Makeflow:
[amiconfig] plugins = workqueue [workqueue]
We also have a cloud-init context cloud-init-user-data that creates an interactive user "cloudy" with the password "password"
users: - name: cloudy lock-passwd: false passwd: $6$XYWYJCb.$OYPPN5AohCixcG3IqcmXK7.yJ/wr.TwEu23gaVqZZpfdgtFo8X/Z3u0NbBkXa4tuwu3OhCxBD/XtcSUbcvXBn1
The following helper script creates our combined user data with multiple MIME parts:
amiconfig-mime cloud-init-user-data:cloud-config amiconfig-user-data:amiconfig-user-data > mixed-user-data
In the same way, the µCernVM contextualization block can be another MIME part in a mixed context with MIME type ucernvm.
By default, CernVM will automatically detect user data from glideinWMS and, if detected, activate the glideinWMS VM agent. CernVM recognizes user data that consists of no more than two lines and that contains the pattern
...#### -cluster 0123 -subcluster 4567####... as glideinWMS user data. It will automatically extract the CernVM-FS proxy configuration (proxy and pac URLs) from the user data. In order to disable autodetection, set
useglideinWMS=false in the µCernVM contextualization.
In addition to the normal user data, we have experimental support for "extra user data", which might be a last resort where the normal user data is occupied by the infrastructure. For instance, glideinWMS seems to exclusively specify user data, making it necessary to modify the image for additional contextualization. Extra user data are injected in the image under /cernvm/extra-user-data and they are internally appended to the normal user data. This does not yet work with cloud-init though; only with amiconfig and the µCernVM bootloader.
When you are creating a cluster, CernVM offers support for automatic master IP distribution to worker machines in the pre-contextualization phase. You generate a PIN with Cluster Pairing Service and put it into appropriate sections of the context files. For further info and usage example, please refer to the documentation.
Create a Condor cluster on CERN OpenStack
CernVM 4 supports the Cluster Contextualization via a centrally managed service. Currently it allows you to distribute a Master node IP address to the Worker nodes. This example shows cluster setup on CERN OpenStack, but the approach is similar with other cloud providers.
First you need to upload a CernVM 4 image to your personal OpenStack image store. Instructions can be found in the Running on CERN OpenStack tutorial.
After you uploaded the CernVM 4 image, you need to go to the Cluster Pairing Service and generate a new cluster PIN. Login is required in order to complete this action (you can log in with your CERN account).
Then you need to create two context files. One for your master node and one for your worker nodes. Remember to replace the cluster PIN field your own PIN. Notice the '###MASTER_IP_PLACEHOLDER###' placeholder, which is automatically replaced during the initial boot on the worker nodes.
Master context file
From nobody Fri Oct 14 13:55:48 2016 Content-Type: multipart/mixed; boundary="===============2227770197833174079==" MIME-Version: 1.0 --===============2227770197833174079== MIME-Version: 1.0 Content-Type: text/cloud-config; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="master.context.cloudinit" users: - name: condor-submit --===============2227770197833174079== MIME-Version: 1.0 Content-Type: text/amiconfig-user-data; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="master.context.amiconfig" [amiconfig] plugins=cernvm condor [condor] use_ips=true lowport=41000 highport=42000 condor_user=condor condor_group=condor condor_secret=secret uid_domain=* [ucernvm-begin] cvm_cluster_master=true cvm_cluster_pin=<Your Generated Cluster Pin> cvmfs_branch=cernvm-sl7.cern.ch cvmfs_server=hepvm.cern.ch cvmfs_path=cvm4 [ucernvm-end] --===============2227770197833174079==--
Slave context file
From nobody Fri Oct 14 13:55:48 2016 Content-Type: multipart/mixed; boundary="===============2227770197833174079==" MIME-Version: 1.0 --===============2227770197833174079== MIME-Version: 1.0 Content-Type: text/amiconfig-user-data; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="master.context.amiconfig" [amiconfig] plugins=cernvm condor [condor] condor_master=###MASTER_IP_PLACEHOLDER### use_ips=true lowport=41000 highport=42000 condor_user=condor condor_group=condor condor_secret=secret uid_domain=* [ucernvm-begin] cvm_cluster_pin=<Your Generated Cluster Pin> cvmfs_branch=cernvm-sl7.cern.ch cvmfs_server=hepvm.cern.ch cvmfs_path=cvm4 [ucernvm-end] --===============2227770197833174079==--
Login to lxplus (if you don't have installed and properly configured Nova tools on your machine) and save these context files as master.context and slave.context. Next we are going to create our machines. You need to have a generated key pair in the CERN OpenStack (check in the Access&Security tab in the OpenStack web interface). The given image name has to be the same as the one you uploaded earlier. Again, don't forget to replace the key name and node name.
Create a Master node
nova boot <my-cluster-master-name> --image 'CernVM_4' --flavor m2.small \ --key-name <my_cloud_key> --user-data master.context
Create Worker nodes
nova boot <my-cluster-slave-name-1> --image 'CernVM_4' --flavor m2.small \ --key-name <my_cloud_key> --user-data slave.context nova boot <my-cluster-slave-name-2> --image 'CernVM_4' --flavor m2.small \ --key-name <my_cloud_key> --user-data slave.context
You need to wait approximately 15 minutes before the DNS info is propagated through the CERN network. If you do not want to setup the DNS records, you can pass the '--meta cern-services=false' argument to the nova command. In that case, you can log into the machines straight away by using their IP addresses.
After your master machine is up and running you can log in and check the status of the condor cluster.
ssh -i <my_cloud_key> root@<master-machine> condor_status
After the worker nodes completed their boot, you should see the machines listed here. You can create and execute a test job to make sure the condor cluster works.
Cluster test job file: hello_world.job
Universe = vanilla Executable = /bin/hostname Output = hello_world.stdout Error = hello_world.stderr Log = hello_world.log Queue
Running the test job
# Switch to the test user su - condor-submit # Submit the job condor_submit hello_world.job
After a moment you should see the job output in the hello_world.stdout file, which should contain a hostname of the worker node that completed the job.
Congratulations! Your cluster is now set up.
Create a Makeflow cluster
CernVM 3 supports the Makeflow workflow engine. Makeflow provides an easy way to define and run distributed computing workflows. The contextualization is similar to condor. There are three parameters:
catalog_server=hostname or ip address workqueue_project=project name, defaults to "cernvm" (similar to the shared secret in condor) workqueue_user=user name, defaults to workqueue (the user is created on the fly if necessary)
In order to contextualize the master node, include an empty workqueue section, like
[amiconfig] plugins=workqueue [workqueue]
In order to start the work queue workers, specify the location of the catalog server, like
[amiconfig] plugins=workqueue [workqueue] catalog_server=184.108.40.206 workqueue_project=foobar
The plugin will start one worker for every available CPU. Once the ensemble is up and running, makeflow can make use of the workqueue resources like so
makeflow -T wq -a -N foobar -d all -C 220.127.116.11:9097 makeflow.example
Note that your cloud infrastructure needs to provide access to UDP and TCP ports 9097 on your virtual machines.