Go to main page

CernVM contextualization

CernVM contextualization on Amazon EC2 (or API compatible) Cloud

Prerequisites

  • Amazon EC2 account/credentials
  • CernVM 2.1.1 or higher
  • Amazon command line tools, firefox add-on or binding to one of the supported languages

Description

In its standard form, CernVM images come with build in rAA (rPath Appliance Agent) which provides Web interface allowing image to be configured for a specific experiment, create user account and enable some extra options. While the same interface can be invoked remotely using XMLRPC RPC, this approach may become impractical when it comes to deployment of a large number of instances in particular on public clouds such as Amazon EC2.

Given that EC2 API provides users with capability to pass an arbitrary user data (in form of base64 encoded string limited to 16kb) to the running instance we added a contextualization mechanism to CernVM which will parse such string and apply several contextualization methods.

  1. The user data string first check for gzip signature and unzipped if necessary.
  2. The contextualization based on rPath amiconfig package extended with CernVM plugin is applied. This tool will execute on boot time (before network services are available), parse user data and look for python style configuration blocks. If match is found the corresponding plugin will process the options and execute configuration steps if needed. By default, enabled rootsshkeys and cernvm are the only enabled plugins (others can be enabled in the configuration file).
  3. Default plugins:

    rootshkeys            - allow injection of root ssh keys
    cernvm                - configure various CernVM options
    

    Available plugins:

    disablesshpasswdauth  - if activated, it will disable ssh authentication with password     
    rapadminpassword      - define rAA admin password 
    conaryproxy           - configure conary proxy  
    storage               - configure ephemeral EC2 storage
    hostname              - set hostname      
    dnsupdate             - update DNS server with current host IO  
    noip                  - register IP address with NOIP dynamic DNS service      
    openvpn               - setup VPN      
    condor                - setup Condor batch system
    ldap                  - setup LDAP connection
    nss                   - /etc/nsswithch.conf configuration 
    squid                 - configure squid for use with CernVM
    ganglia               - configure gmond (ganglia monitoring)
    

    Common amiconfig options:

    [amiconfig]
    plugins = <list of plugins to enable>
    disabed_plugins = <list of plugins to disable>
    

    Specific plugin options:

    [cernvm]
    # list of ',' seperated organisations/experiments (lowercase)
    organisations = <list>
    # list of ',' seperated repositories (lowercase)
    repositories = <list>
    # list of ',' separated user accounts to create <user:group:[password]>
    users = <list>
    # CernVM user shell </bin/bash|/bin/tcsh>
    shell = <shell>
    # CVMFS HTTP proxy
    proxy = http://<host>:<port>;DIRECT
    ----------------------------------------------------------
    # url from where to retrieve initial CernVM configuration
    config_url = <url>
    # install extra conary group
    group_profile = group-<org>[-desktop]
    # list of ',' separated scripts to be executed as given user: <user>:/path/to/script.sh
    contextualization_command = <list>
    # list of ',' seperated services to start
    services = <list>
    # extra environment variables to define
    environment = VAR1=<value>,VAR2=<value>
    
    [rpath]
    rap-password = <password>
    conaryproxy = <url>
    
    [storage]
    # disable the spacedaemon
    daemon = False
    # size in GB
    pre-allocated-space = 20
    # list of ':' seperated dirs
    relocate-paths = /srv/rmake-builddir:/srv/mysql
    
    [hostname]
    hostname = <hostname>
    
    [dnsupdate]
    tsighost
    tsigkey
    server
    host
    hostname
    # or, derive hostname from template 
    [noip]
    # publish ip at https://%(username)s:%(password)s@dynupdate.no-ip.com
    username   
    password
    hostname
    # or, derive hostname from template 
    [openvpn]
    nameserver = 192.168.1.1
    search = foo.example.com bar.example.com
    server = myvpn.example.com
    port = 1194
    proto = tcp
    ca = <compressed ca cert>
    cert = <compressed cert>
    key = <compressed cert>
    
    [condor]
    # host name
    hostname = <FQDN>
    # master host name
    condor_master = <FQDN>
    # shared secret key
    condor_secret = <string>
    #------------------------
    # collector name
    collector_name = <string>
    # condor user
    condor_user = <string>
    # condor group
    condor_group = <string>
    # condor directory
    condor_dir = <path>
    # condor admin
    condor_admin = <path>
    
    [ldap]
    # base DN use to bind to LDAP
    base = <dn>
    # LDAP server URL
    url  = <url>
    
    [nss]
    password = files
    group = files
    shadow = files
    hosts = files
    bootparams = nisplus [NOTFOUND=return] files
    ethers = files
    netmasks = files
    networks = files
    protocols = files
    rpc = files
    services = files
    netgroup = nisplus
    publickey =  nisplus
    automount = files nisplus
    aliases = files nisplus
    
    [squid]
    cvmfs_server = cernvm-webfs.cern.ch
    cache_mem = 4096 MB
    maximum_object_size_in_memory =  32 KB
    cache_dir = /var/spool/squid
    cache_dir_size = 50000
    
    [ganglia]
    name = CernVM
    owner = unknown
    latlong = unknown
    url = unkonown
    location = unknown
    
    [puppet]
    # The puppetmaster server for puppet client. If not specified, puppet server will be started
    puppet_server=puppet
    #----------------------------------------------
    # If you wish to specif the port to connect to do so here
    puppet_port=8140
    # Where to log to. Specify syslog to send log messages to the system log.
    puppet_log=/var/log/puppet/puppet.log
    # You may specify other parameters to the puppet client here
    puppet_extra_opts=--waitforcert=500
    # Location of the main manifest
    #puppetmaster_manifest=/etc/puppet/manifests/site.pp
    # Where to log general messages to.
    # Specify syslog to send log messages to the system log.
    puppetmaster_log=syslog
    # You may specify an alternate port or an array of ports on which
    # puppetmaster should listen. Default is: 8140
    # If you specify more than one port, the puppetmaster ist automatically
    # started with the servertype set to mongrel. This might be interesting
    # if you'd like to run your puppetmaster in a loadbalanced cluster.
    # Please note: this won't setup nor start any loadbalancer.
    # If you'd like to run puppetmaster with mongrel as servertype but only
    # on one (specified) port, you have to add --servertype=mongrel to
    # PUPPETMASTER_EXTRA_OPTS.
    # Default: Empty (Puppetmaster isn't started with mongrel, nor on a
    # specific port)
    puppetmaster_ports=""
    # Puppetmaster on a different port, run with standard webrick servertype
    #puppetmaster_ports="8141"
    # Example with multiple ports which will start puppetmaster with mongrel
    # as a servertype
    puppetmaster_ports=( 18140 18141 18142 18143 )
    # You may specify other parameters to the puppetmaster here
    puppetmaster_extra_opts=--no-ca
    
    [tcpbuffers]
    # increase TCP max buffer size setable using setsockopt()
    # 16 MB with a few parallel streams is recommended for most 10G paths
    # 32 MB might be needed for some very long end-to-end 10G or 40G paths
    net.core.rmem_max = 16777216
    net.core.wmem_max = 16777216
    # increase Linux autotuning TCP buffer limits
    # min, default, and max number of bytes to use
    # (only change the 3rd value, and make it 16 MB or more)
    net.ipv4.tcp_rmem = 4096 87380 16777216
    net.ipv4.tcp_wmem = 4096 65536 16777216
    # recommended to increase this for 10G NICS
    net.core.netdev_max_backlog = 30000
    # these should be the default, but just to be sure
    net.ipv4.tcp_timestamps = 1
    net.ipv4.tcp_sack = 1
    
  4. If user data string starts with a line starting with #!, it will be interpreted as a script and executed. The interpreter can be any scripting language and path to interpreter must be specified following the #! keyword. The same user data string may as well contain amiconfig contextualization options but they must be placed after the configuration script which must end with exit statement. If amiconfig block is found, it is interpreted first and the script part will be executed after amiconfig plugins complete their job.
  5. Finally, if ISO image has been attached to the instance, the SiteContextualization will be applied.

NOTE: This option to pass arbitrary script to virtual machine instance and run is as root is currently restricted to Amazon EC2 instances.

Examples

There are many implementations of Amazon EC2 API and command line tools. Here are some examples:

Perl script to start EC2 instance

use Net::Amazon::EC2;
use MIME::Base64;

my $ec2 = Net::Amazon::EC2->new(
          AWSAccessKeyId  => 'XXXXXXXXXXXXXXXXXXX', 
          SecretAccessKey => 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'
);

# This script will 
#   - start 1 new instance of AMI: ami-XXXXXXXX with kernel aki-XXXXXXX in security group 'cernvm' and with root ssh key 'ami' 
#   - configure it to support cms VO and attach cms software repository to /opt/cms. 
#   - create testcms user in group cms with password 1234578 and shell /bin/bash
#   - define environment variables CMS_SITECONFIG=EC2 CMS_ROOT=/opt/cms
#   - run the "echo Done > /tmp/done" command (as root)
#   - mark 'ntpd' as service to be started 

my $image = 'ami-42e8022b';

my $user_data = qq{
#!/bin/sh
echo Done > /tmp/done
exit

[cernvm]
organisations = cms
repositories = cms
users = testcms:cms:12345678
shell = /bin/bash
services = ntpd
environment = CMS_SITECONFIG=EC2,CMS_ROOT=/opt/cms
};

my $instance = $ec2->run_instances(ImageId  => $image, 
                                   InstanceType => 'm1.large',
                                   MinCount => 1, 
                                   MaxCount => 1, 
                                   KernelId => 'aki-9800e5f1',
                                   SecurityGroup => 'cernvm',
                                   KeyName => 'ami', 
                                   UserData => encode_base64($user_data)
                                   );

CernVM contextualization with CDROM image on KVM

Description

In order to allow site administrators (or end users if they are allowed to start virtual machine images) to customize CernVM instance and add at boot time some specific tools and services such as:

  • monitoring
  • accounting
  • network configuration
  • batch system configuration
  • site specific storage access configuration

The HEPIX startup scripts (/etc/init.d/vmcontext.epilogue and /etc/init.d/vmcontext.epilogue) will attempt to mount CDROM device in temporary directory (normally attached as ISO image).

If this is successful, scripts will check for existence of prologue.sh and epilogue.sh files and source them if they exists and apply start/stop procedure.

CDROM image generation

CernVM batch images have a preinstalled set of scripts which allow to contextualize the virtual machine at boot time using a CDROM image. Below we provide an example script which generates such an image. This CDROM image will instruct contextualization scripts to set up ATLAS software environment, to install the SSH key (to allow login as 'root') and proxy certificate of the user inside the virtual machine, as well as to start the PanDA pilot. To get all available contextualization options ([cernvm] section in the script) please refer to CernVM contextualization above.

#!/bin/sh

tmpdir=`mktemp -dp /tmp`

if [ -f "$X509_USER_PROXY" ]
then
  x509_cert=`cat $X509_USER_PROXY`
fi

if [ -f "$ROOT_PUBKEY" ]
then
  cp $ROOT_PUBKEY $tmpdir/root.pub
fi

cat<<'EOF'>$tmpdir.user_data
[cernvm]
organisations = atlas
repositories  = atlas,grid,atlas-condb,sft
users = panda:atlas:
contextualization_command = panda:'/cvmfs/sft.cern.ch/lcg/external/experimental/panda-pilot/runPanda -m /tmp/eos/atlas/vm/copilot -t 1'
x509-cert = $x509_cert
EOF

user_data64=`base64 --wrap=0 $tmpdir.user_data`

cat<<EOF>$tmpdir/context.sh
# Context variables used by amiconfig
ROOT_PUBKEY=root.pub
EC2_USER_DATA="$user_data64"
ONE_CONTEXT_PATH="/var/lib/amiconfig"
EOF

touch $tmpdir/prolog.sh 

mkisofs -o context.iso $tmpdir

The values of X509_USER_PROXY and ROOT_PUBKEY should be set before running the script:

$ env X509_USER_PROXY=/tmp/x509_u0 ROOT_PUBKEY=/root/.ssh/id_rsa.pub ./create_context_iso.sh

Attaching the image to the virtual machine

To attach the generated image to the VM the following section has to be added to the VM definition:

<disk type='file' device='cdrom'>
  <source file='context.iso'/>
  <target dev='hdb'/>
  <readonly/>
  <driver name='qemu' type='raw'/>
</disk>

The full VM definition file should look like this:

<domain type='kvm'>
  <name>CernVM-2.2.0-x86_64</name>

  <memory>524288</memory>

  <os>
    <type arch='x86_64'>hvm</type>
    <boot dev='hd'/>
  </os>

  <devices>
    <disk type='file' device='disk'>
      <source file='/data/test/cernvm-batch-node-2.4.0-1.1-1-x86_64.hdd' />
      <target dev='hda'/>
    </disk>

    <disk type='file' device='cdrom'>
      <source file='/data/test/context.iso'/>
      <target dev='hdb'/>
      <readonly/>
      <driver name='qemu' type='raw'/>
    </disk>

    <interface type='network'>
      <source network='default'/>
      <model type='virtio'/>
    </interface>        

    <graphics type='vnc' listen='0.0.0.0' port='6019'/>
  </devices>
</domain>

Once the VM is started with an attached CDROM image the contextualization will mount the image and configure the VM accordingly. Once this is done the user who started the VM should be able to login as root (the ip address of the newly created machine should be visible in /var/lib/libvirt/dnsmasq/default.leases file).