CERN School Of Computing Excercises

Exercise 1

This exercise will cover the creation and configuration of CERN Virtual Machine (CernVM) instance. You will first create a CERN Virtual Machine using CernVM Installer tool, then configure it via the web interface, after which you will login to your CernVM instance and run the event displays of ALICE and CMS experiments.

CernVM Installation

CernVM Installer has already been deployed on the exercise machines. It can be started from the shell by typing cernvm-installer:


$ cernvm-installer

Once you start the installer, select the latest version (2.4.0 as of 16th of August 2011) of CernVM Desktop (x86_64 architecture) and click 'Deploy' button.



Increase the Memory size of the Virtual Machine to 1GB and click proceed.



To avoid waiting for CernVM image to download, we have preloaded the exercise machines with the latest version of CernVM Desktop (x86_64) image. Installer will notice that the image is available locally and will ask you whether it should download it again and overwrite locally available image. Make sure you chose 'No'.



Next, installer will uncompress the CernVM image, will configure a Virtual Machine instance with the VirtualBox hypervisor and will start the Virtual Machine. Please be patient, the firt boot of the virtual machine takes a bit longer. The reason is that during the first boot CernVM initialization scripts compile paravirtualized drivers for the guest operating system.

While you are waiting for the Virtual Machine to boot, look at VirtualBox configuration options, try to deduce what kind of virtualization techniques it uses (software, hardware or paravirtualization).



Once the VM was booted you will see the CernVM login prompt.



CernVM configuration via the web interface

Open URL shown on Virtual Machine console screen using Web browser

http://ip_address:8004 (no ssl)
https://ip_address:8003 (ssl)

Login using 'admin' as User Name and 'password' as Password.



Next you will be prompted to change the password of the virtual machine web interface. Note: this is the password of the web interdface 'administrator' account..



After setting the new password you will be prompted to enter a username, chose the desired group (make sure you select csc2011 as the group), set the login shell and define the password for the user.

Note: this is the username and password which you will use to login to the Virtual Machine.



The next screen will let you chose your CernVM edition ('Basic' or 'Desktop') as well as configure your primary organization ('csc2011').

Chose 'Desktop' as the edition, set the desired Virtual Machine screen resolution and tick the 'Start X at boot' checkbox. Make sure you press the "Save" button on the right before proceeding further.
From the dropdown menu below set CSC2011 as the 'Appliance primary group'.Make sure you press the "Save" button on the right before proceeding further.

Navigate to the bottom of the page and hit "Apply changes and reboot".



At this point you should login to your Virtual Machine using the username and password, and reboot it by doing:

$ sudo reboot

After CernVM reboots you will see the graphical login prompt.



Using CernVM

Login to Window Manager using the username and password which you have specified during the configuration step. Start System Monitor application (from task bar on the bottom) Open terminal window (from task bar on the bottom).

Use case of ALICE experiment
List possible ALICE user environments using
/cvmfs/alice.cern.ch/x86_64-unknown-linux-gnu/v2-18/api/bin/alienv list
Setup demo environment by typing
/cvmfs/alice.cern.ch/x86_64-unknown-linux-gnu/v2-18/api/bin/alienv use AliRoot/v4-19-Rev-02-CSC
cd $ALICE_ROOT/demo

Run ALICE event display:
aliceShow

Pay attention to network traffic while application is starting.

Inspect experiment software area in /cvmfs/alice.cern.ch. Exit the application and start it again.

Pay attention to network traffic. Note that second time there is no network traffic.

CMS use case

To setup CMS environment do
$ source /cvmfs/cms.cern.ch/cmsset_default.sh
List available releases
$ scram list CMSSW

Change to home directory and select CMS release

$ cd
$ cmsrel CMSSW_3_7_0

Enter release environment

$ cd CMSSW_3_7_0/src
$ cmsenv

Copy this file with events from Web server


$ wget http://cern.ch/matevz/links/afs-fireworks//web/dijet_mass_events_from_V8...

Run event display and select above input file when prompted
$ cmsShow

Suggested additional steps:

  • Stop the event displays, and start them again. Do you see network traffic?
  • Suspend and resume the virtual machine, to make sure that you find everything in the same state.
  • For die hard students: stop the event display(s), disconnect the VM from network a and try to run the event display again.

Shut down the virtual machine, but do not delete it after you finish the exercise. You will need it during the second part of exercise.

Excercise 2

The aim of the second part of exercises will be to create an ad-hoc Classroom Computing Cloud on exercise machines using CernVM and GridFactory and use it to carry out some computations. The participant who first finishes creating a histogram with 500.000 events get a free bottle of beer (or a glass of wine, or milk) from us.

We will first create a virtual machine which will become a part of the cloud and will be used as a worker node. After that we will use CernVM Desktop to submit jobs to the cloud and get the results back.

Creating a worker node

Navigate to the directory where CernVM Batch image is
$ cd /scratch/cernvm/

Download the script which creates a CD-ROM image for CernVM contextualization
$ wget http://cernvm.cern.ch/portal/d/create_context_iso.sh

Look into the script, try to understand what it does.

Run the script
$ chmod +x create_context_iso.sh && ./create_context_iso.sh

As a result a file called context.iso should be created.

Download the script which will configure a virtual machine, will attach your newly created image to it, and will start it:

$ wget http://cernvm.cern.ch/portal/d/start_worker.sh
$ chmod +x start_worker.sh

This script requires as an input path to the CernVM Batch image. The VM image has already been deployed on exercise machines in the directory /scratch/cernvm. It is important to run the script from the same directory where the context.iso file has been generated. Run the script as follows:

$ ./start_worker.sh /scratch/cernvm/cernvm-batch-node-2.4.0-2-3-x86_64.vdi

Following your contextualization script, once your virtual machine is booted it will start a GridFactory worker. The list of online GridFactory workers can be seen here:

https:/csc2011.dyndns.org/db/nodes/?format=xml.

Identify your worker using the mac address of your Virtual Machine. You can obtain the mac address by issuing the following command on your host machine:


$ hn=$(hostname -s)
$ cat /data/cernvm/vmnetwork/$hn.mac

Please do not proceed further until you see the Gridworker started by you appearing in the list.

Submitting jobs

The rest of the exercise has to be done from within CernVM. Log in to CernVM instance that you have created during the first part of the exercise. Alternatively quickly create and configure the new Deskop and login to it.

Set the environment variables up to run GCC and ROOT from CernVM File System


source /cvmfs/sft.cern.ch/lcg/external/gcc/4.3.2/x86_64-slc5-gcc43-opt/setup.sh
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:\
/cvmfs/sft.cern.ch/lcg/external/ROOT/5.28.00/x86_64-slc5-gcc43-opt/root/lib
export PATH=$PATH:\
/cvmfs/sft.cern.ch/lcg/external/ROOT/5.28.00/x86_64-slc5-gcc43-opt/root/bin

Get and unpack the exercise tarball

$ wget http://www.gridfactory.org/files/2011/07/TTBarExercise.tar.gz
$ tar zxvf TTBarExercise.tar.gz

Compile Monte-Carlo generator (PYTHIA) and the analysis code

$ cd TTBarExercise/siscone-2.0.0
$ ./configure --prefix=$PWD/../siscone
make clean
make
make install
cd ..
rm -f *.o
make
g77 -o pythia pythia_exercise.f pythia-6.4.25.f

Make sure that everything is working:

Run the event generator
$ ./pythia

Analyze PYTHIA output and save result to root histogram:
./AsciiReader Eventsgen.ascii

Now the fun starts.

Create batch jobs...

mkdir jobs
for n in {1..25}; do
cat > jobs/job$n.sh <<EOF
#!/bin/bash
source /cvmfs/sft.cern.ch/lcg/external/gcc/4.3.2/x86_64-slc5-gcc43-opt/setup.sh
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:\
/cvmfs/sft.cern.ch/lcg/external/ROOT/5.28.00/x86_64-slc5-gcc43-opt/root/lib
export PATH=$PATH:\
/cvmfs/sft.cern.ch/lcg/external/ROOT/5.28.00/x86_64-slc5-gcc43-opt/root/bin
./pythia $((19780503 + $n)) 20000
./AsciiReader Eventsgen.ascii
mv AsciiReader.root ttbar_analysis_$n.root
EOF
sed -i "s|\$n|$n|" jobs/job$n.sh
done

$

...set up the environment for GridFactory...

$ export PATH=$PATH:\
/cvmfs/sft.cern.ch/lcg/external/experimental/gridfactory.org/gridfactory_ui:\
/cvmfs/sft.cern.ch/lcg/external/experimental/gridfactory.org/gridworker:\
/cvmfs/sft.cern.ch/lcg/external/Java/JDK/1.6.0/ia32/bin
$

...make sure that GridFactory is setup correctly...

$ psub -h

... and submit jobs to your classroom cloud for execution


$ rm jobs.txt
$ for n in {1..25}; do
psub -b csc2011.dyndns.org jobs/job$n.sh -i pythia -i AsciiReader -e pythia -e AsciiReader -o ttbar_analysis_$n.root | grep -v submitted >> jobs.txt
echo "submitted job $n"
done
$

NOTE! You are sending the executable compiled on one machine for execution to the other machine. How do you know that once delivered to the worker node the executable is going to work? How do you know that the environment of the worker machine will contain everything necessary (e.g. right versions of all the runtime libraries on which it depends on) to run your executable?

The list of submitted jobs can be seen here:
https://csc2011.dyndns.org/db/jobs/?format=text
Do you see only your jobs there?

Hint: if you have problems with submitting many jobs, try running a single job:

psub -b csc2011.dyndns.org jobs/job1.sh -i pythia -i AsciiReader -e pythia -e AsciiReader -o ttbar_analysis_1.root

You can monitor your jobs with

pstat `cat jobs.txt`

and now the important part

Get the reulsts of your jobs back and merge output histograms


$ mkdir results
$ for n in `cat jobs.txt`; do
dir=`echo $n | awk -F / '{print $NF}'`
mkdir results/$dir
pget -o results/$dir $n
done
$
$ hadd ttbar_analysis.root results/*/*.root

Open the final histogram with Root

$ root ttbar-analysis.root
.
root [1] TBrowser b;

Understand how many events did you manage to generate. How can you do more? The student who manages first to generate a histogram with 500.000 events will get a free drink!

We would like to express a special gratitude to Dr. Frederik Orellana from Niels Bohr Institute who kindly volunteered to design and prepare second virtualization excercise for CSC 2011.