On the 1st of september (2010) I started working at SARA as a systems programmer for the e-science group. My current task involves the Dutch Life Science Grid, a major part of BigGrid. The subjects of the BigGrid user meeting matched this very well and therefore I attended this meeting.
After the Official opening of the meeting by Jaap van den Herik & Arjen van Rijn, presentations started.
There is a big distance/difference between the use of a grid infrastructure and life science studies. As Timo explains there are hundreds of different disciplines in life sciences (it is not one single domain) and it is involves a big effort to find common grounds. Just imagine how difficult it is to offer Grid Services to such a group. To try to bridge the gap the eBioSience Group nowadays called e-BioGrid exists. A structure was created where an so called e-core is surrounded by different domains of bio/life science.
Willem starts to present the successful Fly-safe project, including a life demo. The project uses data not needed and therefore discarded by the dutch rain radar (buienradar.nl) to predict the presence of birds in militarily fly zones. This to avoid a 2 day preparation for flying and then finding out on the fly day that there are to many birds present to fly safe. Using the radar images flocks of birds could be followed and studied but it is impossible to study individual bird behavior. To study individual birds they are equipped with GPS logging equipment. The data can be retrieved wireless. Again using a live demo Willem demonstrated the data of the flight home of a Stork after it had been walking around on a freshly ploughed acre. The flight of a Gull from Texel to Amsterdam was even more impressive to observe. It became again clear that good data visualization is key to the success of a data intensive e-science projects.
To allow easy access to the Grid for Life science applications the Group of Alexandre created portals for each of those applications. Using an uniform layout but a specialized back-end per application, the portals became a big success. The project made the Italian VO one of the biggest users of BigGrid. One of the advantages being that there is no need for the users to apply for a Grid Certificate and a number of VO's, which can be cumbersome in Italy because of the location of Certificate authorities. Originally called eNMR (eInfrastructure Nuclear Magnetic Resonance) the project because of its success now continues as weNMR (Worldwide e-Infrastructure for NMR ) and as the name suggests changes from a european scope to a world-wide scope. The success of this project proves the needs for easy and low level access to the Grid for the users of the Grid.
Frank explains the size of storage needed by the LHC Atlas project very clearly: one event generates 1.6 MB of data, not that much, but Atlas measures 40.000.000 event per second resulting in 65TB/s. This is impossible to store and therefore is pre processed before any data is stored. A tiered storage involving a lot of participants including the Netherlands, provides the storage needed. Part of the compute power to further process the data is also in the Netherlands and clusters from the BigGrid are used.
After a description of the technology behind the antenna and the setup of the largest radio telescope of the world it becomes clear that a telescope of this size is a mainly software driven. Besides computational needs there is a large amount of data storage needed. BigGrid is used for both.
The following five speakers presented their view of the Grid and showed how they use it in short sessions. Afterwards the speakers where invited for a short panel discussion.
The Delft university has a Dutch Life Science Grid Cluster on location and can use the whole Dutch Life Science Grid. Beside the Grid, distributed computing and Cloud computing service are also offered as a service. Because of the easy access to the local cluster the users of the Grid are growing.
Like Delft, Leiden University has access to the Life Science Grid and a local cluster part of that Grid. For the chemist the Gird does not suffice, other services are offered. A Portal giving access to those different services. One of the issues with Grid Middleware is the number of jobs that can be submitted. For some experiments a lot of jobs needs to be submitted but the Grid Middleware does not handle this well. By submitting a so called Pilot Job this issue is solved. When the pilot job arrives at a node it downloads the real job(s) and data to it needs to run a big number of jobs.
Using MRI (brain scan) data from a part of the population of Rotterdam a data crunching experiment was created to run on a cluster. To run the experiment on the Grid the huge data blocks needed to be split in smaller parts. One of the issues Henri has with the Grid is the lack of proper job monitoring by the Grid Middleware, it takes a long time before one can determine the failure or status of a job when for example it is waiting in a queue.
To offer Gird services to the users, Amsterdam Medical Center created a portal. The way AMC offers the service allows the users to use local data and with that data create Grid jobs. The Portal-backend takes care of getting the data to the nodes executing the jobs.
As an extra service BigGrid (via SARA) offers a HPC Cloud Infra that can be used to create a custom High Performance Cluster. Han is very pleased with the way he was able to have full control (root access) of his "virtual" cluster. After some initial problems, solved by the HPC Cloud support, a real experiment was executed successfully in the Cloud.
To finish the Big Grid User meeting Steven gave a lecture about the plans and goals for the coming years of the EGI(-InSPIRE) organization. One of the ideas to investigate: An AppStore for Grid applications.
All in all it was a very interesting day of lectures/presentations which gave me an insight view of the (Big)Grid and its users. I think part (or all as a matter of fact) of the success of the Grid is in the success of users when they use the Grid to run escience. The enthusiasm of the users on this BigGrid user meeting shows that the Grid is on its way to be a success. But the ease of use of the Grid middleware has to increase to make it a big success, for now users are trying to find ways to avoid the limitations which shouldn't have been there from the beginning.