Optimizing Data Distribution in Volunteer Computing Systems using Resources of Participants


Abdelhamid Elwaer
Ian Taylor
Omer Rana


Many scientific projects use BOINC middleware to build a volunteer computing project. BOINC uses centralized data servers to distribute data to its users and some projects can attract thousands of participants. Such large numbers of users coupled with large datasets can cause a bottleneck for the centralized organization of the BOINC data servers, which has a knock-on effect on the performance of the project as a whole by limiting the throughput of jobs. Alternative methods have been proposed, such as the Attic file system, which decentralize data distribution to BOINC participants. This has been shown to scale but does not attempt to optimize the use of the various distributed data centres being used. We describe performance techniques based on trust algorithms that when layered on the Attic file system can significantly improve data availability and access time through intelligent selection of the data center for each user, based on the optimization of three parameters: trust, current connection speed and availability.


Special Issue