Performance Analysis of Data Replication in Grid Delivery Networks
Abstract
In this paper, we examine the data replication problem in a particular Grid Delivery Network (GDN). In this system, the data are divided into fixed size blocks which must be replicated on hosts to decrease the total download time. We propose a probabilistic model to optimize the average download time of requests based on the hosts availability and the document size distribution. The objective function induced by this model is a non-linear integer problem. It can be solved in real values by Lagrangian optimization. We prove that in a particular case, this problem can be reduced to a knapsack problem. We propose approximation algorithms and validate them using simulations with varying characteristics.