Deconvolution is a part of the image reconstruction in the radio interferometry used in the radio astronomy to calculate the image of the sky. The image reconstruction consists from several steps and transforms irregularly sampled visibilities, which is an amplitude and phase of electromagnetic waves detected by an interferometer, into the intensity of the radio waves at a given location on the sky called sky brightness. The first step in the image reconstruction is the gridding process, where we transform detected visibilities into a regular grid which allow us to use Fast Fourier Transform (FFT) instead of direct Fourier transform which is computationally expensive. The sky true brightness is then reconstructed by an iterative process where the brightest sources are deconvolved back to the positions of the irregularly sampled visibilities and subtracted from measured visibilities creating a cleaner image. This is repeated until technique converges. In this work, we have investigated the performance of a part of the 3D deconvolution process which calculates irregularly sampled visibilities from precomputed subgrids on the GPU. The number of visibilities which needs to be calculated can vary greatly by several magnitudes depending on the position in the image and the configurations of the telescope. This variability may cause load balancing issues. We have implemented this using CUDA for NVIDIA GPUs and have achieved performance of 540 million deconvolved visibilities per second.