| Convolutional neural network (CNN) has been widely applied to super-resolution tasks. However, the existing super-resolution algorithms with CNN need pre-processing steps which require massive computation. In this paper, we propose an architecture that learns both spatial and temporal information of low-resolution video. Consecutive frames are used as input to our video super-resolution CNN. Unlike existing approaches, only a single convolutional layer, named correlational layer, is used to replace the motion compensation step. Experimental results show that the proposed correlational layer is able to learn the motion information and outperform the network with motion compensated frames by 0.19 dB in PSNR. Our method can averagely outperformVSRnet by 0.39 dB in PSNR for videos upscaled with factor 3. |