Geoinformatics Unit



Current Position

Naoto Yokoya is the unit leader at Geoinformatics Unit, the RIKEN Center for Advanced Intelligence Project (AIP), Japan. His research is focused on image processing and data fusion in remote sensing.
He is a member of the IEEE (2009), IEEE Geoscience and Remote Sensing Society (GRSS), and IEEE GRSS Image Analysis and Data Fusion Technical Committee (IADF TC).
He is a co-chair of the IEEE GRSS IADF TC and the secretary of the IEEE GRSS All Japan Joint Chapter.
He is also the Technology Advisor of Root Inc.


2018 Jan - Present    Unit Leader, RIKEN AIP, Japan
2015 Dec - 2017 Nov    Alexander von Humboldt Research Fellow, DLR & TUM, Germany
2013 Jul - 2017 Dec    Assistant Professor, The University of Tokyo, Japan
2010 Oct - 2013 Mar    Ph.D. in Aerospace Engineering, The University of Tokyo, Japan

Journal Papers

  1. W. He and N. Yokoya, " Multi-temporal Sentinel-1 and -2 data fusion for optical image simulation ," ISPRS International Journal of Geo-Information, vol. 7, no. 10, pp. 389, 2018.
    PDF    Quick Abstract

    Abstract: In this paper, we present the optical image simulation from synthetic aperture radar (SAR) data using deep learning based methods. Two models, i.e., optical image simulation directly from the SAR data and from multi-temporal SAR-optical data, are proposed to testify the possibilities. The deep learning based methods that we chose to achieve the models are a convolutional neural network (CNN) with a residual architecture and a conditional generative adversarial network (cGAN). We validate our models using the Sentinel-1 and -2 datasets. The experiments demonstrate that the model with multi-temporal SAR-optical data can successfully simulate the optical image, meanwhile, the model with simple SAR data as input failed. The optical image simulation results indicate the possibility of SAR-optical information blending for the subsequent applications such as large-scale cloud removal, and optical data temporal super-resolution. We also investigate the sensitivity of the proposed models against the training samples, and reveal possible future directions.

  2. L. Guanter, M. Brell, J. C.-W. Chan, C. Giardino, J. Gomez-Dans, C. Mielke, F. Morsdorf, K. Segl, and N. Yokoya, " Synergies of spaceborne imaging spectroscopy with other remote sensing approaches ," Surveys in Geophysics, pp. 1-31, 2018.
    Quick Abstract

    Abstract: Imaging spectroscopy (IS), also commonly known as hyperspectral remote sensing, is a powerful remote sensing technique for the monitoring of the Earth’s surface and atmosphere. Pixels in optical hyperspectral images consist of continuous reflectance spectra formed by hundreds of narrow spectral channels, allowing an accurate representation of the surface composition through spectroscopic techniques. However, technical constraints in the definition of imaging spectrometers make spectral coverage and resolution to be usually traded by spatial resolution and swath width, as opposed to optical multispectral (MS) systems typically designed to maximize spatial and/or temporal resolution. This complementarity suggests that a synergistic exploitation of spaceborne IS and MS data would be an optimal way to fulfill those remote sensing applications requiring not only high spatial and temporal resolution data, but also rich spectral information. On the other hand, IS has been shown to yield a strong synergistic potential with non-optical remote sensing methods, such as thermal infrared (TIR) and light detection and ranging (LiDAR). In this contribution we review theoretical and methodological aspects of potential synergies between optical IS and other remote sensing techniques. The focus is put on the evaluation of synergies between spaceborne optical IS and MS systems because of the expected availability of the two types of data in the next years. Short reviews of potential synergies of IS with TIR and LiDAR measurements are also provided.

  3. J. Xia, N. Yokoya, and A. Iwasaki, " Fusion of hyperspectral and LiDAR data with a novel ensemble classifier ," IEEE Geosci. Remote Sens. Lett., vol. 15, no. 6, pp. 957-961, 2018.
    Quick Abstract

    Abstract: Due to the development of sensors and data acquisition technology, the fusion of features from multiple sensors is a very hot topic. In this letter, the use of morphological features to fuse an HS image and a light detection and ranging (LiDAR)-derived digital surface model (DSM) is exploited via an ensemble classifier. In each iteration, we first apply morphological openings and closings with partial reconstruction on the first few principal components (PCs) of the HS and LiDAR datasets to produce morphological features to model spatial and elevation information for HS and LiDAR datasets. Second, three groups of features (i.e., spectral, morphological features of HS and LiDAR data) are split into several disjoint subsets. Third, data transformation is applied to each subset and the features extracted in each subset are stacked as the input of a random forest (RF) classifier. Three data transformation methods, including principal component analysis (PCA), linearity preserving projection (LPP), and unsupervised graph fusion (UGF) are introduced into the ensemble classification process. Finally, we integrate the classification results achieved at each step by a majority vote. Experimental results on co-registered HS and LiDAR-derived DSM demonstrate the effectiveness and potentialities of the proposed ensemble classifier.

  4. P. Ghamisi and N. Yokoya, " IMG2DSM: Height simulation from single imagery using conditional generative adversarial nets ," IEEE Geosci. Remote Sens. Lett., vol. 15, no. 5, pp. 794-798, 2018.
    Quick Abstract

    Abstract: This paper proposes a groundbreaking approach in the remote sensing community to simulating digital surface model (DSM) from a single optical image. This novel technique uses conditional generative adversarial nets whose architecture is based on an encoder-decoder network with skip connections (generator) and penalizing structures at the scale of image patches (discriminator). The network is trained on scenes where both DSM and optical data are available to establish an image-to-DSM translation rule. The trained network is then utilized to simulate elevation information on target scenes where no corresponding elevation information exists. The capability of the approach is evaluated both visually (in terms of photo interpretation) and quantitatively (in terms of reconstruction errors and classification accuracies) on sub-decimeter spatial resolution datasets captured over Vaihingen, Potsdam, and Stockholm. The results confirm the promising performance of the proposed framework.

  5. N. Yokoya, P. Ghamisi, J. Xia, S. Sukhanov, R. Heremans, I. Tankoyeu, B. Bechtel, B. Le Saux, G. Moser, and D. Tuia, " Open data for global multimodal land use classification: Outcome of the 2017 IEEE GRSS Data Fusion Contest ," IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 11, no. 5, pp. 1363-1377, 2018.
    PDF    Quick Abstract

    Abstract: In this paper, we present the scientific outcomes of the 2017 Data Fusion Contest organized by the Image Analysis and Data Fusion Technical Committee of the IEEE Geoscience and Remote Sensing Society. The 2017 Contest was aimed at addressing the problem of local climate zones classification based on a multitemporal and multimodal dataset, including image (Landsat 8 and Sentinel-2) and vector data (from OpenStreetMap). The competition, based on separate geographical locations for the training and testing of the proposed solution, aimed at models that were accurate (assessed by accuracy metrics on an undisclosed reference for the test cities), general (assessed by spreading the test cities across the globe), and computationally feasible (assessed by having a test phase of limited time). The techniques proposed by the participants to the Contest spanned across a rather broad range of topics, and of mixed ideas and methodologies deriving from computer vision and machine learning but also deeply rooted in the specificities of remote sensing. In particular, rigorous atmospheric correction, the use of multidate images, and the use of ensemble methods fusing results obtained from different data sources/time instants made the difference.

  6. B. Le Saux, N. Yokoya, R. Hansch, and S. Prasad, " 2018 IEEE GRSS Data Fusion Contest: Multimodal land use classification ," IEEE Geoscience and Remote Sensing Magazine, vol. 6, no. 1, pp. 52-54, 2018.

Conference Papers

  1. V. Ferraris, N. Yokoya, N. Dobigeon, and M. Chabert, "A comparative study of fusion-based change detection methods for multi-band images with different spectral and spatial resolutions," IEEE International Geoscience and Remote Sensing Symposium, 2018.
  2. J. Xia, N. Yokoya, and A. Iwasaki, "Boosting for domain adaptation extreme learning machines for hyperspectral image classification," IEEE International Geoscience and Remote Sensing Symposium, 2018.
  3. D. Hong, N. Yokoya, J. Xu, and X. X. Zhu, "Joint & progressive learning from high-dimensional data for multi-label classification," European Conference on Computer Vision, 2018.