Geoinformatics Unit

Junshi XIA


Current Position

Junshi Xia is the Research Scientist of Geoinformatics Unit at the RIKEN Center for Advanced Intelligence Project (AIP), Japan. His research interests include multiple classifier systems in remote sensing, hyperspectral remote sensing image processing, and urban remote sensing.
He is a member of the IEEE (2014) and IEEE Geoscience and Remote Sensing Society (GRSS).


2018 May - Present    Research Scientist, RIKEN AIP, Japan
2016 May - 2018 Apr    JSPS Research Fellow, The University of Tokyo, Japan
2015 Nov - 2016 Apr    Visiting Scientist, Nanjing University, China
2015 May - 2016 Apr    Postdoctoral Researcher, University of Bordeaux, France
2011 Oct - 2014 Oct    Ph.D. in Image and Signal Processing, Université Grenoble Alpes, France
2008 Sep - 2013 May    M.Sc. and Ph.D. in Photogrammetry and Remote Sensing, China University of Mining and Technology, China

Journal Papers

  1. P. Du, E. Li, J. Xia, A. Samat and X. Bai, " Feature and model level fusion of pre-trained CNN for remote sensing scene classification ," IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens. (accepted for publication), 2018.
    Quick Abstract

    Abstract: Convolutional neural networks (CNN) have attracted tremendous attention in the remote sensing community due to its excellent performance in different domains. Especially for remote sensing scene classification, the CNN based methods have brought a great breakthrough. However, it is not feasible to fully design and train a new CNN model for remote sensing scene classification, as this usually requires a large number of training samples and high computational costs. To alleviate these limitations of fully training a new model, some work attempts to use the pre-trained CNN models as feature extractors to build feature representation of scene images for classification and has achieved impressive results. In this scheme, how to construct feature representation of scene image via the pre-trained CNN model becomes the key process. Existing studies paid little attention to build more discriminative feature representation by exploring the potential benefits of multi-layer features from a single CNN model and different feature representations from multiple CNN models. To this end, this paper presents a fusion strategy to build feature representation of the scene images by integrating multi-layer features of a single pre-trained CNN model, and extends it to a framework of multiple CNN models. For these purposes, a multiscale improved Fisher kernel (MIFK) coding method is used to build feature representation of the scene images on convolutional layers, and a feature fusion approach based on two feature subspace learning methods (PCA/SRKDA and PCA/SRKLPP) is proposed to construct final fused features for scene classification. For validation and comparison purposes, the proposed approaches are evaluated with two challenging high-resolution remote sensing datasets and shows the competitive performance compared with existing state-of-the-art baselines such as fully trained CNN models, fine tuning CNN models and other related works.

  2. P. Du, L. Gan, J. Xia, and D. Wang, " Multikernel adaptive collaborative representation for hyperspectral image classification ," IEEE Trans. Geosci. Remote Sens., vol. 56, no. 8, pp. 4664-4677, 2018.
    Quick Abstract

    Abstract: To adequately represent the nonlinearities in the high-dimensional feature space for hyperspectral images (HSIs), we propose a multiple kernel collaborative representation-based classifier (CRC) in this paper. Extended morphological profiles are first extracted from the original HSIs, because they can efficiently capture the spatial and spectral information. In the proposed method, a novel multiple kernel learning (MKL) model is embedded into CRC. Multiple kernel patterns, e.g., Naive, Multimetric, and Multiscale are adopted for the optimal set of basic kernels, which are helpful to capture the useful information from different pixel distributions, kernel metric spaces, and kernel scales. To learn an optimal linear combination of the predefined basic kernels, we add an extra training stage to the typical CRC where kernel weights are jointly learned with the representation coefficients from the training samples by minimizing the representation error. Moreover, by considering different contributions of dictionary atoms, the adaptive representation strategy is applied to the MKL framework via a dissimilarity-weighted regularizer to obtain a more robust representation of test pixels in the fused kernel space. Experimental results on three real HSIs confirm that the proposed classifiers outperform the other state-of-the-art representation-based classifiers.

  3. J. Chen, P. Du, C. Wu, J. Xia, and J. Chanussot, " Mapping urban land cover of a large area using multiple sensors multiple features ," Remote Sensing, vol. 10, no. 6, pp. 872, 2018.
    PDF    Quick Abstract

    Abstract: Concerning the strengths and limitations of multispectral and airborne LiDAR data, the fusion of such datasets can compensate for the weakness of each other. This work have investigated the integration of multispectral and airborne LiDAR data for the land cover mapping of large urban area. Different LiDAR-derived features are involoved, including height, intensity, and multiple-return features. However, there is limited knowledge relating to the integration of multispectral and LiDAR data including three feature types for the classification task. Furthermore, a little contribution has been devoted to the relative importance of input features and the impact on the classification uncertainty by using multispectral and LiDAR. The key goal of this study is to explore the potenial improvement by using both multispectral and LiDAR data and to evaluate the importance and uncertainty of input features. Experimental results revealed that using the LiDAR-derived height features produced the lowest classification accuracy (83.17%). The addition of intensity information increased the map accuracy by 3.92 percentage points. The accuracy was further improved to 87.69% with the addition multiple-return features. A SPOT-5 image produced an overall classification accuracy of 86.51%. Combining spectral and spatial features increased the map accuracy by 6.03 percentage points. The best result (94.59%) was obtained by the combination of SPOT-5 and LiDAR data using all available input variables. Analysis of feature relevance demonstrated that the normalized digital surface model (nDSM) was the most beneficial feature in the classification of land cover. LiDAR-derived height features were more conducive to the classification of urban area as compared to LiDAR-derived intensity and multiple-return features. Selecting only 10 most important features can result in higher overall classification accuracy than all scenarios of input variables except the feature of entry scenario using all available input features. The variable importance varied a very large extent in the light of feature importance per land cover class. Results of classification uncertainty suggested that feature combination can tend to decrease classification uncertainty for different land cover classes, but there is no “one-feature-combination-fits-all” solution. The values of classification uncertainty exhibited significant differences between the land cover classes, and extremely low uncertainties were revealed for the water class. However, it should be noted that using all input variables resulted in relatively lower classification uncertainty values for most of the classes when compared to other input features scenarios.

  4. J. Xia, N. Yokoya, and A. Iwasaki, " Fusion of hyperspectral and LiDAR data with a novel ensemble classifier ," IEEE Geosci. Remote Sens. Lett., vol. 15, no. 6, pp. 957-961, 2018.
    Quick Abstract

    Abstract: Due to the development of sensors and data acquisition technology, the fusion of features from multiple sensors is a very hot topic. In this letter, the use of morphological features to fuse an HS image and a light detection and ranging (LiDAR)-derived digital surface model (DSM) is exploited via an ensemble classifier. In each iteration, we first apply morphological openings and closings with partial reconstruction on the first few principal components (PCs) of the HS and LiDAR datasets to produce morphological features to model spatial and elevation information for HS and LiDAR datasets. Second, three groups of features (i.e., spectral, morphological features of HS and LiDAR data) are split into several disjoint subsets. Third, data transformation is applied to each subset and the features extracted in each subset are stacked as the input of a random forest (RF) classifier. Three data transformation methods, including principal component analysis (PCA), linearity preserving projection (LPP), and unsupervised graph fusion (UGF) are introduced into the ensemble classification process. Finally, we integrate the classification results achieved at each step by a majority vote. Experimental results on co-registered HS and LiDAR-derived DSM demonstrate the effectiveness and potentialities of the proposed ensemble classifier.

  5. N. Yokoya, P. Ghamisi, J. Xia, S. Sukhanov, R. Heremans, I. Tankoyeu, B. Bechtel, B. Le Saux, G. Moser, and D. Tuia, " Open data for global multimodal land use classification: Outcome of the 2017 IEEE GRSS Data Fusion Contest ," IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 11, no. 5, pp. 1363-1377, 2018.
    PDF    Quick Abstract

    Abstract: In this paper, we present the scientific outcomes of the 2017 Data Fusion Contest organized by the Image Analysis and Data Fusion Technical Committee of the IEEE Geoscience and Remote Sensing Society. The 2017 Contest was aimed at addressing the problem of local climate zones classification based on a multitemporal and multimodal dataset, including image (Landsat 8 and Sentinel-2) and vector data (from OpenStreetMap). The competition, based on separate geographical locations for the training and testing of the proposed solution, aimed at models that were accurate (assessed by accuracy metrics on an undisclosed reference for the test cities), general (assessed by spreading the test cities across the globe), and computationally feasible (assessed by having a test phase of limited time). The techniques proposed by the participants to the Contest spanned across a rather broad range of topics, and of mixed ideas and methodologies deriving from computer vision and machine learning but also deeply rooted in the specificities of remote sensing. In particular, rigorous atmospheric correction, the use of multidate images, and the use of ensemble methods fusing results obtained from different data sources/time instants made the difference.

Conference Papers

  1. J. Xia, N. Yokoya, and A. Iwasaki, "Boosting for domain adaptation extreme learning machines for hyperspectral image classification," IEEE International Geoscience and Remote Sensing Symposium, 2018.