Abstract: Hyperspectral image (HSI) denoising is a fundamental problem in remote sensing and image processing. Recently, non-local low-rank tensor approximation based denoising methods have attracted much attention, due to the advantage of fully exploiting the non-local self-similarity and global spectral correlation. Existing non-local low-rank tensor approximation methods were mainly based on two common Tucker or CP decomposition and achieved the state-of-the-art results, but they suffer some troubles and are not the best approximation for a tensor. For example, the number of parameters of Tucker decomposition increases exponentially follow its dimension, and CP decomposition cannot better preserve the intrinsic correlation of HSI. In this paper, we propose a non-local tensor ring (TR) approximation for HSI denoising by utilizing TR decomposition to simultaneously explore non-local self-similarity and global spectral low-rank characteristic. TR decomposition approximates a high-order tensor as a sequence of cyclically contracted three-order tensors, which has a strong ability to explore these two intrinsic priors and improve the HSI denoising result. Moreover, we develop an efficient proximal alternating minimization algorithm to efficiently optimize the proposed TR decomposition model. Extensive experiments on three simulated datasets under several noise levels and two real datasets testify that the proposed TR model performs better HSI denoising results than several state-of-the-art methods in term of quantitative and visual performance evaluations.
Abstract: Hyperspectral dimensionality reduction (HDR), an important preprocessing step prior to high-level data analysis, has been garnering growing attention in the remote sensing community. Although a variety of methods, both unsupervised and supervised models, have been proposed for this task, yet the discriminative ability in feature representation still remains limited due to the lack of a powerful tool that effectively exploits the labeled and unlabeled data in the HDR process. A semi-supervised HDR approach, called iterative multitask regression (IMR), is proposed in this paper to address this need. IMR aims at learning a low-dimensional subspace by jointly considering the labeled and unlabeled data, and also bridging the learned subspace with two regression tasks: labels and pseudo-labels initialized by a given classifier. More significantly, IMR dynamically propagates the labels on a learnable graph and progressively refines pseudo-labels, yielding a well-conditioned feedback system. Experiments conducted on three widely-used hyperspectral image datasets demonstrate that the dimension-reduced features learned by the proposed IMR framework with respect to classification or recognition accuracy are superior to those of related state-of-the-art HDR approaches.
Abstract: Due to the ever-growing diversity of the data source, multi-modality feature learning has attracted more and more attention. However, most of these methods are designed by jointly learning feature representation from multi-modalities that exist in both training and test sets, yet they are less investigated in absence of certain modality in the test phase. To this end, in this letter, we propose to learn a shared feature space across multi-modalities in the training process. By this way, the out-of-sample from any of multi-modalities can be directly projected onto the learned space for a more effective cross-modality representation. More significantly, the shared space is regarded as a latent subspace in our proposed method, which connects the original multi-modal samples with label information to further improve the feature discrimination. Experiments are conducted on the multispectral-Lidar and hyperspectral dataset provided by the 2018 IEEE GRSS Data Fusion Contest to demonstrate the effectiveness and superiority of the proposed method in comparison with several popular baselines.
Abstract: Cloud and cloud shadow (cloud/shadow) removal from multitemporal satellite images is a challenging task and has elicited much attention for subsequent information extraction. Regarding cloud/shadow areas as missing information, low-rank matrix/tensor completion based methods are popular to recover information undergoing cloud/shadow degradation. However, existing methods required to determine the cloud/shadow locations in advance and failed to completely use the latent information in cloud/shadow areas. In this study, we propose a blind cloud/shadow removal method for time-series remote sensing images by unifying cloud/shadow detection and removal together. First, we decompose the degraded image into low-rank clean image (surface-reflected) component and sparse (cloud/shadow) component, which can simultaneously and completely use the underlying characteristics of these two components. Meanwhile, the spatial-spectral total variation regularization is introduced to promote the spatial-spectral continuity of the cloud/shadow component. Second, the cloud/shadow locations are detected from the sparse component using a threshold method. Finally, we adopt the cloud/shadow detection results to guide the information compensation from the original observed images to better preserve the information in cloud/shadow-free locations. The problem of the proposed model is efficiently addressed using the alternating direction method of multipliers. Both simulated and real datasets are performed to demonstrate the effectiveness of our method for cloud/shadow detection and removal when compared with other state-of-the-art methods.
Abstract: To facilitate the advances in Sentinel-2A products for land cover from Moderate Resolution Imaging Spectroradiometer (MODIS) and Landsat imagery, Sentinel-2A MultiSpectral Instrument Level-1C (MSIL1C) images are investigated for large-scale vegetation mapping in an arid land environment that is located in the Ili River delta, Kazakhstan. For accurate classification purposes, multi-resolution segmentation (MRS) based extended object-guided morphological profiles (EOMPs) are proposed and then compared with conventional morphological profiles (MPs), MPs with partial reconstruction (MPPR), object-guided MPs (OMPs), OMPs with mean values (OMPsM), and object-oriented (OO)-based image classification techniques. Popular classifiers, such as C4.5, an extremely randomized decision tree (ERDT), random forest (RaF), rotation forest (RoF), classification via random forest regression (CVRFR), ExtraTrees, and radial basis function (RBF) kernel-based support vector machines (SVMs) are adopted to answer the question of whether nested dichotomies (ND) and ensembles of ND (END) are truly superior to direct and error-correcting output code (ECOC) multiclass classification frameworks. Finally, based on the results, the following conclusions are drawn: 1) the superior performance of OO-based techniques over MPs, MPPR, OMPs, and OMPsM is clear for Sentinel-2A MSIL1C image classification, while the best results are achieved by the proposed EOMPs; 2) the superior performance of ND, ND with class balancing (NDCB), ND with data balancing (NDDB), ND with random-pair selection (NDRPS), and ND with further centroid (NDFC) over direct and ECOC frameworks is not confirmed, especially in the cases of using weak classifiers for low-dimensional datasets; 3) from computationally efficient, high accuracy, redundant to data dimensionality and easy of implementations points of view, END, ENDCB, ENDDB, and ENDRPS are alternative choices to direct and ECOC frameworks; 4) surprisingly, because in the ensemble learning (EL) theorem, “weaker” classifiers (ERDT here) always have a better chance of reaching the trade-off between diversity and accuracy than “stronger” classifies (RaF, ExtraTrees, and SVM here), END with ERDT (END-ERDT) achieves the best performance with less than a 0.5% difference in the overall accuracy (OA) values, but is 100 to 10000 times faster than END with RaF and ExtraTrees, and ECOC with SVM while using different datasets with various dimensions; and, 5) Sentinel-2A MSIL1C is better choice than the land cover products from MODIS and Landsat imagery for vegetation species mapping in an arid land environment, where the vegetation species are critically important, but sparsely distributed.
Abstract: Mixed noise (such as Gaussian, impulse, stripe, and deadline noises) contamination is a common phenomenon in hyperspectral imagery (HSI), greatly degrading visual quality and affecting subsequent processing accuracy. By encoding sparse prior to the spatial or spectral difference images, total variation (TV) regularization is an efficient tool for removing the noises. However, the previous TV term cannot maintain the shared group sparsity pattern of the spatial difference images of different spectral bands. To address this issue, this study proposes a group sparsity regularization of the spatial difference images for HSI restoration. Instead of using L1 or L2-norm (sparsity) on the difference image itself, we introduce a weighted L2,1-norm to constrain the spatial difference image cube, efficiently exploring the shared group sparse pattern. Moreover, we employ the well-known low-rank Tucker decomposition to capture the global spatial-spectral correlation from three HSI dimensions. To summarize, a weighted group sparsity regularized low-rank tensor decomposition (LRTDGS) method is presented for HSI restoration. An efficient augmented Lagrange multiplier algorithm is employed to solve the LRTDGS model. The superiority of this method for HSI restoration is demonstrated by a series of experimental results from both simulated and real data, as compared to other state-of-the-art TV regularized low-rank matrix/tensor decomposition methods.
Abstract: With a large amount of open satellite multispectral (MS) imagery (e.g., Sentinel-2 and Landsat-8), considerable attention has been paid to global MS land cover classification. However, its limited spectral information hinders further improving the classification performance. Hyperspectral imaging enables discrimination between spectrally similar classes but its swath width from space is narrow compared to MS ones. To achieve accurate land cover classification over a large coverage, we propose a cross-modality feature learning framework, called common subspace learning (CoSpace), by jointly considering subspace learning and supervised classification. By locally aligning the manifold structure of the two modalities, CoSpace linearly learns a shared latent subspace from hyperspectral-MS (HS-MS) correspondences. The MS out-of-samples can be then projected into the subspace, which are expected to take advantages of rich spectral information of the corresponding hyperspectral data used for learning, and thus leads to a better classification. Extensive experiments on two simulated HS-MS data sets (University of Houston and Chikusei), where HS-MS data sets have tradeoffs between coverage and spectral resolution, are performed to demonstrate the superiority and effectiveness of the proposed method in comparison with previous state-of-the-art methods.
Abstract: Time-series remote sensing (RS) images are often corrupted by various types of missing information such as dead pixels, clouds, and cloud shadows that significantly influence the subsequent applications. In this paper, we introduce a new low-rank tensor decomposition model, termed tensor ring (TR) decomposition, to the analysis of RS datasets and propose a TR completion method for the missing information reconstruction. The proposed TR completion model has the ability to utilize the low-rank property of time-series RS images from different dimensions. To furtherly explore the smoothness of the RS image spatial information, total-variation regularization is also incorporated into the TR completion model. The proposed model is efficiently solved using two algorithms, the augmented Lagrange multiplier (ALM) and the alternating least square (ALS) methods. The simulated and real data experiments show superior performance compared to other state-of-the-art low-rank related algorithms.
Abstract: This paper presents the scientific outcomes of the 2018 Data Fusion Contest organized by the Image Analysis and Data Fusion Technical Committee of the IEEE Geoscience and Remote Sensing Society. The 2018 Contest addressed the problem of urban observation and monitoring with advanced multi-source optical remote sensing (multispectral LiDAR, hyperspectral imaging, and very high-resolution imagery). The competition was based on urban land use and land cover classification, aiming to distinguish between very diverse and detailed classes of urban objects, materials, and vegetation. Besides data fusion, it also quantified the respective assets of the novel sensors used to collect the data. Participants proposed elaborate approaches rooted in remote-sensing, and also in machine learning and computer vision, to make the most of the available data. Winning approaches combine convolutional neural networks with subtle earth-observation data scientist expertise.
Abstract: Land-cover map is the basis of research and application related to urban planning, environmental management and ecological protection. Land-cover updating is an essential task especially in a rapidly urbanizing region, where fast development makes it necessary to monitor land-cover change in a timely manner. However, conventional approaches always have the limitations of large amounts of sample collection and exploitation of relational knowledge between multi-modality remote sensing datasets. With some global land-cover products being available, it is important to produce new land-cover maps based on the existing land-cover products and time series images. To this end, a novel transfer learning based automatic approach was proposed for updating land cover maps of rapidly urbanizing regions. In detail, the proposed method is composed of the following three steps. The first is to design a strategy to extract reliable land-cover information from the historical land-cover map for one of the images (source domain). Then, a novel relational knowledge transfer technique is applied to transfer label information. Finally, classifiers are trained on the transferred samples with spatio-spectral features. The experimental results show that aforementioned steps can select sufficient effective samples for target images, and for the main land-cover classes in a rapidly urbanizing region; the results of an updated map show good performance in both precision and vision. Therefore, the proposed approach provides an automatic solution for urban land-cover mapping with a high degree of accuracy.
Abstract: This work presents a detailed analysis of building damage recognition, employing multi-source data fusion and ensemble learning algorithms for rapid damage mapping tasks. A damage classification framework is introduced and tested to categorize the building damage following the recent 2018 Sulawesi earthquake and tsunami. Three robust ensemble learning classifiers were investigated for recognizing building damage from SAR and optical remote sensing datasets and their derived features. The contribution of each feature dataset was also explored, considering different combinations of sensors as well as their temporal information. SAR scenes acquired by the ALOS-2 PALSAR-2 and Sentinel-1 sensors were used. The optical Sentinel-2 and PlanetScope sensors were also included in this study. A non-local filter in the preprocessing phase was used to enhance the SAR features. Our results demonstrated that the canonical correlation forests classifier performs better in comparison to the other classifiers. In the data fusion analysis, DEM- and SAR-derived features contributed the most in the overall damage classification. Our proposed mapping framework successfully classifies four levels of building damage (with overall accuracy > 90%, average accuracy > 67%). The proposed framework learned the damage patterns from a limited available human-interpreted building damage annotation and expands this information to map a larger affected area. This process including pre- and post-processing phases were completed in about 3 hours after acquiring all raw datasets.
Abstract: The mangrove ecosystem plays a vital role in the global carbon cycle, by reducing greenhouse gas emissions and mitigating the impacts of climate change. However, mangroves have been lost worldwide, resulting in substantial carbon stock losses. Additionally, some aspects of the mangrove ecosystem remain poorly characterized compared to other forest ecosystems due to practical difficulties in measuring and monitoring mangrove biomass and their carbon stocks. Without a quantitative method for effectively monitoring biophysical parameters and carbon stocks in mangroves, robust policies and actions for sustainably conserving mangroves in the context of climate change mitigation and adaptation are more difficult. In this context, remote sensing provides an important tool for monitoring mangroves and identifying attributes such as species, biomass, and carbon stocks. A wide range of studies is based on optical imagery (aerial photography, multispectral, and hyperspectral) and synthetic aperture radar (SAR) data. Remote sensing approaches have been proven effective for mapping mangrove species, estimating their biomass, and assessing changes in their extent. This review provides an overview of the techniques that are currently being used to map various attributes of mangroves, summarizes the studies that have been undertaken since 2010 on a variety of remote sensing applications for monitoring mangroves, and addresses the limitations of these studies. We see several key future directions for the potential use of remote sensing techniques combined with machine learning techniques for mapping mangrove areas and species, and evaluating their biomass and carbon stocks.
Abstract: Blue carbon (BC) ecosystems are an important coastal resource, as they provide a range of goods and services to the environment. They play a vital role in the global carbon cycle by reducing greenhouse gas emissions and mitigating the impacts of climate change. However, there has been a large reduction in the global BC ecosystems due to their conversion to agriculture and aquaculture, overexploitation, and removal for human settlements. Effectively monitoring BC ecosystems at large scales remains a challenge owing to practical difficulties in monitoring and the time-consuming field measurement approaches used. As a result, sensible policies and actions for the sustainability and conservation of BC ecosystems can be hard to implement. In this context, remote sensing provides a useful tool for mapping and monitoring BC ecosystems faster and at larger scales. Numerous studies have been carried out on various sensors based on optical imagery, synthetic aperture radar (SAR), light detection and ranging (LiDAR), aerial photographs (APs), and multispectral data. Remote sensing-based approaches have been proven effective for mapping and monitoring BC ecosystems by a large number of studies. However, to the best of our knowledge, this is the first comprehensive review on the applications of remote sensing techniques for mapping and monitoring BC ecosystems. The main goal of this review is to provide an overview and summary of the key studies undertaken from 2010 onwards on remote sensing applications for mapping and monitoring BC ecosystems. Our review showed that optical imagery, such as multispectral and hyper-spectral data, is the most common for mapping BC ecosystems, while the Landsat time-series are the most widely-used data for monitoring their changes on larger scales. We investigate the limitations of current studies and suggest several key aspects for future applications of remote sensing combined with state-of-the-art machine learning techniques for mapping coastal vegetation and monitoring their extents and changes.
Abstract: The fine resolution of synthetic aperture radar (SAR) images enables the rapid detection of severely damaged areas in the case of natural disasters. Developing an optimal model for detecting damage in multitemporal SAR intensity images has been a focus of research. Recent studies have shown that computing changes over a moving window that clusters neighboring pixels is effective in identifying damaged buildings. Unfortunately, classifying tsunami-induced building damage into detailed damage classes remains a challenge. The purpose of this paper is to present a novel multiclass classification model that considers a high-dimensional feature space derived from several sizes of pixel windows and to provide guidance on how to define a multiclass classification scheme for detecting tsunami-induced damage. The proposed model uses a support vector machine (SVM) to determine the parameters of the discriminant function. The generalization ability of the model was tested on the field survey of the 2011 Great East Japan Earthquake and Tsunami and on a pair of TerraSAR-X images. The results show that the combination of different sizes of pixel windows has better performance for multiclass classification using SAR images. In addition, we discuss the limitations and potential use of multiclass building damage classification based on performance and various classification schemes. Notably, our findings suggest that the detectable classes for tsunami damage appear to differ from the detectable classes for earthquake damage. For earthquake damage, it is well known that a lower damage grade can rarely be distinguished in SAR images. However, such a damage grade is apparently easy to identify from tsunami-induced damage grades in SAR images. Taking this characteristic into consideration, we have successfully defined a detectable three-class classification scheme.
Abstract: We performed interferometric synthetic aperture radar (InSAR) analyses to observe ground displacements and assess damage after the M 6.6 Hokkaido Eastern Iburi earthquake in northern Japan on 6 September 2018. A multitemporal SAR coherence map is extracted from 3-m resolution ascending (track 116) and descending (track 18) ALOS-2 Stripmap datasets to cover the entire affected area. To distinguish damaged buildings associated with liquefaction, three influential parameters from the space-based InSAR results, ground-based LiquickMap (from seismic intensities in Japanese networks) and topographic slope of the study area are considered together in a weighted overlay (WO) analysis, according to prior knowledge of the study area. The WO analysis results in liquefaction potential values that agree with our field survey results. To investigate further, we conducted microtremor measurements at 14 points in Hobetsu, in which the predominant frequency showed a negative correlation with the WO values, especially when drastic coherence decay occurred.
Abstract: Hyperspectral imagery collected from airborne or satellite sources inevitably suffers from spectral variability, making it difficult for spectral unmixing to accurately estimate abundance maps. The classical unmixing model, the linear mixing model (LMM), generally fails to handle this sticky issue effectively. To this end, we propose a novel spectral mixture model, called the augmented linear mixing model (ALMM), to address spectral variability by applying a data-driven learning strategy in inverse problems of hyperspectral unmixing. The proposed approach models the main spectral variability (i.e., scaling factors) generated by variations in illumination or typography separately by means of the endmember dictionary. It then models other spectral variabilities caused by environmental conditions (e.g., local temperature and humidity, atmospheric effects) and instrumental configurations (e.g., sensor noise), as well as material nonlinear mixing effects, by introducing a spectral variability dictionary. To effectively run the data-driven learning strategy, we also propose a reasonable prior knowledge for the spectral variability dictionary, whose atoms are assumed to be low-coherent with spectral signatures of endmembers, which leads to a well-known low-coherence dictionary learning problem. Thus, a dictionary learning technique is embedded in the framework of spectral unmixing so that the algorithm can learn the spectral variability dictionary and estimate the abundance maps simultaneously. Extensive experiments on synthetic and real datasets are performed to demonstrate the superiority and effectiveness of the proposed method in comparison with previous state-of-the-art methods.
Abstract: In this paper, we aim at tackling a general but interesting cross-modality feature learning question in remote sensing community - can a limited amount of highly-discriminative (e.g., hyperspectral) training data improve the performance of a classification task using a large amount of poorly-discriminative (e.g., multispectral) data? Traditional semi-supervised manifold alignment methods do not perform sufficiently well for such problems, since the hyperspectral data is very expensive to be largely collected in a trade-off between time and efficiency, compared to the multispectral data. To this end, we propose a novel semi-supervised cross-modality learning framework, called learnable manifold alignment (LeMA). LeMA learns a joint graph structure directly from the data instead of using a given fixed graph defined by a Gaussian kernel function. With the learned graph, we can further capture the data distribution by graph-based label propagation, which enables finding a more accurate decision boundary. Additionally, an optimization strategy based on the alternating direction method of multipliers (ADMM) is designed to solve the proposed model. Extensive experiments on two hyperspectral-multispectral datasets demonstrate the superiority and effectiveness of the proposed method in comparison with several state-of-the-art methods.
Abstract: Convolutional neural networks (CNN) have attracted tremendous attention in the remote sensing community due to its excellent performance in different domains. Especially for remote sensing scene classification, the CNN based methods have brought a great breakthrough. However, it is not feasible to fully design and train a new CNN model for remote sensing scene classification, as this usually requires a large number of training samples and high computational costs. To alleviate these limitations of fully training a new model, some work attempts to use the pre-trained CNN models as feature extractors to build feature representation of scene images for classification and has achieved impressive results. In this scheme, how to construct feature representation of scene image via the pre-trained CNN model becomes the key process. Existing studies paid little attention to build more discriminative feature representation by exploring the potential benefits of multi-layer features from a single CNN model and different feature representations from multiple CNN models. To this end, this paper presents a fusion strategy to build feature representation of the scene images by integrating multi-layer features of a single pre-trained CNN model, and extends it to a framework of multiple CNN models. For these purposes, a multiscale improved Fisher kernel (MIFK) coding method is used to build feature representation of the scene images on convolutional layers, and a feature fusion approach based on two feature subspace learning methods (PCA/SRKDA and PCA/SRKLPP) is proposed to construct final fused features for scene classification. For validation and comparison purposes, the proposed approaches are evaluated with two challenging high-resolution remote sensing datasets and shows the competitive performance compared with existing state-of-the-art baselines such as fully trained CNN models, fine tuning CNN models and other related works.
Abstract: In this paper, we present the optical image simulation from synthetic aperture radar (SAR) data using deep learning based methods. Two models, i.e., optical image simulation directly from the SAR data and from multi-temporal SAR-optical data, are proposed to testify the possibilities. The deep learning based methods that we chose to achieve the models are a convolutional neural network (CNN) with a residual architecture and a conditional generative adversarial network (cGAN). We validate our models using the Sentinel-1 and -2 datasets. The experiments demonstrate that the model with multi-temporal SAR-optical data can successfully simulate the optical image, meanwhile, the model with simple SAR data as input failed. The optical image simulation results indicate the possibility of SAR-optical information blending for the subsequent applications such as large-scale cloud removal, and optical data temporal super-resolution. We also investigate the sensitivity of the proposed models against the training samples, and reveal possible future directions.
Abstract: To adequately represent the nonlinearities in the high-dimensional feature space for hyperspectral images (HSIs), we propose a multiple kernel collaborative representation-based classifier (CRC) in this paper. Extended morphological profiles are first extracted from the original HSIs, because they can efficiently capture the spatial and spectral information. In the proposed method, a novel multiple kernel learning (MKL) model is embedded into CRC. Multiple kernel patterns, e.g., Naive, Multimetric, and Multiscale are adopted for the optimal set of basic kernels, which are helpful to capture the useful information from different pixel distributions, kernel metric spaces, and kernel scales. To learn an optimal linear combination of the predefined basic kernels, we add an extra training stage to the typical CRC where kernel weights are jointly learned with the representation coefficients from the training samples by minimizing the representation error. Moreover, by considering different contributions of dictionary atoms, the adaptive representation strategy is applied to the MKL framework via a dissimilarity-weighted regularizer to obtain a more robust representation of test pixels in the fused kernel space. Experimental results on three real HSIs confirm that the proposed classifiers outperform the other state-of-the-art representation-based classifiers.
Abstract: In this study, we used fifty-six synthetic aperture radar (SAR) images acquired from the Sentinel-1 C-band satellite with a regular period of 12 days (except for one image) to produce sequential phase correlation (sequential coherence) maps for the town of Sarpole-Zahab in western Iran, which experienced a magnitude 7.3 earthquake on 12 November 2017. The preseismic condition of the buildings in the town was assessed based on a long sequential SAR coherence (LSSC) method, in which we considered 55 of the 56 images to produce a coherence decay model with climatic and temporal parameters. The coseismic condition of the buildings was assessed with 3 later images and normalized RGB visualization using the short sequential SAR coherence (SSSC) method. Discriminant analysis between the completely collapsed and uncollapsed buildings was also performed for approximately 700 randomly selected buildings (for each category) by considering the heights of the buildings and the SSSC results. Finally, the area and volume of debris were calculated based on a fusion of a discriminant map and a 3D vector map of the town.
Abstract: Imaging spectroscopy (IS), also commonly known as hyperspectral remote sensing, is a powerful remote sensing technique for the monitoring of the Earth’s surface and atmosphere. Pixels in optical hyperspectral images consist of continuous reflectance spectra formed by hundreds of narrow spectral channels, allowing an accurate representation of the surface composition through spectroscopic techniques. However, technical constraints in the definition of imaging spectrometers make spectral coverage and resolution to be usually traded by spatial resolution and swath width, as opposed to optical multispectral (MS) systems typically designed to maximize spatial and/or temporal resolution. This complementarity suggests that a synergistic exploitation of spaceborne IS and MS data would be an optimal way to fulfill those remote sensing applications requiring not only high spatial and temporal resolution data, but also rich spectral information. On the other hand, IS has been shown to yield a strong synergistic potential with non-optical remote sensing methods, such as thermal infrared (TIR) and light detection and ranging (LiDAR). In this contribution we review theoretical and methodological aspects of potential synergies between optical IS and other remote sensing techniques. The focus is put on the evaluation of synergies between spaceborne optical IS and MS systems because of the expected availability of the two types of data in the next years. Short reviews of potential synergies of IS with TIR and LiDAR measurements are also provided.
Abstract: Concerning the strengths and limitations of multispectral and airborne LiDAR data, the fusion of such datasets can compensate for the weakness of each other. This work have investigated the integration of multispectral and airborne LiDAR data for the land cover mapping of large urban area. Different LiDAR-derived features are involoved, including height, intensity, and multiple-return features. However, there is limited knowledge relating to the integration of multispectral and LiDAR data including three feature types for the classification task. Furthermore, a little contribution has been devoted to the relative importance of input features and the impact on the classification uncertainty by using multispectral and LiDAR. The key goal of this study is to explore the potenial improvement by using both multispectral and LiDAR data and to evaluate the importance and uncertainty of input features. Experimental results revealed that using the LiDAR-derived height features produced the lowest classification accuracy (83.17%). The addition of intensity information increased the map accuracy by 3.92 percentage points. The accuracy was further improved to 87.69% with the addition multiple-return features. A SPOT-5 image produced an overall classification accuracy of 86.51%. Combining spectral and spatial features increased the map accuracy by 6.03 percentage points. The best result (94.59%) was obtained by the combination of SPOT-5 and LiDAR data using all available input variables. Analysis of feature relevance demonstrated that the normalized digital surface model (nDSM) was the most beneficial feature in the classification of land cover. LiDAR-derived height features were more conducive to the classification of urban area as compared to LiDAR-derived intensity and multiple-return features. Selecting only 10 most important features can result in higher overall classification accuracy than all scenarios of input variables except the feature of entry scenario using all available input features. The variable importance varied a very large extent in the light of feature importance per land cover class. Results of classification uncertainty suggested that feature combination can tend to decrease classification uncertainty for different land cover classes, but there is no “one-feature-combination-fits-all” solution. The values of classification uncertainty exhibited significant differences between the land cover classes, and extremely low uncertainties were revealed for the water class. However, it should be noted that using all input variables resulted in relatively lower classification uncertainty values for most of the classes when compared to other input features scenarios.
Abstract: Due to the development of sensors and data acquisition technology, the fusion of features from multiple sensors is a very hot topic. In this letter, the use of morphological features to fuse an HS image and a light detection and ranging (LiDAR)-derived digital surface model (DSM) is exploited via an ensemble classifier. In each iteration, we first apply morphological openings and closings with partial reconstruction on the first few principal components (PCs) of the HS and LiDAR datasets to produce morphological features to model spatial and elevation information for HS and LiDAR datasets. Second, three groups of features (i.e., spectral, morphological features of HS and LiDAR data) are split into several disjoint subsets. Third, data transformation is applied to each subset and the features extracted in each subset are stacked as the input of a random forest (RF) classifier. Three data transformation methods, including principal component analysis (PCA), linearity preserving projection (LPP), and unsupervised graph fusion (UGF) are introduced into the ensemble classification process. Finally, we integrate the classification results achieved at each step by a majority vote. Experimental results on co-registered HS and LiDAR-derived DSM demonstrate the effectiveness and potentialities of the proposed ensemble classifier.
Abstract: This paper proposes a groundbreaking approach in the remote sensing community to simulating digital surface model (DSM) from a single optical image. This novel technique uses conditional generative adversarial nets whose architecture is based on an encoder-decoder network with skip connections (generator) and penalizing structures at the scale of image patches (discriminator). The network is trained on scenes where both DSM and optical data are available to establish an image-to-DSM translation rule. The trained network is then utilized to simulate elevation information on target scenes where no corresponding elevation information exists. The capability of the approach is evaluated both visually (in terms of photo interpretation) and quantitatively (in terms of reconstruction errors and classification accuracies) on sub-decimeter spatial resolution datasets captured over Vaihingen, Potsdam, and Stockholm. The results confirm the promising performance of the proposed framework.
Abstract: In this paper, we present the scientific outcomes of the 2017 Data Fusion Contest organized by the Image Analysis and Data Fusion Technical Committee of the IEEE Geoscience and Remote Sensing Society. The 2017 Contest was aimed at addressing the problem of local climate zones classification based on a multitemporal and multimodal dataset, including image (Landsat 8 and Sentinel-2) and vector data (from OpenStreetMap). The competition, based on separate geographical locations for the training and testing of the proposed solution, aimed at models that were accurate (assessed by accuracy metrics on an undisclosed reference for the test cities), general (assessed by spreading the test cities across the globe), and computationally feasible (assessed by having a test phase of limited time). The techniques proposed by the participants to the Contest spanned across a rather broad range of topics, and of mixed ideas and methodologies deriving from computer vision and machine learning but also deeply rooted in the specificities of remote sensing. In particular, rigorous atmospheric correction, the use of multidate images, and the use of ensemble methods fusing results obtained from different data sources/time instants made the difference.