Technologies and Developments for Earth Observations Data Analysis and Visualisation Volumes of Data Pattern? Mining Information Uttam Kumar Centre for Ecological Sciences, Indian Institute of Science, Bangalore- 560012. Email: uttam@ces.iisc.ernet.in
10 th January, 2013 Agenda Some new Image Classification Techniques for handling coarse resolution data for LCLU applications. The mixed pixel problem Hybrid Bayesian Classifier Development of Free and Open Source Software: GRDSS for Geospatial Applications Web Based Application for Geovisualisation LCLUC Initiatives for major Indian cities
Some New Image Classification Techniques for handling coarse resolution data for LCLU applications. The mixed pixel problem Hybrid Bayesian Classifier
Some Major RS Satellites for LCLU Applications Time scale Satellite / Source Sensor Spectral bands Spatial resolution in metres (m) Temporal resolution 1972 1999 Landsat -1, 5, and 7 MSS, TM, ETM+ PAN, VIS, NIR, MIR, TIR 15 m 120 m (moderate spatial resolution) 16-18 days (free) 1988 2010 IRS-1C/1D, P6 PAN, LISS- III PAN, VIS-2, NIR-1 (low spectral resolution) 5.8 m 23.5 (high to moderate spatial resolution) 24 days (medium cost, moderate temporal resolution) 1999 Till date IKONOS OSA PAN, VIS-3, NIR-1 1 m (PAN) 4 m (Others) (high spatial) 1-3 days (costly) : : : : : : : : : : : : 1999 Till date MODIS (Terra, Aqua) VIS, NIR, MIR, TIR 36 (high spectral resolution) 250 m 1 km (low spatial resolution) 1-2 days (free & high temporal resolution) 2002 SRTM (Shuttle Radar Mission) Topography 2002 Radar- Hydro 1K Asia --- --- DEM-1 90 m 1 time (free) Precipitation, Slope, Aspect-1 1 Km 1 time (free)
How do we handle coarse resolution data?
Develop techniques for deriving information from coarse spatial resolution data (such as MODIS).
Research Questions 1.) What are the techniques to obtain class proportions from mixed pixels? 2.) What are the ways of identifying/extracting endmembers from the bands? 3.) How to address the mixed pixels when objects reflectance s are non-linear mixtures in nature? 4.) How can we address the intra-class spectral variation or endmember variability? 5.) How can we predict class abundance s spatial distribution at sub-pixel resolution within a particular pixel obtained from linear/non-linear mixture models?
Linear unmixing a N y( x, y) e. ( x, y) η n 1 n n c b y Eα η n = a, b, c ( xy, ) n is a scalar value representing the functional coverage of endmember vector e n at pixel y(x, y). Constraints: 1.) 2.) n 0, n: 1 n N N n 1 n 1 Abundance nonnegativity constraint Abundance sum-to-one constraint This can be solved in two ways: 1. Ordinary least square 2. Orthogonal subspace projection
Ordinary Least Squares The conventional approach to extract the abundance values is to minimise y Eα The Unconstrained Least Squares (ULS) estimate of the abundance is T 1 T α ( E E) E y Imposing the unity constraint on the abundance values while minimising y Eα Gives the Constrained Least Squares (CLS) estimate of the abundance as, α E E E y 2 T 1 T ( ) 1 T T 1 T 2(1 ( E E) E y 1) T T 1 1 (E E) 1
Orthogonal Subspace Projection The technique involves (i) finding an operator which eliminates undesired spectral signatures, and then (ii) choosing a vector operator which maximises the SNR of the residual spectral signature General linear unmixing equation: r = M + n r = column vector of digital numbers M = matrix representing target spectral signature α = abundance fraction n = model error r = d + U n the (d, U) model to annihilate U We apply an operator P on this model p P I UU to the (d, U) model that results in a new signal detection model # Where # T -1 T U =( U U) U is the pseudo-inverse of U
On applying P on r = d + U n p we get Pr Pd PU Pn P operating on Uγ reduces the contribution of U to about zero we get p Pr Pd Pn On using a linear filter specified by a weight vector x T on the OSP model, the filter output is given by x Pr x Pd x Pn T T T p Now, we need to maximize signal to noise ratio (SNR) of the filter output p SNR(x) x Pd d T 2 T p T T P T x T x PE{ nn } P x = 2 T T p x Pdd 2 T T x P PP x T x Maximisation of this is a generalized eigenvalue-eigenvector problem T T T P P x=λpp 2 dd x where λ=λ(σ /α ) p (P 2 = P) and (P T = P) The eigenvector which has the maximum λ is the solution of the problem and it turns out to be d.
One of the eigenvalues is d T Pd and it turns out that the value of x T (filter) which maximizes the SNR is Applying d T P on x T kd d PPr d PPd d PPn T T T p T Pr Pd Pn p Obtained by applying P on r = d + U n p p d d T T P r Pd α is the abundance estimate of the pth target material.
FCC of the study area from (a) IKONOS (PAN and MS fused), (b) IKONOS MS, (c) Landsat ETM+ and (d) MODIS.
Data Spectral bands Spatial resolution Dimension 2 classes 3 classes 4 classes IKONOS PAN and MS fused 4 1 m 8000 x 8000 IKONOS 4 4 m 2000 x 2000 Landsat 6 30 m resampled to 25 m vegetation, nonvegetation vegetation, nonvegetation 320 x 320 vegetation, nonvegetation MODIS 7 250 m 32 x 32 vegetation, nonvegetation urban, vegetation, water urban, vegetation, water urban, vegetation, water urban, vegetation, water urban, vegetation, water, open area --- urban, vegetation, water, open area urban, vegetation, water, open area Remote sensing data sets used for validating CLS and OSP algorithms
Unmixed outputs from CLS and OSP for 2 classes (vegetation, non-vegetation), and 3 classes (urban, vegetation and water) from IKONOS MS data.
Results Correlation and RMSE for IKONOS, Landsat ETM+ and MODIS images for 2, 3 and 4 classes. Abundances obtained from OSP is better than CLS.
Endmember Selection Proportion based endmember estimation (PBEE) The rationale behind the new method is that given the m spectral reflectance y of the mixed pixel, if the ~ i, j proportions of all the endmembers ( n; n = 1 to N) in n that pixel are known, then the spectral reflectance of ~ i, j each endmember that constitute the mixed pixel can be approximated by inverting the LMM. y Eα η The endmember estimate for each band turns out to be T -1 T E [ α α] ( α y)
1 1 2 2 n n N N 1 1 e e... e... e y ~ 1,1 ~ 1 ~ 1,1 ~ 1 ~ 1,1 ~ 1 ~ 1,1 ~ 1 ~ 1,1 ~ 1,1 1 1 2 2 n n N N 1 1 e e... e... e y ~ 1,2 ~ 1 ~ 1,2 ~ 1 ~ 1,2 ~ 1 ~ 1,2 ~ 1 ~ 1,2 ~ 1,2 : 1 1 2 2 n n N N 1 1 e e... e... e y ~ i, j ~ 1 ~ i, j ~ 1 ~ i, j ~ 1 ~ i, j ~ 1 ~ i, j ~ i, j : 1 1 2 2 n n N N 1 1 e e... e... e y ~ r,c ~ 1 ~ r,c ~ 1 ~ r,c ~ 1 ~ r,c ~ 1 ~ r,c ~ r,c This is done for each band separately. For y N m n n m ( e ) ~ i, j ~ i, j ~ m n 1 ~ i, j : : : : : : 1 1 2 N 1... e y 1,1 1,1 1,1 1 1,1 : : : : : 1 2 N 2 1... e y : 1,2 : 1,2 : 1,2 : 1 : 1,2 1 2 N N 1... e r, c r, c r, c 1 y : : : : : rc, T -1 T E [ α α] ( α y)
To compare the performance of PBEE, three endmember identification methods were used: 1. a fully automatic endmember extraction technique N-FINDR, 2. a supervised interactive technique a combination of N-Dimensional Visualisation and Scatter Plot and 3. a unsupervised, semi-automatic technique interpreting cluster means as endmembers
Scatter plots for various band combinations. 5-Dimensional Visualisation of the 6 classes.
Original PBEE N-FINDR N-Dimensional Visualisation K-Means Clustering Abundance maps for the 6 classes Row1 original abundances obtained from LISS-III classified map resampled to MODIS image size, row2 PBEE, row3 N-FINDR, row4 N-Dimensional Visualisation, row5 K-Means clustering.
PBEE NFINDR Endmember behaviour for the 6 classes (a) to (f) in 7 bands for various techniques.
Findings From CC and RMSE, it is concluded that inversion of the LMM can provide a better estimate than other automatic, supervised interactive and semi-automatic methods. Shortcoming abundances should be available per class from some high resolution classified image of the same time frame as that of the low spatial resolution image with detailed ground information.
Non-linear Mixture Model Kumar, U., S. Kumar Raja, Mukhopadhyay, C., and Ramachandra T. V., (2011), A Multi-layer Perceptron based Non-linear Mixture Model to estimate class abundance from mixed pixels, Proceedings of the 2011 IEEE Students Technology Symposium, Indian Institute of Technology, Kharagpur, India, 14-16 January, 2011, Abstract page Number 31, Track 4 Image and Multi-dimensional Signal Processing.
NLMM accounts for interactions among the ground cover materials (multiple reflections among the materials on the surface). Also accounts for topographic features (slope) of the ground surface. The Sun-atmosphere-ground paths (tree represented by an ellipse).
Non-linear Mixture Model y = f ( E, α) + η where, f is an unknown non-linear function that defines the interaction between E and α.
The activation rule used here for the hidden and output layer nodes is defined by the logistic function 1 f( x) x 1 e Architecture of the MLP model. Structural diagram of the MLP.
Simulated Data Set A 200 band hyperspectral images generated from spectral libraries of four different minerals - (a) band 1 (b) band 100 (c) band 200. 4 y(x,y) sign sn(x,y) n=1 sig n is the signature corresponding to n th mineral, (x,y) log(1 α (x,y)) where s n n is the contribution of endmember e n and α n (x,y)is the fractional abundance of e n in the pixel at (x,y). Mineral classified map.
BDFs of simulated test data obtained from LMM and NLMM. LMM NLMM Abundances details of four minerals obtained from the LMM and NLMM. LMM NLMM
NLMM on MODIS Data Set Abundances of six categories from NLMM. (a) LISS-3 classified map resampled to 100 x 100 pixels. (b) agriculture, (c) builtup / settlement, (d) forest, (e) plantation / orchard, (f) waste land / orchard, (g) Water bodies
BDFs of MODIS test data from NLMM. (a) agriculture, (b) builtup / settlement, (c) forest, (d) plantation / orchard, (e) waste land / orchard, (f) water bodies. Classes Correlation (r) (p < 2.2e -16 ) RMSE LMM NLMM LMM NLMM Agriculture 0.6730 0.9110 0.0518 0.0271 Builtup / Settlement 0.6390 0.9345 1.0519 0.0083 Forest 0.7310 0.9411 0.0257 0.0062 Plantation / Orchard Waste/Barre n land 0.6990 0.9447 0.0280 0.0061 0.6599 0.9342 0.0431 0.0073 Water bodies 0.7799 0.9855 0.0061 0.0016 Correlation and RMSE between actual and predicted proportions.
Error distribution of MODIS abundance obtained from NLMM (X and Y axes are the two dimensions in feature space and Z axis is the absolute difference between real and estimated class proportion) for the six classes.
Findings Computer simulated data - overall RMSE 0.0089±0.00215 with LMM and 0.0030±0.0001 with the NLMM when compared to actual class proportions. The unmixed MODIS images - overall RMSE of NLMM was 0.0191±0.022 as compared to LMM 0.2005±0.41 indicating that individual class abundances obtained from NLMM is very close to what is present on the ground and observed in the high resolution classified image.
Which side of pixel is the class situated? Unmixed abundance map of builtup 51
Pixel Swapping Algorithm Kumar, U., Mukhopadhyay, C., Kumar Raja S., and Ramachandra T. V., (2008), Soft classification based Sub-pixel allocation model, International Conference on Operations Research for a growing nation in conjunction with the 41st Annual Convention of Operational Research Society of India, Tirupati, AP, India, 15-17 December, 2008.
Pixel swapping algorithm can increase the resolution of the OSP output from 136 x 140 to 1360 x 1400 The swapping algorithm 1. Requires some spatial correlation between pixels. 2. Maximize the autocorrelation between the pixels of the image 3. It takes the abundance output and transforms it into a map of hard LC class map defined at the sub-pixel scale. Limitation - it only allows the mapping of hard binary LC (target, non-target) classes. Atkinson, P. M., 2005, Sub-pixel target mapping from soft-classified, remotely sensed imagery. Photogrammetric Engineering & Remote Sensing, 71(7), pp. 839 846.
Sub-pixel mapping of a linear feature and a circle: (A) Test image-line (B) Abundance (C) Random allocation (D) After convergence (E) Test image-circle, (F) Abundance (G) Random allocation, (H) Converged map Nearest neighbour - 3 and non-linear parameter of the exponential model α was also set to 3. The overall accuracy for line is 99.97, circle is 99.94.
PS on MODIS image LISS-III Classified (25 m) LISS-III (25 m) MODIS abundance (250 m) PS MODIS (25 m) (A)Builtup pixels shown in black and non-built shown in white, (B) Sub-pixel map of builtup, (C) Converged map of the builtup after applying PS algorithm.
Accuracy Sensitivity - (0.6) (proportion of actual positives which are correctly identified) Specificity - 0.69 (proportion of negatives which are correctly identified) PPV - 0.6 (precision of positives that were correctly identified) NPV - 0.69 (precision of negatives correctly identified) With the ground truth, the accuracy was 76.6%
Hybrid Bayesian Classifier Kumar, U., Kumar Raja S., Mukhopadhyay, C., and Ramachandra T. V., (2011), Hybrid Bayesian Classifier for Improved Classification Accuracy. IEEE Geoscience and Remote Sensing Letters, vol. 8, no. 3, pp. 473-476.
In HBC, the class prior probabilities are determined by unmixing a supplement low spatial-high spectral resolution multi-spectral (MS) data that are assigned to every pixel in a high spatial-low spectral resolution MS data in Bayesian classification. Hybrid Bayesian Classifier.
Results IRS LISS-III MS and MODIS Classifiers Bayesian classifier HBC Class PA* UA* PA* UA* Agriculture 87.54 87.47 90.15 95.56 Builtup 85.11 81.68 89.39 98.33 Forest 85.71 88.73 92.61 96.36 Plantation 84.44 91.73 95.95 91.03 Waste land 88.03 90.37 98.67 89.66 Water bodies 90.91 88.89 88.18 97.00 Average 86.96 88.15 92.49 94.66 Accuracy Assessment for LISS-III data IKONOS MS and Landsat MS Classifiers Bayesian classifier HFC Class PA UA PA UA Concrete roofs 69.99 84.01 76.49 93.89 Asbestos roofs 84.77 87.77 91.89 94.46 Vegetation 94.21 87.55 87.24 89.13 Blue plastic roof 84.33 81.17 97.00 85.60 Open area 51.49 69.49 95.00 74.22 Average 76.96 81.99 89.52 87.46 Accuracy Assessment for IKONOS data
Findings Increase in overall accuracy by 6% for with IRS LISS-III MS and MODIS 9% with IKONOS MS and Landsat MS as compared to conventional Bayesian classifier.
Free and Open Source Tools for Geoinformatics http://wgbis.ces.iisc.ernet.in/foss
GRASS GIS GRASS (Geographic Resources Analysis Support System) is a free GIS software used for geospatial data management and analysis, image processing, graphics/maps production, spatial modelling, and visualization. One of the world s biggest open source project, Official project of the Open Source Geospatial (OSGeo) Foundation.
GRASS GIS First GRASS Mirror Site (Tier 1) in India at IISc http://wgbis.ces.iisc.ernet.in/grass
GRASS Wiki: http://grass.osgeo.org/wiki/main_page
GRASS Users Worldwide
GRDSS design and conceptual diagram.
GRDSS data flow diagram.
Functionalities of GRDSS A Quick Look
Applications User Interface Platforms: Linux, Handheld Raster map operations Vector map operation Image Processing LiDAR Cartography Web services
FOSS Kiosk http://wgbis.ces.iisc.ernet.in/foss
Web based Resource Information
GIS Layers and Visualisation Front end Elevation LULC Place names Roads Energy Communication facilities Anganwadi centres Educational Facilities Medical Facilities General Facilities Watershed boundaries Water Flow structures Sacred groves Canals, rivers, ponds Streams Admin boundary Ka-Map from maptools.org (works on Apache, UMN Mapserver, PHP
Current on going project: LCLUC studies of major metropolitan cities of India: A glimpse
Urbanization in 10 Major Indian Cities
Bangalore City Third largest metropolis in India.
We are waiting for the city to come to us
LCLUC in Bangalore 6
2010
Greater Bangalore 2006 1973 1992 1999 2010
Urban growth map (A) 1973 to 1992, (B) 1992 to 2000, (C) 2000 to 2006, (D) 2006 to 2010. Diffusive growth Types of urban outlying growth highlighted in box (A) isolated growth, (B) linear branching (road/corridor), (C) clustered growth.
Analysis of Land Surface Temperature 2010 Decreasing Lakes and Parks Urbanising Bangalore
Dividing Bangalore into directional zones Area (ha) Area (ha) Area (ha) Area (ha) Area (ha) Area (ha) Area (ha) Area (ha) NW N NE 4000 3500 3000 2500 2000 1500 1000 500 0 4000 1970 1980 1990 2000 2010 Year N 2000 1800 1600 1400 1200 1000 800 600 400 200 0 3000 1970 1980 1990 2000 2010 Year NE 3500 3000 2500 2500 2000 2000 1500 W E 1500 1000 500 0 8000 1970 1980 1990 2000 2010 Year E 1000 500 0 8000 1970 1980 1990 2000 2010 Year SE 7000 7000 6000 6000 SW S SE 5000 4000 3000 2000 1000 0 8000 7000 6000 5000 4000 3000 2000 1000 0 1970 1980 1990 2000 2010 Year S 1970 1980 1990 2000 2010 Year W 5000 4000 3000 2000 1000 0 5000 4500 4000 3500 3000 2500 2000 1500 1000 500 0 1970 1980 1990 2000 2010 Year SW 1970 1980 1990 2000 2010 Year NW
Directional Analysis of Land Surface Temperature Direction Mean LST±SD N 21.30±2.39 NE 22.15±2.22 E 21.01±2.47 SE 21.34±2.30 S 21.71±2.07 SW 22.19±1.92 W 22.97±1.72 NW 22.07±2.25
Use of Spatial metrics to quantify the structure of the landscape quantify the spatial pattern and composition of features
Results of Spatial Metrics Built up(total land area) N 8000 NW 6000 NE 4000 2000 W 0 E W NW Largest Patch N 40000 30000 20000 10000 0 NE E W Number of Patches N 3000 NW 2000 NE 1000 0 E W Clumpiness N 1 NW 0.8 0.6 0.4 0.2 0 NE E SW SE SW SE SW SE SW SE S S S S 1973 Urban growth 1973 1992 2000 1973 2006 2010 1992 2000 2006 2010 1992 is more 1973 1992 2000 2006 2000prominent 2006 Largest 2010 patch in N and E in Separate clusters of huge urban patches have in west, southwest and south direction. 2010 and medium urban development in W, SW and S. come in north (Bengaluru International Airport) and east (International Tech Park Limited). More compact and moving towards single big patch in 2010. W NW SW 0.6 0.4 0.2 0 N W NW SW 100 80 60 40 20 0 N S S S 1973 1992 2000 2006 2010 Compactness index of 1973 1992 2000 2006 2010 1973 Ratio 1992 of Open 2000 space 2006 2010 the largest patch NE SE E NE SE Aggregation index E W NW SW 2500 2000 1500 1000 500 0 N NE SE E Open space decreased and urban density increased.
Modelling urban growth Urban dynamics through Cellular Automata (CA) based growth models
Growth model: CA CA is based on pixels, states, neighbourhood and transition rules. Pixel (Initial State) Transition Pixel (Final State) External factors
Conclusions 584% increase in urban areas during 37 years (1973 to 2010). ~2-4 ºC in local LST. 74% vegetation cover and 66% in water bodies. Percent Impervious surface NDVI Temperature
Spatial Thinking LCLUC: The current scenario
Indian Institute of Science, Bangalore Thank you