Home Articles A Rule Based Approach For The Mapping Of Tropical Forest Canopy from...

A Rule Based Approach For The Mapping Of Tropical Forest Canopy from Airborne Hyperspectral Data Sets

11 Minutes Read

Affendi Suhaili
Forest Operations Branch,
Forest Department Sarawak,
Wisma Sumber Alam,
Petra Jaya, 93660 Kuching,
Sarawak, Malaysia.
Email: [email protected]

Ainuddin, N.A.
Faculty of Forestry,
Universiti Putra Malaysia (UPM), 43400 Serdang,
Selangor, Malaysia.

Shafri, H.Z.M.
Geomatics Unit,
Department of Civil Engineering,
Universiti Putra Malaysia (UPM), 43400 Serdang,
Selangor, Malaysia.

Abstract
In forestry operations, the increase in dimensionality of current remote sensing data sets has provided both opportunities and challenges to the users. The use of a rule based or decision tree (DT) classifier offers an alternative approach by focusing on fewer classes and obtaining different features and decision rules at each stage. In this study, the performance of DT models which were derived from classification rules using four different hyperspectral data sets were evaluated for mapping the species compositions of a tropical forest canopy. Our results showed that the amount of useful information that was contained in such data sets would certainly improve the performance of the model (up to 69% classification accuracy) and produce a less complex tree structure. In addition the decision rules were also able to highlight the relative importance of the spectral feature variables in the hyperspectral data.

1.0 INTRODUCTION
The increasing dimensionality of current remote sensing data sets (such as in the use of hyperspectral images) has provide the potential for discrimination of subtle spectral features between target classes. In forestry operations, it had created an opportunity to apply the technological advancements for mapping of the forest resources at taxonomic and to a certain extent to species level (Affendi et al., 2006a). Such increase in dimensionality however, presents new challenges to users of the data sets. Among them is the need to find other alternative classification methods as parametric classification algorithms which are commonly being applied to multi spectral data sets would be affected by the Hughes phenomenon (Hughes, 1968), a situation where an increase in dimensionality (data set) which is not met with adequate training samples results in poor performance of the classifiers. In tropical forest environments this could be a major concern especially with the use of single stage classification algorithms such as the maximum likelihood classifier. This is because training samples are relatively scare and hard to locate.

Decision tree classifiers (Friedl and Brodley, 1997; Lawrence and Wright, 2001; Pal and Mather, 2003) offer a solution to circumvent these problems by focusing on fewer classes and obtaining different features and decision rules at each stage. By performing feature selection and classification simultaneously, the decision tree automatically selects features (spectral bands, indices, other ancillary variables) that carry the maximum information and rejects the remaining features thereby increasing computational efficiency and overcoming the curse of dimensionality. In this study, we evaluated the potential of a binary decision tree classifier (CART) for mapping the individual tree crowns over a tropical forest using airborne hyperspectral data sets.

2.0MATERIALS AND METHODS

2.1 Study Site and Hyperspectral Data Set
The study was conducted on the 2 ha Bukit Bujang old growth forest plot in FRIM, Kepong. The plot was established by conducting a 100% mapping of all standing trees which are visible from the hyperspectral image. A total of 106 trees were enumerated falling into 16 species type, however due to the limited number of samples for training and testing of some of the tree species only 8 of the main species (Table 1) types were selected for this study. Due to the phenological differences that were observed among some of the species, they were further divided into 11 species classes.

Hyperspectral data (radiometric corrected using CALIGEO, a plug-in to ENVI 4.2 was obtained from the airborne AISA sensor over flight of the FRIM area. Four types of hyperspectral data sets were then derived (reflectance, first spectral derivative, continuum removed and a combined data set) for evaluating the performance of the decision tree classifier. Training data (202 pixels per species class) were extracted from the respective data set to represent samples derived from multiple crowns.

Table 1. Tree species utilized for training and test samples within the 2ha Mixed Dipterocarp regenerated old growth forest plot

2.2 Classification and Regression Tree (CART)
They are a large number of decision tree algorithms as reported in literatures. Decision tree classification algorithms differ in the ways in which the decision boundaries at each node of the tree are defined and how they evaluate splits into subsets in the tree growing stage. In this study the classification and regression tree (CART), which is the most widely used algorithms was utilized. The CART is a recursive (greedy search) algorithm that splits the entire data set from the root node into smaller subsets to reduce the deviance and correct for the total sum of the squares. For each split, each input variables is evaluated to find the best cut point based on the reduction in impurity from the target variables Δi(h,s) which according to Breiman et al. (1984) could be defined as:

where i(SR) and i(SL) are the Gini indices for the right and left child nodes respectively which is computed as:

where ωi and ωj are categories of the target variable and pωi ‌S and pωj ‌S are the probability of a random sample X belonging to class ωi and ωj respectively, given the distribution of data in set S. These selected feature variables (predictors) are then compared where the feature with the best improvement is selected for the split. CART can be used to determine the best set of bands (predictor variables) for predicting target variables such as forest cover characteristics and does not require data reduction, tests for normality or data transformations. For these reasons, CART is being used increasingly for mapping from remotely sensed imagery.

2.3 Decision Tree Model Development
A desirable decision tree is one having a relatively small number of branches, a small number of intermediate nodes from which these branches diverge, and high predictive power, in which entities are correctly classified at the terminal nodes. In this study, decision tree models were evaluated based on the effect of the dimensionality of the data set (Reflectance, Spectral Derivative, Continuum Removed & Combine features) on their accuracy (risk estimates and standard error) and structure (number of nodes and depth of layers). Classification rules were also derived from these models which were then incorporated in to the ENVI 4.2 (RSI, 2005) binary tree classifier. From the decision rules, band variable or features which were used in the rules were then paired to the respective image files (noise filtered and spectrally enhanced image data sets derived from previous image processing stage). The final tree was then executed and the results of the classifications were then compared to the ground truth data to determine their accuracy. Post pruning were applied to the tree structure if it was observed that any of the terminal nodes does not contribute much to the image classification accuracy.

3.0 RESULTS AND DISCUSSION
Results from the experiments which evaluated the performance of the CART algorithm (model accuracy and tree structure) based on the varying data set dimensionality (type) are as presented in Table 2. The later part of this section then discuss on the accuracy of 2 of the tree models that were selected for classifying the hyperspectral image over the 2 ha test plot.

Table 2. Effect of data dimensionality on model accuracy and tree structure of the CART algorithm

Note: RE – risk estimate which is the proportion of misclassified cases
SE – standard error of the model
N – number of nodes
L – depth of layers

3.1 Model Accuracy
When comparing the different data sets type (reflectance, first derivative, continuum removed and combined), it seems that the reflectance and the first derivative data set (both with 19 band variables) gives lower risk to the tree model. The highest model accuracy was derived using the combined data set (45 band variables), which could be explained by the increase in information available for the classifiers to be used in discriminating between the species classes. The number of features or band variables in the data set is however, not the single variable that influence the optimality of the model as it was shown that the spectrally enhanced first derivative data set that had an equal number of features as the reflectance data set performed better and the risk estimate was just slightly lower as compared to the combined data set. A possible reason is the reduction in the complexity and intercorrelation among the features in the spectrally enhanced data sets which facilitates splitting of the data by the tree algorithm into more homogenous classes. The better performance of the CART classifier when using the spectrally enhanced derivative data set also lends support to our earlier findings (Affendi et al., 2006b), which emphasized on the improvement in the data set spectral characteristics, in order to effectively discriminate among the subtle differences among the spectra of the different tree species.

3.2 Tree Structure and Complexity
Similar results as to that of model accuracy were observed when relating the data set type on tree structure and its complexity. From the results in Table 2, it could be observed that the original reflectance data set has greater number of nodes (39) and tree depth (9) as compared to the other data sets. The least complex model was derived from the continuum removed data set; however this simple tree structure could also be influenced by the smaller number of band variables (9 bands) that were utilized for growing the tree. In addition, the similar structural complexity of the combined data set as compared to the first derivative data set has further suggested the ability of the decision tree algorithm in selecting important band variables to be used in splitting the respective species class.

3.3 Classification Rule
Classification rules were extracted from each of these models and were then incorporated into ENVI 4.2 decision tree classifier to create a tree structure (Figure 1) for classifying the hyperspectral image. From the tree model that was derived using the CART algorithm on the training data, 19 set of rules were extracted to classify the 11 tropical tree species classes. For brevity, only the decision rules that was used to develop the final tree structure using the combined data set are further discussed.

From the tree structure in Figure 1, every path from the root to the leaf (terminal) node was converted to decision rules by regarding all the test conditions (decision nodes) appearing in the path as the conjunctive rule antecedents while regarding the class label held by the leaf node as the rule consequence. The canopy gap class representing the first rule was added to the root node in the tree model to mask out non vegetated regions and shadows that occur between the tree crowns. The feature selected to optimally separate this class was the first peak (spectral derivative band 12) of the double peak feature in the derivative data set. For labeling the Dyera costulata species, 5 set of rules were generated (3 to class them to JEL1 and 2 to class them to JEL2). From the longer path and higher number of nodes required by the model to label JEL1 class it was shown that spectral similarity of this class with the other species classes was high as more branches were needed to effectively separate them. This was also seen for Shorea bracteolata (MPA) and Ixonanthes reticulate (IBU2) on the other tree models using different feature variables. Distinctively labeled classes were the Shorea (S. maxwelliana, S. singkawang and S. leprosula) and Hopea class as they each require only a single set of rule for class labeling.

The tree structural plot also showed how the available features were being applied and partitioned at each level of the tree, to separate them into the 11 species classes. From the use of the combined data set (reflectance, first derivative and continuum band variables), it could be seen that the derivative variables were the most prominent features used in the classification process as 10 out of the 16 decision nodes were based on these variables which was mostly confined to the red-edge region. Based on the training image output when the model was executed, values used for splitting at each decision nodes were manipulated so as to give more weighting to terminal nodes at certain branches of the tree structure where the respective classes were labeled. In addition, post pruning was also applied in the study to remove the children nodes. It had resulted in a more generalized tree model and led to improvements during image classification. Breiman (1984), however had cautioned that it be done with great care, as error estimates calculated on the basis of the training set do not reflect the actual and real error values.

Figure 1. The final tree structure (ENVI 4.2) developed based on the classification rule using the combined data set.

3.4 Classification Accuracy Assessment
The final tree models were later applied on an unseen data over a test site which consisted of the 2 ha Mixed Dipterocarp regenerated old growth forest plot in Bukit Bujang to further evaluate their effectiveness in classifying 11 tropical tree species classes. As the first derivative and combined data sets showed promising results during the model development, only these models were evaluated in terms of the classification accuracy. The classified image in Figure 2 showed some occurrences of the individual tree crowns being classified as mixed class pixels. This was expected over such operating environment, especially when considering the high within species variations due to the phenological changes on some of the tree species (Dyera costulata, Ixonanthes reticulata and Hopea odorata) that were observed during the image over flight.

Figure 2. Classified image of Bukit Bujang study plot using binary tree classifier

Table 3 shows the classification accuracies and Kappa values, derived from the classified image. When comparing the effect of dimensionality (type) on the image classification accuracy with the misclassification risk of the tree model in Table 2, it could be seen that the combined data sets showed higher classification accuracy (69.3%, 0.6503K) with lower risk estimates (0.269) as compared to the first derivative data set, which has a slightly lower accuracy (59.1%, 0.561K) and higher risk (0.296) during model development. These results have shown that the tree models that were developed using the CART algorithm on both types of data sets were not over fitted and thus have supported the reliability of the classifiers.

Table 3. Effect of data set dimensionality on classification accuracy (CART)

4.0 CONCLUSION
The use of a decision tree classifier in this study to map the regenerated old growth forest plot from the airborne hyperspectral data sets has shown a potential for extending it for mapping the natural forests. During model development, it was found that the dimensionality of the data set was not a major factor in determining the performance of the tree model however the amount of useful information that are contained in such data sets (spectrally enhanced-first derivative) would certainly improve the performance of the model and produce a less complex tree structure. This study has shown that the binary tree model developed based on the classification rule using the combined data set gave the highest classification accuracy (69.3%, 0.6503 Kappa) as compared to the other tree models. It had also indicated the simplicity and flexibility of using the decision tree approach for classifying hyperspectral image derived from the airborne AISA sensor. In addition, the decision trees developed in the study were able to interpret rules that were being used to filter the data sets to the respective species classes and could also highlight the relative importance of the spectral feature variables used for the analyses.

REFERENCES

Affendi, S., Faridah-Hanum, I., Ainuddin, N.A., Awang Noor, A.G., Shafri, H.Z.M. and Rahayu, S., 2006a. Spectral separability of tropical tree taxa based on leaf morphological and anatomical characteristics. The Malaysian Forester 69(1): 75-84.
Affendi, S., Shafri, H.Z.M., Ainuddin, N.A., Awang Noor, A.G. and Faridah-Hanum. I., 2006b. Improving species spectral discrimination using derivatives spectra for mapping of tropical forest from airborne hyperspectral imagery. In proceedings of Map Malaysia 2006 Conference, May 3-4, Palace of The Golden Horses, Kuala Lumpur.
Breiman, L., Friedman, J.H., Olshen, R.A. and Stone, C.J. 1984. Classification And Regression Trees. Belmont California: Wadsworth International Group.
Friedl, M.A. and Brodley, C.E. 1997. Decision tree classification of land cover from remotely sensed data. Remote Sensing of Environment 61(3): 399-409.
Hughes, G.F. 1968. On the mean accuracy of statistical pattern recognizers. IEEE Transaction on Information Theory 14(1): 55-63.
Lawrence, R.L. and Wright, A. 2001. Rule-based classification systems using classification and regression tree (CART) analysis. Photogrammetric Engeineering and Remote Sensing 67(10): 1137-1142.
Pal, M. and Mather, P.M. 2003. An assessment of effectiveness of decision tree methods for land cover classification. Remote Sensing of Environment 86: 554-565.
RSI, 2005. ENVI Version 4.2. Boulder, Colorado, USA.

RELATED ARTICLESMORE FROM AUTHOR

The making of Jack Dangermond: Godfather of GIS

Why You Need Location Intelligence for 5G Deployment Success

UMass Global launches Institute for Geospatial Education

RELATED ARTICLES MORE FROM AUTHOR