Next
Matching of IKONOS Stereo and Multitemporal GEO Images for DSM Generation
Li Zhang, Maria Pateraki, Emmanuel Baltsavias
Institute of Geodesy and Photogrammetry,
ETH-Hoenggerberg, CH-8093, Zurich, Switzerland
Email : [email protected],
[email protected],
[email protected]
1. Introduction
Although IKONOS imagery has been commercially available since early 2000, the use of this imagery for DSM generation has been restricted due to various reasons. Firstly, up until beginning of 2002 stereo imagery was available only for governments and national mapping organisations. Secondly, Space Imaging follows a closed policy and does not release the sensor model but provides rational polynomial coefficients instead. These are provided for stereo images and special versions of Geo imagery (so called OrthoKit) but at significantly higher prices. To these reasons, one should add the general problems of IKONOS imagery like cost, availability, lack of user control on imaging parameters and conditions etc. Since summer 2001, some commercial systems (Erdas Imagine, LHS Socet Set, Z/I Imaging Image Station, PCI Geomatics OrthoEngine) support to one or the other extent IKONOS imagery for import, stereo viewing and processing, orthoimage and DSM generation by using the rational polynomial coefficients (RPCs) provided by SI, estimating RPCs from ground control or using alternative (bundle adjustment, DLT) and partly proprietary mathematical models. While estimating RPCs from GCPs is both expensive (high number of GCPs required) and sub-optimal, use of bundle adjustment due to the undisclosed IKONOS sensor model is impossible. DLT, as shown in Fraser et al. (2002) could be used with GCPs, while proprietary sensor models (PCI) are of unknown quality, while raw IKONOS data which would be most appropriate for use with a strict sensor model are not available. As shown in Baltsavias et al. (2001) and Fraser et al. (2002), a very simple alternative geometric model, needing at least 3 GCPs, can achieve better positioning accuracy than the RPCs and at sub-meter level.
Investigations on DSM generation from high-resolution imagery include the following. Ridley et al. (1997) evaluated the potential of generating a national mapping database of maximum building heights for buildings at least 5 x 10 m in planimetry by using DSMs extracted by matching of 1m aerial imagery. They report that matching has a potential to provide the requested information if the DSM had a spacing of 1-3 m but with lower accuracy (1.5 – 3 m RMS) and completeness compared to manual measurements. Muller et al. (2001) use simulated 1m-resolution and IKONOS data for DSM generation and landuse determination to estimate effective aerodynamic roughness for air pollution modelling and determine position of trees close to buildings that may cause soil subsidence for insurance risk assessment. The most extensive results on DSM generation from IKONOS up to now have been reported by Toutin et al. (2001). Dial (2000) also reports the expected mapping accuracy of IKONOS. There are no published investigations on the other two operational high-resolution spaceborne systems (EROS-A1 and Quickbird-2). As far as the authors know, this is the first investigation deriving DSMs from multitemporal IKONOS images. This data are much more available than stereo data, they do not include epipolar resampling which possibly degrades geometric accuracy and are cheaper than stereo IKONOS. Naturally, multitemporal differences make matching more difficult, but as it is shown in Baltsavias and Stallmann (1992) it is possible with intelligent techniques to achieve good results even in case of large image differences.
Next
PreviousNext
Matching of IKONOS Stereo and Multitemporal GEO Images for DSM Generation
2. Test Data
2.1 Melbourne Dataset
The first dataset comprised a stereopair of epipolar-resampled Geo PAN images of Melbourne. As indicated in Table 1, the sensor and sun elevation angles for the stereopair (imaged in winter) were less than optimal. The azimuths of sensor and sun of the left image differed considerably, leading to strong shadows in non-occluded areas. The base/height ratio of the stereopair was 1.2. GCPS measured with an accuracy of 1-2 dm in both image and object space existed and could be used for our IKONOS sensor model and aerial image orientation. More details on the dataset and the GCPs can be found in Fraser et al. (2002). An existing stereopair of 1:15,000 scale wide-angle colour aerial photography was used to measure reference data.
2.2 Nisyros Dataset
This dataset consisted of two Pan-Sharpened Geo images of a Greek island with little man-made objects and sufficient vegetation. It is an non-active volcano with elevation range 0-700 m. The parameters of the images are listed in Table 2. As it can be seen, they were similar except of the sensor azimuth. The different viewing angle let to illumination differences between the two images. The fact that shadows were mostly in occluded areas was an advantage. The major multitemporal differences were due to different clouds and their shadows. 38 GCPs measured with GPS were available with an object accuracy of less than 1 dm. About half of them were not well defined in the image. The reference DTM had a grid spacing of 2 m and was interpolated from 1:5,000 map contours and additionally measured break lines. Its accuracy, estimated using the GCPs, was ca. 3.3 m RMS. More details on the data can be found in Vassilopoulou et al. (2002).
Table 1 Acquisition parameters of the Melbourne dataset
Left Stereo
Right Stereo
Date, Time (local)
16/7/2000, 09:53
16/7/2000, 09:53
Sensor azimuth (deg)
136.7
71.9
Sensor elevation (deg)
61.4
60.7
Sun azimuth (deg)
38.2
38.3
Sun elevation (deg)
21.1
21.0
Table 2 Acquisition parameters of the Nisyros dataset
Image 1
Image 2
Date, Time (local)
8/4/2000, 10:35
28/3/2000, 10:34
Sensor azimuth (deg)
134.9
72.1
Sensor elevation (deg)
73.5
72.7
Sun azimuth (deg)
136.6
138.7
Sun elevation (deg)
53.4
49.3
3. Own Matching Method and Test Results with Nisyros Dataset
3.1 Quasi-Epipolar Image Generation
Unlike frame-based imagery, where all pixels in the image are exposed simultaneously, each scan line of the IKONOS image is collected in a pushbroom fashion at different instant of time. Thus, epipolar lines with linear CCDs become curves. Using Orun and Natarajan’ sensor model (Orun and Natarajan, 1994), which models the position of the sensor and its yaw angle variation as second-order polynomials of scan line (or time), and assuming the pitch and roll angles to be constant, the derived epipolar curve for a certain point shows a hyperbola-like shape. It can be approximated by a straight line for a small length but not for the entire image. An epipolar curve for the entire image can be approximated only by piecewise linear segments. In our approach, the epipolar curve for point (xl, yl) in the left IKONOS image is approximated by a quadratic polynomial and has the following form:
yl=
(A1x1+A2y1+A3)+
(A4x1+A5y1+A6)xl+
(A7x1+A8y1+A9)xl2 (1)
Where, (xr, yr) are the pixel coordinates in the right IKONOS image and A1-9 are unknown parameters. Using Eq. (1), the quasi-epipolar image pair can be generated by re-arranging of the original IKONOS image pair. However, the computation of the unknown parameters of the epipolar geometry needs a certain number of well distributed conjugate points. These conjugate points are extracted by the following feature-point based matching:
- The Moravec interest operator is used to select well defined feature points that are suitable for image matching. For this, the left IKONOS image is divided into small image windows of 21 ´ 21 pixels and then only one feature point, which has the highest interest value is extracted in each image window.
- Conjugate points are generated using the maximum of the normalized correlation coefficient. The positioning of the search areas is determined by using the already known control points in the neighborhood (image pyramids and a matching strategy based on region growing, which takes the already manually measured control points as seed points are used to get these approximate points). For reliability, the threshold of acceptable normalized correlation coefficients is 0.9.
- Least squares matching is finally used to refine the image coordinates of these points in order to achieve sub-pixel accuracy.
This procedure results in several hundreds of conjugate points. These points can be used to recover the epipolar geometry and interpolate the approximate values for the following grid point matching procedure.
PreviousNext
PreviousNext
Matching of IKONOS Stereo and Multitemporal GEO Images for DSM Generation
3.2 Derivation of Approximations by Using Grid Point Matching based on Relaxation
The matching procedure can be treated as a labeling problem and solved by the relaxation technique. That is, we regard the template image (in our case, the left quasi-epipolar image) as the model and another image as the scene, and then use the feature points in the model as a set of labels to label the feature points extracted from the scene. Relaxation is one of the efficient methods to solve the labeling problem. The work of Hancock and Kittler (1990) that uses Bayesian probability theory has provided a theoretical framework and rigorous basis for the relaxation method.
The important aspect of the relaxation matching algorithm that distinguishes it from the single point matching is its compatible coefficient function and its smoothness constraint satisfaction scheme (Baltsavias, 1991; Zhang et al., 1992). This is most important for areas with homogeneous or only little texture. In such areas, the single point matching is unable to match images because of the lack of information. With the smoothness constraint, such areas can be bridged over, assuming that the terrain surface varies smoothly over the area.
Firstly, the match points are selected and distributed in the form of a regular grid in the template image. Given a grid point in the template image, a search window in the search image (the right quasi-epipolar image, in our case) can be determined. The correct match of this point should lie in this search window. However, due to repetitive texture or poor texture information, there could be several candidate matches appearing in the search window. These candidate matches are located along the epipolar curve. They can be derived by traditional cross-correlation technique, and the candidate matches are selected if their correlation coefficient lies above a certain user-defined threshold (we choose this threshold value as 0.7). The approximate centers of the search windows can be derived from the matching results of the previous pyramid level.
Let Ii be one of the grid points on the template image and Ij (j=1,…,m) its candidate matches in the search image. P(i,j) is the probability of match Ik