hosted by
publicationslist.org
    

Thomas Oommen


toommen@mtu.edu

Journal articles

2012
2011
T Oommen, L G Baise, R M Vogel (2011)  Sampling bias and class imbalance in maximum likelihood logistic regression.   Mathematical Geosciences 43: 1. 99-120  
Abstract: Logistic regression is a widely used statistical method to relate a binary response variable to a set of explanatory variables and maximum likelihood is the most commonly used method for parameter estimation. A maximum-likelihood logistic regression (MLLR) model predicts the probability of the event from binary data deï¬ning the event. Currently, MLLR models are used in a myriad of ï¬elds including geosciences, natural hazard evaluation, medical diagnosis, homeland security, ï¬nance, and many others. In such applications, the empirical sample data often exhibit class imbalance, where one class is represented by a large number of events while the other is represented by only a few. In addition, the data also exhibit sampling bias, which occurs when there is a difference between the class distribution in the sample compared to the actual class distribution in the population. Previous studies have evaluated how class imbalance and sampling bias affect the predictive capability of asymptotic classiï¬cation algorithms such as MLLR, yet no deï¬nitive conclusions have been reached. We hypothesize that the predictive capability of the model is related to the sampling bias associated with the data so that the MLLR model has perfect predictability when the data have no sampling bias. We test our hypotheses using two simulated datasets with class distributions that are 50:50 and 80:20, respectively. We construct a suite of controlled experiments by extracting multiple samples with varying class imbalance and sampling bias from the two simulated datasets and ï¬ttingMLLR models to each of these samples. The experiments suggest that it is important to develop a sample that has the same class distribution as the original population rather than ensuring that the classes are balanced. Furthermore, when sampling bias is reduced either by using over-sampling or under-sampling, both sampling techniques can improve the predictive capability of an MLLR model.
Notes:
2010
T Oommen, L G Baise, R M Vogel (2010)  Validation and application of empirical liquefaction models   ASCE Journal of Geotechnical and Geoenvironmental Engineering 136: 12. 1618-1633  
Abstract: Empirical liquefaction models ELMs are the standard approach for predicting the occurrence of soil liquefaction. These models are typically based on in situ index tests, such as the standard penetration test SPT and cone penetration test CPT, and are broadly classiï¬ed as deterministic and probabilistic models. No objective and quantitative comparison of these models have been published. Similarly, no rigorous procedure has been published for choosing the threshold required for probabilistic models. This paper provides 1 a quantitative comparison of the predictive performance of ELMs; 2 a reproducible method for choosing the threshold that is needed to apply the probabilistic ELMs; and 3 an alternative deterministic and probabilistic ELM based on the machine learning algorithm, known as support vector machine SVM. Deterministic and probabilistic ELMs have been developed for SPT and CPT data. For deterministic ELMs, we compare the “simpliï¬ed procedure,†the Bayesian updating method, and the SVM models for both SPT and CPT data. For probabilistic ELMs, we compare the Bayesian updating method with the SVM models. We compare these different approaches within a quantitative validation framework. This framework includes validation metrics developed within the statistics and artiï¬cial intelligence ï¬elds that are not common in the geotechnical literature. We incorporate estimated costs associated with risk as well as with risk mitigation. We conclude that 1 the best performing ELM depends on the associated costs; 2 the unique costs associated with an individual project directly determine the optimal threshold for the probabilistic ELMs; and 3 the more recent ELMs only marginally improve prediction accuracy; thus, efforts should focus on improving data collection.
Notes:
T Oommen, L G Baise (2010)  Model development and validation for intelligent data collection for lateral spread displacements.   ASCE Journal of Computing in Civil Engineering 24: 6. 467-477  
Abstract: The geotechnical earthquake engineering community often adopts empirically derived models. Unfortunately, the community has not embraced the value of model validation, leaving practitioners with little information on the uncertainties present in a given model and the model’s predictive capability. In this study, we present a machine learning technique known as support vector regression SVR together with rigorous validation for modeling lateral spread displacements and outline how this information can be used for identifying gaps in the data set. We demonstrate the approach using the free face lateral displacement data. The results illustrate that the SVR has relatively better predictive capability than the commonly used empirical relationship derived using multilinear regression. Moreover, the analysis of the SVR model and its support vectors helps in identifying gaps in the data and defining the scope for future data collection.
Notes:
2009
D Misra, T Oommen, A Agarwal, S K Mishra, A Thompson (2009)  Application and analysis of support vector machine based simulation for runoff and sediment yield   Biosystems Engineering 103: 4. 527-535  
Abstract: The objective of the study was to use Support Vector Machines (SVM) to simulate runoff and sediment yield from watersheds. Recently, pattern-recognition algorithms such as artificial neural networks (ANN) have gained popularity in simulating rainfall-runoff-sediment yield processes producing comparable accuracy to physics-based models. We have simulated daily, weekly, and monthly runoff and sediment yield from an Indian watershed, with monsoon period data, using SVM, a relatively new pattern-recognition algorithm. Model performance was evaluated using correlation coefficient for evaluating variability, coefficient of efficiency for evaluating efficiency, and the difference of slope of a best-fit line from observed-estimated scatter plots to 1:1 line for evaluating predictability. Time-series data were split into training, calibration and validation sets. The results of SVM were compared to those of ANN. An alternate method, the Multiple Regressive Pattern Recognition Technique (MRPRT), was used for runoff estimation only. The MRPRT did not improve the results significantly compared to SVM, hence, it was not used to simulate sediment yield. We concluded that SVM provided significant improvement in training, calibration and validation as compared to ANN. SVM could be an efficient alternative to ANN, a computationally intensive method, for runoff and sediment yield predictions providing at least comparable accuracy.
Notes:
2008
T Oommen, D Misra, N K C Twarakavi, A Prakash, B Sahoo, S Bandopadhyay (2008)  An objective analysis of support vector machine based classification for remote sensing   Mathematical Geoscience 40: 4. 409-424  
Abstract: Accurate thematic classification is one of the most commonly desired outputs from remote sensing images. Recent research efforts to improve the reliability and accuracy of image classification have led to the introduction of the Support Vector Classification (SVC) scheme. SVC is a new generation of supervised learning method based on the principle of statistical learning theory, which is designed to decrease uncertainty in the model structure and the fitness of data. We have presented a comparative analysis of SVC with the Maximum Likelihood Classification (MLC) method, which is the most popular conventional supervised classification technique. SVC is an optimization technique in which the classification accuracy heavily relies on identifying the optimal parameters. Using a case study, we verify a method to obtain these optimal parameters such that SVC can be applied efficiently. We use multispectral and hyperspectral images to develop thematic classes of known lithologic units in order to compare the classification accuracy of both the methods. We have varied the training to testing data proportions to assess the relative robustness and the optimal training sample requirement of both the methods to achieve comparable levels of accuracy. The results of our study illustrated that SVC improved the classification accuracy, was robust and did not suffer from dimensionality issues such as the Hughes Effect.
Notes:
T Oommen, A Prakash, D Misra, J J Kelley, S Naidu, S Bandopadhyay (2008)  GIS based marine platinum exploration, Goodnews Bay, Southwest Alaska.   Marine Georesources & Geotechnology 26: 1. 1-18  
Abstract: Goodnews Bay, southwest Alaska, is known for platinum (Pt) reserve that extends offshore in the Bering Sea. To assess the nearshore placer potential we first collated marine Pt concentrations available since 1960 in a geographic information system (GIS) database. Subsequently, in 2005, we collected 23 pipe dredge sediment samples and 26 vibracores from unexplored sites and analyzed them for Pt. This sampling was supplemented by magnetic (Sea Spy) and seismic (side scan, geoacoustic and datasonic bubble pulser) surveys. Integrating results of geospatial analysis of Pt concentrations with geophysical analysis using GIS techniques led to delineate four locations encouraging for further Pt exploration. Of these, two locations fall close to paleochannels and drowned ultramafic source, while the other two coincide with high energy environments in the Goodnews Bay and close to the Carter Bay.
Notes:
2007
B Sahoo, T Oommen, D Misra, G Newby (2007)  S-Transform as a discrimination tool in classification of hyperspectral images   Canadian Journal of Remote Sensing 33: 6. 551-560  
Abstract: A standard part of processing remote sensing data is image classification, in which we assume each pixel belongs to a class or theme with a unique spectral signature. Discrimination may be defined as the phenomenon where multiple themes exhibit very similar spectral patterns within a wavelength range of interest and is a common challenge in remote sensing. As a result, researchers may not achieve the desired classification accuracy. A robust discrimination technique must be capable of detecting very minor spectral differences between classes with similar spectral signatures. Using the one-dimensional S-transform, a spectral localization technique to discriminate similar lithologic classes on a hyperspectral satellite image, we investigated the S-amplitude spectra efficiency in enhancing the spectral information of each pixel of a known class. We compared the overall accuracy of classified themes using a support vector classification (SVC) scheme, with and without using the enhanced spectral information. We found that SVC aided by spectral enhancement from the S-transform provided better classification accuracy. Thus, this method may prove very useful in scenarios where pixels of a known class are sparse and not easily separable.
Notes:
2005

Conference papers

2012
2011
H Wonsook, G H Prasanna, T Oommen, T H Marek, D O Porter, T H Howell (2011)  Spatial Interpolation of Daily Reference Evapotranspiration in the Texas High Plains   In: EWRI World Environmental & Water Resources Congress, Palm Springs, CA  
Abstract: The Texas High Plains Evapotranspiration (ET) Network collects meteorological data from 18 grass reference weather stations at hourly intervals and estimates hourly and daily reference ET using the American Society of Civil Engineers (ASCE) Standardized Reference ET equation. Producers in the Texas High Plains can obtain daily reference ET for any weather station of their interest by subscribing to the fax/email service. However, one concern in using these data is to determine which reference weather station best represents climatic conditions similar to those in their irrigated fields. Availability of accurate daily reference ET maps for the Texas High Plains is expected not only to relieve producers from this concern but also to assist in attracting more producers to adopt the reference ET-based irrigation scheduling. Therefore, the main objective of this study was to evaluate two spatial interpolation methods such as inverse distance weighting (IDW) and ordinary kriging for mapping reference ET in the Texas High Plains. Daily grass reference ET maps were developed for the Texas High Plains using ordinary kriging and IDW methods for the period of 1999-2005, and assessed for mapping accuracy using Root Mean Square Error (RMSE). Comparison of RMSE values between IDW and ordinary kriging methods indicated that IDW outperforms ordinary kriging for approximately 68 % of 2179 days. However, both methods equally performed poorly during summer growing seasons when accurate grass reference ET maps are needed for irrigation scheduling.
Notes:
2010
T Oommen, L G Baise, R Gens, A Prakash, R P Gupta (2010)  Applying Satellite Remote Sensing to Document Liquefaction Failures.   In: Seismological Research Letters, 81(2): 253-283.  
Abstract: Historically, earthquake induced liquefaction is known to have caused extensive structural and lifeline damages around the world. Documenting these instances of liquefaction is extremely important to help earthquake professionals to better evaluate design procedures, and enhance their understanding of liquefaction processes. Currently, after an earthquake event, field-based mapping of liquefaction remains sporadic due to inaccessibility, and difficulties in identifying and mapping large aerial extents. Researchers have used change detection using remotely sensed pre- and post-event satellite images to assist field reconnaissance. However, general change detection is only a first step in developing effective field reconnaissance strategies for liquefaction due to the inherent assumption of the approach that all the change observed within the two dates are induced by the liquefaction. We hypothesize that as liquefaction occurs in saturated granular soils due to an increase in pore pressure, the liquefaction related terrain changes should have an associated increase in soil moisture with respect to the surrounding non-liquefied regions. Mapping the increase in soil moisture using pre- and post-event image that is sensitive to soil moisture is suitable for identifying areas that have undergone liquefaction. However, often only coarse resolution pre- and post-event images are only available after an earthquake event making detailed mapping difficult. Therefore, synergistic use of multisensor and multispectral remote sensing images will improve post-liquefaction reconnaissance mapping. We verify this by supervised change detection on fine resolution post-event images with Support Vector Machine (SVM) using training instance derived from coarser image. The results indicate that satellite remote sensing can be an integral part in strategizing post-earthquake response and reconnaissance as well as for regionally documenting liquefaction failures.
Notes:
T Oommen, L G Baise (2010)  A Practical Approach for Implementing the Probability of Liquefaction in Performance Based Design.   In: Fifth International Conference on Recent Advances in Geotechnical Earthquake Engineering and Soil Dynamics  
Abstract: Empirical Liquefaction Models (ELMs) are the usual approach for predicting the occurrence of soil liquefaction. These ELMs are typically based on in situ index tests, such as the Standard Penetration Test (SPT) and Cone Penetration Test (CPT), and are broadly classified as deterministic and probabilistic models. No objective and quantitative comparison of these models has been published. The deterministic model provides a “yes/no†response to the question of whether or not a site will liquefy. However, Performance-Based Earthquake Engineering (PBEE) requires an estimate of the probability of liquefaction (PL) which is a quantitative and continuous measure of the severity of liquefaction rather than a deterministic (yes/no) estimate. But probabilistic models are still not consistently used in routine engineering applications. This is primarily due to the limited guidance regarding which model to use, and the difficulty in interpreting the resulting probabilities. The practical implementation of probabilistic model requires a threshold of liquefaction (THL). The need for a THL arises because engineering decisions require the site to be classified as either liquefiable or non-liquefiable. Thus, a site where PL < THL is classified as non-liquefiable and a site where PL > THL is classified as liquefiable. The researchers who have used probabilistic methods have either come up with subjective THL or have used the established deterministic curves to develop the THL. However, the importance of the probabilistic approach warrants more objective guidelines for the determination of THL. In this study, we compare the predictive performance of various deterministic and probabilistic ELMs within a quantitative validation framework. This framework includes validation metrics developed within the statistics and artificial intelligence fields that are uncommon in the geotechnical literature. We also provide a thorough and reproducible approach to interpret PL using precision and recall and to, compute the optimal THL that incorporates estimated costs associated with risk as well as with risk mitigation using a new metric that we developed called the Precision-Recall (P-R) cost curve. We present the P-R cost curves for the popular probabilistic model developed using Bayesian updating for SPT and CPT data by Cetin et al. (2004) and Moss et al. (2006) respectively. These curves should be immediately useful to a geotechnical engineer who needs to choose among different ELMs and implement one for design purposes.
Notes:
T Oommen, L G Baise, R Gens, A Prakash (2010)  Post-earthquake Health Monitoring of Critical Infrastructure at Haiti to Assist Rapid Relief Efforts   In: Seismological Research Letters, 81(2): 253-283.  
Abstract: On January 12, 2010 a magnitude Mw 7.0 earthquake struck the Port-au-Prince region of Haiti. With recent technological advances, this earthquake has become a test-case for using technology in post-earthquake health monitoring and rapid relief efforts. Within days of the earthquake, several sets of satellite and aerial imagery were taken and made available for emergency response efforts. The high resolution imagery provided up to 15 cm resolution allowing the user to “see†damage on the ground. In another example of technology use in relief efforts, students at the Tufts Fletcher School of Diplomacy together with USHAHIDI provided real time mapping portal for reporting crisis (i.e. crisis mapping). This portal provided people in Haiti the ability to text message a request for aid and these requests were mapped to create a crisis map and sent to the appropriate relief agencies in real time. In this work, we link the human based data acquired by USHAHIDI with the physical data acquired through remote imagery and engineering field reconnaissance to analyze the health of critical infrastructure and improve technology based rapid relief efforts. This integration of broad spatially available remote data with sparse detailed engineering reports and pervasive personal reports through text messaging provides a richer picture of damage and relief needs than any one data set can provide alone. We use Differential Synthetic Aperture Radar Interferometry (DInSAR) to map the subsidence and displacements that have occurred to the transportation network in Port-au-Prince. This information is then used to characterize the health of the road network. Geographic Information System (GIS) network analysis tools are used to find optimal routes. Finally we use the crisis mapping data paired with field reconnaissance reports to validate the satellite imagery analysis.
Notes:
2009
T Oommen, L G Baise, R Gens, A Prakash, R P Gupta (2009)  Documenting Liquefaction Failures Using Satellite Remote Sensing and Artificial Intelligence Algorithms   In: Eos Trans. AGU, Fall Meet. Suppl.  
Abstract: Historically, earthquake induced liquefaction is known to have caused extensive damage around the world. Therefore, there is a compelling need to characterize and map liquefaction after a seismic event. Currently, after an earthquake event, field-based mapping of liquefaction is sporadic and limited due to inaccessibility, short life of the failures, difficulties in mapping large aerial extents, and lack of resources. We hypothesize that as liquefaction occurs in saturated granular soils due to an increase in pore pressure, the liquefaction related terrain changes should have an associated increase in soil moisture with respect to the surrounding non-liquefied regions. The increase in soil moisture affects the thermal emittance and, hence, change detection using pre- and post-event thermal infrared (TIR) imagery is suitable for identifying areas that have undergone post-earthquake liquefaction. Though change detection using TIR images gives the first indication of areas of liquefaction, the spatial resolution of TIR images is typically coarser than the resolution of corresponding visible, near-infrared (NIR), and shortwave infrared (SWIR) images. We hypothesize that liquefaction induced changes in the soil and associated surface effects cause textural and spectral changes in images acquired in the visible, NIR, and SWIR. Although these changes can be from various factors, a synergistic approach taking advantage of the thermal signature variation due to changing soil moisture condition, together with the spectral information from high resolution visible, NIR, and SWIR bands can help to narrow down the locations of post-event liquefaction for regional documentation. In this study, we analyze the applicability of combining various spectral bands from different satellites (Landsat, Terra-MISR, IRS-1C, and IRS-1D) for documenting liquefaction failures associated with the magnitude 7.6 earthquake that occurred in Bhuj, India, in 2001. We combine the various spectral bands by neighborhood correlation image analysis using an artificial intelligence algorithm called support vector machine to remotely identify and document liquefaction failures across a region; and assess the reliability and accuracy of the thermal remote sensing approach in documenting regional liquefaction failures. Finally, we present the applicability of the satellite data analyzed and appropriateness of a multisensor and multispectral approach for documenting liquefaction related failures.
Notes:
D Misra, T Oommen, T Radatz, A Thompson (2009)  Using Support Vector Machines to Characterize Runoff-triggering in Small Watersheds.   In: Eos Trans. AGU Fall Meet. Suppl.  
Abstract: Runoff is one of the most complex hydrological phenomena to comprehend due to the tremendous spatial variability of catchment characteristics and precipitation patterns. However, the determination of runoff is critical for flood protection works, effective water storage and release, and protection of agricultural lands. The quantity of runoff depends on parameters such as rainfall intensity, duration, initial soil moisture, land use, and catchment geomorphology or relief. One common approach to estimate runoff is to develop physical models validated with measured data that relate the variables (input - output relationship) in the system. Conversely, this extraction of knowledge from the data requires large datasets, sophisticated modeling techniques as well as human intuition and experience. Additionally, the exact conditions that trigger runoff are difficult to predict because of their dependency on a combination of rainfall intensity, antecedent soil moisture conditions, and physical soil properties. Currently, pattern-learning algorithms based on artificial intelligence have shown promise in developing non-parametric models involving complex processes using few input parameters due to their ability to learn and recognize trends in the data. In this study, we explore the applicability of a sparse pattern-learning algorithm called Support Vector Machines (SVM) for modeling runoff from small watersheds. Results indicate that these methods can be an effective alternative to physical models for identifying runoff generation characteristics. Once identified, characteristics that trigger runoff from catchments, such as rainfall intensity and antecedent soil moisture, may be successfully used for large scale monitoring of watersheds using remote methods such as satellite sensors.
Notes:
2008
T Oommen, L G Baise, R Gens, A Prakash, R P Gupta (2008)  Multisensor and Multispectral Approach in Documenting and Analyzing Liquefaction Hazard Using Remote Sensing.   In: Eos Trans. AGU, Fall Meet. Suppl.  
Abstract: Seismic liquefaction is the loss of strength of soil due to shaking that leads to various ground failures such as lateral spreading, settlements, tilting, and sand boils. It is important to document these failures after earthquakes to advance our study of when and where liquefaction occurs. The current approach of mapping these failures by field investigation teams suffers due to the inaccessibility to some of the sites immediately after the event, short life of some of these failures, difficulties in mapping the aerial extent of the failure, incomplete coverage etc. After the 2001 Bhuj earthquake (India), researchers, using the Indian remote sensing satellite, illustrated that satellite remote sensing can provide a synoptic view of the terrain and offer unbiased estimates of liquefaction failures. However, a multisensor (data from different sensors onboard of the same or different satellites) and multispectral (data collected in different spectral regions) approach is needed to efficiently document liquefaction incidences and/or its potential of occurrence due to the possibility of a particular satellite being located inappropriately to image an area shortly after an earthquake. The use of SAR satellite imagery ensures the acquisition of data in all weather conditions at day and night as well as information complimentary to the optical data sets. In this study, we analyze the applicability of the various satellites (Landsat, RADARSAT, Terra-MISR, IRS-1C, IRS-1D) in mapping liquefaction failures after the 2001 Bhuj earthquake using Support Vector Data Description (SVDD). The SVDD is a kernel based nonparametric outlier detection algorithm inspired by the Support Vector Machines (SVMs), which is a new generation learning algorithm based on the statistical learning theory. We present the applicability of SVDD for unsupervised change-detection studies (i.e. to identify post-earthquake liquefaction failures). The liquefaction occurrences identified from the different satellites using SVDD have been compared to the ground truth in terms of documented liquefaction failures by other researchers. We present the applicability and appropriateness of the various satellites and spectral regions for documenting liquefaction related failures. Results illustrate that the SVDD is a promising unsupervised change-detection algorithm, which can help in automating the documentation of earthquake induced liquefaction failures.
Notes:
T Oommen, T Radatz, A Thompson, D Misra (2008)  Developing Runoff-Triggering Characteristics in Small Watersheds Using Artificial Intelligence Models   In: Annual Meeting, Providence, RI, USA: American Society of Agricultural and Biological Engineers  
Abstract: Runoff is one of the most complex hydrological phenomena to comprehend due to the tremendous spatial variability of catchment characteristics and precipitation patterns. However, the determination of runoff is critical for flood protection works, effective water storage and release, and protection of agricultural lands. The quantity of runoff depends on parameters such as rainfall intensity, duration, initial soil moisture, land use, and catchment geomorphology or relief. One common approach to estimate runoff is to develop physical models validated with measured data that relate the variables (input – output relationship) in the system. Conversely, this extraction of knowledge from the data requires large datasets, sophisticated modeling techniques as well as human intuition and experience. Additionally, the exact conditions that trigger runoff are difficult to predict because of their dependency on a combination of rainfall intensity, antecedent soil moisture conditions, and physical soil properties. Currently, pattern-learning algorithms based on artificial intelligence have shown promise in developing non-parametric models involving complex processes using few input parameters due to their ability to learn and recognize trends in the data. In this study, we explore the applicability of two sparse pattern-learning algorithms namely Support Vector Machines (SVM) and Relevant Vector Machines (RVM) for modeling runoff from small watersheds. Results show that these methods can be an effective alternative to physical models for identifying runoff generation characteristics. Once identified, characteristics that trigger runoff from catchments, such as rainfall intensity and antecedent soil moisture, may be successfully used for large scale monitoring of watersheds using remote methods such as satellite sensors.
Notes:
T Oommen, L G Baise (2008)  A New Approach to Liquefaction Potential Mapping Using Remote Sensing and Machine Learning.   In: IEEE International Geoscience & Remote Sensing Symposium 1(III): 51-54.  
Abstract: Earthquake induced ground shaking in areas with saturated sandy soils pose a major threat to communities due to soil liquefaction. Currently liquefaction potential is assessed on two scales: regionally based on surficial geologic unit or locally based on geotechnical sample data. However, the regional maps fail to capture the variability whereas; the collection of geotechnical data on the local scale is costly. Remote sensing products from air and space borne sensors allow us to explore the land surface parameters at different spatial scales. We explore the use of satellite based remote sensing data (Landsat 7 ETM+), together with other satellite derived products and geologic map at a test site in California. A supervised classification using Support Vector Machine (SVM) yielded an overall classification accuracy of 84% on a test data, indicating that the approach is promising for liquefaction potential mapping.
Notes:
T Oommen, L G Baise (2008)  Critical Evaluation of Lateral Spread Displacement Using Support Vector Regression and Stack Generalization   In: Seismological Research Letters, 79(2): 277.  
Abstract: Lateral spreading is the most persistent type of liquefaction induced ground failure. During lateral spreading, integral mass of surficial soil displace downslope or towards a free face along a shear zone that has formed within the liquefied soil layer. The amount of displacement due to lateral spreading depends on several factors such as physical and mechanical characteristics of the soil layers at site, the water table depth, magnitude of earthquake, distance from the site to energy source, ground slope conditions, thickness of the critical layer and the attenuation properties of the in situ soil. Since liquefaction induced lateral spreading is governed by several factors, determination of its displacement is a complex geotechnical engineering problem. Currently, machine learning algorithms have become popular in different fields due to their ability to make appropriate predictions that involve a high degree of complexity and nonlinearity. In this study, we evaluate the predictive capability of a new generation machine learning algorithm known as Support Vector Regression (SVR) and the concept of stack generalization for lateral spread displacement. We also critically evaluate the model performance using K-fold cross validation (K = 5) and quantify it using goodness of fit measures such as coefficient of correlation (r) and coefficient of efficiency (E). Finally, we compare the machine learning approach to the widely used multilinear regression (MLR) equations by Youd et al., (2002). The values of r and E for the stack generalization, SVR and the MLR equations are r = 88.7 E = 78.5, r = 86.2 E = 74.0 and r = 83.2 E = 67.3 respectively. Results show that the machine learning algorithms have an improved predictive capability and can be more reliable and robust for the prediction of lateral spread displacement than the more traditional multilinear regression approach.
Notes:
T Oommen, T Radatz, A Thompson, D Misra (2008)  Probability of Triggering Runoff in Small Watersheds Using Support Vector Machines.   In: 51st Annual Meeting, Association of Environmental & Engineering Geologists  
Abstract: Runoff is one of the most complex hydrological phenomena to comprehend due to the tremendous spatial variability of catchment characteristics and precipitation patterns. However, the determination of runoff is critical for flood protection works, effective water storage and release, and protection of land. The quantity of runoff depends on parameters such as rainfall intensity, duration, initial soil moisture, land use, and catchment geomorphology or relief. One common approach to estimate runoff is to develop physical models validated with measured data that relate the variables (input – output relationship) in the system. Conversely, this extraction of knowledge from the data requires large datasets, sophisticated modeling techniques as well as human intuition and experience. Additionally, the exact conditions that trigger runoff are difficult to predict because of their dependency on a combination of rainfall intensity, antecedent soil moisture conditions, and physical soil properties. Currently, pattern-learning algorithms based on artificial intelligence have shown promise in developing non-parametric models involving complex processes using few input parameters due to their ability to learn and recognize trends in the data. In this study, we explore the applicability of a sparse pattern-learning algorithm called Support Vector Machines (SVM) for modeling runoff from small watersheds. Results show that these methods can be an effective alternative to physical models for identifying runoff generation characteristics. Once identified, characteristics that trigger runoff from catchments, such as rainfall intensity and antecedent soil moisture, may be successfully used for large scale monitoring of watersheds using remote methods such as satellite sensors.
Notes:
D Misra, T Oommen, H G Prasanna, S G Bajwa, T A Howell (2008)  Estimation of Leaf Area Index from Landsat Imagery for Texas High Plains using Support Vector Machines   In: Eos Trans. AGU, Fall Meet. Suppl.  
Abstract: Leaf Area Index (LAI) is an important hydrologic parameter used for the estimation of evapotranspiration (ET) amongst other processes. Remote sensing provides an inexpensive and non-destructive tool in collecting LAI information on various spatial and temporal scales. In this study, we have developed models based on support vector machines (SVM) to estimate LAI using Landsat 5 Thematic Mapper (TM) data. We have used Normalized Differential Vegetation Index (NDVI) and Structure Independent Pigment Index (SIPI), indices developed using combination of reflectance of different bands of the TM data as input for the LAI estimation. The model was calibrated and validated using 47 randomly selected synchronized field measurements from the Landsat 5 overpass data over Moore and Ochiltree Counties located in the Texas High Plains. The initial model developed using a training data set of 35 data points and tested with the remaining 12 data points have provided comparable results as ones estimated using artificial neural network (ANN) approaches. Models such as Least-square SVM and relevant vector machines (RVM) are being developed along with a K- fold cross validation using simple Near Infrared and Red band data besides NDVI and SIPI to improve the LAI estimates. SVM method has been proven to be robust especially with sparse data availability. We expect to apply the model with improved LAI estimates in neighborhood of the study area to develop LAI maps that may be used for mapping ET and crop yield at a regional scale.
Notes:
2007
T Oommen, D Misra, A Agarwal, S K Mishra (2007)  Analysis and Application of Support Vector Machine Based Simulation for Runoff and Sediment Yield.   In: Annual Meeting, Minneapolis, MN, USA: American Society of Agricultural and Biological Engineers  
Abstract: Physics-based models for simulation of runoff and sediment yield from watersheds are quite complex and involved due to tremendous spatial variability of watershed characteristics and precipitation patterns. Recently, pattern-learning algorithms such as the artificial neural networks (ANN) have gained popularity in simulating the rainfall-runoff-sediment yield processes producing comparable accuracy. We have simulated daily, weekly, and monthly runoff and sediment yield from an Indian watershed (area= 7820 Sq.Km), with data from the monsoon period, using support vector machines (SVM), a statistical learning theory based pattern-learning algorithm. The performance of the model was evaluated using correlation coefficient (r) and coefficient of efficiency (E). The time series data was split into a training set for the learning process and a prediction set for comparison of the model’s forecasting ability. The results of SVM were compared to those of ANN. We concluded that SVM provided significant improvement in both training and prediction abilities as compared to those of ANN. ANN being a computationally intensive method, SVM could be used as an efficient alternative for runoff and sediment yield predictions providing at least comparable accuracy.
Notes:
T Oommen, L G Baise (2007)  A New Approach to Liquefaction Potential Mapping Using Remote Sensing and Machine Learning   In: Eos Trans. AGU, Fall Meet. Suppl.  
Abstract: In order to help communities better plan and mitigate the effects of seismic hazards, it is important to use innovations in science and technology to improve our techniques for mapping the spatial extents of seismic hazards. Earthquake induced ground shaking in areas with saturated sandy soils pose a major threat to communities as a result of the soil liquefaction. Liquefaction is the process of changing a saturated cohesionless soil from a solid to liquid state due to increased pore pressure. Many major earthquakes, especially those in coastal regions, result in liquefaction related ground failures that can lead to infrastructure damage or slope stability issues. Currently liquefaction potential is assessed on two scales: regionally based on surficial geologic unit or locally based on geotechnical sample data. Regional liquefaction potential maps fail to capture the variability of liquefaction potential on the local scale. On the other hand, collection of geotechnical data on the local scale is costly and only done for specific engineering projects and therefore not generally available for regional mapping. Today, the advent of advanced remote sensing products from air and space borne sensors allow us to explore the land surface parameters (geology, moisture content, temperature) at different spatial scales (remote sensor footprint). In this study, we explore the use of satellite based remote sensing data (Landsat 7 ETM+), together with digital elevation model, ground water table, land cover classification, geology, water index and normalized difference vegetation index (NDVI) to characterize the liquefaction potential of northern Monterey and southern Santa Cruz counties in California. A supervised classification of the data into seven classes based on the liquefaction potential map developed by Dupre and Tinsley 1980 was done using Support Vector Machine (SVM). SVM is a machine learning/artificial intelligence algorithm that has the ability to simulate the learning capabilities of a human brain and make appropriate predictions that involve intuitive judgments and a high degree of nonlinearity. The accuracy of the developed liquefaction potential map was tested using independent testing data that was not used for the model development. The results show that the developed liquefaction potential map has an overall classification accuracy of 84%, indicating that the combination of remote sensing data and other relevant spatial data together with machine learning can be a promising approach for liquefaction potential mapping.
Notes:
T Oommen, L G Baise (2007)  A New Approach to Liquefaction Potential Mapping Using Remote Sensing and Machine Learning.   In: Northeast Geotechnical Graduate Research Symposium UMass Amherst, MA, USA:  
Abstract: In order to help communities better plan and mitigate the effects of seismic hazards, it is important to use innovations in science and technology to improve our techniques for mapping the spatial extents of seismic hazards. Earthquake induced ground shaking in areas with saturated sandy soils pose a major threat to communities as a result of the soil liquefaction. Liquefaction is the process of changing a saturated cohesionless soil from a solid to liquid state due to increased pore pressure. Many major earthquakes, especially those in coastal regions, result in liquefaction related ground failures that can lead to infrastructure damage or slope stability issues. Currently liquefaction potential is assessed on two scales: regionally based on surficial geologic unit or locally based on geotechnical sample data. Regional liquefaction potential maps fail to capture the variability of liquefaction potential on the local scale. On the other hand, collection of geotechnical data on the local scale is costly and only done for specific engineering projects and therefore not generally available for regional mapping. Today, the advent of advanced remote sensing products from air and space borne sensors allow us to explore the land surface parameters (geology, moisture content, temperature) at different spatial scales (remote sensor footprint). In this study, we explore the use of satellite based remote sensing data (Landsat 7 ETM+), together with digital elevation model, ground water table, land cover classification, geology, water index and normalized difference vegetation index (NDVI) to characterize the liquefaction potential of northern Monterey and southern Santa Cruz counties in California. A supervised classification of the data into seven classes based on the liquefaction potential map developed by Dupre and Tinsley 1980 was done using Support Vector Machine (SVM). SVM is a machine learning/artificial intelligence algorithm that has the ability to simulate the learning capabilities of a human brain and make appropriate predictions that involve intuitive judgments and a high degree of nonlinearity. Figure 1 shows a comparison of the developed liquefaction potential map using SVM to the map of Dupre and Tinsley 1980. It is observed that the spatial variability in liquefaction potential is well captured by the developed map. The accuracy of the developed liquefaction potential map was tested using independent testing data that was not used for the model development. The results show that the developed liquefaction potential map has an overall classification accuracy of 84%, indicating that the combination of remote sensing data and other relevant spatial data together with machine learning can be a promising approach for liquefaction potential mapping. Further, Machine learning will be used to help understand the relative importance of the various parameters in identifying liquefaction hazard and to optimize future data collection efforts.
Notes:
2006
D Misra, T Oommen, A Agarwal, S K Mishra (2006)  Simulation of Runoff and Sediment Yield Using Support Vector Machines: A Preliminary Analysis.   In: Eos Trans. AGU, Fall Meet. Suppl.  
Abstract: Physics-based models for simulation of runoff and sediment yield from watersheds are quite complex and involved due to tremendous spatial variability of watershed characteristics and precipitation patterns. Recently, pattern-learning algorithms such as the artificial neural networks (ANN) have gained popularity in simulating the rainfall-runoff-sediment yield processes producing comparable accuracy. We have simulated daily, weekly, and monthly runoff and sediment yield from an Indian watershed (area= 7820 Sq.Km), with data from the monsoon period, using support vector machines (SVM), a statistical learning theory based pattern-learning algorithm. The performance of the model was evaluated using root mean square error (RMSE), correlation coefficient (CC) and coefficient of efficiency (CE). The time series data was split into a training set for the learning process and a prediction set for comparison of the model's forecasting ability. The results of SVM were compared to those of ANN. We concluded that SVM provided significant improvement in both training and prediction abilities as compared to batch processing schemes of ANN. However, with the online processing schemes, SVM provided comparable or slightly improved results. ANN being a computationally intensive method, SVM could be used as an efficient alternative for runoff and sediment yield predictions under comparable accuracy in predictions.
Notes:
B Sahoo, T Oommen, D Misra, G Newby (2006)  Application of 1D S-Transform in Discrimination Problems in Remote Sensing   In: Eos Trans. AGU, Fall Meet. Suppl.  
Abstract: Image classification is a standard part of processing remote sensing data and is based on the assumption that each pixel belongs to a class or theme with a unique spectral signature. However, a common challenge in remote sensing is image discrimination, also known as the "colinearity". It may be defined as the phenomenon where multiple themes exhibit very similar spectral patterns within a wavelength-range of interest. As a result, the desired classification accuracy might not be achieved. A robust discrimination technique must have the capability to detect very minor spectral differences between classes with similar spectral signatures. One-dimensional S-Transform, a spectral localization technique, was used to discriminate similar lithologic classes on hyper-spectral satellite images. We investigated the efficiency of the S-amplitude spectra in enhancing the spectral information of each pixel of a known class. We compared the overall accuracy of classified themes using Support Vector Classification (SVC) scheme, with and without the enhanced spectral information. We found that SVC aided by spectral enhancement from S-Transform provided better classification accuracy. Thus, this method may prove very useful in scenarios where pixels of a known class are sparse and not easily separable.
Notes:
T Oommen, D Misra, A Prakash, S Bandopadhyay, S Naidu, J J Kelley (2006)  Marine Geodatabase and Multiple Regressive Pattern Recognition Technique: A New Approach to Marine Placer Resource Assessment.   In: Eos Trans. AGU Fall Meet. Suppl.  
Abstract: The ultramafic rocks of the Red Mountain in Goodnews Bay area of southwest Alaska have been the commercial source of onshore placer Pt since 1926. The proximity of the Red Mountain to the Bering Sea, our geophysical survey revealing the possibility of drowned ultramafic and paleo-drainage channels offshore, and the platinum samples collected by various agencies suggests the availability of a significant quantity of marine Pt accumulations in this region. We have created a comprehensive geodatabase for future Pt prospecting and possible exploration in the offshore regions of Goodnews Bay. Offshore exploration needs a preliminary assessment of the marine Pt resource. We have used several regression techniques such as inverse distance weight, kriging, radial basis function, support vector machines (SVM) and relevant vector machines for our assessment. None of these techniques individually was able to capture the entire Pt data variability obtained from the sampled data. The reason could be simply due to the limitation of the method used or the complexity of the governing processes that influence the accumulation of marine Pt such as glaciations, littoral currents, bathymetry, sea-level transgression, or paleo-drainage processes that are difficult to be quantitatively included in the assessment. To obtain improved accuracy of assessment, we propose a new method called the Multiple Regressive Pattern Recognition Technique (MRPRT). We hypothesize that by using the outputs of the different individual regression techniques as the input for a pattern recognition technique, such as the SVM, we will be able to overcome the shortcomings of these regression methods discussed above. The performance of MRPRT was evaluated using the coefficient of correlation (CC) and the coefficient of efficiency (CE). With MRPRT, the CC of our prediction has improved from 0.57 to 0.77 and the CE from 0.28 to 0.43. Post comparative analysis of the predicted marine Pt resource with the different governing
Notes:
2005
T Oommen, J J Kelley, S Naidu, A Prakash, D Misra, S Bandopadhyay (2005)  A Preliminary GIS Analysis of Marine Geophysical Signatures to Decipher Distribution Patterns of Platinum Placer in Offshore Goodnews Bay Region, Alaska   In: Eos Trans. AGU, Fall Meet. Suppl.  
Abstract: The Goodnews Bay region, located in the southwest Alaska, was the only primary platinum-producing region in the U.S. until 1980 with about 20 metric tons of platinum recovered. The U.S Geological Survey has estimated the coastal platinum placer potential at Goodnews Bay region to be 155 metric tons. In order to estimate and locate the offshore platinum placer distribution in this region, a GIS database was developed with the information derived from data collected over more than 50 years by industries and government agencies. These data include comprehensive information on grain mineralogy, modes of offshore placer transport, bathymetry, location of paleo-channels, geology, glacial history, extent of drowned ultramafic rocks, wave-current direction, geophysical surveys and knowledge of paleoshore lines. In this study, we have used geophysical data sets from earlier research and have compared those to the recent data acquisition from our "Platinum Cruise 2005" to Kuskokwim Bay. GIS analysis has been used to decipher the influence of various subsurface marine features as obtained from the geophysical signatures on the depositional pattern of placer platinum in this region.
Notes:

Technical reports

2008

Masters theses

2006
T oommen (2006)  Geodatabase Development and GIS Based Analysis for Resource Assesment of Placer Platinum in the Offshore Region of Goodnews Bay, Alaska   University of Alaska Fairbanks  
Abstract: Goodnews Bay, southwest Alaska, is known for extensive Pt reserves that have their source in the neighboring Red Mountain. The reserves potentially extend offshore into the Bering Sea. This study aims at developing a geodatabase to integrate all offshore platinum related data collected by researchers and agencies in the past, with the intent to identify data gaps. Based on these data gaps 49 new areas were sampled for Pt and geophysical data were collected in summer 2005. Spatial distribution map for offshore Pt was created using a new Multiple Regression Pattern Recognition Technique (MRPRT) that gave an R2=0.76, a significant improvement from standard GIS based geospatial techniques. Four potential Pt exploration areas were delineated, including one area where drowned ultramafics and buried alluvial channels co-occur. Coastal currents influenced the surficial platinum accumulations, and no clear relation between Pt distribution and sand bars in the far offshore could be established
Notes:

PhD theses

2010
Powered by PublicationsList.org.