Jinwei Liu

Cyber Intelligence Lab

Education:

Ph.D. in Computer Engineering/Clemson University

M.S. in Computer Science/Clemson University, University of Science and Technology of China

Contact:
Phone: (407) 882-1323
E-mail: jliu@ist.ucf.edu

Jinwei Liu received the MS degree in Computer Science from Clemson University, SC, USA and University of Science and Technology of China. He received his Ph.D. degree in Computer Engineering from Clemson University, SC, USA, and is currently a Postdoctoral Associate at University of Central Florida. He is a member of the Cyber Intelligence Lab (CI Lab) and contributes to multi-year, multi-institution team research projects to advance understanding and practice of online social simulation. Before joining University of Central Florida, he was a Research Associate at University of Virginia. His major research interests include machine learning and data mining, cloud computing and datacenters, network science (e.g., social networks, wireless networks, SDN, NFV), big data, IoT, cybersecurity, HPC. He was the member of the program committees of several international conferences. He is a member of the IEEE and the ACM.

Projects/Honors/Publications:

Optical Cloud Pixel Recovery via Machine Learning

Remote sensing derived Normalized Difference Vegetation Index (NDVI) is a widely used index to monitor vegetation and land use change. NDVI can be retrieved from publicly available data repositories of optical sensors such as Landsat, Moderate Resolution Imaging Spectro-radiometer (MODIS) and several commercial satellites. Studies that are heavily dependent on optical sensors are subject to data loss due to cloud coverage. Specifically, cloud contamination is a hindrance to long-term environmental assessment when using information from satellite imagery retrieved from visible and infrared spectral ranges. Landsat has an ongoing high-resolution NDVI record starting from 1984. Unfortunately, this long time series NDVI data suffers from the cloud contamination issue. Though both simple and complex computational methods for data interpolation have been applied to recover cloudy data, all the techniques have limitations. In this paper, a novel Optical Cloud Pixel Recovery (OCPR) method is proposed to repair cloudy pixels from the time-space-spectrum continuum using a Random Forest (RF) trained and tested with multi-parameter hydrologic data. The RF-based OCPR model is compared with a linear regression model to demonstrate the capability of OCPR. A case study in Apalachicola Bay is presented to evaluate the performance of OCPR to repair cloudy NDVI reflectance. The RF-based OCPR method achieves a root mean squared error of 0.016 between predicted and observed NDVI reflectance values. The linear regression model achieves a root mean squared error of 0.126. Our findings suggest that the RF-based OCPR method is effective to repair cloudy pixels and provides continuous and quantitatively reliable imagery for long-term environmental analysis

Valley and channel networks extraction based on local topographic curvature and k-means clustering of contours

A method for automatic extraction of valley and channel networks from high‐resolution digital elevation models (DEMs) is presented. This method utilizes both positive (i.e., convergent topography) and negative (i.e., divergent topography) curvature to delineate the valley network. The valley and ridge skeletons are extracted using the pixels' curvature and the local terrain conditions. The valley network is generated by checking the terrain for the existence of at least one ridge between two intersecting valleys. The transition from unchannelized to channelized sections (i.e., channel head) in each first‐order valley tributary is identified independently by categorizing the corresponding contours using an unsupervised approach based on k‐means clustering. The method does not require a spatially constant channel initiation threshold (e.g., curvature or contributing area). Moreover, instead of a point attribute (e.g., curvature), the proposed clustering method utilizes the shape of contours, which reflects the entire cross‐sectional profile including possible banks. The method was applied to three catchments: Indian Creek and Mid Bailey Run in Ohio and Feather River in California. The accuracy of channel head extraction from the proposed method is comparable to state‐of‐the‐art channel extraction methods.

A Random Forest Model Based on Lidar and Field Measurements for Parameterizing Surface Roughness in Coastal Modeling

A novel technique for parameterizing surface roughness in coastal inundation models using airborne laser scanning (lidar) data is presented. Two important parameters to coastal overland flow dynamics, Manning's n (bottom friction) and effective aerodynamic roughness length (wind speed reduction), are computed based on a random forest (RM) regression model trained using field measurements from 24 sites in Florida fused with georegistered lidar point cloud data. The lidar point cloud for each test site is separated into ground and nonground classes and the z-dimensional (height or elevation) variance from the least squares regression plane is computed, along with the height of the nonground regression plane. These statistics serve as the predictor variables in the parameterization model. The model is then tested using a bootstrap subsampling procedure consisting of removal without replacement of one record and using the surviving records to train the model and predict the surface roughness parameter of the removed record. When compared with the industry standard technique of assigning surface roughness parameters based on published land use/land cover type, the RM regression models reduce the parameterization error by 93% (0.086-0.006) and 53% (1.299-0.610 m) for Manning's n and effective aerodynamic roughness length, respectively. These improvements will improve water level and velocity predictions in coastal models.

Adjusting lidar-derived digital terrain models in coastal marshes based on estimated above ground biomass density

Digital elevation models (DEMs) derived from airborne lidar are traditionally unreliable in coastal salt marshes due to the inability of the laser to penetrate the dense grasses and reach the underlying soil. To that end, we present a novel processing methodology that uses ASTER Band 2 (visible red), an interferometric SAR (IfSAR) digital surface model, and lidar-derived canopy height to classify biomass density using both a three- class scheme (high, medium and low) and a two-class scheme (high and low). Elevation adjustments associated with these classes using both median and quartile approaches were applied to adjust lidar-derived elevation values closer to true bare earth elevation. The performance of the method was tested on 229 elevation points in the lower Apalachicola River Marsh. The two-class quartile-based adjusted DEM produced the best results, reducing the RMS error in elevation from 0.65 m to 0.40 m, a 38% improvement. The raw mean errors for the lidar DEM and the adjusted DEM were 0.61 ± 0.24 m and 0.32 ± 0.24 m, respectively, thereby reducing the high bias by approximately 49%.

Review of wetting and drying algorithms for numerical tidal flow models

A review of wetting and drying (WD) algorithms used by contemporary numerical models based on the shallow water equations is presented. The numerical models reviewed employ WD algorithms that fall into four general frameworks: (1) Specifying a thin film of fluid over the entire domain; (2) checking if an element or node is wet, dry or potentially one of the two, and subsequently adding or removing elements from the computational domain; (3) linearly extrapolating the fluid depth onto a dry element and its nodes from nearby wet elements and computing the velocities; and (4) allowing the water surface to extend below the topographic ground surface. This review presents the benefits and drawbacks in terms of accuracy, robustness, computational efficiency, and conservation properties. The WD algorithms also tend to be highly tailored to the numerical model they serve and therefore difficult to generalize. Furthermore, the lack of temporally and spatially defined validation data has hampered comparisons of the models in terms of their ability to simulate WD over real domains. A short discussion of this topic is included in the conclusion.