Computer Science and Engineering
Permanent URI for this collection
Browse
Recent Submissions
Item Detecting and Classifying Malware in Electrical Power Grids Via Cyberdeception(2024-01-01) Omar, Tallal Mohamed; Zohdy, Mohamed A; Edwards, William; Caushaj, Eralda; Sutton, SaraArtificial intelligence (AI) has become an essential instrument for enterprises aiming to protect their digital assets within a progressively aggressive cyber landscape. As the dependence on digital technologies increases among companies and individuals, the risks associated with cyberattacks are also advancing in terms of complexity and magnitude. AI and the proliferation of technology has led to a significant concern over security, mostly due to the escalating prevalence of malware on industrial computers. This has resulted in potential physical harm to computer systems and the individuals involved. Malware is a collection of malicious programming code that aims to inflict harm against computer systems, programs, or online apps. These applications lack the ability to differentiate between legitimate system calls and those that are intended to cause harm. Therefore, it is imperative to ensure that computer systems and online applications are constructed in a manner that enables the identification and differentiation of malicious activities from legitimate application activities. The utilization of AI in the realm of cybersecurity is revolutionizing the domain of digital protection. There are various techniques that can be used to identify malicious activity, leveraging innovative concepts such as AI, machine learning, and deep learning. The present study presents a proposal for utilizing AI approaches to identify and mitigate malware activity in computer memory, with the aim of safeguarding against unauthorized access to and manipulation of physical data within the system. This research aims to combine the traditional K-means algorithm with other methods and functionalities to perform data aggregation tasks on a physical dataset. The primary objective is to identify anomalies in the dataset using clustering techniques. These anomalies will serve as triggers for creating a replica of the main process as a decoy thread. The decoy thread will be equipped with decoy sensors and actuators. The analysis will be conducted on the decoy thread rather than the main process, allowing for intrusive observation. The same host environment will be provided to memory-resident malware, enabling it to continue operating within the main operating system process. The analysis process involves utilizing a replicated instance of malware that resides within a deceptive thread.Item Resilient Suppliers Selection System Using Machine Learning Algorithms and Risk Assessment Methodology(2022-01-01) Albadrani, Abdullah Meteb; Zohdy, Mohamed A.; Edwards, William; Ruegg, EricaDemand, supply, pricing, and lead time are all unpredictable in the manufacturingindustry, and the manufacturer must function in this environment. Because of the enormous amount of data available and the introduction of new technologies, such as the internet of things (IoT), machine learning (ML), and Blockchain, administrators and government officials are better able to deal with uncertainty by applying intelligent decision-making principles to their situations. All supply chains must make use of new technology and analyze previous data to forecast and improve the success of future operations. At the moment, we rely on the supply chain and its facilities for healthcare when we receive our vaccine, for our food when we go grocery shopping, and for transportation when we drive our cars. Supplier selection is exposed to the three most significant factors: quality, delivery, and performance history, which are all evaluated separately. The use of data analytic capabilities in the selection of robust supplier portfolios has not been thoroughly investigated. Manufacturers typically have three to four resilient suppliers for the same item, but occasionally one or two of them will fail, causing a ripple effect throughout the entire supply chain. This is a frequent problem that the supply chain must deal with on a daily basis. Supply chain resilience, on the other hand, ignores or is incompatible with the risk profiles of suppliers’ performance.Item Application of Several Machine Learning Algorithms for Multiple Stage Inference Data(2024-01-01) Amen, Khalid A; Zohdy, Mohamed A; Rrushi, Julian; McDonald, Gary; Mahmoud, MohammedHistorically, machine learning techniques have been dependent on utilizing data from two distinct phases to predict and identify particular occurrences. The outcomes of these studies may exhibit either validity or inaccuracy, represented by binary values of one or zero. An alternative term for this is a prognostication of one of two potential results. Several issues are present in this approach, which have the potential to yield inaccurate outcomes. The issues encompassed in this context consist of data imbalance, overfitting, and error propagation. This study aims to employ and use a multiple stage outcome approach to enhance accuracy and optimize the performance of outcomes. In this step of our research, we will be implementing the Multiclass Classification One-vs.-All methodology to analyze the data collected from various stages of the experiment's conclusion. In the subsequent phase, it is necessary to engage in the utilization or investigation of a diverse range of potential supervised models, which are trained through the application of machine learning algorithms. Subsequently, the determination of the model that exhibits a superior level of accuracy will be made by designating it as the victor. In our study, we employ and evaluate five distinct machine learning algorithms, namely Support Vector Machines (SVM), Logistic Regression (LR), Random Forest (RF), Gradient Tree Boosting (GTB), and Extremely Randomized Trees (ERF). These algorithms are used within our machine learning framework to analyze multi-stage data and ascertain the technique that exhibits the highest accuracy in predicting outcome stages. This multi-stage conclusion would effectively narrow down the problem or difficulties at hand, reduce the potential for errors, and enhance the ability to accurately predict and diagnose medical diseases or cyber security threats. A Python-based model was developed to execute the proposed methodology. The utilized notion employs a binary format, which has been substantiated by empirical evidence and offers two potential outcomes. Upon the completion of our research, it was determined that the Logistic Regression and Support Vector Machine algorithms exhibited better performance compared to the other algorithms when a multiple stage outcome was employed. The results were assessed in terms of accuracy, precision, recall, and the F measureItem Semantic and Temporal Graph Neural Networks for Supply Chain Risk Quantification(2023-01-01) Matovski, Svetle; Nezamoddini, Nasim; Sengupta, Sankar; Lipták, László; Fu, HuirongCalculating supply chain risk management values requires a granular set of parameters. Failure risk at each supply chain entity is a dependent value influenced by entities within a supply chain. Predicting future risk values from historical data is based on trends and patterns therein. On the surface, historical data does not show how the data interlink and connects. To understand the problem this research starts by examining the underlying data structure that a supply chain uses to store data. Mining the data requires understanding its relationship to other data within the structure. This understanding allows the system or user to make better decisions. The research brings together four different pieces to show a risk value at each node within a supply chain network. This research generates a dataset using seed data and trend data to build a supply chain network. This research then predicts future risk values at a node level using ARMIA and a graph neural network. Node centrality values help show the importance of nodes to a supply chain. The centrality methods this research uses are betweenness, degree, eigenvector, and Katz centrality. Each one gives an importance value based on a different view of centrality. This research then looks for the vi influence of a node on a customer. Calculating an influence value by using Bayesian networks. The last value calculated is the profit loss at a customer node. All these different pieces are brought together in the risk calculation equation to give a final node-per-node risk value. The risk calculations align well with the structure of a graph. This research will show that a graph neural network and a graph database are useful for supply chain problems.Item Knowledge Net: An Automated Clinical Knowledge Graph Generation Framework for Evidence Based Medicine(2023-01-01) Alam, Fakhare; Malik, Khalid Mahmood; Siadat, Mohammad-Reza; Ma, Tianle; Homayouni, Ramin; Corliss, DavidTo practice the evidence-based medicine, clinicians are interested to find the most suitable research for the clinical decision making. The use of knowledge graphs (KGs) and Neuro-Symbolic methods to integrate and analyze complex and heterogeneous healthcare data is critical to enable evidence-based treatment in clinical decision support systems (CDSS). Healthcare generates a vast amount of data, including electronic health records (EHRs), medical images, genetic information, research papers, and clinical guidelines. Neuro-symbolic AI can leverage its neural network component to process unstructured data, while using symbolic reasoning to interpret the data and make logical inferences. It also enables a deeper understanding of patient data, leading to more accurate diagnoses, personalized treatment plans, and improved patient outcomes. By incorporating symbolic reasoning, Neuro-Symbolic AI systems can provide explanations for their outputs, making them more transparent and interpretable. To enable Neuro-Symbolic AI in healthcare, large-scale KGs play a pivotal role as it can integrate heterogeneous and big healthcare data including medical ontologies, clinical guidelines, drug databases, patient records, and research literature.The existing KG construction frameworks are not fully automated and predominantly carried out using manual or semi-automated approach, requiring substantial effort and expertise. The challenges encompass identifying knowledge sources, disambiguating concepts in context, enriching semantics, determining relationships, and conducting inferential reasoning. Automating the extraction of coherent knowledge and constructing KGs from diverse data forms remains a longstanding goal in AI research. Also, the current frameworks for constructing KGs fail to generate KGs that provide relevant information for evidence-based practitioners. This is because the organization of constructed subgraphs is neither topic-specific nor evidence-based PICO (Participants/Problem P, Intervention-I, Comparison C, Outcome O) query-friendly. These KGs, built through manual or semi-automated processes, are incapable of adapting to new domains and incorporating the constantly changing information into their knowledge base. Consequently, they gradually lose relevance over time and miss out on important evidence. Thus, ignoring temporal information and failing to incorporate dynamic nature of entities and relations can lead to erroneous information extraction and suboptimal decision-making. This dissertation proposes fully automated knowledge graph curation framework to curate information and create KG of different clinical domains by employing concept extraction, semantic enrichment, optimized clustering using Neuro-Symbolic approach, and state of art Recurrent Neural Networks (RNNs) with BioBERT based encoded representation to categorize PICO elements and predict relationships between concepts using huge corpus of publicly available literature on COVID-19 and cerebral aneurysms. The evaluation shows that the proposed framework achieves significant improvement over baseline models and has 93 , and 82 accuracy on aneurysm and COVID data set respectively for PICO classification. The Neuro-Symbolic clustering approach outperforms traditional baseline models by 43 and achieves average precision of 88 across all identified clusters. Also, the relationship extraction module has an accuracy of 96 with precision and recall being 92 , and 90 respectively. The incorporation of domain-specific and language models has proven to enhance the performance of machine learning models, particularly in the context of Neuro-Symbolic clustering, PICO classification, and relation extraction. The integration of deep learning and symbolic reasoning techniques has demonstrated significant improvements in clustering performance, especially in biomedical research domains. The utilization of the BioBERT embedded layer and LSTM model has notably boosted the accuracy of PICO classification tasks by 11 for both the COVID-19 dataset and cerebral aneurysm dataset. Furthermore, when BioBERT is combined with Bi-LSTM and CNN, the performance of the RE model also experiences substantial enhancements. Future work will focus on parallelizing the data processing pipeline to enhance the efficiency and scalability of the knowledge graph framework, while also developing an interactive user interface for visualization. Additionally, efforts will be dedicated to extending the frameworks application across diverse domains such as the food supply chain, dietary recommendations, agriculture, and fisheries, addressing unique challenges and expanding its impact. This expansion aims to advance multiple industries and leverage the potential benefits of the approach in various domains.Item CLCD-I: Cross Language Clone Detection with Infercode(2023-01-01) Yahya, Mohammad A A; Kim, Dae-Kyoo; Lu, Lunjin; Ming, Hua; Caushaj, EraldaSource code clones are common in software development as part of reuse practice.However, they are also often a source of errors compromising software maintainability. The existing work on code clone detection mainly focuses on clones in a single programming language. However, nowadays software is increasingly developed on a multilanguage platform on which code is reused across different programming languages. Detecting code clones in such a platform is challenging and has not been studied much. In this paper, we present CLCD-I, a deep neural network-based approach for detecting cross-language code clones by using InferCode which is an embedding technique for source code. The design of our model is twofold: (a) taking as input InferCode embeddings of source code in two different programming languages and (b) forwarding them to a Siamese architecture for comparative processing. We compare the performance of CLCD-I with LSTM autoencoders and the existing approaches on cross-language code clone detection. The evaluation shows the CLCD-I outperforms LSTM autoencoders by 30% on average and the existing approaches by 15% on average.Item Intelligent Performance, Architecture Analysis, Functional Safety Metrics of Automated Steering Systems for Autonomous Vehicles(2022-03-15) Salih, Saif Yoseif; Olawoyin, Richard O; Cooley, Christopher; Debnath, Debatosh; ElSayed, SuzanThe increasing complexities and functionalities of the electrical and/or electronic (E/E) systems in present day automobiles, make it challenging for original equipment manufacturers (OEMs) and suppliers to ensure a high level of safety in the automotive critical safety systems. The steering systems represent a standard functionality on every vehicle to control the direction of the vehicle literally and provide more stability for the vehicle motion. High automated vehicles require intelligent steering systems in which more Advanced Driver Assistance Systems (ADAS) applications are linked together such as cameras, radars, Lidars, and global positioning system (GPS). These integrated systems and applications are required for environmental perception, communications, data fusion, planning, prediction, decision making, and actuation processes all in real-time. Therefore, hardware (HW) and software (SW) solutions are developed and implemented in compliance with ISO 26262 standard, Road Vehicles – Functional Safety. Due to the lack of the steering systems published information and the crucial role of the steering associated with complex functionalities challenges, this dissertation provides a case study of how the steering systems of different automated driving levels can be complied with ISO 26262 given the emerging challenges imposed by the electric vehicle curb weight, increasing trend for the near future. The analysis focused on the safety lifecycle of the E/E components of the steering systems to ensure high availability of the steering systems and avoid any sudden loss of assistance (SLOA). Various safety mechanisms were evaluated and analyzed to improve the functional safety of the steering systems architecture and logic control paths. Based on the proposed controllability metrics performed in this dissertation, it was found that the hazard or malfunction of the steering systems shifted from the Automotive Safety Integrity Level (ASIL) B to ASIL C, the second most critical safety level. To comply with the ISO 26262 and to mitigate the residual risks of E/E systems failure, several solutions proposed in the concept for compliance with the standard such as redundant HW or SW in the controller path. The controllability classes or categories of the high automated vehicles based on the vehicle global position related to the lane marker lines were investigated and redefined to accommodate for the machine or system in the loop controlling the dynamic driving task (DDT) in autonomous vehicle maneuvering. A new wheel offset marker concept was introduced when the vehicle is approaching the lane marker lines. Also, it was found that the are human factors challenges in SAE level 4 and 5 and the interaction between the driver and the automated control systems of the vehicle that require human machine interface (HMI) modalities. The driver – automated control system engagement in the steering system of the vehicles is one of the crucial control complex scenarios that add uncertainties and potential risks when handing over the steering control between the driver and-or the automated control system with the allotted time. This study highlights the need to define the driver intervention in high-automated vehicle of SAE level 4 and 5 in order to sustain the traffic safety and keep the vehicle in the intended trajectory or path. This can be addressed by deploying HMI and the human factor implementation in ISO 26262 to standardize the driver-machine relation with the DDT in real time and interactive environment. Both manual and automated driving modes demand the functional safety implementation of the steering system to mitigate any system malfunction or failure.An artificial neural network (ANN) model was developed to predict the steering torque commands and steering wheel angle (SWA) based on the steering system dataset and vehicle’s parameters. ANN model was developed using Neural Network Training (nntraintool) toolbox of MATLAB to evaluate the intelligent steering system performance. The trained ANN model delivered a regression value of ~ 98.5 % versus the measured SWA. The results showed that the ANN was effective in predicting the steering wheel angle patterns based on the input dataset, considering the non-linearity and complexity of the steering system control. This finding helps to improve the functional safety of autonomous vehicles and introduce the concept of intelligent steering systems for path and trajectory planning. Therefore, ANN should be implemented as an abstraction layer in the control module and deployed in the control and actuation processes to support sensor data fusion and support the prediction and pattern recognition.Item Robust and Adaptive Lateral Controller for Autonomous Vehicles(2022-03-25) Khasawneh, Lubna S.; Das, Manohar; Ka, Cheok C; Shilor, Meir; Guangzhi, QuThis thesis addresses the problem of controlling the lateral motion of an autonomous vehicle in the presence of parametric uncertainties, disturbances, and hard nonlinearities in the steering system, such as backlash in gears, stiction, hysteresis, and dead zones. The lateral motion of an autonomous vehicle is controlled by two cascaded controllers, the trajectory tracking controller and the steering angle controller. This thesis focuses on the development of both controllers using robust and adaptive control techniques. Two control strategies are developed to control the electric power steering angle, sliding mode control and adaptive backstepping control. The limitation of sliding mode control is first addressed, which is the chattering phenomena, and then a proposed methodology is presented to solve it using variable gain sliding mode control. Self-aligning moment acts as disturbance on the steering system that the controller has to compensate for. A model-based approach to estimate it is first developed and its limitations are addressed, which is tire parameters dependence. Two other approaches are then developed to overcome these limitations, the first one is a sliding mode observer, and the second one is part of a backstepping controller. Two approaches are developed to control the vehicle lateral trajectory, non-adaptive backstepping and adaptive backstepping. The extended matching design procedure is used in the adaptive backstepping controller to avoid the overestimation problem. Road curvature must be accurately known by the controller to follow the planned trajectory. It is usually measured by a camera, but the quality of the measurement is affected by environmental factors. An adaptive law is developed to estimate the road curvature online as part of an adaptive backstepping controller. Two feedforward approaches are presented to compensate for road curvature, one is derived from steady state vehicle lateral dynamics, and another is based on estimating the transfer function dynamics from road curvature to steering angle. Road bank angle is a significant disturbance in vehicle lateral control systems. A vehicle lateral state and disturbance observer is developed to estimate the road bank angle and the vehicle side slip angle, which are expensive to measure in current road vehicles, using extended Kalman filter. The observer combines a dynamical vehicle model with two measurements from inexpensive sensors.Item A Meta-Heuristic Algorithm Based on Modified Global Firefly Optimization: In Supply Chain Networks with Demand Uncertainty(2022-03-15) Altherwi, Abdulhadi; Zohdy, Mohamed; Malik, Ali; Edwards, William; Cho, Seong-Yeon; Alwerfalli, DawNowadays, many challenges affect global supply chain networks including disruptions, delays, and failures during shipment of products. These challenges also incur penalty costs due to customers’ unmet demands and failures in supply. In this dissertation, the model was developed as a multi-objective supply chain network under two risk factors including failure in supply and unmet demand based on three different scenarios. The objective of scenario I was to minimize the total expected transportation costs between stages for each supply chain and penalty costs associated with shortage of products. Supply chain with no failure in supply will communicate with supply chain with failure to deliver its product to the final customer. For scenario II, the objective was to maximize the profits of the supply chain that face extra inventory. This supply chain with surplus products will collaborate with supply chains with shortage of products to prevent any undesirable costs associated with extra inventory. The objective of scenario III was to develop a multi-objective function, which maximizes the profit and minimizes the total costs associated with production, holding, and penalties due to supplier failure of raw materials. Once a supply chain faces failure in supply of raw materials, other supply chains with no supply failure will collaborate to prevent any associated costs. This research investigates the applicability of the Modified Firefly Algorithm for a multi-stage supply chain network consisting of suppliers, manufacturers, storages, and markets under risks of failure. Commercial software cannot obtain the optimal results for these problems considered in this research. To achieve better findings, we applied a Modified Firefly Algorithm to solve the problem. Two case studies for a pipe and a steel manufacturing integrated supply chain demonstrated the efficiency of the model and the solutions obtained by the Firefly Algorithm. We used four optimization algorithms in ModeFRONTIER and MATLAB software to test the efficiency of the proposed algorithm. The results revealed that when compared with other four optimization algorithms, Firefly Algorithm can help achieve maximum profits and minimizing the total expected costs of supply chain networks.