Computer Science and Engineering
Permanent URI for this collection
Browse
Browsing Computer Science and Engineering by Subject "Deep learning"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
Item CLCD-I: Cross Language Clone Detection with Infercode(2023-01-01) Yahya, Mohammad A A; Kim, Dae-Kyoo; Lu, Lunjin; Ming, Hua; Caushaj, EraldaSource code clones are common in software development as part of reuse practice.However, they are also often a source of errors compromising software maintainability. The existing work on code clone detection mainly focuses on clones in a single programming language. However, nowadays software is increasingly developed on a multilanguage platform on which code is reused across different programming languages. Detecting code clones in such a platform is challenging and has not been studied much. In this paper, we present CLCD-I, a deep neural network-based approach for detecting cross-language code clones by using InferCode which is an embedding technique for source code. The design of our model is twofold: (a) taking as input InferCode embeddings of source code in two different programming languages and (b) forwarding them to a Siamese architecture for comparative processing. We compare the performance of CLCD-I with LSTM autoencoders and the existing approaches on cross-language code clone detection. The evaluation shows the CLCD-I outperforms LSTM autoencoders by 30% on average and the existing approaches by 15% on average.Item Knowledge Net: An Automated Clinical Knowledge Graph Generation Framework for Evidence Based Medicine(2023-01-01) Alam, Fakhare; Malik, Khalid Mahmood; Siadat, Mohammad-Reza; Ma, Tianle; Homayouni, Ramin; Corliss, DavidTo practice the evidence-based medicine, clinicians are interested to find the most suitable research for the clinical decision making. The use of knowledge graphs (KGs) and Neuro-Symbolic methods to integrate and analyze complex and heterogeneous healthcare data is critical to enable evidence-based treatment in clinical decision support systems (CDSS). Healthcare generates a vast amount of data, including electronic health records (EHRs), medical images, genetic information, research papers, and clinical guidelines. Neuro-symbolic AI can leverage its neural network component to process unstructured data, while using symbolic reasoning to interpret the data and make logical inferences. It also enables a deeper understanding of patient data, leading to more accurate diagnoses, personalized treatment plans, and improved patient outcomes. By incorporating symbolic reasoning, Neuro-Symbolic AI systems can provide explanations for their outputs, making them more transparent and interpretable. To enable Neuro-Symbolic AI in healthcare, large-scale KGs play a pivotal role as it can integrate heterogeneous and big healthcare data including medical ontologies, clinical guidelines, drug databases, patient records, and research literature.The existing KG construction frameworks are not fully automated and predominantly carried out using manual or semi-automated approach, requiring substantial effort and expertise. The challenges encompass identifying knowledge sources, disambiguating concepts in context, enriching semantics, determining relationships, and conducting inferential reasoning. Automating the extraction of coherent knowledge and constructing KGs from diverse data forms remains a longstanding goal in AI research. Also, the current frameworks for constructing KGs fail to generate KGs that provide relevant information for evidence-based practitioners. This is because the organization of constructed subgraphs is neither topic-specific nor evidence-based PICO (Participants/Problem P, Intervention-I, Comparison C, Outcome O) query-friendly. These KGs, built through manual or semi-automated processes, are incapable of adapting to new domains and incorporating the constantly changing information into their knowledge base. Consequently, they gradually lose relevance over time and miss out on important evidence. Thus, ignoring temporal information and failing to incorporate dynamic nature of entities and relations can lead to erroneous information extraction and suboptimal decision-making. This dissertation proposes fully automated knowledge graph curation framework to curate information and create KG of different clinical domains by employing concept extraction, semantic enrichment, optimized clustering using Neuro-Symbolic approach, and state of art Recurrent Neural Networks (RNNs) with BioBERT based encoded representation to categorize PICO elements and predict relationships between concepts using huge corpus of publicly available literature on COVID-19 and cerebral aneurysms. The evaluation shows that the proposed framework achieves significant improvement over baseline models and has 93 , and 82 accuracy on aneurysm and COVID data set respectively for PICO classification. The Neuro-Symbolic clustering approach outperforms traditional baseline models by 43 and achieves average precision of 88 across all identified clusters. Also, the relationship extraction module has an accuracy of 96 with precision and recall being 92 , and 90 respectively. The incorporation of domain-specific and language models has proven to enhance the performance of machine learning models, particularly in the context of Neuro-Symbolic clustering, PICO classification, and relation extraction. The integration of deep learning and symbolic reasoning techniques has demonstrated significant improvements in clustering performance, especially in biomedical research domains. The utilization of the BioBERT embedded layer and LSTM model has notably boosted the accuracy of PICO classification tasks by 11 for both the COVID-19 dataset and cerebral aneurysm dataset. Furthermore, when BioBERT is combined with Bi-LSTM and CNN, the performance of the RE model also experiences substantial enhancements. Future work will focus on parallelizing the data processing pipeline to enhance the efficiency and scalability of the knowledge graph framework, while also developing an interactive user interface for visualization. Additionally, efforts will be dedicated to extending the frameworks application across diverse domains such as the food supply chain, dietary recommendations, agriculture, and fisheries, addressing unique challenges and expanding its impact. This expansion aims to advance multiple industries and leverage the potential benefits of the approach in various domains.Item Semantic and Temporal Graph Neural Networks for Supply Chain Risk Quantification(2023-01-01) Matovski, Svetle; Nezamoddini, Nasim; Sengupta, Sankar; Lipták, László; Fu, HuirongCalculating supply chain risk management values requires a granular set of parameters. Failure risk at each supply chain entity is a dependent value influenced by entities within a supply chain. Predicting future risk values from historical data is based on trends and patterns therein. On the surface, historical data does not show how the data interlink and connects. To understand the problem this research starts by examining the underlying data structure that a supply chain uses to store data. Mining the data requires understanding its relationship to other data within the structure. This understanding allows the system or user to make better decisions. The research brings together four different pieces to show a risk value at each node within a supply chain network. This research generates a dataset using seed data and trend data to build a supply chain network. This research then predicts future risk values at a node level using ARMIA and a graph neural network. Node centrality values help show the importance of nodes to a supply chain. The centrality methods this research uses are betweenness, degree, eigenvector, and Katz centrality. Each one gives an importance value based on a different view of centrality. This research then looks for the vi influence of a node on a customer. Calculating an influence value by using Bayesian networks. The last value calculated is the profit loss at a customer node. All these different pieces are brought together in the risk calculation equation to give a final node-per-node risk value. The risk calculations align well with the structure of a graph. This research will show that a graph neural network and a graph database are useful for supply chain problems.