CLCD-I: Cross Language Clone Detection with Infercode

dc.contributor.advisorKim, Dae-Kyoo
dc.contributor.authorYahya, Mohammad A A
dc.contributor.otherLu, Lunjin
dc.contributor.otherMing, Hua
dc.contributor.otherCaushaj, Eralda
dc.date.accessioned2024-09-25T21:19:49Z
dc.date.available2024-09-25T21:19:49Z
dc.date.issued2023-01-01
dc.description.abstractSource code clones are common in software development as part of reuse practice.However, they are also often a source of errors compromising software maintainability. The existing work on code clone detection mainly focuses on clones in a single programming language. However, nowadays software is increasingly developed on a multilanguage platform on which code is reused across different programming languages. Detecting code clones in such a platform is challenging and has not been studied much. In this paper, we present CLCD-I, a deep neural network-based approach for detecting cross-language code clones by using InferCode which is an embedding technique for source code. The design of our model is twofold: (a) taking as input InferCode embeddings of source code in two different programming languages and (b) forwarding them to a Siamese architecture for comparative processing. We compare the performance of CLCD-I with LSTM autoencoders and the existing approaches on cross-language code clone detection. The evaluation shows the CLCD-I outperforms LSTM autoencoders by 30% on average and the existing approaches by 15% on average.
dc.identifier.urihttps://hdl.handle.net/10323/18167
dc.relation.departmentComputer Science and Engineering
dc.subjectClone detection
dc.subjectDeep learning
dc.subjectMachine learning
dc.titleCLCD-I: Cross Language Clone Detection with Infercode

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Yahya_oakland_0446E_10359.pdf
Size:
6.07 MB
Format:
Adobe Portable Document Format