Google Scholar

Enhancing code summarization with graph embedding and pre-trained model

L Li, J Li, Y Xu, H Zhu, X Zhang - International Journal of Software …, 2023 - World Scientific

L Li, J Li, Y Xu, H Zhu, X Zhang

International Journal of Software Engineering and Knowledge Engineering, 2023•World Scientific

Code summarization is a task that aims at automatically producing descriptions of source code. Recently many deep-learning-based approaches have been proposed to generate accurate code summaries, among which pre-trained models (PTMs) for programming languages have achieved promising results. It is well known that source code written in programming languages is highly structured and unambiguous. Though previous work pre-trained the model with well-design tasks to learn universal representation from a large scale of data, they have not considered structure information during the fine-tuning stage. To make full use of both the pre-trained programming language model and the structure information of source code, we utilize Flow-Augmented Abstract Syntax Tree (FA-AST) of source code for structure information and propose GraphPLBART — Graph-augmented Programming Language and Bi-directional Auto-Regressive Transformer, which can effectively introduce structure information to a well PTM through a cross attention layer. Compared with the best-performing baselines, GraphPLBART still improves by 3.2%, 7.1%, and 1.2% in terms of BLEU, METEOR, and ROUGE-L, respectively, on Java dataset, and also improves by 4.0%, 6.3%, and 2.1% on Python dataset. Further experiment shows that the structure information from FA-AST has significant benefits for the performance of GraphPLBART. In addition, our meticulous manual evaluation experiment further reinforces the superiority of our proposed approach. This demonstrates its remarkable abstract quality and solidifies its position as a promising solution in the field of code summarization.

World Scientific

Show moreShow less

Save Cite Cited by 3 Related articles All 2 versions

Showing the best result for this search. See all results

Cite

Advanced search

Saved to My library

Enhancing code summarization with graph embedding and pre-trained model