Abstract:
The objective of the text-to-SQL task is to convert
natural language queries into SQL queries. However, the presence
of extensive text-to-SQL datasets across multiple domains, such
as Spider, introduces the challenge of effectively generalizing to
unseen data. Existing semantic parsing models have struggled
to achieve notable performance improvements on these crossdomain
datasets. As a result, recent advancements have focused
on leveraging pre-trained language models to address this issue
and enhance performance in text-to-SQL tasks. These approaches
represent the latest and most promising attempts to tackle
the challenges associated with generalization and performance
improvement in this field. This paper proposes an approach to
evaluate and use the Seq2Seq model providing the encoder with
the most pertinent schema items as the input and to generate
accurate and valid cross-domain SQL queries using the decoder
by understanding the skeleton of the target SQL query. The
proposed approach is evaluated using Spider dataset which is a
well-known dataset for text-to-sql task and able to get promising
results where the Exact Match accuracy and Execution accuracy
has been boosted to 72.7% and 80.2% respectively compared to
other best related approaches.
Citation:
M. R. Aadhil Rushdy and U. Thayasivam, "Application of Noise Filter Mechanism for T5-Based Text-to-SQL Generation," 2023 Moratuwa Engineering Research Conference (MERCon), Moratuwa, Sri Lanka, 2023, pp. 95-100, doi: 10.1109/MERCon60487.2023.10355492.