Semantic document indexing with generative AI
DOI:
https://doi.org/10.31263/voebm.v78i1.9251Keywords:
Documents, semantic indexing, generative artificial intelligence (AI)Abstract
This paper presents new methods for semantic indexing of reference information using generative artificial intelligence. A GPT language model was used to automatically extract descriptors and relationships between them from architectural history documents. A semantic network was created from the extracted descriptors and relationships. A prototype was then developed that made it possible to find relevant documents using the semantic network. Finally, it is shown how the quality of semantic networks can be improved with the help of a swarm of virtual experts. The use of generative artificial intelligence can reduce the workload and costs of semantic indexing. Semantic indexing and the semantic network can contribute to more effective use and dissemination of scientific information by enabling semantic search, easy navigation, and user-friendliness.
Downloads
References
Ferreira, Diogo R. (2017). A Primer on Process Mining: Practical Skills with Python and Graphviz. Cham: Springer
Lande, Dmytro; Strashnoy, Leonard (2024). Swarm of Virtual Experts in the Implementation of semantic Networking. ResearchGate Preprint. https://doi.org/10.13140/RG.2.2.16686.11845
Lande, Dmytro; Strashnoy, Leonard (2023a). Concept Networking Methods Based on ChatGPT & Gephi. SSRN Preprint, 28 April. https://doi.org/10.2139/ssrn.4420452
Lande, Dmytro; Strashnoy, Leonard (2023b). GPT Semantic Networking: A Dream of the Semantic Web – The Time is Now. Kyiv: Engineering
Sharma, Atri (2020). Practical Apache Lucene 8: Uncover the Search Capabilities of Your Application. New York, NY: Apress.
Wolfram, Stephen (2023). What Is ChatGPT Doing ... and Why Does It Work? Champaign, IL: Wolfram Media
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Dimitri Busch, Dmytro Lande

This work is licensed under a Creative Commons Attribution 4.0 International License.
Alle Inhalte dieser Zeitschrift – exkl. einzelner Logos und Abbildungen – sind lizenziert unter CC BY 4.0.
