Volume 14 Issue 5
Jul.  2023
Turn off MathJax
Article Contents
Qizhi Chen, Hong Yao, Shengwen Li, Xinchuan Li, Xiaojun Kang, Wenwen Lai, Jian Kuang. Fact-condition statements and super relation extraction for geothermic knowledge graphs construction[J]. Geoscience Frontiers, 2023, 14(5): 101412. doi: 10.1016/j.gsf.2022.101412
Citation: Qizhi Chen, Hong Yao, Shengwen Li, Xinchuan Li, Xiaojun Kang, Wenwen Lai, Jian Kuang. Fact-condition statements and super relation extraction for geothermic knowledge graphs construction[J]. Geoscience Frontiers, 2023, 14(5): 101412. doi: 10.1016/j.gsf.2022.101412

Fact-condition statements and super relation extraction for geothermic knowledge graphs construction

doi: 10.1016/j.gsf.2022.101412
More Information
  • Corresponding author: School of Computer Science, China University of Geosciences, Wuhan 430074, China. E-mail address: yaohong@cug.edu.cn (H. Yao)
  • Received Date: 2021-12-02
  • Accepted Date: 2022-06-07
  • Rev Recd Date: 2022-05-05
  • Available Online: 2022-06-09
  • Publish Date: 2023-09-01
  • Researchers utilize information from the geoscience literature to deduce the regional or global geological evolution. Traditionally this process has relied on the labor of researchers. As the number of papers continues to increase, acquiring domain-specific knowledge becomes a heavy burden. Knowledge Graph (KG) is proposed as a new knowledge representation technology to change this situation. However, the super relation is not considered in the previous KG, which bridges the geological phenomenon (fact) and its precondition (condition). For instance, in the statement (“the late Archean was a crucial transition period in the history of global geodynamics”), the condition statement (“crucial transition for global geodynamics”) works as the complementary fact statement (“the late Archean was a crucial transition period”), which defines the scale of crucial transition accurately in the late Archean. In this study, fact-condition statement extraction is introduced to construct a geological knowledge graph. A rule-based multi-input multi-output model (R-MIMO) is proposed for information extraction. In the R-MIMO, fact-condition statements and their super relation are considered and extracted for the first time. To verify its performances, a GeothCF dataset with 1455 fact tuples and 789 condition tuples is constructed. In experiments, the R-MIMO model achieves the best performance by using BERT as encoder and LSTM-d as decoder, achieving F1 80.24% in tuple extraction and F1 70.03% in tag prediction task. Furthermore, the geothermic KG with super relation is automatically constructed for the first time by trained R-MIMO, which can provide structured data for further geothermic research.
  • Declaration of Competing Interest
    The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
  • loading
  • [1]
    Ammar, W., Groeneveld, D., Bhagavatula, C., Beltagy, I., Crawford, M., Downey, D., Etzioni, O., 2018. Construction of the literature graph in semantic scholar. arXiv preprint arXiv: 1805.02262.
    [2]
    Ashish, V., Noam, S., Niki, P., Jakob, U., Llion, J., Aidan, N.G., Łukasz, K., Illia, P., 2017. Attention is all you need. Adv. Neural Inf. Processing Syst., Decem(Nips), 5999–6009.
    [3]
    Beltagy, I., Lo, K., Cohan, A., 2019. SCIBERT: A pretrained language model for scientific text. In: Proc. 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 3615–3620.
    [4]
    Bollacker, K., Evans, C., Paritosh, P., Sturge, T., Taylor, J., 2008. Freebase: a collaboratively created graph database for structuring human knowledge. Proc. 2008 ACM SIGMOD International Conference on Management of data, 1247–1250.
    [5]
    Chen, T., Xu, R., He, Y., Wang, X., 2017. Improving sentiment analysis via sentence type classification using BiLSTM-CRF and CNN. Expert Syst. Appl. 72, 221–230. doi: 10.1016/j.eswa.2016.10.065
    [6]
    Devlin, J., Chang, M.W., Lee, K., Toutanova, K., 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proc NAACL HLT 2019–2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 4171–4186.
    [7]
    Fan, R., Wang, L., Yan, J., Song, W., Zhu, Y., Chen, X., 2019. Deep learning-based named entity recognition and knowledge graph construction for geological hazards. ISPRS Int. J. Geo-Info. 9 (1), 15. doi: 10.3390/ijgi9010015
    [8]
    Howard, J., Ruder, S., 2018. Universal language model fine-tuning for text classification, in: Proc. ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, 1, 328–339.
    [9]
    Huang, Z., Xu, W., Yu, K., 2015. Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv: 1508.01991.
    [10]
    Jiang, M., Shang, J., Cassidy, T., Ren, X., Kaplan, L.M., Hanratty, T.P., Han, J., 2017. Metapad: Meta pattern discovery from massive text corpora. In: Proc 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 877–886.
    [11]
    Jiang, T., Zhao, T., Qin, B., Liu, T., Chawla, N.V., Jiang, M., 2020. Multi-input multioutput sequence labeling for joint extraction of fact and condition tuples from scientific text. In: Proc EMNLP-IJCNLP 2019–2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, pp. 302–312.
    [12]
    Kolluru, K., Adlakha, V., Aggarwal, S., Mausam, Chakrabarti, S., 2020. OpenIE6: Iterative grid labeling and coordination analysis for open information extraction. arXiv preprint arXiv: 2010.03147.
    [13]
    Labeau, M., Löser, K., Allauzen, A., 2015. Non-lexical neural architecture for finegrained POS tagging. In: Proc. 2015 Conference on Empirical Methods in Natural Language Processing, pp. 232–237.
    [14]
    Luan, Y., Ostendorf, M., Hajishirzi, H., 2017. Scientific information extraction with semi-supervised neural tagging. In: Proc 2017 Conference on Empirical Methods in Natural Language Processing, pp. 2641–2651.
    [15]
    Luo, X., Zhou, W., Wang, W., Zhu, Y., Deng, J., 2018. Attention-based relation extraction with bidirectional gated recurrent unit and highway network in the analysis of geological data. IEEE Access 6, 5705–5715. doi: 10.1109/ACCESS.2017.2785229
    [16]
    Peters, M. E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L., 2018. Deep contextualized word representations. arXiv preprint arXiv: 1802.05365.
    [17]
    Qiu, Q., Xie, Z., Wu, L., Tao, L., 2020. Dictionary-based automated information extraction from geological documents using a deep learning algorithm. Earth Space Sci. 7 (3). https://doi.org/10.1029/2019EA000993. doi: 10.1029/2019EA000993
    [18]
    Qiu, Q., Xie, Z., Wu, L., Tao, L., Li, W., 2019a. BiLSTM-CRF for geological named entity recognition from the geoscience literature. Earth Sci. Inform. 12 (4), 565–579. doi: 10.1007/s12145-019-00390-3
    [19]
    Qiu, Q., Xie, Z., Wu, L., Tao, L., 2019b. GNER: A generative model for geological named entity recognition without labeled data using deep learning. Earth Space Sci. 6 (6), 931–946. doi: 10.1029/2019EA000610
    [20]
    Ren, X., Shen, J., Qu, M., Wang, X., Wu, Z., Zhu, Q., Han, J., 2017. Life-inet: A structured network-based knowledge exploration and analytics system for life sciences. Proc. ACL 2017, System Demonstrations.
    [21]
    Shang, J., Liu, J., Jiang, M., Ren, X., Voss, C.R., Han, J., 2018. Automated phrase mining from massive text corpora. IEEE Trans. Knowl. Data. Eng. 30 (10), 1825–1837. doi: 10.1109/TKDE.2018.2812203
    [22]
    Shi, L., Jianping, C., Jie, X., 2018. Prospecting information extraction by text mining based on convolutional neural networks-a case study of the Lala Copper Deposit, China. IEEE Access 6, 52286–52297. doi: 10.1109/ACCESS.2018.2870203
    [23]
    Singhal A, Introducing the knowledge graph: Things, not strings. https://blog.google/products/search/introducing-knowledge-graph-things-not/, 2012 (accessed 7 July 2021).
    [24]
    Swarnadeep Saha, Harinder Pal, and Mausam, 2017. Bootstrapping for numerical OpenIE, in: Proc.55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 317–323.
    [25]
    Takamatsu, S., Sato, I., Nakagawa, H., 2012. Reducing wrong labels in distant supervision for relation extraction. In: Proc ACL 50th Annual Meeting of the Association for Computational Linguistics, pp. 721–729.
    [26]
    Tang, J., 2016. AMiner: Toward understanding big scholar data, in: Proc. ninth ACM international Conference on Web Search and Data Mining, 467-467.
    [27]
    Wang, C., Ma, X., Chen, J., Chen, J., 2018a. Information extraction and knowledge graph const.ruction from geoscience literature. Comput. Geosci. 112, 112–120.
    [28]
    Wang, S., Zhang, Y., Che, W., Liu, T., 2018c. Joint extraction of entities and relations based on a novel graph scheme, in: Proc. IJCAI International Joint Conference on Artificial Intelligence, 4461–4467.
    [29]
    Wang, X., Zhang, Y., Li, Q., Chen, Y., Han, J., 2018b. Open information extraction with meta-pattern discovery in biomedical literature. In: Proc. 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, pp. 291–300.
    [30]
    Wu, S., Fan, K., Zhang, Q., 2019. Improving distantly supervised relation extraction with neural noise converter and conditional optimal selector. In: Proc AAAI Conference on Artificial Intelligence, pp. 7273–7280.
    [31]
    Zhu, Y., Zhou, W., Xu, Y., Liu, J., Tan, Y., 2017. Intelligent learning for knowledge graph towards geological data. Sci. Program.
    [32]
    Zhu, Y.K., Ryan, K., Rich, Z., Ruslan, S., Raquel, U., Antonio, T., Sanja, F., 2015. Aligning books and movies: towards story-like visual explanations by watching movies and reading books. In: The IEEE International Conference on Computer Vision, pp. 19–27.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(5)  / Tables(5)

    Article Metrics

    Article views (94) PDF downloads(11) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return