Skip to content

How to extend Docs2KG

For the package side

If you want to add a new approach to extract relations or entities, it is very simple.

The only one principle you will need to follow is to make sure you will update the layout.json file with proper entities and relations.

For example:

{
  "filename": "gsdRec_2024_08",
  "data": [
    {
      "id": "p_8ffe9562-7be9-4e00-8e09-980972612298",
      "text": "RECORD 2024/8",
      "label": "P",
      "entities": [
        {
          "text": "2024/8",
          "label": "Project Code",
          "confidence": 1.0,
          "start": 7,
          "end": 13,
          "id": "ner-llm--6189338238195147291",
          "method": "NERLLMExtractor"
        }
      ],
      "relations": []
    }
  ]
}

This is one of the generated layout.json files, you just need to update this file with your generated ner or relations. And extend current entity and relation list.

To achieve that you can extend our SemanticKGConstructionBase class

from Docs2KG.kg_construction.semantic_kg.base import SemanticKGConstructionBase
from typing import List, Dict, Any


class YourNewApproach(SemanticKGConstructionBase):
    def __init__(self, project_id: str):
        super().__init__(project_id)

    def construct_kg(self, data: List[Dict[str, Any]]) -> None:
        # your code here
        pass

For the interface side

If you want a new feature in the interface, all you need to do is to email us, describe clearly what you want.