Metadata kg
MetadataKGConstruction
Bases: KGConstructionBase
The input should be a csv file with the following columns: - name: the name of the document - other columns: metadata fields, what we will do is to extract all unique values in each column and create a node for each value - and then create a relationship between the document and the metadata value - for columns is continuous, we will ignore it and put them as the property of the document node
Source code in Docs2KG/kg_construction/metadata_kg/metadata_kg.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 |
|
construct(docs, document_id_column='name')
Construct knowledge graph from document metadata
Parameters:
Name | Type | Description | Default |
---|---|---|---|
docs
|
Union[str, DataFrame]
|
Either path to CSV file or pandas DataFrame containing document metadata |
required |
document_id_column
|
str
|
Name of the column containing document IDs |
'name'
|
Returns:
Type | Description |
---|---|
Dict[str, List[Dict[str, Any]]]
|
Dictionary containing nodes and relationships for the knowledge graph |
Source code in Docs2KG/kg_construction/metadata_kg/metadata_kg.py
127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 |
|