Bases: BaseData
, FeatureStore
, GraphStore
A data object describing a heterogeneous graph, holding multiple node and/or edge types in disjunct storage objects. Storage objects can hold either node-level, link-level or graph-level attributes. In general, HeteroData
tries to mimic the behavior of a regular nested Python dictionary. In addition, it provides useful functionality for analyzing graph structures, and provides basic PyTorch tensor functionalities.
from torch_geometric.data import HeteroData data = HeteroData() # Create two node types "paper" and "author" holding a feature matrix: data['paper'].x = torch.randn(num_papers, num_paper_features) data['author'].x = torch.randn(num_authors, num_authors_features) # Create an edge type "(author, writes, paper)" and building the # graph connectivity: data['author', 'writes', 'paper'].edge_index = ... # [2, num_edges] data['paper'].num_nodes >>> 23 data['author', 'writes', 'paper'].num_edges >>> 52 # PyTorch tensor functionality: data = data.pin_memory() data = data.to('cuda:0', non_blocking=True)
Note that there exists multiple ways to create a heterogeneous graph data, e.g.:
To initialize a node of type "paper"
holding a node feature matrix x_paper
named x
:
from torch_geometric.data import HeteroData # (1) Assign attributes after initialization, data = HeteroData() data['paper'].x = x_paper # or (2) pass them as keyword arguments during initialization, data = HeteroData(paper={ 'x': x_paper }) # or (3) pass them as dictionaries during initialization, data = HeteroData({'paper': { 'x': x_paper }})
To initialize an edge from source node type "author"
to destination node type "paper"
with relation type "writes"
holding a graph connectivity matrix edge_index_author_paper
named edge_index
:
# (1) Assign attributes after initialization, data = HeteroData() data['author', 'writes', 'paper'].edge_index = edge_index_author_paper # or (2) pass them as keyword arguments during initialization, data = HeteroData(author__writes__paper={ 'edge_index': edge_index_author_paper }) # or (3) pass them as dictionaries during initialization, data = HeteroData({ ('author', 'writes', 'paper'): { 'edge_index': edge_index_author_paper } })
Creates a HeteroData
object from a dictionary.
Self
Returns a list of all storages of the graph.
List
[BaseStorage
]
Returns a list of all node types of the graph.
Returns a list of all node storages of the graph.
List
[NodeStorage
]
Returns a list of all edge types of the graph.
Returns a list of all edge storages of the graph.
List
[EdgeStorage
]
Returns a list of node type and node storage pairs.
Returns a list of edge type and edge storage pairs.
Returns the seed/input node/edge type of the graph in case it refers to a sampled subgraph, e.g., obtained via NeighborLoader
or LinkNeighborLoader
.
Returns a dictionary of stored key/value pairs.
Returns a NamedTuple
of stored key/value pairs.
Sets the values in the dictionary value_dict
to the attribute with name key
to all node/edge types present in the dictionary.
data = HeteroData() data.set_value_dict('x', { 'paper': torch.randn(4, 16), 'author': torch.randn(8, 32), }) print(data['paper'].x)
Self
Updates the data object with the elements from another data object. Added elements will override existing ones (in case of duplicates).
Self
Returns the dimension for which the value value
of the attribute key
will get concatenated when creating mini-batches using torch_geometric.loader.DataLoader
.
Note
This method is for internal use only, and should only be overridden in case the mini-batch creation process is corrupted for a specific attribute.
Returns the incremental count to cumulatively increase the value value
of the attribute key
when creating mini-batches using torch_geometric.loader.DataLoader
.
Note
This method is for internal use only, and should only be overridden in case the mini-batch creation process is corrupted for a specific attribute.
Returns the number of nodes in the graph.
Returns the number of features per node type in the graph.
Returns the number of features per node type in the graph. Alias for num_node_features
.
Returns the number of features per edge type in the graph.
Returns True
if the graph contains isolated nodes.
Validates the correctness of the data.
Returns the heterogeneous meta-data, i.e. its node and edge types.
data = HeteroData() data['paper'].x = ... data['author'].x = ... data['author', 'writes', 'paper'].edge_index = ... print(data.metadata()) >>> (['paper', 'author'], [('author', 'writes', 'paper')])
Collects the attribute key
from all node and edge types.
data = HeteroData() data['paper'].x = ... data['author'].x = ... print(data.collect('x')) >>> { 'paper': ..., 'author': ...}
Note
This is equivalent to writing data.x_dict
.
Gets the NodeStorage
object of a particular node type key
. If the storage is not present yet, will create a new torch_geometric.data.storage.NodeStorage
object for the given node type.
data = HeteroData() node_storage = data.get_node_store('paper')
NodeStorage
Gets the EdgeStorage
object of a particular edge type given by the tuple (src, rel, dst)
. If the storage is not present yet, will create a new torch_geometric.data.storage.EdgeStorage
object for the given edge type.
data = HeteroData() edge_storage = data.get_edge_store('author', 'writes', 'paper')
EdgeStorage
Renames the node type name
to new_name
in-place.
Self
Returns the induced subgraph containing the node types and corresponding nodes in subset_dict
.
If a node type is not a key in subset_dict
then all nodes of that type remain in the graph.
data = HeteroData() data['paper'].x = ... data['author'].x = ... data['conference'].x = ... data['paper', 'cites', 'paper'].edge_index = ... data['author', 'paper'].edge_index = ... data['paper', 'conference'].edge_index = ... print(data) >>> HeteroData( paper={ x=[10, 16] }, author={ x=[5, 32] }, conference={ x=[5, 8] }, (paper, cites, paper)={ edge_index=[2, 50] }, (author, to, paper)={ edge_index=[2, 30] }, (paper, to, conference)={ edge_index=[2, 25] } ) subset_dict = { 'paper': torch.tensor([3, 4, 5, 6]), 'author': torch.tensor([0, 2]), } print(data.subgraph(subset_dict)) >>> HeteroData( paper={ x=[4, 16] }, author={ x=[2, 32] }, conference={ x=[5, 8] }, (paper, cites, paper)={ edge_index=[2, 24] }, (author, to, paper)={ edge_index=[2, 5] }, (paper, to, conference)={ edge_index=[2, 10] } )
subset_dict (Dict[str, LongTensor or BoolTensor]) – A dictionary holding the nodes to keep for each node type.
Self
Returns the induced subgraph given by the edge indices in subset_dict
for certain edge types. Will currently preserve all the nodes in the graph, even if they are isolated after subgraph computation.
Returns the subgraph induced by the given node_types
, i.e. the returned HeteroData
object only contains the node types which are included in node_types
, and only contains the edge types where both end points are included in node_types
.
Self
Returns the subgraph induced by the given edge_types
, i.e. the returned HeteroData
object only contains the edge types which are included in edge_types
, and only contains the node types of the end points which are included in node_types
.
Self
Converts a HeteroData
object to a homogeneous Data
object. By default, all features with same feature dimensionality across different types will be merged into a single representation, unless otherwise specified via the node_attrs
and edge_attrs
arguments. Furthermore, attributes named node_type
and edge_type
will be added to the returned Data
object, denoting node-level and edge-level vectors holding the node and edge type as integers, respectively.
node_attrs (List[str], optional) – The node features to combine across all node types. These node features need to be of the same feature dimensionality. If set to None
, will automatically determine which node features to combine. (default: None
)
edge_attrs (List[str], optional) – The edge features to combine across all edge types. These edge features need to be of the same feature dimensionality. If set to None
, will automatically determine which edge features to combine. (default: None
)
add_node_type (bool, optional) – If set to False
, will not add the node-level vector node_type
to the returned Data
object. (default: True
)
add_edge_type (bool, optional) – If set to False
, will not add the edge-level vector edge_type
to the returned Data
object. (default: True
)
dummy_values (bool, optional) – If set to True
, will fill attributes of remaining types with dummy values. Dummy values are NaN
for floating point attributes, False
for booleans, and -1
for integers. (default: True
)
Returns all registered tensor attributes.
Returns all registered edge attributes.
Applies the function func
, either to all attributes or only the ones given in *args
.
Applies the in-place function func
, either to all attributes or only the ones given in *args
.
Performs cloning of tensors, either for all attributes or only the ones given in *args
.
Sorts and removes duplicated entries from edge indices edge_index
.
Self
Concatenates self
with another data
object. All values needs to have matching shapes at non-concat dimensions.
Self
Ensures a contiguous memory layout, either for all attributes or only the ones given in *args
.
Returns the edge indices in the GraphStore
in COO format.
edge_types (List[Any], optional) – The edge types of edge indices to obtain. If set to None
, will return the edge indices of all existing edge types. (default: None
)
store (bool, optional) – Whether to store converted edge indices in the GraphStore
. (default: False
)
Tuple
[Dict
[Tuple
[str
, str
, str
], Tensor
], Dict
[Tuple
[str
, str
, str
], Tensor
], Dict
[Tuple
[str
, str
, str
], Optional
[Tensor
]]]
Copies attributes to CPU memory, either for all attributes or only the ones given in *args
.
Returns the edge indices in the GraphStore
in CSC format.
edge_types (List[Any], optional) – The edge types of edge indices to obtain. If set to None
, will return the edge indices of all existing edge types. (default: None
)
store (bool, optional) – Whether to store converted edge indices in the GraphStore
. (default: False
)
Tuple
[Dict
[Tuple
[str
, str
, str
], Tensor
], Dict
[Tuple
[str
, str
, str
], Tensor
], Dict
[Tuple
[str
, str
, str
], Optional
[Tensor
]]]
Returns the edge indices in the GraphStore
in CSR format.
edge_types (List[Any], optional) – The edge types of edge indices to obtain. If set to None
, will return the edge indices of all existing edge types. (default: None
)
store (bool, optional) – Whether to store converted edge indices in the GraphStore
. (default: False
)
Tuple
[Dict
[Tuple
[str
, str
, str
], Tensor
], Dict
[Tuple
[str
, str
, str
], Tensor
], Dict
[Tuple
[str
, str
, str
], Optional
[Tensor
]]]
Copies attributes to CUDA memory, either for all attributes or only the ones given in *args
.
Detaches attributes from the computation graph by creating a new tensor, either for all attributes or only the ones given in *args
.
Detaches attributes from the computation graph, either for all attributes or only the ones given in *args
.
Generates and sets n_id
and e_id
attributes to assign each node and edge to a continuously ascending and unique ID.
Synchronously obtains an edge_index
tuple from the GraphStore
.
Synchronously obtains a tensor
from the FeatureStore
.
*args – Arguments passed to TensorAttr
.
convert_type (bool, optional) – Whether to convert the type of the output tensor to the type of the attribute index. (default: False
)
**kwargs – Keyword arguments passed to TensorAttr
.
ValueError – If the input TensorAttr
is not fully specified.
Obtains the size of a tensor given its TensorAttr
, or None
if the tensor does not exist.
Returns True
if edge indices edge_index
are sorted and do not contain duplicate entries.
Returns True
if any torch.Tensor
attribute is stored on the GPU, False
otherwise.
Returns True
if edge indices edge_index
are sorted.
Synchronously obtains a list of tensors from the FeatureStore
for each tensor associated with the attributes in attrs
.
Note
The default implementation simply iterates over all calls to get_tensor()
. Implementor classes that can provide additional, more performant functionality are recommended to to override this method.
attrs (List[TensorAttr]) – A list of input TensorAttr
objects that identify the tensors to obtain.
convert_type (bool, optional) – Whether to convert the type of the output tensor to the type of the attribute index. (default: False
)
ValueError – If any input TensorAttr
is not fully specified.
Returns the number of edges in the graph. For undirected graphs, this will return the number of bi-directional edges, which is double the amount of unique edges.
Copies attributes to pinned memory, either for all attributes or only the ones given in *args
.
Synchronously adds an edge_index
tuple to the GraphStore
. Returns whether insertion was successful.
edge_index (Tuple[torch.Tensor, torch.Tensor]) – The edge_index
tuple in a format specified in EdgeAttr
.
*args – Arguments passed to EdgeAttr
.
**kwargs – Keyword arguments passed to EdgeAttr
.
Synchronously adds a tensor
to the FeatureStore
. Returns whether insertion was successful.
tensor (torch.Tensor or np.ndarray) – The feature tensor to be added.
*args – Arguments passed to TensorAttr
.
**kwargs – Keyword arguments passed to TensorAttr
.
ValueError – If the input TensorAttr
is not fully specified.
Ensures that the tensor memory is not reused for another tensor until all current work queued on stream
has been completed, either for all attributes or only the ones given in *args
.
Synchronously deletes an edge_index
tuple from the GraphStore
. Returns whether deletion was successful.
Removes a tensor from the FeatureStore
. Returns whether deletion was successful.
*args – Arguments passed to TensorAttr
.
**kwargs – Keyword arguments passed to TensorAttr
.
ValueError – If the input TensorAttr
is not fully specified.
Tracks gradient computation, either for all attributes or only the ones given in *args
.
Moves attributes to shared memory, either for all attributes or only the ones given in *args
.
Returns the size of the adjacency matrix induced by the graph.
Returns a snapshot of data
to only hold events that occurred in period [start_time, end_time]
.
Self
Sorts edge indices edge_index
and their corresponding edge features.
Performs tensor device conversion, either for all attributes or only the ones given in *args
.
Returns a snapshot of data
to only hold events that occurred up to end_time
(inclusive of edge_time
).
Self
Updates a tensor
in the FeatureStore
with a new value. Returns whether the update was successful.
Note
Implementor classes can choose to define more efficient update methods; the default performs a removal and insertion.
tensor (torch.Tensor or np.ndarray) – The feature tensor to be updated.
*args – Arguments passed to TensorAttr
.
**kwargs – Keyword arguments passed to TensorAttr
.
Returns a view of the FeatureStore
given a not yet fully-specified TensorAttr
.
AttrView
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4