GenAI integrations
Neo4j’s Vector indexes and Vector functions allow you to calculate the similarity between node and relationship properties in a graph. A prerequisite for using these features is that vector embeddings have been set as properties of these entities. The GenAI plugin enables the creation of such embeddings using GenAI providers.
To use the GenAI plugin you need an account and API credentials from any of the following GenAI providers: Vertex AI, OpenAI, Azure OpenAI, and Amazon Bedrock.
To learn more about using embeddings in Neo4j, see Vector indexes → Vectors and embeddings in Neo4j.
For a hands-on guide on how to use the GenAI plugin, see GenAI documentation - Embeddings & Vector Indexes Tutorial → Create embeddings with cloud AI providers.
Installation
The GenAI plugin is enabled by default in Neo4j Aura.
The plugin needs to be installed on self-managed instances.
This is done by moving the neo4j-genai.jar
file from /products
to /plugins
in the Neo4j home directory, or, if you are using Docker, by starting the Docker container with the extra parameter --env NEO4J_PLUGINS='["genai"]'
.
For more information, see Operations Manual → Configure plugins.
Example graph
The examples on this page use the Neo4j movie recommendations dataset, focusing on the plot
and title
properties of Movie
nodes.
The graph contains 28863 nodes and 332522 relationships.
There are 9083 Movie
nodes with a plot
and title
property.
To recreate the graph,
-
For Aura or Enterprise Edition, import this dump file into an empty database.
-
For self-hosted Community Edition, import this dump file into an empty database.
To learn how to import dumps, see Aura → Importing an existing database or Operations manual → Restore a database dump.
The embeddings on this page are generated using OpenAI (model text-embedding-ada-002 ), producing 1536-dimensional vectors.
|
Generate a single embedding and store it
Use the genai.vector.encode()
function to generate a vector embedding for a single value.
genai.vector.encode()
Functiongenai.vector.encode(resource :: STRING, provider :: STRING, configuration :: MAP = {}) :: LIST<FLOAT>
-
The
resource
(aSTRING
) is the object to transform into an embedding, such as a chunk text or a node/relationship property. -
The
provider
(aSTRING
) is the case-insensitive identifier of the provider to use. See identifiers under GenAI providers for supported options. -
The
configuration
(aMAP
) contains provider-specific settings, such as which model to invoke, as well as any required API credentials. See GenAI providers for details of each supported provider. Note that because this argument may contain sensitive data, it is obfuscated in the query.log. However, if the function call is misspelled or the query is otherwise malformed, it may be logged without being obfuscated.
This function sends one API request every time it is called, which may result in a lot of overhead in terms of both network traffic and latency. If you want to generate many embeddings at once, use Generate a batch of embeddings and store them. |
Generate a batch of embeddings and store them
Use the genai.vector.encodeBatch()
procedure to generate many vector embeddings with a single API request.
This procedure takes a list of resources as an input, and returns the same number of result rows, instead of a single one.
This procedure attempts to generate embeddings for all supplied resources in a single API request. Therefore, it is recommended to see the respective provider’s documentation for details on, for example, the maximum number of embeddings that can be generated per request. |
genai.vector.encodeBatch()
Proceduregenai.vector.encodeBatch(resources :: LIST<STRING>, provider :: STRING, configuration :: MAP = {}) :: (index :: INTEGER, resource :: STRING, vector :: LIST<FLOAT>)
-
The
resources
(aLIST<STRING>
) parameter is the list of objects to transform into embeddings, such as chunks of text. -
The
provider
(aSTRING
) is the case-insensitive identifier of the provider to use. See GenAI providers for supported options. -
The
configuration
(aMAP
) specifies provider-specific settings such as which model to invoke, as well as any required API credentials. See GenAI providers for details of each supported provider. Note that because this argument may contain sensitive data, it is obfuscated in the query.log. However, if the function call is misspelled or the query is otherwise malformed, it may be logged without being obfuscated.
Each returned row contains the following columns:
-
The
index
(anINTEGER
) is the index of the corresponding element in the input list, to aid in correlating results back to inputs. -
The
resource
(aSTRING
) is the name of the input resource. -
The
vector
(aLIST<FLOAT>
) is the generated vector embedding for this resource.
GenAI providers
The following GenAI providers are supported for generating vector embeddings.
Each provider has its own configuration map that can be passed to genai.vector.encode()
or genai.vector.encodeBatch()
.
Vertex AI
-
Identifier (
provider
argument):"VertexAI"
Key | Type | Description | Default |
---|---|---|---|
|
|
API access token. |
Required |
|
|
GCP project ID. |
Required |
|
|
The name of the model you want to invoke.
|
|
|
|
GCP region where to send the API requests.
|
|
|
|
The intended downstream application (see provider documentation). The specified |
|
|
|
The title of the document that is being encoded (see provider documentation). The specified |
OpenAI
-
Identifier (
provider
argument):"OpenAI"
Key | Type | Description | Default |
---|---|---|---|
|
|
API access token. |
Required |
|
|
The name of the model you want to invoke. |
|
|
|
The number of dimensions you want to reduce the vector to. Only supported for certain models. |
Model-dependent. |
Azure OpenAI
-
Identifier (
provider
argument):"AzureOpenAI"
Unlike the other providers, the model is configured when creating the deployment on Azure, and is thus not part of the configuration map. |
Key | Type | Description | Default |
---|---|---|---|
|
|
API access token. |
Required |
|
|
The name of the resource to which the model has been deployed. |
Required |
|
|
The name of the model deployment. |
Required |
|
|
The number of dimensions you want to reduce the vector to. Only supported for certain models. |
Model-dependent. |
Amazon Bedrock
-
Identifier (
provider
argument):"Bedrock"
Key | Type | Description | Default |
---|---|---|---|
|
|
AWS access key ID. |
Required |
|
|
AWS secret key. |
Required |
|
|
The name of the model you want to invoke.
|
|
|
|
AWS region where to send the API requests.
|
|