The goal of neo4jshell is to provide rapid querying of âNeo4Jâ graph databases by offering a programmatic interface with âcypher-shellâ. A wide variety of other functions are offered that allow importing and management of data files for local and remote servers, as well as simple administration of local servers for development purposes.
Pre-installation notesThis package requires the ssh
package for interacting with remote âNeo4Jâ databases, which requires libssh
to be installed. See the vignettes for the ssh
package here for more details.
This package also requires the âcypher-shellâ executable to be available locally. This is installed as standard in âNeo4Jâ installations and can usually be found in the bin
directory of that installation. It can also be installed standalone using Homebrew or is available here: https://github.com/neo4j/cypher-shell.
It is recommended, for ease of use, that the path to the âcypher-shellâ executable is added to your PATH
environment variable. If not, you should record its location for use in some of the functions within this package.
You can install the released version of neo4jshell from CRAN with:
install.packages("neo4jshell")
And the development version from GitHub with:
# install.packages("devtools")
devtools::install_github("keithmcnulty/neo4jshell")
Functionality Query
neo4j_query()
sends queries to the specified âNeo4Jâ graph database and, where appropriate, retrieves the results in a dataframe.
In this example, the movies dataset has been started locally in the âNeo4Jâ browser, with a user created that has the credentials indicated. cypher-shell
is in the local system path.
library(neo4jshell)
library(dplyr)
library(tibble)
# set credentials (no port required in bolt address)
neo_movies <- list(address = "bolt://localhost", uid = "neo4j", pwd = "password")
# find directors of movies with Kevin Bacon as actor
CQL <- 'MATCH (p1:Person {name: "Kevin Bacon"})-[:ACTED_IN]->(m:Movie)<-[:DIRECTED]-(p2:Person)
RETURN p2.name, m.title;'
# run query
neo4j_query(con = neo_movies, qry = CQL)
#> p2.name m.title
#> 1 Ron Howard Frost/Nixon
#> 2 Rob Reiner A Few Good Men
#> 3 Ron Howard Apollo 13
Older versions of âNeo4Jâ and âcypher-shellâ (<4.0) will require the encryption
argument to be explicitly 'true'
or 'false'
. For newer versions, which have multi-tenancy, you can use the database
argument to specify the database to query.
neo4j_import()
imports a csv, zip or tar.gz file from a local source into the specified âNeo4Jâ import directory, uncompresses compressed files and removes the original compressed file as clean up.neo4j_rmfiles()
removes specified files from specified âNeo4Jâ import directoryneo4j_rmdir()
removes entire specified subdirectories from specified âNeo4Jâ import directoryIn this general example, we can see how these functions can be used for smooth ETL to a remote âNeo4Jâ server. This example assumes that the URL of the server that hosts the âNeo4Jâ database is the same as the bolt URL for the âNeo4Jâ database. If not, a different set of credentials will be needed for using neo4j_import()
.
# credentials (note no port required in server address)
neo_server <- list(address = "bolt://neo.server.address", uid = "neo4j", pwd = "password")
# csv data file to be loaded onto 'Neo4J' server (path relative to current working directory)
datafile <- "data.csv"
# CQL query to write data from datafile to 'Neo4J'
loadcsv_CQL <- "LOAD CSV FROM 'file:///data.csv' etc etc;"
# path to import directory on remote 'Neo4J' server (should be relative to user home directory on remote server)
impdir <- "./import"
# import data
neo4jshell::neo4j_import(con = neo_server, source = datafile, import_dir = impdir)
# write data to 'Neo4J' (assumes cypher-shell is in system PATH variable)
neo4jshell:neo4j_query(con = neo_server, qry = loadcsv_CQL)
# remove data file as clean-up
neo4jshell::neo4j_rmfiles(con = neo_server, files = datafile, import_dir = impdir)
In Windows, the âcypher-shellâ executable may need to be specified with the file extension, for example shell_path = "cypher-shell.bat"
.
If you are working with the âNeo4Jâ server locally, below will help you get started.
First, the code below is relative to user and is using âNeo4J 4.0.4 Communityâ installed at my userâs root. The directory containing the âcypher-shellâ and âneo4jâ executables are in my systemâs PATH environment variables.
## start the local server
neo4j_start()
#> Directories in use:
#> home: /Users/keithmcnulty/neo4j-community-4.0.4
#> config: /Users/keithmcnulty/neo4j-community-4.0.4/conf
#> logs: /Users/keithmcnulty/neo4j-community-4.0.4/logs
#> plugins: /Users/keithmcnulty/neo4j-community-4.0.4/plugins
#> import: /Users/keithmcnulty/neo4j-community-4.0.4/import
#> data: /Users/keithmcnulty/neo4j-community-4.0.4/data
#> certificates: /Users/keithmcnulty/neo4j-community-4.0.4/certificates
#> run: /Users/keithmcnulty/neo4j-community-4.0.4/run
#> Neo4j is already running (pid 2103).
#> [1] 0
## setup connection credentials and import directory location
neo_con <- list(address = "bolt://localhost:7687", uid = "neo4j", pwd = "password")
import_loc <- path.expand("~/neo4j-community-4.0.4/import/")
First we save mtcars
to a .csv
file, and we compress that file. This package supports a number of delivery formats, but we use a .zip
file as an example.
mtcars <- mtcars %>%
tibble::rownames_to_column(var = "model")
write.csv(mtcars, "mtcars.csv", row.names = FALSE)
zip("mtcars.zip", "mtcars.csv")
Now we use neo4j_import()
to place a copy of this file within the import directory you defined in import_loc
above.
neo4j_import(local = TRUE, graph, source = "mtcars.zip", import_dir = import_loc)
#> Import and unzip successful! Zip file has been removed!
We now write a CQL query to write some information from mtcars.csv
to the graph, and execute that query.
CQL <- "LOAD CSV WITH HEADERS FROM 'file:///mtcars.csv' AS row
WITH row WHERE row.model IS NOT NULL
MERGE (c:Car {name: row.model});"
neo4j_query(neo_con, CQL)
#> Query succeeded with a zero length response from Neo4J
Now, letâs remove the mtcars.csv
file from the import directory of our local server as cleanup. If you want to use a sub-directory to help manage your files during an ETL into âNeo4Jâ, you can remove that local sub-directory when your process has completed using neo4j_rmdir()
.
## remove the file
neo4j_rmfiles(local = TRUE, graph, files="mtcars.csv", import_dir = import_loc)
#> Files removed successfully!
Now letâs run a query to check the data was loaded to the graph.
CQL <- "MATCH (c:Car) RETURN c.name as name LIMIT 5;"
neo4j_query(neo_con, CQL)
#> name
#> 1 Mazda RX4
#> 2 Mazda RX4 Wag
#> 3 Datsun 710
#> 4 Hornet 4 Drive
#> 5 Hornet Sportabout
Local server administration and control
neo4j_start()
starts a local âNeo4Jâ instanceneo4j_stop()
stops a local âNeo4Jâ instanceneo4j_restart()
restarts a local âNeo4Jâ instanceneo4j_status()
returns the status of a local âNeo4Jâ instanceneo4j_wipe()
wipes an entire graph from a local âNeo4Jâ instanceFor example:
# my server was already running, confirm
neo4j_status()
#> Neo4j is running at pid 2103
#> [1] 0
# stop the server
neo4j_stop()
#> Stopping Neo4j....... stopped
#> [1] 0
# restart
neo4j_start()
#> Directories in use:
#> home: /Users/keithmcnulty/neo4j-community-4.0.4
#> config: /Users/keithmcnulty/neo4j-community-4.0.4/conf
#> logs: /Users/keithmcnulty/neo4j-community-4.0.4/logs
#> plugins: /Users/keithmcnulty/neo4j-community-4.0.4/plugins
#> import: /Users/keithmcnulty/neo4j-community-4.0.4/import
#> data: /Users/keithmcnulty/neo4j-community-4.0.4/data
#> certificates: /Users/keithmcnulty/neo4j-community-4.0.4/certificates
#> run: /Users/keithmcnulty/neo4j-community-4.0.4/run
#> Starting Neo4j.
#> Started neo4j (pid 12838). It is available at http://localhost:7474/
#> There may be a short delay until the server is ready.
#> See /Users/keithmcnulty/neo4j-community-4.0.4/logs/neo4j.log for current status.
#> [1] 0
# give it a few seconds to fire up
Sys.sleep(10)
# query again
neo4j_query(neo_con, qry="MATCH (c:Car) RETURN c.name as name LIMIT 5;")
#> name
#> 1 Mazda RX4
#> 2 Mazda RX4 Wag
#> 3 Datsun 710
#> 4 Hornet 4 Drive
#> 5 Hornet Sportabout
If you are using an admin account and you are using âNeo4J 4+â you can check what databases are available by querying the system database.
neo4j_query(neo_con, qry="SHOW DATABASES;", database = "system")
#> name address role requestedStatus currentStatus error default
#> 1 neo4j localhost:7687 standalone online online <NA> TRUE
#> 2 system localhost:7687 standalone online online <NA> FALSE
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4