Skip to content

Speeding up data import to Neo4j v5 and CSV format data #57

@nickzren

Description

@nickzren

I encountered challenges while trying to load Hetionet data into my updated MacBook's Neo4j version 5.13. The existing Neo4j dumps were no longer compatible, and directly importing the data in JSON format was too time-consuming, taking an estimated 10+ hours.

To address this, I've written a script that efficiently converts JSON data to CSV format without any loss in node, edge, or property value information. The JSON-to-CSV conversion takes approximately 30 seconds, while uploading the CSV to Neo4j takes around 40 seconds.

I've organized each node and edge type into its own respective CSV file and accompanying Cypher script. I believe this will make it easier for people to understand and work with the data.

If this sounds useful, I'd be open to integrating these changes into the main branch. Let me know your thoughts.

You can find the revised code at:
https://github.com/nickzren/hetionet/tree/csv

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions