new

AuraDB Professional

AuraDB Free

Data Importer

Import

Data Importer - Introducing file filtering

This release of Data Importer introduces a new way to load more data sources without the need for pre-processing. By allowing you to apply simple filters to files we're enabling loads in more scenarios, including:
  • Generally keeping data relevant from only certain rows in a file while skipping the others
  • Loading data from aggregate
    node lists
    and
    relationship lists
    where information on all nodes and all relationships is encapsulated in just two files.
We're going to take a quick look at how file filtering can help you with the latter example.
Consider the following subset of the Northwind data model showing Orders, the Products they contain and the Shippers they are shipped by.
CleanShot 2023-01-26 at 11
In the classic Northwind dataset, these are represented by different tables like
Shippers.csv
,
Orders.csv
,
Products.csv
and
Order-Details.csv
. When extracting data from more graph-like sources however, it is not uncommon to be provided with wide node lists and relationships lists that contain all the nodes and all the relationships in just two files.
Here's what an example exported from our very own Neo4j Bloom looks like (but could equally apply to any other graph-like export):
CleanShot 2023-01-26 at 11
bloom-nodes-export.csv
CleanShot 2023-01-26 at 11
bloom-relationships-export.csv
In the nodes file you'll notice the node types are identified by the
~labels
column and relationship types in the relationships file by the
~relationship_type
column.
Prior to the file filtering feature, you needed to separate the nodes files into three files representing the three node types and the relationships file into two files representing the two relationship types.
With file filtering, now optionally available under the File dropdown in the Mapping Panel, you can apply include filters to keep rows only relevant to the Nodes or Relationships in your model. Here's an example of applying the file filter to the
bloom-relationships-export.csv
file to ensure it only keeps the rows where the
~relationship_type
column has
ORDERS
values.
CleanShot 2023-01-26 at 12
You'll notice the filter when applied, gives you feedback as you type on how many matches were found. For performance reasons only the first 10,000 rows of any file are scanned, so even if you don't see matched in this feedback, there may still be matches further down your file.
The same filtering principle applies for the nodes file, in this example mapping to the Product node and only keeping the rows in the
bloom-nodes-export.csv
file where values in the column
~labels
equal
Product
.
CleanShot 2023-01-26 at 12
For now file filtering supports exact string matches, but we'd love to hear your feedback on the utility of the filtering functionality and other things you'd like to see. As always, please head over to https://feedback.neo4j.com/data-importer to leave us your feedback.
That's all for now, thank you for reading!