Data Importer - Introducing file filtering

This release of Data Importer introduces a new way to load more data sources without the need for pre-processing. By allowing you to apply simple filters to files we're enabling loads in more scenarios, including:
Generally keeping data relevant from only certain rows in a file while skipping the others
Loading data from aggregate node lists
 and relationship lists
 where information on all nodes and all relationships is encapsulated in just two files.
We're going to take a quick look at how file filtering can help you with the latter example.
Consider the following subset of the Northwind data model showing Orders, the Products they contain and the Shippers they are shipped by.
In the classic Northwind dataset, these are represented by different tables like Shippers.csv
, Orders.csv
, Products.csv
 and Order-Details.csv
. When extracting data from more graph-like sources however, it is not uncommon to be provided with wide node lists and relationships lists that contain all the nodes and all the relationships in just two files.
Here's what an example exported from our very own Neo4j Bloom looks like (but could equally apply to any other graph-like export):
bloom-nodes-export.csv
bloom-relationships-export.csv
In the nodes file you'll notice the node types are identified by the ~labels
 column and relationship types in the relationships file by the ~relationship_type
 column.
Prior to the file filtering feature, you needed to separate the nodes files into three files representing the three node types and the relationships file into two files representing the two relationship types.
With file filtering, now optionally available under the File dropdown in the Mapping Panel, you can apply include filters to keep rows only relevant to the Nodes or Relationships in your model. Here's an example of applying the file filter to the bloom-relationships-export.csv
 file to ensure it only keeps the rows where the ~relationship_type
 column has ORDERS
 values.
You'll notice the filter when applied, gives you feedback as you type on how many matches were found. For performance reasons only the first 10,000 rows of any file are scanned, so even if you don't see matched in this feedback, there may still be matches further down your file.
The same filtering principle applies for the nodes file, in this example mapping to the Product node and only keeping the rows in the bloom-nodes-export.csv
 file where values in the column ~labels
 equal Product
.
For now file filtering supports exact string matches, but we'd love to hear your feedback on the utility of the filtering functionality and other things you'd like to see. As always, please head over to https://feedback.neo4j.com/data-importer to leave us your feedback.
That's all for now, thank you for reading!