Benchmarking MongoSluice: Streaming Yelp’s User Data to MySQL

by | Dec 17, 2018 | Case Studies

The Goal

MongoSluice’s power feature is that it can accurately convert data from MongoDB (BSON) to tables — rows and columns — without any manual labor.  In order to generate a perfect representation of the data, every single document within a collection needs to be checked. MongoSluice does that. Once the data is moved over it’s up to you as to what comes next!

The Data

In order to benchmark MongoSluice we used a 2 GB JSON dataset provided by Yelp called yelp_dataset_users.json that consisted of 1,518,169 documents. 

The Hardware and Software

Here is the specs of our hardware running as separate Digital Ocean Droplets:

  • The MongoDB Droplet: Ubuntu 4.0.2 with 16 GB Memory; 6 vCPUs; and 320 GB of disk space
  • The MySQL Droplet: Ubuntu 4.0.2; 4 GB Memory; 2 vCPUs; and 80 GB of disk space
  • The MongoSluice Droplet: Ubuntu 4.0.2; 4 GB Memory; 2 vCPUs; and 80 GB of disk space

Processing Time

Here is the time that MongoSluice took to process the — moving it from MongoDB to MySQL.

  • Total Time: 158 minute
    • Generating schema: 85 minutes
    • Streaming data from MongoDB to MySQL: 73 minutes

End Result

Here is a look at the schema in MySQL workbench:

 

About MongoSluice

MongoSluice is the most complete solution for leveraging your data in MongoDB in BI application and other RDBMS systems.

Guarantee

We guarantee satisfaction.
Zero hassles.