Generating Demo Databases Quickly With Faker and Dataset
I’ve been throwing the odd spare hour at a side project that needs a backend database to test against. (Interested in using or writing an asset tracking app for circuit boards? Check out trackrack on Github.) Before this, I had just been using some junk data, but decided it’d be a better/more realistic test of the app to use some randomly generated data. It also gave me the chance to try out two software modules I’ve been wanting to mess around with for a while.
I had seen Ruby’s
faker module held up as a nice tool for generating structured random data at a coding meetup, but had never had reason to use it until now. It’s a great one-stop-shop for generating email addresses, names, timestamps, and MAC addresses, which was just what I needed. Plus, if I ever want to go back and replace all the names generated with ones from Game of Thrones, it’s already got that capability built in.
Instead of a database, this little Ruby script just generates a JSON file as output. This was by design; the app, as written now, relies on a big JSON file in lieu of a database. Additionally, the second module I wanted to use is a Python package designed to ease database creation and use. Converting the flat file into a database seemed like the ideal dry run for the
dataset Python module.
Note here that none of the code implemented with
dataset looks like a SQL statement. As far as I’m concerned, that’s OK! I know next to nothing about SQL. Which makes
dataset’s hand-holdy approach doubly convenient: it hides just about all of the SQL-y stuff away under a nice Python abstraction and leaves me with a nice
.db file to drop into the application. (…provided that I ever figure out enough about
dataset to need it, anyway.)
I’d have liked to have written this app as an end-to-end Python application, but I had issues installing the Python port of Faker. Something to do with importing an
ipaddress module…? Also found a few GitHub issues saying it wasn’t backwards compatible with Python 2. Yeah, yeah, I get it, I should just migrate to 3 already… In the meantime, however, it was nothing that forty-odd lines of Ruby couldn’t handle.
Now I just have to figure out enough about
dataset to actually use this database in the app backend… More to come.