Backing Up Your Shopify Data

Shopify has a great ecommerce platform, but they are still a single point of failure for a possible catastrophic data loss that wipes out all record of what you’ve built in your store and business. (Or worse, arbitrary censorship or other obscure reasons for kicking you off their platform 🙈.) In the event the unthinkable happens, you will want to be able to reconstruct your business history and not lose all of the valuable data and learnings.

Other reasons you may want to back up your data include recreating your store data for analysis, or possibly migrating to another platform later on.

In this post I will guide you on how to back up all of the important pieces of your Shopify in a more scalable way than just CSV exports.

The Unscalable Way

CSV Exports

You may have already been making a habit of exporting all of your orders, products, and customers directly from the Shopify dashboard. This is a fine initial solution — but you may soon be bothered by the .csv files piling up, growing file sizes, and repetitive effort to download. For a fast and ad hoc backup however you can’t beat the CSV exports.

Shopify Orders
Shopify Products
Shopify Customers

The Scalable Way

For the the actual real guide to backing up your Shopify data, you’re going to need the following coding stack (or refer to your resident developer). What we’re going to end up with is a mini relational database that represents your entire Shopify business, which will then allow you to reconstruct your business history (if needed), and even do cool things like run SQL queries against it for your own analyses.

Stack

  • MacOS
  • Python 3
  • SQLite

Setup

Create a new directory for your Shopify backup and set up a Python virtual environment:

mkdir shopify-backup
cd shopify-backup/
python3 -m venv venv

Activate your virtual environment and install the necessary packages:

. venv/bin/activate
pip install requests sqlalchemy

Configure Shopify API credentials

Create a .env environment configuration file to hold your Shopify API credentials. You can get an API key by going to your Shopify admin dashboard and navigating to Apps -> Private apps -> Create private app. Call your app simply “Shopify backup” with all of the default settings, and you should get an API key and and Password. Set these in your .env file.

touch config.env

and in the file:

SHOPIFY_URL=your-store.myshopify.com
SHOPIFY_API_KEY=yourapikeylettersandnumbers
SHOPIFY_API_PASSWORD=yourapipasswordlettersandnumbers


Screenshots

Write backup script

Create a new Python file called backup.py and plunk the following code into it. We won’t go through the specifics here, but the end results will be a lightweight SQLite database that holds all of your Shopify data.

#! /usr/bin/env python3

import os, math
import requests
import sqlalchemy
from sqlalchemy import MetaData, create_engine, Column, BigInteger, DateTime, String, ForeignKey, Integer
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship, sessionmaker


# Environment variables
if os.path.exists('config.env'):
    for line in open('config.env'):
        var = line.strip().split('=')
        if len(var) == 2:
            os.environ[var[0]] = var[1].replace("\"", "")

# Metadata settings
convention = {
    "ix": 'ix_%(column_0_label)s',
    "uq": "uq_%(table_name)s_%(column_0_name)s",
    "ck": "ck_%(table_name)s_%(constraint_name)s",
    "fk": "fk_%(table_name)s_%(column_0_name)s_%(referred_table_name)s",
    "pk": "pk_%(table_name)s"
}
Base = declarative_base()
Base.metadata = MetaData(naming_convention=convention)

# Models
class Product(Base):
    __tablename__ = 'products'
    id = Column(BigInteger, primary_key=True)
    title = Column(String)

class Customer(Base):
    __tablename__ = 'customers'
    id = Column(BigInteger, primary_key=True)
    first_name = Column(String)
    last_name = Column(String)
    email = Column(String)
    orders = relationship('Order', back_populates='customer')

class Order(Base):
    __tablename__ = 'orders'
    id = Column(BigInteger, primary_key=True)
    customer_id = Column(BigInteger, ForeignKey('customers.id', ondelete='cascade'))
    currency = Column(String)
    total_price = Column(String)
    customer = relationship('Customer', back_populates='orders')

# Create tables
basedir = os.path.abspath(os.path.dirname(__file__))
SQLALCHEMY_DATABASE_URI = 'sqlite:///' + os.path.join(basedir, 'db.sqlite')
engine = create_engine(SQLALCHEMY_DATABASE_URI)
session = sessionmaker()
session.configure(bind=engine)
db = session()
Base.metadata.create_all(engine)

# Ingest data
s = requests.Session()
s.auth = (os.getenv('SHOPIFY_API_KEY'), os.getenv('SHOPIFY_API_PASSWORD'))
url = 'https://' + os.getenv('SHOPIFY_URL') + '/admin/'
params = {'limit': 250}

## Products
Model = Product
model = 'products'
field_values = ['title']
count = s.get(url + model + '/count.json').json().get('count')
pages = math.ceil(count/250)
print("Starting import for {}...".format(model))
num = 0
for page in range(1, pages+1):
    r = s.get(url + model + '.json', params={'page': page, **params})
    objs = [i for i in r.json().get(model)]
    for i in objs:
        fields = {k: i.get(k) for k in field_values}
        obj = db.query(Model).filter_by(id=i['id'])
        if obj.first() is not None:
            obj.update(fields)
        else:
            obj = Model(id=i['id'], **fields)
            db.add(obj)
        num += 1
print("Imported {} {}.".format(num, model))

## Customers
Model = Customer
model = 'customers'
field_values = ['first_name', 'last_name', 'email']
count = s.get(url + model + '/count.json').json().get('count')
pages = math.ceil(count/250) # max 250 results per page
print("Starting import for {}...".format(model))
num = 0
for page in range(1, pages+1):
    r = s.get(url + model + '.json', params={'page': page, **params})
    objs = [i for i in r.json().get(model)]
    for i in objs:
        fields = {k: i.get(k) for k in field_values}
        obj = db.query(Model).filter_by(id=i['id'])
        if obj.first() is not None:
            obj.update(fields)
        else:
            obj = Model(id=i['id'], **fields)
            db.add(obj)
        num += 1
print("Imported {} {}.".format(num, model))

## Store products and customers for orders later
db.commit()

## Orders
Model = Order
model = 'orders'
field_values = ['currency', 'total_price']
count = s.get(url + model + '/count.json', params={'status': 'any'}).json().get('count')
pages = math.ceil(count/250)
print("Starting import for {}...".format(model))
num = 0
for page in range(1, pages+1):
    r = s.get(url + model + '.json', params={'page': page, 'status': 'any', **params})
    objs = [i for i in r.json().get(model)]
    for i in objs:
        customer = db.query(Customer).get(i['customer']['id'])
        if customer is None:
            continue
        fields = {k: i.get(k) for k in field_values}
        obj = db.query(Model).filter_by(id=i['id'])
        if obj.first() is not None:
            obj.update(fields)
        else:
            obj = Model(id=i['id'], customer_id=customer.id, **fields)
            customer.orders.append(obj)
            db.add(obj)
        num += 1
print("Imported {} {}.".format(num, model))

## Store orders
db.commit()

You can get the full source code from our Github repo. One thing to note is that we simplified which fields we pull in; there are lots of other possible fields you can back up, so check out the Shopify API docs to get the full list.

Run the script

chmod +x backup.py
./backup

Depending on how much data you have, the script may take a while; there are a few helpful command line messages, but you can add your own to better track the progress.

After the script completes you should have a db.sqlite3 file holding all of your data. As you continue to get more data and grow your store, you can simply go back and re-run the script to update all of your existing data and pull in new entries:

./backup

Using Your Backup

All of the different ways you can use your backup data is beyond the scope of this guide, but to get started you can fire up your SQLite interactive command line and check out what you have:

sqlite3 db.sqlite

And run some queries:

select * from customers;
select * from products;
select * from orders;

Conclusion

We went through a simple guide on how to back up all of your Shopify data for various reasons including catastrophe data loss insurance, censorship, custom analysis, or migration. Once you’ve backed up your data, the sky(net)’s the limit for what you can do with it!



We’re the founders behind Segments, a data science & advanced analytics app for your Shopify store. Our mission is to democratize data science access to the millions of online stores and give ecommerce back to the people. We'd love for you to check us out to see how we can help you optimize your store and grow sales!