On May 25th, a data privacy law known as GDPR came into effect. It impacts the way companies collect and handle user data. In this article we will show you how to handle personal user data when creating database dumps in order to avoid potential GDPR penalties.
What is GDPR?
In short, General Data Protection Regulation or GDPR is a set of rules that regulate how EU citizen data must be managed, empowering EU citizens with more control over their personal data. Organizations have to make sure that personal data is legally gathered, strictly managed and respected. Only the data that is needed should be collected and processed.
You can find out what Inchoo did to prepare for GDPR in our blog post by Toni Anicic.
Personal data in a Magento 2 project
Often when developing a Magento 2 project, data from a live Magento 2 website has to be used. Usually this means creating a copy of the website’s database structure and data, also known as a database dump. This database dump might include tables with personal user data such as names, addresses, emails, orders, invoices, etc. Having access to personal user data when it’s not needed is considered bad practice as the data might get lost, stolen or become available to people that it’s not intended for. Handling the user data this way can lead to significant GDPR penalties. Luckily, there is a way to avoid this by using a CLI tool called n98-magerun.
Netz98 magerun CLI tools
N98-magerun2 provides some handy tools to work with Magento 2 from the command line. Among the available tools that n98-magerun provides is the database dump tool.
Installation
There are multiple ways to install n98-magerun2 in a Magento 2 project. We can download and install the phar file and make it executable or we can install the tool using Composer.
Install the phar file
To download the latest stable n98-magerun2.phar file, run this command in your Magento 2 project:
wget https://files.magerun.net/n98-magerun2.phar
Or if you prefer to use Curl:
curl -O https://files.magerun.net/n98-magerun2.phar
Verify the download by comparing the SHA256 checksum with the one on the website:
shasum -a256 n98-magerun2.phar
We can make the phar file executable:
chmod +x ./n98-magerun2.phar
It can now be called by using the PHP CLI interpreter:
php n98-magerun2.phar {command}
Install with Composer
To install n98-magerun2 using Composer, inside a Magento 2 project run:
composer require n98/magerun2
If there is an error, try:
composer require --no-update n98/magerun2
composer update
N98-magerun2 commands are executed from the vendor/bin/ folder. To verify the installation, run:
./vendor/bin/n98-magerun2 --version
Database dump
The db:dump command is used to dump the project database. It uses mysqldump.
./vendor/bin/n98-magerun2 db:dump
This command will create a file containing the database structure and all of the data.
Stripped database dump
The db:dump command has a –strip argument that can be used to exclude specific tables from the dump.
./vendor/bin/n98-magerun2 db:dump [--strip]
Tables that we want to exclude can be added to the –strip argument. Each of the tables should be separated by a space. Wildcards like * and ? can be used in the table names to strip multiple tables.
./vendor/bin/n98-magerun2 db:dump --strip=”customer_address* sales_invoice_*”
Table groups
Predefined table groups that start with @ can also be used in the –strip argument. These contain a list of tables that will be excluded from the dump when the table group is used in the –strip argument.
./vendor/bin/n98-magerun2 db:dump --strip=”@stripped”
The table groups are predefined in the config.yaml file either in the vendor/n98/magerun2/ folder or in the n98-magerun2.phar package.
Available table groups:
@customers
customer_address*
customer_entity*
customer_grid_flat
customer_log
customer_visitor
newsletter_subscriber
product_alert*
vault_payment_token*
wishlist*
@search
catalogsearch_*
@sessions
core_session
@log
log_url
log_url_info
log_visitor
log_visitor_info
log_visitor_online
report_event
report_compared_product_index
report_viewed_*
@quotes
quote
quote_*
@sales
sales_order
sales_order_address
sales_order_aggregated_created
sales_order_aggregated_updated
sales_order_grid
sales_order_item
sales_order_payment
sales_order_status_history
sales_order_tax
sales_order_tax_item
sales_invoice
sales_invoice_*
sales_invoiced_*
sales_shipment
sales_shipment_*
sales_shipping_*
sales_creditmemo
sales_creditmemo_*
sales_recurring_*
sales_refunded_*
sales_payment_*
enterprise_sales_*
enterprise_customer_sales_*
sales_bestsellers_*
paypal_billing_agreement*
paypal_payment_transaction
paypal_settlement_report*
@admin
admin*
authorization*
@trade
@customers
@sales
@quotes
@stripped
@log
@sessions
@development
@admin
@trade
@stripped
@search
The @development table group includes other predefined table groups that contain sensitive user data such as logs, sessions, trade data, admin users, orders, invoices, credit memos, quotes, etc. This table group should be used when personal user data is not needed.
The @development table group should take care of all customer data for a default Magento 2 project, but in many cases, a Magento 2 project will contain modules that create their own data tables. For customer data contained in tables not defined by the default Magento 2 project installation, custom table groups should be defined.
Custom table groups
In addition to the predefined table groups, custom table groups can also be defined. To define a custom table group, create a n98-magento2.yaml file inside the Magento 2 project app/etc/ folder. The file should contain the following lines:
# app/etc/n98-magerun2.yaml
# ...
commands:
N98MagentoCommandDatabaseDumpCommand:
table-groups:
- id: table_group_name
description: table group description
tables: space separated list of tables
# ...
This will create a new table group @table_group_name that can be used in the –strip argument to exclude the data specified inside this group.
This way we can strip all of the personal user data that we do not need, making sure that the database dump is GDPR compliant and that the personal user data is never available if not needed.