Migration
Since Superfiliate was born as an app for Shopify stores, we initially had on our back-end the model ShopifyProduct, instead of only a generic Product one. This worked fine until now, but the problem is that it restricted us to live only inside the Shopify world - it would not be ideal to represent a generic product using the existent entity ShopifyProduct. So, we needed to restructure the architecture of one of our main models to support a generic product. This article explains how we managed to do so in a few weeks and with zero downtime.
Issues & architecture
The main feature we, as a squad of 2 engineers in a sprint of 3 weeks, needed to implement was what we called “Fake product”, which is a product that is created through the merchant in our app, instead of a Shopify Product. Then, we realized that there would be no clean way of doing so without a restructuring. Our current model, ShopifyProduct, consisted of an entity that stores the data just like it is on Shopify, it is kind of a mirror. We also allow our merchants to edit some attributes, like the title, for a Shopify product in our app, but then we store this customized data in another entity called Personalization. It is also worth mentioning that a ShopifyProduct can have multiple ProductVariants (for example, if you have a T-shirt with multiple sizes, each size would be a variant) and also some other 1-to-n relations. A simplified diagram of it would be like this:
The Personalization stores the personalized data as a jsonb inside the column partial_data
.
We then started to brainstorm how could we restructure our current architecture to support generic products given the short deadline of 3 weeks. The first thing was to think how would be the ideal architecture that would support this new kind of product. There were some options for it:
- Creating products under the ShopifyProduct model: the easiest way, but also the dirtiest. If we followed this approach, we would have many columns useless for “fake” products that belong only to real Shopify products. In the long term, this can cause a mess in our code, like having many verifications of
if is_shopify_product *do this* else *do that*
- Keeping the ShopifyProduct structure as it is, with its personalization, and creating a product entity: this way, it would still be a little dirty since we would have ShopifyProduct that isn't a Product. Also, it would be complex whenever we wanted to access fields that exist on both kinds; for example, when getting the title, we would need to do something like
shopify_product.personalization.title
if it is a ShopifyProduct orproduct.title
if it is a "fake” product - Creating a FakeProduct entity: easy, but we would also have the problem of the complexity when accessing “common” fields
- Create a new Product entity, with a relation 1-1 to the ShopifyProduct: the chosen approach, since it makes sense to think of ShopifyProducts as products. Following this approach, we could adjust both our back-end and front-end to manipulate just products, since, for example, it seems unnecessary to access the title of a ShopifyProduct through another entity called Personalization instead of looking at the Product itself, which is the entity that would store the title for “fake” products
Then, we've designed the chosen approach like this:
However, migrating the ProductVariants and all other relations from ShopifyProduct would take more time and, since they weren’t necessary for the “fake” product to work properly, we decided not to migrate them by now, taking us to this final diagram:
Notice that in this new architecture, the Personalization doesn’t exist. We have replaced it by the Product entity. This way, we can use the ShopifyProduct really be just a mirror and, whenever we want to send the product to front-end, we use the Product entity - for reading, editing, creating, and deleting (the last 2 are only available for “fake” products). All our front-end was using directly the shopify_product in their queries (replacing them was probably the biggest effort in terms of time) - and we also had the issue that these GraphQL queries had names like products
, instead of shopify_products
.
Our front-end is deployed separately from the back-end, so any change in our API is very costly, because we need to:
- Add the new field in back-end
- Use the new field in front-end
- Delete the old field from back-end
The last thing to mention is that we also have some other entities that are not directly related to the ShopifyProduct (they don’t have a DB foreign key), but they store some ids from ShopifyProducts inside JSONBs, something like this:
We need these entities to now store the ids from Products, since this is what front-end will know that exists.
The plan
Considering all the issues mentioned above, we have broken down the migration into the following steps:
This seemed to be the easiest way to do this migration. Another way of doing so was to do it in a reverse way:
- Start creating “fake” products under the ShopifyProduct model, with the shopify-related fields as nil
- Rename the
shopify_products
table to be onlyproducts
- Create a new
shopify_products
table, migrating the pertinent data fromproducts
to this new one - Remove shopify-related fields from the
products
table
This way we would have the advantage of lauching the “fake” product feature since the beginning of our implementation. However, it would come with two major risks: the complexity of renaming a big table like this, which could take us to some downtime, and the risk of finishing the 3 weeks with us still doing the migration - which could make us move some technical debts resolution to a posterior moment.
Given our chosen approach, by creating initially the products with the same id as their shopify_product we ensure that we won’t have problems nor need to migrate the other models that use some shopify_products. We just needed to change the id to be set automatically right before we started implementing the “fake” product feature.
As stated before, doing the replacements on front-end were the most time-consuming steps, since a big part of our application relies on products endpoints and then there were lots of things to be tested (we don’t have yet automated tests on front-end).
But the riskier steps are those with the dashed square around it. They are like this because they needed to be done one right after another and in a non peak hour. To be able to create products without manually defining the id, we needed to first reset the counter that Postgres has for the id. But, at the same time, we couldn’t stop manually defining the id for products before we reset it, otherwise we could face errors like id 1 has already been taken
. To avoid any issue between the time we reset the counter and deploy the code for stopping this manual definition, we have reset the counter to a number that is 50k higher than the id of the last Product created we had. This number was set considering the number of products created in a day (we have never created more than 30k products in a day, so 50k is safe).
After doing these, we were finally able to create some custom products normally.
Learnings
The biggest take away from this project is that implementing big changes with a detailed execution plan and without hurries can really bring some benefits. We had enough time to test every single dangerous pull request before actually deploying them and this is essential for the success of any big refactor.
Next steps
We already have the necessary back-end for creating a basic custom product. However, it would be very nice if these products could also have the relations the ShopifyProduct has, like the ProductVariant, for example. The sad news is that this would be also too costly (taking a whole 3-week cycle) for some reasons:
- There are also lots of references for ProductVariant in our front-end
- Every change in our API requires at least 3 deploys
- We will need to also migrate the records from ProductVariant (or any other entity) to the new table ShopifyProductVariant
However, doing this first migration for the core entity could already show us that it is not impossible nor an infinite migration!