Check out our take on the Shopify Summer '24 Edition!

Designing and Scaling High-Throughput Shopify Middleware

Nicolò Rebughini

8 May 2024 Shopify, Software Development, System Integration

Nicolò Rebughini

5 mins

So, you’ve reached the hard conclusion that you need to build some middleware for your Shopify store—perhaps you need to integrate with an exotic third-party system that doesn't offer a native app, or you have a complex subscription flow that cannot be accommodated by an off-the-shelf solution.

But once you've mapped out all the business requirements, you're still left with a million decisions to make: where should you host your middleware? What's the best architecture, a monolith or microservices? Should your code run on a fixed cadence or respond to Shopify's webhooks?

In this article, we're publishing some broad technical considerations we've collected over the years while working on middleware implementations for our clients. We suggest treating these as general guidelines: your mileage may vary depending on your requirements, team composition, and growth stage.

Let's dive in!

Reactive vs. Scheduled Middleware

The first crucial architectural decision you'll make is whether your middleware should be reactive or scheduled.

Reactive middleware responds immediately to events triggered by external actions, such as customer purchases or inventory updates. It's the preferred choice when dealing with time-sensitive business logic, such as sending email notifications or updating order information.

In reactive middleware, incorporating a queue mechanism is imperative to facilitate the asynchronous processing of incoming event payloads. By decoupling the reception of webhooks from the actual processing logic, queues create a buffer that allows your middleware to absorb workload spikes seamlessly. They also mitigate the risk of data loss during processing disruptions, ensuring that all incoming events are reliably processed.

Queues are de facto mandatory when building middleware for Shopify, as your store will deactivate your webhook subscription in case of multiple failures/timeouts.

Additionally, you should design reactive middleware to be idempotent and resilient against race conditions, e.g., when handling duplicate or out-of-order event notifications. This allows brands to mitigate the risk of data corruption.

Scheduled middleware, on the other hand, operates on a predefined cadence, executing tasks at regular intervals. It's the preferred choice when dealing with bulk data operations, such as subscription processing or (in some cases) integrations with third-party systems.

Because scheduled middleware might process a substantial amount of data during each execution, optimizing performance and resource utilization is essential. This can be achieved by grouping data into manageable chunks and processing it in batches, minimizing resource contention and maximizing throughput. When integrating with third-party systems, it's also essential to be mindful of rate limits, maximum payload sizes, and the opportunity to leverage batch APIs whenever available.

By understanding the distinct functionalities of reactive and scheduled middleware, businesses can leverage these tools strategically to orchestrate complex workflows and optimize their Shopify operations for efficiency and scalability.

Monoliths vs. Service-Oriented Architectures

Especially in larger projects, it's tempting to implement your middleware logic as a constellation of microservices, making up what's typically called a Service-Oriented Architecture, or SOA for short. Technically, these services may run as traditional long-lived web applications, but we often see them deployed in serverless environments such as AWS Lambda or Cloudflare Workers.

The theory is that this design increases technical freedom and reduces maintenance overhead: for instance, the subscription processing engine may be implemented using a technology different from the WMS integration. Plus, if they're deployed as serverless functions, there's no need to worry about scaling the services to match transaction volume because the infrastructure provider will handle that responsibility for us.

In our experience, however, most SOA setups suffer from premature optimization. The additional freedom and scalability come at the expense of a more complicated development and infrastructure setup, with multiple codebases to maintain and constraints on the deployment process. Code reuse is also more complicated, and rather than going through the hassle of building and maintaining an internal SDK, engineering teams often end up duplicating the shared code. In other words, the additional effort and complexity often outweigh the gains.

Because of this, when faced with complex middleware architectures, we always consider whether there's a chance to simplify things by moving to a Majestic Monolith deployed as a traditional, long-lived Web application. While they're not as fancy as a serverless SOA, monoliths have several advantages:

  • Engineers only need to work in one codebase to make orthogonal changes, which makes it easier to work atomically.
  • Code sharing across API endpoints and scheduled jobs is trivial since everything's in the same application.
  • We only need to write and maintain a single deployment pipeline, which makes deployments simpler and more robust.
  • We eliminate the need to deal with network failures, data model inconsistencies, and data synchronization issues.
  • It's still easier and cheaper to find and onboard new engineers on a monolith vs. a SOA.

This is not to say that microservices and serverless architectures are always bad. There are situations where they can help with scalability, e.g., for trivial middleware that only needs to respond to a specific webhook (in which case a serverless infrastructure is acceptable) or when a particular piece of business logic can be optimized by rewriting it in a more suitable technology. However, these pros always need to balance against the additional complexity introduced into the system.

Keep Your Options Open

As we mentioned at the beginning of this article, your mileage may and will vary, especially as time passes and your middleware starts taking on more responsibilities. Our suggestion is to regularly re-evaluate your design and architectural choices to ensure that your current setup is still the ideal fit for your needs.

Furthermore, remember that the best middleware is no middleware. Whenever you see an opportunity to simplify your architecture by removing custom logic in favor of a third-party solution, you should seriously evaluate it. E-commerce brands are not tech startups, and every dollar you save in development costs is a dollar you'll be able to invest in other areas of the business.

If you are struggling with fine-tuning your custom integrations or business logic, don’t hesitate to reach out: we can help you navigate and streamline the complexity, finding the ideal setup for your brand.

You may also like

Let’s redefine
eCommerce together.