Building a Business System Integration and Automation Platform at Shopify — Development (2022)

0
49
Building a Business System Integration and Automation Platform at Shopify — Development (2022)


Companies organize and automate their internal processes with a multitude of business systems, Shopify, Salesforce, HubSpot, Airtable, Netsuite, Google Sheets and many more. Since companies function as a whole, these systems need to be able to talk to one another. At Shopify, we took advantage of Ruby, Rails, and our scale with these technologies to build a business system integration solution.

In step with software design’s progression from monolithic to modular architecture, business systems have proliferated over the past 20 years, becoming smaller and more focused. Software hasn’t only targeted the different business domains like sales, marketing, support, finance, legal, and human resources, but the niches within or across these domains, like tax, travel, training, documentation, procurement, and shipment tracking. Targeted applications can provide the best experience by enabling rapid development within a small, well defined space.

The Gap

The transition from monolithic to modular architecture doesn’t remove the need for interaction between modules. Maintaining well-defined, versioned interfaces and integrating with other modules is one of the biggest costs of modularization. In the business systems space, however, it doesn’t always make sense for vendors to take responsibility for integration, or do it in the same way.

Business systems are built on different tech stacks with different levels of competition and different customer requirements. This landscape leads to business systems with asymmetric interfaces (from SOAP to FTP to GraphQL) and integration capabilities (from complete integration platforms to nothing). Businesses are left with a gap between their systems and no clear, easy way to fill it.

Organic Integration

Connecting these systems on an as needed basis leads to a hacky hodgepodge of:

  • ad hoc code (often running on individual’s laptops)
  • integration platforms like Zapier
  • users downloading and uploading CSVs
  • third party integration add ons from app stores
  • out of the box integrations
  • custom integrations built on capable business systems.

Frequently data won’t be going from the source system directly to the target system, but has multiple layovers in whatever systems it could integrate with. The only determining factors are the skillsets and creativity of the people involved in building the integration.

When a company is small this can work, but as companies scale and the number of integrations grow it becomes unmanageable. Data flows are convoluted, raising security questions, and making business critical automation fragile. Just like with monolithic architecture it can become too terrifying and complex to change anything, paralyzing the business systems and preventing them from adapting and scaling to support the company.

The solution, as validated by the existence of numerous Integration Platform as a Service (IPaaS) solutions like Mulesoft, Dell Boomi, and Zapier, is yet another piece of software that’s responsible for integrating business systems. The consistency provided by using one application for all integration can solve the issues of visibility, fragility, reliability, and scalability.

Mulesoft

At Shopify, we ran into this problem, created a small team of business system integration developers and put them to work building on Mulesoft. This was an improvement, but, because Shopify is a software company, it wasn’t perfect.

Isolation from Shopify Development

Shopify employs thousands of developers. We have infrastructure, training, and security teams. We maintain a multitude of packages and have tons of Slack channels for getting help, discussing ideas, and learning about best practices. Shopify is intentionally narrow in the technologies it uses (Ruby, React, and Go) to benefit from this scale.

Mulesoft is a proprietary platform leveraging XML configuration for the Java virtual machine. This isn’t part of Shopify’s tech stack, so we missed out on many of the advantages of developing at Shopify.

Issues with Integrating Internal Applications

Mulesoft’s cloud runtime takes care of infrastructure for its users, a huge advantage of using the platform. However, Shopify has a number of internal services, like shipment tracking, as well as infrastructure, like Kafka, that for security reasons can only be used from within Shopify’s cloud. This meant that we would need to build infrastructure skills on our team to host Mulesoft on our own cloud.

Although using Mulesoft initially seemed to lower the costs of connecting business systems, due to our unique situation, it had more drawbacks than developing on Shopify’s tech stack.

Unless performance is paramount, in which case we use Go, Ruby is Shopify’s choice for backend development. Generally Shopify uses the Rails framework, so if we’re going to start building business system integrations on Shopify’s tech stack, Ruby on Rails is our choice. The logic for choosing Ruby on Rails within the context of development at Shopify is straightforward, but how do we use it for business system integration?

The Design Priorities

When the platform is complete, we want to build reliable integrations quickly. To turn that idea into a design, we need to look at the technical aspects of business system integration that differentiate it from the standard application development Rails is designed around.

Minimal

Generally applications are structured around a domain and get to determine the requirements, the data they will and won’t accept. An integration, however, isn’t the source of truth for anything. Any validation we introduce in an integration will be, at best, a duplication of logic in the target application. At worst our logic will create erroneous errors.

I did this the other day with a Sorbet Struct. I was using it to organize data before posting it. Unfortunately a field was required in the struct that wasn’t required in the target system. This resulted in records failing in transit when the target system would have accepted them.

Transparent

Many business systems are highly configurable. Changes in their configuration can lead to changes in their APIs, affecting integrations.

Airtable, for example, uses the column names as the JSON keys in their API, so changing a column name in the user interface can break an integration. We need to provide visibility into exactly what integrations are doing to help system admins avoid creating errors and quickly resolve them when they arise.

Flexible

Business systems are diverse, created at different times by different developers using different technologies and design patterns. For integration work this⁠—most importantly⁠—leads to a wide variety of interfaces like FTP, REST, SOAP, JSON, XML, and GraphQL. If we want a centralized, standardized place to build integrations it needs to support whatever requirements are thrown at it.

Secure

Integrations deal with sensitive information, personally identifiable information (PII), compensation, and anything else that needs to move between business systems. We need to make sure that we aren’t exposing this data.

Reusable

Small, point to point integrations are the most reliable and maintainable. This design has the potential to create a lot of duplicate code and infrastructure. If we want to build integrations quickly we need to reuse as much as possible.

Implementation

Those are some nice high-level design priorities. How did we implement them?

Documentation

From the beginning of the project, documentation has been a priority. We document

  • decisions that we’re making, so they’re understood and challenged in the future as needs change
  • the integrations living on our platform
  • the clients we’ve implemented for connecting to different systems and how to use them
  • how to build on the platform as a whole.

Initially we were using GitHub’s built-in wiki, but being able to version control our documentation and commit updates alongside the code made it easier to trace changes and ensure documentation was being kept up to date. Fortunately Shopify’s infrastructure makes it very easy to add a static site to a git repository.

Design priorities covered: transparency, reusability

Language Features

Ruby is a mature, feature-rich language. Beyond being Turing complete, over the years it’s added a plethora of features to make programming simpler and more concise. It also has an extensive package ecosystem thanks to Ruby’s wide usage, long life, and generous community. In addition to reusing our code, we’re able to leverage other developer’s and organization’s code. Many business systems have great, well-maintained gems, so integrating with them is as simple as adding the gem and credentials.

Design priorities covered: reusability

Rails Engines

We reused Shopify Core’s architecture, designing our application as a modular monolith made up of Rails Engines. Initially the application didn’t take advantage of Rails Engines and simply used namespaces within the app directory. It quickly became apparent that this model made tracking down an individual integration’s code difficult. You have to go through every one of the app directories, controllers, helpers, and more to see if an integration’s namespace was present.

After a lot of research and a few conversations with my Shopify engineering mentor, I began to understand Rails Engines. Rails engines are a great fit for our platform because integrations have relatively obvious boundaries, so it’s easy and advantageous to modularize them.

This design enabled us to reuse the same infrastructure for all our integrations. It also enabled us to share code across integrations by creating a common Rails Engine, without the overhead of packaging it up into rubygems or duplicating it. This reduces both development and maintenance costs.

In addition, this architecture benefitted transparency by keeping all of the code in one place and modularizing it. It’s easy to know what integrations exist and what code belongs to them.

Design priorities covered: reusability, transparent

Eliminating Data Storage

Our business system integration platform won’t be the source of truth for any business data. The business data comes from other business systems and passes through our application.

If we start storing data in our application it can become stale, out of sync with the source of truth. We could end up sending stale data to other systems and triggering the wrong processes. Tracking this all down requires digging through databases, logs, and timestamps in multiple systems, some without good visibility.

Data storage adds complexity, hurts transparency, and introduces security and compliance concerns.

Design priorities covered: transparent, minimal, secure

Actions

Business system integration consists almost entirely of business logic. In Rails, there are multiple places this could live, but they generally involve abstractions designed around building standalone applications, not integrations. Using one of these abstractions would add complexity and obfuscate the logic.

Actions were floating around Shopify as a potential home for business logic. They have the same structure as Active Jobs, one public method, perform, and don’t reference any other Actions. The Action concept provides consistency, making all integration logic easy to find. It also provides transparency by putting all business logic in one place, so it’s only necessary to look at one Action to understand a data flow.

One of the side effects of Actions is code duplication. This was a trade-off we accepted. Given that integrations should be acting independently, we would prefer to duplicate some code than tightly couple integrations.

Design priorities covered: transparent, minimal

Embracing Hashes

Dataflows are the purpose of our application. In every integration we are dealing with at least two API abstractions of complex systems. Introducing our own abstractions on top of these abstractions can quickly compound complexity. If we want the application to be transparent, it needs to be obvious what data is flowing through it and how the data is being modified.

Most of the data we’re working with is JSON. In Ruby, JSON is represented as a hash, so working with hashes directly often provides the best transparency with the least room for introducing errors.

I know, I know. We all hate to see strings in our code, but hear me out. You receive a JSON payload. You need to transform it and send out another JSON payload with different keys. You could map the original payload to an object, map that object to another object, and map the final object back to JSON. If you want to track that transformation, though, you need to track it through three transformations. On the other hand, you could use a hash and a transform function and have the mapping clearly displayed.

Using hashes leads to more transparency than abstracting them away, but it also can lead to typos and therefore errors, so it’s important to be careful. If you’re using a string key multiple times, turn it into a constant.

Design priorities covered: transparent, minimal

Low-level Mocking

At Shopify, we generally use Mocha for mocking, but for our use case we default to WebMock. WebMock mocks at the request level, so you see the URL, including query parameters, headers, and request body explicitly in tests. This makes it easy to work directly with business systems API documentation because this is the level it’s documented at, and it allows us to understand exactly what our integrations are doing.

There are some cases, though, where we use Mocha, for example with SOAP. Reading a giant XML text string doesn’t provide useful visibility into what data is being sent. WebMock tests also become complex when many requests are involved in the integration. We’re working on improving the testing experience for complex integrations with common factories and prebuilt WebMocks.

Design priorities covered: transparent

Shopify

Perhaps most importantly, we’ve been able to tap into development at Shopify by leveraging our:

  • infrastructure, so all we have to do to stand up an application or add a component is run dev runtime
  • training team to help onboard our developers
  • developer pipeline for hiring
  • observability through established logging, metrics and tracing setups
  • internal shipment tracking service
  • security team standards and best practices

The list could go on forever.

Design priorities covered: reusability, security

It’s been a year since work on our Rails integration platform began. Now, we have 18 integrations running, have migrated all our Mulesoft apps to the new platform, have doubled the number of developers from one to two and have other teams building integrations on the platform. The current setup enables us to build simple integrations, the majority of our use case, quickly and securely with minimal maintenance. We’re continuing to work on ways to minimize and simplify the development process, while supporting increased complexity, without harming transparency. We’re currently focused on improving test mock management and the onboarding process and, of course, building new integrations.

Will is a Senior Developer on the Solutions Engineering Team. He likes building systems that free people to focus on creative, iterative, connective work by taking advantage of computers’ scalability and consistency.


Wherever you are, your next journey starts here! If building systems from the ground up to solve real-world problems interests you, our Engineering blog has stories about other challenges we have encountered. Intrigued? Visit our Engineering career page to find out about our open positions and learn about Digital by Design.



Source link

Leave a reply

Please enter your comment!
Please enter your name here