Application extensibility in the Google Cloud ecosystem
This Google Cloud ecosystem-related article examines the four tools Senior Engineer Łukasz Szymik uses for application extensibility.
We software developers know the importance of building extensible and composable software. In addition, we understand that each customer is different, with unique requirements and workflows to support in software. You have likely built several extension points within your software to accommodate these varied needs, allowing customers to develop and plug in custom functionality quickly.
Extension points come in many forms, from simple hooks that allow customers to adjust existing behaviour to full-blown APIs that open up your software’s internal workings, allowing for more complex integrations. As a result, there’s a need to carefully design these extension points so they are as flexible and intuitive as possible while maintaining high security and stability.
In addition to designing and implementing extension points, you will work closely with clients to help them develop custom plugins, connectors, and other add-ons.
Extensions may be helpful to clients in several circumstances. Regular use cases include:
|Example uses cases|
|Connect a commerce platform to several payment processing systems.|
|Swap the promotion engine in a composable commerce system with a custom one.|
|Customise stock-level management based on past customer satisfaction data and intelligent supplier selection algorithms.|
|Plugin for multi-channel notifications, e.g. email, text messages, WhatsApp|
Many extension points can serve different purposes and have other characteristics.
|The extension point||Description||Example|
|Inbound webhook||Lightweight HTTP pattern for inserting data into your system.||Send a message to the MS Teams channel using an HTTP endpoint.|
|Outbound webhook||Lightweight HTTP pattern for sending data from your system to an external one.||Send stock consumption information to an external fulfilment system.|
|Inbound events||Integration of Pub?Sub messages to deliver events to your system.||The system reacts to an external event that triggers processing.|
|Outbound events||Integration of Pub/Sub messages to deliver events to your system||A third-party system subscribes to events exposed by your system.|
|Software plugins||Eclipse plugins (based on OSGi) ass new functionality to the application.||Add a new menu item and related implementation to an image processing application.|
|Micro frontend component||Web page component to extend web page’s functionality.||Add a new metric view component to a monitoring dashboard.|
I’ve developed many extension points for established businesses and startups during my career. Both have similar requirements, such as the quick deployment of custom code. Additionally, fast validation of the extension’s concept is needed.
The potential to provide solutions or extensions can determine a startup’s destiny, so long conversations are inappropriate here. On the other hand, solutions must include sufficient structure so they are as close to the final solution as possible. The luxury of throwing everything out and starting over again is not always attainable.
In this article, I describe some Google Cloud Platform components that are very useful when working on extension points. I have used them successfully in several projects without a hassle. They all now have found a place within my standard toolbelt.
Commonly implemented patterns
A typical use case is as follows: We have a system that implements the core business logic, which is deployed into the Kubernetes cluster and maintained by the company’s core team(s). The system offers extension points based on eventing, hooks, or simple access to the exposed public API.
The exact technique used for integration has little impact on our use case. Note that, in this article, we assume that we have an extension point ready for our integration, and our primary focus is on running the extension functionality.
We need to develop and run the extension point (usually a micro-service) that will connect to the core system and provide additional functionality tailored to a customer’s needs. We need to ensure a suitable level of visibility in the execution and the metrics to ensure operability. Because such extensions are often co-development with the customer, we want to avoid deploying them into the Kubernetes cluster where the core system runs. We want to have a separation barrier.
Usually, the architecture I am extending looks like this:
Although we can separate the core system and extension in many ways, like dedicated namespaces, I always try to externalise my deployment to avoid a slowdown caused by a need to discuss it with the core team, organise access for myself, and sometimes just by the core team worries.
Hosting your extension
My first choice for running extensions is Google Cloud Run. Cloud Run is a serverless platform that runs containerized applications developed in any language.
If you already have a docker image with your application, then the deployment is just a single command:
gcloud run deploy extension-app --image=eu.gcr.io/prj-123/extension-app:latest
The command will take a docker image from Google Container Registry and run it in the Google Cloud Run environment. It is trivial. No hassle with configuration or deployment scripts, nothing but pure hacking.
There is one requirement. The service must listen to the HTTP port defined by the `PORT` environment variable. Typically it is `8080`.
The API is exposed by the service ingress, e.g., https://extension-app-gfluduqybb-ew.a.run.app. It is possible to expose the service publicly or make it accessible only internally, like from an internal load balancer or VPC. The unauthenticated invocation is possible, or we can use Cloud IAM for authentication.
The deployment and exposure of the service can be handled in a matter of minutes. We can completely concentrate on building the business logic now that the complexity has been removed.
Cloud Run is serverless, as it is based on the Knative API. If our container receives HTTP traffic, it will start running, and we will pay for the CPU usage. It is related to a cold start, so the faster the container starts, the better. However, I have several installations where my Cloud Run applications run 24/7; still, costs are manageable.
Related to the HTTP traffic that triggers our application, we can integrate Cloud Run with Google Pub/Sub. It opens up possibilities for new integrations. A good case is data import, where a new file is uploaded to the Cloud Storage buckets, which then causes an event to be sent to the Cloud Pub/Sub. The dedicated Cloud Run instance is awoken, and the file is processed.
An extension point utilising Google Cloud Run has immediate benefits for us:
- Scalability (we can set the minimum and maximum number of instances.)
- We can assign CPU always or only while executing
- We have monitoring and logging in place. Logs are browseable using Logs Explorer.
- We can secure traffic with authorisation.
We must also deal with limitations if we run software using Google Cloud Run. For some use cases, Google Cloud Run is not suitable. I learned this while trying to develop a micro-service that supported a client-server certificate exchange. Unfortunately, there is no way to forward a client certificate to a service started on Cloud Run.
We all know how valuable logs can be. They are irreplaceable when we need to find anomalies in the execution of our program. Sometimes we must also provide detailed records to prove that our program works as expected. A regular tendency is to build a custom solution for logging and monitoring. Here, a team typically states that cloud operation usage will be costly and that building and hosting a solution is better.I understand that approach, but I would recommend first calculating the total costs.
Several times in my career, I’ve been dissatisfied with solutions that lost logs, forcing users to access them via strange hacks like port forwarding or failing entirely. If you are working on something new, like a new project or small startup, use what you have available before considering optimisations and improvements. There will be time for it later. Maintaining your logging pipeline is not for free, even if all of its components are open-source.
There’s a similar story in the alerting space. Nowadays, almost all systems are under close monitoring 24/7. I always integrate the alerting capabilities as soon as possible while developing the first proof of concept.
To move quickly, I plugin Google Error Reporting close to the beginning. Error Reporting analyses all logs and can react to them in real-time. Slack notifications are sufficient in a project’s early phase. That’s enough. The tool processes stack traces from popular languages.
And, if more is needed, an API is available. I have used the API in one of my programs to compare capabilities. Both solutions work the same. The error grouping feature organises a little mess at the beginning of the deployment. The possibility of quickly detecting a new error and navigating to it and its logs speeds development.
Recently, I started using an additional component in a slightly different context. A recurring requirement for my extension points is to ensure insights into business process execution. Business units must have detailed proof that execution was successful, often with additional information like the number of processed resources. That allows them to detect issues faster.
An example is the importing of product descriptions into a commerce system. To quickly provide business execution logs, I am using Google Big Query. Big Query is a robust data warehouse. I only use a small subset of its capabilities. Each execution of the business process records a new row in the Big Table.
Table structures are easy to tailor according to our needs. Usually, it contains a timestamp, resource id, and information about the error, if an issue arises. The integration within the Google Cloud Platform is trivial. That is a very low-hanging fruit in many projects. Results are easy to export to a data sheet and shared with stakeholders.
Moreover, the foundation provided by Big Query analytics can evolve with the system. It can become a potent tool supporting the whole organisation. Big Query uses SQL to manipulate data, and basic SQL knowledge is enough to browse data and extract the information needed.
I typically prepare the initial set of SQL queries showcasing possible usage. Queries are stored within the Google project, and a customer can browse and execute them. Data storage does not limit us. The information gathered can stay with us for a long time, and we will use it in the future.
Even if we do not want to perform any analytics, keeping the business and application logs separated is a good practice. It allows more tailored access rights and keeps the information noise away from our customers.
We can exchange Google Big Query with any SQL database. The advantage of the Big Query is its easy access, the hidden complexity of the database details, and integration with the rest of the Google Platform. We don’t need to worry about connection strings or other technical stuff.
When working on application extensibility in the Google Cloud ecosystem, I frequently use these four components. They power my integrations, projects, and solutions. They help me with quick wins while prototyping and showcasing solutions. I also use other cloud components daily. In this article, I described only part of my toolbox for prototyping and getting things done quickly. Get familiar with them and apply them to your solutions where they fit.