Backend Archives - Building Productive

A Close Call with Real-Time: How Rethinking Pub-Sub Saved the Day

Nikola Buhiniček — Wed, 17 Jan 2024 12:23:31 +0000

–

{{minutes}} min read

A Close Call with Real-Time: How Rethinking Pub-Sub Saved the Day

Nikola Buhiniček

Backend Tech Lead at Productive. Pretty passionate about working on one product and taking part in decision-making. When I’m not coding, you’ll probably find me chilling with my friends and family.

January 17, 2024

Sometimes the easy way is not the most obvious way. And this was one of those times.

All of this started while I was working on our new feature – Automations . In a nutshell, Automations allow customers to set up actions triggered under specific conditions. Some of the currently implemented actions are sending slack messages, creating and updating tasks, or posting comments on objects.

I wanted that actions that modify tasks and comments would send a message over our real-time system so that our frontend clients (browser, mobile, desktop app) could pick that up and show those changes as they occur.

Currently, our app’s real-time updates are tied to POST/PATCH/DELETE requests. We have a controller extension Extensions::Broadcastable that hooks on save_form and destroy_resource methods and sends a real-time event if the action was successful. However, this approach wasn’t suitable for my automation actions, as they don’t go through controllers.

As I was digging into this topic, I somehow changed the scope of my task from making the automations actions “live” to revamping our whole broadcasting architecture. I wanted to move that logic out of controllers to a place where I could catch all the changes – which would also catch the automations actions.

Exploring Different Approaches

1. Callbacks

Yeah… although we all know callbacks are a hassle, I couldn’t NOT think about them, just for a moment…

2. Forms instead of Controllers

Both the controller actions and my automation actions use the same forms to handle our data. So, why wouldn’t I hook on forms and send my real-time messages from there?

This approach was bugging me a bit as we don’t use Form objects in all the places we are actually changing our data. So this wouldn’t make the whole app feel “live” but I would cover more places than what we have currently. That sounded promising to me.

Wanted to pitch my thoughts to the rest of the Core team and that discussion turned on a lightbulb above my head.

3. Embracing Pub/Sub

That was exactly what I was looking for.

Publishing events happens for every change. Of course, except the places we are using methods that skip callbacks (update_column, update_all, …) as Pub/Sub and the aspect of publishing changes essentially is hooked to callbacks – but that’s a story for another day.

Making a PoC

As things go with every big change in our codebase, and generally in our product, I was putting this code behind a Feature Flag (FF).

Putting it simply, when one would have the pubSubBroadcasting FF enabled I would skip sending the real-time events from the controller actions and I would handle the published events accordingly. If you didn’t have that FF, nothing would have changed.

I made a few Subscribers that would listen for all the task and comment related changes and simply handle them as sending a real-time event.

And of course, I added RSpec tests and set up some widgets on New Relic to cover the difference in number of events that we are sending now – as we knew we would be sending more events now.

Basically that was it. The next step was to slowly propagate this FF over all organizations, check our New Relic metrics and see if anything breaks. Once we release the change to the entirety of our user base, we should cover the remaining objects and make subscribers for their Pub/Sub events too.

Showing it off

As I was pretty proud of this solution, I kinda talked about it a lot and naturally it popped up in a 1on1 meeting with my Engineering Manager. He wanted to discuss it a bit more so I told him the same thing I wrote here: “Isn’t that great? Our APP will be LIVE, all the data would be in sync.”

His response was “But do we really want that?”.

That wasn’t the response that I was looking for…

What I didn’t know was that our frontend client, once it gets socket messages, depending on the screen the user is, has to make additional API calls so that all the required data could be fetched again. So, a great deal of my real-time events actually end up generating additional requests to our server and in a way, we are just generating a lot more traffic (self DDoSing?). As I didn’t know about this, I wasn’t even paying attention to those metrics along the way.

One step forward, two steps back

We decided to get better data so that we could make a better call for this situation.

I took a time frame of one week from our logs and checked the number of the POST/PATCH/DELETE requests versus the number of dispatched publish events. This, in the end, would roughly be the same number as the events we are sending over the socket in the current and in the new way.

I wasn’t really aware of how badly this could end. I was making a PoC out of this for tasks and comments and you can see here that there wasn’t such a big difference. 25% more events was okay, I knew it would be more.

But look at our deals endpoint for example – 55 times more events would be sent. That would add up to the traffic we sent over sockets, to our infrastructure bill for that services, and I don’t want to imagine the number of API calls generated as a result of this – by ourselves…

“Deals have that much traffic because a lot of other objects update financial data on deals so that was understandable too, it immediately came to us…”

Back to the drawing board

1. Pub/Sub shouldn’t be a bad call for this

As this is the part of our code that gets all the changes in our data, when wanting to make a frontend client to be as live as it gets, this should be a good call. The solution would be not sending all the changes over sockets but filter them by relevant and not relevant. This way we would surely see a drop of those insane company and deal numbers.

2. Send all the needed data to front?

As mentioned before, each real-time message already contains the object that was changed. The issue here is that there are a lot of screens in our client and we can’t know all the contexts our users are in and what additional data should be sent – which is exactly why our client makes API calls when receiving some socket messages.

3. Why didn’t I just resolve the problem I had?

To make those actions “live” in our frontend clients (browser, mobile, desktop APP), I wanted to plug them into our real-time system. So, why didn’t I just put a bit of explicit calls to the code of my automations actions where I would just call the class that sends the real-time message. But no, I wanted to act smart and fix a problem that wasn’t really there in the first place – to make everything more live while no one was asking for it.

The Aftermath

So yeah, the Pub/Sub usage in our real-time part of the app is on hold until we leverage things up.

I went on with the 3rd solution and added 5 lines of code – a call to the Broadcaster class is one line and I have 5 events over 4 classes to send…
This was a nice learning opportunity for me and I would say that:

When having concrete problems, stick to handling them and resolve them first
I had the data the whole time in New Relic, I should’ve prepare better
It’s not bad to be explicit in code, not everything should be an abstraction, generalization, metaprogramming, … Hope to write on this point a bit more soon

The good thing is that we didn’t actually do any damage with this and we didn’t lose a lot of time during this “learning opportunity”.

Anyone faced similar problems? If yes, how did you deal with them? Feel free to reach out to me!

Nikola Buhiniček

Related jobs

Engineering Manager

Zagreb

Open Job Application

The post A Close Call with Real-Time: How Rethinking Pub-Sub Saved the Day appeared first on Building Productive.

Custom Fields: Give Your Customers the Fields They Need

Nikola Buhiniček — Mon, 14 Nov 2022 07:01:58 +0000

Backend

–

{{minutes}} min read

Custom Fields: Give Your Customers the Fields They Need

Nikola Buhiniček

November 14, 2022

Here at Productive, we’re building an operating system for digital agencies.

But, because each agency is different (think type, size, services they offer, the way they’re set up as an organization…), they need customization options for their workflows. So it’s pretty hard to model all those needs and use cases through a unified data model.

If only there were a way to let them shape those models to their own needs.

Let’s say that one of our customers, ACME digital agency wants to keep track of their employees’ nicknames and to be able to search them by that field. Other than that, they would also like to keep track of their birthdays and be able to sort them and group them by that date.

To me, as a developer, this sounds as simple as it gets—add two new columns to the people table, open those attributes to be editable over the API and send them back in the response.
But should we do that? Should we add all kinds of fields to our models even if those fields are going to be used only by a handful of our customers?

Let me show you how we tackled this type of feature request and made a pretty generic system around it.

What Did Our Customers Want?

It was pretty clear to us what our customers wanted, and that was:

to be able to add additional fields to some of our models (People, Projects, Tasks, …)
to have various data types on those fields (text, number, or date)
to be able to search, sort, or even group by those fields

Our Approach

The Custom Field Model

As we’re building a RESTful API that’s formatted by the JSON:API specification and store our data in a MySQL8 relational database, a few things were pretty straightforward – we need a new model and we’ll name it Custom Field (naming wasn’t an issue here ).

The main attributes of that model should be:

How To Store the Field Values?

OK, so now that we know how to define custom fields, how can we know which value someone assigned to a custom field for some object? And where to store that information?

Three possible solutions came to mind:

1. Add a limited number of custom_field columns to our models

We can add a few custom_field columns to our models and that will work for some of our customers but there will always be others that need few extra fields. Adding numerous columns to our models surely isn’t the best solution, we can do better than this

2. Add a join table

As mentioned before, while relying on a relational database, a join table sounds like the go-to approach. That table would be a simple join table between the custom field and a polymorphic target (yay, Rails ). Other than those foreign keys, we would have a column to store the value.

3. Add a single JSON column to our models

This sounded as flexible as it gets. It would be a simple map where the key would be the custom field ID and the value would be the assigned value for that custom field.

Why We Ended Up Choosing JSON

The first solution was just too limited so we discarded that one immediately and focused on the remaining two solutions.

On one hand, a better design would be to have the custom field values represented by a model but on the other hand, we won’t actually do much with that data. That would just be data that our users set on our objects, data that isn’t important for our business logic. So a simple JSON column didn’t sound bad either.

The searching and sorting aspect of this feature request was probably the most important one for us. That was supposed to work as fast as it gets, without being a burden to our performance.

That’s why we implemented both solutions, tested a lot of searching/sorting/grouping scenarios (we’ll go through that in more detail soon), and then checked the metrics.

The faster solution was the second one, the one with the JSON column, and that made sense to us. That solution doesn’t use JOIN clauses in SQL since the values are written directly in the searched table and can be queried in the WHERE clause. Luckily for us, MySQL8 supports a bunch of great functions to work with JSON columns (JSON_EXTRACT, JSON_UNQUOTE, JSON_CONTAINS and others).

Great! Now that we know how to store the custom field values too, let’s dig into the coding.

From a development point of view, we did the following:

Added a new model, Custom Field, and implemented CRUD operations that can be called over the API
Wrote schema migrations that added a JSON column –custom_fields – to some of our models (people, projects, tasks, …)
Opened the custom_fields attribute so it can be edited over the API
Wrote a generic validation that checks if all the values in the custom_fields hash have the appropriate data type
Added the custom_fields attribute to the API response of the appropriate models

That was most of the work we needed to do to be able to manage custom fields in our models.

But…what about the searching and sorting aspect of custom fields?

Searching Through Custom Field Values

We already had a generic solution written for searching over the API.

We have a format of sending query params for searching, like filter[attribute][operation]=value. For searching through custom fields, we wanted to keep the same format so we ended with a quite similar one –filter[custom_fields][custom_field_id][operation]=value.

We had to add an if-else statement that would handle the custom fields filtering in a different way than filtering through other attributes as the format contained one additional argument—custom_field_id.

What was different in the filtering logic was that we have to load the custom field that’s being filtered by and check what data type its values are. That’s needed to cast the values into numbers or dates—text values don’t make a difference.

So the query params and its SQL query counterparts, based on custom field type, would look like this:

Sorting by Custom Field Values

The concept of sorting by attributes is something we also already tackled by abstracting logic.

The only thing that changes when sorting by custom fields is that we first need to cast the values and then sort by them.

Once again, there’s a small change in the format for custom fields sorters (sort=custom_fields[custom_field_id]) compared to when sorting by a standard attribute (sort=attribute). We need to handle the custom_fields sorters separately because we have to load the desired custom_field and check its type.

Then the ORDER BY statement, based on custom field types, looks like this:

Grouping by Custom Field Values

This was a fun one. The main point here was that you should include the custom fields as some kind of columns stated in the SELECT statement so that you could later use those columns in the GROUP BY statement.

To get the custom field in the SELECT statement, you have to create a virtual column for it. All we needed to do was to extract the values of the grouped custom field and give that virtual column an alias so that we could reference it in the GROUP BY statement. For the column alias we went with the format custom_fields_{custom_field_id}.

For a custom field with id=x, this is done as following:

Once we have the virtual column defined, the grouping part gets done simply, by adding the GROUP BY statement with the earlier mentioned alias.

So in the end, you get a SQL query like:

What Our Customers Got

A simple way to define Custom Fields:

And a place to assign values to their fields:

Summa Summarum

We made it possible for our customers to define custom fields in our data models. Also, we made it possible to search, sort and group by those fields.

It wasn’t long before we had even more requests that built upon our custom fields architecture. The fields we supported at first were okay, but now our customers wanted more field types. They wanted:

to have dropdown custom fields
to have relational custom fields
a field where the values would be objects from one of our existing data models

But before we dig into that, let’s give some time for this basics to sink in. I’ll be back soon with another blog post in which I cover how we solved that new set of feature requests.

Nikola Buhiniček

Related jobs

Engineering Manager

Zagreb

Open Job Application

The post Custom Fields: Give Your Customers the Fields They Need appeared first on Building Productive.

Backend Archives - Building Productive

A Close Call with Real-Time: How Rethinking Pub-Sub Saved the Day

A Close Call with Real-Time: How Rethinking Pub-Sub Saved the Day

Exploring Different Approaches

1. Callbacks

2. Forms instead of Controllers

3. Embracing Pub/Sub

Making a PoC

Showing it off

One step forward, two steps back

Back to the drawing board

1. Pub/Sub shouldn’t be a bad call for this

2. Send all the needed data to front?

3. Why didn’t I just resolve the problem I had?

The Aftermath

Related articles

Related jobs

Custom Fields: Give Your Customers the Fields They Need

Custom Fields: Give Your Customers the Fields They Need

What Did Our Customers Want?

Our Approach

The Custom Field Model

How To Store the Field Values?

Why We Ended Up Choosing JSON

Great! Now that we know how to store the custom field values too, let’s dig into the coding.

Searching Through Custom Field Values

Sorting by Custom Field Values

Grouping by Custom Field Values

What Our Customers Got

Summa Summarum

Related articles

Related jobs

Backend Archives - Building Productive

A Close Call with Real-Time: How Rethinking Pub-Sub Saved the Day

A Close Call with Real-Time: How Rethinking Pub-Sub Saved the Day

Exploring Different Approaches

1. Callbacks

2. Forms instead of Controllers

3. Embracing Pub/Sub

Making a PoC

Showing it off

One step forward, two steps back

Back to the drawing board

1. Pub/Sub shouldn’t be a bad call for this

2. Send all the needed data to front?

3. Why didn’t I just resolve the problem I had?

The Aftermath

Related articles

Custom Fields: Give Your Customers the Fields They Need

Testing the Test

Related jobs

Custom Fields: Give Your Customers the Fields They Need

Custom Fields: Give Your Customers the Fields They Need

What Did Our Customers Want?

Our Approach

The Custom Field Model

How To Store the Field Values?

Why We Ended Up Choosing JSON

Great! Now that we know how to store the custom field values too, let’s dig into the coding.

Searching Through Custom Field Values

Sorting by Custom Field Values

Grouping by Custom Field Values

What Our Customers Got

Summa Summarum

Related articles

Integrations Series – Data Mapping

Integrations Series: Authentication And Connection

Related jobs