Recap

In the previous blog, we discussed a bit about how we would stabilize multiple user connections, and handle large incoming traffic on our tool. So following up on that, lets discuss it further in this blog.

You can check out the previous blog here.

Implementing on the problem

Ok, so as of now, we have a working solution, which is sending the payload data to the respective socket id (unique for every tab instance). But since we have to introduce a new feature called “Pending Works”, where we store the pending works of the user. Within it, we can have a case of one user having multiple tabs open (editing multiple videos at once), and he/she is working on different tabs, and so we would want them to see live progress of the other videos they are editing apart from the work they are doing in the current tab.

For this, we would need to store the active socket connections of the user, and send the payload data to all the active connections of the user, and segregate the data based on the video id.

So yeah, as discussed previously, we went for the same architecture, and implemented it in the following way:

Server

// key : userId
// value : array of socketIds
let activeSockets = {};

// socket events
socket.on("join", (userId) => {
  activeSockets[userId].push(socket.id);
});

socket.on("disconnect", (userId) => {
  activeSockets[userId] = activeSockets[userId].filter((id) => id != socket.id);
});

Note: The above is just a pseudo code of the implementation.

We chose to store the above data in a in-memory storage, as we don’t need to persist it, and it would be a lot faster to access it from the memory, rather than querying the database for the same. So we introduced Redis service on our tool here.

To be fair, we didn’t introduced it for just this sole purpose. We had to work on a faster queue system for handling large traffic, and so we went for BullMQ (more on this later), which uses a Redis server.

Client

const { currentVideoId } = getCurrentVideoDetails();
// socket events
socket.on("progress", (data) => {
  // update progress bar
  const { videoId, progress } = data;

  if (videoId === currentVideoId) {
    // update main page progress bar
  }
  // update pending works progress bar
});

Ah, getting the videoId on client was a pain in itself. As currently we were getting the video id as a response along with the payload data. But we needed video id before hand, so that we can segregate the data based on the video id, and update the progress bar(s) accordingly.

So we made a few more changes, like registering a video on the server, just as we upload it from client, and get back videoId and other stuff as a response from the api. And now on further actions taken on the video, we would send only the videoId to the server.

PS: The client side work of this was done by Punith, and so you can learn more about this on his blog.

Tremendous Traffic

Apart from working on that feature, we thought of stabilizing our tool for handling large traffic, and so we went for a fast and robust queue system, which would help us in handling large traffic, and also help us in scaling our tool in the future.

The Process

Well, this was also not an easy task to just start with. We had to do a lot of research on the same. We looked into Celery, Kue, Bull, and many more. Some were either deprecated, or not maintained anymore. We then we came across BullMQ, which is a Redis based queue system, and is actively maintained.

The Implementation

Implementing BullMQ, required a lot of research work in itself. We had to learn about Redis, and how it works, and then how can we integrate it with BullMQ, and NodeJS.

Current Scenario

Currently we were just spawning a child process worker for every video we edit or process. So in a case of 1000 users editing 1000 videos at once, we would have 1000 child processes running on our server, which would be a lot of load on our server, and would also be a lot slower.

Here’s a snippet of the current implementation:

// processVideo.js
const worker = new Worker("./worker.js");

// worker.js
const processVideo = async (videoId) => {
  // process video and spawns child process
};

Note: When we call /process route (i.e on clicking process video button) processVideo.js runs.

The Solution

Since we don’t want to spawn a large number of child processes, so we initialized the number of workers to spawned to be 10 (can be changed later on). This limits the number of workers to be spawned. So basically we are allowing only 10 videos to be processed at once.

What happens when more than 10 videos are processed at once?

This is where BullMQ comes in handy. We push the videoId to the queue, and the worker would pick up the videoId from the queue, and process it.

We don’t need to worry about the working, as BullMQ handles it for us. Once all the workers are busy, BullMQ would push the videoId to the queue, and once a worker is free, it would pick up the videoId from the queue, and process it.

Here’s a snippet of the implementation:

// index.js
const queue = new Queue("video-processing", {
  connection: {
    host: "",
    port: "<redis port>",
  },
});

let availableWorkers = []
for(let i = 0; i < 10; i++){
  const worker = new Worker("video-processing",
    connection: {
      host: "",
      port: "<redis port>",
    },
  });
  availableWorkers.push(worker)
}

// processVideo.js
await queue.add("process-video", { videoId });

As seen above, we add the job with the videoId on the queue, and the worker would pick up the job, and process it. When the worker is done processing the video, it would again check for any pending jobs on the queue, and if there are any, it would again pick up the job, and process it. This repeats until the queue is empty.

Until Next Time

With this, we come to an end of this blog. I hope you enjoyed reading it, and learned something new from it. I would be back with another blog super soon.

Next on the list

Since only about two weeks left for the final evaluation, its time to wrap up the project, and make it production ready (since we’ve been following the Agile methodology). So we would be making some final changes.

Note: We know that some of the error-fixes/new functionalities are not reflected on prod, so here’s the beta link for the tool: https://beta-videocuttool.wmcloud.org/

Stay tuned for the last and final blog of this series.