Amazon SQS and 5 tips to help you architect your platform

The current COVID pandemic is not causing any significant disruption to me, its business as usual, and I’ve been working on a project involving Amazon SQS so in this post I’m going to share some observations on it. 

I have a process that generates some data based on some given parameters; the process inserts this into an Amazon SQS queue. A secondary process (and there are multiple) takes the object and processes it accordingly. 

Here are 6 pointers which should help you design & architect your SQS solution, if not similar to mine above.

An object in a queue is just a string

At their simplest objects are nothing more than bits of text. JSON is nothing more than text. 

Use First In First Out (FIFO) queues for order-dependent processing

First IFirst Out queues work by ensuring entries added first are removed and processed first. If you need to ensure order based handling, this is the fit for you. 

Be aware of availability zones (and multi-threaded operations)

AWS is a fault-tolerant platform; a part of being this fault-tolerant is availability zones. Some services allow you to pick a zone. EC2 has options to choose zones (useful for low latency work), and S3 provides the ‘One ZoneIA‘ storage class (useful for cost). SQS does not give you a zone option.

As this multi-zone approach exists your objects in the queue can be delivered more than once.

If you request to dequeue an object, and you end up to be in zone A, and you process that object. Straight after this, you send request to dequeue a second object which puts you in zone B. If the first object removal has not yet informed SQS in zone B, you’re likely to get the same item again.

Consider costs & use your 64kb message size and batch.

An SQS object has a maximum size; You can’t insert a 2mb JPG into a queue for resizing, but you can enter a path to a file on S3 in a JSON document.

Amazon charge for every 1 million API calls, an API call is one request to SQS with 64kb of data or less. 

For example, one billable request allows you 64kb of data so imagine you’ve got ten objects to process, and they’re 3kb each. It doesn’t make sense to use 10 API calls when you can group them as one for 30kb.

A side note, there is a limit on message size of 256kb nevertheless you’ll be charged for every 64kb. A 256kb message counts as four requests. You’ll find more on the SQS Pricing Page.

Visibility timeout and retention period

When an item gets claimed from an Amazon SQS queue, it’s just hidden – not deleted; this is the visibility timeout in play. What the visibility timeout does is hide a message for a set period, this gives the process a chance to process that message before it either deletes it, lets it reappear in the queue or the age limit hits. 

The visibility timeout is useful to prevent multiple processes taking and processing the same item and gives you a buffer to handle it. As the message becomes invisible, the second processor takes the next available item. 

If a message reaches its visibility timeout and it is younger than the retention period, it becomes available on the queue. 

If your infrastructure guarantees that a message is viewed once, you can set the retention period to the same as the visibility timeout. The message is automatically removed from the queue. Therefore it renders calls to delete messages are redundant. 

If your processes lose the ability to access the queue, you’re going to lose those messages. Consider the problems that can arise if using this approach. 

Deduplication of incoming messages

If you use a FIFO queue, this can perform content-based deduplication. Content-based deduplication ensures your queues only contain the same object once (this doesn’t prevent a message delivery occurring multiple times). 

If it’s possible to have duplicates, this is an excellent way to remove them. However, there are conditions.

Imagine you have 2 processors that generate an output which can be the same. They might be performing a series of API requests and returning a calculation result. Having 2 in the queue might distort a process later down the line so you turn on content deduplication. 

Content-based deduplication prevents an already seen message appearing again for 5 minutes; this is a hard limit and cannot be changed. SQS accepts the duplicate message, but it does not appear in the queue. It does not tell you it was duplicate. 

If you need to send a duplicate message more frequently than every 5 minutes; handle duplicate messages yourself.

If you’re looking for some help with Amazon SQS, get in touch.