- External reference: https://www.byronruth.com/grokking-nats-consumers-part-1/
- External reference: https://docs.nats.io/nats-concepts/jetstream/consumers
- External reference: https://www.byronruth.com/grokking-nats-consumers-part-1/
Consumers can be conceived as ‘views’ into a stream, with their own ‘cursor’. Consumers iterate or consume over all or a subset of the messages stored in the stream, according to their ‘subject filter’ and ‘replay policy’, and can be used by one or multiple client applications.
AckFloor is the highest acked message that is contiguous, so if consumer starts at say seq 10, and you ack 10, 11 and then 12 the floor is 12. If you then ack 14 but skip 13 (or it has not come in yet) the floor will remain 12 and the system knows about 14 being acked. If 13 gets acked the floor moves to 14.
AckPolicy The policies choices include: AckExplicit - The default policy. It means that each individual message must be acknowledged. It is recommended to use this mode, as it provides the most reliability and functionality. AckNone - You do not have to ack any messages, the server will assume ack on delivery. AckAll - If you receive a series of messages, you only have to ack the last one you received. All the previous messages received are automatically acknowledged at the same time.
External reference: https://docs.nats.io/nats-concepts/jetstream/consumers#maxackpending For push consumers, MaxAckPending is the only form of flow control
However, for pull consumers because the delivery of the messages to the client application is client-driven (hence the ‘pull’) rather than server initiated (hence the ‘push’) there is an implicit one-to-one flow control with the subscribers (the maximum batch size of the Fetch calls). Therefore you should remember to set it to an appropriately high value (e.g. the default value of 1000), as it can otherwise place a limit on the horizontal scalability of the processing of the stream in high throughput situations.
config.max_ack_pending indicates how many messages can be in-flight at one time without an acknowledgement. If acks are required, having many messages in-flight means that re-delivery could result in out-of-order messages, assuming there is one subscriber. If this value is set to 1, then this is forcing serial processing and ack’ing of messages
how to get ordered messages
With maxackpending of a value greater than 1 and
If not acknowledged in the Ack Wait time, the message are replayed to the client.
If the policy is ReplayOriginal, the messages in the stream will be pushed to the client at the same rate that they were originally received, simulating the original timing of messages.
If the policy is ReplayInstant (the default), the messages will be pushed to the client as fast as possible while adhering to the Ack Policy, Max Ack Pending and the client’s ability to consume those messages.
ephemeral or durable
can also be ephemeral or durable. A consumer is considered durable when an explicit name is set on the Durable field when creating the consumer, otherwise it is considered ephemeral. Durables and ephemeral behave exactly the same except that an ephemeral will be automatically cleaned up (deleted) after a period of inactivity, specifically when there are no subscriptions bound to the consumer. By default, durables will remain even when there are periods of inactivity (unless InactiveThreshold is set explicitly).
In practice, ephemeral push consumers can be a lightweight and useful way to do one-off consumption of a subset of messages in a stream. However, if you have a durable use case, it is recommended to access pull consumers first which provides more control and implicit support for scaling out consumption
In general, it simply means that the maxackpending is set to 1.
But it is much more specific in the implementations.
OrderedConsumer will create a FIFO direct/ephemeral consumer for in order delivery of messages. There are no redeliveries and no acks, and flow control and heartbeats will be added but will be taken care of without additional client code
push or pull
In pull mode, the subscriber asks for messages batch by batch and acknowledge them. The subscriber controls the flow.
In push mode, the subscriber initially gets the replay history it wants, but afterwards the messages are sent to a subject. In case the subscriber does not keep up with the message speed and does not ack them in time, the messages may be redelivered and the subscriber then needs to realize it already dealt with those. Therefore, the gained speed due to the fact messages are constantly pushed comes with the cost of an additional complexity to do things right.
push-based where messages will be delivered to a specified subject or pull-based which allows clients to request batches of messages on demand
if you want to horizontally scale the processing of all the messages stored in a stream and/or process a high-throughput stream of messages in real-time using batching, then use a shared pull consumer (as they scale well horizontally and batching is in practice key to achieving high throughput)
if the access pattern is more like individual application instances needing their own individual replay of the messages in a stream on demand: then an ‘ordered push consumer’ is best. Consider the use of a durable push consumer with a queue-group for the clients if you want a scalable low latency real time processing of the messages inserted into a stream.
NATS Weekly #36 - Synadia
External reference: https://www.synadia.com/newsletter/nats-weekly-36 NATS Weekly #36 - Synadia
general, default to pull consumers except when the following two conditions are true:Ordered consumption of messages is requiredThe subscriber can handle the push rate sufficiently
be ingested into a database, a common approach is to use an OrderedConsumer (which is a convenience option for an ephemeral consumer + a few configuration options). The ingestion process would startup and define the last sequence that was consumed (defaulting to zero). On each transactional write into the database, the sequence of the last message consumed would be written to the database in the same transaction. This way, if the ingest process crashes, it can be recreated with an offset of the last sequence present in the database
ordered consumer is particularly optimized for this purpose, but a standard durable consumer would work as well
Grokking NATS Consumers: Pull-based
External reference: https://www.byronruth.com/grokking-nats-consumers-part-3/ Grokking NATS Consumers: Pull-based
although the messages in a given batch will remain ordered, processing of those messages across batches has no ordering guarantees since subscribers will be operating independently.
consumers more recently, the API is easier to use, the control over when messages are requested is more predictable (and comforting), and the implicit ability to scale is wonderful to have.
a push consumer as a queue group with explicit ack is functionally the same as a pull consumer, however the control flow is different.
most use cases where control flow matters or subscriptions are expected to be long-lived, pull consumers should be used.
when are push consumers useful? For short-lived consumers, reading short segments of a stream, or a loose ack requirement on subscribers, push consumers are a good fit.
As an anecdote, I am currently implementing an event store on top of NATS leveraging streams for storage and the KV layer for snapshots. The standard replay operation is to get the latest snapshot of some state given an ID, and fetch the events that occurred after the snapshot was taken (which would occur when there are multiple concurrent actors). Typically this could be a handful of events depending on how active the entity is and the snapshot frequency. But in this case an ephemeral, ordered push consumer is perfect for reading these short sequence of events.In all other cases, I would reach for a pull consumer by default
Default to using pull consumers as they are easier to use and understand. Choose push consumers for very specific, short-lived or less-control-flow-tolerant use cases.
External reference: https://natsbyexample.com/examples/jetstream/pull-consumer/go/
pull consumer allows for the application to fetch one or more messages on-demand using a subscription bound to the consumer
control the flow of the messages coming in so it can process and ack them in an appropriate amount of time
Ephemeral consumers are useful as one-off needs and are a bit cheaper in terms of resources and management. However, ephemerals do not support multiple subscribers nor do they (of course) persist after the primary subscriber unsubscribes. The server will automatically clean up (delete) the consumer after a period of time.
External reference: https://natsbyexample.com/examples/jetstream/push-consumer/go
server will proactively push as many messages to the active subscription up to the consumer’s max ack pending limit
push consumers can get unwieldy and confusing is when the subscriber cannot keep up
Messages start getting redelivered and being interleaving with new messages pushed from the stream