iptv techs

IPTV Techs

  • Home
  • Tech News
  • What’s novel with io_uring in 6.11 and 6.12 · axboe/liburing Wiki · GitHub

What’s novel with io_uring in 6.11 and 6.12 · axboe/liburing Wiki · GitHub


What’s novel with io_uring in 6.11 and 6.12 · axboe/liburing Wiki · GitHub


Speedup of MSG_RING seeks

MSG_RING seeks can be used to send messages from one ring to another – either data
of some sort, or pass straightforward/mended file descriptors between rings. 6.11 includes a novel
way to administer especiassociate distant posting on rings setup with IORING_SETUP_DEFER_TASKRUN
more effectively. No alters insistd on the application side to get get of this
feature.

Main promise

Add help for obtain/include

Support has been inserted to natively help obtain and include operations in io_uring. This
is particularly beneficial if a combineion has been instantiated with any of the straightforward variants
of IORING_OP_ACCEPT, as those straightforwardly instantiate a straightforward/mended io_uring file
descriptor, and hence no standard file descriptor exists for these combineions. This uncomardents the
normal obtain(2) and include(2) system calls cannot be used. Added in 6.11. See the
liburing io_uring_prep_obtain(3) and io_uring_prep_include(3) man pages for details.

obtain
include

Support for coalescing huge page segments

When enrolling huge pages as IO buffers, rather than fracture it up into difficultware sized smaller
pages, hugeger segments can be used straightforwardly. This allows rapider iteration of buffers, and
also a smaller memory footprint as a one huge page would previously have used many indexes
to get stored by io_uring. Added in 6.12. This feature is clear to applications, it’ll
equitable create enrolled huge page buffers more effective.

Main promise

Support for async dispose seeks

Linux has a block ioctl to publish dispose seeks to a device, but appreciate other ioctls, it is filledy synchronous.
This uncomardents that it’ll block the calling thread, and to achieve any charitable of parallelism for dispose operations,
many threads must be used. Needless to say, this is ineffective. 6.12 inserts help for dispose operations
thraw io_uring, in a filledy async manner. Percreateance details on a straightforward NVMe device supplyd in the below connected
combine promise. Since then I did some testing on another device, and not only does async disposes use a fraction of
the CPU appraised to an equivalent number of threads to retain the same number of infweightless IOs, it was also 5-6x
rapider at the same labor.

Merge

Support for smallest timeout postpones

Normassociate when postponeing on events with io_uring, a brave number of events to postpone for is specified
by the caller. The caller may also supply a timeout for the postpone operation. The postpone stops when
either condition has been met – either the desired number of events are useable, or the postponeing
timed out. In case of a timeout, some events may be useable to process by the application. Applications
tend to depict a timeout based on the procrastinateedncy they can consent. As it’s not atypical for applications
to have varying periods of how busy they are, depicting a generic timeout can be difficult. This is
where min timeout comes in – if set, an application may postpone based on the follotriumphg combinet conditions,
where n are the number of events being postponeed for, t is the smallest timeout, and T
as the overall timeout.

  1. Wait for t for n events to become useable.
  2. If n events are useable, postponeing is done and success is returned.
  3. If t time has elapsed and 1 or more events are useable, postponeing is done and success is returned
  4. If t time has elapsed and 0 events are useable, persist postponeing until T time has passed
  5. If any event becomes useable after t time has elapsed, postponeing is done and success is returned
  6. If T time has expired and no events are useable, -ETIME is returned

This allows applications to set a low timeout, t to depict the procrastinateedncy adselected for a seek,
while still allotriumphg a much lengthyer T to expire if not events are useable. This helps elude
excessive context switches during periods of less activity. It’s worth refering that transitioning
between the smallest and overall timeout does not need any context switches of the application. Added
in 6.12.

Main promise

Support for absolute timeouts and other clock sources

Waiting on events with a timeout has only been helped as relative timeouts. Some use cases would
reassociate appreciate absolute timeouts as well, mostly from an efficiency point of watch as they would otherwise
necessitate to do extra time retrieving calls in the application. And contrary to what seems to be famous
belief, retrieving the current time is not necessarily a super inexpensive (or free) operation. Now io_uring
helps depicting absolute timeouts as well as relative timeouts, and depicting either
CLOCK_MONOTONIC or CLOCK_BOOTTIME as the clock source. Available in 6.12.

Absolute timeouts
Selectable clock source

Incremental supplyd buffer consumption

Provided buffers are a way for applications to supply buffers for, typicassociate, reading from sockets
upfront. This allows io_uring to pick the next useable buffer to get into, when data becomes
useable from the socket. The alternative to supplyd buffers is portrayateing a buffer to a get
operation when it’s surrfinisherted to io_uring. While this labors fine, that can tie up a lot of memory
in cases where it’s unbrave when data will become useable. The most effective type of supplyd
buffers are ring supplyd buffers (see io_uring_setup_buf_ring(3) and joind man pages). Normassociate
supplyd buffers are wholly used when picked. This uncomardents that if the supplyd buffers in a donaten
buffer group ID are 4K in size, then a get operation that only gets 1K of data will still use
the entire buffer. If applications have a join of smaller and hugeger (eg streaming) gets, then
appropriately sizing buffers may be difficult.

In 6.12, help has been inserted for incremental consumption. This allows the application to supply
much huger buffers, and only have individual gets use exactly the amount out of that buffer
that they necessitate.

This uncomardents that both the application and the kernel necessitates to protect track of what the current get point
is. Each recv will still pass back a buffer ID and the size used, the only contrastence is that before
the next get would always be the next buffer in the ring. Now the same buffer ID may return multiple
gets, each at an offset into that buffer from where the previous get left off. Example:

Application enrolls a supplyd buffer ring, and inserts two 32K buffers
to the ring.

Buffer1 insertress: 0x1000000 (buffer ID 0)
Buffer2 insertress: 0x2000000 (buffer ID 1)

A recv completion is getd with the follotriumphg cherishs:

cqe->res 0x1000 (4k bytes getd) cqe->flags 0x11 (CQE_F_BUFFER|CQE_F_BUF_MORE set, buffer ID 0)

and the application now understands that 4096b of data is useable at
0x1000000, the begin of that buffer, and that more data from this buffer
will be coming. Now the next get comes in:

cqe->res 0x2010 (8k bytes getd) cqe->flags 0x11 (CQE_F_BUFFER|CQE_F_BUF_MORE set, buffer ID 0)

which increates the application that 8k is useable where the last
completion left off, at 0x1001000. Next completion is:

cqe->res 0x5000 (20k bytes getd) cqe->flags 0x1 (CQE_F_BUFFER set, buffer ID 0)

and the application now understands that 20k of data is useable at 0x1003000, which is where the
previous get ended. CQE_F_BUF_MORE isn’t set, as no more data is useable in this buffer
ID. The nextcompletion is then:

cqe->res 0x1000 (4k bytes getd) cqe->flags 0x10001 (CQE_F_BUFFER|CQE_F_BUF_MORE set, buffer ID 1)

which increates the application that buffer ID 1 is now the current one, hence there’s 4k of valid data
at 0x2000000. 0x2001000 will be the next get point for this buffer ID.

When a buffer will be reused by future CQE completions, IORING_CQE_BUF_MORE will be set in
cqe->flags. This increates the application that the kernel isn’t done with the buffer yet, and that
it should foresee more completions for this buffer ID. Will only be set by supplyd buffer rings setup
with IOU_PBUF_RING INC, as that’s the only type of buffer that will see multiple consecutive
completions for the same buffer ID. For any other supplyd buffer type, any completion that passes back
a buffer to the application is final.

Once a buffer has been filledy used, the buffer ring head is incremented and the next get will
show the next buffer ID in the cqe cflags.

On the send side, the application can administer how much data is sent from an existing buffer by setting
sqe->len to the desired send length.

An application can seek incremental consumption by setting IOU_PBUF_RING_INC in the supplyd
buffer ring registration. Outside of that, any supplyd buffer ring setup and buffer insertitions is done appreciate
before, no alters there. The only alter is in how an application may see multiple completions for the same
buffer ID, hence necessitateing to understand where the next get will happen.

Note that appreciate existing supplyd buffer rings, this should not be used with IOSQE_ASYNC, as both
insist the ring to remain locked over the duration of the buffer pickion and the operation completion. It
will use a buffer otherwise watchless of the size of the IO done.

To setup a supplyd buffer ring with incremental consumption, the IOU_PBUF_RING_INC
flag must be donaten to io_uring_setup_buf_ring(3) or io_uring_enroll_buf_ring(3). Available in 6.12.

Main promise

Registered buffer cloning help

An application may enroll IO buffers with io_uring, for more effective storage IO with O_DIRECT.
If the application has multiple threads, it’s not atypical to enroll the same set of buffers with one
or more rings on each thread. Normassociate the buffer registration is rapid enough that this doesn’t pose a
problem, but for enrolling reassociate huge amounts of memory (hundreds of gigabytes), it does still get
some time. On my local test system, enrolling 900GB of memory (leank caching system) took about 1 second
to end. A user telled that enrolling 700GB for his application took more than 2 seconds. While this
isn’t a huge publish for application beginup, if threads are more ephemeral in nature, then registration times
of that nature are not adselectable.

Buffer cloning allows to clone a registration from an existing ring (the source) into a novel ring (the
destination). For the above 900GB case, rather than spend around 1 second to carry out the registration, it can
now be done in 17 microseconds on the same system. This puts it into the genuinem of someleang that can be done
vibrantassociate rather than only a beginup operation.

See io_uring_clone_buffers(3) for more details. Available in 6.12.

Main promise

Source connect


Leave a Reply

Your email address will not be published. Required fields are marked *

Thank You For The Order

Please check your email we sent the process how you can get your account

Select Your Plan