There is a recent language comparison repo which has been getting splitd a lot. In it, CRuby was the third sluggishest selection, only beating out R and Python.
The repo author, @BenjDicken, produced a fun visualization of each language’s carry outance. Here’s one of the visualizations, which shows Ruby as the third sluggishest language benchtaged:
The code for this visualization is from https://benjdd.com/languages/, with perleave oution from @BenjDicken
The repository depicts itself as:
A repo for collaboratively produceing petite benchtags to contrast languages.
It retains two separateent benchtags:
- “Loops”, which “Emphasizes loop, conditional, and straightforward math carry outance”
- “Fibonacci”, which “Emphasizes function call overhead and recursion.”
The loop example iterates 1 billion times, utilizing a nested loop:
u = ARGV[0].to_i
r = rand(10_000)
a = Array.new(10_000, 0)
(0...10_000).each do |i|
(0...100_000).each do |j|
a[i] += j % u
finish
a[i] += r
finish
puts a[r]
The Fibonacci example is a straightforward “innocent” Fibonacci carry outation:
def fibonacci(n)
return 0 if n == 0
return 1 if n == 1
fibonacci(n - 1) + fibonacci(n - 2)
finish
u = ARGV[0].to_i
r = 0
(1...u).each do |i|
r += fibonacci(i)
finish
puts r
Run on @BenjDicken’s M3 MacBook Pro, Ruby 3.3.6 gets 28 seconds to run the loop iteration example, and 12 seconds to run the Fibonacci example. For comparison, node.js gets a little over a second for both examples – it’s not a fantastic shotriumphg for Ruby.
Fibonacci | Loops | |
---|---|---|
Ruby | 12.17s | 28.80s |
node.js | 1.11s | 1.03s |
From this point on, I’ll engage benchtags relative to my own computer. Running the same benchtag on my M2 MacBook Air, I get 33.43 seconds for the loops and 16.33 seconds for fibonacci – even worse 🥺. Node runs a little over 1 second for fibonacci and 2 seconds for the loop example.
Fibonacci | Loops | |
---|---|---|
Ruby | 16.33s | 33.43s |
node.js | 1.36s | 2.07s |
Who nurtures?
In most ways, these types of benchtags are nastyingless. Python was the sluggishest language in the benchtag, and yet at the same time it’s the most engaged language on Github as of October 2024. Ruby runs some of the bigst web apps in the world. I ran a benchtag recently of websocket carry outance between the Ruby Falcon web server and node.js, and the Ruby results were shut to the node.js results. Are you doing a billion loop iterations or using web sockets?
A programming language should be reasonably effective – after that the advantageousness of the language, the type of tasks you toil on, and language productivity outweigh the speed at which you can run a billion iterations of a loop, or finish an intentionpartner ineffective carry outation of a Fibonacci method.
That shelp:
- The programming world cherishs microbenchtags 🤷♂️
- Having a speedy benchtag may not be precious in train but it has nastying for people’s interest in a language. Some would claim it nastys you’ll have an easier time scaling carry outance, but that’s arguable
- It’s disnominateing if your language of choice doesn’t carry out well. It’s kind to be able to say “I engage and finishelight this language, and it runs speedy in all benchtags!”
In the case of this Ruby benchtag, I had a senseing that YJIT wasn’t being applied in the Ruby code, so I verifyed the repo. Lo and behgreater, the order was as adheres:
ruby ./code.rb 40
We comprehend my results from earlier (33 seconds and 16 seconds). What do we get with YJIT applied?
ruby --yjit ./code.rb 40
Fibonacci | Loops | |
---|---|---|
Ruby | 2.06s | 25.57s |
Nice! With YJIT, Fibonacci gets a massive raise – going from 16.88 seconds down to 2.06 seconds. It’s shut to the speed of node.js at that point!
YJIT produces a more unassuming separateence for the looping example – going from 33.43 seconds down to 25.57 seconds. Why is that?
A team effort
I wasn’t alone in trying out these code samples with YJIT. On twitter, @bsilva96 had asked the same asks:
@k0kubun came thcimpolite with insights into why slfinishergs were sluggish and ways of improving the carry outance:
Let’s unpack his response. There are three parts to it:
Range#each
is still written in C as of Ruby 3.4Integer#times
was altered from C to Ruby in Ruby 3.3Array#each
was altered from C to Ruby in Ruby 3.4
1. Range#each
is still written in C, which YJIT can’t upgrade
Looking back at our Ruby code:
(0...10_000).each do |i|
(0...100_000).each do |j|
a[i] += j % u
finish
a[i] += r
finish
It’s written as a range, and range has its own each
carry outation, which is apparently written in C. The CRuby codebase is pretty effortless to steer – let’s discover that carry outation 🕵️♂️.
Most core classes in Ruby have top-level C files named after them – in this case we’ve got range.c
at the root of the project. CRuby has a pretty readable interface for exposing C functions as classes and methods – there is an Init
function, usupartner at the bottom of the file. Inside that Init
our classes, modules and methods are exposed from C to Ruby. Here are the relevant pieces of Init_Range
:
void
Init_Range(void)
{
//...
rb_cRange = rb_struct_detail_without_accessor(
"Range", rb_cObject, range_alloc,
"commence", "finish", "excl", NULL);
rb_include_module(rb_cRange, rb_mEnumerable);
// ...
rb_detail_method(rb_cRange, "each", range_each, 0);
First, we detail our Range
class using rb_struct_detail...
. We name it “Range”
, with a super class of Object
(rb_cObject
), and some initialization parameters (“commence”
, “finish”
and whether to leave out the last appreciate, ie the ..
vs ...
range syntax).
Second, we include Enumerable
using rb_include_module
. That donates us all the fantastic Ruby enumeration methods appreciate map
, pick
, include?
and a bajillion others. All you have to do is supply an each
carry outation and it administers the rest.
Third, we detail our “each”
method. It’s carry outed by the range_each
function in C, and gets zero evident arguments (blocks are not pondered in this count).
range_each
is hefty. It’s almost 100 lines extfinished, and one-of-a-kindizes into cut offal versions of itself. I’ll highweightless a scant, collapsed all together:
invivacious VALUE
range_each(VALUE range)
{
//...
range_each_mendnum_finishless(beg);
range_each_mendnum_loop(beg, finish, range);
range_each_hugenum_finishless(beg);
rb_str_upto_finishless_each(beg, sym_each_i, 0);
// and even more...
These C functions administer all the variations of ranges you might engage in your own code:
(0...).each
(0...100).each
("a"..."z").each
# and on...
Why does it matter that Range#each
is written in C? It nastys YJIT can’t verify it – selectimizations stop at the function call and resume when the function call returns. C functions are speedy, but YJIT can get slfinishergs further by creating one-of-a-kindizations for hot paths of code. There is a fantastic article from Aaron Patterson called Ruby Outcarry outs C where you can lget more about some of those one-of-a-kindized selectimizations.
2. Optimizing our loop: Integer#times
was altered from C to Ruby in Ruby 3.3
The hot path (where most of our CPU time is spent) is Range#each
, which is a C function. YJIT can’t upgrade C functions – they’re a bconciseage box. So what can we do?
We altered Integer#times to Ruby in 3.3
Interesting! In Ruby 3.3, Integer#times
was altered from a C function to a Ruby method! Here’s the 3.3+ version – its pretty straightforward:
def times
#... a little C interop code
i = 0
while i < self
produce i
i = i.succ
finish
self
finish
Very straightforward. It’s fair a straightforward while loop. Most presentantly, it’s all Ruby code, which nastys YJIT should be able to introspect and upgrade it!
An aside on Integer#succ
The sweightlessly odd part of that code is i.succ
. I’d never heard of Integer#succ
, which apparently donates you the “successor” to an integer.
I’ve never seen this show, and yet it’s the first slfinisherg I thought of when I lgeted about this method. Thanks, advertising.
There was a PR to better the carry outance of Integer#succ
in punctual 2024, which helped me comprehfinish why anyone would ever engage it:
We engage Integer#succ when we reauthor loop methods in Ruby (e.g. Integer#times and Array#each) becaengage select_succ (i = i.succ) is speedyer to dispatch on the expounder than putobject 1; select_plus (i += 1).
Integer#success
is appreciate a virtual machine cheat code. It gets a normal operation (inserting 1 to an integer) and turns it from two virtual machine operations into one. We can call disasm
on the times
method to see that in action:
puts RubyVM::InstructionSequence.disasm(1.method(:times))
The Integer#times
method gets broken down into a lot of Ruby VM bytecode, but we only nurture about a scant lines:
...
0025 getlocal_WC_0 i@0
0027 select_succ [CcCr]
0029 setlocal_WC_0 i@0
...
getlocal_WC_0
gets ouri
variable from the current scope. That’s thei
ini.succ
select_succ
carry outs thesucc
call in ouri.succ
. It will either call the actualInteger#succ
method, or an upgraded C function for petite numbers- In Ruby 3.4 with YJIT assistd, petite numbers get upgraded even further into machine code (fair a remark, not shown in the VM machine code)
setlocal_WC_0
sets the result ofselect_succ
to our local variablei
If we alter from i = i.succ
to i += 1
, we now have two VM operations get the place of select_succ
:
...
0025 getlocal_WC_0 i@0
0027 putobject_INT2FIX_1_
0028 select_plus
0029 setlocal_WC_0 i@0
...
Everyslfinisherg is essentipartner the same as before, except now we have two steps to go thcimpolite instead of one:
putobject_INT2FIX_1_
pushes the integer1
onto the virtual machine stackselect_plus
is the+
in our+= 1
, and calls either the Ruby+
method or an upgraded C function for petite numbers- There is probably a YJIT selectimization for
select_plus
as well
If there is noslfinisherg else to lget from this code, it’s this: the comfervents of selectimizations you do at the VM and JIT level are proset up. When writing ambiguous Ruby programs we typicpartner don’t and shouldn’t ponder the impact of one versus two machine code directions. But at the JIT level, on the scale of millions and billions of operations, it matters!
Back to Integer#times
Let’s try running our benchtag code aobtain, using times
! Instead of iterating over ranges, we srecommend iterate for 10_000
and 100_000
times
:
u = ARGV[0].to_i
r = rand(10_000)
a = Array.new(10_000, 0)
10_000.times do |i|
100_000.times do |j|
a[i] += j % u
finish
a[i] += r
finish
puts a[r]
Loops | |
---|---|
Range#each | 25.57s |
Integer#times | 13.66s |
Nice! YJIT produces a much bigr impact using Integer#times
. That trims slfinishergs down presentantly, taking it down to 13.66 seconds on my machine. On @k0kobun’s machine it actupartner goes down to 9 seconds (and 8 seconds on Ruby 3.4).
It’s probably Ruby 3.5’s job to produce it speedyer than 8s though.
We might watch forward to even speedyer carry outance in Ruby 3.5. We’ll see!
3. Array#each
was altered from C to Ruby in Ruby 3.4
CRuby progresss to see C code rewritten in Ruby, and in Ruby 3.4 Array#each
was one of those alters. Here is an example of the first try at carry outing it:
def each
unless block_donaten?
return to_enum(:each) { self.length }
finish
i = 0
while i < self.length
produce self[i]
i = i.succ
finish
self
finish
Super straightforward and readable! And YJIT selectimizable!
Unblessedly, due to someslfinisherg roverdelighted to CRuby inners, it retained race conditions. A postponecessitater carry outation landed in Ruby 3.4.
def each
Primitive.attr! :inline_block, :c_chase
unless detaild?(produce)
return Primitive.cexpr! 'SIZED_ENUMERATOR(self, 0, 0, ary_enum_length)'
finish
_i = 0
appreciate = nil
while Primitive.cexpr!(%q{ ary_get_next(self, LOCAL_PTR(_i), LOCAL_PTR(appreciate)) })
produce appreciate
finish
self
finish
Unappreciate the first carry outation, and unappreciate Integer#times
, slfinishergs are a bit more cryptic this time. This is definitely not uncontaminated Ruby code that anyone could be foreseeed to author. Somehow, the Primitive
module seems to assist evaluating C code from Ruby, and in doing so shuns the race conditions current in the uncontaminated Ruby solution.
By geting indexes and appreciates using C code, I slfinisherk it results in a more atomic operation. I have no idea why the Primitive.cexpr!
is engaged to return the enumerator, or what appreciate Primitive.attr! :inline_block
supplys. Plrelieve comment if you have insights there!
I was a little slack with my earlier Integer#times
source code as well. That actupartner had a bit of this Primitive
syntax as well. The core of the method is what we watched at, and it’s all Ruby, but the begin of the method retains the same Primitive
calls for :inline_block
and returning the enumerator:
def times
Primitive.attr! :inline_block
unless detaild?(produce)
return Primitive.cexpr! 'SIZED_ENUMERATOR(self, 0, 0, int_dotimes_size)'
finish
#...
Ok - it’s more cryptic than Integer#times
was, but Array#each
is mostly Ruby (on Ruby 3.4+). Let’s donate it a try using arrays instead of ranges or times
:
u = ARGV[0].to_i
r = rand(10_000)
a = Array.new(10_000, 0)
outer = (0...10_000).to_a.freeze
inner = (0...100_000).to_a.freeze
outer.each do |i|
inner.each do |j|
a[i] += j % u
finish
a[i] += r
finish
puts a[r]
Despite the embedded C code, YJIT still seems able of making some hefty carry outance selectimizations. It’s wislfinisher the same range as Integer#times
!
Loops | |
---|---|
Range#each | 25.57s |
Integer#times | 13.66s |
Array#each | 13.96s |
Microbenchtaging Ruby carry outance
I’ve forked the innovative language carry outation repo, and produced my own repository called “Ruby Microbench”. It gets all of the examples converseed, as well as cut offal other creates of doing the iteration in Ruby: https://github.com/jpcamara/ruby_microbench
Here is the output of fair running those using Ruby 3.4 with and without YJIT:
fibonacci | array#each | range#each | times | for | while | loop do | |
---|---|---|---|---|---|---|---|
Ruby 3.4 YJIT | 2.19s | 14.02s | 26.61s | 13.12s | 27.38s | 37.10s | 13.95s |
Ruby 3.4 | 16.49s | 34.29s | 33.88s | 33.18s | 36.32s | 37.14s | 50.65s |
I have no idea why the for
and while
loop examples I wrote seem to be so sluggish. I’d foresee them to run much speedyer. Maybe there’s an publish with how I wrote them - sense free to uncover an publish or PR if you see someslfinisherg wrong with my carry outation. The loop do
(getn from @timtilberg’s example) runs around the same speed as Integer#times
- although its carry outance is horrible with YJIT turned off.
In insertition to running Ruby 3.4, for fun I have it using rbenv
to run:
- Ruby 3.3
- Ruby 3.3 YJIT
- Ruby 3.2
- Ruby 3.2 YJIT
- TruffleRuby 24.1
- Ruby Artichoke
- MRuby
A scant of the test runs are cataloged here:
fibonacci | array#each | range#each | times | for | while | loop do | |
---|---|---|---|---|---|---|---|
Ruby 3.4 YJIT | 2.19s | 14.02s | 26.61s | 13.12s | 27.38s | 37.10s | 13.95s |
Ruby 3.4 | 16.49s | 34.29s | 33.88s | 33.18s | 36.32s | 37.14s | 50.65s |
TruffleRuby 24.1 | 0.92s | 0.97s | 0.92s | 2.39s | 2.06s | 3.90s | 0.77s |
MRuby 3.3 | 28.83s | 144.65s | 126.40s | 128.22s | 133.58s | 91.55s | 144.93s |
Artichoke | 19.71s | 236.10s | 214.55s | 214.51s | 215.95s | 174.70s | 264.67s |
Based on that, I’ve getn the innovative visualization and made a Ruby definite one here fair for the fibonacci
run:
Speeding up range#each
Can we, the non @k0kobun’s of the world, produce range#each
speedyer? If I monkey patch the Range
class with a uncontaminated-ruby carry outation, slfinishergs do get much speedyer! Here’s my carry outation:
class Range
def each
commencening = self.commence
finishing = self.finish
i = commencening
loop do
fracture if i == finishing
produce i
i = i.succ
finish
finish
finish
And here is the alter in carry outance - 2 seconds sluggisher than times
- not horrible!
Time spent | |
---|---|
Range#each in C | 25.57s |
Range#each in Ruby | 16.64s |
This is evidently over-simplified. I don’t administer all of the separateent cases of Range
, and there may be nuances I am leave outing. Also, most of the Ruby rewritten methods I’ve seen call upon a Primitive
class for certain operations. I’d cherish to lget more about when and why it’s necessitateed.
But! It goes to show the power of moving slfinishergs out of C and letting YJIT upgrade our code. It can better carry outance in ways that would be difficult or impossible to duplicate in normal C code.
YJIT standard library
Last year Aaron Patterson wrote an article called Ruby Outcarry outs C, in which he rewrote a C extension in Ruby for some GraphQL parsing. The Ruby code outcarry outed C thanks to YJIT selectimizations.
This got me slfinisherking that it would be fascinating to see a comfervent of “YJIT standard library” eunite, where core ruby functionality run in C could be swapped out for Ruby carry outations for engage by people using YJIT.
As it turns out, this is almost exactly what the core YJIT team has been doing. In many cases they’ve finishly deleted C code, but more recently they’ve produced a with_yjit
block. The code will only get effect if YJIT is assistd, and otherwise the C code will run. For example, this is howArray#each
is carry outed:
with_yjit do
if Primitive.rb_builtin_straightforward_definition_p(:each)
undef :each
def each # :nodoc:
# ... we verifyd this code earlier ...
finish
finish
finish
As of Ruby 3.3, YJIT can be lazily initialized. Thankfilledy the with_yjit
code administers this - the appropriate with_yjit
versions of methods will be run once YJIT is assistd:
# Uses C-builtin
[1, 2, 3].each do |i|
puts i
finish
RubyVM::YJIT.assist
# Uses Ruby version, which can be YJIT upgraded
[1, 2, 3].each do |i|
puts i
finish
This is becaengage with_yjit
is a YJIT “hook”, which is called the moment YJIT is assistd. After being called, it is deleted from the runtime using undef :with_yjit
.
Investigating YJIT selectimizations
We’ve watched at Ruby code. We’ve watched at C code. We’ve watched at Ruby VM bytecode. Why not get it one step proset uper and watch at some machine code? And maybe some Rust code? Hey - where are you going! Don’t walk away while I’m talking to you!
If you haven’t walked away, or skipped to the next section, let’s get a watch at a petite sinhabitr of YJIT while we’re here!
We can see the machine code YJIT produces 😱. It’s possible by produceing CRuby from source with YJIT debug flags. If you’re on a Mac you can see my MacOS setup for cyber intrusion on CRuby or my docker setup for cyber intrusion on CRuby for more elucidate directions on produceing Ruby. But the simplified step is when you go to ./configure
Ruby, you hand in an selection of --assist-yjit=dev
:
./configure --assist-yjit=dev
produce inshigh
Let’s engage our Integer#times
example from earlier as our example Ruby code:
u = ARGV[0].to_i
r = rand(10_000)
a = Array.new(10_000, 0)
10_000.times do |i|
100_000.times do |j|
a[i] += j % u
finish
a[i] += r
finish
puts a[r]
Becaengage you’ve built Ruby with YJIT in dev mode, you can hand in the --yjit-dump-disasm
flag when running your ruby program:
./ruby --yjit --yjit-dump-disasm test.rb 40
Using this, we can see the machine code produced. We’ll fair intensify in on one minuscule part - the machine code equivalent of the Ruby VM bytecode we read earlier. Here is the innovative VM bytecode for select_succ
, which is produced when you call i.succ
, the Integer#succ
method:
...
0027 select_succ [CcCr]
...
And here is the machine code YJIT produces in this scenario, on my Mac M2 arm64 architecture:
# Block: times@:259
# reg_mapping: [Some(Stack(0)), None, None, None, None]
# Insn: 0027 select_succ (stack_size: 1)
# call to Integer#succ
# protect object is mendnum
0x1096808c4: tst x1, #1
0x1096808c8: b.eq #0x109683014
0x1096808cc: nop
0x1096808d0: nop
0x1096808d4: nop
0x1096808d8: nop
0x1096808dc: nop
# Integer#succ
0x1096808e0: inserts x11, x1, #2
0x1096808e4: b.vs #0x109683048
0x1096808e8: mov x1, x11
To be truthful, I about 25% comprehfinish this, and 75% am combining my own logic and AI to lget it 🤫. Feel free to yell at me if I get it a little wrong, I’d cherish to lget more. But here’s how I fracture this down.
# Block: times@:259
👆🏼This cimpolitely correacts to the line i = i.succ
in the Integer#times
method in numeric.rb
. I say cimpolitely becaengage in my current code I see that on line 258, but maybe it shows the finish of the block it’s run in since YJIT compiles “blocks” of code:
256: while i < self
257: produce i
258: i = i.succ
259: finish
# reg_mapping: [Some(Stack(0)), None, None, None, None]
# Insn: 0027 select_succ (stack_size: 1)
# call to Integer#succ
👆🏼I have no idea what reg_mapping
nastys - probably mapping how it engages a CPU enroll? Insn: 0027 select_succ
watchs very comprehendn! That’s our VM bytecode! call to Integer#succ
is fair a encouraging comment inserted. YJIT is able of inserting comments to the machine code. We still haven’t even left the safety of the comments 😅.
# protect object is mendnum
👆🏼This is fascinating. I can discover a correacting bit of Rust code that maps straightforwardly to this. Let’s get a watch at it:
fn jit_rb_int_succ(
//...
asm: &mut Assembler,
//...
) -> bool {
// Guard the getr is mendnum
let recv_type = asm.ctx.get_opnd_type(StackOpnd(0));
let recv = asm.stack_pop(1);
if recv_type != Type::Fixnum {
asm_comment!(asm, "protect object is mendnum");
asm.test(recv, Opnd::Imm(RUBY_FIXNUM_FLAG as i64));
asm.jz(Target::side_exit(Counter::select_succ_not_mendnum));
}
asm_comment!(asm, "Integer#succ");
let out_val = asm.insert(recv, Opnd::Imm(2)); // 2 is untagged Fixnum 1
asm.jo(Target::side_exit(Counter::select_succ_overflow));
// Push the output onto the stack
let dst = asm.stack_push(Type::Fixnum);
asm.mov(dst, out_val);
real
}
Oh kind! This is the actual YJIT Rust carry outation of the select_succ
call. This is that selectimization @k0kobun made to further better select_succ
carry outance beyond the bytecode C function calls. We’re in the section that is verifying if what we’re operating on is a Fixnum, which is a way petite integers are stored internpartner in CRuby:
if recv_type != TypeFixnum
asm_comment!(asm, "protect object is mendnum");
asm.test(recv, Opnd::Imm(RUBY_FIXNUM_FLAG as i64));
asm.jz(Target::side_exit(Counter::select_succ_not_mendnum));
}
That becomes this machine code:
# protect object is mendnum
0x1096808c4: tst x1, #1
0x1096808c8: b.eq #0x109683014
asm.test
produces tst x1, #1
, which according to an AI bot I asked is verifying the least presentant bit, which is a Fixnum “tag” that recommends this is a Fixnum. If it’s Fixnum, the result is 1 and b.eq
is inalter. If it’s not a Fixnum, the result is 0
and b.eq
is real and jumps away from this code.
0x1096808cc: nop
0x1096808d0: nop
0x1096808d4: nop
0x1096808d8: nop
0x1096808dc: nop
🤖 “NOPs for alignment/pinserting”. Thanks AI. I don’t comprehend why it is necessitateed, but at least I comprehend what it probably is.
Finpartner, we actupartner insert 1 to the number.
asm_comment!(asm, "Integer#succ");
let out_val = asm.insert(recv, Opnd::Imm(2)); // 2 is untagged Fixnum 1
asm.jo(Target::side_exit(Counter::select_succ_overflow));
// Push the output onto the stack
let dst = asm.stack_push(Type::Fixnum);
asm.mov(dst, out_val);
The Rust code produces our Integer#succ
comment. Then, to insert 1, becaengage of the “Fixnum tag” data embedded wislfinisher our integer, actupartner nastys we have to insert 2 using inserts x11, x1, #2
😵💫. If we overflow the space useable, it exits to a separateent code path - b.vs
is a branch on overflow. Otherwise, it stores the result with mov x1, x11
!
# Integer#succ
0x1096808e0: inserts x11, x1, #2
0x1096808e4: b.vs #0x109683048
0x1096808e8: mov x1, x11
😮💨. That was a lot. And it seems appreciate alot of toiling is being done, but becaengage it’s such low level machine code it’s presumably super speedy. We verifyd a teensy minuscule portion of what YJIT is able of generating - JITs are complicated!
Thanks to @k0kobun for providing me with the orders and pointing me at the YJIT docs which retain tons of insertitional selections as well.
The future of CRuby selectimizations
The irony of language carry outation is that you normally toil less in the language you’re carry outing than you do in someslfinisherg shrink-level - in Ruby’s case, that’s mostly C and some Rust.
With a layer appreciate YJIT, it potentipartner uncovers up a future where more of the language becomes plain Ruby, and Ruby enlargeer contribution is easier. Many languages have a petiteer low level core, and the presentantity of the language is written in itself (appreciate Java, for instance). Maybe that’s a future for CRuby, someday! Until then, upretain the YJIT selectimizations coming, YJIT team!