Writing JSI code is already fast, but small architectural choices can make it 2–10x slower or faster.
If you’re building a module with pure JSI, it helps to know which patterns make your code as efficient as possible.
This matters most when the module runs in a hot path.
Before diving in, keep one thing in mind: JSI is fast, but it is not free. Even if your native implementation is very fast, the total cost may still be dominated by:
-
JS ↔ C++ boundary crossing
-
property access resolution
-
virtual dispatch
-
string conversion and comparison
-
heap allocation
So in practice, the biggest wins usually come from choosing the right architecture first, and only then from micro-optimizations.
HostFunction vs HostObject: Avoiding Hidden Property Access Costs
When writing a module, there are several ways to achieve the same result.
For example, imagine we want to create a JSI function that executes some logic. One possible way is to create a HostObject, and then use it from JS roughly like this: MyModule.process(...)
class MyModuleHostObject: public HostObject {
public:
Value get(Runtime &rt, const PropNameID &prop_name) override {
const std::string name = prop_name.utf8(rt);
if (name == "process") {
return Function::createFromHostFunction(..., [](...) -> Value {
// Do something
});
}
...
}
};
But this approach is not the most performant because of several sources of overhead:
-
virtual dispatch through
HostObject::get -
conversion from
PropNameIDtostd::string -
string comparison for the property name
-
and, in many implementations, creation of a new
HostFunctionduring property access
One detail that is easy to miss:
HostObject::get()is executed on every property access, not only once during initialization.
In reality, each call looks more like this:
JS -> property access -> HostObject::get()
-> virtual call
-> PropNameID -> utf8 string
-> string comparison
-> create HostFunction
We can expose the same API with far less overhead:
Function process = Function::createFromHostFunction(..., [](...) -> Value {
// Do something
});
rt.global().setProperty(rt, "process", std::move(process));
From JavaScript, both approaches look nearly identical: they do the same useful work, and they return the same result:
global.MyModule.process(...)
global.process(...)
Internally, however, they behave very differently, and the direct HostFunction version avoids the property-resolution overhead on every call.
If we benchmark these two approaches by measuring a simple call in a loop, where the function does nothing and just returns some number, we can see the difference.
Benchmark ( 1,000,000 iterations )
HostObject: 181.09ms
HostFunction: 26.87ms
The result is clear: a simple HostFunction call is about 5x faster than the same function exposed through HostObject, and in some cases this can bring very noticeable gains.
Another important nuance is that if you do stay with HostObject, property dispatch itself can still be optimized.
For stable property sets, jsi::PropNameIDs can be cached and re-used instead of repeatedly converting property names to UTF-8 strings and comparing raw strings on every access.
This does not remove all HostObject overhead, but it can reduce part of the lookup cost.
NitroModules also helps here out of the box: it caches jsi::PropNameIDs for repeated comparisons, which is especially useful when many native objects share the same shape.
Faster Property Dispatch with Hashed Lookup
If the property set is fixed, another possible optimization is to replace long chains of string comparisons with generated or hashed dispatch. That can reduce branching and repeated string work during lookup.
Nitro Modules also applies this kind of optimization out of the box through generated dispatch logic, so users can benefit from it without writing the lookup code manually.
Example from Nitro Modules
switch (hashString(unionValue.c_str(), unionValue.size())) {
case hashString("electric"): return margelo::nitro::test::Powertrain::ELECTRIC;
case hashString("gas"): return margelo::nitro::test::Powertrain::GAS;
case hashString("hybrid"): return margelo::nitro::test::Powertrain::HYBRID;
default: [[unlikely]]
throw std::invalid_argument("Cannot convert \"" + unionValue + "\" to enum Powertrain - invalid value!");
}
NativeState vs HostObject: Faster Stateful Native Objects
What if we need native state between calls?
In that case, we can use jsi::HostObject, but there is another strong option jsi::NativeState.
Let's consider a simple example: we have an object that mutates its native state between calls (a counter) and returns the new value back to JS.
If we use HostObject, the code might look like this:
class HObject: public HostObject {
public:
int counter = 10;
Value get(Runtime &rt, const PropNameID &prop_name) override {
const std::string name = prop_name.utf8(rt);
if (name == "increment") {
return Function::createFromHostFunction(..., [](...) -> Value {
counter++;
return counter;
});
}
...
}
};
...
auto hostObj = Object::createFromHostObject(rt, std::make_shared<HObject>());
rt.global().setProperty(rt, "MyHostObject", std::move(hostObj));
We can achieve the same result by using jsi::NativeState, and this approach performs significantly better
class MyNativeState: public NativeState {
public:
int counter = 10;
};
...
Object jsObject(rt);
jsObject.setNativeState(rt, std::make_shared<MyNativeState>());
jsObject.setProperty(rt, "increment", Function::createFromHostFunction(
rt, PropNameID::forAscii(rt, "increment"), 0,
[](Runtime &rt, const Value &thisVal, const Value *args, size_t count) {
auto nativeState = thisVal.asObject(rt).getNativeState<CustomNativeState>(rt);
nativeState->counter++;
return nativeState->counter;
}));
rt.global().setProperty(rt, "MyObjectNativeState", std::move(jsObject));
In this example, we create a plain JS object, attach NativeState to it, add one property to that JS object which is just a simple HostFunction, and inside that function we access the NativeState from thisVal.
Then, just like before, we put that object into the runtime under the name we want.
On the JS side, we use it like this:
MyHostObject.increment()
MyObjectNativeState.increment()
From JavaScript, the API looks the same. Let’s now look at the benchmark.
Benchmark (1,000,000 iterations)
HostObject: 188.20ms
NativeState: 38.19ms
The benchmark shows NativeState is again roughly 5x faster.
Nitro note :
If you are using Nitro Modules, this optimization is already built into the default object model.
Nitro HybridObjects are backed by jsi::NativeState rather than jsi::HostObject, so this class of performance improvement comes largely out of the box.
Why NativeState Performs Better
There is no magic here. The main reason is simple: NativeState removes an entire layer of dynamic property-resolution overhead:
-
no virtual dispatch through
HostObject::get -
no property-name lookup via
PropNameID -> std::string -
no string comparison
-
direct access to the native pointer through
thisVal.asObject(rt).getNativeState(...)
So if you need a JSI object with internal native state and you want maximum performance, it is worth considering Object + NativeState instead of HostObject.
When NativeState Is the Better Choice
This pattern is especially attractive when:
-
the set of methods is known upfront
-
method names are stable
-
state lives naturally on the native side
-
you want object-like JS API, but without paying
HostObjectcosts on every access
In other words, if you do not need dynamic property interception semantics, NativeState is often a much better fit.
Stack Allocation vs Heap Allocation
If the size of the string is known, or at least if the maximum size is known ( for example, < 512 ), and you need to create that string during execution, prefer working on the stack.
Let’s look at a simple example.
Imagine we need to build a string from input data. It could be anything, but for simplicity we will just fill it with the character 'A'.
std::string buf;
buf.resize(256);
for(int i =0; i < 256; ++i){
buf[i] = 'A';
}
return String::createFromAscii(rt, buf);
Next, let’s speed this code up by replacing std::string with a char array:
char buf[256];
for(int i =0; i < 256; ++i){
buf[i] = 'A';
}
return String::createFromAscii(rt, buf, 256);
The logic is the same, but the performance characteristics are different.
Benchmark ( 1,000,000 iterations )
std::string: 146.70ms
char buf[256]: 46.64ms
The result shows that char buf[256] is about 3x faster.
Why Stack Allocation Wins
The main reasons are usually:
-
no heap allocation
-
no allocator overhead
-
better locality for short-lived data
-
fewer indirections
A stack buffer is just local memory with a fixed size known at compile time. In contrast, once std::string grows beyond its small internal buffer, it typically requires heap allocation.
Important note about std::string
One nuance is worth mentioning:
std::string is not always heap-allocated. Most standard library implementations use SSO (Small String Optimization), which means short strings can be stored directly inside the std::string object itself.
That optimization only helps for small strings, commonly around 15–23 bytes depending on the implementation and platform. In this example we use 256 bytes, so this case is well beyond SSO and will typically require heap allocation.
Use createFromAscii(...) When Data Is ASCII
If the string contains only ASCII characters, jsi::String::createFromAscii(...) is usually a better fit than createFromUtf8(...).
ASCII is a strict subset of UTF-8, but the ASCII-specific constructor communicates a stronger invariant and may avoid some UTF-8 handling work. So if your native code already knows the data is ASCII-only, it is worth making that explicit.
In this example:
return String::createFromAscii(rt, buf, 256);
the buffer does not need to be null-terminated, because the length is provided explicitly. That is an important detail, because when the length is known, passing it directly avoids the need to scan for '\\0'.
So if the size is known and the resulting string is ASCII-only, this pattern is usually preferable.
Use createFromUtf16(...) When Data Is Already UTF-16
If your native code already holds the string in UTF-16 form, String::createFromUtf16(...) can also be a better choice.
In that case, you may avoid converting the data to UTF-8 first and then making the JS runtime decode it again. So in practice, the best constructor is often the one that matches the representation you already have in memory:
-
ASCII-only data ->
createFromAscii(...) -
UTF-8 data ->
createFromUtf8(...) -
UTF-16 data ->
createFromUtf16(...)
A Smaller but Useful Optimization
Let us look at one more small optimization example.
Suppose we have a function that performs some work and creates a new string:
std::string hexdigest(){
static const char hex[] = "0123456789abcdef";
char buf[33];
for (int i = 0; i < 16; i++) {
buf[i * 2] = hex[(i >> 4) & 0xF];
buf[i * 2 + 1] = hex[i & 0xF];
}
buf[32]=0;
return std::string(buf);
}
...
std::string res = hexdigest();
return String::createFromAscii(rt, res);
We can try a small optimization and then look at the result:
void hexdigest_char(char* buf){
static const char hex[] = "0123456789abcdef";
for (int i = 0; i < 16; i++) {
buf[i * 2] = hex[(i >> 4) & 0xF];
buf[i * 2 + 1] = hex[i & 0xF];
}
buf[32]=0;
}
...
char buf[33];
hexdigest_char(buf);
return String::createFromAscii(rt, buf, 32);
Benchmark ( 1,000,000 iterations )
hexdigest: 60.36ms
hexdigest_char: 45.95ms
There is still a measurable improvement here: about 1.3x.
What Changed Under the Hood
The optimized version avoids:
-
creating a temporary
std::string -
copying data from the local buffer into that string
-
extra heap-related work once the returned string leaves the local scope
This is not a huge architectural optimization like HostFunction vs HostObject, but in hot code it can still be worth it.
Reduce JS ↔ C++ Boundary Crossings
Even when the native implementation is already optimized, too many crossings between JS and C++ can still dominate the total runtime.
Every call usually includes:
-
entering native runtime
-
reading and validating arguments
-
converting values between JS and C++ representations
-
wrapping the return value back to JS
So in practice, many tiny calls are often worse than fewer larger calls, even if each individual call is fast.
That means after applying low-level optimizations, the next thing worth looking at is often API shape:
-
can several calls be merged into one?
-
can work be batched?
-
can intermediate JS-visible objects be avoided?
This is not a JSI-specific trick - it is just one of the main cost drivers in real systems.
Final Thoughts: Optimize the Right Things
JSI is already fast, but maximum performance comes from good architecture and smart memory choices.
The biggest wins usually come from simple decisions:
-
prefer
HostFunctionoverHostObjectwhen state is not needed -
prefer
Object + NativeStateoverHostObjectwhen native state is needed -
prefer stack buffers when size is known
-
avoid unnecessary temporary allocations
-
reduce JS ↔ C++ crossings in hot paths
If your code runs occasionally, these optimizations may not matter.
But if your code sits on a hot path, small decisions like these can make a very noticeable difference.
If you want to avoid dealing with many of these low-level JSI optimizations manually, Nitro Modules can give you many of them out of the box.
