Choosing the wrong data representation in a JSI module can make your code 30x slower — with no algorithmic mistake, just the wrong shape.
In the first part we covered HostFunction vs HostObject, NativeState, and stack allocation. This post goes one level deeper: how data crosses from C++ to JS, how the API contract between layers is designed, and how strings are built on the native side. Each section has benchmarks.
All benchmarks were run in a release build with Hermes. Each test executes the function 100,000 times. Results are averaged across 5 runs.
Data shape: from objects to raw memory
Native code often needs to return data back to JavaScript. There are several ways to do that, and the difference between a slow representation and a fast one can be significant.
Imagine a function that runs for every camera frame, detects something in the image, and returns a list of points. For example, these could be face landmarks that we later render on the JS side.
struct Point {
int32_t x, y;
};
Since there can be many points, we will represent them as a vector:
std::vector<Point> points;
Let's write a small helper that generates test points:
constexpr size_t pointsCount = 50;
auto createPoints = []() -> std::vector<Point> {
std::vector<Point> points;
points.reserve(pointsCount);
for (int i = 0; i < pointsCount; ++i) {
int v = i * 2;
points.emplace_back(v, v + 1);
}
return points;
};
Now we need to return this data to JS. The first solution that usually comes to mind is an array of objects, so JS receives Array<{x:number,y:number}>.
Array of Objects
Function process = Function::createFromHostFunction(..., [](...) {
std::vector<Point> points = createPoints();
auto size = points.size();
jsi::Array jsPoints(rt, size);
int index = 0;
for (auto& [x, y]: points) {
jsi::Object obj(rt);
obj.setProperty(rt, "x", x);
obj.setProperty(rt, "y", y);
jsPoints.setValueAtIndex(rt, index, std::move(obj));
index++;
}
return jsPoints;
});
This looks nice. On the JS side, the data is very easy to work with:
points.forEach(({x, y}) => {
// Do something
})
Let's benchmark this code.
Benchmark (100,000 calls)
Array of Objects: 1036.50ms
This is not optimal. Every jsi::Object is a separate allocation on the JS heap, and every setProperty call crosses into the JSI layer. With 50 points, that means 100 property writes plus 50 object allocations that the GC has to track. Let's remove that overhead.
The main cost here is not the loop itself, but crossing the JSI boundary multiple times and allocating many small JS objects that the GC has to track.
Flat Array
Instead of creating objects, we can flatten the array.
The data will look like this:
[x, y, x, y, x, y, ...]
To produce this shape, we only need to change the JSI function a little:
std::vector<Point> points = createPoints();
auto size = points.size();
jsi::Array jsPoints(rt, size * 2);
int index = 0;
for (auto& [x, y]: points) {
jsPoints.setValueAtIndex(rt, index, x);
jsPoints.setValueAtIndex(rt, index + 1, y);
index += 2;
}
return jsPoints;
On the JS side, access becomes slightly less convenient:
for (let i = 0; i < points.length / 2; i++) {
const index = i * 2;
const x = points[index];
const y = points[index + 1];
}
Let's see how much faster this is:
Benchmark (100,000 calls)
Flat Array: 311.78ms // 3.3x Faster
That is already a strong result: the function became 3.3x faster. Since we are only returning numbers, we can push this further.
ArrayBuffer
We can avoid creating a JS array and JS objects altogether. Instead of building a JavaScript structure element by element from C++, we hand JavaScript a direct pointer to our native memory. No per-element JSI calls. No GC pressure from individual values. No second allocation. Just one contiguous block of bytes that both runtimes share.
JSI's MutableBuffer is a C++ interface that lets native code own the underlying memory. The resulting ArrayBuffer on the JS side is just a view into that same allocation — the data never gets copied at all.
The MutableBuffer contract
jsi::MutableBuffer is a pure virtual base class with two methods:
class MutableBuffer {
public:
virtual size_t size() const = 0;
virtual uint8_t* data() = 0;
};
You subclass it, back it with any storage you want — a std::vector, a malloc, a memory-mapped file — and JSI exposes that memory as an ArrayBuffer in JavaScript with zero intermediate copies.
Here is our implementation:
class PointsBuffer : public jsi::MutableBuffer {
std::vector<Point> points_;
public:
explicit PointsBuffer(std::vector<Point>&& points)
: points_(std::move(points)) {}
size_t size() const override {
return points_.size() * sizeof(Point);
}
uint8_t* data() override {
return reinterpret_cast<uint8_t*>(points_.data());
}
};
The key detail: points_ is moved in, not copied. The vector allocates once during createPoints(). data() returns that same pointer — no second allocation ever happens.
The JSI function itself becomes trivial:
auto process = Function::createFromHostFunction(..., [](...) {
std::vector<Point> points = createPoints();
auto buffer = std::make_shared<PointsBuffer>(std::move(points));
return jsi::ArrayBuffer(rt, std::move(buffer));
});
Reading it on the JS side
Our Point struct is two int32_t fields — 8 bytes per point, packed tightly. We read it with Int32Array, which is a typed view into the raw bytes with no additional boxing or copying:
const buffer = process(); // returns ArrayBuffer
const view = new Int32Array(buffer);
for (let i = 0; i < view.length / 2; i++) {
const x = view[i * 2];
const y = view[i * 2 + 1];
}
The typed array type must match the C++ layout exactly. int32_t → Int32Array, float → Float32Array, double → Float64Array. If your struct has padding or mixed types, account for it in the byte offsets or use a DataView for precision.
Benchmark (100,000 calls)
ArrayBuffer: 34.81ms // 29x faster than Array of Objects
That is a huge improvement: ~30x faster than Array<Object> and ~9x faster than Flat Array.
In workloads like camera processing, this kind of representation change can be the difference between staying within the frame budget and dropping frames.
Data shape summary
The difference between approaches is not incremental — it is an order of magnitude.
API shape: strings vs numbers
Now let's talk about passing parameters into a native function and branching based on those parameters.
Suppose we have a function that accepts a type parameter representing a data format.
I have seen code shaped like this in a real production repository. The example below is simplified and slightly changed for clarity, but the idea is the same:
enum class NativeType {
NONE = 0,
ST_1 = 101,
ST_2 = 151,
ST_3 = 180,
ST_4 = 190,
ST_5 = 221,
ST_6 = 230,
ST_7 = 252
};
...
const std::string type = args[0].asString(rt).utf8(rt);
auto code = NativeType::NONE;
if (type == "uint8") {
code = NativeType::ST_1;
} else if (type == "uint16") {
code = NativeType::ST_2;
} else if (type == "int8") {
code = NativeType::ST_3;
} else if (type == "int16") {
code = NativeType::ST_4;
} else if (type == "int32") {
code = NativeType::ST_5;
} else if (type == "float32") {
code = NativeType::ST_6;
} else if (type == "float64") {
code = NativeType::ST_7;
}
...
This function receives type as a JS string and maps it to a native numeric type on the JSI side. This approach has two sources of overhead: asString(rt).utf8(rt) allocates on the heap and copies data through the JSI boundary, and the following string comparisons add more work on every call.
Let's benchmark it:
Benchmark (100,000 calls)
Branch string: 12.88ms
The gains here are modest in absolute terms, but the pattern matters on paths called thousands of times per second — for example, processing a stream of sensor events or a high-frequency callback from a native module.
Looking at the code, we can pass a numeric value from JS to JSI and keep the same behavior.
JS:
enum Types {
uint8 = 0,
uint16 = 1,
int8 = 2,
int16 = 3,
int32 = 4,
float32 = 5,
float64 = 6,
}
JSI:
...
const int type = args[0].asNumber();
auto code = NativeType::NONE;
switch (type) {
case 0:
code = NativeType::ST_1;
break;
case 1:
code = NativeType::ST_2;
break;
case 2:
code = NativeType::ST_3;
break;
case 3:
code = NativeType::ST_4;
break;
case 4:
code = NativeType::ST_5;
break;
case 5:
code = NativeType::ST_6;
break;
case 6:
code = NativeType::ST_7;
break;
default:
break;
}
...
The code is similar, but it is already more efficient: we pass a number instead of a string, and there are no string comparisons.
Benchmark (100,000 calls)
switch int: 9.13ms
The result is a little better. But if we look at the switch, it is effectively an indexed lookup. We can make the code cleaner and remove the branch entirely:
static std::array<NativeType, 7> codes{
NativeType::ST_1,
NativeType::ST_2,
NativeType::ST_3,
NativeType::ST_4,
NativeType::ST_5,
NativeType::ST_6,
NativeType::ST_7
};
...
const int type = args[0].asNumber();
const NativeType code = codes[type];
In this example, we removed the if-else chain and the switch, created a std::array, and read the value by index.
This assumes that type is always a valid index. In production code, validate the input before indexing into the array. assert(type >= 0 && type < codes.size()) is useful in debug builds, but release builds should still guard against invalid JS input if it can happen.
You could pass the final native value directly from JS and avoid this mapping.
In practice, the native side often needs to initialize multiple internal values based on the input (enums, modes, flags, etc). Using an index + lookup table keeps the JS API minimal while resolving everything in one place on the native side.
Benchmark (100,000 calls)
lookup table: 8.67ms
This version is the fastest and the cleanest.
If your API is called frequently, even small costs like string conversion (asString(rt).utf8(rt)) become visible. Prefer numeric contracts between JS and native.
API shape summary
If your branches only initialize values based on a type passed from JS, consider using switch + int or a lookup table instead of strings.
String building: convenience vs performance
Suppose we are writing a function that requests data from the GitHub REST API. We need to fetch the latest release tag for a repository, so we need two parameters: USER_NAME and REPO_NAME.
The URL looks like this: /repos/{USER_NAME}/{REPO_NAME}/releases/latest
Now suppose we store many USER_NAME/REPO_NAME pairs and make requests for several repositories. We need to build the URL dynamically based on the input.
Let's look at a few options. The first and most convenient one is std::format.
std::format
auto url = std::format("/repos/{}/{}/releases/latest", user_name, repo_name);
The benchmark shows this result:
Benchmark (100,000 calls)
std::format: 21.35ms
This version is the most convenient, but it is not ideal on a hot path. std::format has to process the format string, perform type dispatch, and build the result through formatting machinery that is heavier than simple concatenation. For code that runs very frequently, a more explicit approach can be faster.
std::string concatenation
std::string url;
url.reserve(7 + user_name.size() + 1 + repo_name.size() + 16);
url += "/repos/";
url += user_name;
url += '/';
url += repo_name;
url += "/releases/latest";
This code is a little more verbose. Let's look at the benchmark:
Benchmark (100,000 calls)
std::string concatenation: 12.38ms
That is better: this version is 1.7x faster.
There is one more implementation worth looking at.
Char buffer
...
// Buffer must be large enough to hold the full URL + null terminator.
// Validate total length in production to avoid overflow.
char buffer[128];
char* ptr = buffer;
auto user_name_size = user_name.size();
auto repo_name_size = repo_name.size();
std::memcpy(ptr, "/repos/", 7);
ptr += 7;
std::memcpy(ptr, user_name.data(), user_name_size);
ptr += user_name_size;
std::memcpy(ptr, "/", 1);
ptr += 1;
std::memcpy(ptr, repo_name.data(), repo_name_size);
ptr += repo_name_size;
std::memcpy(ptr, "/releases/latest", 16);
ptr += 16;
*ptr = '\0';
...
In this version, we allocate a fixed-size buffer on the stack, copy each part of the URL into it, move the pointer after each copy, and finish with \0 to mark the end of the string.
This approach is only safe when the maximum input size is known and validated. If user_name or repo_name can be arbitrary, check the total length before copying or use a bounded API. The benchmark is meant to show the cost of string construction when the size is controlled.
Let's benchmark this implementation:
Benchmark (100,000 calls)
char buffer: 7.78ms
This version is 1.6x faster than std::string concatenation and 2.7x faster than std::format.
char buffers are fastest, but they trade safety for performance. In most code paths, std::string with reserve is a better balance.
String building summary
Number to string conversion
There is one more small detail that is easy to miss when building strings on a hot path: converting numbers to text.
A common approach is to use std::to_string:
const int value = static_cast<int>(args[0].asNumber());
std::string res;
res.reserve(64);
res.append(std::to_string(value));
res.append(":");
res.append(std::to_string(value));
This code is simple and readable, but std::to_string returns a new std::string for every conversion. Depending on the standard library implementation and the number size, this can allocate and create temporary objects that are then copied or moved into the final string. In this example, we create two temporary strings and then append them into the final result.
Benchmark (100,000 calls)
std::to_string: 16.27ms
For most code, this is perfectly fine. But if this runs very frequently, those temporary allocations can become visible.
A lower-level alternative is std::to_chars (C++17):
const int value = static_cast<int>(args[0].asNumber());
char buffer[64];
char* ptr = buffer;
char* const end = buffer + sizeof(buffer);
auto [p, ec] = std::to_chars(ptr, end, value);
ptr = p;
*ptr++ = ':';
auto [p2, ec2] = std::to_chars(ptr, end, value);
ptr = p2;
*ptr = '\0';
std::to_chars writes directly into the provided buffer. It does not allocate, does not create temporary strings, and does not depend on locale formatting. This makes it a good fit for hot paths where the output format is simple and the maximum size is known.
Typical use cases include IDs, counters, timestamps, offsets, sizes, and compact protocol strings.
When using std::to_chars, always check the returned error code in production code. If the buffer is too small, ec will be set to std::errc::value_too_large, and the output should not be used.
Benchmark (100,000 calls)
std::to_chars: 9.75ms // 1.7x Faster
That is 1.7x faster with no changes to the output format.
The trade-off is the same as with manual char buffers: the code is faster, but also more explicit and easier to get wrong. For regular application code, std::to_string is usually the better default. For tight native loops, std::to_chars can avoid unnecessary allocations.
Number conversion summary
Final thoughts
The topics in this post — data shape, API shape, string building, and number conversion — can have a large practical impact after the broader architectural decisions from the first part.
The main takeaways:
-
if you need to pass numeric arrays from C++ to JS, use
ArrayBufferorMutableBufferinstead of an array of objects -
if a function accepts a type parameter, pass a numeric index instead of a string and consider using a lookup table
-
if you build URLs or similar strings on a hot path, a stack
charbuffer can be much faster thanstd::format, as long as input size is controlled -
if you convert numbers to strings in a hot path, consider
std::to_charsto avoid temporary string creation
These optimizations do not require changing the whole module architecture. They mostly come from choosing the right data representation.
The biggest performance wins in JSI often do not come from clever algorithms, but from choosing the right data representation and minimizing boundary crossings. In many cases, the difference between a naive and a well-shaped API is not 10–20%, but an order of magnitude.
