5.2 Strings
On Aleo there is no native string type available when writing programs. In this chapter we'll discover a few ways we can come up with custom representation of strings.
Naive approach
A naive approach to string representation is to represent string characters as u8
and strings as arrays of those u8
characters.
[u8; 32] // 32 characters string
Here's how you could then write a transition checking if two 32 character string equal one with another:
program strings.aleo {
transition equals(str1: [u8; 32], str2: [u8; 32]) -> bool {
let out: bool = true;
for i: u16 in 0u16..32u16 {
out &&= (str1[i] == str2[i]);
}
return out;
}
}
Let's try it now:
leo run equals "[1u8, 2u8, 3u8, 4u8, 5u8, 6u8, 7u8, 8u8, 9u8, 10u8, 0u8, 0u8, 0u8, 0u8, 0u8, 0u8, 1u8, 2u8, 3u8, 4u8, 5u8, 6u8, 7u8, 8u8, 9u8, 10u8, 0u8, 0u8, 0u8, 0u8, 0u8, 0u8]" "[1u8, 2u8, 9u8, 0u8, 0u8, 0u8, 0u8, 0u8, 0u8, 0u8, 0u8, 0u8, 0u8, 0u8, 0u8, 0u8, 1u8, 2u8, 3u8, 4u8, 5u8, 6u8, 7u8, 8u8, 9u8, 10u8, 0u8, 0u8, 0u8, 0u8, 0u8, 0u8]"
As you can see we have a 95 constraints program:
• 'strings.aleo/equals' - 95 constraints (called 1 time)
Another problem we have is that arrays on Aleo are limited to 32 elements. Strings defined this way would then be limited to 32 characters.
Less (but still) naive approach
Let's now represent our strings using arrays of u128
elements. Because each character just needs 8 bits to be represented, we can represent 128/8 = 16 character for each u128
.
[u128; 2] // 32 characters string
The previous function is now:
program strings.aleo {
transition equals(str1: [u128; 2], str2: [u128; 2]) -> bool {
let out: bool = true;
for i: u16 in 0u16..2u16 {
out &&= (str1[i] == str2[i]);
}
return out;
}
}
And when we run this:
leo run equals "[67305985u128, 513u128]" "[513u128, 513u128]"
Here the amount of constraints is way smaller than before:
• 'strings.aleo/equals' - 5 constraints (called 1 time)
Here are the functions you can use in typescript to encode and decode such string to and from u128 arrays:
function stringToBigInt(input: string): bigint {
const encoder = new TextEncoder();
const encodedBytes = encoder.encode(input);
encodedBytes.reverse();
let bigIntValue = BigInt(0);
for (let i = 0; i < encodedBytes.length; i++) {
const byteValue = BigInt(encodedBytes[i]);
const shiftedValue = byteValue << BigInt(8 * i);
bigIntValue = bigIntValue | shiftedValue;
}
return bigIntValue;
}
function bigIntToString(bigIntValue: bigint): string {
const bytes: number[] = [];
let tempBigInt = bigIntValue;
while (tempBigInt > BigInt(0)) {
const byteValue = Number(tempBigInt & BigInt(255));
bytes.push(byteValue);
tempBigInt = tempBigInt >> BigInt(8);
}
bytes.reverse();
const decoder = new TextDecoder();
const asciiString = decoder.decode(Uint8Array.from(bytes));
return asciiString;
}
function splitStringToBigInts(input: string): bigint[] {
const chunkSize = 16; // Chunk size to split the string
const numChunks = Math.ceil(input.length / chunkSize);
const bigInts: bigint[] = [];
for (let i = 0; i < numChunks; i++) {
const chunk = input.substr(i * chunkSize, chunkSize);
const bigIntValue = stringToBigInt(chunk);
bigInts.push(bigIntValue);
}
return bigInts;
}
function joinBigIntsToString(bigInts: bigint[]): string {
let result = '';
for (let i = 0; i < bigInts.length; i++) {
const chunkString = bigIntToString(bigInts[i]);
result += chunkString;
}
return result;
}
Efficient approach
Now the best approach we can use to fit the maximum amount of information with the least amount of constraints, is to use arrays of fields instead of u128.
program strings.aleo {
transition equals(str1: [field; 2], str2: [field; 2]) -> bool {
let out: bool = true;
for i: u16 in 0u16..2u16 {
out &&= (str1[i] == str2[i]);
}
return out;
}
}
And when we run that transition:
leo run equals "[67305985field, 513field]" "[513field, 513field]"
As you can see we still have just 5 constraints for our program:
• 'strings.aleo/equals' - 5 constraints (called 1 time)
Although because a field element can contain up to ~2^253 different integers. Such a field array can roughly contain 253 bits multiplied by the length of the array. This means we have ~31.6 characters available for each new field in our array representation.
Here are the functions you can use in javascript to encode and decode such string to and from field arrays:
const FIELD_MODULUS = 8444461749428370424248824938781546531375899335154063827935233455917409239040n;
function stringToBigInt(input: string): bigint {
const encoder = new TextEncoder();
const encodedBytes = encoder.encode(input);
encodedBytes.reverse();
let bigIntValue = BigInt(0);
for (let i = 0; i < encodedBytes.length; i++) {
const byteValue = BigInt(encodedBytes[i]);
const shiftedValue = byteValue << BigInt(8 * i);
bigIntValue = bigIntValue | shiftedValue;
}
return bigIntValue;
}
function bigIntToString(bigIntValue: bigint): string {
const bytes = [];
let tempBigInt = bigIntValue;
while (tempBigInt > BigInt(0)) {
const byteValue = Number(tempBigInt & BigInt(255));
bytes.push(byteValue);
tempBigInt = tempBigInt >> BigInt(8);
}
bytes.reverse();
const decoder = new TextDecoder();
const asciiString = decoder.decode(Uint8Array.from(bytes));
return asciiString;
}
function stringToFields(input: string, numFieldElements = 4): bigint[] {
const bigIntValue = stringToBigInt(input);
const fieldElements = [];
let remainingValue = bigIntValue;
for (let i = 0; i < numFieldElements; i++) {
const fieldElement = remainingValue % FIELD_MODULUS;
fieldElements.push(fieldElement);
remainingValue = remainingValue / FIELD_MODULUS;
}
if (remainingValue !== 0n) {
throw new Error("String is too big to be encoded.");
}
return fieldElements;
}
function fieldsToString(fields: bigint[]): string {
let bigIntValue = BigInt(0);
let multiplier = BigInt(1);
for (const fieldElement of fields) {
bigIntValue += fieldElement * multiplier;
multiplier *= FIELD_MODULUS;
}
return bigIntToString(bigIntValue);
}
This last standard for strings is the most efficient and should be preferred for most usages.
Last updated