KrkString Struct Reference

Immutable sequence of Unicode codepoints. More...

#include <object.h>

Inheritance diagram for KrkString:

Public Member Functions

KrkStringkrk_takeString (char *chars, size_t length)
 Yield ownership of a C string to the GC and obtain a string object. More...
 
KrkStringkrk_takeStringVetted (char *chars, size_t length, size_t codesLength, KrkStringType type, uint32_t hash)
 Like krk_takeString but for when the caller has already calculated code lengths, hash, and string type. More...
 
KrkStringkrk_copyString (const char *chars, size_t length)
 Obtain a string object representation of the given C string. More...
 
void * krk_unicodeString (KrkString *string)
 Ensure that a codepoint representation of a string is available. More...
 
uint32_t krk_unicodeCodepoint (KrkString *string, size_t index)
 Obtain the codepoint at a given index in a string. More...
 
size_t krk_codepointToBytes (krk_integer_type value, unsigned char *out)
 Convert an integer codepoint to a UTF-8 byte representation. More...
 

Data Fields

size_t length
 String length in bytes.
 
size_t codesLength
 String length in Unicode codepoints.
 
char * chars
 UTF8 canonical data.
 
void * codes
 Codepoint data.
 
- Data Fields inherited from KrkObj
uint16_t type
 Tag indicating core type.
 
uint16_t flags
 General object flags, mostly related to garbage collection.
 
uint32_t hash
 Cached hash value for table keys.
 
struct KrkObjnext
 Invasive linked list of all objects in the VM.
 

Protected Attributes

KrkObj obj
 Base.
 

Detailed Description

Immutable sequence of Unicode codepoints.

Definition at line 93 of file object.h.

Member Function Documentation

◆ krk_codepointToBytes()

size_t krk_codepointToBytes ( krk_integer_type  value,
unsigned char *  out 
)

Convert an integer codepoint to a UTF-8 byte representation.

Converts a single codepoint to a sequence of bytes containing the UTF-8 representation. 'out' must be allocated by the caller.

Parameters
valueCodepoint to encode.
outArray to write UTF-8 sequence into.
Returns
The length of the UTF-8 sequence, in bytes.

Definition at line 38 of file object.c.

◆ krk_copyString()

KrkString * krk_copyString ( const char *  chars,
size_t  length 
)

Obtain a string object representation of the given C string.

Converts the C string 'chars' into a string object by checking the string table for it. If the string table does not have an equivalent string, a new one will be created by copying 'chars'.

'chars' must be a nil-terminated C string representing a UTF-8 character sequence.

Parameters
charsC string to convert to a string object.
lengthLength of the C string.
Returns
A string object.

Definition at line 224 of file object.c.

◆ krk_takeString()

KrkString * krk_takeString ( char *  chars,
size_t  length 
)

Yield ownership of a C string to the GC and obtain a string object.

Creates a string object represented by the characters in 'chars' and of length 'length'. The source string must be nil-terminated and must remain valid for the lifetime of the object, as its ownership is yielded to the GC. Useful for strings which were allocated on the heap by other mechanisms.

'chars' must be a nil-terminated C string representing a UTF-8 character sequence.

Parameters
charsC string to take ownership of.
lengthLength of the C string.
Returns
A string object.

Definition at line 208 of file object.c.

◆ krk_takeStringVetted()

KrkString * krk_takeStringVetted ( char *  chars,
size_t  length,
size_t  codesLength,
KrkStringType  type,
uint32_t  hash 
)

Like krk_takeString but for when the caller has already calculated code lengths, hash, and string type.

Creates a new string object in cases where the caller has already calculated codepoint length, expanded string type, and hash. Useful for functions that create strings from other KrkStrings, where it's easier to know these things without having to start from scratch.

Parameters
charsC string to take ownership of.
lengthLength of the C string.
codesLengthLength of the expected resulting KrkString in codepoints.
typeCompact type of the string, eg. UCS1, UCS2, UCS4...
See also
KrkStringType
Parameters
hashPrecalculated string hash.

Definition at line 241 of file object.c.

◆ krk_unicodeCodepoint()

uint32_t krk_unicodeCodepoint ( KrkString string,
size_t  index 
)

Obtain the codepoint at a given index in a string.

This is a convenience function which ensures that a Unicode codepoint representation has been generated and returns the codepoint value at the requested index. If you need to find multiple codepoints, it is recommended that you use the KRK_STRING_FAST macro after calling krk_unicodeString instead.

Note
This function does not perform any bounds checking.
Parameters
stringString to index into.
indexOffset of the codepoint to obtain.
Returns
Integer representation of the codepoint at the requested index.

Definition at line 162 of file object.c.

◆ krk_unicodeString()

void * krk_unicodeString ( KrkString string)

Ensure that a codepoint representation of a string is available.

Obtain an untyped pointer to the codepoint representation of a string. If the string does not have a codepoint representation allocated, it will be generated by this function and remain with the string for the duration of its lifetime.

Parameters
stringString to obtain the codepoint representation of.
Returns
A pointer to the bytes of the codepoint representation.

Definition at line 153 of file object.c.


The documentation for this struct was generated from the following file: