===== Strings ===== Strings are enclosed in double quotes (" ") and are UTF-8 encoded. As with the stock 50g ROM, use of the ''->STR'' and ''STR->'' are useful for converting other objects to strings and vice versa. Strings can be cancatenated by the plus (+) operator: ''"Hello " "World!" +'' yields the string ''"Hello World!"''. Strings in newRPL are always Unicode NFC normalized for maximum compatibility with other devices. Strings imported from other devices should be NFC normalized for proper operation in newRPL. When the source device doesn't guarantee text in a normalized form, this can be done with the ''→NFC'' command. ---- ==== Length of strings ==== A Unicode Code Point is an indivisible unit of text. A character is usually a single Code Point, but not necessarily. A character may be formed by a group of Unicode Code Points that includes a starter character and optionally a series of overlapping characters or modifiers. There are separate commands to measure the length of a string. The most useful is ''STRLEN'', which returns the number of characters in a string. The command ''STRLENCP'' returns the number of Code Points in the string. Finally, the command ''SIZE'' returns the size in bytes of the string. ==== Commands for strings ==== The following table summarizes commands which can be applied to strings. | Command | Purpose | Example | | ''→STR'' | Convert object to string | ''45,569 →STR'' yields ''"45,569"'' | | ''STR→'' | Compile a string to RPL objects((In this case, ''OBJ→'' can also be used.))| ''"45,569." STR→'' yields ''45.569.'' | | ''→NFC'' | Normalize a string to Unicode NFC | ''"Hello World" →NFC'' yields ''"Hello World"''| | ''UTF8→'' | Convert string to list of Unicode Code Points | ''"abcd" UTF8→'' yields ''{ #61h #62h #63h #64h }'' | | ''→UTF8'' | Convert list of Unicode Code Points to a UTF8 string | ''{ #61h #62h #63h #64h } →UTF8'' yields ''"abcd"''| | ''SIZE'' | Return the number of bytes used by a UTF8 string | ''"Hello World" SIZE'' yields ''11''| | ''STRLEN'' | Returns the length of the string, in characters | ''"Hello World" STRLEN'' yields ''11'' | | ''STRLENCP'' | Returns the number of Unicode Code Points in a string | ''"Hello World" STRLEN'' yields ''11'' | | ''POS'' | Returns the position of a substring within a string (0 if not found) | ''"Hello World" "Wor" POS'' yields ''7''| | ''POSREV'' | Returns the position of the substring counting from end of sting (0 if not found) | ''"Hello World" "Wor" POSREV'' yields ''3'' | | ''NPOS'' | Same as ''POS'', but starting the search from position N | | | ''NPOSREV'' | Same as ''POSREV'' starting at position N | | | ''SREV'' | Reverse the string | ''"Hello World" SREV'' yields ''"dlroW olleH"'' | | ''REPL'' | Replace part of a string with another at the specified position | ''"Hello World" 7 "Universe" REPL'' yields ''"Hello Universe"''| | ''SREPL'' | Search and replace a string (return 1 if successful, 0 if not) | ''"Hello World" "World" "Universe" SREPL'' yields ''"Hello Universe" 1'' | | ''SUB'' | | | | ''HEAD'' | | | | ''TAIL'' | | | | ''TRIM'' | | | | ''RTRIM'' | | | | ''NTOKENS'' | | | | ''NTHTOKEN'' | | | | ''NTHTOKENPOS'' | | | ----