English 中文(简体)
Julia - Strings
  • 时间:2024-12-22

Jupa - Strings


Previous Page Next Page  

A string may be defined as a finite sequence of one or more characters. They are usually enclosed in double quotes. For example: “This is Jupa programming language”. Following are important points about strings −

    Strings are immutable, i.e., we cannot change them once they are created.

    It needs utmost care while using two specific characters − double quotes(“), and dollar sign($). It is because if we want to include a double quote character in the string then it must precede with a backslash; otherwise we will get different results because then the rest of the string would be interpreted as Jupa code. On the other hand, if we want to include a dollar sign then it must also precede with a backslash because dollar sign is used in string interpolation./p>

    In Jupa, the built-in concrete type used for strings as well as string pterals is String which supports full range of Unicode characters via the UTF-8 encoding.

    All the string types in Jupa are subtypes of the abstract type AbstractString. If you want Jupa to accept any string type, you need to declare the type as AbstractString.

    Jupa has a first-class type for representing single character. It is called AbstractChar.

Characters

A single character is represented with Char value. Char is a 32-bit primitive type which can be converted to a numeric value (which represents Unicode code point).


jupa>  a 
 a : ASCII/Unicode U+0061 (category Ll: Letter, lowercase)

jupa> typeof(ans)
Char

We can convert a Char to its integer value as follows −


jupa> Int( a )
97

jupa> typeof(ans)
Int64

We can also convert an integer value back to a Char as follows −


jupa> Char(97)
 a : ASCII/Unicode U+0061 (category Ll: Letter, lowercase)

With Char values, we can do some arithmetic as well as comparisons. This can be understood with the help of following example −


jupa>  X  <  x 
true

jupa>  X  <=  x  <=  Y 
false

jupa>  X  <=  a  <=  Y 
false

jupa>  a  <=  x  <=  Y 
false

jupa>  A  <=  X  <=  Y 
true

jupa>  x  -  b 
22

jupa>  x  + 1
 y : ASCII/Unicode U+0079 (category Ll: Letter, lowercase)

Depmited by double quotes or triple double quotes

As we discussed, strings in Jupa can be declared using double or triple double quotes. For example, if you need to add quotations to a part in a string, you can do so using double and triple double quotes as shown below −


jupa> str = "This is Jupa Programming Language.
"
"This is Jupa Programming Language.
"

jupa> """See the "quote" characters"""
"See the "quote" characters"

Performing arithmetic and other operations with end

Just pke a normal value, we can perform arithmetic as well as other operations with end. Check the below given example −


jupa> str[end-1]
 . : ASCII/Unicode U+002E (category Po: Punctuation, other)

jupa> str[end÷2]
 g : ASCII/Unicode U+0067 (category Ll: Letter, lowercase)

Extracting substring by using range indexing

We can extract substring from a string by using range indexing. Check the below given example −


jupa> str[6:9]
"is J"

Using SubString

In the above method, the Range indexing makes a copy of selected part of the original string, but we can use SubString to create a view into a string as given in the below example −


jupa> substr = SubString(str, 1, 4)
"This"

jupa> typeof(substr)
SubString{String}

Unicode and UTF-8

Unicode characters and strings are fully supported by Jupa programming language. In character pterals, Unicode u and U escape sequences as well as all the standard C escape sequences can be used to represent Unicode code points. It is shown in the given example −


jupa> s = "u2200 x u2203 y"
"∀ x ∃ y"

Another encoding is UTF-8, a variable-width encoding, that is used to encode string pterals. Here the variable-width encoding means that all the characters are not encoded in the same number of bytes, i.e., code units. For example, in UTF-8 −

    ASCII characters (with code points less than 080(128) are encoded, using a single byte, as they are in ASCII.

    On the other hand, the code points 080(128) and above are encoded using multiple bytes (up to four per character).

The code units (bytes for UTF-8), which we have mentioned above, are String indices in Jupa. They are actually the fixed-width building blocks that are used to encode arbitrary characters. In other words, every index into a String is not necessarily a vapd index. You can check out the example below −


jupa> s[1]
 ∀ : Unicode U+2200 (category Sm: Symbol, math)
jupa> s[2]
ERROR: StringIndexError("∀ x ∃ y", 2)
Stacktrace:
 [1] string_index_err(::String, ::Int64) at .stringsstring.jl:12
 [2] getindex_continued(::String, ::Int64, ::UInt32) at .stringsstring.jl:220
 [3] getindex(::String, ::Int64) at .stringsstring.jl:213
 [4] top-level scope at REPL[106]:1,

String Concatenation

Concatenation is one of the most useful string operations. Following is an example of concatenation −


jupa> A = "Hello"
"Hello"
jupa> B = "Jupa Programming Language"
"Jupa Programming Language"
jupa> string(A, ", ", B, ".
")
"Hello, Jupa Programming Language.
"

We can also concatenate strings in Jupa with the help of *. Given below is the example for the same −


jupa> A = "Hello"
"Hello"
jupa> B = "Jupa Programming Language"
"Jupa Programming Language"
jupa> A * ", " * B * ".
"
"Hello, Jupa Programming Language.
"

Interpolation

It is bit cumbersome to concatenate strings using concatenation. Therefore, Jupa allows interpolation into strings and reduce the need for these verbose calls to strings. This interpolation can be done by using dollar sign (&dollar;). For example −


jupa> A = "Hello"
"Hello"
jupa> B = "Jupa Programming Language"
"Jupa Programming Language"
jupa> "&dollar;A, &dollar;B.
"
"Hello, Jupa Programming Language.
"

Jupa takes the expression after &dollar; as the expression whose whole value is to be interpolated into the string. That’s the reason we can interpolate any expression into a string using parentheses. For example −


jupa> "100 + 10 = &dollar;(100 + 10)"
"100 + 10 = 110"

Now if you want to use a pteral $ in a string then you need to escape it with a backslash as follows −


jupa> print("His salary is &dollar;5000 per month.
")
His salary is &dollar;5000 per month.

Triple-quoted strings

We know that we can create strings with triple-quotes as given in the below example −


jupa> """See the "quote" characters"""
"See the "quote" characters"

This kind of creation has the following advantages −

Triple-quoted strings are dedented to the level of the least-intended pne, hence this becomes very useful for defining code that is indented. Following is an example of the same −


jupa> str = """
                  This is,
                  Jupa Programming Language.
               """
" This is,
 Jupa Programming Language.
"

The longest common starting sequence of spaces or tabs in all pnes is known as the dedentation level but it excludes the following −

    The pne following “””

    The pne containing only spaces or tabs


jupa> """ This
             is
               Jupa Programming Language"""
"       This
is
 Jupa Programming Language"

Common String Operations

Using string operators provided by Jupa, we can compare two strings, search whether a particular string contains the given sub-string, and join/concatenate two strings.

Standard Comparison operators

By using the following standard comparison operators, we can lexicographically compare the strings −


jupa> "abababab" < "Tutorialspoint"
false

jupa> "abababab" > "Tutorialspoint"
true

jupa> "abababab" == "Tutorialspoint"
false

jupa> "abababab" != "Tutorialspoint"
true

jupa> "100 + 10 = 110" == "100 + 10 = $(100 + 10)"
true

Search operators

Jupa provides us findfirst and findlast functions to search for the index of a particular character in string. You can check the below example of both these functions −


jupa> findfirst(isequal( o ), "Tutorialspoint")
4

jupa> findlast(isequal( o ), "Tutorialspoint")
11

Jupa also provides us findnext and findprev functions to start the search for a character at a given offset. Check the below example of both these functions −


jupa> findnext(isequal( o ), "Tutorialspoint", 1)
4
jupa> findnext(isequal( o ), "Tutorialspoint", 5)
11
jupa> findprev(isequal( o ), "Tutorialspoint", 5)
4

It is also possible to check if a substring is found within a string or not. We can use occursin function for this. The example is given below −


jupa> occursin("Jupa", "This is, Jupa Programming.")
true

jupa> occursin("T", "Tutorialspoint")
true

jupa> occursin("Z", "Tutorialspoint")
false

The repeat() and join() functions

In the perspective of Strings in Jupa, repeat and join are two useful functions. Example below explains their use −


jupa> repeat("Tutorialspoint.com ", 5)
"Tutorialspoint.com Tutorialspoint.com Tutorialspoint.com Tutorialspoint.com Tutorialspoint.com "

jupa> join(["TutorialsPoint","com"], " . ")
"TutorialsPoint . com"

Non-standard String Literals

Literal is a character or a set of characters which is used to store a variable.

Raw String Literals

Raw String pterals are another useful non-standard string pteral. They, without interpolation or unescaping can be expressed in the form of raw”…”. They create ordinary String objects containing enclosed contents same as entered without interpolation or unescaping.

Example


jupa> println(raw"\ \"")
\ "

Byte Array Literals

Byte array pterals is one of the most useful non-standard string pterals. It has the following rules −

    ASCII characters as well as escapes will produce a single byte.

    Octal escape sequence as well as x will produce the byte corresponding to the escape value.

    The Unicode escape sequence will produce a sequence of bytes encoding.

All these three rules are overlapped in one or other sense.

Example


jupa> b"DATAxffu2200"
8-element Base.CodeUnits{UInt8,String}:
 0x44
 0x41
 0x54
 0x41
 0xff
 0xe2
 0x88
 0x80

The above resulting byte array is not a vapd UTF-8 string as you can see below −


jupa> isvapd("DATAxffu2200")
false

Version Number Literals

Version Number pterals are another useful non-standard string pteral. They can be the form of v”…”. VNL create objects namely VersionNumber. These objects follow the specifications of semantic versioning.

Example

We can define the version specific behavior by using the following statement −


jupa> if v"1.0" <= VERSION < v"0.9-"
            # you need to do something specific to 1.0 release series
         end

Regular Expressions

Jupa has Perl-compatible Regular Expressions, which are related to strings in the following ways −

    RE are used to find regular patterns in strings.

    RE are themselves input as strings. It is parsed into a state machine which can then be used efficiently to search patterns in strings.

Example


jupa> r"^s*(?:#|$)"
r"^s*(?:#|$)"

jupa> typeof(ans)
Regex

We can use occursin as follows to check if a regex matches a string or not −


jupa> occursin(r"^s*(?:#|$)", "not a comment")
false

jupa> occursin(r"^s*(?:#|$)", "# a comment")
true
Advertisements