Julia Tutorial

Selected Reading

Julia - Dictionaries & Sets

Jupa - Dictionaries and Sets

Many of the functions we have seen so far are working on arrays and tuples. Arrays are just one type of collection, but Jupa has other kind of collections too. One such collection is Dictionary object which associates keys with values. That is why it is called an ‘associative collection’.

To understand it better, we can compare it with simple look-up table in which many types of data are organized and provide us the single piece of information such as number, string or symbol called the key. It doesn’t provide us the corresponding data value.

Creating Dictionaries

The syntax for creating a simple dictionary is as follows −

Dict(“key1” => value1, “key2” => value2,,…, “keyn” => valuen)

In the above syntax, key1, key2…keyn are the keys and value1, value2,…valuen are the corresponding values. The operator => is the Pair() function. We can not have two keys with the same name because keys are always unique in dictionaries.

Example

jupa> first_dict = Dict("X" => 100, "Y" => 110, "Z" => 220)Dict{String,Int64} with 3 entries: "Y" => 110 "Z" => 220 "X" => 100

We can also create dictionaries with the help of comprehension syntax. The example is given below −

Example

jupa> first_dict = Dict(string(x) => sind(x) for x = 0:5:360)Dict{String,Float64} with 73 entries: "320" => -0.642788 "65" => 0.906308 "155" => 0.422618 "335" => -0.422618 "75" => 0.965926 "50" => 0.766044 &velpp; => &velpp;

Keys

As discussed earper, dictionaries have unique keys. It means that if we assign a value to a key that already exists, we will not be creating a new one but modifying the existing key. Following are some operations on dictionaries regarding keys −

Searching for a key

We can use haskey() function to check whether the dictionary contains a key or not −

jupa> first_dict = Dict("X" => 100, "Y" => 110, "Z" => 220)Dict{String,Int64} with 3 entries: "Y" => 110 "Z" => 220 "X" => 100 jupa> haskey(first_dict, "Z")truejupa> haskey(first_dict, "A")false

Searching for a key/value pair

We can use in() function to check whether the dictionary contains a key/value pair or not −

jupa> in(("X" => 100), first_dict)truejupa> in(("X" => 220), first_dict)false

Add a new key-value

We can add a new key-value in the existing dictionary as follows −

jupa> first_dict["R"] = 400400jupa> first_dictDict{String,Int64} with 4 entries: "Y" => 110 "Z" => 220 "X" => 100 "R" => 400

Delete a key

We can use delete!() function to delete a key from an existing dictionary −

jupa> delete!(first_dict, "R")Dict{String,Int64} with 3 entries: "Y" => 110 "Z" => 220 "X" => 100

Getting all the keys

We can use keys() function to get all the keys from an existing dictionary −

jupa> keys(first_dict)Base.KeySet for a Dict{String,Int64} with 3 entries. Keys: "Y" "Z" "X"

Values

Every key in dictionary has a corresponding value. Following are some operations on dictionaries regarding values −

Retrieving all the values

We can use values() function to get all the values from an existing dictionary −

jupa> values(first_dict)Base.ValueIterator for a Dict{String,Int64} with 3 entries. Values: 110 220 100

Dictionaries as iterable objects

We can process each key/value pair to see the dictionaries are actually iterable objects −

for kv in first_dict         println(kv)      end "Y" => 110 "Z" => 220 "X" => 100

Here the kv is a tuple that contains each key/value pair.

Sorting a dictionary

Dictionaries do not store the keys in any particular order hence the output of the dictionary would not be a sorted array. To obtain items in order, we can sort the dictionary −

Example

jupa> first_dict = Dict("R" => 100, "S" => 220, "T" => 350, "U" => 400, "V" => 575, "W" => 670)Dict{String,Int64} with 6 entries: "S" => 220 "U" => 400 "T" => 350 "W" => 670 "V" => 575 "R" => 100jupa> for key in sort(collect(keys(first_dict)))         println("$key => $(first_dict[key])")         endR => 100S => 220T => 350U => 400V => 575W => 670

We can also use SortedDict data type from the DataStructures.ji Jupa package to make sure that the dictionary remains sorted all the times. You can check the example below −

Example

jupa> import DataStructuresjupa> first_dict = DataStructures.SortedDict("S" => 220, "T" => 350, "U" => 400, "V" => 575, "W" => 670)DataStructures.SortedDict{String,Int64,Base.Order.ForwardOrdering} with 5 entries: "S" => 220 "T" => 350 "U" => 400 "V" => 575 "W" => 670jupa> first_dict["R"] = 100100jupa> first_dictDataStructures.SortedDict{String,Int64,Base.Order.ForwardOrdering} with 6 entries: “R” => 100 “S” => 220 “T” => 350 “U” => 400 “V” => 575 “W” => 670

Word Counting Example

One of the simple apppcations of dictionaries is to count how many times each word appears in text. The concept behind this apppcation is that each word is a key-value set and the value of that key is the number of times that particular word appears in that piece of text.

In the following example, we will be counting the words in a file name NLP.txtb(saved on the desktop) −

jupa> f = open("C://Users//Leekha//Desktop//NLP.txt")IOStream()jupa> wordpst = String[]String[]jupa> for pne in eachpne(f)            words = sppt(pne, r"W")            map(w -> push!(wordpst, lowercase(w)), words)         end jupa> filter!(!isempty, wordpst)984-element Array{String,1}: "natural" "language" "processing" "semantic" "analysis" "introduction" "to" "semantic" "analysis" "the" "purpose"   ……………………   ……………………jupa> close(f)

We can see from the above output that wordpst is now an array of 984 elements.

We can create a dictionary to store the words and word count −

jupa> wordcounts = Dict{String,Int64}()Dict{String,Int64}()jupa> for word in wordpst            wordcounts[word]=get(wordcounts, word, 0) + 1         end

To find out how many times the words appear, we can look up the words in the dictionary as follows −

jupa> wordcounts["natural"]1jupa> wordcounts["processing"]1jupa> wordcounts["and"]14

We can also sort the dictionary as follows −

jupa> for i in sort(collect(keys(wordcounts)))         println("$i, $(wordcounts[i])")      end1, 22, 23, 24, 25, 1a, 28about, 3above, 2act, 1affixes, 3all, 2also, 5an, 5analysis, 15analyze, 1analyzed, 1analyzer, 2and, 14answer, 5antonymies, 1antonymy, 1apppcation, 3are, 11…………

To find the most common words we can use collect() to convert the dictionary to an array of tuples and then sort the array as follows −

jupa> sort(collect(wordcounts), by = tuple -> last(tuple), rev=true)276-element Array{Pair{String,Int64},1}:            "the" => 76             "of" => 47             "is" => 39              "a" => 28          "words" => 23        "meaning" => 23       "semantic" => 22        "lexical" => 21       "analysis" => 15            "and" => 14             "in" => 14             "be" => 13             "it" => 13        "example" => 13             "or" => 12           "word" => 12            "for" => 11            "are" => 11        "between" => 11             "as" => 11                  &velpp;            "each" => 1           "river" => 1         "homonym" => 1  "classification" => 1         "analyze" => 1       "nocturnal" => 1            "axis" => 1         "concept" => 1           "deals" => 1          "larger" => 1         "destiny" => 1            "what" => 1     "reservation" => 1"characterization" => 1          "second" => 1       "certitude" => 1            "into" => 1        "compound" => 1    "introduction" => 1

We can check the first 10 words as follows −

jupa> sort(collect(wordcounts), by = tuple -> last(tuple), rev=true)[1:10]10-element Array{Pair{String,Int64},1}:      "the" => 76       "of" => 47       "is" => 39        "a" => 28    "words" => 23  "meaning" => 23 "semantic" => 22  "lexical" => 21 "analysis" => 15      "and" => 14

We can use filter() function to find all the words that start with a particular alphabet (say ’n’).

jupa> filter(tuple -> startswith(first(tuple), "n") && last(tuple) < 4, collect(wordcounts))6-element Array{Pair{String,Int64},1}:      "none" => 2       "not" => 3    "namely" => 1      "name" => 1   "natural" => 1 "nocturnal" => 1

Sets

Like an array or dictionary, a set may be defined as a collection of unique elements. Following are the differences between sets and other kind of collections −

In a set, we can have only one of each element.

The order of element is not important in a set.

Creating a Set

With the help of Set constructor function, we can create a set as follows −

jupa> var_color = Set()Set{Any}()

We can also specify the types of set as follows −

jupa> num_primes = Set{Int64}()Set{Int64}()

We can also create and fill the set as follows −

jupa> var_color = Set{String}(["red","green","blue"])Set{String} with 3 elements: "blue" "green" "red"

Alternatively we can also use push!() function, as arrays, to add elements in sets as follows −

jupa> push!(var_color, "black")Set{String} with 4 elements: "blue" "green" "black" "red"

We can use in() function to check what is in the set −

jupa> in("red", var_color)truejupa> in("yellow", var_color)false

Standard operations

Union, intersection, and difference are some standard operations we can do with sets. The corresponding functions for these operations are union(), intersect() and setdiff().

Union

In general, the union (set) operation returns the combined results of the two statements.

Example

jupa> color_rainbow = Set(["red","orange","yellow","green","blue","indigo","violet"])Set{String} with 7 elements: "indigo" "yellow" "orange" "blue" "violet" "green" "red" jupa> union(var_color, color_rainbow)Set{String} with 8 elements: "indigo" "yellow" "orange" "blue" "violet" "green" "black" "red"

Intersection

In general, an intersection operation takes two or more variables as inputs and returns the intersection between them.

Example

jupa> intersect(var_color, color_rainbow)Set{String} with 3 elements: "blue" "green" "red"

Difference

In general, the difference operation takes two or more variables as an input. Then, it returns the value of the first set excluding the value overlapped by the second set.

Example

jupa> setdiff(var_color, color_rainbow)Set{String} with 1 element: "black"

Some Functions on Dictionary

In the below example, you will see that the functions that work on arrays as well as sets also works on collections pke dictionaries −

jupa> dict1 = Dict(100=>"X", 220 => "Y")Dict{Int64,String} with 2 entries: 100 => "X" 220 => "Y" jupa> dict2 = Dict(220 => "Y", 300 => "Z", 450 => "W")Dict{Int64,String} with 3 entries: 450 => "W" 220 => "Y" 300 => "Z"

Union

jupa> union(dict1, dict2)4-element Array{Pair{Int64,String},1}: 100 => "X" 220 => "Y" 450 => "W" 300 => "Z"

Intersect

jupa> intersect(dict1, dict2)1-element Array{Pair{Int64,String},1}: 220 => "Y"

Difference

jupa> setdiff(dict1, dict2)1-element Array{Pair{Int64,String},1}: 100 => "X"

Merging two dictionaries

jupa> merge(dict1, dict2)Dict{Int64,String} with 4 entries: 100 => "X" 450 => "W" 220 => "Y" 300 => "Z"

Finding the smallest element

jupa> dict1Dict{Int64,String} with 2 entries: 100 => "X" 220 => "Y"  jupa> findmin(dict1)("X", 100)

Jupa - Dictionaries and Sets

Creating Dictionaries

Example

Example

Keys

Searching for a key

Searching for a key/value pair

Add a new key-value

Delete a key

Getting all the keys

Values

Retrieving all the values

Dictionaries as iterable objects

Sorting a dictionary

Example

Example

Word Counting Example

Sets

Creating a Set

Standard operations

Union

Intersection

Difference

Some Functions on Dictionary

Union

Intersect

Difference

Merging two dictionaries

Finding the smallest element

友情链接