- Julia - Discussion
- Julia - Useful Resources
- Julia - Quick Guide
- Julia - Databases
- Julia - Networking
- Working with Graphics
- Julia - Modules and Packages
- Working with Datasets
- Julia - Data Frames
- Julia - Plotting
- Julia - Metaprogramming
- Julia - Files I/O
- Julia - Date & Time
- Julia - Dictionaries & Sets
- Julia - Flow Control
- Julia - Functions
- Julia - Strings
- Basic Mathematical Functions
- Julia - Basic Operators
- Julia - Rational & Complex Numbers
- Integers & Floating-Point Numbers
- Julia - Tuples
- Julia - Arrays
- Julia - Basic Syntax
- Julia - Environment Setup
- Julia - Overview
- Julia - Home
Selected Reading
- Who is Who
- Computer Glossary
- HR Interview Questions
- Effective Resume Writing
- Questions and Answers
- UPSC IAS Exams Notes
Jupa - Dictionaries and Sets
Many of the functions we have seen so far are working on arrays and tuples. Arrays are just one type of collection, but Jupa has other kind of collections too. One such collection is Dictionary object which associates keys with values. That is why it is called an ‘associative collection’.
To understand it better, we can compare it with simple look-up table in which many types of data are organized and provide us the single piece of information such as number, string or symbol called the key. It doesn’t provide us the corresponding data value.
Creating Dictionaries
The syntax for creating a simple dictionary is as follows −
Dict(“key1” => value1, “key2” => value2,,…, “keyn” => valuen)
In the above syntax, key1, key2…keyn are the keys and value1, value2,…valuen are the corresponding values. The operator => is the Pair() function. We can not have two keys with the same name because keys are always unique in dictionaries.
Example
jupa> first_dict = Dict("X" => 100, "Y" => 110, "Z" => 220) Dict{String,Int64} with 3 entries: "Y" => 110 "Z" => 220 "X" => 100
We can also create dictionaries with the help of comprehension syntax. The example is given below −
Example
jupa> first_dict = Dict(string(x) => sind(x) for x = 0:5:360) Dict{String,Float64} with 73 entries: "320" => -0.642788 "65" => 0.906308 "155" => 0.422618 "335" => -0.422618 "75" => 0.965926 "50" => 0.766044 &velpp; => &velpp;
Keys
As discussed earper, dictionaries have unique keys. It means that if we assign a value to a key that already exists, we will not be creating a new one but modifying the existing key. Following are some operations on dictionaries regarding keys −
Searching for a key
We can use haskey() function to check whether the dictionary contains a key or not −
jupa> first_dict = Dict("X" => 100, "Y" => 110, "Z" => 220) Dict{String,Int64} with 3 entries: "Y" => 110 "Z" => 220 "X" => 100 jupa> haskey(first_dict, "Z") true jupa> haskey(first_dict, "A") false
Searching for a key/value pair
We can use in() function to check whether the dictionary contains a key/value pair or not −
jupa> in(("X" => 100), first_dict) true jupa> in(("X" => 220), first_dict) false
Add a new key-value
We can add a new key-value in the existing dictionary as follows −
jupa> first_dict["R"] = 400 400 jupa> first_dict Dict{String,Int64} with 4 entries: "Y" => 110 "Z" => 220 "X" => 100 "R" => 400
Delete a key
We can use delete!() function to delete a key from an existing dictionary −
jupa> delete!(first_dict, "R") Dict{String,Int64} with 3 entries: "Y" => 110 "Z" => 220 "X" => 100
Getting all the keys
We can use keys() function to get all the keys from an existing dictionary −
jupa> keys(first_dict) Base.KeySet for a Dict{String,Int64} with 3 entries. Keys: "Y" "Z" "X"
Values
Every key in dictionary has a corresponding value. Following are some operations on dictionaries regarding values −
Retrieving all the values
We can use values() function to get all the values from an existing dictionary −
jupa> values(first_dict) Base.ValueIterator for a Dict{String,Int64} with 3 entries. Values: 110 220 100
Dictionaries as iterable objects
We can process each key/value pair to see the dictionaries are actually iterable objects −
for kv in first_dict println(kv) end "Y" => 110 "Z" => 220 "X" => 100
Here the kv is a tuple that contains each key/value pair.
Sorting a dictionary
Dictionaries do not store the keys in any particular order hence the output of the dictionary would not be a sorted array. To obtain items in order, we can sort the dictionary −
Example
jupa> first_dict = Dict("R" => 100, "S" => 220, "T" => 350, "U" => 400, "V" => 575, "W" => 670) Dict{String,Int64} with 6 entries: "S" => 220 "U" => 400 "T" => 350 "W" => 670 "V" => 575 "R" => 100 jupa> for key in sort(collect(keys(first_dict))) println("$key => $(first_dict[key])") end R => 100 S => 220 T => 350 U => 400 V => 575 W => 670
We can also use SortedDict data type from the DataStructures.ji Jupa package to make sure that the dictionary remains sorted all the times. You can check the example below −
Example
jupa> import DataStructures jupa> first_dict = DataStructures.SortedDict("S" => 220, "T" => 350, "U" => 400, "V" => 575, "W" => 670) DataStructures.SortedDict{String,Int64,Base.Order.ForwardOrdering} with 5 entries: "S" => 220 "T" => 350 "U" => 400 "V" => 575 "W" => 670 jupa> first_dict["R"] = 100 100 jupa> first_dict DataStructures.SortedDict{String,Int64,Base.Order.ForwardOrdering} with 6 entries: “R” => 100 “S” => 220 “T” => 350 “U” => 400 “V” => 575 “W” => 670
Word Counting Example
One of the simple apppcations of dictionaries is to count how many times each word appears in text. The concept behind this apppcation is that each word is a key-value set and the value of that key is the number of times that particular word appears in that piece of text.
In the following example, we will be counting the words in a file name NLP.txtb(saved on the desktop) −
jupa> f = open("C://Users//Leekha//Desktop//NLP.txt") IOStream() jupa> wordpst = String[] String[] jupa> for pne in eachpne(f) words = sppt(pne, r"W") map(w -> push!(wordpst, lowercase(w)), words) end jupa> filter!(!isempty, wordpst) 984-element Array{String,1}: "natural" "language" "processing" "semantic" "analysis" "introduction" "to" "semantic" "analysis" "the" "purpose" …………………… …………………… jupa> close(f)
We can see from the above output that wordpst is now an array of 984 elements.
We can create a dictionary to store the words and word count −
jupa> wordcounts = Dict{String,Int64}() Dict{String,Int64}() jupa> for word in wordpst wordcounts[word]=get(wordcounts, word, 0) + 1 end
To find out how many times the words appear, we can look up the words in the dictionary as follows −
jupa> wordcounts["natural"] 1 jupa> wordcounts["processing"] 1 jupa> wordcounts["and"] 14
We can also sort the dictionary as follows −
jupa> for i in sort(collect(keys(wordcounts))) println("$i, $(wordcounts[i])") end 1, 2 2, 2 3, 2 4, 2 5, 1 a, 28 about, 3 above, 2 act, 1 affixes, 3 all, 2 also, 5 an, 5 analysis, 15 analyze, 1 analyzed, 1 analyzer, 2 and, 14 answer, 5 antonymies, 1 antonymy, 1 apppcation, 3 are, 11 … … … …
To find the most common words we can use collect() to convert the dictionary to an array of tuples and then sort the array as follows −
jupa> sort(collect(wordcounts), by = tuple -> last(tuple), rev=true) 276-element Array{Pair{String,Int64},1}: "the" => 76 "of" => 47 "is" => 39 "a" => 28 "words" => 23 "meaning" => 23 "semantic" => 22 "lexical" => 21 "analysis" => 15 "and" => 14 "in" => 14 "be" => 13 "it" => 13 "example" => 13 "or" => 12 "word" => 12 "for" => 11 "are" => 11 "between" => 11 "as" => 11 &velpp; "each" => 1 "river" => 1 "homonym" => 1 "classification" => 1 "analyze" => 1 "nocturnal" => 1 "axis" => 1 "concept" => 1 "deals" => 1 "larger" => 1 "destiny" => 1 "what" => 1 "reservation" => 1 "characterization" => 1 "second" => 1 "certitude" => 1 "into" => 1 "compound" => 1 "introduction" => 1
We can check the first 10 words as follows −
jupa> sort(collect(wordcounts), by = tuple -> last(tuple), rev=true)[1:10] 10-element Array{Pair{String,Int64},1}: "the" => 76 "of" => 47 "is" => 39 "a" => 28 "words" => 23 "meaning" => 23 "semantic" => 22 "lexical" => 21 "analysis" => 15 "and" => 14
We can use filter() function to find all the words that start with a particular alphabet (say ’n’).
jupa> filter(tuple -> startswith(first(tuple), "n") && last(tuple) < 4, collect(wordcounts)) 6-element Array{Pair{String,Int64},1}: "none" => 2 "not" => 3 "namely" => 1 "name" => 1 "natural" => 1 "nocturnal" => 1
Sets
Like an array or dictionary, a set may be defined as a collection of unique elements. Following are the differences between sets and other kind of collections −
In a set, we can have only one of each element.
The order of element is not important in a set.
Creating a Set
With the help of Set constructor function, we can create a set as follows −
jupa> var_color = Set() Set{Any}()
We can also specify the types of set as follows −
jupa> num_primes = Set{Int64}() Set{Int64}()
We can also create and fill the set as follows −
jupa> var_color = Set{String}(["red","green","blue"]) Set{String} with 3 elements: "blue" "green" "red"
Alternatively we can also use push!() function, as arrays, to add elements in sets as follows −
jupa> push!(var_color, "black") Set{String} with 4 elements: "blue" "green" "black" "red"
We can use in() function to check what is in the set −
jupa> in("red", var_color) true jupa> in("yellow", var_color) false
Standard operations
Union, intersection, and difference are some standard operations we can do with sets. The corresponding functions for these operations are union(), intersect() and setdiff().
Union
In general, the union (set) operation returns the combined results of the two statements.
Example
jupa> color_rainbow = Set(["red","orange","yellow","green","blue","indigo","violet"]) Set{String} with 7 elements: "indigo" "yellow" "orange" "blue" "violet" "green" "red" jupa> union(var_color, color_rainbow) Set{String} with 8 elements: "indigo" "yellow" "orange" "blue" "violet" "green" "black" "red"
Intersection
In general, an intersection operation takes two or more variables as inputs and returns the intersection between them.
Example
jupa> intersect(var_color, color_rainbow) Set{String} with 3 elements: "blue" "green" "red"
Difference
In general, the difference operation takes two or more variables as an input. Then, it returns the value of the first set excluding the value overlapped by the second set.
Example
jupa> setdiff(var_color, color_rainbow) Set{String} with 1 element: "black"
Some Functions on Dictionary
In the below example, you will see that the functions that work on arrays as well as sets also works on collections pke dictionaries −
jupa> dict1 = Dict(100=>"X", 220 => "Y") Dict{Int64,String} with 2 entries: 100 => "X" 220 => "Y" jupa> dict2 = Dict(220 => "Y", 300 => "Z", 450 => "W") Dict{Int64,String} with 3 entries: 450 => "W" 220 => "Y" 300 => "Z"
Union
jupa> union(dict1, dict2) 4-element Array{Pair{Int64,String},1}: 100 => "X" 220 => "Y" 450 => "W" 300 => "Z"
Intersect
jupa> intersect(dict1, dict2) 1-element Array{Pair{Int64,String},1}: 220 => "Y"
Difference
jupa> setdiff(dict1, dict2) 1-element Array{Pair{Int64,String},1}: 100 => "X"
Merging two dictionaries
jupa> merge(dict1, dict2) Dict{Int64,String} with 4 entries: 100 => "X" 450 => "W" 220 => "Y" 300 => "Z"
Finding the smallest element
jupa> dict1 Dict{Int64,String} with 2 entries: 100 => "X" 220 => "Y" jupa> findmin(dict1) ("X", 100)Advertisements