Assignment Chef icon Assignment Chef

Browse assignments

Assignment catalog

33,401 assignments available

[SOLVED] Csci251  assignment 1 overview this assignment is to be implemented using procedural programming.

Overview This assignment is to be implemented using procedural programming. These are some general rules about what you should and should not do.This program is used to simulate the process of a crowdsourcing system, and we simplify some steps here. The Tasks.txt and Workers.txt file save the information of individual tasks and workers, respectively.Each task (one line record from in Tasks.txt) represents a task which needs to be assigned to a worker to finish. The crowdsourcing system attempts to assign the task through a list of workers in the same line in Tasks.txt. All workers in the list will try the task in order. Each worker has a certain number of trials. Whether this worker is successful depends on the average performance over these trials.The next worker can try only when the current worker fails. If the current worker y succeeds, you should output something like “Assignment of Task x to worker y succeeds” where x is the taskID and y is the workerID. Otherwise, output something like “Assignment of Task x to worker y fails”.Now, let us define the inputs in Tasks.txt and Workers.txt, ‘worker performance’ at a time, ‘worker average performance’ and the ‘successful condition’ of a worker.Inputs in Tasks.txt and Workers.txt: 1. Tasks.txt Format: taskId,description;uncertainty$difficulty%priorityLabel&workers:a list of worker IDs Example: 123,image labelling;5$10%1&workers:0,1 Note: the ‘priorityLabel’ shows whether this task has a high priority or not. (1 denotes high priority and 0 denotes low priority)2. Workers.txt Format: workerId,name%variability$ability;experienceLabel Example: 0,Michael%-2$50;1 Note: the ‘experienceLabel’ shows whether this worker is senior or not. (1 denotes a senior worker and 0 denotes an ordinary worker)Worker Performance and her/his average performance: The performance score is a sample drawn from a normal distribution. In this normal distribution, “mean = worker ability – task difficulty” and “standard deviation = task uncertainty +worker variability”. E.g, given the task and the worker in the example above. Mean=50-10, and standard deviation=5+(-2).The average performance is the average score of a certain number of independent draws from this distribution plus a conditional value of 0 or 6. The number of draws is 5 if the task has a low priority and is 10 if it is a high-priority task. The conditional value is 0 if the worker is an ordinary worker and is 6 if the worker is senior.To summarize, the average performance of a worker is: The average of {5 or 10} independent performance scores drawn from a normal distribution, plus {0 or 6}. The values in the brackets {} depend on the type of the task and the worker.Successful condition of a worker We say the worker is successful if and only if the average performance score is greater than 50. Otherwise (i.e., the score

$25.00 View

[SOLVED] Csci218 assessed lab 2 solving problems by searching

Searching is an important approach used by intelligent agents to solve numerous problems including the travelling salesman problem, the 8-puzzle problem, the N-queens problem, and any practical problems that can be formulated as a searching task in a state space. In this subject, we have studied two types of search methods and their specific algorithms, which are listed as follows.• Uninformed search: Breadth-first search, Depth-first search, and Uniform-cost search. • Informed search: Best-first search, Greedy best-first search, and A* search.In this lab, you will apply them to solve the problem of travelling according to a simplified road map of Romania. You are provided with the Python code that has implemented all the search functions and the simplified road map.What you need to complete this lab: 1. The code (assessed_lab_2.py and utils.py) taken from the main textbook’s website [1]. 2. The lecture content and recordings in Week 8. 3. Python 3 programming environment with required libraries, packages, and modules.Objectives • Understand the concept of Solving Problems by Searching and its implementation. • Learn to choose classic searching methods and algorithms. • Learn to conduct searching on a typical searching task. • Learn to use libraries, packages and modules related to searching problems.Questions (5 marks) The sample code ‘assessed_lab_2.py’ includes seven functions for search algorithms (e.g., breadth first tree search, depth first tree search, and A*). Assume that the initial state is ‘Fagaras’ and the goal state is ‘Zerind’. Please complete the main function to call any five of the search functions.• Run your code and report the results by completing the following table. ID Algorithm Explored states Solution Path cost Time (seconds) 1 2 3 4 5• Your report must clearly describe the five selected search algorithms and provide a detailed analysis of the results, including how the LIFO stack and FIFO queue work for each expansion.• Each correctly implemented search algorithm and its analysis is worth 1 mark.Submission • Submit a single PDF file which contains your answers to the questions of all tasks. All questions are to be answered. A clear and complete explanation needs to be provided with each answer.• The PDF must contain typed text of your answers (do not submit a scan of a handwritten document. Any handwritten document will be ignored). The document can include computer generated graphics and illustrations (hand-drawn graphics and illustrations will be ignored).• The PDF document of your answers should be no more than 5 pages including all graphs and illustrations. Appendix is allowed and will not be counted for the page limit. • The size limit for this PDF report is 20MB. • Submit your Python code for the lab task separately. Note that, your code must be in Python fomat (i.e., PY file) and well-documented. Other file formats (e.g., PDF and DOCX) will be not evaluated.• ZIP all the files into a single zip file and submit it via the submission link on Moodle. • Late submission will not be accepted without academic consideration being granted.References [1] https://github.com/aimacode/aima-python/blob/master/search.py. ** END **

$25.00 View

[SOLVED] Csci218 assessed lab 1 text classification with naïve bayes classifiers

Text classification is an important task in natural language processing. It aims to categorise a given document, paragraph, or a sentence into a set of predefined classes. Text classification has applications in business, social media, electronic health record, education, to name just a few. In this assignment, you will conduct text classification by using Naïve Bayes Classifier on a benchmark dataset.You are provided with the 20 newsgroups dataset and the code explained in Week 6’s tutorial. They can be obtained from Moodle by downloading the file “20-newsgroups_textclassification- master.zip.” [1] under “Week 6 Lab.”What you need to complete this lab: • The 20 newsgroups dataset and the code (“Multinomial Naive Bayes- BOW with TF.ipynb”). • The lecture & tutorial content and recordings in Week 6. • Python 3 programming environment with required libraries, packages, and modules.• Understand Bayes’ formula and Naïve Bayes Classifier. • Understand text classification and the pre-processing procedure in natural language processing. • Learn to conduct text classification on a benchmark data set. • Learn to use libraries, packages and modules related to text classification.Questions (5 marks) Run the code “Multinomial Naive Bayes- BOW with TF.ipynb” (further modifications in the code may be required) and answer the following questions: 1. Describe the key steps for data preparation and feature extraction (1 mark). 2 2. Report the overall classification results, including precision, recall, and f1-score. Explain the meaning of these criteria (1 mark).3. Plot the confusion matrix for your classification result. Find the pair of classes that confuses the classifier most. Is this result consistent with your expectation? (1 mark). 4. Based on the confusion matrix, report the individual accuracy scores for each class (1 mark). 5. Train a Complement Naive Bayes classifier and compare its classification results with those of Multinomial Naive Bayes (1 mark).Submission • Submit a single PDF file which contains your answers to the questions of all tasks. All questions are to be answered. A clear and complete explanation needs to be provided with each answer.• You must show your name and student number on the first page of the PDF report. • Your PDF report should begin with a short introduction to the lab and the dataset. • Submit the PDF file via the submission link on Moodle.• The PDF file must contain typed text of your answers (do not submit a scan of a handwritten document. Any handwritten document will be ignored). The document can include computer generated graphics and illustrations (hand-drawn graphics and illustrations will be ignored).• The PDF document of your answers should be no more than 4 pages including all graphs and illustrations. Appendix is allowed and will not be counted for the 4-page limit. • The size limit for this PDF report is 20MB. • Late submission will not be accepted without academic consideration being granted.References [1] https://github.com/gokriznastic/20-newsgroups_text-classification. ** END **

$25.00 View

[SOLVED] Csci203 assignment two task: you should write a program with your selected programming language (python, java, or c++)

Task: You should write a program with your selected programming language (python, java, or C++) to simulate the queuing and service of bank customers. The simulation should be discrete event driven.Your program should: 1. Read in the number of servers/tellers for the simulation, 2. Read in the file name from standard input and then open the file, 3. Read data record by record from the named file, 4. Perform the simulation according to the input records, and 5. Output the statistics of the services as specified below.The input text file (a2-sample.txt) contains the following data: 1. A set of customers, one customer per line and each line consisting of a customer’s arrival time, the corresponding service time, and the customer’s priority (1 for low, 2 for normal, 3 for high). 2. The set of customers is sorted in the ascending order of their arrival times. 3. This set is terminated by a dummy record with the arrival time and the service time all equal to 0. The simulation is to be a bank branch with multiple tellers (i.e., servers) and a single queue.The simulation should be run as follows: 1. Setup the total number of tellers; 2. Read the text file to start the simulation; 3. Allocate the new arrival customer to the next idle tellers (if applicable); 4. If all tellers are busy, add the new arrival customer to the queue; 5. The queue shall be updated based on two criteria of the customers, i.e., their arrival times and their priorities. For the customers with the same priority level, the customers who arrive earlier shall be put in front of the customers who arrive late.However, the customers with a higher priority shall be put in front of the customers with a lower priority regardless of their arrival times. For example, if we have three customers, i.e., c1 (2.8, 4.8, 2), c2 (1.5, 3.1, 2), and c3 (3, 5.6, 3), then their order in the queue should be ‘c3, c2, c1’, which means the service order for these three customers should be c3,c2,c1. 6. If a busy teller finishes a service request, then the teller will serve the first customer in the queue (the customer will be removed from the queue). If the queue is empty, then the busy teller will become idle;7. The simulation should be run until the queue is empty and the last customer has left the system. The simulation will run multiple times to gather statistics on the efficiency of the service with a different number of tellers on duty. You are required to run the simulation three times with the number of servers being set to 1 teller, 2 tellers and 4 tellers, respectively. Each simulation runs on the sample input file and outputs the statistics as specified below. Output for each run will consist of the following data: • Number of customers served by each teller; • Total time of the simulation, i.e., the time difference between the simulation starts and the simulation ends; • Average service time per customer (this excludes the time spent in the queue);• Average waiting time per customer (this excludes the time spent in the service); • The maximum length of the queue; • The average length of the queue (this can be estimated as the ratio between the total queuing time and the total amount of the customers); • The idle rate of each teller (this can be calculated as the ratio between the total idle time of each teller and the total time of the simulation). Implementation Requirements: 1. Part of the purpose of this subject is to gain an in-depth understanding of data structures and algorithms. As such, all programming tasks in this subject require you to choose appropriate data structures and/or algorithms and implement them yourself. 2. You may use any data structures and/or algorithms that have been presented in class. If you use other data structures or algorithms appropriate references must be provided.3. The pseudocode should be provided first to explain the principle of your program. 4. Programs must be compiled and run. Programs which do not compile and run will lose marks. 5. Programs should be appropriately explained with comments. 6. All coding must be your own work. 7. You are only allowed to include or import input, output, and file streams in the beginning of your program. That is, you should only use built-in data structures and functions to implement your program.For instance, if you want to use data structure heap and functions related with heap in your program, you should implement heap and the relevant functions by using built-in array instead of calling heap from existing library. Specifically, you should start your program like this: 7.1 If you are to use C++ to program, the head should be #include #include using namespace std; 7.2 For Java, import java.io.File; import java.io.FileNotFoundException; import java.util.Scanner; 7.3 If you use Python, import sys import numpy 8. Use of STL, Java Collection, collection framework and any third-party libraries of data structures and algorithms in your language choice is NOT allowed. 9.If you use any references other than the lecture notes, ensure you cite them. Otherwise, you may lose marks. A clear comment in your code is sufficient. 10. Code sourced from textbooks, the internet, etc may also not be used. Otherwise, you may lose marks. 11. If needed, you may assume the queue will never have more than 100 customers. Report: A pdf file describing your solution and program output should be produced. This file should contain: 1. A high‐level description (in pseudo-code) of the overall solution strategy. 2. A complexity analysis of your solution with big-O notation and sufficient justification. 3. A list of all of the data structures used, and the reasons for using them. 4. A snapshot of the compilation and the execution of your program on the provided “a2-sample.txt” file. 5. The outputs (three runs in total) are produced by your program on the provided “a2-sample.txt” file. 6. A discussion based on the statistics that how the teller’s number affect the efficiency of the services. 7. The report pdf file should be called -a2.pdfSubmission via Moodle: • Please submit your source code and the pdf report as a zip file (named as -a2.zip) to CSCI203 Moodle site (Assignment 2 submission folder) before the deadline. • Please note that the email submission will NOT be accepted. • If an extension (maximally 1 week) is required, please submit an academic consideration via SOLS before the deadline. • Late submission will receive 25% penalty of the assessment weight per day and a zero mark after 3 days. Marking Guide: • Programs submitted must work (can be compiled and executed)!A program which fails to compile or run will lose marks. • If your program produces different output from what is reported in the pdf file, a mark of zero will be graded. • A program which produces the correct output, no matter how inefficient the code, will receive a minimum of 50% of the program component of the mark.• Additional marks beyond this will be awarded for the appropriateness, i.e. efficiency for this problem, of the algorithms and data structures you use. • Programs which lack clarity, both in code and comments, will lose marks. The total mark will be determined based on both your code and the accompanying design pdf document.

$25.00 View

[SOLVED] Csci203 assignment one task: propose an algorithm and implement it with your selected programming

Task: Propose an algorithm and implement it with your selected programming language (python, java, or C++). The program shall read and processes a text file, and generate a statistic report on its content.Your program should: 1. Read the name of a text file from the console. 2. Read in the text file, not all at once. (This can be line by line, word by word or character by character.) 3. Convert the textual content to a sequence of words, discarding punctuation. 4. Convert all letters into the lower case. 5. Store a count (the number of occurrences) of each different word. 6. Sort the words based on the decreasing order of the word counts. If there are multiple words with the same count, sort them alphabetically. (This ordering may be achieved as the words are read in, partially as the words are read or at the end of all input processing.)7. Output the first ten words in the sorted list, along with their counts. 8. Output the last ten words in the sorted list, along with their counts. 9. Output all ‘unique words’ (the word count is 1). ImplementationYou must choose appropriate data structures and algorithms to accomplish this task. Note that 1) In the context of this assignment, appropriate choices will be efficient and will not use excessive instructions or data. 2) Where a punctuation mark appears between two letters, the sequence is to be treated as a single word. Thus, ‘it’s’ will become ‘its’, ‘you’ll’ will become ‘youll’ and ‘loop-hole’ will become ‘loophole’. 3) You can assume that the input file contains no more than 50,000 different words.4) Two sample input files “sample-short.txt” and “sample-long.txt” is provided for you to test your program and produce the program report. 5) you may use any data structures or algorithms that have been presented in class up to the end of week 4. If you use other data structures or algorithms, appropriate references must be provided. 6) Programs must be compiled and executed. Otherwise, a zero mark will be applied. 7) Programs should be appropriately documented with comments. 8) All coding must be your own work. 9) Standard libraries of data structures and algorithms such as STL should NOT be used. 10) String class should NOT be used, you must define your own string pool. 11) Code from textbooks, the internet, etc may also not be used. Otherwise, you will receive a zero mark.Report: A pdf file describing your solution and program output should be produced. This file should contain: 1. A high‐level description (in pseudo code) of the overall solution strategy. 2. A complexity analysis of your solution with big-O notation and sufficient justification. 3. A list of all of the data structures used, and the reasons for use them. 4. A snapshot of the compilation and the execution of your program on the provided “samplelong.txt” file. 5. The output produced by your program on the provided “sample-long.txt” file. 6. The report pdf file should be called -a1.pdfSubmission: • Please submit your source code and the pdf report as a zip file (named as -a1.zip) to CSCI203 Moodle site (Assignment 1 submission folder) before the deadline. • Please note, the email submission will NOT be accepted. • If an extension (maximally 1 week) is required, please submit an academic consideration via SOLS before the deadline. • Late submission will receive 25% penalty of the assessment weight per day and a zero mark after 3 days.Marking Guide: • Programs submitted must work (can be compiled and executed)! A program which fails to compile or run will receive a zero mark. • If your program produces different output from what is reported in the pdf file, a mark of zero will also be graded. • A program which produces the correct output, no matter how inefficient the code, will receive a minimum of 50% of the program component of the mark.• Additional marks beyond this will be awarded for the appropriateness, i.e. efficiency for this problem, of the algorithms and data structures you use. • Programs which lack clarity, both in code and comments, will lose marks. The total mark will be determined based on both your code and the accompanying design pdf document.

$25.00 View

[SOLVED] Cs6601 assignment 6

5/5 - (1 vote) Part 1: International Morse Code (55 pts) Given a model of a system’s transition and emission probabilities and a series of evidence observations, what is the most likely sequence of states that generated the observed evidence? International Morse Code represents each letter by a unique sequence of dots and dashes. In theory, the duration of a dash is three times the duration of a dot. Each dot or dash is followed by a short silence, equal to the dot duration. The letters of a word are separated by a space equal to three dots (one dash), and the words are separated by a space equal to seven dots. The dot duration is the basic unit of time measurement in code transmission. The chart of International Morse Code looks like: 1a) The HMM for letter A is shown below: There are 3 states as shown on the graph. And the transition, emission, and prior probabilities are below. In [ ]: A_states = ('A1', 'A2', 'A3','Aend') A_transition_probs = { 'A1': {'A1': 0, 'A2': 1, 'A3':0,'Aend':0}, 'A2': {'A1': 0, 'A2': 0, 'A3':1,'Aend':0}, 'A3': {'A1': 0, 'A2': 0, 'A3':0.667,'Aend':0.333}, 'Aend': {'A1': 0, 'A2': 0, 'A3':0,'Aend':1} } A_emission_probs = { 'A1' : [0,1], 'A2' : [1,0], 'A3' : [0,1], 'Aend' : [0,0] } A_prior = {'A1': 1, 'A2': 0,'A3':0,'Aend':0} B_states = ('B1', 'B2', 'B3','B4','B5','B6','B7','Bend') C_states = ('C1', 'C2', 'C3','C4','C5','C6','C7','Cend') Follow A’s example, please provide the transition, prior and emission probabilities for letters B and C. (accurate to 3 decimal places) In [ ]: def part_1_a(): #(10 pts) # TODO: Fill below! B_transition_probs = { } B_emission_probs = { } B_prior = { } C_transition_probs = { } C_emission_probs = { } C_prior = { } return B_prior, B_transition_probs, B_emission_probs, C_prior, C_transition_probs, C_emission_probs 1b) Now the evidence sequence [1,1,0,1,0,1,0,1] is received. Implement the Viterbi algorithm to get the most likely state sequences for A, B and C and output the probabilities associated with the most likely state sequences. Hint: In order to reconstruct your most-likely path after running Viterbi, you’ll need to keep track of a back-pointer at each state, which directs you that state’s most-likely predecessor. In the autograder, we will also test your code against other evidence_vectors. In [ ]: import numpy as np def viterbi(evidence_vector, prior, states, transition_probs, emission_probs): sequence=[] probability=0 """ 45 points Input: evidence_vector: A list of integers (0,1) prior: A dictionary corresponding to the prior distribution over states states: A list of all possible system states transition_probs: A dictionary mapping states onto dictionaries mapping states onto probabilities emission_probs: A dictionary mapping states onto their emission probabilities Output: sequence: A list of states that is the most likely sequence of states explaining the evidence, like ['A1', 'A2', 'A3', 'A3', 'A3'] probability: float """ return sequence, probability In [ ]: evidence_sequence = [1,1,0,1,0,1,0,1] B_prior, B_transition_probs, B_emission_probs, C_prior, C_transition_probs, C_emission_probs=part_1_a() A_state_sequence,probability = viterbi(evidence_sequence, A_prior, A_states, A_transition_probs, A_emission_probs) B_state_sequence,probability = viterbi(evidence_sequence, B_prior, B_states, B_transition_probs, B_emission_probs) C_state_sequence,probability = viterbi(evidence_sequence, C_prior, C_states, C_transition_probs, C_emission_probs) Part 2: Let’s add some noise!!! (45 pts) Now the noise come in. For the emission probability, instead of having a 0/1 pair, we will have the 0.2/0.8 pair. And for transition probability, for those having 0/1 as self-transitioning/transitioning-to-another-state pair, we change it to 0.2/0.8. All other probabilities remain the same. Letter A is shown here as an example. And the changed transition, emission probabilities for A, B and C along with the HMM for space_between_two_words and space_between_two_letters are given below. In [ ]: A_transition_probs_noise = { 'A1': {'A1': 0.2, 'A2': 0.8, 'A3':0,'Aend':0}, 'A2': {'A1': 0, 'A2': 0.2, 'A3':0.8,'Aend':0}, 'A3': {'A1': 0, 'A2': 0, 'A3':0.667,'Aend':0.333}, 'Aend': {'A1': 0, 'A2': 0, 'A3':0,'Aend':1} } A_emission_probs_noise = { 'A1' : [0.2,0.8], 'A2' : [0.8,0.2], 'A3' : [0.2,0.8], 'Aend' : [0,0] } B_transition_probs_noise = { 'B1': {'B1': 0.667, 'B2': 0.333, 'B3':0,'B4': 0, 'B5': 0, 'B6':0,'B7': 0,'Bend':0}, 'B2': {'B1': 0, 'B2': 0.2, 'B3':0.8,'B4': 0, 'B5': 0, 'B6':0,'B7': 0,'Bend':0}, 'B3': {'B1': 0, 'B2': 0, 'B3':0.2,'B4': 0.8, 'B5': 0, 'B6':0,'B7': 0,'Bend':0}, 'B4': {'B1': 0, 'B2': 0, 'B3':0,'B4': 0.2, 'B5': 0.8, 'B6':0,'B7': 0,'Bend':0}, 'B5': {'B1': 0, 'B2': 0, 'B3':0,'B4': 0, 'B5': 0.2, 'B6':0.8,'B7': 0,'Bend':0}, 'B6': {'B1': 0, 'B2': 0, 'B3':0,'B4': 0, 'B5': 0, 'B6':0.2,'B7': 0.8,'Bend':0}, 'B7': {'B1': 0, 'B2': 0, 'B3':0,'B4': 0, 'B5': 0, 'B6':0,'B7': 0.2,'Bend':0.8}, 'Bend': {'B1': 0, 'B2': 0, 'B3':0,'B4': 0, 'B5': 0, 'B6':0,'B7': 0,'Bend':1} } B_emission_probs_noise = { 'B1' : [0.2,0.8], 'B3' : [0.2,0.8], 'B5' : [0.2,0.8], 'B7' : [0.2,0.8], 'B2' : [0.8,0.2], 'B4' : [0.8,0.2], 'B6' : [0.8,0.2], 'Bend': [0,0] } C_transition_probs_noise = { 'C1': {'C1': 0.667, 'C2': 0.333, 'C3':0,'C4': 0, 'C5': 0, 'C6':0,'C7': 0,'Cend':0}, 'C2': {'C1': 0, 'C2': 0.2, 'C3':0.8,'C4': 0, 'C5': 0, 'C6':0,'C7': 0,'Cend':0}, 'C3': {'C1': 0, 'C2': 0, 'C3':0.2,'C4': 0.8, 'C5': 0, 'C6':0,'C7': 0,'Cend':0}, 'C4': {'C1': 0, 'C2': 0, 'C3':0,'C4': 0.2, 'C5': 0.8, 'C6':0,'C7': 0,'Cend':0}, 'C5': {'C1': 0, 'C2': 0, 'C3':0,'C4': 0, 'C5': 0.667, 'C6':0.333,'C7': 0,'Cend':0}, 'C6': {'C1': 0, 'C2': 0, 'C3':0,'C4': 0, 'C5': 0, 'C6':0.2,'C7': 0.8,'Cend':0}, 'C7': {'C1': 0, 'C2': 0, 'C3':0,'C4': 0, 'C5': 0, 'C6':0,'C7': 0.2,'Cend':0.8}, 'Cend': {'C1': 0, 'C2': 0, 'C3':0,'C4': 0, 'C5': 0, 'C6':0,'C7': 0,'Cend':1} } C_emission_probs_noise = { 'C1' : [0.2,0.8], 'C3' : [0.2,0.8], 'C5' : [0.2,0.8], 'C7' : [0.2,0.8], 'C2' : [0.8,0.2], 'C4' : [0.8,0.2], 'C6' : [0.8,0.2], 'Cend': [0,0] } space_between_two_letters_states = ('L1','Lend') space_between_two_words_states = ('W1','Wend') space_between_two_letters_transition_probs = { 'L1': {'L1': 0.667,'Lend':0.333}, 'Lend': {'L1': 0,'Lend':1} } space_between_two_letters_emission_probs = { 'L1' : [0.8,0.2], 'Lend': [0,0] } space_between_two_letters_prior = { 'L1': 1, 'Lend': 0 } space_between_two_words_transition_probs = { 'W1': {'W1': 0.857,'Wend':0.143}, 'Wend': {'W1': 0,'Wend':1} } space_between_two_words_emission_probs = { 'W1' : [0.8,0.2], 'Wend': [0,0] } space_between_two_words_prior = { 'W1': 1, 'Wend': 0 } 2a) Suppose evidence sequence [1,1,1,0,1,1,1,0,1,0,0,0,1,0,0,0,1,1,1,1,1,0,1,1,1,0,1,0,0,0,0,0,1,1,1] is received. Only A, B, C and the 2 types of spaces are involved. A sequence will start with a letter and end with a letter. The same letter may be repeated. The sequence can be made up of one or more words. And a word can be made up of one or more letters. The beginning of the sequence can be any letter with equal probability. After each letter there is an end or a space. It can be an inner letter space or inner word space and equal probability for end, inner letter space and inner word space. A space can transition to any letter with equal probability. When it reaches the end state, it stays at the end state. In our experiments above we assumed that each letter would be decoded in isolation. Each evidence string decoded to one letter. Now the evidence string can decode to a string of letters and spaces. The state sequence for “A” can end with Aend or can transition to L1 (space between two letters) or W1 (space between two words). Redefine the states, prior, transition and emission probabilities given above so that your Viterbi algorithm will be able to decode strings of letters. (With noise, accurate to 3 decimal places) In [ ]: def part_2_a(): #TO DO: fill in below states=( ) prior_probs = { } transition_probs={ } emission_probs={ } return states, prior_probs, transition_probs, emission_probs Instead of checking all probabilities, we will check only five of them. Finish the quick_check below In [ ]: def quick_check(): #TO DO: fill the probabilities, 5 points #prior probability for C1 prior_C1= #transition probability from A3 to L1 A3_L1= #transition probability from B4 to B5 B4_B5= #transition probability from W1 to B1 W1_B1= #transition probability from L1 to L1 L1_L1= return prior_C1,A3_L1,B4_B5,W1_B1,L1_L1 2b) Use your output in 2a and evidence_sequence to generate the most probable decoded letter sequence from the evidence sequence and its probability. Your code will also be tested against other evidence_vectors. In [ ]: states, prior_probs, transition_probs, emission_probs=part_2_a() In [ ]: def part_2_b(evidence_vector, prior, states, transition_probs, emission_probs): sequence='' probability=0 ''' TO DO: fill this (40 points) Output: sequence: a string of most likely decoded letter sequence (like 'A B A CAC', using uppercase) probability: float ''' return sequence, probability In [ ]: evidence_sequence = [1,1,0,0,1,0,1,0,1,0,0,0,0,0,1,0,1,1,1,0,1,0,1,1,1,0,1,0,0,1,1,0,1,1,1] evidence_sequence=[1,1,0,0,1,0,1,0,1] print part_2_b(evidence_sequence,prior_probs,states,transition_probs,emission_probs) Extra Credit Here is a piece of code that listens for a first left mouse click and starts producing a 1 or 0 every 100 milliseconds depending on whether or not the left mouse is depressed. When it senses a “return” key, the code finishes writing the binary sequence and you can close the window. The time of pressing and depressing is rounded to the nearest 100 milliseconds. In [ ]: from Tkinter import * import sys import time message='' key_start=-1 space_start=-1 def press(event): global message,key_start,space_start times=0 if space_start>0: times=int(round((time.time()-space_start)*10)) key_start=time.time() for i in range(times): message+='0,' msg.insert(END,'0,') def depress(event): global message,key_start,space_start times=0 if key_start>0: times=int(round((time.time()-key_start)*10)) space_start=time.time() for i in range(times): message+='1,' msg.insert(END,'1,') def end(event): global message print message root=Tk() msg=Text(root) msg.pack() msg.config(font=('times',20)) button=Button(root,text="press me") button.pack() button.bind('',press) button.bind('',depress) root.bind('',end) mainloop() Use the program (and your skill as a Morse code keyer) to generate a few strings of 0s and 1s that represent words in Morse code. Please note the temporal variability of human input. For example, SOS’s morse code is … — … , but the user’s input might be 11100110011100011111100111111100111111001101110011. Finish creating HMMs for the rest of the letters of the Morse alphabet. Show that your Viterbi decoder for Morse can successfully decode the examples you made above (or at least get close). You may have to tune the transition probabilities to get reliable results. Include your binary strings and your decodings of them. Below are two webpages that you might find useful. http://thelivingpearl.com/2013/01/08/morse-code-and-dictionaries-in-python-with-sound/ http://www.prooffreader.com/2014/09/how-often-does-given-letter-follow.html Besides the decoder, you also need to submit a string (only alphabets and spaces) and its morse code equivalent as performed by a human (in 0s and 1s). Your code will be tested against our string and strings submitted by other students. The decoder that gets the best results wins the competition. We will have no noise in the system. Temporal variability means that dots can be different lengths “1” vs. “11” vs “111” and dashes can be different lengths “11111” vs. “1111111” vs. “1111111” but the system still decodes properly. For our test sentences, we will make sure that the dots and dashes have some variability in them but still maintain that a dot is approximately 1/3 the size of the dash. In [ ]: def decoder(evidence_vector): #you can define your prior, emission, transition probabilities in your own format return sequence In [ ]: def extra_credit(): #string: like "A BC" #morse_code_equivalent: like [1,0,1] return string, morse_code_equivalent

$25.00 View

[SOLVED] Cs6601 assignment 5: gaussian mixture models

5/5 - (1 vote) The Challenge Automatic image processing is a key component to many AI systems, including facial recognition and video compression. One basic method for processing is segmentation, by which we divide an image into a fixed number of components in order to simplify its representation. For example, we can train a mixture of Gaussians to represent an image, and segment it according to the simplified representation as shown in the images below. In this assignment, you will learn to perform image segmentation. To this end, you will implement Gaussian mixture models and iteratively improve their performance. You will perform this segmentation on the “Bird” and “Party Spock” images included with the assignment. About the Assignment The tests for the assignment are provided in the notebook, so Bonnie is only for submission purposes. The tests on Bonnie will be similar to the ones provided here, but the images being tested against, and the values for calculations will be different. Thus, you will be allowed only 5 submissions on Bonnie. Make sure you test everything before submitting. Score for the last submission counts. The code will be allowed to run for not more than 2 hours per submission. In order for the code to run quickly, make sure to vectorize the code (more on this below). Your Mission (should you choose to accept it) Your assignment is to implement several methods of image segmentation, with increasing complexity. Implement k-means clustering to segment a color image. Build a Gaussian mixture model to be trained with expectation-maximization. Experiment with varying the details of the Gaussian mixture model’s implementation. Implement and test a new metric called the Bayesian information criterion, which guarantees a more robust image segmentation. Grading The grade you receive for the assignment will be distributed as follows: k-Means Clustering (20 points) Gaussian Mixture Model (40 points) Model Performance Improvements (20 points) Bayesian Information Criterion (20 points) Bonus Resources The em.pdf chapter in the assignment folder gives a good explanation of implementing and training mixture models, particularly 424-427 (k-means) and 435-439 (mixture models and EM). The book Elements of Statistical Learning, pages 291-295. Background A Gaussian mixture model is a generative model for representing the underlying probability distribution of a complex collection of data, such as the collection of pixels in a grayscale photograph. In the context of this problem, a Gaussian mixture model defines the joint probability f(x)f(x) as f(x)=∑i=1kmiNi(x|μi,σ2i)f(x)=∑i=1kmiNi(x|μi,σi2) where xx is a grayscale value [0,1], f(x)f(x) is the joint probability of that gray scale value, mimi is the mixing coefficient on component ii, NiNi is the ithith Gaussian distribution underlying the value xx with mean μiμi and variance σ2iσi2. We will be using this model to segment photographs into different grayscale regions. The idea of segmentation is to assign a component ii to each pixel xx using the maximum posterior probability componentx=argmaxi(miNi(x|μi,σ2i)componentx=argmaxi(miNi(x|μi,σi2) Then we will replace each pixel in the image with its corresponding μiμi to produce a result as below (original above, segmented with three components below). In [ ]: from __future__ import division import warnings warnings.simplefilter(action = "ignore", category = FutureWarning) import numpy as np import scipy as sp from matplotlib import image import matplotlib.pyplot as plt import matplotlib.cm as cm In [ ]: """Helper image-processing code. These have been added in a separate python file and added in to the repo. The functions below have all been imported in to your submission file""" def image_to_matrix(image_file, grays=False): """ Convert .png image to matrix of values. params: image_file = str grays = Boolean returns: img = (color) np.ndarray[np.ndarray[np.ndarray[float]]] or (grayscale) np.ndarray[np.ndarray[float]] """ img = image.imread(image_file) # in case of transparency values if(len(img.shape) == 3 and img.shape[2] > 3): height, width, depth = img.shape new_img = np.zeros([height, width, 3]) for r in range(height): for c in range(width): new_img[r,c,:] = img[r,c,0:3] img = np.copy(new_img) if(grays and len(img.shape) == 3): height, width = img.shape[0:2] new_img = np.zeros([height, width]) for r in range(height): for c in range(width): new_img[r,c] = img[r,c,0] img = new_img return img def matrix_to_image(image_matrix, image_file): """ Convert matrix of color/grayscale values to .png image and save to file. params: image_matrix = (color) numpy.ndarray[numpy.ndarray[numpy.ndarray[float]]] or (grayscale) numpy.ndarray[numpy.ndarray[float]] image_file = str """ # provide cmap to grayscale images cMap = None if(len(image_matrix.shape) 1 and image_matrix.shape[-1] != 1): depth = image_matrix.shape[-1] return image_matrix.reshape(height, width, depth) else: return image_matrix.reshape(height, width) def image_difference(image_values_1, image_values_2): """ Calculate the total difference in values between two images. Assumes that both images have same shape. params: image_values_1 = (color) numpy.ndarray[numpy.ndarray[numpy.ndarray[float]]] or (grayscale) numpy.ndarray[numpy.ndarray[float]] image_values_2 = (color) numpy.ndarray[numpy.ndarray[numpy.ndarray[float]]] or (grayscale) numpy.ndarray[numpy.ndarray[float]] returns: dist = int """ flat_vals_1 = flatten_image_matrix(image_values_1) flat_vals_2 = flatten_image_matrix(image_values_2) N, depth = flat_vals_1.shape dist = 0. point_thresh = 0.005 for i in range(N): if(depth > 1): new_dist = sum(abs(flat_vals_1[i] - flat_vals_2[i])) if(new_dist > depth * point_thresh): dist += new_dist else: new_dist = abs(flat_vals_1[i] - flat_vals_2[i]) if(new_dist > point_thresh): dist += new_dist return dist In [ ]: image_dir = 'images/' image_file = 'party_spock.png' values = image_to_matrix(image_dir + image_file) print(values) Part 0: Note on Vectorization The concept of Vectorization was introduced in the last section of Assignment 4. For this assignment, please vectorize your code wherever possible using numpy arrays, instead of running for-loops over the images being processed. For an example of how this might be useful, consider the following array: A = [12 34 1234 764 …(has a million values)… 91, 78] Now you need to calculate another array B, which has the same dimensions as A above. Say each value in B is calculated as follows: (each value in B) = square_root_of(some constants pi log(k) * (each value in A))/7 You might wish to use a for-loop to compute this. However, it will take really long to run on an array of this magnitude. Alternatively, you may choose to use numpy and perform this calculation in a single line. You can pass A as a numpy array and the entire calculation will be done in a line, resulting in B being populated with the corresponding values that come out of this formula. Part 1: K-means clustering 20 pts One easy method for image segmentation is to simply cluster all similar data points together and then replace their values with the mean value. Thus, we’ll warm up using k-means clustering. This will also provide a baseline to compare with your segmentation. Please note that clustering will come in handy later. Fill out k_means_cluster() to convert the original image values matrix to its clustered counterpart. Your convergence test should be whether the assigned clusters stop changing. Note that this convergence test is rather slow. When no initial cluster means are provided, k_means_cluster() should choose kk random points from the data (without replacement) to use as initial cluster means. For this part of the assignment, since clustering is best used on multidimensional data, we will be using the color image bird_color_24.png. You can test your implementation of k-means using our reference images in k_means_test(). Try to vectorize the code for it to run faster. Without vectorization it takes 25-30 minutes for the code to run. In [ ]: from random import randint from functools import reduce def k_means_cluster(image_values, k=3, initial_means=None): """ Separate the provided RGB values into k separate clusters using the k-means algorithm, then return an updated version of the image with the original values replaced with the corresponding cluster values. params: image_values = numpy.ndarray[numpy.ndarray[numpy.ndarray[float]]] k = int initial_means = numpy.ndarray[numpy.ndarray[float]] or None returns: updated_image_values = numpy.ndarray[numpy.ndarray[numpy.ndarray[float]]] """ # TODO: finish this function raise NotImplementedError() return updated_image_values In [ ]: def k_means_test(): """ Testing your implementation of k-means on the segmented bird_color_24 reference images. """ k_min = 2 k_max = 6 image_dir = 'images/' image_name = 'bird_color_24.png' image_values = image_to_matrix(image_dir + image_name) # initial mean for each k value initial_means = [ np.array([[0.90980393,0.8392157,0.65098041],[0.83137256,0.80784315,0.69411767]]), np.array([[0.90980393,0.8392157,0.65098041],[0.83137256,0.80784315,0.69411767],[0.67450982,0.52941179,0.25490198]]), np.array([[0.90980393,0.8392157,0.65098041],[0.83137256,0.80784315,0.69411767],[0.67450982,0.52941179,0.25490198],[0.86666667,0.8392157,0.70588237]]), np.array([[0.90980393,0.8392157,0.65098041],[0.83137256,0.80784315,0.69411767],[0.67450982,0.52941179,0.25490198],[0.86666667,0.8392157,0.70588237],[0,0,0]]), np.array([[0.90980393,0.8392157,0.65098041],[0.83137256,0.80784315,0.69411767],[0.67450982,0.52941179,0.25490198],[0.86666667,0.8392157,0.70588237],[0,0,0],[0.8392157,0.80392158,0.63921571]]), ] # test different k values to find best for k in range(k_min, k_max+1): updated_values = k_means_cluster(image_values, k, initial_means[k-k_min]) ref_image = image_dir + 'k%d_%s'%(k, image_name) ref_values = image_to_matrix(ref_image) dist = image_difference(updated_values, ref_values) print('Image distance = %.2f'%(dist)) if(int(dist) == 0): print('Clustering for %d clusters produced a realistic image segmentation.'%(k)) else: print('Clustering for %d clusters didn't produce a realistic image segmentation.'%(k)) In [ ]: k_means_test() Part 2: Implementing a Gaussian mixture model 40 pts Next, we will step beyond clustering and implement a complete Gaussian mixture model. Complete the below implementation of GaussianMixtureModel so that it can perform the following: Calculate the joint log probability of a given greyscale value. (5 points) Use expectation-maximization (EM) to train the model to represent the image as a mixture of Gaussians. (20 points) To initialize EM, set each component’s mean to the grayscale value of randomly chosen pixel and variance to 1, and the mixing coefficients to a uniform distribution. Note: there are packages that can run EM automagically, but please implement your own version of EM without using these extra packages. We’ve set the convergence condition for you in GaussianMixtureModel.default_convergence(): if the new likelihood is within 10% of the previous likelihood for 10 consecutive iterations, the model has converged. Calculate the log likelihood of the trained model. (5 points) Segment the image according to the trained model. (5 points) Determine the best segmentation by iterating over model training and scoring, since EM isn’t guaranteed to converge to the global maximum. (5 points) We have provided the necessary tests for this part. When multiplying lots of probabilities in sequence, you can end up with a probability of zero due to underflow. To avoid this, you should calculate the log probabilities for the entire assignment. The log form of the Gaussian probability of scalar value xx is: ln(N(x|μ,σ))=−0.5ln(2πσ2)−(x−μ)22σ2ln(N(x|μ,σ))=−0.5ln(2πσ2)−(x−μ)22σ2 where μμ is the mean and σσ is standard deviation. You can calculate the sum of log probabilities by using scipy.misc.logsumexp(). For example, logsumexp([-2,-3]) will return the same result as numpy.log(numpy.exp(-2)+numpy.exp(-3)). In other words, logsumexp(a, b) = log(e^a + e^b). Rather than using lists of lists, you will find it much easier to store your data in numpy.array arrays. You can instantiate them using the command matrix = numpy.zeros([rows, columns]) where rows is the number of rows and columns is the number of columns in your matrix. numpy.zeros() generates a matrix of the specified size containing 0s at each row/column cell. You can access cells with the syntax matrix[2,3] which will return the value in row 2 and column 3. Warning: You may lose all marks for this part if your code runs for too long. You will need to vectorize your code in this part. Specifically, the method train_model() needs to perform operations using numpy arrays, as does likelihood(), which calculates the log likelihood. These are time-sensitive operations and will be called over and over as you proceed with this assignment. For the assignment, focus on vectorizing the following: The calculations for the Expectation step, where you calculate joint probabilities The calculations where you update your means, variances and mixing coefficients in the Maximization step. Remember, these are fundamental operations and will be called a lot in the remainder of the assignment. So it is crucial you optimize these. For the synthetic data test which we provide to check if your training is working, the set is too small and it won’t make a difference. But with the actual image that we use ahead, for-loops won’t do good. Vectorized code would take under 30 seconds to converge which would typically involve about 15-20 iterations with the convergence function we have here. Inefficient code that uses loops or iterates over each pixel value sequentially, will take hours to run. You don’t want to do that because: You won’t have that much time to test to your code. You won’t be getting marks. We will be capping the run time and kill anything that takes over 2 minutes for each iteration. You will want to have your image pixel values as a one-dimensional array to perform these operations. We have provided a method to flatten the image for this purpose. In [ ]: def default_convergence(prev_likelihood, new_likelihood, conv_ctr, conv_ctr_cap=10): """ Default condition for increasing convergence counter: new likelihood deviates less than 10% from previous likelihood. params: prev_likelihood = float new_likelihood = float conv_ctr = int conv_ctr_cap = int returns: conv_ctr = int converged = boolean """ increase_convergence_ctr = (abs(prev_likelihood) * 0.9 = likelihood_thresh): print('Congrats! Your model's log likelihood improved by at least %d.'%(likelihood_thresh)) print( 'Synthetic example with 4 means:') num_components = 4 actual_means = [2,4,6,8] actual_variances = [1]*num_components actual_mixing = [.25]*num_components dataset_1 = generate_test_mixture(data_range, actual_means, actual_variances, actual_mixing) gmm = GaussianMixtureModel(dataset_1, num_components) gmm.initialize_training() # start off with faulty means gmm.means = [1,3,5,9] initial_likelihood = gmm.likelihood() gmm.train_model() final_likelihood = gmm.likelihood() # compare likelihoods likelihood_difference = final_likelihood - initial_likelihood likelihood_thresh = 200 if(likelihood_difference >= likelihood_thresh): print('Congrats! Your model's log likelihood improved by at least %d.'%(likelihood_thresh)) return gmm In [ ]: gmm_train_test() In [ ]: def gmm_segment_test(): """ Apply the trained GMM to unsegmented image and generate a segmented image. returns: segmented_matrix = numpy.ndarray[numpy.ndarray[float]] """ image_file = 'images/party_spock.png' image_matrix = image_to_matrix(image_file) num_components = 3 gmm = GaussianMixtureModel(image_matrix, num_components) gmm.initialize_training() gmm.train_model() segment = gmm.segment() segment_num_components = len(np.unique(segment)) if(segment_num_components == num_components): print('Congrats! Your segmentation produced an image '+ 'with the correct number of components.') return segment In [ ]: def gmm_best_segment_test(): """ Calculate the best segment generated by the GMM and compare the subsequent likelihood of a reference segmentation. Note: this test will take a while to run. returns: best_seg = np.ndarray[np.ndarray[float]] """ image_file = 'images/party_spock.png' image_matrix = image_to_matrix(image_file) image_matrix_flat = flatten_image_matrix(image_matrix) num_components = 3 gmm = GaussianMixtureModel(image_matrix, num_components) gmm.initialize_training() iters = 10 # generate best segment from 10 iterations # and extract its likelihood best_seg = gmm.best_segment(iters) matrix_to_image(best_seg, 'images/best_segment_spock.png') best_likelihood = gmm.likelihood() # extract likelihood from reference image ref_image_file = 'images/party_spock%d_baseline.png'%(num_components) ref_image = image_to_matrix(ref_image_file, grays=True) gmm_ref = GaussianMixtureModel(ref_image, num_components) ref_vals = ref_image.flatten() ref_means = list(set(ref_vals)) ref_variances = [0]*num_components ref_mixing = [0]*num_components for i in range(num_components): relevant_vals = ref_vals[ref_vals==ref_means[i]] ref_mixing[i] = float(len(relevant_vals)) / float(len(ref_vals)) ref_variances[i] = np.mean((image_matrix_flat[ref_vals==ref_means[i]] - ref_means[i])**2) gmm_ref.means = ref_means gmm_ref.variances = ref_variances gmm_ref.mixing_coefficients = ref_mixing ref_likelihood = gmm_ref.likelihood() # compare best likelihood and reference likelihood likelihood_diff = best_likelihood - ref_likelihood likelihood_thresh = 1e4 if(likelihood_diff >= likelihood_thresh): print('Congrats! Your image segmentation is an improvement over ' + 'the baseline by at least %.2f.'%(likelihood_thresh)) return best_seg In [ ]: best_segment = gmm_best_segment_test() matrix_to_image(best_segment, 'best_segment.png') Part 3: Model experimentation 20 points We’ll now experiment with a few methods for improving GMM performance. 3a: Improved initialization 12.5 points To run EM in our baseline Gaussian mixture model, we use random initialization to determine the initial values for our component means. We can do better than this! Fill in the below GaussianMixtureModelImproved.initialize_training() with an improvement in component initialization. Please don’t use any external packages for anything other than basic calculations (e.g. scipy.misc.logsumexp). Note that your improvement might significantly slow down runtime, although we don’t expect you to spend more than 10 minutes on initialization. Hint: you’ll probably want an unsupervised learning method to initialize your component means. Clustering is one useful example of unsupervised learning, and you may want to look at 1-dimensional methods such as Jenks natural breaks optimization. In [ ]: class GaussianMixtureModelImproved(GaussianMixtureModel): """A Gaussian mixture model for a provided grayscale image, with improved training performance.""" def initialize_training(self): """ Initialize the training process by setting each component mean using some algorithm that you think might give better means to start with, each component variance to 1, and each component mixing coefficient to a uniform value (e.g. 4 components -> [0.25,0.25,0.25,0.25]). [You can feel free to modify the variance and mixing coefficient initializations too if that works well.] """ # TODO: finish this raise NotImplementedError() In [ ]: def gmm_improvement_test(): """ Tests whether the new mixture model is actually an improvement over the previous one: if the new model has a higher likelihood than the previous model for the provided initial means. returns: original_segment = numpy.ndarray[numpy.ndarray[float]] improved_segment = numpy.ndarray[numpy.ndarray[float]] """ image_file = 'images/party_spock.png' image_matrix = image_to_matrix(image_file) num_components = 3 initial_means = [0.4627451, 0.20392157, 0.36078432] # first train original model with fixed means gmm = GaussianMixtureModel(image_matrix, num_components) gmm.initialize_training() gmm.means = np.copy(initial_means) gmm.train_model() original_segment = gmm.segment() original_likelihood = gmm.likelihood() # then train improved model gmm_improved = GaussianMixtureModelImproved(image_matrix, num_components) gmm_improved.initialize_training() gmm_improved.train_model() improved_segment = gmm_improved.segment() improved_likelihood = gmm_improved.likelihood() # then calculate likelihood difference diff_thresh = 1e3 likelihood_diff = improved_likelihood - original_likelihood if(likelihood_diff >= diff_thresh): print('Congrats! Improved model scores a likelihood that was at ' + 'least %d higher than the original model.'%(diff_thresh)) return original_segment, improved_segment In [ ]: best_segment, best_segment_improved = gmm_improvement_test() matrix_to_image(best_segment, 'best_segment_original.png') matrix_to_image(best_segment_improved, 'best_segment_improved.png') 3b: Convergence condition 7.5 points You might be skeptical of the convergence criterion we’ve provided in default_convergence(). To test out another convergence condition, implement new_convergence_condition() to return true if all the new model parameters (means, variances, and mixing coefficients) are within 10% of the previous variables for 10 consecutive iterations. This will mean re-implementing train_model(), which you will also do below in GaussianMixtureModelConvergence. You can compare the two convergence functions in convergence_condition_test(). In [ ]: def new_convergence_function(previous_variables, new_variables, conv_ctr, conv_ctr_cap=10): """ Convergence function based on parameters: when all variables vary by less than 10% from the previous iteration's variables, increase the convergence counter. params: previous_variables = [numpy.ndarray[float]] containing [means, variances, mixing_coefficients] new_variables = [numpy.ndarray[float]] containing [means, variances, mixing_coefficients] conv_ctr = int conv_ctr_cap = int return: conv_ctr = int converged = boolean """ # TODO: finish this function raise NotImplementedError() return conv_ctr, converged In [ ]: class GaussianMixtureModelConvergence(GaussianMixtureModel): """ Class to test the new convergence function in the same GMM model as before. """ def train_model(self, convergence_function=new_convergence_function): # TODO: finish this function raise NotImplementedError() In [ ]: def convergence_condition_test(): """ Compare the performance of the default convergence function with the new convergence function. return: default_convergence_likelihood = float new_convergence_likelihood = float """ image_file = 'images/party_spock.png' image_matrix = image_to_matrix(image_file) num_components = 3 initial_means = [0.4627451, 0.10196079, 0.027450981] # first test original model gmm = GaussianMixtureModel(image_matrix, num_components) gmm.initialize_training() gmm.means = np.copy(initial_means) gmm.train_model() default_convergence_likelihood = gmm.likelihood() # now test new convergence model gmm_new = GaussianMixtureModelConvergence(image_matrix, num_components) gmm_new.initialize_training() gmm_new.means = np.copy(initial_means) gmm_new.train_model() new_convergence_likelihood = gmm_new.likelihood() # test convergence difference convergence_diff = new_convergence_likelihood - default_convergence_likelihood convergence_thresh = 200 if(convergence_diff >= convergence_thresh): print('Congrats! The likelihood difference between the original ' + 'and the new convergence models should be at least %.2f'%(convergence_thresh)) return default_convergence_likelihood, new_convergence_likelihood Part 4: Bayesian information criterion 20 points In our previous solutions, our only criterion for choosing a model was whether it maximizes the posterior likelihood regardless of how many parameters this requires. As a result, the “best” model may simply be the model with the most parameters, which would be overfit to the training data. To avoid overfitting, we can use the Bayesian information criterion (a.k.a. BIC) which penalizes models based on the number of parameters they use. In the case of the Gaussian mixture model, this is equal to the number of components times the number of variables per component (mean, variance and mixing coefficient) = 3*components. 4a: Implement BIC 5 points Implement bayes_info_criterion() to calculate the BIC of a trained GaussianMixtureModel. In [ ]: def bayes_info_criterion(gmm): # TODO: finish this function raise NotImplementedError() return BIC In [ ]: def bayes_info_test(): """ Test for your implementation of BIC on fixed GMM values. Should be about 727045. returns: BIC = float """ image_file = 'images/party_spock.png' image_matrix = image_to_matrix(image_file) num_components = 3 initial_means = [0.4627451, 0.10196079, 0.027450981] gmm = GaussianMixtureModel(image_matrix, num_components) gmm.initialize_training() gmm.means = np.copy(initial_means) BIC = bayes_info_criterion(gmm) return BIC 4b: Test BIC 15 points Now implement BIC_model_test(), in which you will use the BIC and likelihood to determine the optimal number of components in the Party Spock image. Use the original GaussianMixtureModel for your models. Iterate from k=2 to k=7 and use the provided means to train a model that minimizes its BIC and a model that maximizes its likelihood. Then, fill out BIC_likelihood_question() to return the number of components in both the min-BIC and the max-likelihood model. In [ ]: def BIC_likelihood_model_test(): """Test to compare the models with the lowest BIC and the highest likelihood. returns: min_BIC_model = GaussianMixtureModel max_likelihood_model = GaussianMixtureModel """ # TODO: finish this method raise NotImplementedError() comp_means = [ [0.023529412, 0.1254902], [0.023529412, 0.1254902, 0.20392157], [0.023529412, 0.1254902, 0.20392157, 0.36078432], [0.023529412, 0.1254902, 0.20392157, 0.36078432, 0.59215689], [0.023529412, 0.1254902, 0.20392157, 0.36078432, 0.59215689, 0.71372563], [0.023529412, 0.1254902, 0.20392157, 0.36078432, 0.59215689, 0.71372563, 0.964706] ] return min_BIC_model, max_likelihood_model In [ ]: def BIC_likelihood_question(): """ Choose the best number of components for each metric (min BIC and maximum likelihood). returns: pairs = dict """ # TODO: fill in bic and likelihood raise NotImplementedError() bic = 0 likelihood = 0 pairs = { 'BIC' : bic, 'likelihood' : likelihood } return pairs Bonus 2 points A crucial part of machine learning is working with very large datasets. As stated before, using for loops over these datasets will result in the code taking many hours, or even several days, to run. Even vectorization can take time if not done properly, and as such there are certain tricks you can perform to get your code to run as fast as physically possible. For this part of the assignment, you will need to implement part of a k-Means algorithm. You are given two arrays – points_array with X n-dimensional points, and means_array with Y n-dimensional points. You will need to return an X x Y array containing the distances from each point in points_array to each point in means_array. Your code will be tested using two very large arrays, against our reference implementation, which was designed by Murtaza Dhuliawala. Thus, you’ll be competing against the Head TA! If your implementation returns the correct answer in time comparable to Murtaza’s implementation, you will receive 2 bonus points. For reference, the data used is in the order of thousands of points and hundreds of means, and Bonnie automatically kills a grading script that takes more than 500MB. So please test accordingly locally before submitting, as you may lose a submission for an inefficient solution. It is very likely that you could run out of memory if your implementation is inefficient. In [ ]: def bonus(points_array, means_array): """ Return the distance from every point in points_array to every point in means_array. returns: dists = numpy array of float """ # TODO: fill in the bonus function # REMOVE THE LINE BELOW IF ATTEMPTING BONUS raise NotImplementedError() return dists In [ ]: def bonus_test(): points = np.array([[ 0.9059608,0.67550357,0.13525533],[ 0.23656114,0.63624466,0.3606615 ],[ 0.91163215,0.24431103,0.33318504],[ 0.25209736,0.24600123,0.42392935],[ 0.62799146,0.04520208,0.55232494],[ 0.5588561, 0.06397713,0.53465371],[ 0.82530045,0.62811624,0.79672349],[ 0.50048147,0.13215356,0.54517893],[ 0.84725662,0.71085917,0.61111105],[ 0.25236734,0.25951904,0.70239158]]) means = np.array([[ 0.39874413,0.47440682,0.86140829],[ 0.05671347,0.26599323,0.33577454],[ 0.7969679, 0.44920099,0.37978416],[ 0.45428452,0.51414022,0.21209852],[ 0.7112214, 0.94906158,0.25496493]]) expected_answer = np.array([[ 0.90829883,0.9639127, 0.35055193,0.48575144,0.35649377],[ 0.55067427,0.41237201,0.59110637,0.29048911,0.57821151],[ 0.77137409,0.8551975, 0.23937264,0.54464354,0.73685561],[ 0.51484192,0.21528078,0.58320052,0.39705222,0.85652654],[ 0.57645778,0.64961631,0.47067874,0.60483973,0.95515036],[ 0.54850426,0.57663736,0.47862222,0.56358129,0.94064631],[ 0.45799673,0.966609,0.45458971,0.70173336,0.63993928],[ 0.47695785,0.50861901,0.46451987,0.50891112,0.89217387],[ 0.56543953,0.94798437,0.35285421,0.59357932,0.4495398 ],[ 0.30477736,0.41560848,0.66079087,0.58820896,0.94138546]]) if np.allclose(expected_answer,bonus(points,means),1e-7): print 'You returned the correct distances.' else: print 'Your distance calculation is incorrect.' bonus_test() You’re done with the requirements! Hope you have completed the functions in the mixture_models.py file and tested everything!

$25.00 View

[SOLVED] Cs 6601 assignment 4: decision tree learning

5/5 - (1 vote) Machine learning offers a number of methods for classifying data into discrete categories, such as k-means clustering. Decision trees provide a structure for such categorization, based on a series of decisions that led to separate distinct outcomes. In this assignment, you will work with decision trees to perform binary classification according to some decision boundary. Your challenge is to build and to train decision trees capable of solving useful classification problems. You will learn first how to build decision trees, then how to effectively train them and finally how to test their performance. This assignment is due on T-Square as decision_trees.py on March 19th midnight AOE (anywhere on Earth). Abstract You will build, train and test decision tree models to perform basic classification tasks. Motivation Classification is used widely in machine learning to figure out how to sort new data that comes through. Objectives Students should understand how decision trees and random forests work. Students should develop and intuition for how and why accuracy differs for training and testing data based on different parameters. Evaluation Evaluation is using the last submission on Bonnie. Datasets We have provided you with two datasets: part23_data.csv: 4 features, 1372 data points, binary classification (last column) challenge_train.csv: 30 features, 6636 datapoints, binary classification (first column) challenge_test.csv: not provided, similar to challenge_train but with 40% of the dataset Imports We are only allowing three imports: numpy, collections.Counter and time. We will be checking to see if any other libraries are used. You are not allowed to use any outside libraries especially for part 4 (challenge). Introduction For this assignment we’re going to need an explicit way to make structured decisions. The following is DecisionNode, representing a decision node as some atomic choice in a binary decision graph. It can represent a class label (i.e. a final decision) or a binary decision to guide the us through a flow-chart to arrive at a decision. Note that in this representation ‘True’ values for a decision take us to the left. This is arbitrary but matters for what comes next. In [ ]: import numpy as np from collections import Counter import time class DecisionNode(): """Class to represent a single node in a decision tree.""" def __init__(self, left, right, decision_function,class_label=None): """Create a node with a left child, right child, decision function and optional class label for leaf nodes.""" self.left = left self.right = right self.decision_function = decision_function self.class_label = class_label def decide(self, feature): """Return on a label if node is leaf, or pass the decision down to the node's left/right child (depending on decision function).""" if self.class_label is not None: return self.class_label elif self.decision_function(feature): return self.left.decide(feature) else: return self.right.decide(feature) Part 1a: Building a Binary Tree by Hand 5 pts. In build_decision_tree(), construct a tree of decision nodes by hand in order to classify the data below, i.e. map each datum xx to a label yy. Select tests to be as small as possible (in terms of attributes), breaking ties among tests with the same number of attributes by selecting the one that classifies the greatest number of examples correctly. If multiple tests have the same number of attributes and classify the same number of examples, then break the tie using attributes with lower index numbers (e.g. select A1A1 over A2A2) Datum A1A1 A2A2 A3A3 A4A4 y x1x1 1 0 0 0 1 x2x2 1 0 1 1 1 x3x3 0 1 0 0 1 x4x4 0 1 1 0 0 x5x5 1 1 0 1 1 x6x6 0 1 0 1 0 x7x7 0 0 1 1 1 x8x8 0 0 1 0 0 Hints: To get started, it might help to draw out the tree by hand with each attribute representing a node. To create the decision function to pass to your DecisionNode, you can create a lambda expression as follows: func = lambda feature : feature[2] == 0 in which we would choose the left node if the third attribute is 0. For example, if your tree looked like this: if A1=0 then class = 1; else class = 0 A1 / 1 0 You would write your code like this: decision_tree_root= DecisionNode(None, None, lambda a1: a1 == 0) decision_tree_root.left = DecisionNode(None, None, None, 1) decision_tree_root.right = DecisionNode(None, None, None, 0) return decision_tree_root Requirements: The tree nodes should be less than 10 nodes including the leaf (not the number of instances, but the actual nodes in the tree). In [ ]: examples = [[1,0,0,0], [1,0,1,1], [0,1,0,0], [0,1,1,0], [1,1,0,1], [0,1,0,1], [0,0,1,1], [0,0,1,0]] classes = [1,1,1,0,1,0,1,0] In [ ]: def build_decision_tree(): """Create decision tree capable of handling the provided data.""" # TODO: build full tree from root decision_tree_root = None return decision_tree_root In [ ]: decision_tree_root = build_decision_tree() Part 1b: Precision, Recall, Accuracy and Confusion Matrix 12 pts. Now that we have a decision tree, we’re going to need some way to evaluate its performance. In most cases we’d reserve a portion of the training data for evaluation, or use cross-validation. For now let’s just see how your tree does on the provided examples. In the stubbed out code below, fill out the methods to compute the confusion matrix, accuracy, precision and recall for your classifier output. classifier_output is just the list of labels that your classifier outputs, corresponding to the same examples as true_labels. You can refer to Wikipedia for calculating the true/false positive/negative. You should get 1.0 (float) for precision, recall and accuracy. You can create a simple example for the confusion matrix for testing purposes and count by hand. In [ ]: def confusion_matrix(classifier_output, true_labels): #TODO output should be [[true_positive, false_negative], [false_positive, true_negative]] #TODO output is a list raise NotImplemented() def precision(classifier_output, true_labels): #TODO precision is measured as: true_positive/ (true_positive + false_positive) raise NotImplemented() def recall(classifier_output, true_labels): #TODO: recall is measured as: true_positive/ (true_positive + false_negative) raise NotImplemented() def accuracy(classifier_output, true_labels): #TODO accuracy is measured as: correct_classifications / total_number_examples raise NotImplemented() classifier_output = [decision_tree_root.decide(example) for example in examples] p1_confusion_matrix = confusion_matrix(classifier_output, classes) p1_accuracy = accuracy( classifier_output, classes ) p1_precision = precision(classifier_output, classes) p1_recall = recall(classifier_output, classes) print p1_confusion_matrix, p1_accuracy, p1_precision, p1_recall Part 2a: Decision Tree Learning 6 pts. You will need to implement entropy() and information_gain() in order to do so (hints here) and here). Test cases have been provided. In [ ]: def entropy(class_vector): """Compute the entropy for a list of classes (given as either 0 or 1).""" # TODO: finish this raise NotImplemented() def information_gain(previous_classes, current_classes ): """Compute the information gain between the previous and current classes (a list of lists where each list has 0 and 1 values).""" # TODO: finish this raise NotImplemented() def test_information_gain(): """ Assumes information_gain() accepts (classes, [list of subclasses]) Feel free to edit / enhance this note with more tests """ restaurants = [0]*6 + [1]*6 split_patrons = [[0,0], [1,1,1,1], [1,1,0,0,0,0]] split_food_type = [[0,1],[0,1],[0,0,1,1],[0,0,1,1]] # If you're using numpy indexing add the following before calling information_gain() # split_patrons = [np.array(i) for i in split_patrons] #convert to np array # split_food_type = [np.array(i) for i in split_food_type] gain_patrons = information_gain(restaurants, split_patrons) gain_type = information_gain(restaurants, split_food_type) assert round(gain_patrons,3) == 0.541, "Information Gain on patrons should be 0.541" assert gain_type == 0.0, "Information gain on type should be 0.0" print "Information Gain calculations correct..." assert (information_gain([1,1,1,0,0,0],[[1,1,1],[0,0,0]])==1),"TEST FAILED" assert (round(information_gain([1,1,1,0,0,0],[[1,1,0],[1,0,0]]),2)==0.08),"TEST FAILED" def test_entropy(): assert (entropy([1,1,1,0,0,0])==1),"TEST FAILED" assert (entropy([1,1,1,1,1,1])==0),"TEST FAILED" assert (int(entropy([1,1,0,0,0,0])*100)==91),"TEST FAILED" test_information_gain() test_entropy() Part 2b: Decision Tree Learning 20 pts. File to use: part23_data.csv Grading: average test accuracy over 10 rounds should be >= 70% As the size of our training set grows, it rapidly becomes impractical to build these trees by hand. We need a procedure to automagically construct these trees. For starters, let’s consider the following algorithm (a variation of C4.5) for the construction of a decision tree from a given set of examples: 1) Check for base cases: a) If all elements of a list are of the same class, return a leaf node with the appropriate class label. b) If a specified depth limit is reached, return a leaf labeled with the most frequent class. 2) For each attribute alpha: evaluate the normalized information gain gained by splitting on attribute $alpha$ 3) Let alpha_best be the attribute with the highest normalized information gain 4) Create a decision node that splits on alpha_best 5) Recur on the sublists obtained by splitting on alpha_best, and add those nodes as children of node First, in the DecisionTree.__build_tree__() method implement the above algorithm. Next, in DecisionTree.classify() below, write a function to produce classifications for a list of features once your decision tree has been built. Some other helpful notes: 1) Your features and classify should be in numpy arrays where if the dataset was (m x n) then the features is (m x n-1) and classify is (m x 1). 2) These features are continuous features and you will need to split based on a threshold. How grading works: 1) We load part23_data.csv and create our cross-validation training and test set with a k=10 folds. 2) We classify the training data onto the three then fit the testing data onto the tree. 3) We check the accuracy of your results versus the true results and we return the average of this over 10 iterations. In [ ]: class DecisionTree(): """Class for automatic tree-building and classification.""" def __init__(self, depth_limit=float('inf')): """Create a decision tree with an empty root and the specified depth limit.""" self.root = None self.depth_limit = depth_limit def fit(self, features, classes): """Build the tree from root using __build_tree__().""" self.root = self.__build_tree__(features, classes) def __build_tree__(self, features, classes, depth=0): """Implement the above algorithm to build the decision tree using the given features and classes to build the decision functions.""" #TODO: finish this raise NotImplemented() def classify(self, features): """Use the fitted tree to classify a list of examples. Return a list of class labels.""" class_labels = [] #TODO: finish this raise NotImplemented() return class_labels Part 2c: Validation 10 pts. File to use: part23_data.csv Grading: average test accuracy over 10 rounds should be >= 70% In general, reserving part of your data as a test set can lead to unpredictable performance- a serendipitous choice of your train or test split could give you a very inaccurate idea of how your classifier performs. That’s where k-fold cross validation comes in. In generate_k_folds(), we’ll split the dataset at random into k equal subsections. Then iterating on each of our k samples, we’ll reserve that sample for testing and use the other k-1 for training. Averaging the results of each fold should give us a more consistent idea of how the classifier is doing across the data as a whole. How grading works: The same as 2b however, we use your generate_k_folds instead of ours. In [ ]: def load_csv(data_file_path, class_index): handle = open(data_file_path, 'r') contents = handle.read() handle.close() rows = contents.split(' ') out = np.array([[float(i) for i in r.split(',')] for r in rows if r]) if(class_index == -1): # this is used for part23_data classes= map(int, out[:,class_index]) features = out[:,:class_index] return features, classes elif(class_index == 0): # this is used for challenge_train classes= map(int, out[:, class_index]) features = out[:, 1:] return features, classes else: # this is used for vectorize return out def generate_k_folds(dataset, k): #TODO this method should return a list of folds, # where each fold is a tuple like (training_set, test_set) # where each set is a tuple like (examples, classes) raise NotImplemented() dataset = load_csv('part23_data.csv', -1) ten_folds = generate_k_folds(dataset, 10) accuracies = [] precisions = [] recalls = [] confusion = [] for fold in ten_folds: train, test = fold train_features, train_classes = train test_features, test_classes = test tree = DecisionTree( ) tree.fit( train_features, train_classes) output = tree.classify(test_features) accuracies.append( accuracy(output, test_classes)) precisions.append( precision(output, test_classes)) recalls.append( recall(output, test_classes)) confusion.append( confusion_matrix(output, test_classes)) Part 3: Random Forests 30 pts. File to use: part23_data.csv Grading: average test accuracy over 10 rounds should be >= 75% The decision boundaries drawn by decision trees are very sharp, and fitting a decision tree of unbounded depth to a list of training examples almost inevitably leads to overfitting. In an attempt to decrease the variance of our classifier we’re going to use a technique called ‘Bootstrap Aggregating’ (often abbreviated ‘bagging’). A Random Forest is a collection of decision trees, build as follows: 1) For every tree we're going to build: a) Subsample the examples provided us (with replacement) in accordance with a provided example subsampling rate. b) From the sample in a), choose attributes at random to learn on (in accordance with a provided attribute subsampling rate) c) Fit a decision tree to the subsample of data we've chosen (to a certain depth) Classification for a random forest is then done by taking a majority vote of the classifications yielded by each tree in the forest after it classifies an example. Fill in RandomForest.fit() to fit the decision tree as we describe above, and fill in RandomForest.classify() to classify a given list of examples. Your features and classify should be in numpy arrays where if the dataset was (m x n) then the features is (m x n-1) and classify is (n x 1). To test, we will be using a forest with 5 trees, with a depth limit of 5, example subsample rate of 0.5 and attribute subsample rate of 0.5 How grading works: similar to 2b but with the call to Random Forest. In [ ]: class RandomForest(): """Class for random forest classification.""" def __init__(self, num_trees, depth_limit, example_subsample_rate, attr_subsample_rate): """Create a random forest with a fixed number of trees, depth limit, example sub-sample rate and attribute sub-sample rate.""" self.trees = [] self.num_trees = num_trees self.depth_limit = depth_limit self.example_subsample_rate = example_subsample_rate self.attr_subsample_rate = attr_subsample_rate def fit(self, features, classes): """Build a random forest of decision trees.""" # TODO implement the above algorithm raise NotImplemented() def classify(self, features): """Classify a list of features based on the trained random forest.""" # TODO implement classification for a random forest. raise NotImplemented() Part 4: Challenge Classifier 10 points. File to use: challenge_train.csv Grading: average training accuracy over 10 runs should be >= 80% and average ru accuracy over 10 runs should be >= 70% You should be implementing some sort of a tree structure, students in the past have been able to call their RandomForest with different parameters. We also encourage things like boosting. You’ve been provided with a sample of data from a research dataset in ‘challenge_train.csv’ while we have reserved a part of the dataset for testing called challenge_test.csv (which you do not have access to). To get full points for this part of the assignment, you’ll need to get at least an average accuracy of 80% on the training data you have (challenge_train.csv), and at least an average accuracy of 70% on the holdout/test set (challenge_test.csv). We do provide how long it takes for your training and testing to run. In [ ]: class ChallengeClassifier(): def __init__(self): # initialize whatever parameters you may need here- # this method will be called without parameters # so if you add any to make parameter sweeps easier, provide defaults raise NotImplemented() def fit(self, features, classes): # fit your model to the provided features raise NotImplemented() def classify(self, features): # classify each feature in features as either 0 or 1. raise NotImplemented() Part 5: Vectorization! 7 points. File to use: vectorize.csv Last semester, students struggled a lot with assignment 5 not because of the assignment but of the vectorization requirement so that it can run under the time limit we have. As a result, we are adding a small section to this assignment that will hopefully introduce you to vectorization and some of the cool tricks you can use in python. We encourage you to use any numpy function out there (on good faith) to do the following functions. For the three functions that we have, we are testing your code based on how fast it runs. It will need to beat the non-vectorized code to get full points. As a reminder, please don’t ask the TA’s for help regarding this section, we will not be able to assist you in any way. This section was created to help get you ready for assignment_5; feel free to ask other students on Piazza or use the Internet. How grading works: we run the non-vectorized code and your vectorized code 500 times, as long as the average time of your vectorized code is less than the average time of the non-vectorized code, you will get the points (given that your answer is correct). In [ ]: import time import resource import numpy as np class Vectorization(): def load_csv(self,data_file_path, class_index): handle = open(data_file_path, 'r') contents = handle.read() handle.close() rows = contents.split(' ') out = np.array([[float(i) for i in r.split(',')] for r in rows if r]) if(class_index == -1): classes= map(int, out[:,class_index]) features = out[:,:class_index] return features, classes elif(class_index == 0): classes= map(int, out[:, class_index]) features = out[:, 1:] return features, classes else: return out # Vectorization #1: Loops! # This function takes one matrix, multiplies by itself and then adds to itself. # Output: return a numpy array # 1 point def non_vectorized_loops(self, data): non_vectorized = np.zeros(data.shape) for row in range(data.shape[0]): for col in range(data.shape[1]): non_vectorized[row][col] = data[row][col] * data[row][col] + data[row][col] return non_vectorized def vectorized_loops(self, data): # TODO vectorize the code from above # Bonnie time to beat: 0.09 seconds raise NotImplemented() def vectorize_1(self): data = self.load_csv('vectorize.csv', 1) start_time = resource.getrusage(resource.RUSAGE_SELF).ru_utime * 1000 real_answer = self.non_vectorized_loops(data) end_time = resource.getrusage(resource.RUSAGE_SELF).ru_utime * 1000 print 'Non-vectorized code took %s seconds' % str(end_time-start_time) start_time = resource.getrusage(resource.RUSAGE_SELF).ru_utime * 1000 my_answer = self.vectorized_loops(data) end_time = resource.getrusage(resource.RUSAGE_SELF).ru_utime * 1000 print 'Vectorized code took %s seconds' % str(end_time-start_time) assert np.array_equal(real_answer, my_answer), "TEST FAILED" # Vectorization #2: Slicing and summation # This function searches through the first 100 rows, looking for the row with the max sum # (ie, add all the values in that row together) # Output: return the max sum as well as the row number for the max sum # 3 points def non_vectorized_slice(self, data): max_sum = 0 max_sum_index = 0 for row in range(100): temp_sum = 0 for col in range(data.shape[1]): temp_sum += data[row][col] if (temp_sum > max_sum): max_sum = temp_sum max_sum_index = row return max_sum, max_sum_index def vectorized_slice(self, data): # TODO vectorize the code from above # Bonnie time to beat: 0.07 seconds raise NotImplemented() def vectorize_2(self): data = self.load_csv('vectorize.csv', 1) start_time = resource.getrusage(resource.RUSAGE_SELF).ru_utime * 1000 real_sum, real_sum_index = self.non_vectorized_slice(data) end_time = resource.getrusage(resource.RUSAGE_SELF).ru_utime * 1000 print 'Non-vectorized code took %s seconds' % str(end_time-start_time) start_time = resource.getrusage(resource.RUSAGE_SELF).ru_utime * 1000 my_sum, my_sum_index = self.vectorized_slice(data) end_time = resource.getrusage(resource.RUSAGE_SELF).ru_utime * 1000 print 'Vectorized code took %s seconds' % str(end_time-start_time) assert (real_sum==my_sum),"TEST FAILED" assert (real_sum_index==my_sum_index), "TEST FAILED" # Vectorization #3: Flattening and dictionaries # This function flattens down data into a 1d array, creates a dictionary of how often a # positive number appears in the data and displays that value # Output: list of tuples [(1203,3)] = 1203 appeared 3 times in data # 3 points def non_vectorized_flatten(self, data): unique_dict = {} flattened = np.hstack(data) for item in range(len(flattened)): if flattened[item] > 0: if flattened[item] in unique_dict: unique_dict[flattened[item]] += 1 else: unique_dict[flattened[item]] = 1 return unique_dict.items() def vectorized_flatten(self, data): # TODO vectorize the code from above # Bonnie time to beat: 15 seconds raise NotImplemented() def vectorize_3(self): data = self.load_csv('vectorize.csv', 1) start_time = resource.getrusage(resource.RUSAGE_SELF).ru_utime * 1000 answer_unique = self.non_vectorized_flatten(data) end_time = resource.getrusage(resource.RUSAGE_SELF).ru_utime * 1000 print 'Non-vectorized code took %s seconds'% str(end_time-start_time) start_time = resource.getrusage(resource.RUSAGE_SELF).ru_utime * 1000 my_unique = self.vectorized_flatten(data) end_time = resource.getrusage(resource.RUSAGE_SELF).ru_utime * 1000 print 'Vectorized code took %s seconds'% str(end_time-start_time) assert np.array_equal(answer_unique, my_unique), "TEST FAILED" In [ ]: vectorize = Vectorization() vectorize.vectorize_1() vectorize.vectorize_2() vectorize.vectorize_3() Bonus Note: this part will be changing. Official annoucements for this bonus will be made through Piazza We will be having a competition using your challenge classifier and a dataset of our choice. We will provide you with a portion of the dataset as well as the testing data (but without the labels) and you will upload your solution as a csv to Kaggle. Kaggle will evaluate your scores and the classifier with the highest accuracy will win the competiton. Any ties will be broken by the submission time. We are still figuring out all the details for this bonus so hopefully it will be out by the time the midterm period is over. We will keep the competition available for at least a few weeks. First place: 3 bonus points on your final grade Second place: 2 bonus points on your final grade Third place: 1 bonus point on your final grade In [ ]:

$25.00 View

[SOLVED] Cs 6601 assignment 3: probability

Abstract You will implement several Bayesian networks and sampling algorithms to gain a better understanding of probabilistic systems.Learning Objectives Students should be able to understand the importance of Bayesian networks to represent conditional dependencies. Also, be able learn the sampling methods, Gibbs and Metropolis-Hastings and develop an intuition for their convergence criteria (very “researchy”).Evaluation Evaluation is using the last submission on Bonnie. 1. The Challenge Many AI systems rely on probabilistic knowledge of the world, rather than absolute knowledge, to execute tasks efficiently: for example, motion planning in robots with unreliable sensors. One type of probabilistic system that is especially useful is the Bayesian network, which encodes a joint probability distribution among dependent variables as a network of conditional probabilities. Your challenge is to implement and test several of these networks, ultimately using a sampling method to approximate a probability distribution. Figure 1: Example Bayesian network (representing prediction for wet grass).Your task is to implement a few basic networks as well as several sampling algorithms. You will do this in probability notebook.ipynb, and there are tests along the way to help. Unlike previous assignments, we will not be grading on performance but rather on completion.We have provided the following additional classes and files: File/Folder Description probability_tests.py To test the models you’ve built. pbnt/combined Module to implement Bayesian networks (you’ll basically need BayesNode in Node.py and BayesNet in Graph.py). Also contains an example (ExampleModels.py) to help you get started. This is meant to be a shorter assignment, so there won’t be much testing required.3. Grading BASIC TASK (100 points) Warmup 1a: Build a basic Bayesian network representing a power plant. (10 points) Warmup 1b: Set the probabilities for the Bayes Net. (15 points) Warmup 1c: Use inference to calculate several marginal probabilities within the Net. (10points) Exercise 2a: Build a Bayesian network representing a sports competition. (10 points) Exercise 2b: Given the outcomes of 2 matches, calculate likelihoods for the 3rd match. (5 points) Exercise 2c: Implement single iteration of Gibbs sampling. (15 points) Exercise 2d: Implement single iteration of Metropolis-Hastings sampling.(15 points) Exercise 2e: Compare the performance of the 2 sampling methods. (20 points)4. Due date This assignment is due on Bonnie and T-Square by February 26th at 11:59PM UTC-12 (Anywhere on Earth). The deliverable for this assignment is a Python file : ● Probability_solution.py5. Resources IMPORTANT: If you want to know more about how pbnt works, check out exampleinference.py and water() in pbnt/combined/ExampleModels.py. Also here’s a clone of the library : https://github.com/achille/pbnt. Basics of Bayes nets and Conditional Probability: – https://www.mathsisfun.com/data/probability-events-conditional.html – https://ocw.mit.edu/courses/mathematics/18-05-introduction-to-probability-and-st atistics-spring-2014/class-slides/ Gibbs Sampling and convergence: – http://gandalf.psych.umn.edu/users/schrater/schrater_lab/courses/AI2/gibbs.pdf – https://en.wikipedia.org/wiki/Gibbs_sampling – https://www.youtube.com/watch?v=ol0l6aTfb_g – Section 14.5 in Russell and Norvig (pp. 535-538 for Gibbs sampling). Metropolis Hastings and convergence: – http://www.mit.edu/~ilkery/papers/MetropolisHastingsSampling.pdf – https://www.cs.cmu.edu/~scohen/psnlp-lecture6.pdf Although you don’t have to implement the inference algorithm (Junction tree) that you’ll use with your networks, you might be interested in knowing how it works. You can find details on pp. 529-530 of Russell and Norvig.

$25.00 View

[SOLVED] Cs 6601 assignment 2 search

Find the assignment description [here](https://docs.google.com/document/d/1Fct-45PuiT500cYda_15SpWBA44iBObLXQAhuTsWcTs/pub). Please also read the [FAQ](https://github.gatech.edu/omscs6601/assignment_2/blob/master/FAQ.txt).Here are some notes you might find useful. Bi-directional search Using Landmarks # Setup Clone this repository recursively: `git clone –recursive https://github.gatech.edu/omscs6601/assignment_2.git`(If your version of git does not support recurse clone, then clone without the option and run `git submodule init` and `git submodule update`).If you run across certificate authentication issues during the clone, set the git SSL Verify option to false: `git config –global http.sslVerify false`.# Python DependenciesThe submission scripts depend on the presence of 2 python packages – `requests` and `future`. If you are missing either of these packages, install them from the online Python registries. The easiest way to do this is through pip:`pip install requests future`# Keeping your code upto date After the clone, we recommend creating a branch and developing your agents on that branch:`git checkout -b develop`(assuming develop is the name of your branch)Should the TAs need to push out an update to the assignment, commit (or stash if you are more comfortable with git) the changes that are unsaved in your repository:`git commit -am “”`Then update the master branch from remote:`git pull origin master`This updates your local copy of the master branch. Now try to merge the master branch into your development branch:`git merge master`(assuming that you are on your development branch)There are likely to be merge conflicts during this step. If so, first check what files are in conflict:`git status`The files in conflict are the ones that are “Not staged for commit”. Open these files using your favourite editor and look for lines containing ``. Resolve conflicts as seems best (ask a TA if you are confused!) and then save the file. Once you have resolved all conflicts, stage the files that were in conflict:`git add -A .`Finally, commit the new updates to your branch and continue developing:`git commit -am “”`# Submit your code To submit your code to have it evaluated for a grade, use `python submit.py assignment_2`. You may submit as many times as you like. The last submission before the deadline will be used to determine your grade.To add a data.pickle file to your submission (containing landmarks of the Atlanta map for improved tri-directional/custom_search), use `python submit.py assignment_2 –add-data`.A friendly reminder: please ensure that your submission is in `search_submission.py`. The submit script described automatically sends that file to the servers for processing.# VagrantYou have the option of using vagrant to make sure that your local code runs in the same environment as the servers on Bonnie (make sure you have [Vagrant](https://www.vagrantup.com/) and [Virtualbox](https://www.virtualbox.org/wiki/Downloads) installed). To use this option run the following commands in the root directory of your assignment:“` vagrant up –provider virtualbox vagrant ssh “`Your code lives in the `/vagrant` folder within this virtual machine. Changes made to files in your assignment folder will automatically be reflected within the machine.# Azure NotebooksAzure has a service for creating and hosting your iPython notebooks. Find it [here](https://notebooks.azure.com/). You can even use your Georgia Tech credentials to sign in.

$25.00 View

[SOLVED] Cs6601 assignment 1

# Setup Clone this repository recursively: `git clone –recursive https://github.gatech.edu/omscs6601/assignment_1.git`(If your version of git does not support recurse clone, then clone without the option and run `git submodule init` and `git submodule update`).If you run across certificate authentication issues during the clone, set the git SSL Verify option to false: `git config –global http.sslVerify false`.## Python DependenciesThe submission scripts depend on the presence of 2 python packages – `requests` and `future`. If you are missing either of these packages, install them from the online Python registries. The easiest way to do this is through pip:`pip install requests future`# Keeping your code upto date After the clone, we recommend creating a branch and developing your agents on that branch:`git checkout -b develop`(assuming develop is the name of your branch)Should the TAs need to push out an update to the assignment, commit (or stash if you are more comfortable with git) the changes that are unsaved in your repository:`git commit -am “”`Then update the master branch from remote:`git pull origin master`This updates your local copy of the master branch. Now try to merge the master branch into your development branch:`git merge master`(assuming that you are on your development branch)There are likely to be merge conflicts during this step. If so, first check what files are in conflict:`git status`The files in conflict are the ones that are “Not staged for commit”. Open these files using your favourite editor and look for lines containing ``. Resolve conflicts as seems best (ask a TA if you are confused!) and then save the file. Once you have resolved all conflicts, stage the files that were in conflict:`git add -A .`Finally, commit the new updates to your branch and continue developing:`git commit -am “”`# Submit your code A friendly reminder: please ensure that your submission is in `player_submission.py`. The script described in the following section automatically sends that file to the servers for processing.To submit your code and have it evaluated for a grade, use `python submit.py assignment_1`. We are going to limit you to 1 submissions in one hour(Subjected to change depending on load on servers) and the last submission before the deadline will be used to determine your grade.To enter yourself into the playoffs against your classmates, run `python submit.py –enable-face-off assignment_1`. Ensure that you have created the required AI.txt to enter the tournament. 

$25.00 View

[SOLVED] Cs5110 – assignment 2

1. Suppose A ≤L B using the reduction function f. Given w, an instance of A, what is an upper bound on |f(w)| in terms of |w|?2. Show that ANF A is NL-complete. ANF A = {⟨N, w⟩| N is an NFA that accepts the string w} 3. Show that 2-SAT is NL-complete.4. A ladder is a sequence of strings s1, s2, . . . , sk, wherein every string differs from the preceding one in exactly one character. For example, the following is a ladder of English words, starting with the word “head” and ending with the word “free”. head, hear, near, fear, bear, beer, deer, deed, feed, feet, fret, free.Let LADDERDF A = {⟨M, s, t⟩| M is a DFA and L(M) contains a ladder of strings, starting with s and ending with t}. Show that LADDERDF A is in PSPACE.5. If A ∈P, then show that PA =P. 6. A directed graph is strongly connected if for every pair of vertices (u, v) there is a directed path from u to v in G. Show that the problem of deciding if a graph is strongly connected is NL-complete.7. For a language L ⊆ {0, 1} ∗ , and a function f(n) (assuming f(n) can be computed in time O(f(n)), let Lf ⊆ {0, 1, #} ∗ denote the following language: Lf := {x#f(|x|) |x ∈ L}• Suppose that that L ∈ DTIME(f(n)). Then show that Lf ∈ DTIME(O(n)). • Show that if f(n) is a polynomial function, then L ∈ P if and only if Lf ∈ P.• Show that P ̸= DSPACE(O(n)). Hint: Assume an equality and arrive at a contradiction via suitable padding and the Space Hierarchy Theorem.• Define the class NEXP as NEXP := [ k NTIME 2 n k  Prove that if P = NP then EXP = NEXP.8. Show that if Σk = Πk for some k, then polynomial hierarchy collapses to Σk.9. What is the count (number) of functions f : {0, 1} n → {0, 1} that are both monotone and symmetric?

$25.00 View

[SOLVED] Cs5110 – assignment 1

1. Let DOUBLE-SAT be the language consisting of all Boolean formulas that have at least two distinct satisfying assignments. Show that DOUBLESAT is NP-complete.2. A Boolean formula is in DNF form (Disjunctive normal form) if it is an OR of clauses: C1 ∨ C2 ∨ . . . ∨ Cm, where each clause Cj is an AND of literals. Let DNF-SAT be language consisting of Boolean formulas ⟨ϕ⟩ that are in DNF form and are satisfiable. In other words, the goal is to decide if a given formula in DNF form is satisfiable. Is DNF-SAT in P? Is it in NP? Is it NP-complete?3. A Boolean formula is in 2-CNF form if it is an AND of clauses: C1 ∧ C2 ∧ . . . ∧ Cm, where each clause Cj is an OR of exactly two literals. Let 2-SAT or 2-CNF-SAT be the language consisting of satisfiable Boolean formulas that are in 2-CNF form. Is 2-CNF in P? Is it in NP? Is it NPcomplete?4. If P = NP, which are the languages that are NP-Complete?5. Show that if P = NP, there is a polynomial time algorithm to find a satisfying assignment to a 3-SAT formula if such an assignment exists. 6. Show that A ≤P B and B ≤P C ⇒ A ≤P C.7. Show that a language L is co-NP complete if and only if L is NP-complete. 8. Show that NP ̸= co-NP =⇒ P ̸= NP.9. The language EXACT-CLIQUE consists of all ⟨G, k⟩ where G is an undirected graph and k is a natural number such that the largest clique in G is of size exactly k. Show that EXACT-CLIQUE ∈ Σ2 ∩ Π2.

$25.00 View

[SOLVED] Quantum computing (cs5100) : problem set 3

1. The total variation distance between two probability distributions P and Q on the same set A, is defined as dT V (P, Q) = 1/2 P i∈A |P(i)−Q(i)|. An equivalent alternative way to define this: dT V (P, Q) is the maximum, over all events E ⊆ A, of |P(E) − Q(E)|. Hence dT V (P, Q) is small iff all events have roughly the same probability under P and under Q. The Euclidean distance between two states |ϕ⟩ = P i αi |i⟩ and |ψ⟩ = P i βi |i⟩ is defined as ∥|ϕ⟩ − |ψ⟩∥ = pP i |αi − βi | 2 . Assume the two states are unit vectors. Suppose the Euclidean distance is small: ∥|ϕ⟩ − |ψ⟩∥ = ε. If we measure |ϕ⟩ in the computational basis then the probability distribution over the outcomes is given by the |αi | 2 , and if we measure |ψ⟩ then the probabilities are |βi | 2 . Show that these distributions are close in total variation distance, i.e., 1/2 P i ||αi | 2 − |βi | 2 | is ≤ ε. (6 marks) 2. Suppose a ∈ R N is a vector (indexed by ℓ = 0, . . . , N − 1) which is r-periodic in the following sense: there exists an integer r such that aℓ = 1 whenever ℓ is an integer multiple of r, and aℓ = 0 otherwise. Compute the Fourier transform FN a of this vector, i.e., write down a formula for the entries of the vector FN a. Assuming r divides N, write down a simple closed form for the formula for the entries. Which are the nonzero entries in the vector FN a, and what is their magnitude? (5 marks) 1 3. (a) The squared Fourier transform, F 2 N , turns out to map computational basis states to computational basis states. Describe this map, i.e., determine to which basis state a basis state |k⟩ gets mapped for each k ∈ {0, 1, . . . , N − 1}. (5 marks) (b) Show that F 4 N = I. What can you conclude about the eigenvalues of FN ? (4 marks) 4. Consider the task of constructing a quantum circuit to compute |x⟩ 7→ |x + y mod N⟩, where y is a fixed constant, and 0 ≤ x < N. Show that one efficient way to do this, for values of y such as 1, is to first perform a quantum Fourier transform, then to apply single qubit phase shifts, then an inverse Fourier transform. What values of y can be added easily this way, and how many operations are required? (10 marks) 5. Construct a quantum circuit that computes the Hamming weight of a given string x ∈ {0, 1} n . That is, it performs the following transformation: |x⟩|0⟩ 7→ |x⟩|hw(x)⟩, where hw(x) is the Hamming weight of x. What is the size of your circuit? (10 marks) 6. In the lectures we claimed without proof that Grover’s algorithm can be tweaked to work with probability 1 if we know the number of solutions exactly. For N = 2n , this question asks you to provide such an exact algorithm for an x ∈ {0, 1} n with a unique solution (so we are promised that there is exactly one i ∈ {0, 1} n with xi = 1, and our goal is to find this i). (a) Define a new 2N-bit string y ∈ {0, 1} 2N , indexed by (n + 1)-bit strings j = j1 . . . jnjn+1, by setting yj = ( 1 if xj1···jn = 1 and jn+1 = 0, 0 otherwise. Show how you can implement the following (n + 1)-qubit unitary Sy : |j⟩ 7→ (−1)yj |j⟩, using one query to x (of the usual form Ox : |i, b⟩ 7→ |i, b⊕xi⟩) and a few elementary gates. (5 marks) (b) Let γ ∈ [0, 2π) and let Uγ =  cos γ − sin γ sin γ cos γ  be the corresponding rotation matrix. Let A = H⊗n ⊗ Uγ be an (n + 1)-qubit unitary. What is the probability (as a function of γ) that measuring the state A|0 n+1⟩ in the computational basis gives a solution j ∈ {0, 1} n+1 for y (i.e., such that yj = 1)? (5 marks) (c) Give a quantum algorithm that finds the unique solution in string x with probability 1 using O( √ N) queries to x. (10 marks) 2 7. Let N = 2n and x = x0 . . . xN−1 be a sequence of distinct integers such that each xi can be written exactly using b bits. We can query these in the usual way, i.e., we can apply (n + b)-qubit unitary Ox : |i, 0 b ⟩ 7→ |i, xi⟩, as well as its inverse. The minimum of x is defined as min{xi | 0 ≤ i ≤ N − 1}. Give a quantum algorithm that finds (with probability ≥ 2/3) an index acheiving the minimum, using at most O( √ N log N) queries to the input. (10 marks) 8. Let x = x0 . . . xN−1, where N = 2n and xi ∈ {0, 1} n , be an input that we can query in the usual way. We are promised that this input is 2-to-1: for each i there is exactly one other j such that xi = xj . Such an (i, j)-pair is called a collision. Give a quantum algorithm that finds a collision (with probability ≥ 2/3) using O(N1/3 ) queries. (10 marks) 9. Consider an undirected graph G = (V, E), where V = {1, . . . , n}. Let M be the adjacency matrix of G. Suppose we are given input graph G in the form of a unitary that allows us to query whether an edge (i, j) is present in G or not: OM : |i, j, b⟩ 7→ |i, j, b ⊕ Mij ⟩. (a) Assume G is connected. Suppose we have a set A of edges which we already know to be in the graph (so A ⊆ E; you can think of A as given classically, you don’t have to query it). Let GA = (V, A) be the subgraph induced by only these edges , and suppose GA is not connected, so it consists of c > 1 connected components. Call an edge (i, j) ∈ E “good” if it connects two of these components. Give a quantum algorithm that finds a good edge with an expected number of O(n/√ c − 1) queries to M. (10 marks) (b) Give a quantum algorithm that uses at most O(n 3/2 ) queries to M and decides (with success probability ≥ 2/3) whether G is connected or not. (10 marks) 10. Consider a 2-bit input x = x0x1 with phase-oracle Ox,± : |i⟩ 7→ (−1)xi |i⟩. Write out the final state of the following 1-query quantum algorithm: HOx,±H|0⟩. Give a degree 2-polynomial p(x0, x1) that equals the probability that this algorithm outputs 1 on this input x. (5 marks) 11. Let f : {0, 1} N → {0, 1} be the N-bit Parity function, which is 1 iff its input x ∈ {0, 1} N has odd Hamming weight. (a) Give a quantum algorithm that computes Parity with success probability 1 on every input x, using N/2 queries (assume N is an even number). (5 marks) (b) Show that this is optimal, even for quantum algorithms that have error probability ≤ 1/3 on every input. (10 marks) 12. Suppose we have a T-query quantum algorithm that computes the N-bit OR function with success probability 1 on all inputs x ∈ {0, 1} N . Show that T ≥ N. (10 marks) 3 13. For a partial function f : {0, 1} N → {0, 1, ∗}, let Y ⊆ f −1 (1) and Z ⊆ f −1 (0). Let R ⊆ Y × Z be a set of pairs and for each coordinate j ∈ [N], define Rj = {(y, z) ∈ R | yj ̸= zj}. Now suppose that • for each y ∈ Y , there are at least m1 strings z ∈ Z with (y, z) ∈ R; • for each z ∈ Z, there are at least m0 strings y ∈ Y with (y, z) ∈ R; • for each y ∈ Y and j ∈ [N], there are at most ℓ1 strings z ∈ Z with (y, z) ∈ Rj ; • for each z ∈ Z and j ∈ [N], there are at most ℓ0 strings y ∈ Y with (y, z) ∈ Rj . Then show that Q(f) ≥ Ω(p m0m1/ℓ0ℓ1), where Q(f) denotes the quantum query complexity of computing f with success probability at least 2/3. (10 marks) 14. Show a quantum query lower bound of Ω(p N/k) for computing the following partial function with error probability ≤ 1/3: output 1 if the input string x ∈ {0, 1} N has at least k 1’s; output 0 if x = 0N . Be explicit about what relations R, Rj you are using, and about the values of the parameters m0, m1, ℓ0, ℓ1. (4 marks) 15. Let k be an odd natural number and N = k 2 . Define f : {0, 1} N → {0, 1} such that f(x) = Majk (ORk(x (1)), . . . , ORk(x (k) )) where x = x (1) . . . x(k) with each x (i) ∈ {0, 1} k , Majk is the k-bit majority function and ORk is the k-bit OR function. Show that Q(f) = Ω(N3/4 ). Be explicit about what relations R, Rj you are using, and about the values of the parameters m0, m1, ℓ0, ℓ1. (6 marks)

$25.00 View

[SOLVED] Quantum computing (cs5100) : problem set 2

1. Construct a CNOT gate from two Hadamard gates and one controlled-Z gate. Recall, the controlled-Z gate maps |11⟩ → −|11⟩ and acts like the identity on the other basis states. (10 marks) 2. A SWAP-gate interchanges two qubits, i.e., it maps basis state |a, b⟩ to |b, a⟩. Implement a SWAP-gate using only CNOT gates. When using a CNOT, you’re allowed to use either of the 2 bits as the control, but be explicit about this. (10 marks) 3. Let U be a 1-qubit unitary that we would like to implement in a controlled way, i.e., we want to implement the map |c⟩|b⟩ 7→ |c⟩U c |b⟩ for all c, b ∈ {0, 1} (here U 0 = I and U 1 = U). Suppose there exist 1-qubit unitaries A, B, and C, such that ABC = I and AXBXC = U, where X is the NOT-gate. Give a circuit that acts on two qubits and implements a controlled-U gate, using CNOTs and (uncontrolled) A, B, and C gates. (10 marks) 4. Let C be a given quantum circuit consisting of T many gates, which may be CNOTs and single-qubit gates. Show that we can implement C in a controlled way using O(T) Toffoli gates, CNOTs and single-qubit gates, and no auxiliary qubits other than the controlling qubit. (10 marks) 5. Recall we can apply a standard query Ox to bitstring x ∈ {0, 1} N in the usual form: Ox : |i, b⟩ 7→ |i, b ⊕ xi⟩. 1 Give a circuit, involving one application of Ox and some other gates, to implement the following controlled-phase-query: Cx : |c, i, 0⟩ 7→ (−1)cxi |c, i, 0⟩. The idea here is that we implement a phase-query to x, but only in case the controlqubit (c ∈ {0, 1}) is set to 1. (10 marks) 6. Show that a standard query Ox can be implemented using one controlled-phase-query to x (which maps |c, i⟩ 7→ (−1)cxi |c, i⟩, so the phase is added only if the control bit is c = 1), and possibly some auxiliary qubits and other gates. (10 marks) 7. Give a circuit that maps |0 n , b⟩ 7→ |0 n , 1⊕b⟩ for b ∈ {0, 1}, and that maps |i, b⟩ 7→ |i, b⟩ whenever i ∈ {0, 1} n {0 n}. You are allowed to use elementary gates, including Toffoli gates, as well as auxiliary qubits that are initially |0⟩ and that should be put back to |0⟩ at the end of the computation. (10 marks) 8. Suppose we can make queries of the type |i, b⟩ 7→ |i, b ⊕ xi⟩ to input x ∈ {0, 1} N , with N = 2n . Let x ′ be the input x with its first bit flipped (e.g., if x = 0110 then x ′ = 1110). Give a circuit that implements a query to x ′ . Your circuit may use one query to x. (10 marks) 9. Suppose our N-bit input x satisfies the following promise: either (1) the first N/2 bits of x are all 0 and the second N/2 bits are all 1; or (2) the number of 1s in the first half of x plus the number of 0s in the second half, equals N/2. Modify the Deutsch-Jozsa algorithm to efficiently distinguish these two cases (1) and (2). (10 marks) 10. Consider the following generalization of Simon’s problem: the input is x = (x0, . . . , xN−1), with N = 2n and xi ∈ {0, 1} n with the property that there is some unknown subspace V ⊆ {0, 1} n (where {0, 1} n is the vector space of n-bit strings with entrywise addition modulo 2) such that xi = xj iff there exists a v ∈ V such that i = j ⊕ v. The usual definition of Simon’s problem corresponds to the case of 1-dimensional subspace V = {0, s}. Show that one run of Simon’s algorithm now produces a j ∈ {0, 1} n that is orthogonal to the whole subspace (i.e., j · v = 0 mod 2 for every v ∈ V ). (10 marks) 2

$25.00 View

[SOLVED] Quantum computing (cs5100) : problem set 1

1. (a) Prove that there doesn’t exist a 2-qubit unitary U that maps |ϕ⟩|0⟩ → |ϕ⟩|ϕ⟩ for every qubit |ϕ⟩. (8 marks) (b) Prove that there doesn’t exist a 2-qubit unitary U that maps |0⟩|0⟩ → |0⟩|0⟩ and |+⟩|0⟩ → |+⟩|+⟩. (15 marks) 2. Alice and Bob prepare an EPR pair, that is, two qubits in the state √ 1 2 (|00⟩ + |11⟩). They each take one qubit home. Suddenly, Alice decides she wishes to convey one of 4 messages to Bob; in other words, she wants to convey a classical string ab ∈ {0, 1} 2 to Bob. Alice does the following in the privacy of her own home: First, if a = 1 she applies a NOT gate to her qubit (else if a = 0 she does nothing here). Next, if b = 1, she applies a Z (phaseflip) gate,  1 0 0 −1  , to her qubit (else if b = 0 she does nothing here). (a) Write the resulting 2-qubit state for the four different cases that ab could take. (8 marks) (b) Suppose Alice sends her half of the state to Bob, who now has two qubits. Show that Bob can determine both a and b from his state. (14 marks) 3. Let θ ∈ [0, 2π), Uθ =  cos θ − sin θ sin θ cos θ  , |ϕ⟩ = Uθ|0⟩ and |ϕ ⊥⟩ = Uθ|1⟩. 1 (a) Show that ZX|ϕ ⊥⟩ = |ϕ⟩. Recall, X is the NOT gate. (5 marks) (b) Show that an EPR-pair, √ 1 2 (|00⟩ + |11⟩), can also be written as √ 1 2 (|ϕ⟩|ϕ⟩ + |ϕ ⊥⟩|ϕ ⊥⟩). (10 marks) (c) Suppose Alice and Bob start with an EPR-pair. Alice applies U −1 θ to her qubit and then measures it in the standard computational basis. What (pure) state does Bob have if her outcome was 0, and what (pure) state does he have if her outcome was 1? (8 marks) (d) Suppose Alice knows the number θ but Bob does not. Give a protocol that uses one EPR-pair and 1 classical bit of communication where Bob ends up with the qubit |ϕ⟩ (in contrast to general teleportation of an unknown qubit, which uses 1 EPR-pair and 2 bits of communication). (10 marks) 4. Recall the CHSH game we saw in class: • Alice gets x ∈ {0, 1} and Bob gets y ∈ {0, 1} • Alice outputs a ∈ {0, 1} and Bob outputs b ∈ {0, 1} (recall they can’t communicate) • the success condition for the game is a ⊕ b = x ∧ y. • their goal is to succeed with high probability when the inputs are given uniformly at random. Now suppose that Alice and Bob can build magic “non-local boxes” that would allow them to succeed at the CHSH game with 100% probability. That is, a non-local box is an imaginary device that has an input-output port at Alice’s and another one at Bob’s, even though they are spatially distant; furthermore, Alice can put a bit x ∈ {0, 1} into the box and get back a bit a ∈ {0, 1}, Bob can put a bit y ∈ {0, 1} into the box and get back a bit b ∈ {0, 1}, and these bits will always satisfy a⊕b = x∧y. Also, inspired by entanglement, we assume that a non-local box is a one-shot device, that is, one box can only be used once. (a) Assume that Alice and Bob are spatially distant, but they have access to n of these magical non-local boxes. Assume also that Alice knows n bits x1, . . . , xn ∈ {0, 1}, Bob knows n bits y1, . . . , yn ∈ {0, 1}, and they wish to compute the “inner product mod 2” function of their bits, IP2(x1, . . . xn, y1, . . . , yn) = x1 · y1 + · · · + xn · yn (mod 2). Show that by using the non-local boxes, and then allowing one classical bit of communication from Alice to Bob, Bob can learn the value IP2(x1, . . . , xn, y1, . . . , yn). (10 marks) (b) Show that every Boolean function f : {0, 1} n → {0, 1} can also be computed by a polynomial over F2. Recall F2 is the field with two elements {0, 1} with addition and multiplication being performed (mod 2). (8 marks) 2 (c) Let f(x1, . . . , xn, y1, . . . , yn): {0, 1} 2n → {0, 1} be a Boolean function over 2n variables. Now suppose Alice knows x1, . . . , xn, Bob knows y1, . . . , yn, and they wish to compute f applied to their two inputs: f(x1, . . . , xn, y1, . . . , yn). Show that by using as many non-local boxes as they want, and then using two classical bits of communication, both of them can learn the value f(x1, . . . , xn, y1, . . . , yn). (Quantify the number of non-local boxes used in your protocol.) (14 marks) 5. We had seen one-qubit teleportation in class. In fact, entangled states can also be teleported. Suppose Alice has prepared a two-qubit entangled state |ϕ⟩ := α|00⟩ + β|11⟩. She wishes to teleport one half of |ϕ⟩ to Bob and another half to Charlie, so that in the end Bob and Charlie will hold halves of the entangled state |ϕ⟩ despite never physically interacting. Give a protocol to achieve this. (10 marks) 6. Alice and Bob prepare the following 2-qubit state: |ψ⟩ = H ⊗ H  1 √ 3 |00⟩ + 1 √ 3 |01⟩ + 1 √ 3 |10⟩  . Alice now takes control of the first qubit and Bob takes control of the second qubit. Each of Alice and Bob now flips a coin and does the following: If they flip Tails, they directly measure their qubit; if they flip Heads, they first apply a Hadamard to their qubit and then they measure. (a) Prove the following statements: (10 marks) If Alice flips T and Bob flips T, it’s possible A & B will measure 1, 1 respectively. If Alice flips T and Bob flips H, it’s impossible A & B will measure 1, 0 respectively. If Alice flips H and Bob flips T, it’s impossible A & B will measure 0, 1 respectively. If Alice flips H and Bob flips H, it’s impossible A & B will measure 1, 1 respectively. The next two questions carry zero marks and needn’t be turned in. But it would be good if you spend some time thinking about them. (b) Lucien says the following: “Let’s consider the situation before any coin flips or measurement happens, and try to decide what outcomes the qubits are capable of producing when measured. • Consider the first statement in (a). Since it’s possible that Alice will flip Tails and Bob will flip Tails, we conclude that prior to any coin flips/measuring, it’s possible for Alice’s qubit to register 1 after being directly measured. • Now consider the second statement in (a). Since Alice’s qubit is capable of generating a 1 when she flips Tails, it must be impossible for Bob’s qubit to produce a 0 when he flips Heads, and consequently Hadamards-thenmeasures. • Let’s repeat the previous two bullet points, interchanging ‘Alice’ and ‘Bob’. By the first statement in (a), we conclude that prior to any coin flips/measuring, it’s possible for Bob’s qubit to register a 1 when directly measured. Hence 3 by the third statement in (a), since Bob’s qubit is capable of generating a 1 when directly measured, we conclude that it must be impossible for Alice’s qubit to produce a 0 when she Hadamards-then-measures. • We’ve concluded that in case of flipping Heads, for both Alice and Bob it’s impossible for them to register a 0 when they Hadamard-and-measure; i.e., they must both register a 1 in this case. But this contradicts the fourth statement in (a).” Critique the four bullet points above. Do you agree or disagree with Lucien? (c) Read Scott Aaronson’s blog post from Sept. 25, 2018, It’s hard to think when someone Hadamards your brain. Critique his argument. Do you agree or disagree with him? 4

$25.00 View

[SOLVED] Cmput275—assignment 2

Memory Management: In order to complete some of these questions you will be required to use dynamic memory allocation. Your programs must not leak any memory, if you leak memory on a test case then you are considerd to have failed that test case. You can test your program for memory leaks by using the tool valgrind. Memory Requirements: In addition to not leaking memory your programs must not use at any one time more than double of the maximum amount of memory they require. That is if implementing a dynamic array you may should use the doubling strategy. If you simply allocate a very large array hoping input sizes will never exceed that then you will not receive marks for that question. For initializing dynamic arrays you may intialize them to have a capacity of 4. Compilation Flags: each of your programs should be compiled with the following command: gcc -Wall -Wvla -Werror These are the flags we’ll compile your program with, and should they result in a compilation error then your code will not be compiled and ran for testing. Allowed libraries: stdlib.h, stdio.h, and string.h. No other libraries are allowed. 1. Wordl In this question you will develop a game called wordl, wordl is a word game much like the game wordle. In the game of wordl there is a secret code word that the user is trying to guess, and whenever a user makes a guess the following happens: • The users guess is printed out colorized with the following rules: – If the letter in the guess is not in the code at all then it is printed out as white. – If the letter in the guess is the same letter at that position as the code then the letter is printed out as green. – If the letter in the guess occurs in the code, but is in the wrong spot then you print it out as yellow if the corresponding letter in the code is not matched as green by the previous rule and you have not printed out a number of that character in yellow equal to the number of that character in the code minus the number of green of that character. For example, if the code was verge and you guessed bevel then the printed out response would be: bevel However, if the code was bevel an you guessed eenie then the printed out response would be: eenie With the important distinction that the last letter e was coloured white, despite the fact that the letter e appears in the code. This is because there was already a yellow e previously printed and there was a green e printed, so there were already a coloured e to account for both letters e in the code. Your program expects one command line argument which is the code word that must be guessed. Your program then should repeat the following process: (a) Print the message “Enter guess: ” and read a string from standard input, that string will be the users guess. (b) Print out the colourized version of the user’s guess followed by a newline. (c) If the user did not guess the code word correctly, and has not already made 6 guesses, then begin these steps again. (d) If the user did not guess the code correctly, and has already made 6 guesses then print out the message “Failed to guess the word: ” followed by a newline, where is replaced with the secret code word. Then your program terminates. (e) If the user does guess the code correctly, then print the message “Finished in guesses” followed by a newline, where is replaced with the number of guesses the user made. You may assume the user always gives you a guess exactly equal to the length of the code, and you may assume the code is never more than 12 characters long. In order to print out colours you will have to print special colour characters, and these are characters that show up in your output, so you have to be careful about when you print them. You should only print out a colour character whne you need to switch colours for the immediate next thing you’d like to print. In order to help with this we have given you some starter code that uses some global variables as well as a function setColour. You should use the function setColour and provide as the argument one of the global variables YELLOW, GREEN, or WHITE for those colours. The function makes sure not to print out the colour character if that colour is already set, which will help in ensuring you do not print out erroneous colour characters, but does not guarantee you are printing them exactly only when you need to. Deliverables: For this question include in your final submission zip your c source code file named wordl.c 2. Reverse Polish Notation In this question you will be writing a simple interpreter for arithmetic expressions written in reverse polish notation. Reverse polish notation, or postfix notation, is a notation for expressions where operators follow their operands. This is opposed to the usual infix notation that we are used to where operators are placed inbetween their operands. That is the infix expression 5 + 3 – 10 would be written in reverse polish notation as 5 3 + 10 -. For this program you need to be able to interpret any valid reverse polish notation expression that includes any valid combination of integers and operators p, s, *, /, where p means the addition operator, s means the subtraction operator, * means the multiplication operator and / means the integer division operator. You may assume no operations will result in integer overflow. That is if any calculation would result in a value outside the bounds of [INT MIN, INT MAX] then that input is invalid. You may also assume that any expression you are given is valid, which is that all operators have the appropriate amount of operands and all operands are consumed to result in one final value. Note: There may be any amount of whitespace inbetween each operator/operand. You must be able to handle this arbitrary amount of whitespace. Hint 1: It will help you to create a stack. A stack is a simple data structure in which you can only add and remove elements to/from the back of it. Due to this behaviour a stack is called a last in, first out (LIFO) data structure. What should you do to your stack when you see an integer in the input stream? What should you do to your stack when see an operator in the input stream? Deliverables For this question include in your final submission zip your c source code file named rpn.c 3. Integer Sets In this question you will be writing a program that allows you to create and manipulate two arbitarary sets of integers. A set is a collection of objects with no duplicates, so your data structure should have no duplicates integers in it. Your program will need to read input that specifies several commands the user would like to do to interact with the two sets your program manages (which the user will refer to as x and y). Your program should read commands from the user executing them until receiving the commmand q upon which time your program terminates. In all of the following command must be replaced with either x or y to refer to that particular set, and must be replaced with an integer. The commands your program must handle are: • a — the command for adding an integer to a set. When you receive this command you should add the specified integer to the specified set. If the integer already exists in the set then you should not add it, as sets should not contain duplicates. • r — the command for removing an integer from a set. When you receive this command you should remove the specified integer from the specified set. If the integer does not already exist in the set then no change should occur. • p — the command for printing out a set. This should print out the elements of the specified set in increasing order with one space between each element and ending in a newline. If there are no elements in the set then print nothing. • u — this command for set union. When receiving this command you should calculate and print out the union of your two sets. The union of two sets s1 and s2 is the set which contains every element that is in either s1 or s2. When printing out the union you should print out the elements in increasing order with one space between each element and ending in a newline. If there are no elements in the union then print nothing. • i — the command for set intersection. When receinv this command you should calculate and print out the intersection of your two sets. The intersect of two sets s1 and s2 is the set which contains only elements that occur in both s1 and s2. When printing out the intersection you should print out the elements in increasing order with one space between each element and ending in a newline. If there are no elements in the intersection then print nothing. • q — the command for quitting your program, when received your program should terminate. Note: For each command there may be any amount of whitespace inbetween the components of the commands, or inbetween separate comamnds. You must be able to handle this arbitrary amount of whitespace. Deliverables For this question include in your final submission zip your c source code file named int set.c How to submit: Create a zip file a2.zip, make sure that zip file contains your C source code files wordl.c, rpn.c, and int set.c. Assuming all three of these files are in your current working directory you can create your zip file with the command $ zip a2.zip wordl.c rpn.c int_set.c Upload your file a2.zip to the a1 submission link on eClass.

$25.00 View

[SOLVED] Cmput275—assignment 3

1. Conway’s Game of Life In this question you will be implementing Conway’s Game of Life. Conway’s game of life is not really a game, it is a cellular automaton with very simple rules. Firstly, it takes place on a grid of squares. Each cell in the grid can be either “alive” or “dead” and begins with an initial value as the input. The cellular automaton then runs with rules to update each time step. Every time step cells will either become alive, become dead, or remain unchanged based on a four simple rules (a) The underpopulation rule: Any live cell with fewer than two live neighbors dies (b) The overpopulation rule: Any live cell with more than three live neighbors dies (c) The reproduction rule: Any dead cell with exactly three live neighbors becomes a live cell (d) Any live cell with two or three live neighbors stays alive in the next generation In Conway’s Game of Life a neighbour is any cell adjacent in the cardinal or ordinal directions, that is any other cell sharing a side or cells adjacent on the diagonal. The first input to your program will be the grid, one row per line, with each alive cell being represented by the character ’O’ and each dead cell being represented by the character ’.’. Once the grid is complete your program will receive the letter ’x’ alone on a line. After your program has read in the grid the only non-whitespace characters your program will receive are ’s’ or ’p’. When your program receives the command ’p’ it should print out the grid in its current state, with a line of pipe characters ’|’ above and below it. When your program receives the command ’s’ it should progress the grid one time step following the rules described above. Deliverables For this question include in your final submission zip your C++ source code file named conways.cc 2. In this question you will be developing some very basic ADTs and operations to simulate very simplistic forces in a two-dimensional plane. You will have to develop the classes Point, Force, and accompanying operations in order to work with the provided test harness a1q1.cc. The Point class will model a point in a two-dimensional plane with a given x and y float value. The Force class will model a force in two-dimensional plane, with a given angle (in degrees), and magnitude. Angles in degrees start from the line out in the positive x direction of the cartesian plane, and wraps back around at 360 degrees. See below for an example: 0 ◦ 30◦ 60◦ 90◦ 120◦ 150◦ 180◦ 210◦ 240◦ 270◦ 300◦ 330◦ Quadrant 2 Quadrant 1 Quadrant 3 Quadrant 4 You have been provided a header 2DMotion.h, which you may change as you like. However, you must implement all the functionality used by the provided test harness file a1q1.cc, as that file (or a similar one in how it uses the Point and Force classes) will be used to test your program. You may not change the test harness, as we will test you with our own copy of it, so any changes you make won’t be reflected in how your solution is tested! You must implement the following functions for this question: • Point default constructor. Should initialize both x and y to 0. • Force default constructor. Should initialize both angle and magnitude to 0. • An overloaded input operator for Point objects. Should read in first the x field, then the y field. • An overloaded input operator for Force objects. Should read in first the angle, then the magnitude. • An overloaded output operator for Point objects. Should print them out in the format: “(, )” (Note is used to denote that the variables value should go there, so if the point has x value 4, and y value 5, you should print out “(4, 5)”. • An overloaded output operator for Force objects. Should print them out in the format: “ degrees with magnitude of ” • An overloaded addition operator between a Point object and a Force object. This should effectively create a new Point that is the result of “moving” the original by that Force. This requires a bit of trigonometry! You will require the header, and should use the PI constant defined in the provided 2DMotion.h file. In order move a Point by a given Force you must determine the horizontal and vertical components of the given Force. Doing so is simple trigonometry. Consider that the magnitude of a Force is simply the hypotenuse of a right-angle triangle. Given the hypotenuse and angle of a right-angle triangle you can easily find out the length of the other sides (the horizontal and vertical components) using sin and cos provided in the library – but be wary, those operations work on radians! • An overloaded multiplication operator between a Force and an int scalar. This should simply produce a new Force which has a magnitude scaled by the given scalar. • int Point::quadrant() – A member function that returns the quadrant (1, 2, 3, or 4) that the given point is in. See quadrants in diagram above. Deliverables: For this question include in your final submission zip your C++ header file 2DMotion.h and the implementation file 2DMotion.cc 3. Integer Sets — a full ADT this time For this question you may not use any STL container, nor may you use any STL smart pointers. You must implement the class in question by managing memory yourself. These headers are already banned from the assignment, but this is to remind you. You may create any helper classes you want to help yourself manage memory. In this question, you will implement a class intSet that represents a mathematical set for integers (recall, a mathematical set means no duplicate values are included). The interface has already been given to you in the provided intSet.h file. You may add private helper methods or additional fields as you see fit, however you do not need to add additional fields (helper methods may be a good idea though). The important part of memory management in this class is that your add method, which adds an integer to the set, must follow the following memory management scheme: • A default constructed intSet should have an array large enough to store 4 ints, the capacity field should reflect this (your size field represents how many integers are actually in the array, and thus should be 0 for default constructed objects). • When add is called and there is no more space in the current array (that is capacity==size), then you must double the size of the array. That is, you must allocate a new array twice the size of the old array, copy over all the old elements, and finally add your new int to the array. Of course, this must also update the size and capacity fields correctly. • Of course, if add is called and there is still space in the array you simply need to add the int to the array and update the size field. Recall though that add only actually adds the integer to the set if it doesn’t already exist in the set. This can be achieved either by not adding the int when it already exists, or adding it but ignoring duplicates in future functions. The behaviour is up to you. You must implement all the functions in the provided header, that is: • A copy constructor, which must perform a deep copy so that each set maintains its data independently • A copy assignment operator, which must perform a deep copy so that each set maintains its data independently • A move constructor, which must efficiently steal data and not perform a deep copy • A move assignment operator, which must efficiently steal data and not perform a deep copy • A destructor, which must free all memory allocated within the intSet • operator| which consumes two intSet objects and returns a new intSet object which represents the set union of those two sets. Set union of two mathematical sets is defined as a set which contains all elements which occur in either set • operator& which consumes two intSet objects and returns a new intSet object which represents the set intersection of those two sets. Set intersection of two mathematical sets is defined as a set which contains all elements which appear in both sets • operator== which consumes two intSet objects and returns true if they are equal sets, and false other. Two sets X and Y are equal if there does not exist an element in X that doesn’t exist in Y, and there also does not exist an element in Y that does not exist in X. That is, they contain exactly the same elements – though ordering does not matter • isSubset which is a method called with an intSet parameter. It returns true if the parameter is a subSet of the intSet the method was called on, and false otherwise. A set X is a subset of another set Y if every element that occurs in X also occurs in Y • contains which is a method called with an int parameter. It returns true if the int parameter is a member of the set • add which is a method called with an int parameter. It adds the int to the set (if the set already contained that int then it doesn’t need to do anything), it must follow the memory management scheme specified above. • remove which is a method called with an int parameter. It removes the int parameter from the set (if the set didn’t contain that int then it doesn’t need to do anything). You never need to shrink your array, so no matter how many elements are removed from a set you only ever change the contents of your array and your size variable, never changing capacity. • A friend operator

$25.00 View