Browse assignments

Assignment catalog

33,401 assignments available

[SOLVED] Cs6601 assignment 3 fall 2025 –

5/5 - (2 votes) Today, Bayes Nets have become more important than ever in Artificial Intelligence. The ability to generate solutions with incomplete data has become critical to predicting the future as well as classification and Deep Learning. It is the Bayesian Networks ability to represent complex relationships and reliably predict outcomes that is being used throughout AI. In this assignment, you will work with probabilistic models aka Bayesian networks to efficiently calculate the answer to probability questions concerning discrete random variables. Bayesian view is a subjective view which allows self-consistent ascription of prior probabilities to propositions, and requires arriving evidence to update the model. Since the 1960’s the brilliant work by Rev. Thomas Bayes (1702-1761) has been outperforming human experts in diagnosis of medical conditions. We also hope to prove the value of Sampling to you. In particular, Markov Chain Monte Carlo (MCMC) helps to overcome the limitations of traditional sampling. In MCMC each sample is generated by making a random change to the preceding sample MCMC are algorithms with a state, the next state is generated from the current one One specific MCMC used is Gibbs sampling. In which the sampling process settles into a “dynamic equilibrium” in which the longrun fraction of time spent in each state is exactly proportional to its posterior probability The states are generated given the Markov blanket, and state transitions are defined by the conditional distribution Resources You will find the following resources helpful for this assignment. Canvas Videos: Lecture 5 on Probability Lecture 6 on Bayes Nets Textbook: 4th edition: Chapter 12: Quantifying Uncertainty Chapter 13: Probabilistic Reasoning 3rd edition: Chapter 13: Quantifying Uncertainty Chapter 14: Probabilistic Reasoning Setup Clone the project repository from your private repo on Github git clone [your URL] Substitute your actual username where the angle brackets are. Navigate to assignment_3/ directory Activate the environment you created during Assignment 0 conda activate ai_env In case you used a different environment name, to list of all environments you have on your machine you can run conda env list. Run the following command in the command line to install and update the required packages pip install torch torchvision -f https://download.pytorch.org/whl/torch_stable.html pip install --upgrade -r requirements.txt For Mac OSX users with apple silicon, if you have error related to xgboost, you may also need to install xgboost. conda install py-xgboost Submission Please include all of your own code for submission in submission.py. Important: There is a TOTAL submission limit of 7 on Gradescope for this assignment. This means you can submit a maximum of 7 times during the duration of the assignment. Please use your submissions carefully and do not submit until you have thoroughly tested your code locally. If you’re at 6 submissions, use your last submission wisely. The submission marked as ‘Active’ in Gradescope will be the submission counted towards your grade. Restrictions You are not allowed to use following set of modules from ‘pgmpy’ Library. pgmpy.sampling.* pgmpy.factor.* – pgmpy.estimators.* Part 1 Bayesian network tutorial: [35 points total] To start, design a basic probabilistic model for the following system: James Bond, Quartermaster (Q), and M (Head of MI6) are in charge of the security system at the Military Intelligence Unit (MI6) of the British Secret Service. MI6 runs a special program called “Double-0”, where secret spy agents are trained and deployed to gather information concerning national security. A terrorist organization named “Spectre” is planning an espionage mission and its aim is to gain access to the secret “Double-0” files stored in the MI6 database. Q has designed a special security system to protect the secret “Double-0” files. In order to gain access to these files, Spectre needs to steal from MI6 a cipher and the key to crack this cipher. Q stores this cipher in his personal database, which is guarded by heavy security protocols. The key to cracking the cipher is known only to M, who is protected by Bond. 1a: Casting the net [10 points] We anticipate that Spectre will try to carry out their mission by performing the following steps: Hire professional hackers who can write programs to launch a cyberattack on Q’s personal database. Buy a state-of-the-art computer called “Contra” to launch their cyberattack. Hire ruthless mercenaries to kidnap M and get the key. Find a way to ensure Bond is not with M when they attempt their kidnapping. Use the cipher and key to access the target “Double-0” files. As an up and coming field agent, MI6 has assigned you to design a Bayes Network for modeling their mission, so it can be used to build defenses. This is the list of variables in your Bayes Network: “H”: That Spectre can hire the professional hackers “C”: That Spectre is able to buy Contra “M”: That Spectre is able to hire mercenaries “B”: That Bond is guarding M at the time of the attempted kidnapping “Q”: That Q’s database is hacked and the cipher is compromised “K”: That M gets kidnapped and has to divulge the key “D”: That Spectre succeeds in obtaining the “Double-0” files Based on previous attacks by Spectre, MI6 has provided data to help you determine probabilities for your Bayes Network: Spectre’s ability to find and hire skilled professional hackers has a probability of h = 0.5. Spectre’s ability to get their hands on Contra has a probability of c = 0.3. Spectre’s will not be able to hire the mercenaries has a probability of ¬m = 0.2. Bond is also assigned to another mission, so the probability that he will be protecting M at the time of the kidnapping attempt is b = 0.5. The ability of the professional hackers to crack Q’s personal database without using Contra has a probability of 0.55. The ability of the professional hackers to crack Q’s personal database using Contra has a probability of 0.9. If Spectre cannot hire professional hackers, and launches a cyberattack using their less experienced employees using Contra on Q’s personal database it will remain secure with a probability of 0.75 and will remain secure with a probability of 0.95 if Spectre does not have Contra. When Bond is protecting M, the probability that M does not get kidnapped is 0.85 if mercenaries conduct the attack. And if mercenaries are not present, the probability that M is not kidnapped is 0.99. However, if M is not accompanied by Bond, M gets kidnapped with probability 0.95 when mercenaries are present and 0.75 when they are not. With both the cipher and the key, Spectre can gain access to the “Double-0” files with a probability of 0.99. If Spectre has neither of these, then this probability is only 0.02. If Spectre has just the cipher, the probability that the “Double-0” files will not be compromised is 0.4. If Spectre has just the key, then this probability changes to 0.65 (i.e., 35% chance of compromise). Use the description of the model above to design a Bayesian network for this model. The pgmpy package is used to represent nodes and conditional probability arcs connecting nodes. Don’t worry about the probabilities for now. Use the functions below to create the net. You will write your code in submission.py. Fill in the function make_security_system_net() The following commands will create a BayesNet instance add node with name “node_name”: BayesNet = BayesianNetwork() BayesNet.add_node("node_name") You will use BayesNet.add_edge() to connect nodes. For example, to connect the parent and child nodes that you’ve already made (i.e. assuming that parent affects the child probability): Use function BayesNet.add_edge(,). For example: BayesNet.add_edge("parent","child") (Hint: after adding all edges, run BayesNet.check_model()—it will raise a helpful exception if you missed a parent or your CPDs don’t sum to 1.) After you have implemented make_security_system_net(), you can run the following test in the command line to make sure your network is set up correctly. python probability_tests.py ProbabilityTests.test_network_setup 1b: Setting the probabilities [15 points] Now set the conditional probabilities for the necessary variables on the network you just built. Fill in the function set_probability() Using pgmpy‘s factors.discrete.TabularCPD class: if you wanted to set the distribution for node ‘A’ with two possible values, where P(A) to 70% true, 30% false, you would invoke the following commands: cpd_a = TabularCPD('A', 2, values=[[0.3], [0.7]]) NOTE: Use index 0 to represent FALSE and index 1 to represent TRUE, or you may run into testing issues. If you wanted to set the distribution for P(G|A) to be A P(G=true given A) T 0.75 F 0.85 you would invoke: cpd_ga = TabularCPD('G', 2, values=[[0.15, 0.25], [ 0.85, 0.75]], evidence=['A'], evidence_card=[2]) Reference for the function: https://pgmpy.org/_modules/pgmpy/factors/discrete/CPD.html Modeling a three-variable relationship is a bit trickier. If you wanted to set the following distribution for P(T|A,G) to be A G P(T=true given A and G) T T 0.15 T F 0.6 F T 0.2 F F 0.1 you would invoke cpd_tag = TabularCPD('T', 2, values=[[0.9, 0.8, 0.4, 0.85], [0.1, 0.2, 0.6, 0.15]], evidence=['A', 'G'], evidence_card=[2, 2]) The key is to remember that first row represents the probability for P(T==False), and second row represents P(T==true). Add Tabular conditional probability distributions to the bayesian model instance by using following command. bayes_net.add_cpds(cpd_a, cpd_ga, cpd_tag) You can check your probability setups in the command line with python probability_tests.py ProbabilityTests.test_probability_setup and probability values: python probability_tests.py ProbabilityTests.test_security_system_probabilities 1c: Probability calculations : Perform inference [10 points] To finish up, you’re going to perform inference on the network to calculate the following probabilities: What is the marginal probability that the “Double-0” files get compromised? You just received an update that the British Elite Forces have successfully secured and shut down Contra, making it unavailable for Spectre. Now, what is the conditional probability that the “Double-0” files get compromised? Despite shutting down Contra, MI6 still believes that an attack is imminent. Thus, Bond is reassigned full-time to protect M. Given this new update and Contra still shut down, what is the conditional probability that the “Double-0” files get compromised? You’ll fill out the “get_prob” functions to calculate the probabilities: get_marginal_double0() get_conditional_double0_given_no_contra() get_conditional_double0_given_no_contra_and_bond_guarding() Here’s an example of how to do inference for the marginal probability of the “A” node being True (assuming bayes_net is your network): solver = VariableElimination(bayes_net) marginal_prob = solver.query(variables=['A'], joint=False) prob = marginal_prob['A'].values To compute the conditional probability, set the evidence variables before computing the marginal as seen below (here we’re computing P(‘A’ = false | ‘B’ = true, ‘C’ = False)): solver = VariableElimination(bayes_net) conditional_prob = solver.query(variables=['A'],evidence={'B':1,'C':0}, joint=False) prob = conditional_prob['A'].values NOTE: marginal_prob and conditional_prob return two probabilities corresponding to [False, True] case. You must index into the correct position in prob to obtain the particular probability value you are looking for. For example, prob[1] is the probability the queried variable is true. If you need to sanity-check to make sure you’re doing inference correctly, you can run inference on one of the probabilities that we gave you in 1a. For instance, running inference on P(M=false) should return 0.20 (i.e. 20%). However, due to imprecision in some machines it could appear as 0.199xx. Calculating one or two of these by hand (or in a spreadsheet) is strongly recommended; it is the quickest way to spot indexing mistakes before submitting. Part 2: Sampling [65 points total] For the main exercise, consider the following scenario. There are three frisbee teams who play each other: the Airheads, the Buffoons, and the Clods (A, B and C for short). Each match is between two teams, and each team can either win, lose, or draw in a match. Each team has a fixed but unknown skill level, represented as an integer from 0 to 3. The outcome of each match is probabilistically proportional to the difference in skill level between the teams. Sampling is a method for ESTIMATING a probability distribution when it is prohibitively expensive (even for inference!) to completely compute the distribution. Here, we want to estimate the outcome of the matches, given prior knowledge of previous matches. Rather than using inference, we will do so by sampling the network using two Markov Chain Monte Carlo models: (2c) Gibbs sampling and (2d) Metropolis-Hastings . 2a: Build the network. [10 points] For the first sub-part, consider a network with 3 teams : the Airheads, the Buffoons, and the Clods (A, B and C for short). 3 total matches are played. Build a Bayes Net to represent the three teams and their influences on the match outcomes. Fill in the function get_game_network() Assume the following variable conventions: variable name description A A’s skill level B B’s skill level C C’s skill level AvB the outcome of A vs. B (0 = A wins, 1 = B wins, 2 = tie) BvC the outcome of B vs. C (0 = B wins, 1 = C wins, 2 = tie) CvA the outcome of C vs. A (0 = C wins, 1 = A wins, 2 = tie) (Throughout Part 2 the index 0 means the first item shown in each parenthesis, index 1 the second, and so on.) Use the following name attributes: “A” “B” “C” “AvB” “BvC” “CvA” Assume that each team has the following prior distribution of skill levels: skill level P(skill level) 0 0.15 1 0.45 2 0.30 3 0.10 In addition, assume that the differences in skill levels correspond to the following probabilities of winning: skill difference (T2 – T1) T1 wins T2 wins Tie 0 0.10 0.10 0.80 1 0.20 0.60 0.20 2 0.15 0.75 0.10 3 0.05 0.90 0.05 If the difference is negative, swap T1 and T2 (i.e., use the row for the positive difference and swap the win columns). You can check your network implementation in the command line with python probability_tests.py ProbabilityTests.test_games_network 2b: Calculate posterior distribution for the 3rd match. [5 points] Suppose that you watched two of the three games and observed the following outcome: A beats B A draws with C Calculate the posterior distribution for the outcome of the BvC match in calculate_posterior(). Let’s first use the inference engine, VariableElimination, provided by the pgmpy library to perform inference. This result will be the ground truth for you to test your estimation in 2c) Gibbs and 2d) MH, where you are NOT allowed to use the inference engine there. You can check your posteriors in the command line with python probability_tests.py ProbabilityTests.test_posterior NOTE: In the following sections, we’ll be arriving at the same values by using sampling-based methods. Hints Regarding sampling for Part 2c, 2d, and 2e Hint 1: In both Metropolis-Hastings and Gibbs sampling, you’ll need access to each node’s probability distribution and nodes. You can access these by calling: A_cpd = bayes_net.get_cpds('A') team_table = A_cpd.values AvB_cpd = bayes_net.get_cpds("AvB") match_table = AvB_cpd.values Hint 2: While performing sampling, you will have to generate your initial sample by sampling uniformly at random an outcome for each non-evidence variable and by keeping the outcome of your evidence variables (AvB and CvA) fixed. Hint 3: You’ll also want to use the random package, e.g. random.randint() or random.choice(), for the probabilistic choices that sampling makes. 2c: Gibbs sampling [15 points] Implement the Gibbs sampling algorithm, which is a special case of Metropolis-Hastings. You’ll do this in Gibbs_sampler(), which takes a Bayesian network and initial state value as a parameter and returns a sample state (a tuple of six integers in the order [A,B,C,AvB,BvC,CvA]) drawn from the network’s distribution. The function should just consist of a “single iteration” of the algorithm. “Single iteration” means this function generates exactly one new sample based on the previous sample you have. The new sample is chosen with some distribution you have to compute by written down the equation. If an initial value is not given (initial state is None or and empty list), default to a state chosen uniformly at random from the possible states. Your sampling process could look like something below: state_chosen := randomly choose one *non‑evidence* index (0,1,2 or 4) if state_chosen = A: (derive your formula yourself) (Don't use VariableElimination or other inference engine, or you will have 600 sec timeout issue on Grade Scope) prob = [your formula] sample new state with the prob else if state_chosen = B: ... else if ... Note: DO NOT USE the given inference engines or pgmpy samplers to run the sampling method, since the whole point of sampling is to calculate marginals without running inference. "YOU WILL SCORE 0 POINTS ON THIS ASSIGNMENT IF YOU USE THE GIVEN INFERENCE ENGINES FOR THIS PART" (Gradescope enforces this by timing out any submission that calls VariableElimination inside the sampler.) You may find this helpful in understanding the basics of Gibbs sampling over Bayesian networks. Please note that there is a typo in this material: on page 3, $P( eg c|r,s,w) = 0.5556$ instead of 0.6923. 2d: Metropolis-Hastings sampling [15 points] Now you will implement the independent Metropolis-Hastings sampling algorithm in MH_sampler(), which is another method for estimating a probability distribution. The general idea of MH is to build an approximation of a latent probability distribution by repeatedly generating a “candidate” value for each sample vector comprising of the random variables in the system, and then probabilistically accepting or rejecting the candidate value based on an underlying acceptance function. Unlike Gibbs, in case of MH, the returned state can differ in multiple non-evidence variables. This paper provides a nice intro. You may need to download and view the file locally, as there are some issues with GitHub’s rendering. This method should just perform a “single iteration” of the algorithm. If an initial value is not given, default to a state chosen uniformly at random from the possible states. Note: DO NOT USE the given inference engines to run the sampling method, since the whole point of sampling is to calculate marginals without running inference. "YOU WILL SCORE 0 POINTS IF YOU USE THE PROVIDED INFERENCE ENGINES, OR ANY OTHER BLACK-BOX SAMPLER" 2e: Comparing sampling methods [19 points] Now we are ready for the moment of truth. Given the same outcomes as in 2b, A beats B and A draws with C, you should now estimate the likelihood of different outcomes for the third match by running Gibbs sampling until it converges to a stationary distribution. We’ll say that the sampler has converged when, for “N” successive iterations, the difference in expected outcome for the 3rd match differs from the previous estimated outcome by less than “delta”. N is a positive integer, delta goes from (0,1). For the most stationary convergence, delta should be very small. N could take values like 10,20,…,100 or even more. Play it out yourself! Use the functions from 2c and 2d to measure how many iterations it takes for Gibbs and MH to converge to a stationary distribution over the posterior. See for yourself how close (or not) this stable distribution is to what the Inference Engine returned in 2b. And if not, try tuning those parameters(N and delta). (You might find the concept of “burn-in” period useful). Count only post-burn-in iterations when reporting Gibbscount and MHcount; rejected MH proposals still count as one iteration. You can choose any N and delta (with the bounds above), as long as the convergence criterion is eventually met. Repeat this experiment for Metropolis-Hastings sampling. Implement compare_sampling() to (i) run each sampler until convergence, (ii) return the final BvC distribution, and (iii) record the iteration counts. Which algorithm converges more quickly? By approximately what factor? For instance, if Metropolis-Hastings takes twice as many iterations to converge as Gibbs sampling, you’d say that Gibbs converged faster by a factor of 2. Fill in sampling_question() to answer both parts. You only need to comment out or delete “raise NotImplementedError” and modify choice and factor. 2f: Return your name