5/5 - (1 vote) “…and we aim to show stronger improvements starting next fiscal quarter.” This all-hands call is taking forever, you think to yourself while daydreaming about where you should go for your holiday in two weeks. “…will be taking the lead with our next client. We expect great things from them.” Wait. Did your boss just say your name? “The client is an IDS vendor whose product uses machine learning models to identify malware. However, they have noticed that their models are frequently evaded and hope that we can find out why.” You don’t remember being told about this, but at this point, you guess you’re used to it. You’re just thankful your coworker was taking notes during the meeting and he gave you some tutorials on MLSploit, a framework you are expected to use. And later, you found some information about the attack. Assignment The purpose of this assignment is to gain experience with training machine learning (ML) and deep learning (DL) models classifying Windows portable executable (PE) malware into families. Specifically, the models will be given two different datasets: benign PE files and malicious PE files from multiple families. After training a DL models, you will attack those models using an evasion attack called the Mimicry Attack. Then, you will be tasked with improving the models which were attacked. Finally, you will train a ML model using different features and see if the mimicry attack still work. You will write a report about your experiences and observations. There are 5 tasks and a bonus task you will need to complete for this assignment. They include: Training DL Models (10%): Train LSTM, CNN, and RNN models on API call sequences. Attacking DL Models (10%): Attack models via mimicry attack. Detecting Attack (20%): Train model based on static features to detect the attack sample. Training ML Models (10%): Train classical ML models on API call existence, frequency, and arguments. Transferring Attack (10%): Run the mimicry attacks in controlled environment and evaluate the ML models. Attack ML Model using RL (10% bonus): Train ML model using ember and train RL model to evade the ML model. You will also need to compile a report (40%) that should contain screenshots of your findings and explanations for why the certain screenshot happened. For example, if your screenshot is comparing the results of how well different models detected the attack in Task 2, then an explanation for why the results differed should be included. To complete the tasks, you will also need these files Download these files. Supplementary Material: Lab 7_Supplementary_Material.pdf , Task 3 TemplateDownload Task 3 Template Deliverables Compress the deliverables for each task into a .tar.gz file called [GT Username]_cs6264_lab07.tar.gz with the following directory layout: task1/ pe.model.zip prediction.zip *.log.txt files task2/ attack-exe.zip: the attack samples generated from MLSploit to evade your models attack-feature.zip attack-prediction.zip attack.cfg.zip: Configuration file from MLSploit *.log.txt files task3.a/ detection1.py: Source code (preferably in Python) that will train a new model that will detect the attack from the previous task detection2.py detection3.py (others here if you wish) task3.b/ model1.zip model2.zip model3.zip (others here if you wish) task4/ pe.model.zip prediction.zip *.log.txt files task5/ task1 model/ prediction.zip *.log.txt files task4 model/ prediction.zip *.log.txt files report.pdf: This report should contain screenshots of your findings and explanations for why the certain screenshot happened. For example, if your screenshot is comparing the results of how well different models detected the attack in part 2, then an explanation for why the results differed should be included. bonus/ model.zip *.log.txt files ember-attack.zip *.log.txt files Warning: Warning: The malware binary we provide you (and the malware produced by MLSploit) is real malware. Do not under any circumstances execute these malware EVER. It is a compiled form of the rbot malware family and antivirus companies are well-aware of their existence (https://github.com/ytisf/theZooLinks to an external site.). We have not applied any static obfuscation to them so they should be easily detectable by AV companies. You are to use these binaries responsibly by only reading their byte contents (e.g., using tools like https://github.com/erocarrera/pefileLinks to an external site.). Lab 7: Supplementary Information CS 6264-OCY Overview Introduction Training Deep Learning Models Attacking Deep Learning Models Defending DL Models with ML Cat and Mouse Further Reading Introduction A single malware family may have several variants, making it hard to detect all of them comprehensively However, these malware variants will generally all have the same behaviour across different variants One way to detect an entire family of malware is to train a machine learning model based on the behaviour of a malware family so that if the model sees a new variant, it will still be able to classify the malware as from a specific strain This makes it not only faster but can allow you to scale your malware detection up than classifying each malware by hand Just like the pattern-recognition Host-Based IDS you created in lab 3, these models classify and identify syscall sequences (among other parameters) MLSploit MLSploit was developed by Georgia Tech PhD students with oversight from Georgia Tech Professors and Intel research scientists Helps you train and test different machine learning solutions against machine learning attacks Made with a simple GUI for usability See tutorials in lab files for more information on how you should operate MLSploit Training a DL Model To explore the world of machine learning security, we will first construct a few models to test later We want to test LSTM, CNN, and RNN models You can change the window size (i.e. length of the sequence of syscalls that the model will compare at each time) on the right size of the UI for more accurate models A full-length tutorial can be found in Canvas called Lab07_PE_Module_Tutorial.pdf Attacking a DL Model Now that our models are created, let’s see how easily they can be attacked (AKA tricked into thinking a malicious file is benign) In MLSploit, we will create a new pipeline that will first perform a “mimicry attack” against a benign application to figure out what a benign application might do to evade the machine learning model Next in the pipeline, we transform the benign application by injecting 10 different shellcode chunks into it, creating 10 samples that might evade the machine learning model by making it think that it was benign These steps are also detailed in the same tutorial Identifying Malware Meant to Trick DL with ML As we have learned in this course, combining both static and dynamic analysis makes for much more robust malware identification To incorporate static analysis into our process, we will need to train an ML model on static features of normal programs to identify differences in one that was injected with a mimicry attack We have provided static features of benign programs for you, so first we will extract features of the malicious programs you created in the previous step We will be using EMBER To extract features like EMBER, check out how PEFeatureExtractor is used by scripts/ember_init.py You may also find a fork of EMBER here with some useful scripts After you have static features extracted for both, let us move on to training a model to identify malware based on static features First, we must prepare the data for the ML algorithm Note that we are using a classification algorithm as this is a classification problem ○ We have some labels (malicious vs. benign) and we want to put them on a set of THREE features ○ First, we make a table with features and a label for eat set of features Next, we will split this table into the table of features and the table of labels ● You can do this with hsplit() or numpy.split(axis=1) x, y = numpy.split(dataset, [-1], axis=1) Now, you can create a training set with scikit-learn’s train_test_split() method This will output a training and testing array for both features and labels for the ML algorithm to use Now, we can finally train a model Initialize a classifier like the DecisionTreeClassifier Then, use the class’ fit() method to train your model To figure out the accuracy of your model, first use the model to predict labels for the test array of features, and then get the accuracy score using sci-kitlearn’s accuracy_score() method y_pred = dt.predict(x_test) print(accuracy_score(y_test, y_pred)) Cat and Mouse You can do something similar with MLSploit Follow the tutorial included with the project and create traditional ML models You can also try to run the same attack from before against these models Further Reading Useful Example Code Mlsploit-pe (has useful fork of EMBER) GitHub – evandowning/mlsploit-pe: MLSploit PE module Scikit-learn documentation API Reference — scikit-learn 0.24.1 documentation ML Evasion Mimicry attacks on host-based intrusion detection systems | Proceedings of the 9th ACM conference on Computer and communications security [1804.04637] EMBER: An Open Dataset for Training Static PE Malware Machine Learning Models (arxiv.org) [1801.08917] Learning to Evade Static PE Machine Learning Malware Models via Reinforcement Learning (arxiv.org)
Let us return one last time to the events which occurred in 2014 at a small eCommerce start-up company in Monroe, CT.Your investigation of the insidious greencat malware is coming to an end. You have found that the malware is using an HTTP connection to transmit command and control messages to a suspicious IP address. This discovery leads you to identify several predicates in the greencat binary which seem to control the execution of each malicious payload.“How can I be sure which payloads are controlled by each predicate?” You wonder …Static control dependence will not suffice. You suspect that some payloads in greencat may be packed or self-modifying. Therefore, analyzing the malware’s execution via dynamic analysis is the only way to expose its true behaviors.“One problem remains,” You think. “I have to precisely recreate the attack steps, as they unfolded at the eCommerce start-up. How can I know which command and control inputs will exercise those exact greencat execution paths?”Luckily, you remember that the part-time IT support professionals employed by the eCommerce start-up had saved network packet traces for all of the accounting department computers!“This is perfect!” You exclaim.The plan that you devise is as follows.Step 1: You must design a dynamic control dependence pintool.Step 2: Using a fake greencat C&C server, you can execute greencat with the pintool, send it C&C commands, and recover the dynamic control dependence of each payload.Step 3: You can check each payload’s network communication that you observe against the packet trace to be sure you’ve covered the correct execution paths.Disclaimer: In this assignment, you will only accomplish Steps 1 and 2.Instructions:The story for this lab is quite accurate to how a malware analyst would instrument, manipulate, and compare against evidence from the original cyber-attack. A dynamic Control Dependence Graph (CDG) enables malware analysis tools to link code sections to the predicates that control their execution. In a malware investigation (as in the story above) this can reveal exactly what network or environment input triggered the attack evidence the investigator observed. A CDG is also an essential program analysis build block that is required for program slicing, and as you know from the research papers, program slicing is widely used to focus analysis algorithms on key malware capabilities. However, like in previous labs, tool designers (you) must make tradeoffs, specifically in how to obtain immediate post-dominators. In this lab, you will create a dynamic CDG using the “puppeteered” GreenCat traces you collected in Lab 5. After completing this lab, I encourage you to go a step further and write a simple analysis script to automatically identify which C&C command causes each GreenCat function to execute (e.g., use an GHIDRA plugin to automatically color each instruction based on which C&C command causes it to execute).To determine control dependence, you will need to identify the immediate post-dominators of many instructions. For this lab we expect you to calculate IPDs based on the dynamic trace (with the control flow trace that your pintool generated for Lab #5). As we discussed in class, it is fine if your solution requires executing the dynamic analysis multiple times.Note that: If you obtain immediate post-dominators from the dynamic execution trace, then your immediate post-dominators might cover the entire program (i.e., you will have control dependence between functions). This is acceptable for this lab!After the process that your pintool is tracing exits (i.e., in your pintool’s Fini function), generate a DOT directed graph file representing the control dependence of all the observed instructions. Each node in your DOT directed graph file should be the address of an instruction that was executed (only ONE node per instruction address). The edges in your DOT directed graph file should go from each executed instruction to the instructions that it is control dependent on.For example, assume the instruction at 0x638 is control dependent on the instruction at 0x634. The DOT directed graph file generated by your tool should be as follows:The order of the edges in the DOT directed graph file does not matter. Also see: https://stackove rflow.com/questions/1494492/graphviz-how-to-go-from-dot-to-a-graphYou can use sendsignal.exe on the Desktop (via the command prompt) to kill it.Use your pintool and explore all the different control flow paths that each of the greencat command and control (C&C) commands exercise. Refer back to your previous labs to recover the C&C commands that greencat accepts from its C&C server. Send each command to greencat (one time is enough; order does not matter) and generate one DOT file.Submit your pintool source code and the DOT file your pintool generated.Note 1: You may not need to change your pintool from Lab #5. That is fine.Note 2: You may include a “post-processing” step (e.g., a python script) after your pintool runs that can be used to formulate the control dependence graph used in your submission instead of doing the CDG calculation in your pintool code.Note 3: Feel free to use any third-party packages for graph algorithms/processing (hint: NetworkX is a handy python library for dealing with graphs).As always: Post any questions or ideas on Ed Discussion! Even code snippets are fine, as long as they do not give away a key answer to this assignment. Class collaboration is encouraged — It’s us versus malware! If you’re not sure if your post is safe, send it to the Prof/TA in a private post to verify.Additional Example:Consider this example:Consider what it would look like if we converted this to a digraph in the form of our lab 5 output (1A and 1B are G and H respectively):Now consider the control dependence output:Notice that you can add a dummy START node, which can be helpful for calculating control dependence, and in this case, we can say that node A and node H are control dependent on START, though whether or not this exists in your output is not a requirement. Also notice that the edges are reversed compared to the image, though it is describing the same node relationship. Describing the relationship as seen in the example output is what we expect.For example, B is control dependent on A because:Grade: 100 pointsGrading Criteria:The grade will be based on the correctness of the DOT file that your pintool and/or prost-processing script generates.Your grade is dependent on the count of correct control dependence relationships you define. For example, if you miss 10% of the CDG relationships you will get a 90% on the lab. If you miss 10% of the expected CDG relationships and add an extra 10% of incorrect relationships to your CDG you will receive an 80% on the lab.Teams:This assignment can be done individually or in a team of 2. Please join a group in Gradescope if you are collaborating.Do not create or join a group in Canvas. Canvas groups are different from Gradescope groups.New to Gradescope? This link provides instructions for how to create groups in Gradescope: https://help.gradescope.com/article/m5qz2xsnjy-student-add-group-membersZoom can also provide the ability to collaborate and video conference with your teammate.Submission Instructions:Upload the following to Gradescope:The DOT file that your pintool generated, named “submission.dot”.Your pintool code, named “plugin.cpp” and any other code files needed to run your solution. We reserve the right to run all submitted code, through automated means or otherwise, and if it is found that your code does not output equivalent to your original dotfile submission then you will also receive a zero.Be advised, please submit (1) and (2) separately, do NOT zip them together.Note: Gradescope will only check the formatting of your submission. Gradescope will not automatically check the correctness and provide a grade.Note: You can download the webc2-greencat-2.7z file directly into your lab environment. After you are done with this lab, you can submit your files directly from the lab environment (Highly recommended). Doing this will help you avoid transferring the file from the lab environment to your personal computer.Transferring Files:To transfer files from your personal device to the lab environment or to the Windows7 VM:Create a zip folder of all the files that you would like to transfer to the lab environment or the Windows7 VM.Every GT student has Box and OneDrive accounts given free by the institution. Login to either of those two and upload the desired files.Now go back to the lab environment or the Windows7 VM and login to either of those two services where you uploaded you zip folder. Download folder to the lab environment or the Windows7 VM and use the appropriate 7z command to unzip your folder.FAQWhat is the purpose of the additional, ungraded, GradeScope assignment with the same release period as Lab 6?This is an ungraded assignment that is exactly the same as Lab 5. It lets students submit their dynamic CFG to ensure that the coverage meets the minimum requirements. Since Lab 6 is a dynamic CDG, we test the dynamic CDG for all the points that we look for in the coverage we require for Lab 5 (these labs are largely interdependent). Making sure you start out with maximum coverage on Lab 5 before implementing Lab 6 will likely result in the best grade overall.What to do when you encounter technical difficulties?If you are experiencing technical difficulty such as being unable to access the lab environment, please submit a ticket to the “Digital Learning Tools and Platforms” team at https://gatech.servicenow.com/continuity. And on the ticket, please put “Route to the DLT Team” at the top of the ticket because it will help the Service Desk know where to send it.
Acknowledgments: This project is based on one used in Columbia University’s Artificial Intelligence Course (COMS W4701). Special thanks to Dr. Daniel Bauer, who developed the starter code that we’ve extended. Othello is a 2-player board game that is played with distinct pieces that are typically dark on one side and light on the other side, each side belonging to one player. Our version of the game is played on a chess board of any size, but the typical game is played on an 8×8 board. Players (dark and light) take turns placing their pieces on the board. Placement is dictated by the rules of the game and can result in the flipping of colored pieces from light to dark or dark to light. The rules of the game are explained in detail at https://en.wikipedia.org/ wiki/Reversi. Objective: The player’s goal is to ensure that, by the end of the game, the majority of the pieces displayed on the board are of their color. Game Ending: Our version of the game differs from the standard rules described on Wikipedia in one minor point: The game ends as soon as one of the players has no legal moves left. Rules: The game begins with four pieces placed in a square in the middle of the grid, two light pieces and two dark pieces (see Figure 1, at left). The dark player always makes the first move. • A move is defined as a player placing a piece on an unoccupied square on the board during their turn. • A move is considered legal ONLY if the move “brackets” one or more opponent pieces in a straight line along at least one axis (vertical, horizontal, or diagonal). Figure 1: On the left, the initial state. In the middle, possible moves of dark player are shown in grey. At right, the board state after the dark player has moved. For example, from the initial state the dark player can achieve this bracketing by placing a dark piece in any of the positions indicated by grey pieces in 1 (in the middle). Each of these potential placements would create a Dark-Light-Dark sequence, thus “bracketing” the Light piece. Once the piece is placed, all opponent pieces that got bracketed by the move are flipped to become the same color as the current player’s. Returning to our example, if the dark player places a piece in Position 6-d in the middle panel of Figure 1,the light piece in position 5-d will become bracketed and consequently flipped to dark, resulting in the board depicted in the right panel of Figure 1. Figure 2: On the left, possible moves of the light player are shown in grey. At right, the board state after the move of the light player is shown. Now it’s the light player’s turn to play. All of the light player’s possibilities at this time are shown as grey pieces in 2 (at left). If the light player places a piece on 4-c, it will cause the dark piece in 4-d to be bracketed, resulting in the 4-d piece being flipped to light, as shown in 2(at right). To summarize, a legal move for a player is one that results in at least one of its opponent’s pieces being flipped. Our version of the game ends when one player no longer has any legal moves available. 1 Starter Code The starter code contains 5 files: 1. othello gui.py, which contains a simple graphical user interface (GUI) for Othello. 2. othello game.py, which contains the game ”manager”. This stores the current game state and communicates with different player AIs. 3. othello shared.py, which contains functions for computing legal moves, captured disks, and successor game states. These are shared between the game manager, the GUI and the AI players. 4. randy ai.py, which specifies an ”AI” player (named Randy) that randomly selects a legal move. 5. agent.py – The file where you will implement your game agent. Game State Representation: Each game state contains two pieces of information: the current player and the current disks on the board. Throughout our implementation, Player 1 (dark) is represented by the integer 1, and Player 2 (light) by the integer 2. The board is represented as a tuple of tuples. The inner tuple represents each row of the board. Each entry in the rows is either an empty square (integer 0), a dark disk (integer 1), or a light disk (integer 2). For example, an 8×8 initial state looks like this:( (0, 0, 0, 0, 0, 0, 0, 0), (0, 0, 0, 0, 0, 0, 0, 0), (0, 0, 0, 0, 0, 0, 0, 0), (0, 0, 0, 2, 1, 0, 0, 0), (0, 0, 0, 1, 2, 0, 0, 0), (0, 0, 0, 0, 0, 0, 0, 0), (0, 0, 0, 0, 0, 0, 0, 0), (0, 0, 0, 0, 0, 0, 0, 0) ) Running the code: You can run the Othello GUI by typing: python3 othello_gui.py -d board_size -a agent.py where the parameter board size is an integer that determines the dimension of the board upon which you will play, and agent.py is the game agent you would like to play against. If you type: python3 othello_gui.py -d board_size -a randy_ai.py you will play against an agent that selects moves randomly (named Randy). Playing a game should bring up a game window. If you play against a game agent, you and the agent will take turns. We recommend that you play against Randy to develop a better understanding of how the game works and what strategies can give you an advantage. Remember that othello gui.py has been written so that if you want to play against an agent, you should supply the agent file only after -a (so that the agent will play as light and you as dark). If you use -b to introduce an agent, the game will not detect it and will think that you want to play both sides yourself. The GUI can also take two AI programs as command line parameters. When two AIs are specified at the command line, you can watch them play against each other. Again, remember that the agent introduced with -a will play as the dark player, and the agent introduced with -b will play as the light player. To see Randy play against itself, type: python3 othello_gui.py -d board_size -a randy_ai.py -b randy_ai.py You may want to try playing the agents you create against those that are made by your friends. The GUI is rather minimalistic, so you need to close the window and then restart to play a new game. Communication between the Game Host and the AI: NOTE: This is a technical detail that you can skip if you are not interested. Functions for communicating with the game manager are provided as part of the scaffolding code. Note, however, that the game manager communicates with agents via stdout. If you want to print debuggingstatements, you should print to stderr instead. You can do this by using the eprint command found in agent.py. The AI and the Game Manager/GUI run in different Python interpreters. The Game Manager/GUI spawns a child process for each AI player. This makes it easier for the game manager to let the AI process time out and also ensures that if an AI crashes, the game manager can keep running. To communicate with the child process, the game manager uses pipes: it reads from the AI’s standard output and writes to its standard input. The two programs follow a precise protocol to communicate: • The AI sends a string to identify itself (e.g., Randy AI sends the string “Randy.” You can come up with a fun name for your AI). • The game manager sends back “1” or “2,” indicating whether the AI plays dark or light. • The AI then waits for input from the game manager. When it is the AI’s turn, the game manager will send two lines: the current score (e.g., “SCORE 2 2”) and the game board (a Python tuple converted to a string). The game manager then waits for the AI to respond with a move (e.g., “4 3”). • At the end of the game, the game master sends the final score (e.g., “FINAL 33 31”). Time Constraints: Your AI player is expected to make a move within 10 seconds. If no move is selected within that time, the AI loses the game. This time constraint does not apply to human players. You may change the time constraint by editing line 32 in othello game.py: TIMEOUT = 10. However, during grading and the Othello competition, your AI will be run with a timeout of 10 seconds. Mark Breakdown 1. Minimax [30 pts] You will want to test your Minimax implementations on boards that are only 4×4 in size. This restriction makes the game somewhat trivial: it is easy even for human players to think ahead to the end of the game. When both players play optimally, the player who goes second always wins. However, the default Minimax algorithm, without a depth limit, takes too long, even on a 6×6 board. Write a function compute utility(board, color) that computes the utility of a final game board state (in the format described above). The utility should be calculated as the number of disks of the player’s color minus the number of disks of the opponent. HINT: The function get score(board) returns a tuple (number of dark disks, number of light disks). Then, implement the method select move minimax(board, color, limit, caching). For now, you can ignore the limit and caching parameters; we will return to these later. Your function should select the action that leads to the state with the highest minimax value. The parameter board is the current board (in the format described above), and color is the color of the AI player (using 1 for dark and 2 for light). The return value should be a (column, row) tuple representing the move. Implement minimax recursively by writing two functions: minimax max node(board, color, limit, caching) and minimax min node(board, color, limit, caching). (Again, ignore limit and caching for now.) HINT: Use the function get possible moves(board, color) in othello shared.py to get the list of legal moves for the player, and use play move(board, color, move) to compute the successor board state. Optional: There is an alternative approach to implement the minimax algorithm using only one function instead of two. Consider what other modifications might be needed. Once you are done, you can run your MINIMAX algorithm via the command line using the flag -m. For example: python3 othello_gui.py -d 4 -a agent.py -m will let you play against your agent on a 4×4 board using the MINIMAX algorithm. The command python3 othello_gui.py -d 4 -a agent.py will have you play against the same agent using the ALPHA-BETA algorithm, which you will implement next. You can also play your agent against Randy with: python3 othello_gui.py -d 4 -a agent.py -b randy_ai.py 2. Alpha-Beta Pruning [30 pts] The simple minimax approach becomes infeasible for boards larger than 4×4. To address this, write the function select move alphabeta(board, color, limit, caching, ordering) to compute the best move using alpha-beta pruning. The parameters and return values are the same as for minimax. For now, ignore the limit, caching, and ordering parameters. Your alpha-beta implementation should recursively call two helper functions: alphabeta min node(board, color, alpha, beta) and alphabeta max node(color, alpha, beta). Play with pruning should speed up the AI decisions. Use: python3 othello_gui.py -d 4 -a agent.py to play against your agent using the ALPHA-BETA algorithm on a 4×4 board. 3. Depth Limit [10 pts] To further speed up your agents, implement a depth limit by using the -l flag at the command line. For example: python3 othello_gui.py -d 6 -a agent.py -m -l 5 calls your agent’s MINIMAX routine with a depth limit of 5, and python3 othello_gui.py -d 6 -a agent.py -l 5calls your agent’s ALPHA-BETA routine with a depth limit of 5. Alpha-beta and minimax will pass the limit parameter recursively. When the depth limit (i.e., when limit equals zero) is reached, use a heuristic (such as the compute utility function) to estimate the value of the non-terminal state. Experiment with different depth limits on boards larger than 4×4. What is the largest board you can play on without timing out after 10 seconds? 4. Caching States [10 pts] To speed up the AI further, modify your program to respond to the -c flag for caching. Create a dictionary (a global variable in your AI file) that maps board states to their minimax value. Modify your minimax and alpha-beta functions to store computed values in this dictionary and check it before re-computing a state. When running: python3 othello_gui.py -d 6 -a agent.py -m -c -l 5 the game manager will call your agent’s MINIMAX routines with caching enabled. Similarly, omitting -m will use the ALPHA-BETA routines with caching. Note that caching may not always improve runtime significantly, but it should when searching deeply. 5. Node Ordering Heuristic [10 pts] Alpha-beta pruning can be more efficient if nodes with higher utility are explored first. In your ALPHABETA functions, order the successor states according to the heuristic: nodes for which the number of the AI player’s disks minus the number of the opponent’s disks is highest should be explored first. (This is essentially the utility function.) Enable node ordering when the -o flag is placed on the command line. For example: python3 othello_gui.py -d 6 -a agent.py -o -l 5 will have the game manager call your ALPHA-BETA routines with an ordering parameter of 1. To play against your agent on an 8×8 board using state caching, alpha-beta pruning, and node ordering, type: python3 othello_gui.py -d 8 -a agent.py -c -o -l 5 6. Your Own Heuristic [10 pts] The previous steps should give you a good AI player, but there is room for improvement. To create a better AI, design your own game heuristic. You can use this heuristic in place of the compute utility function in your alpha-beta routines. Some Ideas for Heuristic Functions for the Othello Game: 1. Consider board locations where pieces are stable (i.e., cannot be flipped). 2. Consider the number of moves available to you versus your opponent.3. Use different strategies for the opening, mid-game, and end-game. To grade your heuristic, we put two minimax agents against each other. One uses the compute utility and one uses the compute heuristic function submitted by you as its heuristic function. A total of 10 games are played for depth limits 2 to 6 and both color assignments. Each game the agent with your heuristic wins, you get 1 point and with each tie, you get 0.5 points. Remember that the agents are developed by TAs and we only import the compute heuristic function from your file. Your grade only depends on this function and not anything else in your submission. In addition, please include a (short) description of your heuristic as a comment at the start of your solution file. GOOD LUCK!
There are two parts to this assignment. 1. Propagators. You will implement two constraint propagators—a Forward Checking constraint propagator, and a General Arc Consistency constraint propagator 2. CSP Encodings. You will implement three different CSP encodings: two grid-only puzzle encodings, and one full FunPuzz encoding (adding cage constraints to grid). What is supplied • cspbase.py. Class definitions for the python objects Constraint, Variable, and BT. • propagators.py. Starter code for the implementation of your two propagators. You will modify this file with the addition of two new procedures prop FC and prop GAC. • puzzle csp.py. Starter code for the CSP encodings. You will modify three procedures in this file: binary ne grid, nary ad grid, and caged csp. • csp sample run.py. This file contains a sample implementation of two CSP problems. • tests.py. Sample test cases. Run the tests with “python3 tests.py”. This file will be released some time after the assignment is posted.11+ 2/ 20x 6x 3- 3/ 240x 6x 6x 7+ 30x 6x 9+ 8+ 2/ 11+ 2/ 20x 6x 3- 3/ 240x 6x 6x 7+ 30x 6x 9+ 8+ 2/ 5 6 3 4 1 2 6 1 4 5 2 3 4 5 2 3 6 1 3 4 1 2 5 6 2 3 6 1 4 5 1 2 5 6 3 4 Figure 1: An example of a 6×6 FunPuzz grid with its start state (left) and solution (right). FunPuzz Formal Description The FunPuzz puzzle has the following formal description: • FunPuzz consists of an n×n grid where each cell of the grid can be assigned a number 1 to n. No digit appears more than once in any row or column. Grids range in size from 3×3 to 9×9. • FunPuzz grids are divided into heavily outlined groups of cells called cages. These cages come with a target and an operation. The numbers in the cells of each cage must produce the target value when combined using the operation. • For any given cage, the operation is one of addition, subtraction, multiplication or division. Values in a cage can be combined in any order: the first number in a cage may be used to divide the second, for example, or vice versa. Note that the four operators are “left associative” e.g., 16/4/4 is interpreted as (16/4)/4 = 1 rather than 16/(4/4) = 16. • A puzzle is solved if all empty cells are filled in with an integer from 1 to n and all above constraints are satisfied. • An example of a 6×6 grid is shown in Figure 1. Note that your solution will be tested on n×n grids where n can be from 3 to 9. Additional notes: • It is possible for a given cell to not participate in any cage constraints. That is still a valid FunPuzz board. • For division, we are only concerned with divisions producing integer results throughout the operation.Question 1: Propagators (worth 55/100 marks) You will implement Python functions to realize two constraint propagators—a Forward Checking (FC) constraint propagator and a Generalized Arc Consistence (GAC) constraint propagator. These propagators are briefly described below. The files cspbase.py and propagators.py provide the complete input/output specification of the two functions you are to implement. Brief implementation description: The propagator functions take as input a CSP object csp and (optionally) a Variable newVar representing a newly instantiated Variable, and return a tuple of (bool,list) where bool is False if and only if a dead-end is found, and list is a list of (Variable, value) tuples that have been pruned by the propagator. In all cases, the CSP object is used to access variables and constraints of the problem, via methods found in cspbase.py. You must implement: prop FC (worth 25/100 marks) A propagator function that propagates according to the FC algorithm that check constraints that have exactly one uninstantiated variable in their scope, and prune appropriately. If newVar is None, forward check all constraints. Otherwise only check constraints containing newVar. prop GAC (worth 30/100 marks) A propagator function that propagates according to the Generalized Arc Consistency (GAC) algorithm, as covered in lecture. If newVar is None, run GAC on all constraints. Else, if newVar=var only check constraints containing newVar. Question 2: CSP Encodings (worth 45/100 marks) You will implement three different CSP encodings using three different constraint types. The three different constraint types are (1) binary not-equal; (2) n-ary all-different; and (3) FunPuzz cage. The three encodings are (a) binary grid-only FunPuzz; (b) n-ary grid-only FunPuzz; and (c) complete FunPuzz. The CSP encodings you will build are described below. The file puzzle csp.py provides the complete input/output specification. Brief implementation description: The three encodings take as input a valid FunPuzz grid, which is a list of lists, where the first list has a single element, N, which is the size of each dimension of the board, and each following list represents a cage in the grid. Cell names are encoded as integers in the range 11,…,nn and each inner list contains the numbers of the cells that are included in the corresponding cage, followed by the target value for that cage and the operation (0=’+’, 1=’-’, 2=’/’, 3=’*’). If a list has two elements, the first element corresponds to a cell, and the second one—the target—is the value enforced on that cell. For example, the grid ((3),(11,12,13,6,0),(21,22,31,2,2),….) corresponds to a 3×3 board1 where 1. cells 11, 12 and 13 must sum to 6, and 2. the result of dividing some permutation of cells 21, 22, and 31 must be 2. That is, (C21/C22)/C23 = 2 or (C21/C23)/C22 = 2, or (C22/C21)/C23 = 2, etc… 1Note that cell indexing starts from 1, e.g. 11 is the cell in the upper left corner.All encodings need to return a CSP object, and a list of lists of Variable objects representing the board. The returned list of lists is used to access the solution. The grid-only encodings do not need to encode the cage constraints. You must implement: binary ne grid (worth 10/100 marks) An encoding of a FunPuzz grid (without cage constraints) built using only binary not-equal constraints for both the row and column constraints. nary ad grid (worth 10/100 marks) An encoding of a FunPuzz grid (without cage constraints) built using only n-ary all-different constraints for both the row and column constraints. caged csp (worth 25/100 marks) An encoding built using your choice of (1) binary binary not-equal, or (2) n-ary all-different constraints for the grid, together with (3) FunPuzz cage constraints. That is, you will choose one of the previous two grid encodings and expand it to include FunPuzz cage constraints. Notes: The CSP encodings you will construct can be space expensive, especially for constraints over many variables, (e.g., for cage constraints and those contained in the first binary ne grid grid CSP encoding). Also be mindful of the time complexity of your methods for identifying satisfying tuples, especially when coding the caged csp. HAVE FUN and GOOD LUCK!
1 Introduction The goal of this assignment will be to implement a working solver for the puzzle game Sokoban shown in Figure 1. Sokoban is a puzzle game in which a warehouse robot must push boxes into storage spaces. The rules hold that only one box can be moved at a time, that boxes can only be pushed by robots and not pulled, and that neither robots nor boxes can pass through obstacles (walls or other boxes). In addition, robots cannot push more than one box, i.e., if there are two boxes in a row, they cannot push them. The game is over when all the boxes are in their storage spots. In our version of Sokoban the rules are slightly more complicated, as there may be more than one warehouse robot available to push boxes. These robots cannot pass through one another nor can they move simultaneously, however. Sokoban can be played online at https://www.sokobanonline.com/play. We recommend that you familiarize yourself with the rules and objective of the game before proceeding. It is worth noting that the version that is presented online is only an example. We will give a formal description of the puzzle in the next section. 2 Description of Sokoban Sokoban has the following formal description. Note that our version differs from the standard one. Read the description carefully. • The puzzle is played on a board that is a grid board with N squares in the x-dimension and M squares in the y-dimension.• Each state contains the x and y coordinates for each robot, the boxes, the storage spots, and the obstacles. • From each state, each robot can move North, South, East, or West. No two robots can move simultaneously, however. If a robot moves to the location of a box, the box will move one square in the same direction. Boxes and robots cannot pass through walls or obstacles, however. Robots cannot push more than one box at a time; if two boxes are in succession the robot will not be able to move them. Movements that cause a box to move more than one unit of the grid are also illegal. Whether or not a robot is pushing an object does not change the cost. • Each movement is of equal cost. Whether or not the robot is pushing an object does not change the cost. • The goal is achieved when each box is located in a storage area on the grid. Ideally, we will want our robots to organize everything before the supervisor arrives. This means that with each problem instance, you will be given a computation time constraint. You must attempt to provide some legal solution to the problem (i.e., a plan) within this constraint. Better plans will be plans that are shorter, i.e. that require fewer operators to complete. Your goal is to implement an anytime algorithm for this problem, meaning that your algorithm should generate better solutions (i.e., shorter plans) the more computation time it is given. 3 Code You Have Been Provided The code for this assignment consists of several Python files, some of which you will need to read and understand in order to complete the assignment. You have been provided: 1. search.py 2. sokoban.py 3. solution.py 4. autograder.py The only file you will submit is solution.py. We consider the other files to be starter code, and we will test your code using the original versions of those files. In order for your solution.py to be compatible with our starter code, you should not modify the starter code. In addition, you should not modify the functions defined in the starter code files from within solution.py. The file search.py, which is available from the website, provides a generic search engine framework and code to perform several different search routines. This code will serve as a base for your Sokoban solver. A brief description of the functionality of search.py follows. The code itself is documented and worth reading. • An object of class StateSpace represents a node in the state space of a generic search problem. The base class defines a fixed interface that is used by the SearchEngine class to perform search in that state space.For the Sokoban problem, we will define a concrete sub-class that inherits from StateSpace. This concrete sub-class will inherit some of the “utility” methods that are implemented in the base class. Each StateSpace object s has the following key attributes: – s.gval: the g value of that node, i.e., the total cost of getting to that state (from the initial state). – s.parent: the parent StateSpace object of s, i.e., the StateSpace object that has s as a successor. This will be None if s is the initial state. – s.action: a string that contains that name of the action that was applied to s.parent to generate s. Will be “START” if s is the initial state. • An object of class SearchEngine se runs the search procedure. A SearchEngine object is initialized with a search strategy (‘depth first’, ‘breadth first’, ‘best first’, ‘a star’, or ‘custom’) and a cycle checking level (‘none’, ‘path’, or ‘full’). Note that SearchEngine depends on two auxiliary classes: – An object of class sNode sn which represents a node in the search space. Each object sn contains a StateSpace object and additional details: hval, i.e., the heuristic function value of that state and gval, i.e. the cost to arrive at that node from the initial state. An f val f n and weight are tied to search nodes during the execution of a search, where applicable. – An object of class Open is used to represent the search frontier. The search frontier will be organized in the way that is appropriate for a given search strategy. When a SearchEngine’s search strategy is set to ‘custom’, you will have to specify the way that f values of nodes are calculated; these values will structure the order of the nodes that are expanded during your search. Once a SearchEngine object has been instantiated, you can set up a specific search with: init search(initial state,goal f n,heuristic f n, f val f n) and execute that search with search(timebound, costbound) The arguments are as follows: – initial state will be an object of type StateSpace; it is your start state. – goal f n(s) is a function which returns True if a given state s is a goal state and False otherwise. – heuristic f n(s) is a function that returns a heuristic value for state s. This function will only be used if your search engine has been instantiated to be a heuristic search (e.g., best first). – f val f n(sNode,weight) defines f values for states. This function will only be used by your search engine if it has been instantiated to execute a ‘custom’ search. Note that this function takes in an sNode and that an sNode contains not only a state but additional measures of the state (e.g., a gval). The function also takes in a float weight. It will use the variables that are provided to arrive at an f value calculation for the state contained in the sNode. – timebound is a bound on the amount of time your code will execute the search. Once the run time exceeds the time bound, the search will stop; if no solution has been found, the search will return False.– costbound is an optional parameter that is used to set boundaries on the cost of nodes that are explored. This costbound is defined as a list of three values. costbound[0] is used to prune states based on their g-values; any state with a g-value higher than costbound[0] will not be expanded. costbound[1] is used to prune states based on their h-values; any state with an hvalue higher than costbound[1] will not be expanded. Finally, costbound[2] is used to prune states based on their f-values; any state with an f-value higher than costbound[2] will not be expanded. The output of the search function will include both a solution path as well as a SearchStats object (if a solution is found). A SearchStats object (ss) details some interesting statistics that are related to a given search. Its attributes are as follows: • ss.states expanded, which is a count of the number of states drawn from the Frontier during a search. • ss.states generated, which is a count of the number of states generated by the successor function during a search. • ss.states pruned cycles, which is a count of the number of states pruned as a result of cycle checking. • ss.states pruned cost, which is a count of the number of states pruned as a result of enforcing cost boundaries during a search. For this assignment we have also provided sokoban.py, which specializes StateSpace for the Sokoban problem. You will therefore not need to encode representations of Sokoban states or the successor function for Sokoban! These have been provided to you so that you can focus on implementing good search heuristics and anytime algorithms. The file sokoban.py contains: • An object of class SokobanState, which is a StateSpace with these additional key attributes: – s.width: the width of the Sokoban board – s.height: the height of the Sokoban board – s.robots: positions for each robot that is on the board. Each robot position is a tuple (x, y), that denotes the robot’s x and y position. – s.boxes: positions for each box that is on the board. Each box position is also an (x, y) tuple. – s.storage: positions for each storage bin that is on the board (also (x, y) tuples). – s.obstacles: locations of all of the obstacles (i.e. walls) on the board. Obstacles, like robots and boxes, are also tuples of (x, y) coordinates. • SokobanState also contains the following key functions: – successors(): This function generates a list of SokobanStates that are successors to a given SokobanState. Each state will be annotated by the action that was used to arrive at the SokobanState. These actions are (r,d) tuples wherein r denotes the index of the robot that moved d denotes the direction of movement of the robot. – hashable state(): This is a function that calculates a unique index to represents a particular SokobanState. It is used to facilitate path and cycle checking.– print state(): This function prints a SokobanState to stdout. Note that SokobanState depends on one auxiliary class: – An object of class Direction, which is used to define the directions that the robot can move and the effect of this movement. Also note that sokoban.py contains a set of 20 initial states for Sokoban problems, which are stored in the tuple PROBLEMS. You can use these states to test your implementations. The file solution.py contains the methods that need to be implemented. The file autograder.py runs some tests on your code to give you an indication of how well your methods perform. 4 Assignment Specifics – Your Tasks To complete this assignment you must modify solution.py to: 1. Implement a Manhattan distance heuristic (heur manhattan distance(state)). This heuristic will be used to estimate how many moves a current state is from a goal state. The Manhattan distance between coordinates(x0, y0) and (x1, y1) is |x0−x1|+|y0−y1|. Your implementation should calculate the sum of Manhattan distances between each box that has yet to be stored and the storage point nearest to it. Ignore the positions of obstacles in your calculations and assume that many boxes can be stored at one location. 2. Implement a non-trivial heuristic for Sokoban that improves on the Manhattan distance heuristic (heur alternate(state)). Place a description of your heuristic in the comments of your code. 3. Implement a version of weighted A* search. Use the function declaration: weighted astar(initial state,heur f n,weight,timebound). Weighted A* balances features of Greedy Search against those of A* search. It requires a specialized f-value function (described below). Note that to run weighted astar you will need to instantiate a SearchEngine object with a custom search strategy and initialize this object with a your f-value function. More details are provided in Section 6. 4. Implement an iterative version of weighted A* in order to create an A* search that will provide a solution in whatever time is given. This will use weighted A* to generate solutions quickly; it will then iteratively refine and improve solutions as time allows. Use the function declaration: iterative astar(initial state,heur f n,weight,timebound). More details are provided in Section 7. 5. Greedy search can also be modified to refine solutions iteratively! Do this next by implementing an iterative version of Greedy Search. More details are provided in Section 5. Note that when we are testing your code, we will limit each run of your algorithm on teach.cs to 2 seconds. Instances that are not solved within this limit will provide an interesting evaluation metric: failure rate.5 Iterative Greedy Best-First Search Greedy best-first search expands nodes with lowest h(node) first. The solution found by this algorithm may not be optimal. Iterative greedy-best first search (which is called iterative gb f s in the code) continues searching after a solution is found in order to improve solution quality. Since we have found a path to the goal after the first iteration, we can introduce a cost bound for pruning: if node has g(node) greater than the best path the goal found so far, we can prune it. The algorithm returns either when we have expanded all non-pruned nodes, in which case the best solution found by the algorithm is the optimal solution, or when it runs out of time. We prune based on the g-value of the node only because greedy best-first search is not necessarily run with an admissible heuristic. Record the time when iterative gb f s is called with os.times()[0]. Each time you call search, you should update the time bound with the remaining allowed time. The automarking script will confirm that your algorithm obeys the specified time bound. 6 Weighted A* Instead of A*’s regular node-valuation formula f(node) = g(node) +h(node), Weighted A* introduces a weighted formula: f(node) = g(node) +w∗ h(node) where g(node) is the cost of the path to node, h(node) the estimated cost of getting from node to the goal, and w ≥ 1 is a bias towards states that are closer to the goal. Theoretically, the smaller w is, the better the first solution found will be (i.e., the closer to the optimal solution it will be … why??). However, different values of w will require different computation times. Start by implementing Weighted A* in the function weighted astar(initial state,heur f n,weight,timebound) using the f-value function above. This will require you to instantiate a custom SearchEngine and an f-value of your own design. When you are passing in f val f unction to init search for this problem, you will need to have specified the weight for f val f unction. You can do this by wrapping the f val f unction(sN,weight) you have written in an anonymous function, i.e., wrapped fval function = (lambda sN: fval function(sN, weight)) Explore the performance of your weighted A* implementation on the test problems that have been provided using the Manhattan distance heuristic and the following weights: 10, 5, 2, 1. Which weights yield the fastest time to a solution? Which yields the least cost solution? 7 Iterative Weighted A* You have hopefully discovered that, even when using an admissible heuristic, the length of Weighted A* solutions may not be optimal when w is anything larger than 1. We can therefore keep searching after we have found a solution in order to try and find a better one. More specifically, we can continue to useWeighted A* with smaller and smaller weights, as time allows, in an effort to improve on our solution with our remaining time. This is the idea behind Iterative Weighted A*. Iterative Weighted A* continues to search for solutions until either there are no nodes left to expand (and our best solution is the optimal one) or it runs out of time. It will do this by running Weighted A* again and again, with increasingly small weights. Since Iterative Weighted A* will have found a path to the goal after its first search iteration, we can use this solution to guide our search in subsequent iterations. More specifically, we can introduce a cost bound that will help prune nodes in future iterations: if any node we generate has a g(node) +h(node) value greater than the cost of the best path to the goal found so far, we can prune it. Implement an iterative version of weighted A* search using the following function stub: iterative astar(initial state,heur f n,weight,timebound). This should be an iterative search that makes use of your weighted A*. When a solution is found, remember it and, if time allows, iterate upon it. Change your weight at each iteration and enforce a cost boundary so that you will move toward more optimal solutions at each iteration. GOOD LUCK!
1. Suppose that a page can contain at most four data values and that all data values are integers. Using only B+ trees of order 2, give examples of each of the following: a. A B+ tree whose height changes from 2 to 3 when the value 25 is inserted. Show your structure before and after the insertion. b. A B+ tree in which the deletion of the value 30 leads to a redistribution. Show your structure before and after the deletion. c. A B+ tree in which the deletion of the value 35 causes a merging of two nodes but without altering the height of the tree. 2. Answer the following questions about Linear Hashing: a. How does Linear Hashing provide an average-case search cost of only slightly more than one disk I/O, given that overflow buckets are part of its data structure? b. Does Linear Hashing guarantee at most one disk access to retrieve a record with a given key value?
1. Answer the following questions about files and indexes: a. What alternatives are available for the data entries in an index? b. What is the difference between a clustered index and an unclustered index? If an index contains data records as ‘data entries’, can it be unclustered? c. How many clustered indexes can you create on a file? Would you always create at least one clustered index for a file? 2. Explain the terms seek time, rotational delay, and transfer time. 3. What is sequential flooding of the buffer pool? 4. Describe two possible record formats. What are the trade-offs between them? 5. Why do frames in the buffer pool have a pin count instead of a pin flag
1. What is a foreign key constraint? Why are such constraints important? What is referential integrity? (10 pts) 2. Explain the difference between external, internal, and conceptual schemas. How are these different schema layers related to the concepts of logical and physical data independence? (10 pts) 3. Consider the following schema: Suppliers (sid: integer, sname: string, address: string) Parts (pid: integer, pname: string, color: string) Catalog (sid: integer, pid: integer, cost: real) The Catalog relation lists the prices charged for parts by suppliers. Write the following queries in SQL (40 pts): a.) Find the pnames of parts for which there is some supplier. b.) For each part, find the sname of the supplier who charges the most for that part. c.) Find the sids of suppliers who supply only red parts. d.) Find the snames of suppliers who supply every part. 4. Consider the following schema: Employee (person-name, street, city) Works (person-name, company-name, salary) Company (company-name, city) Manages (person-name, manager-name) Write the following queries in relational algebra (40 pts): a.) Find the names of all employees who work for Auburn Bank. Page 2 of 2 b.) Find the names and cities of residence of all employees who work for Auburn Bank. c.) Find the names, street address, and cities of residence of all employees who work for Auburn Bank and earn more than $50,000 per year. d.) Find the names of all employees in this database who live in the same city as the company for which they work.
The goal of this assignment is to design a conceptual schema using the ER Data model. ER data model Design a schema that incorporates the specification described below as efficiently as possible. You should submit a written diagram of your schema design using the notation given in the class and/or the textbook. In this diagram, indicate all the classes, subclasses, relationships (weak & strong), attributes, and primary keys. If you feel you need to make additional assumptions, state them clearly on a separate page. Application Domain Introduction Geotechnical information on soil deposits is critical for our civil infrastructure. Local, state and federal agencies, universities, and companies need this information for a variety of civil engineering and urban policies applications, including land usage and development, and mapping of natural hazards such as soil liquefaction and earthquake ground motions. Foremost examples of geotechnical information, geotechnical boreholes are vertical holes drilled in the ground for the purpose of obtaining samples of soil and rock materials and determining the stratigraphy, groundwater conditions and/or engineering soil properties. In spite of rather costly drilling operations, boreholes remain the most popular and economical means to obtain subsurface information. These type of data range from basic borehole logs containing a visual inspection report of soil cuttings to sophisticated composite boreholes combining visual inspection and in-situ, laboratory geotechnical and geophysical tests. Design Specification The following is a description of the information required for a database system that manages data for Alabama Department of Transportation (ALDOT, http://www.dot.state.al.us/). We will refer to this as the ALDOT Database. ALDOT is in charge of many geotechnical projects around Page 2 of 3 the State of Alabama. Geotechnical project data is the primary components of the ALDOT Database. A project has a project ID for identifying these projects, a brief title for describing the project, a project start date, a project end date, and a project creation date. Usually, a project contains several different experiments and the experiment data represents a particular experiment that is part of a project. It includes information that applies to the experiment as a whole, but may vary from one experiment to another. Zip code regions are widely used in civil engineering for identifying locations as well as recording information about the population. Each zip code boundary is defined by a 2-D polygon area. Each experiment is conducted in a specific area, which is usually represented by zip code region(s). Each experiment also has an experiment ID for identifying the experiment, a brief title for describing the experiment, an experiment start date, an experiment end date, and the organization that conducted the experiment, which includes its name and address (street, city, state, and zip code). ALDOT hires engineers in several different specialties: civil engineering, computer science, and geoscience. These scientists and engineers’ personnel data are also managed by the ALDOT Database and the personnel data include information on any individual who participates in an ALDOT project. Note that each person will have at least one role (principal-investigator, project manager, assistant, and field engineer) with respect to each project he/she participates in. The people who are involved in projects have three different professional titles: civil engineer, field operator, and computer specialist. The information about a civil engineer’s field of expertise, a field operator’s experience (in years), and a computer specialist’s computer skills are also included in the personnel data. A person has a social security number, a last name (surname of this person), a middle name, a first name, an address (street, city, state, and zip code), a phone number, date of joining the project, and an email. Employees of ALDOT ordinarily carry out several missions for an experiment. The mission data represents a particular configuration or steps used in conducting some portions of an experiment. There may be multiple missions for a given experiment. A mission has a mission ID for identifying this mission in a specific experiment, a brief title, a mission start date, a mission end date, and a number of ALDOT employees who work on the mission. ALDOT owns various engineering vehicles and construction equipment. The usages and status of all the vehicles and equipment are controlled by the ALDOT Database. The database tracks machines such as excavators, bulldozers, cranes, centrifuges, and drills, which are used for accomplishing the experiments. Excavators have different bucket sizes, bulldozers have different horsepower, cranes have different boom lengths, centrifuges have different motor power, and drills have different diameters. Vehicle(s) and equipment are associated with the particular mission(s) where they are used. Each vehicle has a vehicle identification number (VIN), a price, and an administrator. Equipment has an ID for identifying each equipment, a name, and a short description of the equipment functionality. For each mission a collection of boreholes are drilled. A borehole has a borehole ID that is unique among all the boreholes of one experiment, the depth of the borehole, the location of the borehole, a drill date, and a testing type (CPT, SPT, etc.). For each borehole the database also records the soil layer data (encountered while drilling the borehole); each layer information contains a unique layer ID, start depth, end depth, and soil type. Page 3 of 3 ALDOT finances its operations through commercial banks. The banking system provides two kinds of accounts for ALDOT: personal accounts and business accounts. Each bank has a unique routing number, name, address and several phone numbers. Each account has a unique account number, balance and owner name. Each person working in ALDOT has one or more personal accounts. Banks also provide funding for projects through ALDOT business accounts. Each project may receive several funding transactions (called funds), each fund has the amount information, start date, and end date. Submission Guidelines You may submit a hard copy of your ER diagram at the beginning of the class on the due day or submit an electronic copy through Canvas by 11:50 pm of the due day. Your submission should also include reasonable assumptions, your name, and your student ID number. Reminder: This homework must be solved individually (no group work). If you are unclear about what is allowed and what is not, check the class website, the university guidelines or ask the instructor/TA.
Problem 1 Concepts (20 points) (1) Current hard disks typically allow at most one disk head to read or write at any one time. True False (2) Updates of sorted files are much faster than updates of indexed files. True False (3) A search key of an index is the same as a candidate key (minimal set of fields that uniquely identify a record in a relation). True False (4) For data entry k* in index, alternative 1 (data record with key value k) implies clustered index and in practice, clustered index also implies alternative 1. True False (5) A RAID Level 5 system has the best performance of all RAID levels with redundancy for small and large read and large write requests. True False (6) We can create several clustered indexes on a file without replicating data records. True False (7) It is the responsibility of the buffer manager to pin and unpin a page. True False (8) For ISAM (Indexed Sequential Access Method) tree indexes, any sequence of inserts or deletes of a set of data entries will result in the same tree. True False (9) Based on the assumptions of our textbook, in order to insert a record into a heap file with an unclustered hash index, we have to finish the whole process at a cost of 2D (D – the average time to read or write a disk page). True False (10) In a bulk loading B+ tree, all the leaf nodes are stored sequentially and linked with pointers. True False Page 3 of 9 Problem 2 SQL (20 points) Consider the following relations: Student(snum: integer, sname: string, major: string, level: string, age: integer) Class(cname: string, meets_at: string, room: string, fid: integer) Enrolled(snum: integer, cname: string) Faculty(fid: integer, fname: string, deptid: integer) The meaning of these relations is straightforward; for example, Enrolled has one record per student-class pair such that the student is enrolled in the class. Write the following queries in SQL. No duplicatesshould be printed in any of the answers. 1. For each level, print the level and the average age of students for that level. 2. Find the names of all Seniors (level = SR) who are enrolled in a class taught by John Doe. Page 4 of 9 3. Find the names of all classes that either meet in room Shelby 1120 or have three or more students enrolled. 4. Find the names of faculty members for whom the combined enrollment of the courses that they teach is less than ten. Page 5 of 9 Problem 3 B+ Tree (25 points) Consider the following B+ tree indexes of order d = 2. 1. Show the tree that would result from deleting a data entry with key 25 from the following B+ tree (follow the algorithm in the textbook). (11 pt) Page 6 of 9 2. Show the tree that would result from inserting a data entry with key 25 into the following B+ tree (follow the algorithm in the textbook). (14 pt) Page 7 of 9 Problem 4 Storage and Indexing (35 points, 7 points each) 1. Briefly explain why we need the two techniques, data striping and redundancy, in RAID (Redundant Arrays of Independent Disks) systems. 2. What is the role of the buffer manager in a DBMS? Explain what the buffer manager must do to process a read request for a page. Page 8 of 9 3. Explain the index-only evaluation technique with an SQL query sentence as an example. 4. Explain the difference between an ISAM (Indexed Sequential Access Method) tree and a B+ tree. Page 9 of 9 5. What is the difference between a clustered index and an unclustered index? If an index contains data records as data entries, can it be unclustered?
Goals: • To design a use case diagram to capture the requirements of project 4. • To use the argoUML tool to create a use case diagram and specify use cases. 1. Overview Write a few important use cases. Remember, these use cases describe how the user interacts with the text-based game (what they do, what the system does in response, etc.). Your use cases should have enough basic details such that someone unfamiliar with the system can understand what is happening in the text-based game. They should not include internal technical details that users are not (and should not) aware of. Make sure that any special rules/features you plan to add are clearly described in your analysis section. 2.1. Create a Use Case Diagram using argoUML In this project, you must use ArgoUML to draw a use case diagram. When you create a new project, it has a use case diagram created by default, named use case diagram 1. 2.2. Create Use Case Specification in argoUML You must also use argoUML to document the behavior of each use case in your use case diagram. The specification of a use case should be described in the Documentation tab of the use case. The specification of each use case should contain the following items: • Name. The name of the use case to which this relates. • Goal. A one- or two-line summary of what this use case achieves for its actors. • Actors. The actors involved in this use case, and any context regarding their involvement. Note: This should not be a description of the actor. That should be associated with the actor on the use case diagram. • Pre-condition. These would be better named “pre-assumptions”, but the term used everywhere is pre-conditions. This is a statement of any simplifying assumptions we can make at the start of the use case. • Basic Flow. The linear sequence of steps that describe the behavior of the use case in the “normal” scenario. Where a use case has a number of scenarios that could be normal, one is arbitrarily selected. • Alternate Flows. A series of linear sequences describing each of the alternative behaviors to the basic flow. • Post-conditions. These would be better named “post-assumptions”. This is a statement of any assumptions that we can make at the end of the use case. • Requirements. In an ideal world, all of the vision document, use case diagrams, use case specifications and supplementary requirements Project 4 – page 2 specification would form the requirements for a project. 2. Grading Criteria 2.1 (40 points) Use case diagram 1. (10 points) Actors 2. (10 points) Use cases in the diagram. 3. (10 points) Relations among actors and use cases. 4. (10 points) Relations among use cases. 2.2 (50 points) Use case specification 1. (10 points) Name, goal, actors in each use case 2. (20 points) Pre-condition/post-condition in each use case 3. (20 points) Basic Flows/Alternate Flows in each use case 2.3 (10 points) Submission Please submit your project analysis through the Canvas system (e-mail submission will not be accepted). You just need to submit your analysis document as an ArgoUML compressed project file (*.zargo). The file name should be formatted as: Project#/Quiz#/ Exam#_AUID_Firstname_Lastname. Note: other format (e.g., pdf, doc, txt) will not be accepted. 3. No Late Submission • Late submissions will not be accepted and will result in ZERO without valid excuses, in which case you should talk to Dr. Li to explain your situation. • GTA/Instructor will NOT accept any late submission caused by Internet latency. 4. Rebuttal period • You will be given a period of two business days to read and respond to the comments and grades of your homework or project assignment. The TA may use this opportunity to address any concern and question you have. The TA also may ask for additional information from you regarding your homework or project. Project 4 – page 3 5. Sample Usage What’s your name? Bob =========================================================== | Welcome, Bob! | =========================================================== 1) Start a New Game of Dunstan and Dragons! 2) View top 10 High Scores 3) Quit Please choose an option: 2 The top 5 High Scores are: Win 1337 CaseyZZZ 625 JonnieKill 400 Bob 75 Daisy 33 -no more scores to show1) Start a New Game of Dunstan and Dragons! 2) View top 10 High Scores 3) Quit Please choose an option: 1 Entering the Dungeon… You have: intelligence: 20 time: 25 money: $11.00 You are 20 steps from the goal. Time left: 25. 1) Move forward(takes time, could be risky…) 2) Read technical papers (boost intelligence, takes time) 3) Search for loose change (boost money, takes time) 4) View character 5) Quit the game Please choose an action: 4 Project 4 – page 4 You have: intelligence: 20 time: 25 money: $11.00 You are 20 steps from the goal. Time left: 25. 1) Move forward(takes time, could be risky…) 2) Read technical papers (boost intelligence, takes time) 3) Search for loose change (boost money, takes time) 4) View character 5) Quit the game Please choose an action: 2 You read through some technical papers. You gain 3 intelligence, but lose 2 units of time. You are 20 steps from the goal. Time left: 23. 1) Move forward (takes time, could be risky…) 2) Read technical papers (boost intelligence, takes time) 3) Search for loose change (boost money, takes time) 4) View character 5) Quit the game Please choose an action: 1 You move forward one step, and… NOTHING HAPPENS! You spent one unit of time. You are 19 steps from the goal. Time left: 22. 1) Move forward (takes time, could be risky…) 2) Read technical papers (boost intelligence, takes time) 3) Search for loose change (boost money, takes time) 4) View character 5) Quit the game Please choose an action: 1 You move forward one step, and… Project 4 – page 5 YOU FIND SOME PAPERS TO GRADE. You spent two units of time, but gained $3.00! You are 18 steps from the goal. Time left: 20. You can move forward or backward. 1) Move forward(takes time, could be risky…) 2) Read technical papers (boost intelligence, takes time) 3) Search for loose change (boost money, takes time) 4) View character 5) Quit the game Please choose an action: 1 You move forward one step, and… PUZZLE: It’s a riddling imp. I hate riddling imps. But fine, he asks: “Find the product of 8 and 8!” 1) 16 2) 64 3) 256 4) Uh…uh… no? Choose wisely: 4 The imp cackles “Oh yes. Yes indeed. Now you die.” TIME HAS FALLEN TO ZERO. YOU DIE.
Goals: • To understand how important the to-do list before designing a use case diagram. • To figure out the logical structure in a use case diagram.1. Background A customer wants a cool new video game, similar in style to popular video games like World of Warcraft or Bioshock. However, the customer recognizes the lack of significant funds and only has a resource poor computer to play this game on. Hence, the game will be a simple text-based adventure concerning a graduate student trying to navigate his way down Shelby Center. In this project, you will help the customer to design and implement a simple text-based game. 2. Requirement Details 2.1. Player The “player” is represented by at least three attributes: intelligence, time, and money. If the player runs out of intelligence, time or money, the player dies. The goal of the player is to survive to the end of the “hall” with the highest combined total of the attributes as possible. “Score” is determined as the three attributes multiplied together. 2.2. The Hall The player starts the game at the beginning of a hall, which is linear. The “Hall” is a path that is at least twenty (20) steps long. After a move, the user should be told how far away from the goal they are (in steps). If the player survives to the goal square without any of the attributes falling to 0, they win. Their score should be displayed with a simple ASCII victory message. If the player dies, a “You Lose” message should appear indicating the cause of death (for example, if money falls to zero, you can say that the player starved to death because of poverty). All the attributes should start in some random range (e.g. 8-25). 2.3. Turns Every turn, the player has (at least) 5 options to choose from: • Move: The player moves one step in the grid, but risks an Encounter or a Puzzle. Moving also takes time. • Read technical papers (boost intelligence): The player loses a fixed amount of time, but increases intelligence by a random amount • Search for loose change (boost money): The player loses a fixed amount of time, but increases money by a random amount. • View character: A simple display should show the character attributes and current position in the hall (ASCII is fine) • Quit the game (shows the “You Lose” screen – optional mockery- and exits the program) Project 3 – page 3 2.4. Encounter Encounters: Every time the character steps, there is a random change of various events happening. You are free to change the probabilities as you see fit for “game balance,” but here are some suggestions: • 25% chance: nothing happens, you just move forward. • 30% chance: You encounter a Puzzle (see Puzzle below) • 10% chance: Encounter a professor. This loses a random extra amount of time, but may slightly increase intelligence. • 10% chance: Encounter another graduate student. This loses a random amount of time. • 15% chance: Attacked by grunt work! Lose both time and intelligence. • 10% chance: Grade papers. Lose time, but gain money. • 0% chance: Get a huge raise, gain lots of money! (This never happens). 2.5. Puzzles Puzzles: Puzzles are different from normal encounters since they require interaction from the user. These don’t necessarily have to be brilliant, but riddles or even edutainment light puzzles are fine. Examples: • “What is 2 + 2:” For a correct response, Money + 1. For an incorrect response, Money -20 (you idiot). • “What can you put in a barrel to make it lighter?” For a correct response, int+2. For an incorrect response, int-2. 2.6. Other options You are free to add more details and rules to your game, but you must have at least the above specifications. Feel free to be creative – there are many opportunities to do so. 3. No Late Submission • Late submissions will not be accepted and will result in ZERO without valid excuses, in which case you should talk to Dr. Li to explain your situation. • GTA/Instructor will NOT accept any late submission caused by Internet latency. 4. Rebuttal period • You will be given a period of two business days to read and respond to the comments and grades of your homework or project assignment. The TA may use this opportunity to address any concern and question you have. The TA also may ask for additional information from you regarding your homework or project. 5. Rubrics Project 3 – page 4 5.1 Based on the game description above, you need the following information to design a use case diagram (5 points each): 1. Project broken down into multiple small functionalities. 2. Identify the goal and priority. 3. Functionality Scope. 4. Identify relationship and association. 5. Identify Extension and Inclusion of use cases. 6. Identify Multiplicity 7. Naming Use Case and actors 8. Important note points 5.2 Documents (20 points each list) Also, please submit three lists in a single PDF file: a) List of Actors of the Project; b) List of Use Cases/Activities; and c) List of System (Functionality list). 5.3 Submission Please submit your project preparation work as a .pdf file Project3_ firstname_lastname.pdf Project 3 – page 5 Sample Usage What’s your name? Bob =========================================================== | Welcome, Bob! | =========================================================== 1) Start a New Game of Dunstan and Dragons! 2) View top 10 High Scores 3) Quit Please choose an option: 2 The top 5 High Scores are: Win 1337 CaseyZZZ 625 JonnieKill 400 Bob 75 Daisy 33 -no more scores to show1) Start a New Game of Dunstan and Dragons! 2) View top 10 High Scores 3) Quit Please choose an option: 1 Entering the Dungeon… You have: intelligence: 20 time: 25 money: $11.00 You are 20 steps from the goal. Time left: 25. 1) Move forward(takes time, could be risky…) 2) Read technical papers (boost intelligence, takes time) 3) Search for loose change (boost money, takes time) 4) View character 5) Quit the game Project 3 – page 6 Please choose an action: 4 You have: intelligence: 20 time: 25 money: $11.00 You are 20 steps from the goal. Time left: 25. 1) Move forward(takes time, could be risky…) 2) Read technical papers (boost intelligence, takes time) 3) Search for loose change (boost money, takes time) 4) View character 5) Quit the game Please choose an action: 2 You read through some technical papers. You gain 3 intelligence, but lose 2 units of time. You are 20 steps from the goal. Time left: 23. 1) Move forward (takes time, could be risky…) 2) Read technical papers (boost intelligence, takes time) 3) Search for loose change (boost money, takes time) 4) View character 5) Quit the game Please choose an action: 1 You move forward one step, and… NOTHING HAPPENS! You spent one unit of time. You are 19 steps from the goal. Time left: 22. 1) Move forward (takes time, could be risky…) 2) Read technical papers (boost intelligence, takes time) 3) Search for loose change (boost money, takes time) 4) View character 5) Quit the game Please choose an action: 1 You move forward one step, and… YOU FIND SOME PAPERS TO GRADE. Project 3 – page 7 You spent two units of time, but gained $3.00! You are 18 steps from the goal. Time left: 20. You can move forward or backward. 1) Move forward(takes time, could be risky…) 2) Read technical papers (boost intelligence, takes time) 3) Search for loose change (boost money, takes time) 4) View character 5) Quit the game Please choose an action: 1 You move forward one step, and… PUZZLE: It’s a riddling imp. I hate riddling imps. But fine, he asks: “Find the product of 8 and 8!” 1) 16 2) 64 3) 256 4) Uh…uh… no? Choose wisely: 4 The imp cackles “Oh yes. Yes indeed. Now you die.” TIME HAS FALLEN TO ZERO. YOU DIE.
1. Building your Linux disk monitoring tool – auDiskTool In this project, you have a Linux tool called – auDiskTool – to monitor disk performance (i.e., I/O transfer rates) in a Linux system. Your auDiskTool can output a file containing a list of reports (attached at the end) that help system administrators in configuring the Linux system to achieve good disk I/O performance. The monitoring reports created and maintained by your auDiskTool offers statistical information on I/O transfer rates. Our long-term goal is building a tool to monitor both processor and disk performance of Linux systems. However, the short-term goal to be achieved in this project is to generate a state diagram. After you complete this project, you may extend your auDiskTool to monitor processor performance in addition to disk performance. 2. Requirements 2.1. Statistical Information Each report item recorded in an output file contains statistics on a per disk or partition basis. auDiskTool allows users to choose a particular disk or partition to monitor. If no disk nor partition is chosen by the users, then auDiskTool monitors all disks used by the Linux system. Each report item may show the following information: Device: This column provides the disk or partition name (e.g., sda). If your Linux machine has new kernels, the device name listed in the /dev directory is displayed. If your Linux kernel is 2.4, the format of the names is devm-n, where m is the major number of the device, and n is a distinctive number. If your Linux kernel is 2.2 (this is an uncommon case), the name will be displayed as hdiskn. Blk_read: This column reports the total number of reads. Blk_read/s: This column indicates the number of reads from the disk per second. KB_read/s: This column indicates the amount of data blocks read from the disk per second (i.e., measured in Kilobytes per second). Note: we assume the size of sector is 1 KB. Blk_wrtn: This column reports the total number of blocks written. KB_wrtn/s: This column indicates the number of data blocks written to the disk per second. (i.e., measured in Kilobytes per second.) Blk_wrtn/s: This column indicates the number of data blocks written to the disk per second. 2.2. Configure Time Interval and Count Your auDiskTool can set an interval parameter, which specifies the amount of time measured in seconds between two consecutive disk reports. Each disk report contains statistics collected during the interval since the previous report is recorded. Your auDiskTool also can decide the number of reports to be collected in an output file. If the interval parameter and the count parameter are not specified, auDiskTool may use default values. For example, the default value of the interval parameters is 1 second; the default value of the count parameter is 10. 2.3. Specify the File Name of an Output Report Your auDiskTool should allow users to specify the file name of an output report. The file name may be fully specified with a path. If no path is provided, then the working directory will be the current directory where the new report file is created. If the output file exists, then new report items will be appended at the end of the existing file. Note 1: If users specify a file name that does exist, auDiskTool must inform users that an output report file with the same name exists and reported items will be added to the existing file. Note 2: If the report file name is not specified, then “report.adt” will be used as a default output file name. 2.4. Specify what statistical data to be reported In section 3.1, we list 7 statistical data items. Your tool should allow users to decide what data items to be included in a report. A configuration file (see Section 3.5 blow) stores default values for these decisions. 2.5. A Configuration File – audisktool.conf All the default parameters (e.g., time interval, count, output file name) are stored in a configuration file. This configuration file is loaded into main memory when auDiskTool starts its execution. The configuration file name is “audisktool.conf”. The format of the configuration file is: Interval, count, print_blk_read, print_blk_read/s, print_kb_read/s, print_blk_write, print_blk_write/s, print_kb_write/s The values of print_blk_read, print_blk_read/s, print_kb_read/s, print_blk_write, print_blk_write/s, print_kb_write/s can be either ‘1’ or ’0’. ‘1’ means that the value will be reported; ‘0’ means the value is ignored in the report. For example, suppose we have the following configuration file: 5 10 1 1 1 0 0 0 The above file indicates that Interval is 5 seconds, count is 10, report values of blk_read, blk_read/s, kb_read/s and do not include the values of blk_write, blk_write/s, kb_write/s in the report. You do not need to submit this configuration file via Canvas; the TA will use the configuration file with the following parameters to test your implementation: 1 10 1 1 1 1 1 1 2.6. Display the report Users are allowed to open the report and display monitoring records inside audisktool. If the report file does not exist or cannot be opened, audisktool must show a warning message. 2.7. System Quit This should safely terminate the audiskTool. If any parameter (e.g., time interval and count) is updated, the system parameters must be saved back to the configuration file called “audisktool.conf”. 3. Retrieve Disk Statistics The Linux (version 2.4.20 and above) operating system offers extensive disk statistics to measure disk activities. You can use the following command to check the version of your Linux: $uname -r Linux Version 2.6 and above: The disk statistical information can be found in /proc/diskstats You can use the following command to display this file: $cat /proc/diskstats Example 1: Below is an example to show the format of the above file: 3 0 sda 446216 784926 9550688 4382310 424847 312726 5922052 19310380 0 3376340 23705160 3 1 sda1 2 0 4 24 0 0 0 0 0 24 24 You also can use the following command to display the information related to disks in the /proc/diskstats file. $grep ‘sda’ /proc/diskstats Note that grep is a utility program for searching plain-text data sets for lines matching a regular expression (e.g., ‘sda’ in our case). 4. Format of “/proc/diskstats” In example 1 shown on page 4, you can find each row has 14 items. The first three items are the major and minor device numbers, and device name. For example, given the following row: 3 1 sda1 2 0 4 24 0 0 0 0 0 24 24 The major device number is 3, and minor device number is 1, and device name is sda1. The 11 fields listed after the device name are statistics data of the device whose major/minor device numbers as well as name are shown in the first three fields. All these 11 fields except field 9 are cumulative since boot. Note that field 9 goes to zero as I/Os complete; all others only increase. These fields are unsigned long numbers. The 11 fields are explained below: Field 1: # of reads completed. This is the total number of reads completed successfully. Field 2/6: # of reads/writes merged. Reads and writes which are adjacent to each other may be merged for efficiency. Thus two 4K reads may become one 8K read before it is ultimately handed to the disk, and so it will be counted (and queued) as only one I/O. This field lets you know how often this was done. Field 3: # of sectors read. This is the total number of sectors read successfully. Field 4: # of milliseconds spent reading. This is the total number of milliseconds spent by all reads (as measured from make_request() to end_that_request_last()). Field 5: # of writes completed. This is the total number of writes completed successfully. Field 7: # of sectors written. This is the total number of sectors written successfully. Field 8: # of milliseconds spent writing. This is the total number of milliseconds spent by all writes (as measured from make_request() to end_that_request_last()). Field 9: # of I/Os currently in progress. The only field that should go to zero. Incremented as requests are given to appropriate struct request_queue and decremented as they finish. Field 10: # of milliseconds spent doing I/Os. This field increases so long as field 9 is nonzero. Field 11: weighted # of milliseconds spent doing I/Os. This field is incremented at each I/O start, I/O completion, I/O merge, or read of these stats by the number of I/Os in progress (field 9) times the number of milliseconds spent doing I/O since the last update of this field. This can provide an easy measure of both I/O completion time and the backlog that may be accumulating. 5. Rubrics: 1. Please list class names. (18 points) 2. Use argoUML to create a state chart diagram. (10 points) 3. For each class, please add proper info – at least one attribute and operation. If no ops, please indicate N/A– for software engineers to define them. (16 points) 4. Please indicate the relationship among classes with proper types of lines. (20 points) 5. Please indicate the conditions for transitions. (16 points) 6. Please specify the multiplicity. (15 points) 7. Submit your solution on Canvas with the format: project2_firstname.zargo. (5 points) 6. No Late Submission • Late submissions will not be accepted and will result in a ZERO without valid excuses, in which case you should talk to Dr. Li to explain your situation. • GTA/Instructor will NOT accept any late submission caused by Internet latency. 7. Rebuttal period • You will be given a period of two business days to read and respond to the comments and grades of your homework or project assignment. The TA may use this opportunity to address any concern and question you have. The TA also may ask for additional information from you regarding your homework or project. Sample Usage 1. Help $./audisktool_aak0010 help run – run the monitoring tool. set interval [value] – set sampling period to [value] set count [value] – set the number of records to [value] set report [name] – set report file name to [name] set blk_read [0|1] – set print_blk_read to 0 or 1 set blk_read/s [0|1] – set print_blk_read/s to 0 or 1 set kb_read/s [0|1] – set print_kb_read/s to 0 or 1 set blk_write [0|1] – set print_blk_write to 0 or 1 set blk_write/s [0|1] – set print_blk_write/s to 0 or 1 set kb_write [0|1] – set print_kb_write to 0 or 1 print conf – display all the parameters print report – open and display the report file save – the configuration file audisktool.conf is updated display – exit – exit the tool. > 2. Run the tool $./audisktool_aak0010 auDiskTool, version 1.0.0. Type ‘help’ to find more about commands. >run Monitoring time = 5 Seconds, Number of records = 10, print_blk_read = 1, print_blk_read/s = 1, print_kb_read/s = 1, print_blk_write = 0, print_blk_write/s = 0, print_kb_write/s = 0, report file name = ‘report.adt’ Please wait … A file “report.adt” is updated. > 3. Change report file name $./audisktool_aak0010 auDiskTool, version 1.0.0. Type ‘help’ to find more about commands. >set report aak0010.adt The report file name is changed from ‘report.adt’ to ‘aak0010.adt’. You can now type ‘run’ to generate new records to be saved in ‘aak0010.adt’. Note: ‘report.adt’ will not be deleted by audisktool. > 4. Display records in the report file 4.1 no record is found $./audisktool_aak0010 auDiskTool, version 1.0.0. Type ‘help’ to find more about commands. >print report No record found in ‘report.adt’ > 4.2 Records are found $./audisktool_aak0010 auDiskTool, version 1.0.0. Type ‘help’ to find more about commands. >print report 2 records found in ‘report.adt’ blk_read blk_read/s kb_read/s blk_write blk_write/s kb_write/s 10 1.13 2.26 N/A N/A N/A 55 > 5.4 10.8 N/A N/A N/A 5. Change count and interval $./audisktool_aak0010 auDiskTool, version 1.0.0. Type ‘help’ to find more about commands. >set count 15 The number of records generated in each run has been changed to 15. >set interval 3 The sampling interval has been changed to 3 seconds. > 6. Change print_blk_read and other similar parameters $./audisktool_aak0010 auDiskTool, version 1.0.0. Type ‘help’ to find more about commands. >set blk_read 0 Print_blk_read has been changed to 0. >set blk_write 0 Print_blk_write was 0; the parameter remains unchanged to 0. > 7. Display all the parameters. $./audisktool_aak0010 auDiskTool, version 1.0.0. Type ‘help’ to find more about commands. >print conf Monitoring time = 5 Seconds, Number of records = 10, print_blk_read = 1, print_blk_read/s = 1, print_kb_read/s = 1, print_blk_write = 0, print_blk_write/s = 0, print_kb_write/s = 0, report file name = ‘report.adt’ > 8. Save the configuration file 8.1 no need to save $./audisktool_aak0010 auDiskTool, version 1.0.0. Type ‘help’ to find more about commands. >save audisktool.conf has not been updated. There is no need to save the file. 8.2 change can save the configuration file $./audisktool_aak0010 auDiskTool, version 1.0.0. Type ‘help’ to find more about commands. >set blk_read 0 Print_blk_read has been changed to 0. >save file audisktool.conf has been updated. >
There are several “categories” if students plan to borrow books from our library. For example, Li would like to improve teaching skills, therefore learning skills from the history is the best solution to guide Li in the future. Surprisingly, a textbook “Aviation Instructor’s Handbook” is saved in the library. Li needs several steps to “grab” this book. For example, type the textbook name on AU library website to see where it is i.e. which collection area, which floor, which shelf (alphabetically). Unfortunately, currently the library app and website are under maintenance. They are not open to users until a clear class diagram is presented by software developers. What can you do to accelerate it? Based on the description above, please draw a class diagram via argoUML and submit .zargo file in Canvas. Notes: As a designer, think about 1) what info users need; 2)how software provides info; 3)from perspective of software design, what additional info you need i.e. what if users don’t return books on time. In your class diagram, not only provide all necessary class boxes, but also indicate relationships. Rubrics: 1.Please list class names. (18 points) 2.For each class, please add proper info – at least one attribute and operation. If no ops, please indicate N/A– for software engineers to define Project 1 – page 2 them. (37 points) 3.Please indicate the relationship among classes with proper types of lines. (20 points) 4.Please specify the multiplicity. (20 points) 5. Submit your solution on Canvas with the format: project1_firstname.zargo. (5 points) NO Late Submission: • Late submissions will not be accepted and will result in a ZERO without valid excuses, in which case you should talk to Dr. Li to explain your situation. • GTA/Instructor will NOT accept any late submission caused by Internet latency. Rebuttal period: • You will be given a period of two business days to read and respond to the comments and grades of your homework or project assignment. The TA may use this opportunity to address any concern and question you have. The TA also may ask for additional information from you regarding your projects.
The objective of this assignment is to explore an application of some aspect of linear algebra, preferably something you have studied in class. The application area should be of personal interest to you. There is a numerical computation requirement. Something beyond utilitarian topics is required (e.g not simply Gaussian elimination or solving linear systems). Motivation behind this course component is that there is not sufficient time to present many detailed applications in class, and also it’s not practical to grade numerical computation skills on exams. The assignment is to be done in groups of TWO. The amount of effort put into the assignment should be commensurate with the 20% weighting in the course grade. A good assignment will: • select an interesting1 and well-defined application area (engineering, economics, biology, science, numerical methods etc.) • describe meaningful analytical use of linear algebra • demonstrate computational skill with Matlab in some (non-trivial) aspect(s) of SYDE 312 course material • present the work in a report with careful attention given to prescribed content and format The report is to be submitted on or before the last day of classes: Tuesday 5 April 2022. Submission is in the form of a printed copy: (a) given to me at the last lecture, or (b) given to Sandy (department secretary) at the front office, or (c) under my office door E3-3158. The topic The most suitable topics are related to the latter part of the course, which covers the more advanced ideas starting with unit II and leads to a vast array of applications. Some of the 1This applies to your level of interest, not necessarily mine. Try to come across as inspired rather than going through the motions. So choose the topic wisely. Linear algebra hides in many disguises in things you are already familiar with. best application topics (e.g. SVD) unfortunately aren’t in class work until the last lectures. If you’re stuck I recommend looking at the textbook lists of applied projects as a starting point to get ideas. Check it out. There are numerous applications described there. Also try google with ‘linear algebra applications + a subject area’! The report The deliverable is a report in TWO-COLUMN format with a mandatory maximum of four pages. The title, your name and ID, and a short abstract should appear on the first page, formatted across both columns. The paper will include the following numbered sections after the abstract: 1) Background and discussion of the application area. 2) Use of linear algebra in the analysis. 3) Presentation of a specific case study or problem (real or invented). 4) Mathematical solution and selection of appropriate numerical technique(s). 5) Analysis and interpretation of the results in the context of the application. 6) Conclusions etc Any references you feel are appropriate can be provided as footnotes, but these should be minimal. There is no cover page. No computer code is to be included in the report. No appendices are permitted. Any figures, such as plots, should be selected with discrimination, and must be clear and legible. There are LATEX templates floating around that will do the formatting appropriately. Given the page limit the report is clearly intended to be illustrative and not comprehensive. This does NOT mean the work should be casual or superficial. The expectation is for a concise presentation of a significant application of linear algebra, and an original computational example to demonstrate. Your grade will reflect the quality of the report and how well you have condensed your work without trivializing it. Think carefully about every sentence you include. There is no room for padding in a 4-page report.
JavaScript can be used to add functionality to web pages. It is a very powerful, versatile, and quirky language. For this assignment, you are to create three small JavaScript programs which interact with web pages. Your project files should have (a least) three different HTML files, one for each part. It should also have (at least) three different JavaScript files, one for each part. Finally, you should have only one CSS file to provide styling for it all. Each file should have header comment block. Project 2a – Saying Hello Create a webpage that, when visited, displays a prompt asking for the user’s name. If they don’t input a name, keep asking for one. In a second prompt, ask for the user’s age. If they don’t input a valid value (let’s say valid ages are 0 – 150 inclusive), keep asking for one. Finally, display an alert with the text, ‘Hello [NAME]. You are [AGE] years old!’ with the correct values filled in. Project 2b – Calculator On this page, you are to create a very basic calculator. This should consist of two textboxes for input values, and somewhere to display the result on the page. There should also be buttons for the following mathematical operations: +, -, *, /, ^ (exponent), and square root. The calculator should work like this: user enters two values in to the text boxes, user clicks an operator, and then the result displays somewhere on the page. The square root operator only needs the value from the first input. Finally, somewhere on the page display the ‘history’ of the calculator (previously done operations with values, operator, and result). Project 2c – Mouse Chase Create a page with a nice, shiny, pretty button on it. The button should say ‘Click to receive $10!’ However, any time the user’s mouse cursor gets over the button, the button should move to a random location on the page, making the button essentially impossible to click! However, if they do somehow manage to click the button, redirect the user to an image of a 10 trillion Zimbabwe Dollar. Extra credit: Finally, the background color of the page should change every 3 seconds to a random color. Styling All of your pages should use the same CSS file to provide styling. You should use at least one tag selector, one class selector, one ID selector, and one context selector. Other than that, you’re free to style your pages however you would like! While I don’t have any exact criteria you must meet for styling, try to make your pages visually appealing to earn full points.Deploying your website Similar to project one, you should deploy your website on Thor. Your project two should be in the /p2/ directory. Add a link to your already existing landing page to your project 2. You can either link directly to each page separately from your landing page, or you can make a simple home page for project 2 where the user can select which of the three pages to go to. While you’re at it, why not give your landing page some styling updates with your newly learned CSS skills? Useful Tools JavaScript is an actual programming language, unlike HTML and CSS. Using a more powerful and intelligent editor may help you out greatly. Refer back to project one for a list of recommendations. Additionally, learn how to use the JavaScript console in your browser. In most browsers you can press F12 and then click “Console” to bring up the JavaScript console. This will make debugging JavaScript errors much easier! Grading The assignment is worth 150 points. The grading is broken down in the following categories: Part a functionality and code quality 30 Part b functionality and code quality 40 Part c functionality and code quality 40 Site has visually appealing styling 20 Properly deployed to web server 10 Header comment block 10 Total 150 Submitting your work Your work must be submitted as an archive (.zip, .7z, .rar) on Moodle. This archive should include your HTML, JS, CSS, and any other necessary files for your website. Late submissions are allowed up to a week after the due date, but have an automatic 20% penalty. This project is due Friday, October 18 at 11:59pm.
PHP is a programming language commonly used to create dynamic websites. Your task is to create a basic website that requires the user to log in using a username and password. Once logged in, there are various pages with various features. Finally, it should have a way to logout. Pages Your project will be a website made of various pages as described below. All of these pages require that the user be logged in. If the user attempts to visit any of the pages and isn’t logged in, you should redirect them to the login page. Your project should contain at minimum the following pages. Every page (except for the login and menu page) should have a link back to the main menu page. index.php The main page. This page should act as a menu, displaying links to the other pages below as well as a link to the logout page. If the user isn’t logged in, this page should redirect them to the login page. repeater.php This page should display a sentence of your choosing a random number of times (lets limit to 500 repeats max). The number of times the sentence appears should be different every time they visit this page. Requires login. date.php This page should display the current date and time in MM/DD/YYYY HH:MM AM/PM format. So, for example, a properly formatted date would be 10/31/2017 5:37 PM or 12/25/2017 11:32 AM. Requires login. rps.php This page should recreate the game of “Rock Paper Scissors”. It should have three radio buttons on it: “Rock”, “Paper”, and “Scissors” and also a “Submit” button. When the user clicks submit, it should submit their choice to playrps.php. Requires login. playrps.php This file should randomly pick “Rock”, “Paper”, or “Scissors” as the CPUs selection. It should then display the results of the game. Example output would be “Player: Rock. CPU: Scissors. Result: Win”. If the user didn’t submit a choice, an error should be displayed. Requires login. logout.php This page logs the user out and then redirects them to login.php login.php This is where the user may sign in using a username and password. If they’re already logged in, redirect them to index.php. If they type in the incorrect username/password combo, display the login page again with an error message.Creating a custom class PHP is an object oriented language. This means that you may write a class once but use it on multiple pages. Your task is to create class named UserAuthenticator in a file named ‘UserAuthenticator.php’. This class should contain all of the logic for logging in, logging out, and other features. Each of your pages should then use this UserAuthenticator class for their login related logic. Below is a UML class diagram which describes the UserAuthenticator class. It MUST have all the methods as described in the UML class diagram. You can add more methods if you’d like. UserAuthenticator +isLoggedIn() : Boolean +authenticate(username : String, password : String) : Boolean +logUserIn(username: String) +logout() +redirectToLogin() isLoggedIn: Returns true or false depending upon if the user is logged in. authenticate: Accepts a username and password. If they’re correct, returns true. False otherwise. logUserIn: Creates the session record necessary to log the user in. logout: Logs the current user out. redirectToLogin: Redirects them to the login page. Hints • You can use the $_SESSION to keep track of the currently logged in user. • PHP has a really useful class, DateTime, for getting and formatting times. You should use this. • You may want to have constants in your UserAuthenticator for the correct username and password. • You can redirect the user to another page by sending special headers using the header() method in PHP. You must call this method BEFORE printing any output or else it won’t work. • Ask me for help! Seriously, people have crashed and burned on this project before and then were amazed at how simple it could be once they asked for help.Suggestions Below are some suggestions to help keep your project organized • Separate your logic from the pages. For example, all the logic could be in one PHP file, and then all the HTML and page layout could be in a separate PHP file that the first one includes. • If you have a common menu on all pages, you could put that in a separate file and include it on all pages. • Use a single CSS file to style the entire website. If you’re wanting a challenge, make your website responsive, meaning it will displays well on small phone screens. Deploying to Thor Your project three should be uploaded to the CS web server, Thor, in a ‘p3’ directory. You should update the landing page of your website with a link to project 3. It is important to make sure your site works on Thor! This is where I will be testing them, so if your site isn’t at the correct path and doesn’t work on Thor, you will lose points! Grading The assignment is worth 150 points. You will be graded on the following criteria: Has all required pages 10 UserAuthenticator class meets requirements 20 Log in, log out, and redirect work 25 Repeater page works as described 15 Date page works as described 15 Rock/paper/scissors page works as described 25 Site works on Thor 10 Website is styled with CSS and looks appealing 20 Header comment block on all files 10 Total 150 Submitting your work Your work should be submitted as an archive (.zip, .7z, .rar) on Moodle. This archive should include your all files needed for your project. Late submissions are allowed up to a week after the due date, but have an automatic 20% penalty. This project is due Thursday, November 14th at 11:59pm.
This project is to create a basic website using HTML and to deploy it on a web server. Creating the website will require HTML and deploying your website will require SSH, FTP, and various other tools. This is also a good opportunity to set up an editor and begin to learn how it works. Finally, you will also create basic landing page for your website and future projects. Requirements Your task is to create a website using separate HTML files. A web page is a single page, and a web site is a collection of pages which link to each other. The site should at least five HTML files (meaning five different pages) and one CSS file. Your website should meet the following criteria: HTML Criteria • Has valid HTML5 doctype. • Has a correct basic page layout, including,, andtags. • All pages should have a unique title. • All pages should have a link to them somewhere using an tag. Meaning, I should be able to navigate your website just by clicking on links. • At least one use of each of the following HTML tags: , ,