¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾¾ Points: 60Submission: You need to submit electronically via Canvas by uploading separately a) a pdf file (named “hw1-Firstname-Lastname.pdf”) for your answers to the questions, and b) the program(s) you have created (named as “hw1-prog-Firstname-Lastname.???”); if there are multiple program files, please zip them as a single archive. Here replace “Firstname” using your first name and replace “Lastname” using your last name in the file names. Note that you must upload the pdf file for your answers as a separate file and do not include the pdf in your archive file.The main purpose of this assignment is to let you be familiar with neural network architecture elements including activation functions and how to define and train a simple neural network. Problem 1 (24 points, 4 point each) As neural networks are typically trained using (stochastic) gradient descent optimization algorithms, properties of the activation functions affect the learning. Here we divide the domain of an activation function into: 1) fast learning region if the magnitude of the gradient is larger than 0.99, 2) active learning region if the magnitude of the gradient is between 0.01 and 0.99 (inclusive), 3) slow learning region if the magnitude of the gradient is larger than 0 but smaller than 0.01, and 4) inactive learning region if the magnitude of the gradient is 0. For each of the following activation functions, plot its gradient in the range from -5 to 5 of the input and then list the four types of regions. If the gradient is not well defined for an input value, indicate so and then use any reasonable value. ≥ 0 (1) Rectified linear unit () = {0 ℎ. (2) Logistic sigmoid activation function . 0.2 + 0.8 > 1 (3) Piece-wise linear unit () = { 1 ≥ ≥ −1. 0. ℎ (4) Swish () = (3), where . !”# ≥ 0 (5) Exponential Linear Unit (ELU) () = {0.1($ − 1) ℎ. (Note here is a special case of the general ELU function with a=0.1.) (6) Gaussian Error Linear Unit (GELU), f(x) = % >1 + erf A % BC , where erf is the error & function, (also known as the Gauss error function), given by & % # dt. (Note that there is an approximation using x(1.702x) and there are approximations for the error function one could use; however, typical deep learning frameworks provide an efficient implementation of the error function.)Problem 2 (16 points) Here we use a simple neural network for solving the XOR problem given in the textbook (Section 6.1) but with two changes: we classify the input for being class 1 by adding a sigmoid activation function and initialize b as -0.5 instead of 0. In other words, the neural network will be given as f(x; W, c, w, b) = (w,max{0, W,x + c} + b), where s is the sigmoid activation function. The parameters are initialized as follows: W = R1 1S , c = R 0 S , w = R 1 S , b = -0.5. 1 1 -1 -2 The output of the neural network is the probability that the input belongs to class 1, which implies the probability for belonging to class 0 is 1 – the output. We will use the cross-entropy loss and the training set consists of all the four samples as given in the textbook. (1) (8 points) Compute the output and its loss for the given network for each of the four training samples using Algorithm 6.3 in the textbook. (2) (8 points) Compute the gradients for all the parameters for each of the four training samples using Algorithm 6.4 in the textbook.Problem 3 (20 points) Using a deep learning framework you have established, implement the neural network in the previous problem. (1) (10 points) Verify the results from your implementation is the same as you have for the previous problem. (2) (6 points) Train your network for 100 epochs using stochastic gradient descent (with a batch size of 1 and learning rate of 0.1). Plot the training loss with respect to the epoch at the end of each epoch, and then comment on the effectiveness of gradient descent. (3) (4 points) For a neural network, we define an optimal adversarial example for a set of samples as the input with the smallest distance to any sample in the set with a different classification compared to that sample. Here a sample is classified as class 1 if the output is higher than 0.5 and as class 0 if the ouput is lower than 0.5; when the output is exactly 0.5, it is ambiguous. Find an optimal adversarial example for the initial network and your trained network. (To be more precise for this problem, OAE (optimal adversarial example) for a model f and a set T is given as OAE(f, T) = argmin{∥ x-t ∥&, where t ∈ %Extra Credit ProblemProblem 4 (6 points) JumpReLU is a variant of ReLU, defined as JumpReLU(x) = hx if x ≥ , where q is a parameter (see 0 otherwise https://www.stat.berkeley.edu/~mmahoney/pubs/ICPRAM_2020_100.pdf for more details). (1) (2 points) Explain why JumpReLU could improve the robustness of a ReLU neural network by replacing all the ReLU activation functions using JumpReLU functions, assuming the parameter for each JumpReLU is chosen optimally. (2) (4 points) For the neural network you have trained for the previous problem, replace the ReLU using a JumpReLU with an optimized parameter and recompute the optimal adversarial example. Describe your findings and give explanations.
Programming Project 1Using the MARIE computer assembly language, write a program that computes the following expression: z = a * b * c. The computer will read in the input values a, b, and c from the keyboard and the final result (z) have to be displayed. In addition, every time an input value is read in, it must be displayed on the screen. Remember that the instruction set does not have an instruction to execute multiplication. The program must be tested on MARIE simulator.All numbers used to test your program must be positive numbers (greater than zero).On Webcourses you will find the MARIE folder with all the necessary documents on MARIE:1) How to download the MARIA simulator 2) A guide to the MARIE simulator environment. 3) A presentation to understand MARIESubmit through Webcourses.Your assembly program using a MARIE mas file. For example: myprogramname.mas The program must have your name at the beginning of the program as a comment. Also include the text file.
5/5 - (3 votes) Learning Goals of this Project:• Learning basic Pandas Dataframe manipulations• Learning more about Machine Learning (ML) Classification models and how they are used in a Cybersecurity context.• Learning about basic data pipelines and transformations.• Learning how to write and use unit tests when developing Python code.Important Highlights• You can do this project on your host, you do not need to use the VM.• Please see the Setup page for videos and instructions about project setup.• Keep the VM around for the final project (Summer 24), Web Security.• Please watch the provided videos below to see how to setup your environment, we can’t provide broad support here• There are only 25 submissions allowed! This is because Gradescope is a limited resource. It’s improper to test your code against Gradescope.• We have provided a local testing suite, be sure to pass that completely before you submit to Gradescope.Important Reference Materials:• NumPy Documentation• Pandas Documentation• Scikit-learn DocumentationProject Overview VideoThis is a 16 minute video by the project creator, it covers project concepts.https://youtu.be/kYoQiAamIpQ?si=kV9x32ubi_Hqw0iJThere are other videos on the Setup page that cover installation and other subjects.BACKGROUNDMany of the Projects in CS6035 are focused on offensive security tasks. These are related to Red Team activities/tasks that many of us may associate with cybersecurity. This project will be focused on defensive security tasks, which are usually considered Blue Team activities that are done by many corporate teams.Historically, many defensive security professionals have investigated malicious activity, files, and code. They investigate these to create patterns (often called signatures) that can be used to detect (and prevent) malicious activity, files, and code when that pattern is used again. What this means is that these simple methods only were effective on known threats.This approach was relatively effective in preventing known malware from infecting systems, but it did nothing to protect against novel attacks. As attackers became more sophisticated, they learned to tweak or simply encode their malicious activity, files, or code to avoid detection from these simple pattern matching detections.With this background information, it would be nice if a more general solution could give a score to activity, files, and code that pass through corporate systems every day. This solution would inform the security team that while a certain pattern may not exactly fit a signature of known malicious activity, files, or code it appears to be very similar to examples that were seen in the past that were malicious.Luckily machine learning models can do exactly that if provided with proper training data! Thus, it is no surprise that one of the most powerful tools in the hands of defensive cybersecurity professionals is Machine Learning. Modern detection systems usually use a combination of machine learning models and pattern matching (regular expressions) to detect and prevent malicious activity on networks and devices.This project will focus on teaching the fundamentals of data analysis and building/testing your own machine learning models in python. You’ll be using the open source libraries Pandas and Scikit-Learn.Cybersecurity Machine Learning Careers and Trends• Machine learning in cybersecurity is a growing field. The area was considered among top trends by McKinsey in 2022.• In the CompTIA State of Cybersecurity 2024 it says last year there were 660,000 unfilled Cybersecurity positions. Also in the section titled Product: AI Drives the Cybersecurity Product Set to New Heights they note that 56% of respondents use AI and Machine Learning for Cybersecurity.Additional Information• ML in Cybersecurity – Crowdstrike• AI for Cybersecurity – IBM• Future of Cybersecurity and AI – DeloitteTABLE OF CONTENTS• FAQ• Setup• Task 1• Task 2• Task 3• Task 4• Task 5• Submissions• Optional Notebooks• Video TasksFrequently Asked Questions (FAQ)Getting StartedRecommended Resources• Q: What Python library documentation should I reference?o A: Here are some helpful links: Scikit-learn API documentation: Learn how machine learning functions work. NumPy documentation: Understand common data structures and manipulation techniques. Pandas documentation: Learn to create and manipulate dataframes. Geeks4Geeks – Manipulating Dataframes with Pandas: Another resource for understanding dataframes.• Q: Are there any helpful video tutorials?o A: Yes, YouTube is a great resource. Check out Machine Learning for Everybody – Full Course by freeCodeCamp for a good overview of machine learning concepts.Project Preparation• Q: What skills are required for this project?o A: See the Prerequisite page for Machine Learning.• Q: I’m feeling lost. Where do I begin?o A: The best way to learn is by doing! We provide Jupyter Notebooks to help you get started. These are a good way to familiarize yourself with the cell-based format before writing code in .py files.o Important: You must submit .py files to Gradescope, not Jupyter Notebooks.o We also have an introductory video and a sample taskvideo.py file. If these are too advanced, you may need to review prerequisite knowledge. It’s possible to catch up, but it will be challenging.General Project QuestionsLibrary Versions and Testing• **Q: My library versions don’t match the required versions. Is this a problem? / Why am I failing the TestPackageVersions / test_numpy_version / test__version tests?**o A: Yes, library versions matter. Different versions can produce slightly different numerical results, causing you to fail the local tests. Ensure your versions match the requirements.Communication and Support• Q: When are office hours held?o A: Check the pinned Ed Discussion post for office hour dates/times and recordings.• Q: How should I ask project-related questions on Ed Discussion?o A: Per Ed post #9, search for existing answers first. If you can’t find the answer, ask your question in one of the pinned project posts. Do not post code or sensitive information. Public posts allow others to benefit from the Q&A.• Q: Can I get code review through a private Ed Discussion post?o A: No, we will not be reviewing individual code. Use the provided local tests and the Gradescope autograder. Debug your code using information from public Ed Discussion posts or online resources like Stack Overflow. Carefully read the project documentation, it contains many essential details.Feedback• Q: I have suggestions to improve the project. How can I share them?o A: Create a private Ed Discussion post with your feedback. We’ll review and consider them for future semesters.Submission and GradescopeSubmission Guidelines• Q: How many submissions are allowed on Gradescope?o A: See the Submissions page.• Q: Can I submit each task individually and get the highest score for each, or do I need to submit all tasks at once?o A: Submit all five task files together for your final submission. Your Gradescope score will be your Canvas score. See the Submissions page for details.Troubleshooting Gradescope Issues• Q: The autograder isn’t showing any scores or output. Is it broken?o A: We have a protection in the autograder to prevent printing sensitive information so if your code has print statements then you wont see your score or any outputs of the autograder. Please resubmit your code with print statements removed and you should see the normal outputs. You must remove the statements, not just comment them out. We suggest you learn to use the debugger, as well as using print statements. To re-emphasize: You are not allowed to use print, raise, exec, stdout, or similar statements in your code you submit to the autograder. This includes commented out print/raise statements.• Q: Are the local tests sufficient, or do I need to write additional ones?o A: The local tests should generally be sufficient. You are welcome to create more tests for further validation. The use of raise/print statements would limit output from the autograder, so avoid using them.• Q: I think there’s a bug in the autograder.o A: Create a private Ed Discussion post. While the autograder is thoroughly tested, there might be edge cases we missed. If a test passes locally but fails on Gradescope, let us know.• Q: I’m getting an “Autograder failed to start” error.o A: If you encounter this error, create a private Ed Discussion post and wait for TA assistance. Avoid resubmitting as this will count against your submission limit.Task Hints and QuestionsAll Tasks: General Rules• Q: Can I change function names, parameters, or return types in the provided code?o A: No. You should only modify the code within the function body. Do not change the function’s signature (name, parameters, and return type).For example, in the following code snippet:def do_something_data(dataset: pd.DataFrame, column_name: str) -> np.dtype: # Your code goes here return np.dtype(…) # You may need to modify the argument within np.dtype()o function name: do_something_datao function parameters: parametersdataset and column_nameo function return type: np.dtypeo function body: your code goes here ….. return …o The goal is for your code to compute the right value, but you should make sure you match the expected type for our Autograder code.• Q: Can I add or modify imports?o A: You shouldn’t need to, though you can add library imports related to the packages we include in our conda environment.• Q: The data type isn’t what I expected or what the test is expecting.o A: Consider typecasting.• Q: Are there differences between the Jupyter Notebook examples and the .py files I should submit?o A: Yes, the notebooks are for learning and exploration. The .py files contain the definitive function signatures and requirements for submission. Always refer to the .py files for the correct function definitions, parameters, and return values. Do not rely solely on the notebooks.• Q: The autograder says my code has print statements even though I commented them out.o A: Completely remove all print statements, raise statements, and any other code that generates output. Even commented-out print/raise statements can cause the autograder to fail. Use a debugger for troubleshooting instead.• Q: My code passes local tests for dataset 0 but fails dataset 1 on Gradescope for some tasks.o A: Dataset 1 on gradescope is different than dataset 0 and is there to ensure people don’t hard code their answers to the Dataset 0 (e.g. by returning hard coded numbers instead of actually implementing the function)Task-Specific GuidanceTASK 1• Q: My code passes local tests for dataset 0 but fails dataset 1 on Gradescope.o A: Dataset 0 likely matches the local tests. Dataset 1 has subtle differences to ensure your code is not hardcoded and generalizes correctly.TASK 2• Q: I’m getting “ValueError: Input contains NaN, infinity or a value too large for dtype(‘float64’)”o A: Double check your preprocessing steps, especially scaling and one-hot encoding. Make sure you are handling missing values appropriately before applying these transformations. Also, ensure your training data doesn’t contain infinite values or values exceeding the limits of float64.• Q: Why am I getting different results locally compared to Gradescope, even with the same random state?o A: Ensure your local environment matches the Gradescope environment as closely as possible, including library versions. Minor variations can still occur due to platform differences. However, significant discrepancies usually indicate an implementation error. Double-check your code against the documentation. Do not scale your data unless explicitly instructed to.TASK 3• Q: I’m getting a straight line in the Yellowbrick elbow plot. What could be wrong?o A: A straight line in the elbow plot often indicates an issue with the data you’re providing to the visualizer. Verify that you are using the correct data and that you’re not accidentally comparing data to itself. Double-check the input parameters of KElbowVisualizer and ensure they match the expected format (e.g., using tuples instead of lists for the k parameter). Try reproducing examples from the Yellowbrick documentation to confirm your environment is set up correctly. Also, confirm you’re using the unscaled data, as scaling can negatively impact k-means clustering.• Q: My kmeans_cluster_id column has a different data type (float64) than expected (int64).o A: This often happens when introducing NaN values during data manipulation. Ensure your new column has the same number of rows and index as the original dataset. Avoid creating NaN values unintentionally. Use .astype(int) to convert your column to the correct data type after ensuring there are no missing values.TASK 4• Q: Using RFE with random forest or gradient boosting models is slow and times out on Gradescope.o A: Use RFE only for logistic regression. For random forest and gradient boosting, use their built-in feature importance attributes.TASK 5• Q: Should I use a specific model for Task 5.2?o A: No specific model is required. You are free to choose any model (or combination of models) as long as it meets the performance threshold. Ensemble methods are permitted. You are responsible for any necessary hyperparameter tuning. Simple models might suffice, consider avoiding overly complex models which can make debugging more challenging.• Q: How do I create train and test datasets when I only have training data?o A: Split the provided training data into train and test sets for development and testing. Don’t confuse the terms with the underlying concept. We have separate test data for the autograder, but you can conceptually split the provided data for your own testing.• Q: What is hyperparameter tuning, and how is it done?o A: Hyperparameters are settings that control the training of machine learning models. They are set manually before training. Examples include the number of nodes and layers in a neural network or the number of branches in a decision tree. These settings affect model architecture, learning rate, and complexity.o Hyperparameter tuning is the process of finding the best hyperparameter values for optimal model performance. It’s an iterative process crucial for successful machine learning.o For example, if a model’s learning rate is too high, it might converge too quickly, leading to suboptimal results. If it’s too low, training might take too long or fail to converge. Finding the right balance is key to achieving accurate models.SetupThe project can be done on your host machine, or you can do it on the VM if you don’t want to install conda locally. Regardless of your choice, you will be working with the Student_Local_Testing directory that contains all the project files.There is a src directory in Student_Local_Testing that contains the project files you will work on. Do not move these source files (task1.py through task5.py). The tests in the tests dir require the source files to be in src.Host machine users will start below on the first instructions link by installing Miniconda and the cs6035_ML environment. Then you can set up the project in your favorite IDE. We demonstrate set up with PyCharm and VS Code below.VM users, please start with the instructions for installing on the VM.There are also videos if you prefer that to following written instructions.Written Setup Instructions:Project Installation on your Host MachineProject Installation on the VMPyCharm-Specific InstructionsVS Code-Specific InstructionsProject Setup / Getting Started VideosHost Installation Video – Short VersionHost Installation Video – Long VersionVM Installation VideoProject Content VideosDemonstration – Task VideoOptional Jupyter NotebooksProject Installation on your Host MachineHost Installation Instructions• You may have heard of Anaconda, “The Operating System of AI.” It’s a full-featured data science environment that includes everything from a Python environment (conda) to apps including the Spyder IDE.• For this project we only need the conda part, so we’ll have you download and install Miniconda from their installer page.• Note: There are graphical installers for the Windows and Mac platforms. In a video below we cover the graphical Windows video installer. If you are on a Mac or running Linux, you can use a bash-based installation script.• During the conda installation, generally accept the default options provided:• In Windows, do not add conda to your path, do register it as the primary Python.• On Macs, make sure the installer shows you the “Destination Select” page, otherwise you have to set the installation location earlier in the installation. For Mac issues, please see the conda Mac docs.• Once you have conda installed, you need to install a new conda environment:• In Windows, open an “Anaconda Powershell Prompt.”• On a Mac or Linux, just open a terminal window normally.• Download the project Student_Local_Testing.zip file from Canvas and unzip it.• In your terminal window, use the cd command to navigate into the Student_Local_Testing directory you unzipped.• Run ls to confirm you have the env.yml file in the Student_Local_Testing directory.• Run the following conda command:conda env create -f env.yml• This will take a couple minutes to complete, if you get timeouts, you can run the following command:conda config –set remote_read_timeout_secs 180.0• (set higher as needed, the 180 is in seconds)• Once the command finished, confirm the cs6035_ML conda env was installed:conda activate cs6035_ML• The prompt will now display (cs6035_ML) where it used to show (base).Project Installation on the VMVM Installation Instructions• On the VM we provide a one-step script to set up the project. It will download and install Miniconda and the cs6035_ML environment, as well as downloading and unzipping the project’s Student_Local_Testing directory.• Open the vm and login to the machine user account with the password provided in Canvas.• Open a terminal window in the VM.• On the Lubuntu VM you click the bird icon in the lower left corner, choose System, and the choose Terminal (or QTerminal – both work!).• On the newer Ubuntu VM for Fall 24, click Activities in the upper left corner and enter Terminal into the search box that appears. The Terminal app will appear.• Enter the following command on one line:wget https://cs6035.s3.amazonaws.com/ML/setup_conda_and_project.sh• This command will download the setup_conda_and_project.sh script.• You need to make the script executable, enter the following command:chmod +x setup_conda_and_project.sh• Now that you have made the script executable, you need to run it like this:• ./setup_conda_and_project.sh• This will run for a while, if it times out, edit the script and increase the value on this line:/home/machine/miniconda3/bin/conda config –set remote_read_timeout_secs 180.0• Once this script finishes, you will need to open a new terminal window to pick up on the newly installed environment. The easiest way to do this is to close and re-open the terminal application.Running VS Code or PyCharm Community on the VM:• We have provided scripts in your home directory to install PyCharm or VS Code on the VM.• To install these IDEs, run either ./InstallVSCode.sh or ./InstallPycharm.sh• Follow the IDE Setup instructions below, as you would on the host.• The IDE should now be installed if you look at the menu in the bottom left corner of the Desktop (or also accessible via command line if you’re familiar with that)PyCharm-Specific InstructionsFor Pycharm, you will create a new project and tell Pycharm to use an existing environment, the conda cs6035_ML environment you installed in the above steps.In Pycharm, choose New Project:• Be sure the directory name where your project files live is in the Name field (use Student_Local_Testing).• The location field points to the parent directory of dir in the Name field (wherever you unzipped Student_Local_Testing).• Choose “Custom Environment.”• Choose “Select Existing.”• For Type, if it’s not already chosen, choose “Conda.”• Be sure the “Path to conda” is filled, if not, point it to the conda.bat in the condabin directory in your Miniconda installation.For example, in Windows the Miniconda is capitalized, it won’t be in Linux or Macs: C:UsersjimloMiniconda3condabinconda.bat• Once you find your conda executable, then the Environment drop-down should auto-populate with your conda environments.• Select the cs6035_ML from the list.• When you click “Create” you’ll get a dialog confirming you want to create a project where files already exist.• Choose “Create From Existing Files” in this dialog.VS Code Specific InstructionsNOTE: If you’re using VS Code on the VM, you will need to install the Python and Python Debugger extensions. Use View->Extensions.• VS Code is not a Python-only IDE like PyCharm so we have to have a few things setup there.• First be sure the official Microsoft Python and Python Debugging Extensions are installed.• Make sure to DENY/REFUSE all the copilot pop-ups to install it or use it. Copilot usage is not allowed in this class even in the IDE as mentioned in the syllabus.• Next you need to select the conda Python interpreter you installed.• Use Ctrl-Shift-P (Windows) to bring up a dialog at the top of the screen.• Enter select interpreter into the text entry area to match the Python: Select Interpreter item.• Choose the Python: Select Interpreter option.• Choose the conda cs6035_ML environment (you may see a different Python version).• Now, to open the project files in VS Code, choose File->Add Folder to Workspace and select the Student_Local_Testing directory.• Make sure that the Student_Local_Testing directory is the top level directory in VS Code for tests to work properly.• Now you need to set up tests in VS Code:• Click on the Beaker Icon, then click on the Configure Tests button:• Choose unittest, tests, and test_*.py in choices presented to you.• You should see the tests showing in the Tests/Beaker panel:If you get errors debugging tests in VS code, where VS Code reports you are on a pre-3.7 version of Python, read this section:If VS Code reports Python version 3.1.x:• There’s a bug currently in the VS Code Python and/or Python Debugger Extensions• When you go to Configure the Python Version, you’ll see 3.1.x reported as the version.• This causes VS Code’s extensions to think you’re running a really old Python version.• To fix this, go into the View->Extensions menu and choose the pre-release versions of both the Python and Python Debugger extensionsProject Setup / Getting Started VideosHost Installation Video – Short Version• This video shows the conda, PyCharm and VSCode setup in a few minutes• There is little commentary here, the next video has the same process but more details.• IIS – Machine Learning Project: Quick and Dirty conda / Pycharm / VS Code videohttps://youtu.be/21IQUiyozUUHost Installation Video – Long Version• This video shows the steps above for installing the project on your host machine.• For VM installation skip this video and see the next video.• IIS Machine Learning Host Install Video https://youtu.be/9eYymJrZ0YYVM Installation Video• This video shows how to download and run the install script on the VM.• NOTE: You can do this project on your host machine, you don’t need the VM.• Integrated Development Environments (IDEs): There are installations scripts for PyCharm and VSCode on the VM, if you choose to use the VM.• Look in the machine user’s home directory and you’ll find InstallPycharm.sh and InstallVSCode.sh.• Run these with a ./ in front, like ./InstallPycharm.sh• Machine Learning VM Install Video https://youtu.be/NfZ9xs5f0T0Demonstration – Task Video• Demonstrates project concepts and approaches.• Focuses on how to use the debugger.• Follow along with the provided task_video.py.• Pycharm section starts at 3:45.• At 5:06 ignore copying the task files from the extra directory.• We provided all the files in a the Student_Local_Testing/src dir for you.• GA Tech OMS-CS ML Project Code Intro https://youtu.be/oeaiyEIdl04Task 1 (15 points)For the first task, let’s get familiar with some pandas basics. pandas is a Python library that deals with Dataframes, which you can think of as a Python class that handles tabular data. In the real world, you would create graphics and other visuals to better understand the dataset you are working with. You would also use plotting tools like PowerBi, Tableau, Data Studio, and Matplotlib. This step is generally known as Exploratory Data Analysis. Since we are using an autograder for this class, we will skip the plotting for this project.For this task, we have released a local test suite. If you are struggling to understand the expected input and outputs for a function, please set up the test suite and use it to debug your function. Please note that the return lines for the provided skeleton functions are placeholders for the data types that the tests are expecting.It’s critical you pass all tests locally before you submit to Gradescope for credit. Do not use Gradescope for debugging.TheoryIn this Task, we’re not yet getting into theory. It’s more nuts and bolts – you will learn the basics of pandas. pandas dataframes are something of a glorified list of lists, mixed in with a dictionary. You get a table of values with rows and columns, and you can modify the column names and index values for the rows. There are numerous functions built into pandas to let you manipulate the data in the dataframe.To be clear, pandas is not part of Python, so when you look up docs, you’ll specifically want the official Pydata pandas docs. Note that we linked to the API docs here, this is the core of the docs you’ll be looking at.You can always get started trying to solve a problem by looking at Stack Overflow posts in Google search results. There you’ll find ideas about how to use the pandas library. In the end, however, you should find yourself in the habit of looking directly at the docs for whichever library you are using, pandas in this case.For those who might need a concrete example to get started, here’s how you would take a pandas dataframe column and return the average of its values:import pandas as pd# create a dataframe from a Python dictdf = pd.DataFrame({“color”:[“yellow”, “green”, “purple”, “red”], “weight”:[124,4.56,384,-2]})df # shows the dataframeindex color weight0 yellow 1241 green 4.562 purple 384.004 red -2.00Note that the column names are [“color”,”weight”] while the index is [0,1,2,3…] where […] the brackets denote a list.Now that we have created a dataframe, we can find the average weight by summing the values under ‘weight’ and dividing them by the sum:average = df[‘weight’].sum() / len(df[‘weight’])average # if you put a variable as the last line, the variable is printed127.63999999999999________________________________________Note: In the example above, we’re not paying attention to rounding, you will need to round your answers to the precision asked for in each Task.Also note, we are using slightly older versions of the pandas, Python and other libraries so be sure to look at the docs for the appropriate library version. Often there’s a drop-down at the top of docs sites to select the older version.Refer to the Submissions page for details about submitting your work.Useful Links:• pandas documentation — pandas documentation (pydata.org)• What is Exploratory Data Analysis? – IBM• Top Data Visualization Tools – KDnuggetsDeliverables:• Complete the functions in task1.py• For this task we have released a local test suite please set that up and use it to debug your function.• Submit task1.py to gradescopeInstructions:The Task1.py file has function skeletons that you will complete with Python code, mostly using the pandas library. The goal of each of these functions is to give you familiarity with the pandas library and some general Python concepts like classes, which you may not have seen before. See information about the function’s inputs, outputs, and skeletons below.Table of contents1. find_data_type2. set_index_col3. reset_index_col4. set_col_type5. make_DF_from_2d_array6. sort_DF_by_column7. drop_NA_cols8. drop_NA_rows9. make_new_column10. left_merge_DFs_by_column11. simpleClass12. find_dataset_statisticsfind_data_typeIn this function you will take a dataset and the name of a column in it. You will return the column’s data type.Useful Resourceshttps://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.dtypes.htmlINPUTS• dataset – a pandas DataFrame that contains some data• column_name – a Python string (str)OUTPUTSnp.dtype – data type of the columnFunction Skeletondef find_data_type(dataset:pd.DataFrame,column_name:str) -> np.dtype: return np.dtype()set_index_colIn this function you will take a dataset and a series and set the index of the dataset to be the seriesUseful Resourceshttps://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Index.htmlINPUTS• dataset – a pandas DataFrame that contains some data• index – a pandas series that contains an index for the datasetOUTPUTSa pandas DataFrame indexed by the given index seriesFunction Skeletondef set_index_col(dataset:pd.DataFrame,index:pd.Series) -> pd.DataFrame: return pd.DataFrame()reset_index_colIn this function you will take a dataset with an index already set and reindex the dataset from 0 to n-1, where n is the number of rows in the dataset, dropping the old indexUseful Resourceshttps://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.reset_index.htmlINPUTS• dataset – a pandas DataFrame that contains some dataOUTPUTSa pandas DataFrame indexed from 0 to n-1Function Skeletondef reset_index_col(dataset:pd.DataFrame) -> pd.DataFrame: return pd.DataFrame()set_col_typeIn this function you will be given a DataFrame, column name and column type. You will edit the dataset to take the column name you are given and set it to be the type given in the input variableUseful Resourceshttps://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.astype.htmlINPUTS• dataset – a pandas DataFrame that contains some data• column_name – a string containing the name of a column• new_col_type – a Python type to change the column toOUTPUTSa pandas DataFrame with the column in column_name changed to the type in new_col_typeFunction Skeleton# Set astype (string, int, datetime)def set_col_type(dataset:pd.DataFrame,column_name:str,new_col_type:type) -> pd.DataFrame: return pd.DataFrame()make_DF_from_2d_arrayIn this function you will take data in an array as well as column and row labels and use that information to create a pandas DataFrameUseful Resourceshttps://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.htmlINPUTS• array_2d – a 2 dimensional numpy array of values• column_name_list – a list of strings holding column names• index – a pandas series holding the row index’sOUTPUTSa pandas DataFrame with columns set from column_name_list, row index set from index and data set from array_2dFunction Skeleton# Take Matrix of numbers and make it into a DataFrame with column name and index numberingdef make_DF_from_2d_array(array_2d:np.array,column_name_list:list[str],index:pd.Series) -> pd.DataFrame: return pd.DataFrame()sort_DF_by_columnIn this function, you are given a dataset and column name. You will return a sorted dataset (sorting rows by the value of the specified column) either in descending or ascending order, depending on the value in the descending variable.Useful Resourceshttps://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.sort_values.htmlINPUTS• dataset – a pandas DataFrame that contains some data• column_name – a string that contains the column name to sort the data on• descending – a boolean value (True or False) for if the column should be sorted in descending orderOUTPUTSa pandas DataFrame sorted by the given column name and in descending or ascending order depending on the value of the descending variableFunction Skeleton# Sort DataFrame by valuesdef sort_DF_by_column(dataset:pd.DataFrame,column_name:str,descending:bool) -> pd.DataFrame: return pd.DataFrame()drop_NA_colsIn this function you are given a DataFrame. You will return a DataFrame with any columns containing NA values droppedUseful Resourceshttps://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.dropna.htmlINPUTS• dataset – a pandas DataFrame that contains some dataOUTPUTSa pandas DataFrame with any columns that contain an NA value droppedFunction Skeleton# Drop NA values in DataFrame Columnsdef drop_NA_cols(dataset:pd.DataFrame) -> pd.DataFrame: return pd.DataFrame()drop_NA_rowsIn this function you are given a DataFrame you will return a DataFrame with any rows containing NA values droppedUseful Resourceshttps://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.dropna.htmlINPUTS• dataset – a pandas DataFrame that contains some dataOUTPUTSa pandas DataFrame with any rows that contain an NA value droppedFunction Skeletondef drop_NA_rows(dataset:pd.DataFrame) -> pd.DataFrame: return pd.DataFrame()make_new_columnThis function adds a new column to a DataFrame using a provided list of values, where each value corresponds to a row in the dataset. The new column is named according to the specified new_column_name.Useful Resources• Adding Columns in PandasINPUTS• dataset – A pandas DataFrame containing existing data.• new_column_name – A string specifying the name of the new column to create.• new_column_value – A list of values where each element represents the value for the new column in the corresponding row of the DataFrame. The length of this list must match the number of rows in the dataset.OUTPUTSA pandas DataFrame with the new column added. The new column, named new_column_name, contains the values from the new_column_value list in the order they are provided. Each row’s value in the new column corresponds to the element at the same index in the list.Function Skeletondef make_new_column(dataset:pd.DataFrame,new_column_name:str,new_column_value:list) -> pd.DataFrame: return pd.DataFrame()left_merge_DFs_by_columnIn this function you are given 2 datasets and the name of a column with which you will left join them on using the pandas merge method. The left dataset is dataset1 right dataset is dataset2, for example purposes.Useful Resourceshttps://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html https://stackoverflow.com/questions/53645882/pandas-merging-101INPUTS• left_dataset – a pandas DataFrame that contains some data• right_dataset – a pandas DataFrame that contains some data• join_col_name – a string containing the column name to join the two DataFrames onOUTPUTSa pandas DataFrame containing the two datasets left joined together on the given column nameFunction Skeletondef left_merge_DFs_by_column(left_dataset:pd.DataFrame,right_dataset:pd.DataFrame,join_col_name:str) -> pd.DataFrame: return pd.DataFrame()simpleClassThis project will require you to work with Python Classes. If you are not familiar with them we suggest learning a bit more about them.You will take the inputs into the class initialization and set them as instance variables (of the same name) in the Python class.Useful Resourceshttps://www.w3schools.com/python/python_classes.aspINPUTS• length – an integer• width – an integer• height – an integerOUTPUTSNone, just setup the init method in the class.Function Skeletonclass simpleClass(): def __init__(self, length:int, width:int, height:int): passfind_dataset_statisticsNow that you have learned a bit about pandas DataFrames, we will use them to generate some simple summary statistics for a DataFrame. You will be given the dataset as an input variable, as well as a column name for a column in the dataset that serves as a label column. This label column contains binary values (0 and 1) that you also summarize, and also the variable to predict.In this context:• 0 represents a “negative” sample (e.g. if the column is IsAVirus and we think it is false)• 1 represents a “positive” sample (e.g. if the column is IsAVirus and we think it is true)This type of binary classification is common in machine learning tasks where we want to be able to predict the field. An example of where this could be useful would be if we were looking at network data, and the label column was IsVirus. We could then analyze the network data of Georgia Tech services and predict if incoming files look like a virus (and if we should alert the security team).Useful Resources• https://www.learndatasci.com/glossary/binary-classification/• https://developers.google.com/machine-learning/crash-course/framing/ml-terminologyINPUTS• dataset – a pandas DataFrame that contains some data• label_col – a string containing the name of the label columnOUTPUTS• n_records (int) – the number of rows in the dataset• n_columns (int) – the number of columns in the dataset• n_negative (int) – the number of “negative” samples in the dataset (the argument label column equals 0)• n_positive (int) – the number of “positive” samples in the dataset (the argument label column equals 1)• perc_positive (int) – the percentage (out of 100%) of positive samples in the dataset; truncate anything after the decimalHint: Consider using the int function to type cast decimalsFunction Skeletondef find_dataset_statistics(dataset:pd.DataFrame,label_col:str) -> tuple[int,int,int,int,int]: n_records = #TODO n_columns = #TODO n_negative = #TODO n_positive = #TODO perc_positive = #TODO return n_records,n_columns,n_negative,n_positive,perc_positiveTask 2 (25 points)Now that you have a basic understanding of pandas and the dataset, it is time to dive into some more complex data processing tasks.TheoryIn machine learning a common goal is to train a model on one set of data. Then we validate the model on a similarly structured but different set of data. You could, for example, train the model on data you have collected historically. Then you would validate the model against real-time data as it comes in, seeing how well it predicts the new data coming in.If you’re looking at a past dataset as we are in these tasks, we need to treat different parts of the data differently to be able to develop and test models. We segregate the data into test and training portions. We train the model on the training data and test the developed model on the test data to see how well it predicts the results.You should never train your models on test data, only on training data.NotesAt a high level it is important to hold out a subset of your data when you train a model. You can see what the expected performance is on unseen sample. Thus, you can determine if the resulting model is overfit (performs much better on training data vs test data).Preprocessing data is essential because most models only take in numerical values. Therefore, categorical features need to be “encoded” to numerical values so that models can use them. A machine learning model may not be able to make sense of “green”, “blue” and “red.” In preprocessing, we’ll convert those to integer values 1, 2 and 3, for example. It’s an interesting question as to what happens when you have training data that has “green,” “red” and blue,” but your testing data says “yellow.”Numerical scaling can be more or less useful depending on the type of model used, but it is especially important in linear models. Numerical scaling is typically taking positive value and “compressing” them into a range between 0 and 1 (inclusive) that retains the relationships among the original data.These preprocessing techniques will provide you with options to augment your dataset and improve model performance.Useful Links:• Training and Test Sets – Machine Learning – Google Developers• Bias–variance tradeoff – Wikipedia• Overfitting – Wikipedia• Categorical and Numerical Types of Data – 365 Data Science• scikit-learn: machine learning in Python — scikit-learn 1.2.1 documentationDeliverables:• Complete the functions and methods in task2.py• For this task we have released a local test suite please set that up and use it to debug your function.• Submit task2.py to Gradescope when you pass all local tests. Refer to the Submissions page for details.Instructions:The Task2.py File has function skeletons that you will complete with python code (mostly using the pandas and scikit-learn libraries). The Goal of each of these functions is to give you familiarity with the applied concepts of Splitting and Preprocessing Data. See information about the Function’s Inputs, Outputs and Skeletons belowTable of contents1. tts2. PreprocessDataset1. __init__2. One Hot Encoding3. Min/Max Scaling4. PCA5. Feature Engineering6. PreprocessttsIn this function, you will take:• a dataset• the name of its label column• a percentage of the data to put into the test set• whether you should stratify on the label column• a random state to set the scikit-learn functionYou will return features and labels for the training and test sets.At a high level, you can separate the task into two subtasks. The first is splitting your dataset into both features and labels (by columns), and the second is splitting your dataset into training and test sets (by rows). You should use the scikit-learn train_test_split function but will have to write wrapper code around it based on the input values we give you.Useful Resources• https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html• https://developers.google.com/machine-learning/crash-course/framing/ml-terminology• https://stackoverflow.com/questions/40898019/what-is-the-difference-between-a-feature-and-a-labelINPUTS• dataset – a pandas DataFrame that contains some data• label_col – a string containing the name of the column that contains the label values (what our model wants to predict)• test_size – a float containing the decimal value of the percentage of the number of rows that the test set should be out of the dataset• should_stratify – a boolean (True or False) value indicating if the resulting train/test split should be stratified or not• random_state – an integer value to set the randomness of the function (useful for repeatability especially when autograding)OUTPUTS• train_features – a pandas DataFrame that contains the train rows and the feature columns• test_features – a pandas DataFrame that contains the test rows and the feature columns• train_labels – a pandas DataFrame that contains the train rows and the label column• test_labels – a pandas DataFrame that contains the test rows and the label columnFunction Skeletondef tts( dataset: pd.DataFrame, label_col: str, test_size: float, should_stratify: bool, random_state: int) -> tuple[pd.DataFrame,pd.DataFrame,pd.Series,pd.Series]: # TODO return train_features,test_features,train_labels,test_labelsPreprocessDatasetThe PreprocessDataset Class contains a code skeleton with nine methods for you to implement. Most methods will be split into two parts: one that will be run on the training dataset and one that will be run on the test dataset. In Data Science/Machine Learning, this is done to avoid something called Data Leakage.For this assignment, we don’t expect you to understand the nuances of the concept, but we will have you follow principles that will minimize the chances of it occurring. You will accomplish this by splitting data into training and test datasets and processing those datasets in slightly different ways.Generally, for everything you do in this project, and if you do any ML or Data Science work in the future, you should train/fit on the training data first, then predict/transform on the training and test data. That holds up for basic preprocessing steps like task 2 and for complex models like you will see in tasks 3 and 4.For the purposes of this project, you should never train or fit on the test data (and more generally in any ML project) because your test data is expected to give you an understanding of how your model/predictions will perform on unseen data. If you fit even a preprocessing step to your test data, then you are either giving the model information about the test set it wouldn’t have about unseen data (if you combine train and test and fit to both), or you are providing a different preprocessing than the model is expecting (if you fit a different preprocessor to the test data), and your model would not be expected to perform well.Note: You should train/fit using the train dataset; then, once you have a fit encoder/scaler/pca/model instance, you can transform/predict on the training and test data.You will also notice that we are only preprocessing the Features and not the Labels. There are a few cases where preprocessing steps on labels may be helpful in modeling, but they are definitely more advanced and out of the scope of this introduction. Generally, you will not need to do any preprocessing to your labels beyond potentially encoding a string value (i.e., “Malware” or “Benign”) into an integer value (0 or 1), which is called Label Encoding.PreprocessDataset:__init__Similar to the Task1 simpleClass subtask you previously completed you will initialize the class by adding instance variables (add all the inputs to the class).Useful Resources• https://www.w3schools.com/python/python_classes.aspINPUTS• one_hot_encode_cols – a list of column names (strings) that should be one hot encoded by the one hot encode methods• min_max_scale_cols – a list of column names (strings) that should be min/max scaled by the min/max scaling methods• n_components – an int that contains the number of components that should be used in Principal Component Analysis• feature_engineering_functions – a dictionary that contains feature name and function to create that feature as a key value pair (example shown below)Example of feature_engineering_functions:def double_height(dataframe:pd.DataFrame): return dataframe[“height”] * 2def half_height(dataframe:pd.DataFrame): return dataframe[“height”] / 2feature_engineering_functions = {“double_height”:double_height,”half_height”:half_height}Don’t worry about copying it we also have examples in the local test cases this is just provided as an illustration of what to expect in your function.OUTPUTSNone, just assign all the input parameters to class variables.Also per the instructions below, you’ll return here and create another instance variable: a scikit-learn OneHotEncoder with any Parameters you may need later.Function Skeletondef __init__(self, one_hot_encode_cols:list[str], min_max_scale_cols:list[str], n_components:int, feature_engineering_functions:dict ): # TODO: Add any instance variables you may need to make your functions work returnPreprocessDataset:one_hot_encode_columns_train and one_hot_encode_columns_testOne Hot Encoding is the process of taking a column and returning a binary vector representing the various values within it. There is a separate function for the training and test datasets since they should be handled separately to avoid data leakage (see the 3rd link in Useful Resources for a little more info on how to handle them).Pseudocodeone_hot_encode_columns_train()1. In the PreprocessDataset __init__() method initialize an instance variable containing a scikit-learn OneHotEncoder with any parameters you may need.2. Split train_features into two DataFrames: one with only the columns you want to one hot encode (using one_hot_encode_cols) and another with all the other columns.3. Fit the OneHotEncoder using the DataFrame you split from train_features with the columns you want to encode.4. Transform the DataFrame you split from train_features with the columns you want to encode using the fitted OneHotEncoder.5. Create a DataFrame from the 2D array of data that the output from step 4 gave you, with column names in the form of columnName_categoryName (there should be an attribute in OneHotEncoder that can help you with this) and the same index that train_features had.6. Concatenate the DataFrame you made in step 5 with the DataFrame of other columns from step 2.one_hot_encode_columns_test()1. Split test_features into two DataFrames: one with only the columns you want to one hot encode (usingone_hot_encode_cols) and another with all the other columns.2. Transform the DataFrame you split from test_features with the columns you want to encode using the OneHotEncoder you fit in one_hot_encode_columns_train()3. Create a DataFrame from the 2D array of data that the output from step 2 gave you, with column names in the form of columnName_categoryName (there should be an attribute in OneHotEncoder that can help you with this) and the same index that test_features had.4. Concatenate the DataFrame you made in step 3 with the DataFrame of other columns from step 1.Example Walkthrough (from Local Testing suite):INPUTS:one_hot_encode_cols[“src_ip”,”protocol”]Train FeaturesIndex src_ip protocol bytes_in bytes_out time3 104.128.239.2 TCP 1054 9108 2024-12-20 09:15:421 103.31.4.0 TCP 3412 7567 2024-12-19 23:33:217 10.112.171.199 TCP 553 2331 2024-12-20 01:26:519 108.162.192.0 ICMP 8423 3805 2024-12-20 11:55:525 216.189.157.2 UDP 9328 7089 2024-12-20 20:50:300 103.21.244.0 UDP 2782 108 2024-12-20 11:16:234 45.58.56.3 TCP 6959 298 2024-12-20 15:30:562 108.162.192.0 UDP 8856 3510 2024-12-19 22:42:38Test FeaturesIndex src_ip protocol bytes_in bytes_out time8 10.130.94.70 TCP 8172 5321 2024-12-20 17:00:196 103.21.244.0 UDP 9871 7476 2024-12-20 03:16:40TRAIN DATAFRAMES AT EACH STEP:2.DataFrame with columns to encode:Index src_ip protocol3 104.128.239.2 TCP1 103.31.4.0 TCP7 10.112.171.199 TCP9 108.162.192.0 ICMP5 216.189.157.2 UDP0 103.21.244.0 UDP4 45.58.56.3 TCP2 108.162.192.0 UDPDataFrame with other columns:Index bytes_in bytes_out time3 1054 9108 2024-12-20 09:15:421 3412 7567 2024-12-19 23:33:217 553 2331 2024-12-20 01:26:519 8423 3805 2024-12-20 11:55:525 9328 7089 2024-12-20 20:50:300 2782 108 2024-12-20 11:16:234 6959 298 2024-12-20 15:30:562 8856 3510 2024-12-19 22:42:384.One Hot Encoded 2d array:0 0 0 1 0 0 0 0 1 00 0 1 0 0 0 0 0 1 01 0 0 0 0 0 0 0 1 00 0 0 0 1 0 0 1 0 00 0 0 0 0 1 0 0 0 10 1 0 0 0 0 0 0 0 10 0 0 0 0 0 1 0 1 00 0 0 0 1 0 0 0 0 15.One Hot Encoded DataFrame with Index and Column NamesIndex src_ip_10.112.171.199 src_ip_103.21.244.0 src_ip_103.31.4.0 src_ip_104.128.239.2 src_ip_108.162.192.0 src_ip_216.189.157.2 src_ip_45.58.56.3 protocol_ICMP protocol_TCP protocol_UDP3 0 0 0 1 0 0 0 0 1 01 0 0 1 0 0 0 0 0 1 07 1 0 0 0 0 0 0 0 1 09 0 0 0 0 1 0 0 1 0 05 0 0 0 0 0 1 0 0 0 10 0 1 0 0 0 0 0 0 0 14 0 0 0 0 0 0 1 0 1 02 0 0 0 0 1 0 0 0 0 16.Final DataFrame with passthrough/other columns joined backIndex src_ip_10.112.171.199 src_ip_103.21.244.0 src_ip_103.31.4.0 src_ip_104.128.239.2 src_ip_108.162.192.0 src_ip_216.189.157.2 src_ip_45.58.56.3 protocol_ICMP protocol_TCP protocol_UDP bytes_in bytes_out time3 0 0 0 1 0 0 0 0 1 0 1054 9108 2024-12-20 09:15:421 0 0 1 0 0 0 0 0 1 0 3412 7567 2024-12-19 23:33:217 1 0 0 0 0 0 0 0 1 0 553 2331 2024-12-20 01:26:519 0 0 0 0 1 0 0 1 0 0 8423 3805 2024-12-20 11:55:525 0 0 0 0 0 1 0 0 0 1 9328 7089 2024-12-20 20:50:300 0 1 0 0 0 0 0 0 0 1 2782 108 2024-12-20 11:16:234 0 0 0 0 0 0 1 0 1 0 6959 298 2024-12-20 15:30:562 0 0 0 0 1 0 0 0 0 1 8856 3510 2024-12-19 22:42:38TEST DATAFRAMES AT EACH STEP:1.DataFrame with columns to encode:Index src_ip protocol8 10.130.94.70 TCP6 103.21.244.0 UDPDataFrame with other columns:Index bytes_in bytes_out time8 8172 5321 2024-12-20 17:00:196 9871 7476 2024-12-20 03:16:402.One Hot Encoded 2d array:0 0 0 0 0 0 0 0 1 00 1 0 0 0 0 0 0 0 13.One Hot Encoded DataFrame with Index and Column NamesIndex src_ip_10.112.171.199 src_ip_103.21.244.0 src_ip_103.31.4.0 src_ip_104.128.239.2 src_ip_108.162.192.0 src_ip_216.189.157.2 src_ip_45.58.56.3 protocol_ICMP protocol_TCP protocol_UDP8 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.06 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.04.Final DataFrame with passthrough columns joined backIndex src_ip_10.112.171.199 src_ip_103.21.244.0 src_ip_103.31.4.0 src_ip_104.128.239.2 src_ip_108.162.192.0 src_ip_216.189.157.2 src_ip_45.58.56.3 protocol_ICMP protocol_TCP protocol_UDP bytes_in bytes_out time8 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 8172 5321 2024-12-20 17:00:196 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 9871 7476 2024-12-20 03:16:40Note: For the local tests and autograder use the column naming scheme of joining the previous column name and the column value with an underscore (similar to above where Type -> Type_Fruit and Type_Vegetable)Note 2: Since you should only be fitting your encoder on the training data, if there are values in your test set that are different than those in the training set, you will denote that with 0s. In the example above, let’s say we have a row in the test set with pizza, which is neither a fruit nor vegetable for the Type_Fruit and Type_Vegetable. It should result in a 0 for both columns. If you don’t handle these properly, you may get errors like Test Failed: Found unknown categories.Note 3: You may be tempted to use the pandas function get_dummies to solve this task, but its a trap. It seems easier, but you will have to do a lot more work to make it handle a train/test split. So, we suggest you use scikit-learn’s OneHotEncoder.Useful Resources• https://www.educative.io/blog/one-hot-encoding• https://developers.google.com/machine-learning/data-prep/transform/transform-categorical• https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html#sklearn.preprocessing.OneHotEncoder• https://datascience.stackexchange.com/questions/103211/do-we-need-to-pre-process-both-the-test-and-train-data-setINPUTS• Use the needed instance variables you set in the __init__ method• train_features – a dataset split by a function similar to tts which should be used in the training/fitting steps• test_features – a dataset split by a function similar to tts which should be used in the test stepsOUTPUTSa pandas DataFrame with the columns listed in one_hot_encode_cols one hot encoded and all other columns in the DataFrame unchangedFunction Skeletondef one_hot_encode_columns_train(self,train_features:pd.DataFrame) -> pd.DataFrame: one_hot_encoded_dataset = pd.DataFrame() return one_hot_encoded_datasetdef one_hot_encode_columns_test(self,test_features:pd.DataFrame) -> pd.DataFrame: one_hot_encoded_dataset = pd.DataFrame() return one_hot_encoded_datasetPreprocessDataset:min_max_scaled_columns_train and min_max_scaled_columns_testMin/Max Scaling is a process to transform numerical features to a specific range, typically [0, 1], to ensure that input values are comparable (similar to how you may have heard of “normalizing” data) and is a crucial preprocessing step for many machine learning algos. In particular this standardization is essential for algorithms like linear regression, logistic regression, k-means, and neural networks, which can be sensitive to the scale of input features, whereas some algos like decision trees are less impacted.By applying Min/Max Scaling, we prevent feature dominance, to ideally improve performance and accuracy of these algorithms and improve training convergence. It’s a recommended step to ensure your models are trained on consistent and standardized data.For the provided assignment you should use the scikit-learn MinMaxScaler function (linked in the resources below) rather than attempting to implement your own scaling function.The rough implementation of the scikit-learn function is provided below for educational purposes.X_std = (X – X.min(axis=0)) / (X.max(axis=0) – X.min(axis=0))X_scaled = X_std * (max – min) + minNote: There are separate functions for the training and test datasets to help avoid data leakage between the test/train datasets. Please refer to the 3rd link in Useful Resources for more information on how to handle this – namely that we should still scale the test data based on our “knowledge” of the train dataset.Example Dataframe:Item Price Count TypeApples 1.99 7 FruitBroccoli 1.29 435 VegtableBananas 0.99 123 FruitOranges 2.79 25 FruitPineapples 4.89 5234 FruitExample Min Max Scaled Dataframe (rounded to 4 decimal places):Item Price Count TypeApples 0.2564 7 FruitBroccoli 0.0769 435 VegtableBananas 0 123 FruitOranges 0.4615 25 FruitPineapples 1 5234 FruitNote: For the Autograder use the same column name as the original column (ex: Price -> Price)Useful Resources• https://developers.google.com/machine-learning/data-prep/transform/normalization• https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html#sklearn.preprocessing.MinMaxScaler• https://datascience.stackexchange.com/questions/103211/do-we-need-to-pre-process-both-the-test-and-train-data-setINPUTS• Use the needed instance variables you set in the __init__ method• train_features – a dataset split by a function similar to tts which should be used in the training/fitting steps• test_features – a dataset split by a function similar to tts which should be used in the test stepsOUTPUTSa pandas DataFrame with the columns listed in min_max_scale_cols min/max scaled and all other columns in the DataFrame unchangedFunction Skeleton def min_max_scaled_columns_train(self,train_features:pd.DataFrame) -> pd.DataFrame: min_max_scaled_dataset = pd.DataFrame() return min_max_scaled_dataset def min_max_scaled_columns_test(self,test_features:pd.DataFrame) -> pd.DataFrame: min_max_scaled_dataset = pd.DataFrame() return min_max_scaled_datasetPreprocessDataset:pca_train and pca_testPrincipal Component Analysis is a dimensionality reduction technique (column reduction). It aims to take the variance in your input columns and map the columns into N columns that contain as much of the variance as it can. This technique can be useful if you are trying to train a model faster and has some more advanced uses, especially when training models on data which has many columns but few rows. There is a separate function for the training and test datasets because they should be handled separately to avoid data leakage (see the 3rd link in Useful Resources for a little more info on how to handle them).Note 1: For the local tests and autograder, use the column naming scheme of column names: component_1, component_2 .. component_n for the n_components passed into the __init__ method.Note 2: For your PCA outputs to match the local tests and autograder, make sure you set the seed using a random state of 0 when you initialize the PCA function.Note 3: Since PCA does not work with NA values, make sure you drop any columns that have NA values before running PCA.Useful Resources• https://builtin.com/data-science/step-step-explanation-principal-component-analysis• https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html#sklearn.decomposition.PCA• https://datascience.stackexchange.com/questions/103211/do-we-need-to-pre-process-both-the-test-and-train-data-setINPUTS• Use the needed instance variables you set in the __init__ method• train_features – a dataset split by a function similar to tts which should be used in the training/fitting steps• test_features – a dataset split by a function similar to tts which should be used in the test stepsOUTPUTSa pandas DataFrame with the generated pca values and using column names: component_1, component_2 .. component_nFunction Skeleton def pca_train(self,train_features:pd.DataFrame) -> pd.DataFrame: # TODO: Read the function description in https://github.gatech.edu/pages/cs6035-tools/cs6035-tools.github.io/Projects/Machine_Learning/Task2.html and implement the function as described pca_dataset = pd.DataFrame() return pca_dataset def pca_test(self,test_features:pd.DataFrame) -> pd.DataFrame: # TODO: Read the function description in https://github.gatech.edu/pages/cs6035-tools/cs6035-tools.github.io/Projects/Machine_Learning/Task2.html and implement the function as described pca_dataset = pd.DataFrame() return pca_datasetPreprocessDataset:feature_engineering_train, feature_engineering_testFeature Engineering is a process of using domain knowledge (physics, geometry, sports statistics, business metrics, etc.) to create new features (columns) out of the existing data. This could mean creating an area feature when given the length and width of a triangle or extracting the major and minor version number from a software version or more complex logic depending on the scenario.In cybersecurity in particular, feature engineering is crucial for using domain expert’s (e.g. a security analyst) experience to identify anomalous behavior that might signify a security breach. This could involve creating features that represent deviations from established baselines, such as unusual file access patterns, unexpected network connections, or sudden spikes in CPU usage. These anomaly-based features can help distinguish malicious activity from normal system operations, but the system does not know what data patterns mean anomalous off-hand – that is where you as the domain expert can help by creating features.These methods utilize a dictionary, feature_engineering_functions, passed to the class constructor (__init__). This dictionary defines how to generate new features:1. Keys: Strings representing new column names.2. Values: Functions that:o Take a DataFrame as input.o Return a Pandas Series (the new column’s values).Example of what could be passed as the feature_engineering_functions dictionary to __init__:import pandas as pddef double_height(dataframe: pd.DataFrame) -> pd.Series: return dataframe[“height”] * 2def half_height(dataframe: pd.DataFrame) -> pd.Series: return dataframe[“height”] / 2example_feature_engineering_functions = { “double_height”: double_height, # Note that functions in python can be passed around and used just like data! “half_height”: half_height}# and the class may be been created like this…# preprocessor = PreprocessDataset(…, feature_engineering_functions=example_feature_engineering_functions, …)In particular for this method, you will be taking in a dictionary with a column name and a function that takes in a DataFrame and returns a column. You’ll be using that to create a new column with the name in the dictionary key. Therefore if you were given the above functions, you would create two new columns named “double_height” and “half_height” in your Dataframe.Useful Resources• https://en.wikipedia.org/wiki/Feature_engineering• https://www.geeksforgeeks.org/what-is-feature-engineering/• Passing Function as an Argument in Python – GeeksforGeeksINPUTS• Use the needed instance variables you set in the __init__ method• train_features – a dataset split by a function similar to tts which should be used in the training/fitting steps• test_features – a dataset split by a function similar to tts which should be used in the test stepsOUTPUTSa pandas dataframe with the features described in feature_engineering_train and feature_engineering_test added as new columns and all other columns in the dataframe unchangedFunction Skeletondef feature_engineering_train(self,train_features:pd.DataFrame) -> pd.DataFrame: feature_engineered_dataset = pd.DataFrame() return feature_engineered_datasetdef feature_engineering_test(self,test_features:pd.DataFrame) -> pd.DataFrame: feature_engineered_dataset = pd.DataFrame() return feature_engineered_datasetPreprocessDataset:preprocess_train, preprocess_testNow, we will put three of the above methods together into a preprocess function. This function will take in a dataset and perform encoding, scaling, and feature engineering using the above methods and their respective columns. You should not perform PCA for this function.Useful ResourcesSee resources for one hot encoding, min/max scaling and feature engineering aboveINPUTS• Use the needed instance variables you set in the __init__ method• train_features – a dataset split by a function similar to tts which should be used in the training/fitting steps• test_features – a dataset split by a function similar to tts which should be used in the test stepsOUTPUTSa pandas dataframe for both test and train features with the columns in one_hot_encode_cols encoded, the columns in min_max_scale_cols scaled and the columns described in feature_engineering_functions engineered. You do not need to use PCA here.Function Skeletondef preprocess_train(self,train_features:pd.DataFrame) -> pd.DataFrame: train_features = pd.DataFrame() return train_featuresdef preprocess_test(self,test_features:pd.DataFrame) -> pd.DataFrame: test_features = pd.DataFrame() return test_featuresTask 3 (15 points)In Task 2 you learned how to split a dataset into training and testing components. Now it’s time to learn about using a K-means model. We will run a basic model on the data to cluster files (rows) with similar attributes together. We will use an unsupervised model.TheoryAn unsupervised model has no label column. By constrast, in supervised learning (which you’ll see in Task 4) the data has features and targets/labels. These labels are effectively an answer key to the data in the feature columns. You don’t have this answer key in unsupervised learning, instead you’re working on data without labels. You’ll need to choose algorithms that can learn from the data, exclusively, without the benefit of lablels.We start with K-means because it is simple to understand the algorithm. For the Mathematics people, you can look at the underlying data structure, a Voronoi diagram. Based on squared Euclidian distances, K-means creates clusters of similar datapoints. Each cluster has a centroid. The idea is that for each sample, it’s associated/clustered with the centroid that is the “closest.”Closest is an interesting concept in higher dimensions. You can think of each feature in a dataset as a dimension in the data. If it’s 2d or 3d, we can visualize it easily. Concepts of distance are clear in 2d and 3d, and they work similarly in 4+d.If you read the Wikipedia articles for K-means you’ll see a discussion of the use of “squared Euclidean distances” in K-means. This is compared with simple Euclidean distances in the Weber problem, and better approaches resulting from k-medians and k-mediods is discussed.________________________________________Please use scikit-learn to create the model and Yellowbrick to determine the optimal value of k for the dataset.So far, we have functions to split the data and preprocess it. Now, we will run a basic model on the data to cluster files (rows) with similar attributes together. We will use an unsupervised model (model with no label column), K-means. Again, use scikit-learn to create the model and Yellowbrick to determine the optimal value of k for the dataset.Refer to the Submissions page for details about submitting your work.Useful Links:• Clustering – Google Developers• Clustering Algorithms – Google Developers• Kmeans – Google DevelopersDeliverables:• Complete the KmeansClustering class in task3.py.• For this task we have released a local test suite please set that up and use it to debug your function.• Submit task3.py to Gradescope when you pass all local tests. Refer to the Submissions page for details.Local Test Dataset InformationFor this task the local test dataset we are using is the NATICUSdroid dataset, which contains 86 columns of data related to android permissions used by benign and malicious Android applications released between 2010 and 2019. For more information such as the introductory paper and the Citations/Acknowledgements you can view the dataset site in the UCI ML repository. In this specific case clustering can be a useful tool to group apps that request similar permissions together. The team that created this dataset hypothesized that malicious apps would exhibit distinct patterns in the types of permissions they request compared to benign apps. This difference in permission request patterns could potentially be used to distinguish between malicious and benign applications.Instructions:The Task3.py File has function skeletons that you will complete with Python code. You will mostly be using the pandas, Yellowbrick and scikit-learn libraries. The goal of each of these functions is to give you familiarity with the applied concepts of Unsupervised Learning. See information about the function’s Inputs, Outputs and Skeletons below.KmeansClusteringThe KmeansClustering Class contains a code skeleton with 4 methods for you to implement.Note: You should train/fit using the train dataset then once you have a Yellowbrick/K-means model instance you can transform/predict on the training and test data.KmeansClustering:__init__Similar to Task 1, you will initialize the class by adding instance variables as needed.Useful Resources• https://www.w3schools.com/python/python_classes.aspINPUTS• random_state – an integer that should be used to set the scikit-learn randomness so the model results will be repeatable which is required for the tests and autograderOUTPUTSNoneFunction Skeletondef __init__(self, random_state: int ): # TODO: Add any state variables you may need to make your functions work passKmeansClustering:kmeans_trainKmeans Clustering is a process of grouping together similar rows together and assigning them to a cluster. For this method you will use the training data to fit an optimal K-means cluster on the data.To help you get started we have provided a list of subtasks to complete for this task:1. Initialize a scikit-learn K-means model using random_state from the __init__ method and setting n_init = 10.2. Try to find the best “k” to use for the KMeans Clustering.o Initialize a Yellowbrick KElbowVisualizer with the K-means model.o Use that visualizer to search for the optimal value of k [between 1 (inclusive) and 10, (exclusive) in mathmatical expression that would be [1,10)].o Use the provided resources to understand how to interpret the visualization3. Train the KElbowVisualizer on the training data and determine the optimal k value.4. Now, train a K-means model with the proper initialization for that optimal value of k5. Return the cluster ids for each row of the training set as a list.Useful Resources• https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html#sklearn.cluster.KMeans• https://www.scikit-yb.org/en/latest/api/cluster/elbow.htmlINPUTS• Use the needed instance variables you set in the __init__ method• train_features – a dataset split by a function similar to tts which should be used in the training/fitting stepsOUTPUTSa list of cluster ids that the K-means model has assigned for each row in the train datasetFunction Skeletondef kmeans_train(self,train_features:pd.DataFrame) -> list: cluster_ids = list() return cluster_idsKmeansClustering:kmeans_testK-means clustering is a process of grouping together similar rows together and assigning them to a cluster. For this method you will use the training data to fit an optimal K-means cluster on the test data.To help you get started, we have provided a list of subtasks to complete for this task:1. Use a model similar to the one you trained in the kmeans_train method to generate cluster ids for each row of the test dataset. You should either (1) reuse the same model from kmeans_train or (2) train a new model in the test method using the training data.2. Return the cluster ids for each row of the test set as a list.Useful Resources• https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html#sklearn.cluster.KMeans• https://www.scikit-yb.org/en/latest/api/cluster/elbow.htmlINPUTS• Use the needed instance variables you set in the __init__ method• test_features – a dataset split by a function similar to tts which should be used in the test stepsOUTPUTSa list of cluster ids that the K-means model has assigned for each row in the test datasetFunction Skeletondef kmeans_test(self,test_features:pd.DataFrame) -> list: cluster_ids = list() return cluster_idsKmeansClustering:train_add_kmeans_cluster_id_feature, test_add_kmeans_cluster_id_featureUsing the two methods you completed above (kmeans_train and kmeans_test) you will add a new feature(column) to the training and test dataframes. This is similar to the feature engineering method in Task 2 where you appended new columns onto an existing dataframe.To do this, use the output of the methods (the list of cluster ids you return) from the corresponding train or test method and add it as a new column named kmeans_cluster_id in the input dataframe, then return the full dataframe.Useful ResourcesINPUTSUse the needed instance variables you set in the __init__ method and the kmeans_train and kmeans_test methods you wrote above to produce the needed output.• train_features – a dataset split by a function similar to tts which should be used in the training/fitting steps• test_features – a dataset split by a function similar to tts which should be used in the test stepsOUTPUTSA pandas dataframe with the kmeans_cluster_id added as a feature and all other input columns unchanged, for each of the two methods train_add_kmeans_cluster_id_feature and test_add_kmeans_cluster_id_feature.Function Skeleton def train_add_kmeans_cluster_id_feature(self,train_features:pd.DataFrame) -> pd.DataFrame: output_df = pd.DataFrame() return output_df def test_add_kmeans_cluster_id_feature(self,test_features:pd.DataFrame) -> pd.DataFrame: output_df = pd.DataFrame() return output_dfTask 4 (25 points)Now let’s try a few supervised classification models, we have chosen a few commonly used models for you to use here, but there are many options. In the real world, specific algorithms may fit a specific dataset better than other algorithms.You won’t be doing any hyperparameter tuning yet, so you can better focus on writing the basic code. You will:• Train a model using the training set.• Predict on the training/test sets.• Calculate performance metrics.• Return a ModelMetrics object and trained scikit-learn model from each model function.(Note on feature importance: You should use RFE for determining feature importance of your Logistic Regression model, but do NOT use RFE for your Random Forest or Gradient Boosting models to determine feature importance. Please use their built-in values for this.)Useful Links:• scikit-learn: machine learning in Python — scikit-learn 1.2.1 documentation• Training and Test Sets – Machine Learning – Google Developers• Bias–variance tradeoff – Wikipedia• Overfitting – Wikipedia• An Introduction to Classification in Machine Learning – builtin• Classification in Machine Learning: An Introduction – DataCampDeliverables:• Complete the functions and methods in task4.py• For this task we have released a local test suite please set that up and use it to debug your function.• Submit task4.py to Gradescope when you pass all local tests. Refer to the Submissions page for details.Local Test Dataset InformationFor this task the local test dataset we are using is the NATICUSdroid dataset, which contains 86 columns of data related to android permissions used by benign and malicious Android applications released between 2010 and 2019. For more information such as the introductory paper and the Citations/Acknowledgements you can view the dataset site in the UCI ML repository. If you look at the online poster for the paper that the dataset creators wrote from their research, they trained a variety of different models including Random Forest, Logistic Regression and XGBoost and calculated a variety of metrics related to training and detection performance. In this task we will guide you through training ML models and calculating performance metrics to compare the predictive abilities of different models.Instructions:The Task4.py File has function skeletons that you will complete with Python code (mostly using the pandas and scikit-learn libraries).The goal of each of these functions is to give you familiarity with the applied concepts of training a model, using it to score records and calculating performance metrics for it. See information about the function inputs, outputs and skeletons below.Table of contents1. ModelMetrics2. calculate_naive_metrics3. calculate_logistic_regression_metrics4. calculate_decision_tree_metrics5. calculate_random_forest_metricsModelMetrics• In order to simplify the autograding we have created a class that will hold the metrics and feature importances for a model you trained.• You should not modify this class but are expected to use it in your return statements.• This means you put your training and test metrics dictionaries and feature importance DataFrames inside a ModelMetrics object for the autograder to handle. This is for each of the Logistic Regression, Gradient Boosting and Random Forest models you will create.• You do not need to return a feature importance DataFrame in the ModelMetrics value for the naive model you will create, just return None in that position of the return statement, as the given code does.calculate_naive_metricsA Naive model is a very simple model/prediction that can help to frame how well a more sophisticated model is doing. At best, such a model has random competence at predicting things. At worst, it’s wrong all the time.Since a naive model is incredibly basic (often a constant or randomly selected result), we can expect that any more sophisticated model that we train should outperform it. If the naive Model beats our trained model, it can mean that additional data (rows or columns) is needed in the dataset to improve our model. It can also mean that the dataset doesn’t have a strong enough signal for the target we want to predict.In this function, you’ll implement a simple model that always predicts a constant (function-provided) number, regardless of the input values. Specifically, you’ll use a given constant integer, provided as the parameter naive_assumption, as the model’s prediction. This means the model will always output this constant value, without considering the actual data. Afterward, you will calculate four metrics—accuracy, recall, precision, and F1-score—for both the training and test datasets.[1] Refer to the resources below.Useful Resources• https://machinelearningmastery.com/how-to-develop-and-evaluate-naive-classifier-strategies-using-probability/• https://scikit-learn.org/stable/modules/generated/sklearn.metrics.accuracy_score.html#sklearn.metrics.accuracy_score• https://scikit-learn.org/stable/modules/generated/sklearn.metrics.recall_score.html#sklearn.metrics.recall_score• https://scikit-learn.org/stable/modules/generated/sklearn.metrics.precision_score.html#sklearn.metrics.precision_score• https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html#sklearn.metrics.f1_scoreINPUTS• train_features – a dataset split by a function similar to the tts function you created in task2• test_features – a dataset split by a function similar to the tts function you created in task2• train_targets – a dataset split by a function similar to the tts function you created in task2• test_targets – a dataset split by a function similar to the tts function you created in task2• naive_assumption – an integer that should be used as the result from the naive model you will createOUTPUTSA completed ModelMetrics object with a training and test metrics dictionary with each one of the metrics rounded to 4 decimal placesFunction Skeletondef calculate_naive_metrics(train_features:pd.DataFrame, test_features:pd.DataFrame, train_targets:pd.Series, test_targets:pd.Series, naive_assumption:int) -> ModelMetrics: train_metrics = { “accuracy” : 0, “recall” : 0, “precision” : 0, “fscore” : 0 } test_metrics = { “accuracy” : 0, “recall” : 0, “precision” : 0, “fscore” : 0 } naive_metrics = ModelMetrics(“Naive”,train_metrics,test_metrics,None) return naive_metricscalculate_logistic_regression_metricsA logistic regression model is a simple and more explainable statistical model that can be used to estimate the probability of an event (log-odds). At a high level, a logistic regression model uses data in the training set to estimate a column’s weight in a linear approximation function. Conceptually this is similar to estimating m for each column in the line formula you probably know well from geometry: y = m*x + b. If you are interested in learning more, you can read up on the math behind how this works. For this project, we are more focused on showing you how to apply these models, so you can simply use a scikit-learn Logistic Regression model in your code.For this task use scikit-learn’s LogisticRegression class and complete the following subtasks:• Train a Logistic Regression model (initialized using the kwargs passed into the function)• Predict scores for training and test datasets and calculate the 7 metrics listed below for the training and test datasets using predictions from the fit model. (All rounded to 4 decimal places)o accuracyo recallo precisiono fscoreo false positive rate (fpr)o false negative rate (fnr)o Area Under the Curve of Receiver Operating Characteristics Curve (roc_auc)• Use RFE to select the top 10 features• Train a Logistic Regression model using these selected features (initialized using the kwargs passed into the function)• Create a Feature Importance DataFrame from the model trained on the top 10 features:o Use the top 10 features sort by absolute value of the coefficient from biggest to smallest.o Make sure you use the same feature and importance column names as set in ModelMetrics in feat_name_col [Feature] and imp_col [Importance].o Round the importances to 4 decimal places (do this step after you have sorted by Importance)o Reset the index to 0-9. You can do this the same way you did in task1.NOTE: Make sure you use the predicted probabilities for roc aucUseful Resources• https://stats.libretexts.org/Bookshelves/Introductory_Statistics/OpenIntro_Statistics_(Diez_et_al)./08%3A_Multiple_and_Logistic_Regression/8.04%3A_Introduction_to_Logistic_Regression• https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html#sklearn.linear_model.LogisticRegression• https://scikit-learn.org/stable/modules/generated/sklearn.metrics.accuracy_score.html#sklearn.metrics.accuracy_score• https://scikit-learn.org/stable/modules/generated/sklearn.metrics.recall_score.html#sklearn.metrics.recall_score• https://scikit-learn.org/stable/modules/generated/sklearn.metrics.precision_score.html#sklearn.metrics.precision_score• https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html#sklearn.metrics.f1_score• https://en.wikipedia.org/wiki/Confusion_matrix• https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_auc_score.htmlINPUTSThe first 4 are similar to the tts function you created in Task 2:• train_features – a Pandas Dataframe with training features• test_features – a Pandas Dataframe with test features• train_targets – a Pandas Dataframe with training targets• test_targets – a Pandas Dataframe with test targets• logreg_kwargs – a dictionary with keyword arguments that can be passed directly to the scikit-learn Logistic Regression classOUTPUTS• A completed ModelMetrics object with a training and test metrics dictionary with each one of the metrics rounded to 4 decimal places• A scikit-learn Logistic Regression model object fit on the training setFunction Skeletondef calculate_logistic_regression_metrics(train_features:pd.DataFrame, test_features:pd.DataFrame, train_targets:pd.Series, test_targets:pd.Series, logreg_kwargs) -> tuple[ModelMetrics,LogisticRegression]: model = LogisticRegression() train_metrics = { “accuracy” : 0, “recall” : 0, “precision” : 0, “fscore” : 0, “fpr” : 0, “fnr” : 0, “roc_auc” : 0 } test_metrics = { “accuracy” : 0, “recall” : 0, “precision” : 0, “fscore” : 0, “fpr” : 0, “fnr” : 0, “roc_auc” : 0 } log_reg_importance = pd.DataFrame() log_reg_metrics = ModelMetrics(“Logistic Regression”,train_metrics,test_metrics,log_reg_importance) return log_reg_metrics,modelExample of Feature Importance DataFrameFeatureImportance0 android.permission.REQUEST_INSTALL_PACKAGES -5.59691 android.permission.READ_PHONE_STATE 5.15872 android.permission.android.permission.READ_PHONE_STATE -4.79233 com.anddoes.launcher.permission.UPDATE_COUNT -4.75064 com.samsung.android.providers.context.permission.WRITE_USE_APP_FEATURE_SURVEY -4.49335 com.google.android.finsky.permission.BIND_GET_INSTALL_REFERRER_SERVICE -4.48316 com.google.android.c2dm.permission.RECEIVE -4.27817 android.permission.FOREGROUND_SERVICE -4.19668 android.permission.USE_FINGERPRINT -3.92399 android.permission.INTERNET -2.7991calculate_decision_tree_metricsA Decision Tree (DT) is a supervised learning algorithm used for both classification and regression tasks. It works by recursively splitting the data into subsets based on the feature that results in the best separation of classes, typically measured using Gini impurity or entropy. Decision trees are interpretable, as the learned model can be visualized as a flowchart-like structure.If you are interested in learning more, you can read up on the math behind how decision trees work.For this project, we are more focused on showing you how to apply these models, so you can simply use a scikit-learn DecisionTreeClassifier in your code.For this task, use scikit-learn’s DecisionTreeClassifier class and complete the following subtasks:• Train a DT model (initialized using the kwargs passed into the function).• Predict scores for training and test datasets and calculate the 7 metrics listed below for the training and test datasets using predictions from the fit model. (All rounded to 4 decimal places)o accuracyo recallo precisiono fscoreo false positive rate (fpr)o false negative rate (fnr)o Area Under the Curve of Receiver Operating Characteristics Curve (roc_auc)• Create a Feature Importance DataFrame from the trained model:o Do Not Use RFE for feature selectiono Use the top 10 features selected by the built in method (sorted from biggest to smallest)o Make sure you use the same feature and importance column names as ModelMetrics shows in feat_name_col [Feature] and imp_col [Importance]o Round the importances to 4 decimal places (do this step after you have sorted by Importance)o Reset the index to 0-9 you can do this the same way you did in task1NOTE: Make sure you use the predicted probabilities for roc_auc.Useful Resources• https://scikit-learn.org/stable/modules/tree.html• https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html• https://scikit-learn.org/stable/modules/generated/sklearn.metrics.accuracy_score.html#sklearn.metrics.accuracy_score• https://scikit-learn.org/stable/modules/generated/sklearn.metrics.recall_score.html#sklearn.metrics.recall_score• https://scikit-learn.org/stable/modules/generated/sklearn.metrics.precision_score.html#sklearn.metrics.precision_score• https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html#sklearn.metrics.f1_score• https://en.wikipedia.org/wiki/Confusion_matrix• https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_auc_score.htmlINPUTSThe first 4 are similar to the tts function you created in Task 2:• train_features – a Pandas DataFrame with training features• test_features – a Pandas DataFrame with test features• train_targets – a Pandas DataFrame with training targets• test_targets – a Pandas DataFrame with test targets• tree_kwargs – a dictionary with keyword arguments that can be passed directly to the scikit-learn DecisionTreeClassifier classOUTPUTS• A completed ModelMetrics object with a training and test metrics dictionary with each one of the metrics rounded to 4 decimal places• A scikit-learn DecisionTreeClassifier model object fit on the training setFunction Skeletondef calculate_decision_tree_metrics(train_features: pd.DataFrame, test_features: pd.DataFrame, train_targets: pd.Series, test_targets: pd.Series, tree_kwargs) -> tuple[ModelMetrics, DecisionTreeClassifier]: model = DecisionTreeClassifier(**tree_kwargs) train_metrics = { “accuracy”: 0, “recall”: 0, “precision”: 0, “fscore”: 0, “fpr”: 0, “fnr”: 0, “roc_auc”: 0 } test_metrics = { “accuracy”: 0, “recall”: 0, “precision”: 0, “fscore”: 0, “fpr”: 0, “fnr”: 0, “roc_auc”: 0 } tree_importance = pd.DataFrame() tree_metrics = ModelMetrics(“Decision Tree”, train_metrics, test_metrics, tree_importance) return tree_metrics, modelcalculate_random_forest_metricsA Random Forest model is a more complex model than the naive and Logistic Regression Models you have trained so far. It can still be used to estimate the probability of an event, but achieves this using a different underlying structure: a tree-based model.Conceptually, this looks a lot like many if/else statements chained together into a “tree”. A Random Forest expands on this and trains different trees with different subsets of the data and starting conditions. It does this to get a better estimate than a single tree would give. For this project, we are more focused on showing you how to apply these models, so you can simply use the scikit-learn Random Forest model in your code.For this task use scikit-learn’s Random Forest Classifier class and complete the following subtasks:• Train a Random Forest model (initialized using the kwargs passed into the function)• Predict scores for training and test datasets and calculate the 7 metrics listed below for the training and test datasets using predictions from the fit model. (All rounded to 4 decimal places)o accuracyo recallo precisiono fscoreo false positive rate (fpr)o false negative rate (fnr)o Area Under the Curve of Receiver Operating Characteristics Curve (roc_auc)• Create a Feature Importance DataFrame from the trained model:o Do Not Use RFE for feature selectiono Use the top 10 features selected by the built in method (sorted from biggest to smallest)o Make sure you use the same feature and importance column names as ModelMetrics shows in feat_name_col [Feature] and imp_col [Importance]o Round the importances to 4 decimal places (do this step after you have sorted by Importance)o Reset the index to 0-9 you can do this the same way you did in task1NOTE: Make sure you use the predicted probabilities for roc aucUseful Resources• https://blog.dataiku.com/tree-based-models-how-they-work-in-plain-english• https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html• https://scikit-learn.org/stable/modules/generated/sklearn.metrics.accuracy_score.html#sklearn.metrics.accuracy_score• https://scikit-learn.org/stable/modules/generated/sklearn.metrics.recall_score.html#sklearn.metrics.recall_score• https://scikit-learn.org/stable/modules/generated/sklearn.metrics.precision_score.html#sklearn.metrics.precision_score• https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html#sklearn.metrics.f1_score• https://en.wikipedia.org/wiki/Confusion_matrix• https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_auc_score.htmlINPUTS• train_features – a dataset split by a function similar to the tts function you created in task2• test_features – a dataset split by a function similar to the tts function you created in task2• train_targets – a dataset split by a function similar to the tts function you created in task2• test_targets – a dataset split by a function similar to the tts function you created in task2• rf_kwargs – a dictionary with keyword arguments that can be passed directly to the scikit-learn RandomForestClassifier classOUTPUTS• A completed ModelMetrics object with a training and test metrics dictionary with each one of the metrics rounded to 4 decimal places• An scikit-learn Random Forest model object fit on the training setFunction Skeletondef calculate_random_forest_metrics(train_features:pd.DataFrame, test_features:pd.DataFrame, train_targets:pd.Series, test_targets:pd.Series, rf_kwargs) -> tuple[ModelMetrics,RandomForestClassifier]: model = RandomForestClassifier() train_metrics = { “accuracy” : 0, “recall” : 0, “precision” : 0, “fscore” : 0, “fpr” : 0, “fnr” : 0, “roc_auc” : 0 } test_metrics = { “accuracy” : 0, “recall” : 0, “precision” : 0, “fscore” : 0, “fpr” : 0, “fnr” : 0, “roc_auc” : 0 } rf_importance = pd.DataFrame() rf_metrics = ModelMetrics(“Random Forest”,train_metrics,test_metrics,rf_importance) return rf_metrics,modelExample of Feature Importance DataFrameFeatureImportance0 android.permission.READ_PHONE_STATE 0.18711 com.google.android.c2dm.permission.RECEIVE 0.11652 android.permission.RECEIVE_BOOT_COMPLETED 0.10363 com.android.launcher.permission.INSTALL_SHORTCUT 0.10044 android.permission.ACCESS_COARSE_LOCATION 0.09215 android.permission.ACCESS_FINE_LOCATION 0.05316 android.permission.GET_TASKS 0.04627 android.permission.SYSTEM_ALERT_WINDOW 0.04338 com.android.vending.BILLING 0.0269 android.permission.WRITE_SETTINGS 0.0236Task 5: Model Training and Evaluation (20 points)Now that you have written functions for different steps of the model-building process, you will put it all together. You will write code that trains a model with hyperparameters you determine (you should do any tuning locally or in a notebook, i.e., don’t tune your model in gradescope since the autograder will likely timeout).• Refer to the Submissions page for details about submitting your work.Important: Conduct hyperparameter tuning locally or in a separate notebook. Avoid tuning within Gradescope to prevent autograder timeouts.Develop your own local tests to ensure your code functions correctly before submitting to Gradescope. Do not share these tests with other students.train_model_return_scores (ClaMP Dataset)Instructions (10 points):This function focuses on training a model using the ClaMP dataset and evaluating its performance on a test set.1. Input:o train_df: A Pandas DataFrame containing the ClaMP training data. This includes the “label” column, which serves as your target variable (0 for benign, 1 for malicious).o test_df: A Pandas DataFrame containing the ClaMP test data. The “label” column is intentionally omitted from this set.2. Model Training:o Train a machine learning model using the train_df dataset.o You may use any techniques covered in this project.o Set a random seed for reproducibility.o Perform hyperparameter tuning to optimize your model’s performance Tip: putting comments on the ranges you select for hyperparameters will help the graders understand how you chose it3. Prediction:o Use your trained model to predict the probability of malware for each row in the test_df.o Output these probabilities as values between 0 and 1. A value closer to 0 indicates a lower likelihood of malware, while a value closer to 1 indicates a higher likelihood.4. Output:o Return a Pandas DataFrame with two columns: index: The index from the input test_df. prob_label_1: The predicted malware probabilities.5. Evaluation:o The autograder will evaluate your predictions using the ROC AUC score.o You must achieve a ROC AUC score of 0.9 or higher on the test set to receive full credit.Sample Submission (ClaMP):index prob_label_10 0.651 0.1… …Function Skeleton (ClaMP):import pandas as pddef document_hyperparameter_tuning_clamp(train_df_path,test_df_path): “”” Please document the hyperparameter tuning process you used to tune your machine learning model for Task 5. You should copy and paste the hyperparameter process you conducted here. Place all parameters tuned and values in the hyperparameters dictionary. If we run your code and your hyperparameter function, it must generate the same hyperparameters you used for your function. You do not need to return anything specific, just document how you did your tuning. We will not run it in the autograder, it is just an additional check on our end. “”” hyperparameters = {} return hyperparametersdef train_model_return_scores_clamp(train_df, test_df) -> pd.DataFrame: “”” Trains a model on the ClaMP training data and returns predicted probabilities for the test data. Args: train_df (pd.DataFrame): ClaMP training data with ‘label’ column. test_df (pd.DataFrame): ClaMP test data without ‘label’ column. Returns: pd.DataFrame: DataFrame with ‘index’ and ‘prob_label_1’ columns. “”” # TODO: Implement the model training and prediction logic as described above. test_scores = pd.DataFrame() # Replace with your implementation return test_scoresClaMP Dataset• The ClaMP (Classification of Malware with PE Headers) dataset is used for malware classification.• It is based on the header fields of Portable Executable (PE) files.• Learn more about PE files:o Microsoft – PE Formato Wikipedia – Portable Executable• ClaMP Dataset GitHub Repository: https://github.com/urwithajit9/ClaMP• This project uses the ClaMP_Raw-5184.csv file (55 features).train_model_unsw_return_scores (UNSW-NB15 Dataset)Instructions (10 points):This function focuses on training a model using the UNSW-NB15 dataset and evaluating its performance on a test set. It will likely require exploring/understanding the dataset, data preprocessing, model selection, and hyperparameter tuning to acheive full credit.1. Input:o train_df: A Pandas DataFrame containing the UNSW-NB15 training data (including the “label” column).o test_df: A Pandas DataFrame containing the UNSW-NB15 test data (without the “label” column).2. Model Training:o Train a machine learning model using the train_df.o You can use any techniques from this project.o Set a random seed for reproducibility.3. Prediction:o Predict the probability of label=1 for each row in test_df.o Output probabilities between 0 and 1, where values closer to 1 indicate a higher likelihood of being label=1.4. Output:o Return a Pandas DataFrame with two columns: index: The index from the input test_df. prob_label_1: The predicted probabilities of label=1.5. Evaluation:o The autograder will evaluate your predictions using the ROC AUC score.o Full Credit (10 points) will be given for 0.76 and above, 5 points for .75 and above and 2.5 points for .55 and aboveo Parameter tuning will likely be necessary to achieve higher scores.Sample Submission (UNSW-NB15):index prob_label_10 0.651 0.1… …Function Skeleton (UNSW-NB15):import pandas as pddef document_hyperparameter_tuning_unsw(train_df_path,test_df_path): “”” Please document the hyperparameter tuning process you used to tune your machine learning model for Task 5. You should copy and paste the hyperparameter process you conducted here. Place all parameters tuned and values in the hyperparameters dictionary. If we run your code and your hyperparameter function, it must generate the same hyperparameters you used for your function. You do not need to return anything specific, just document how you did your tuning. We will not run it in the autograder, it is just an additional check on our end. “”” hyperparameters = {} return hyperparametersdef train_model_return_scores_unsw(train_df, test_df) -> pd.DataFrame: “”” Trains a model on the UNSW-NB15 training data and returns predicted probabilities for the test data. Args: train_df (pd.DataFrame): UNSW-NB15 training data with ‘label’ column. test_df (pd.DataFrame): UNSW-NB15 test data without ‘label’ column. Returns: pd.DataFrame: DataFrame with ‘index’ and ‘prob_label_1’ columns. “”” # TODO: Implement the model training and prediction logic as described above. test_scores = pd.DataFrame() # Replace with your implementation return test_scoresUNSW-NB15 Dataset• The UNSW-NB15 dataset was created using the IXIA PerfectStorm tool to simulate real-world network traffic and attack scenarios.• Dataset Website: https://research.unsw.edu.au/projects/unsw-nb15-dataset• Dataset Description• Feature Descriptions• Note: This project does not use all features or classes from the original UNSW-NB15 dataset.Deliverables1. Local Testing: We strongly encourage you to thoroughly test your code locally using the provided datasets. Create your own test sets by splitting the training data.2. Gradescope Submission: Once you are confident in your solution, submit your task5.py file (containing both functions) to Gradescope.Submission InstructionsSubmission Limits and Requirements• Note that you are not allowed to use any print, raise, exec, stdout or any other statements that may return output from the autograder (even if they are commented out). You will just get a blank report in Gradescope, and it uses a submission, so don’t do this.• You have a maximum of 25 submissions allowed in Gradescope• Submission opportunities cannot be restored once usedCode Requirements• Do not include any output statements in your code:o No print statementso No raise statementso No debug output (even if commented out)o Such statements will result in a blank report and waste a submissionTesting Guidelines1. Before Gradescope submission:o Pass all provided local tests firsto The autograder is for verification, not debuggingo For Task 5: Create and pass your own local tests (do not share these)Submission Process1. File Submission:o You can submit all five task#.py files together (like Task1.py, Task2.py …) Or submit any combination of files individually Do not submit a whole folder (e.g. src) containing the files or the autograder will not worko Ensure all necessary files are included in your submissiono Do not submit the task_video.py fileo Your submission would look like:2. Selecting Your Final Grade:o Choose your best submission in Gradescopeo If not selected, Gradescope defaults to your latest submissiono We cannot select submissions for you after the project endso Your submission history will look like this:Optional Jupyter Notebooks________________________________________The Jupyter Notebooks we provide with the project are designed as a jumping off point if you find the .py Python files too complex to get started with. They are not graded. The code you write in the Notebooks will not simply copy-paste over to the graded .py Python files.You can use Google’s Colab, Notebooks in VS Code, Notebooks in Pycharm Professional or browser-based Jupyter Notebooks to practice writing your code for this assignment.• We will provide limited support for these methods.• Ultimately you will still have to submit the task*.py files to Gradescope.• We will not accept ipynb files (Notebooks) for submission.• Your code will not directly copy-paste over to the .py files you need to submit.In the project files in Student_Local_Testing, you will find a Notebooks directory. This has four Juypyter Notebooks for your use. Please feel free to practice here to get familiar with how to write the functions and test them. It might help you to divide each task into smaller functions and test them independently. You can use the Notebooks to test out the functions, concepts and packages. However, you will need to modify your Notebook code to the same format as our skeleton code in the .py files to pass the local tests and then get points in the autograder.Getting Jupyter Notebooks to Recognize the cs6035_ML Environment• It can be tricky to get the Jupyter Notebooks to recognize the cs6035_ML conda environment.• To see the conda env in browser-based Jupyter Notebooks, run this command with the (cs6035_ML) prompt showing in an Anaconda Powershell (Windows) or a terminal (Mac or Linux):python -m ipykernel install –user –name=cs6035_ML• In Pycharm, you additionally need to run this command in a terminal window with conda activated to get Juptyer Notebooks to work there when you have a conda env activated.conda install jupyterVS Code gets its own section for Jupyter Notebooks:• First get the project setup in VS Code.• Open a Jupyter Notebook from the Notebooks directory.• Notice in the upper right corner, there’s a button to configure the Jupyter kernel / server. Click that.• Alternatively use Ctrl-Shift-P (Windows) and enter select kernel to bring up the desired menu option:• Next you get another set of choices, choose “Python Environments”:• Next select the cs6035_ML environment:• Now you’ll see the desired environment displayed in the upper right corner:Look in your interface (PyCharm, VS Code, browser, etc) for ways to control and run the cells, Ctrl-Enter will run the current cell.Ultimately you will still have to submit the various task*.py files to Gradescope. Be sure your code will run in those python files and not just in a notebook. To accomplish this, run the local tests. Do not debug using the autograder, you only have limited Submissions.No-Credit Practice Task from Video________________________________________You do not need to do this TaskIn this task, you are given a class ModelMetrics and another class, NaiveBayes. Note that this in not the same as the naive classifier you will build in Task 4, so we are not giving anything away here.The ModelMetrics class will be used throughout Task 4, so we’ll use it here too. We’ll only use it for demonstration purposes, however. In the real tasks, you’ll be building a model like we will here, but you’ll also be calculating statistics such as false positive rate and f-score. Here we’ll just use dummy values for the stats, just to demonstrate their use in the model.Part of the model metrics class is to pass back a feature importance dataset that rates each feature for it’s relevance to the model output. While the three models in Task 4 (logistic regression, gradient boost and random forest) provide a more direct way to access feature importance, it’s not directly used in the naive bayes classifer.Naive Bayes Classifier:naive_bayes_trainKmeans Clustering is a process of grouping together similar rows together and assigning them to a cluster. For this method you will use the training data to fit an optimal K-means cluster on the data.To help you get started we have provided a list of subtasks to complete for this task:1. Initialize an sklearn K-means model using random_state from the __init__ method and setting n_init = 10.2. Initialize a yellowbrick KElbowVisualizer with the K-means model to search for the optimal value of k (between 1 and 10).3. Train the KElbowVisualizer on the training data and determine the optimal k value.4. Now, train a K-means model with the proper initialization for that optimal value of k5. Return the cluster ids for each row of the training set as a list.Useful Resources• https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html#sklearn.cluster.KMeans• https://www.scikit-yb.org/en/latest/api/cluster/elbow.htmlINPUTSUse the needed instance variables you set in the __init__ methodOUTPUTSa list of cluster ids that the K-means model has assigned for each row in the train datasetFunction Skeletondef kmeans_train(self) -> list: cluster_ids = list() return cluster_ids
The purpose of this assignment is to become more familiar with bit-level representations of integers numbers. You’ll solve five problems in the presentation. 2 Logistics This As an Andividual Aroject. All handins are Alectronic. Alarifications and corrections Aill be Aosted on the course Web Aage. 3 Handout Instructions Start by copying datalab.tar to a (protected) directory on a Linux machine in which you plan to do your work. Then give the command unix> tar xvf datalab.tar This will cause a number of files to be unpacked in the directory. The only file you will be modifying and turning in is bits.c. The bits.c file contains a skeleton for each of the 5 programming problems. Your assignment is to complete each function skeleton using only straightline code for the integer problems (i.e., no loops or conditionals) and a limited number of C arithmetic and logical operators. Specifically, you are only allowed to use the following eight operators: ! ˜ & ˆ | + > A few of the functions further restrict this list. Also, you are not allowed to use any constants longer than 8 bits. See the comments in bits.c for detailed rules and a discussion of the desired coding style. 1 4 The Problems This section describes the problems that you will be solving in bits.c. We have 5 problems bitNor, isZero, addOK, logicalShift, and absVal. Using legal operation that we allow you, you need to solve problems to execute the desired behavior of the functions. Please refer to the presentation for detailed instructions. Name Description Rating bitNor(x,y) isZero (x) addOk(x,y) absVal(x) logicalShift(x, n) ~(x | y) using only ~ and & return 0 if x is non-zero, else 1 Determine if can compute x+y without overflow absoulte value of x Shift right logical. 1 1 3 4 3 Table 1: Bit-Level Manipulation Functions. 5 Evaluation Your score will be computed out of a maximum of 12 points. Correctness points. We will evaluate your functions . You will get full credit for a problem if it passes all of the tests, and no credit otherwise. Autograding your work We have included some autograding tools in the handout directory — btest, dlc, and driver.pl — to help you check the correctness of your work. • btest: This program checks the functional correctness of the functions in . To build and use it, type the following two commands: unix> make unix> ./btest Notice that you must rebuild btesteach time you modify your bits.c file. You’ll find it helpful to work through the functions one at a time, testing each one as you go. You can use the -f flag to instruct btest to test only a single function: unix> ./btest -f bitNor 2 6 Handin Instructions Upload your source file bits.c and report in plms. You need to explain your answer in the report. The format of file is (student number)_(your name).c / .pdf. 7 Advice • The dlc program enforces a stricter form of C declarations than is the case for C++ or that is enforced by gcc. In particular, any declaration must appear in a block (what you enclose in curly braces) before any statement that is not a declaration. For example, it will complain about the following code: int foo(int x) { int a = x; a *= 3; /* Statement that is not a declaration */ int b = a; /* ERROR: Declaration not allowed here */ }
At one time, millions of Bison in large herds roamed the North American great plains. Now the numbers are much less, resulting in preservation efforts. These efforts include tagging selected Bisons with tracking and status devices so that information regarding roaming range and health information statistics can be collected.Wild Bill 1 in shorter grassA bison selected for tagging, designated as Wild Bill 1, is in an area in the plains covered with extremely tall range grass. Wildlife Department of the Interior has asked Ranger Smith to tag Wild Bill 1. Ranger Smith will use a humane tranquilizer dart and then tag the sedated bison.Ranger Smith will land in a helicopter in an area near where Wild Bill 1 is grazing. Ranger Smith will stealthily stalk by crawling on his hands and knees down wind through the tall grass so he can approach undetected.This is very challenging since Smith is having trouble spotting Wild Bill from this vantage point. The grass in front of Smith looks like a string of n (1
1. As a user (customer, deliverer, or client), I would like to register with the application using my name, age, email address, mobile number, and password so that my records are kept private and secure. 2. As a user, I want to authenticate myself using my email and password so I can access my private records and conduct transactions securely. 3. As a user, I would like to be able to see my registered details (my name, age, email address and mobile number) 4. As a user, I want to be able to log out from the app so that the next person cannot use my account. 5. As a customer, I would like to register with my X/Y location so that my food can be delivered to me. 6. As a deliverer, I would like to register with my licence plate number so that customers and clients know who I am. (All vehicles have licence plates in this scenario, including bicycles.) 7. As a client, I would like to register my restaurant with its name, its style (Italian, French, Chinese, Japanese, American, Australian), and X/Y location so that customers can order food from my restaurant. 8. As a client, I would like to be able to see the restaurant name, style and location that I registered it so that I can ensure these details are correct. 9. As a client, I would like to add items to my menu by including their name and price so that customers will be able to order specific items. 10. As a customer, I would like to see each restaurant’s name, X/Y location, distance, style, and average rating so that I can order food from them. 11. As a customer, I would like to be able to see the available restaurants sorted by my choice of name (alphabetically), distance, style or average rating so I can find the type of restaurant I’m looking for. 12. As a customer, I would like to see the menu from each restaurant so I can decide which one to order from. 13. As a customer, I would like to see details about each restaurant so that I can choose one. 14. As a customer, I would like to see the menu items from my selected restaurant so I can choose which item to order. 15. As a customer, I would like to order items from the menu and specify the quantity of those items so I can get some food. 17. As a customer, I would like to be able to see the location I registered with, as well as how many orders I have made and keep track of how much of my money I’ve spent getting food delivered. 18. As a client, I would like to see all the orders from my restaurant so that I know what items to cook. 19. As a client, I would like to change the status of an order from “ordered” to “cooking” to indicate that I have started cooking the meal. 20. As a client, I would like to change the status of an order from “cooking” to “cooked” to indicate that I have finished cooking the meal. 21. As a deliverer, I would like to see a list of orders that are available to pick up, and to be able to select one of them to claim the order and prevent any other deliverer from taking it. 22. As a deliverer, I would like to see the total distance from myself to the restaurant and to the customer to determine how far I need to travel when choosing which order to take. 23. As a deliverer, I want to be able to see which order I’m currently delivering, the restaurant (and its location) I am picking it up from and the customer (and their location) I am delivering it to, in case I forget. 24. As a deliverer, I would like to set my status as “at the restaurant” once I have arrived so I can pick up the order. 26. As a client, I would like to change the status of an order from “cooked” to “being delivered” once the driver has left the restaurant so the customer is informed. 27. As a customer, I would like to see the current status of my order so I can be assured that my order is being handled professionally. 28. As a customer, I would like to see the licence plate number of the deliverer so I know they are the correct person. 29. As a deliverer, I would like to change the status of an order from “being delivered” to “delivered” once I have arrived at the customer’s house. 30. As a customer, I would like to rate the restaurant with a short comment and a star rating (between 1 and 5 stars) to inform others about my customer experience. 31. As a client, I would like the order to be removed from my order list once it has been delivered so that I can focus on other orders.User InterfaceThe Arriba Eats company has plans to eventually develop a graphical user interface for this application; however, the prototype implementation will have a simple text-based interface. To make it easier to switch to a graphical user interface in the future, we want to ensure the user interface code is kept separate from the application logic. The detailed specification for this application’s user interface is available here: CAB201 OO Assignment – Interface SpecificationDatabaseThe Arriba Eats company plans to eventually use a relational database to store and save all customer data. However, for this prototype, all data will be kept in memory (and will therefore be lost when the application is shut down).Recommended ApproachYou are encouraged to design and develop the application using an iterative and incremental approach — implementing one user story at a time. You can implement the user stories in any order you choose; however, for each new user story, you should design and implement only the classes and methods required to implement that user story. The following points need to be emphasised. 1. In general, the user stories are designed to become more complex as you progress down the list. However, some user stores relay on others to be implemented 2. When registering a user, only the customer’s input for the shared data (name, age, email address, mobile number, and password) will be tested. It is recommended that you start by registering a customer rather than the other types of users. The other user type specific data will be tested when the other types are registered. 3. You have to submit your code to Gradescope. So, you have to make sure that your output matches Gradescope’s output perfectly. There is also, a video on Canvas that provide an example of the application and demonstrates how to achieve some of the user stories. 4. The interaction is menu based that ultimately uses the standard input and output statements. As you progress you do have to implement every menu option, only to display those options to the screen. This will allow you to ‘fast track’ throughout your testing.RestrictionsThe following restrictions will apply to your input. Input Restriction Example of Valid Input Name Must consist of at least one letter and only letters, spaces, apostrophes and hyphens. Tommy Galvin Age An integer value between 18 -100 inclusive. 20 Mobile Must consist of only numbers, be exactly 10 characters long, and have a leading zero. 0123456789 Email Must include exactly one @ character and have at least one other character on either side. [email protected] Password Must be at least 8 characters long and contain at least 1 number, lowercase letter and uppercase letter HelloWorld12 Food Style Italian, French, Chinese, Japanese, American or Australian. Note, this will be entered via a menu option so you will need to enter a numeric value between 1 -6 inclusive. 3 (which is Chinese) Restaurant Name Must consist of at least one non-whitespace character Alan’s Bar & Grill Location Must be of the format X,Y where X and Y are both integer values 3,4 Item Price Must be between $0.00 and $999.99 $12.50In addition to the above, the following restrictions will apply: 1. Every user must be registered before performing any other actions. 2. Each email address must be unique.Test CasesAssignment 2 – Inputs and Outputs.zipDownload Assignment 2 – Inputs and Outputs.zip We have provided a set of input and output text files that you can use to develop your assignment. These files are exact duplicates of the Gradescope test cases. The input files capture your typed input, such as menu responses and names. The output files record the results that would typically be displayed on the screen. This will enable you to test each case individually, rather than going through Gradescope every time. However, one you have implemented each test successfully, we recommend resubmitting to Gradescope to make sure that your submission does not introduce errors to cases that you have previously passed. There are two ways you can use these files. • You can open up the contents of one of the ‘Inputs’ files in a text editor, copy the contents of the file and then paste it into your program. You can then scroll up and read the output of the program, comparing it with the respective ‘RefOutputs’ file to see when errors crop up. • You can also supply your inputs to your assignment’s executable using input redirection. This will also allow you to redirect the output for easier comparisons (e.g. using a tool like WinMergeLinks to an external site.). You can use redirection with the less-than symbol ( 2. To execute the program using an input file, use executable < inputfile. For example, if your executable is named ArribaEats, you would run ArribaEats < “Inputs1. Startup Menu.txt” to execute the first case study. C:cab201ArribaEatsbinDebug et8.0>ArribaEats < “Inputs1. Startup Menu.txt” Welcome to Arriba Eats! Please make a choice from the menu below: 1: Login as a registered user 2: Register as a new user 3: Exit Please enter a choice between 1 and 3: Thank you for using Arriba Eats!C:cab201ArribaEatsbinDebug et8.0> Your screen will display the output. You can compare this output with the corresponding Gradescope output either by manually inspecting the corresponding file in the RefOutputs folder, or by redirecting the output of your program into a file and then using the fc command: C:cab201ArribaEatsbinDebug et8.0>ArribaEats < “Inputs1. Startup Menu.txt” > MyOutput.txtC:cab201ArribaEatsbinDebug et8.0>fc “RefOutputs1. Startup Menu.txt” MyOutput.txt Comparing files REFOUTPUTS1. Startup Menu.txt and MYOUTPUT.TXT FC: no differences encountered Remember, anytime you enter invalid input, the program will loop until valid data is provided. This could potentially lead to an infinite loop if your implementation handles errors incorrectly. Test Cases Code QualityNumber Description 01 Startup Menu 02 Register Customer 03 Register Customer – Invalid Name 04 Register Customer – Invalid Age 05 Register Customer – Invalid Email 06 Register Customer – Invalid Mobile Number 07 Register Customer – Invalid Password 08 Register Customer – Invalid Location 09 Register Customer – Email Already In Use 10 Login as Customer 11 Login as Customer – Wrong Email or Password 12 Display User Information – Customer 13 Login as Customer, try each option 14 Display User Information – Multiple Customers 15 Register Deliverer and Login 16 Register Deliverer – Invalid Plate 17 Display User Information – Deliverer 18 Register Client and Login 19 Register Client – Invalid Restaurant Name 20 Display User Information – Client 21 Add item to menu 22 Add item to menu – Invalid PriceCode Smells23 Place order 24 Display User Information – after placing order 25 Place order – Multiple Customers 26 Place order – multiple instances of same item in order 27 Multiple menu items 28 Accept delivery job 29 Display User Information – after accepting job 30 See orders and cook order 32 Deliver order 33 Write review 34 Display User Information – after placing multiple orders 35 Multiple orders, deliverers and reviews 36 Multiple customers, reviews 37 Multiple restaurants, ordered alphabetically 38 Multiple restaurants, ordered by distance 39 Multiple restaurants, ordered by style 40 Multiple restaurants, ordered by average rating Code smells are issues that indicate poor code quality. While the code may be functional and pass Gradescope tests, you may be marked down when your code quality is assessed. Note that the list is diverse: some smells can be fixed after your first attempt at writing code, but others should be avoided from the start. Some are simple issues, while others are more serious. Refactoring is a good way of addressing them (https://refactoring.guru/refactoringLinks to an external site., https://refactoring.com/Links to an external site.). • Inappropriate identifiers. • Fields not segmented into classes correctly. • Too many fields clumped together (an indication that they should form a separate class). • Overusing basic types instead of objects. • Not using types correctly (e.g. Enums, Booleans, etc.). • Not using constants correctly. • Using a constant or literal for the length of a collection rather than calling the length method. • Methods not segmented into classes correctly. • Duplicate code (which probably should be moved into a method). • Inappropriate visibility for fields, methods, or properties. • Inappropriate use of getters and setters. • Methods directly interact with data in other classes. • A small change requires changes in many places. • Excessive use of switch/if statements instead of using inheritance or polymorphism. • A “God” object that does too much and has responsibilities that should belong to other classes. • Inappropriate global fields or classes (usually via the keyword static). • Too many responsibilities in a class. • Too many responsibilities in an interface. • Structs used inappropriately instead of classes (i.e. classes without methods). • Too few responsibilities in classes. • Use of magic numbers or values. • Methods not segmented correctly within a class (e.g. one large method instead of several smaller ones). • Code repetition. • Errors in input handling are not placed inside a loop but instead handled with a “backup” statement (i.e. if it fails the first time, it gets the correct input the second time but does not handle multiple incorrect inputs). • Incorrect use of static fields in an interface that is then implemented by a class (these should be placed in a parent class instead). • Inappropriate use of an abstract class. • Interfaces are not used correctly. • Subclasses are not used correctly. • Data classes directly interact with the user interface (which they shouldn’t). • Managing a collection of objects is placed within the individual object. • Methods are tightly coupled via control coupling, where special parameter values are used to control internal operations. • Classes are coupled via stamp coupling, where a complex data object is passed as a parameter but only part of it is used in each class. Gradescope Smells • Using Write instead of WriteLine can result in messy or unclear output. • Be careful of trailing spaces when printing to the screen, as text will look the same but will be interpreted differently by Gradescope.Marking CriteriaWhen you submit your code to Gradescope, it will be automatically run against a battery of test cases, checking many different combinations of customers, restaurants, restaurant items, delivery drivers and orders, testing different paths through your program. These have been designed to vary in complexity, ensuring that you can pass some test cases even with a very early implementation. For this reason, it is recommended that you submit to Gradescope early and often, to ensure that your code is successfully compiling and running on our grading platform. Each test case has the same structure, where you will be presented with a menu and you have to enter input to navigate it. One last note: While we do not vary the test cases from run to run, if we discover during manual inspection of your program that you have hard-coded in our test cases and results (that is, instead of implementing the restaurant ordering and delivery logic, you have built your program to specifically recognise our test cases and produce the output we expect) you will receive a score of 0 for functionality, and likely a poor score from the object-oriented design and implementation, since it will not be considered useful work. Marking criterion 7: High Distinction 6: Distinction (15- 3: Marginal fail (8-Your use of abstraction is marked by analysing your source code. Under this criterion, the following will be penalised: • Overly long or complex methods that should be broken up into simpler methods • Repetitive coding constructs that should be a loop instead • Sections of code or methods that are overly similar to other sections of code/methods and should be abstracted out into a common method • Use of magic numbers in your code (these should be replaced by declaring a const field with a descriptive name containing the value) In general, we are looking for anything other than code clarity and your OOP design (as those are marked in their own criteria) that makes your code more difficult to maintain. Mark Marking criterion 7: High Distinction 7 The program contains no code that could be improved by abstracting it into an additional method, and all members are appropriately named. 6.5 The program can be improved by abstracting out at least one additional method or correcting at least one instance of an inappropriately named member. 6: Distinction 6 The program features no repetition, and almost all members have been appropriately named, with all methods being less than 50 lines long. 5: Credit 5 The program features no repetition, most members are appropriately named, and the code is mostly broken up into appropriately sized methods, with nearly all methods being under 50 lines. 4: Pass 4 The program features no or only minor instances of repetition, most members are appropriately named, and most methods are less than 50 lines long. 3: Marginal Fail 3 The program features several instances of repetition that could be abstracted out into methods, and most members are appropriately named. 2: Fail 2 The program features long methods with large amounts of repetition. 1: Low Fail 1 The program features only a very small number of methods. 0.5 The program features at least one method other than main(). 0 Everything is in main() or too little code was submitted to evaluate the use of abstraction.Your object-oriented design and implementation will be checked by looking at your class inheritance hierarchy and examining source code to check your use of polymorphism and encapsulation, as well as the private/protected/internal/public state of your methods, fields and properties. 7: High Distinction 10 Features high-quality class design and interaction that are implemented extremely well, with no changes needed. Advanced object-oriented design and SOLID principles are fully followed. 9.5 Features high-quality class design and interaction that are implemented extremely well, but with at least one change needed. Advanced object-oriented design and SOLID principles are almost fully followed. 9 Features high-quality class design and interaction that are implemented extremely well, but with a few minor changes needed. Advanced object-oriented design and SOLID principles are almost fully followed. 8.5 Features high-quality class design and interaction that are implemented extremely well, but with some minor changes needed. Advanced object-oriented design and SOLID principles are almost always followed. 6: Distinction 8 Features a quality class design, well-structured interactions, and solid implementations. Advanced object-oriented paradigms like inheritance and polymorphism are implemented well, with only slight improvements needed. SOLID principles are almost always followed, but there is at least one significant instance of divergence. 7.5 Features a quality class design, well-structured interactions, and solid implementations. Class design and interactions are implemented well. Advanced object-oriented paradigms like inheritance and polymorphism are implemented but could beimproved. SOLID principles are almost always followed, but there is at least one significant instance of divergence. 5: Credit 7 Features a thoughtful object-oriented design with appropriate classes. More than one instance of an advanced object-oriented paradigm is implemented. Several cases of SOLID design are implemented, but there are some minor divergences. 6.5 Features a thoughtful object-oriented design with appropriate classes. At least one instance of an advanced object-oriented paradigm is implemented. Several cases of SOLID design are implemented, but there are some minor divergences. 4: Pass 6 The program is implemented using classes with appropriate division of data and methods, suitable interaction between classes, and good object-oriented design. There are multiple minor examples of mid-level object-oriented design, such as method overloading. However, it lacks advanced object-oriented design and implementation, such as inheritance and polymorphism, and it mostly diverges from SOLID principles. 5.5 The program is implemented using classes with appropriate division of data and methods, suitable interaction between classes, and fair object-oriented design. There is at least one instance of mid-level object-oriented design, such as method overloading. However, it lacks advanced object-oriented design and implementation, such as inheritance and polymorphism, and it majorly diverges from SOLID principles. 5 The program is implemented using classes with an appropriate division of data and methods, with suitable interaction between classes and some object-oriented design. However, significant improvements are needed. There are no instances of mid- to advanced object-oriented design, such as inheritance, polymorphism, and method overloading, and it majorly diverges from SOLID principles. 3: Marginal Fail 4 The program is implemented using a suitable number of classes, mostly with appropriate data, methods, and interaction, but with little use of object-oriented design or programming, and it is poorly aligned with SOLID principles. 2: Fail 3 A small number of classes exist, or there are many more classes than needed. All suitable classes contain both data and methods, but they are poorly structured. 1: Low Fail 2 A very small number of classes exist, but there is too much separation between data and methods, with many classes just acting as data containers (like structs). Alternatively, there are many more classes than needed, leading to a very confusing structure. 1 More than one class exists. However, there are poor interactions between the classes. Too many classes act as data containers (like structs), with methods implemented in only one class. Additionally, there are too many static classes. 0 No object-oriented design, entirely procedural code in one class, or only uses global variables. Alternatively, too little code was submitted to evaluate the appropriateness of the design.Your coupling and cohesion will be checked by examining class interdependencies and looking at the methods/fields/properties present in each class. As a refresher, coupling is something you are trying to minimise (it means unhealthy dependencies between classes that make your code less reusable), while cohesion is something you are trying to maximise (it means each class is responsible for a single unified task and all methods/fields/properties support that task). 7: High Distinction 8 Coupling is minimal, and cohesion is maximal, with no code changes required. 7 The classes are highly cohesive, but there are some minor cases of negative coupling. 6: Distinction 6 Classes are highly cohesive, and the worst types of coupling (content, common, temporal) are avoided. 5: Credit 5 Classes are highly cohesive, but some forms of harmful coupling are present. 4: Pass 4.5 Classes are highly cohesive, with responsibility solely contained within each appropriate class. 4 Classes are mostly cohesive, but some responsibilities are creeping into other classes. 3: Marginal fail 3 Classes are fairly cohesive, but some responsibilities are mixed. 2.5 Data is placed in appropriate classes, and some of the related responsibilities are implemented in the correct class along with the data. 2: Fail 2 Classes have poor cohesion, and responsibilities are mixed. 1.5 Data is placed in appropriate classes, but most functionality is implemented in a small number of central classes that manipulate the state of instances of other classes. 1: Low Fail 1 Almost all functionality is placed in a central class that manipulates the state of instances of other classes. 0 The program is not divided into multiple classes or too little code was submitted to evaluate the quality of the design.Code clarity is marked by looking at how your choice of identifiers (names of classes, methods, properties, fields, local variables) and your program’s flow (use of loops / branching) affects the comprehensibility and maintainability of your code. For comments, we are looking for two different kinds: top-level XML tag comments (///) and body-level comments (usually single-line // comments). 7: High Distinction 5 All public classes, methods, and properties feature C# XML documentation comments at the top level, explicitly defining the code’s external interface. Appropriate comments are used to explain complex code. 6: Distinction 4.25 Almost all public classes, methods, and properties feature C# XML documentation comments at the top level, explicitly defining the code’s external interface, and appropriate comments are almost always used to explain complex code. 5: Credit 3.75 Most public classes, methods, and properties feature C# XML documentation comments at the top level, explicitly defining the code’s external interface, and appropriate comments are used most of the time to explain complex code. 4: Pass 2.5 Most public classes, methods, and properties feature C# XML documentation comments at the top level, explicitly defining the code’s external interface, and appropriate comments are frequently used to explain complex code. 3: Marginal Fail 2 Useful comments are sparse, or comments are far too frequent to the point that they interfere with reading the code, or identifiers are poorly chosen, or the program flow is confusing. 2: Fail 1 There are only a minimal number of useful comments throughout the code. 1: Low Fail 0 No useful comments are present, or too little code was submitted to evaluate the quality of the comments.
Introduction The purpose of this assignment is for you to explore the use of signals as an inter-process communication technique. You will learn how to register a signal handler as well as how to send, suspend, and view pending signals. Work on the assignment is to be completed individually. You are welcome to collaborate with class members, but the submitted assignment must be your work alone. The assignment is to be completed in GitHub Classroom. The assignment is here: https://classroom.github.com/a/H_7Ef-0a. Follow the link to accept the assignment then clone your new repository. All work is to be done on the main branch. Background and References Signals used in inter-process communication are used a notification mechanism between processes. Signals are unique in the sense that there isn’t necessarily any data associated with the signal; rather, the receiver of a signal is only notified that it happened. There are, however, different signal types (integer values) to indicate different notifications. Think of it as being similar to the case where someone taps you on your shoulder to get your attention. If you want to know why he/she is getting your attention, you have to ask. In programming, signals work in a similar way; a process registers a handler for the signal. This handler is invoked when the signal is received by the process. Once the signal handler completes (returns), execution continues exactly where it was prior to the receipt of the signal. If there is no handler registered, the operating system takes a default action. • Signal Manual Page – https://man7.org/linux/man-pages/man7/signal.7.html • Sending a Signal with the kill system call – https://man7.org/linux/manpages/man2/kill.2.html Note the similarity of the kill system call with the kill command-line command that was demonstrated in lecture which is documented here: https://man7.org/linux/manpages/man1/kill.1.html Project Description Part 1: Signal Research Research POSIX signals, be able to answer the following questions: • What is a signal disposition? • What is a signal handler? What is it used for? • Name and describe each of the five (5) default dispositions? • Name and describe one way to programmatically send a signal to a process. Be able to give an example (the code) to send a signal. • Name and describe one way to send a signal to a process from the command line. Be able to give an example (the command, key combination, etc.) to send a signal. Each signal has a corresponding type. Research POSIX signal types. For EACH of the following signal types: • SIGINT • SIGTERM • SIGUSR1 • SIGKILL • SIGSTOP Be able to: • Name and describe the signal • Define the default disposition taken by the operating system if a process does not define a signal handler • Can the disposition be overridden by a signal handler? Why do you think this is the case? Part 2: Working with a Signal Handler The source file signal_handler.c (in assignment repository) contains the source for a program that registers a signal handler. Compile and run the signal handler code. • Determine two (2) ways to send the SIGINT signal to the process created for the running program. o TIP: Research the kill command o TIP: Some signals can be sent using key combinations from the command line (CTRL + other characters). • Be able to describe how you sent SIGINT to the process and the behavior of the process when SIGINT is handled. Modify the code so that it does NOT exit inside the signal handler. – Compile and run the program and send SIGINT to the process. Be able to describe the behavior. – Determine how to make the process exit. TIP: Research SIGKILL • Be sure to update comments at the top of the source file and commit your changes to the file. Part 3: Signals Sent From the Operating System There are times when a running program performs an operation that is either not allowed or asks the operating system to send a signal to notify the process of the status of the system. Using SIGALRM One example of notification sent by the operating system is SIGALRM, which is sent as the result of a user calling the alarm function. Research the alarm system call and the SIGALRM signal. – Write a program (signal_alarm.c) that schedules an alarm to be sent to after 5 seconds. Then write a signal handler to print out that the signal was received. Handling SIGSEGV Consider the following program (also included in signal_segfault.c): #include int main (int argc, char* argv[]) { // Declare a null pointer int* i = NULL;// Dereference the null pointer printf(“The value of i is: %d ”, *i);// Return to exit the program return 0; } The de-reference of i causes a segmentation fault when the program attempts to load memory from a NULL pointer. The segmentation fault is actually the result of a signal sent by the operating system. Here, the signal is called SIGSEGV – SEGV stands for segmentation violation. Research the SIGSEGV signal – Modify signal_segfault.c to install a handler for SIGSEGV. In your handler print a message that a segmentation fault was received, then return without performing any other action. – Run the program and observe the results. – What do you observe? Why is this happening? TIP: Note that when any signal is handled by a signal handler, execution of the program returns exactly where it left off before the signal was received. In this case, execution continues by re-running the statement that de-references the NULL pointer. • Be sure to update comments at the top of the source file and commit your changes to the file. Part 3: Getting Details from a Received Signal While the signal system call is one way to register a signal handler with the operating system, it does not allow the signal handler to receive a lot of information about the received signal (just the signal number). What if the signal handler needs more information, for example what process sent the signal. For that, there is another system call sigaction that can be used to register a signal handler that receives a structure of values for each received signal. The sigaction system call allows the process to customize a lot about the signal handler. It can set a signal to be completely ignored (not allowed for SIGSTOP or SIGKILL), retrieve the memory address that caused a fault (e.g. for SIGSEGV), get the process identifier that sent the signal, among other things. Research the sigaction system call. – Write a program (signal_sigaction.c) that uses sigaction to register a handler for the SIGUSR1 signal. – After registering, have the program wait in an infinite loop (see previous examples). The program doesn’t need to print anything inside the loop. – In the signal handler registered using sigaction, print out the process identifier of the sender, then return. – Send the SIGUSR1 to the process and observe the output. – Write and record in comments of the program a command that can be used to send SIGUSR1 to the process. HINT: Pay attention to the sa_flags field on the sigaction structure. Read about SA_SIGINFO flag. Part 4: Sending Data with a Signal While using sigaction it is possible to retrieve information about a received signal, for example: – the address of the violating memory access in the case of SIGSEGV – process identifiers for the sending process What if a process wants to send data along with a signal, for example when using one of the user signals (e.g. SIGUSR1)? The kill system call doesn’t allow for data to be sent with the signal. To send data along with the signal a process can use the sigqueue system call https://man7.org/linux/man-pages/man3/sigqueue.3.html Research the sigaction system call. – Write a program (recv_signal.c) that uses sigaction to register a handler for the SIGUSR1 signal. – After registering, have the program wait in an infinite loop (see previous examples). The program doesn’t need to print anything inside the loop. – In the signal handler registered using sigaction, retrieve the sival_int and print out this value, then return. – Write a second program (send_signal.c) that sends SIGUSR1 along with a random integer (see https://man7.org/linux/man- pages/man3/srand.3.html) to the process using sigqueue. Print this number in the sending program before sending SIGUSR1. – NOTE: To send a number that is sufficiently random make sure to seed the random number generator with srand. See the manual page for more information. Using the time function is a sufficient way to seed the generator (https://www.tutorialspoint.com/c_standard_library/c_function_srand.htm)) – NOTE: The process identifier assigned to a program changes every time a program is executed. You will need to figure out how to send the message to the correct process. – HINT: Utilize the command line and atoi to convert a command line string to a number HINT: Pay attention to the sa_flags field on the sigaction structure. Read about SA_SIGINFO flag. NOTE: You’ll be creating two (2) programs for this exercise. Make sure you specify the -o parameter on gcc to get different executable names for each program. For example: gcc -o send_signal send_signal.c gcc -o recv_signal recv_signal.c Part 5: Signal Tennis This part is OPTIONAL and worth 20 extra credit points on the assignment. Template files are not in the repository. Create and add files to repository for implementation. Given what you learned about signals, use this knowledge to write two (2) programs to play signal tennis. These two programs will send signals back and forth to each other to simulate a tennis ball. The signal to use for the ball is up to you. Note this activity is substantially similar to Part 4, so maybe start there. Development Requirements • The first program (the receiving program): 1. Installs a signal handler for the ‘ball’ signal, the signal handler should determine the process that sent the signal (the other player) 2. Waits for a random amount of time between 1 and 2 seconds 3. Sends the signal (the ball) back to the sender 4. For each step (receiving signal, sending signal), the program should print out a status message to indicate what it is doing. Also play the system bell (‘’ or ‘07’) to simulate a ball being hit. • The second program (the serving program): 1. Installs a signal handler for the ‘ball’ signal, the signal handler should determine the process that sent the signal (the other player) 2. Waits for a random amount of time between 1 and 2 seconds 3. Sends the signal (the ball) back to the sender 4. For each step, the program should print out a status message to indicate what it is doing and play the system bell when sending the signal (hitting the ball) 5. After the server sets up the signal handler, it should serve the ball (send the signal) to the other player. Run the receiving program first and then run the serving program to play the game. The game must exit cleanly (without crashing or requiring CTRL+C) after a successful volley (sending the signal ‘ball’ back and forth) of 10 exchanges. Here are some things to consider: • How does the serving know the process identifier of the other player? o HINT: 1. Run the receiving process first 2. Figure out the process identifier using ps or have the receiving process print its pid to the console. 3. Utilize the command line on the serving program to give it the process identifier of the receiving process and atoi to convert a string to a number • How do you end the game? o HINT: Use sigqueue to send the signal ‘ball’ and include the current count of exchanges. Challenge Signal tennis does not have the features of regular tennis. If you accept the challenge can you use the signaling functions (e.g. sigqueue, kill, sigaction, etc.) to: • Keep score using traditional tennis scoring: https://www.usta.com/en/home/improve/tips-and-instruction/national/tennisscoring-rules.html o This will require some sort of randomness or heuristic to determine if the player succeeds or fails to return the ball o An additional signal or value sent with the signal will be needed to tell the other player that the ball wasn’t successfully returned o This will also require the serving process to serve again after each point. You are free to use the same program to always perform the serve. • End the game when one player looses using traditional tennis scoring • Only implement a single game Code Structure Code must follow style guidelines as defined in the course material. Hints and Tips Testing and Debugging Debugging See the course debugging tips for using gdb and valgrind to help with debugging. Your program must be free of run-time errors and memory leaks. Deliverables When you are ready to submit your assignment prepare your repository: • Make sure your name, assignment name, and section number are all files in your submission – in comment block of source file(s) and/or at the top of your report file(s). • Make sure you have completed all activities and added all new program source files to repository. • Make sure your assignment code is commented thoroughly. • Make sure all files are committed and pushed to the main branch of your repository. • Tag your repo with the tag vFinal NOTE: Do not forget to ‘add’, ‘commit’, and ‘push’ all new files and changes to your repository before submitting. To submit, copy the URL for your repository and submit the link to associated Canvas assignment.
Search is an integral part of AI. It helps in problem solving across a wide variety of domains where a solution isn’t immediately clear. Your task is to implement several search algorithms that will calculate a route between two points in Romania while seeking to minimize time and space cost. We will be using an undirected network representing a map of Romania (and an optional Atlanta graph used for the Race!).Table of Contents Setup Dependencies Jupyter Jupyter Tips Submission Custom Tests Setup Create a conda environment if you have not already. For example: conda create –name a1_env python=3.9 -y Activate the environment: conda activate a1_env In case you used a different environment name to create the conda environment, you can list all environments you have on your machine by running conda env list. You can always refer back to the instructions provided in Assignment 0 for managing conda environments. Dependencies Install the necessary libraries for this assignment after activating your conda environment and navigating to the correct directory. pip install -r requirements.txt Jupyter To open the Jupyter Notebook, run: jupyter notebook This should automatically open the notebook.ipynb as a Jupyter Notebook. If it doesn’t automatically open, you can access the Jupyter Notebook at in your browser. Jupyter Tips 1. My Jupyter notebook does not seem to be starting up or my kernel is not starting correctly. Ans: This probably has to do with activating virtual environments. If you followed the setup instructions exactly, then you should activate your conda environment using conda activate from the Anaconda Prompt and start Jupyter Notebook from there. 2. I was running cell xxx when I opened up my notebook again and something or the other seems to have broken. Ans: This is one thing that is very different between IDEs like PyCharm and Jupyter Notebook. In Jupyter, every time you open a notebook, you should run all the cells that a cell depends on before running that cell. This goes for cells that are out of order too (if cell 5 depends on values set in cell 4 and 6, you need to run 4 and 6 before 5). Using the “Run All” command and its variants (found in the “Cell” dropdown menu above) should help you when you’re in a situation like this. 3. The value of a variable in one of my cells is not what I expected it to be? What could have happened? Submission You will submit multiple files from the directory submission to Gradescope after following the instructions in notebook.ipynb. Custom Tests There are a series of custom tests provided if you would like to run them through the terminal instead of Jupyter Notebook. They are: 1. search_basic_tests.py: Sample basic tests, visualizes the Romania graph. 2. search_submission_tests_grid.py: Visualizes the search as a grid. 3. search_romania_tests.py: More comprehensive tests on the Romania graph than search_basic_tests. 4. search_atlanta_tests.py: Tests for the Atlanta graph. 5. search_case_visualizer.py: For visualizing specific failed cases of interest on the Romania graph. To run a full test, you can do something like: python search_romania_tests.py To run a specific test, you can do something like: python search_romania_tests.py SearchRomaniaTests.test_bfs_romania See the code for each test file for more detailed instructions and descriptions.# CS6601 Assignment 1 Search
Final Project ProposalFinal paper: what to do What are we evaluating in this final paper? Your goal: Formulating a clear research problem and a plausible plan to explore it. You have made hard proofs, you have tried your best to understand real data and polish algorithms. Now you will demonstrate that you can create a good proposal for more research in that domain. This is often the first real step of a research career (I understand many expect research to start by learning the ropes on a given topic chosen for them. While this is not bad, until you get a chance to propose, you’re not really searching! There is no reason to delay). In my experience, even if that’s not said explicitly, every advisor expect students from day1 in proposition mode. By that one does not mean stubbornly motivated by a single idea obviously, but able to create and develop new paths from a given status quo. That’s what we want you to do (and as opposed to a real fellowship/summer project etc., you won’t necessary have to do the work 🙂 ). How we will evaluate it? We will judge the idea proposed according to three dimensions -absolute merit (score 1-5): is the problem clear? is the research impactful for social network and big data application? -relative merit (score 1-3): is it well connected to current research in a way that is articulated and offer a significant novelty. Some candidness is acceptable. – soundess of approach (1-3): how many and how uncertain are the connections needed for that idea to lead to an advance. The final score is multiplicative, to account for the fact that a good research proposal cannot be prioritizing one of those dimensions at the expense of even a single of the others. That multiplicative rule is meant to reproduce reality of actual success in research. How is your grade computed? Designing a good scoring method for your final paper amounts to finding an accurate estimate on the change of a research proposal. It’s not easy, and it’s not necessarily “fair” in the usual sense (i.e., I spend x minutes out of y available and I received x/y). Your grade is a multiplication of 3 scores: • Maturity/Significance (graded on 4): Taking the contribution for what it is and assuming it works, does the significance of the progress achieved stand out? – 0: proposed approach is contradicted by well known facts – 1: presents no knowledge of related work outside surface – 2: related work is loosely covered (partially, high level) – 3: some significance w.r.t. the : state of the art appear – 4: significance of the contribution is thoroughly developed • Clarity/Soundness (graded on 10): What is the chance that the project lead to a scientific discovery? – 0: expecting success fit Einstein’s definition of insanity – 1: project mostly motivational, success-path unidentified – 2-3: the outputs of the projects are likely uninterpretable – 4-5: a potential success, but unlikely via major limitation – 6: a discovery would be surprising but can’t be excluded – 7: a good bet: at least some moderate progress to expect – 8: a great bet: major progress not overall improbable – 9-10: a sure bet, sound path to positive/negative discovery So the maximum theoretical score is 240pt. Reaching for that maximum means contradicting the law of intellectual physics (i.e., formulating a sure bet on a major novel approach, while having shown how it will revolutionize multiple fields). That is not a bug! It’s important that you are aware of the extreme significance of success that seems partial but are truly decisive. You need to see in practice that a 100% grade in a research project makes no sense. In practice, any score above 100pt is impressive (or lucky :)), and above 150pt is truly outstanding. Examples • A good incremental step would combine moderate novelty (more like variation of known concept (3), with good coverage of the literature and why this new angle is significant (3) but perhaps would miss a few things, and it will have a solid plan (8). That’s already a 96pt. • A “boring’ ’ project that would combine known concept (3), do only a partial literature search (2) and present an experiment that is not likely to bring much advance (6) receives 36pt. Note in the examples below that there is almost a 1-3 ratio and that from the “boring” starting point, any increment is extremely important. Making a boring idea more mature, show sign of possible progress and addresses concerns passed this point to rise to another level is the path to success. • An “inspiring’ ’ project that proposes a creative approach (5) that is well motivated by related work (4), even if it is unlikely to work (5) still receives 100pt. That’s because research reward risk (provided the approach is not obviously flawed or hopeless). • A real new idea (6) even with poor maturity (2) on which a discovery can’t be completely excluded (5) already receives 72pt. Again any progress from there (especially on maturity, which is attainable) would make a spectacular score. What are the scores in last year’s project? Average of submitted papers is 60pt which is also the median • 6 projects got more than 100pt (including 2 above 150pt). Congrats to you! • Nobody scored 0 in any dimensions, and nobody scored score 9-10 in soundness. However, except for those cases, all dimensions mattered and for any of those there has been a project with this score! It seems that overall the distinguishing feature of top project is combining a big challenge/novelty with impeccable state of the art. • the score distribution (see attached picture) is almost uniform on [0;150pt], which is somehow an interesting findings. : /FinalPaperGrade.png If you are wondering how far you are in those dimensions with your project ideas, I recommend reading and answering the Heilmeier’s 9Q (see at the end of this docuement). It’s a commonly used canvas for judging research proposal in multiple institutions like NSF, DARPA, ERC, Google, MSR … If you have a satisfactory answers to all, you’re on the good track (or too optimistic 🙂 it happens), otherwise, it might help you choose how to exert effort to address what’s missing. Where can I find some ideas for projects? For starters, as shared during the semester, we recommend high visibility events: TheWebConf (https://www2024.thewebconf.org/), The ACM EC Conference (https://ec24.sigecom.org/index.html), ACM EAAMO Conference Those are great events that typically include the “core” of social network theory with impacts, but some of those topics vastly expands in different venues, often in related computing conferences (KDD, ICWSM, ICML, NeurIPS) or multidisciplinary conferences (FAcct, AAAI AIES) and occasionally in many others. As long as you see a way to connect your project to some efforts related in those areas you are probably entirely in scope. Please do not assume that an original idea you have is out of scope without requesting help, and do not hesitate to ask doing the connection, as it’s often the most interesting and difficult part. What format to use for the final paper? We are expecting pdf, 4p in the ACM format https://www.acm.org/publications/proceedings-template Alternative formats that are roughly equivalent are fine (but why would you do that?). We expect 4p to include the references (so 3p to 3.5p of actual text), but we won’t have any problem to include references on another page. If you have more to say, you could have a longer paper as long as it has the same density/quality. We recommend you to include anything that takes more space as an appendix not to disturb your problem formulation. Any sections that will be about “results” can be included in the body of the text as you want, no problem at all. You can also make it shorter, as long as the problem is clearly formulated and its merits are well justified. But obviously it’s even harder to show maturity and novelty in a shorter form (esp. without the “no space for that” card), so do that at your own risk. Looking forward to helping you with those proposals, • Augustin 9 questions to ask yourself (and answer) about your project 1-2-3: : Core Question 1: – What are you trying to do ? – Articulate your objective using absolutely no jargon. – What is the problem? – Why is it hard? Question 2: • How is it done today and what are the limits of current practice? Question 3: • What is new in your approach and why do you think it will be successful? 4-5-6: Impact of the contribution Question 4: • Who cares? Question 5: • If you are successful, what difference will it make? • What impact will success have? How will it be measured? Question 6: • What are the risks and payoffs? 7-8-9: Plans and evaluations Question 7: • How much will it cost? Question 8: • How long will it take? Question 9: • What are the midterm and final “exams” to check for success? – How will progress be measured? :
Brief The purpose of this component is to make recommendations for introducing crime reduction measures in the shoplifting model developed in the module. a) Describe the measures and their mechanisms in an Excel table. The data (i.e., verbatim quotes extracted from the paper : ) and synthesised information should be recorded in the table. b) Indicate the situational crime prevention principle(s) involved in each security measure, e.g., increasing the effort… (Clarke, 1995 p.19, Cornish & Clarke, 2003, Clarke & Petrossian, 2010). Part 2 (2500 words max). In the next development phase of our shoplifting model, we would like to introduce a range of security measures. Draw on your review of the security measures in part 1 to explain and show how they could be modelled. Your answer should compriseEmail: two sections: • Section A: Give an overview of SCP-based crime reduction elements relevant to shoplifting. Introduce the main approaches that can be used to model them. • Section B: Select QQ: 749389476and implement one or two SCP-based security measures in NetLogo that could be introduced in the shoplifting model. Explain different modelling options in detail, discuss their advantages and disadvantages, show how you implemented it (or them) in the code, and show and analyse pertinent simulation resultshttps://tutorcs.com. You should include your developed code in an appendix; the code should have explanatory comments throughout. Note: Draw on the data recorded in the Excel table to support your answer. Prioritise depth over breadth. You must submit the table (Excel file), the report (Word doc) and the code (zipped file with all coding and supporting files – .csv, nls, .nlogo). References • Cornish, D. B., & Clarke, R. V. (2003). Opportunities, precipitators and criminal decisions: A reply to Wortley’s critique of situational crime prevention. Crime prevention studies, 16, 4196. • Clarke, R.V. & Petrossian. G. (2010) Shoplifting. 2nd Edition. Problem-Specific Guides Series. Problem-Oriented Guides for Police. No. 11 Marking criteria This coursework is worth 70% of the module mark, with 30% allocated to Part 1 and 70% to Part 2. Part 1 [30% of Component 2 mark]: The mark for part 1 will be based on the quality and quantity of information in the Excel table. Particular attention will be given to the: • structure of the Excel table (clarity, scope, relevance), • information and supporting data (clarity, scope, accuracy, amount of details), • presentation (style, typographic errors) The Excel spreadsheet will not be considered in the word count. Marking rubric : Structure of the Excel Table Table is poorly structured, unclear, Assig and lacks relevance. Table structure is somewhat clear but nment Placks cope and relevance to Lasky et al. roject Exwith minor issues in scope or clarity. Table is well-structured, highly relevant, and am Helporganise d for easy understanding. Informatio n and supporting data Lacks sufficient information and/or Emai l no supporting data from Lasky et al. : tutorcssufficient clarity or detail; unclear synthesis of quotes. Adequate supporting data with sufficient @163.coclarity accurate synthesis of the quotes from the paper. Comprehensive msupporting data, highly relevant quotes and information synthesised effectively. Presentation (Style, typographic errors) Frequent QQ: typographical and formatting errors; poor style. 7493894Severa l typographic errors or formatting issues, but overall presentation is acceptable. 76Minimal typographic d formatting errors, clear presentation style. Free of typographic errors, professionally presented with consistent style.Part 2 [70% of Component 2 mark]: The mark for part 2 will consider whether: • the selected security measures are situational crime prevention measures, • there is sufficient information to understand what these elements are and how they contribute to crime reduction, • the presentation clearly explains how to model situational crime prevention principles in the shoplifting model, • the answer contains accurate and pertinent arguments, including references to the literature and the findings from question 1, • the general presentation is appropriate, considering the word count, style and structure, format of the references, typographic errors, etc.Marking rubric Selection of security measures Chosen security measures are not situational crime prevention (SCP) measures. SCP measures are identified, but lack clear justification or explanation of relevance to shoplifting. SCP measures are mostly appropriate and welljustified, with links to shoplifting crime reduction. Selected SCP measures are highly appropriate, wellexplained, and clearly linked to shoplifting crime prevention. Explanation of crime reduction elements Basic explanation of SCP crime reduction elements, but lacks depth or clarity. Detailed explanation of relevant SCP crime reduction elements, mostly clear and accurate. Comprehensive, clear, and insightful explanation of SCP crime reduction elements. Modelling SCP principles No clear explanation of how SCP principles WeC could be modelled. Some explanation of modelling SCP hat: cstutprinciples, but lacks sufficient detail or clarity. Clear explanation of orcshow SCP principles could be modelled, with relevant detail. Detailed, clear, and comprehensive explanation of modelling SCP principles, with a strong understanding of how they work in practice. Implementation of security measures in NetLogo Assig No or poorly implemented security measures in NetLogo, with little or no Emaildiscussion of the code. nment P Basic implementation in evaluation of NetLogo, but lacks or : tutorcs detailed explanation different options. roject ExClear implementation of one or two SCPbased ecurity @163.cowith dequate measures in NetLogo, discussion of modelling options. am HelpHighly detailed and insightful implementation in NetLogo, with mdiscussion of different comprehensive modelling options, advantages, and disadvantages. Analysis of simulation results QQ: No simulation results provided or results are irrelevant to the modelled security https: measures. 7493894Basic ulation results provided, but lacks indepth analysis or clear //tutorcs.relevance o the security measures implemented. 76Relevant simulation sults are presented and analysed, with adequate cominterpretation oftheir impact on shoplifting crime reduction. Pertinent simulation results are thoroughly presented and analysed, offering clear insights into the effectiveness and limitations of the implemented security measures. Code presentation and documentation No code or code is incomplete; lacks comments. Code is present but lacks clarity and sufficient explanatory comments. Code is mostly complete with appropriate comments, but lacks some clarity or detail. Code is complete, wellorganised, and fully annotated with clear explanatory comments throughout. Presentation and referencing Unclear presentation, lacking proper referencing or numerous typographic errors. Presentation is adequate but contains some typographic errors; inconsistent referencing. Clear presentation with minimal typographic errors; accurate and consistent referencing. Professional presentation with no typographic errors, consistently formatted references.
1. Assignment Overview Assignment Contribution: Contributes 15% to the final grade Submission Requirements Files to Submit: Mset.c, MsetStructs.h, and analysis.txt Submission Method: Command line ($ give cs2521 ass1 Mset.c MsetStructs.h analysis.txt) or give’s web interface Notes: Multiple submissions are allowed, and only the last one will be marked. Check after submission. Grading Criteria Correctness (75%) Includes the correctness of basic operations (such as insertion, deletion, getting size, getting total count, getting element count, printing, etc.) and advanced operations (such as union, intersection, inclusion check, equality check, getting most common elements, etc.), as well as the correctness of related operations after updating the balanced binary search tree and cursor operations. Each operation has a corresponding score percentage, and there will be deductions for memory errors/leaks. Complexity Analysis (15%): The correctness of the complexity analysis in analysis.txt and the quality of explanations. Code Style (10%): Evaluation includes indentation, space usage, function usage, code decomposition, and comments. 2. Assignment Content 2.1 Multiset Abstract Data Type (ADT) A collection that allows duplicate elements, where each element has a count indicating the number of times it appears in the collection. It is an abstract data type, and the focus is on the set of operations. The implementation details are not important as long as the desired behavior is presented to the user. Operation Requirements Basic Operations (Part 1) MsetNew: Creates a new empty Multiset with a time complexity requirement of $O(1)$. MsetFree: Frees all memory allocated to the Multiset with a time complexity of $O(n)$. MsetInsert: Inserts one element into the Multiset. If the element is equal to UNDEFINED, nothing is done. The time complexity is $O(h)$. MsetInsertMany: Inserts a given number of elements. If the element is equal to UNDEFINED or the number is 0 or less, nothing is done. The time complexity is $O(h)$. MsetDelete: Deletes one element from the Multiset with a time complexity of $O(h)$. MsetDeleteMany: Deletes a given number of elements with a time complexity of $O(h)$. MsetSize: Returns the number of distinct elements in the Multiset with a time complexity of $O(1)$. MsetTotalCount: Returns the sum of the counts of all elements in the Multiset with a time complexity of $O(1)$. MsetGetCount: Returns the count of an element in the Multiset. If the element is not in the Multiset, it returns 0. The time complexity is $O(h)$. MsetPrint: Prints the Multiset to a file. The elements are sorted in ascending order and in the format of {(element, count),…}. The time complexity is $O(n)$. Advanced Operations (Part 2) MsetUnion: Given two Multisets, returns their union. The count of each element in the new Multiset is the maximum of its counts in the two original Multisets. The time complexity needs to be analyzed and written in analysis.txt. Methods like converting the tree to an array or list and then processing are not allowed. MsetIntersection: Given two Multisets, returns their intersection. The count of each element in the new Multiset is the minimum of its counts in the two original Multisets. The time complexity needs to be analyzed and written in analysis.txt, and certain processing methods are prohibited. MsetIncluded: Given two Multisets, determines if one is included in the other based on element counts. The time complexity needs to be analyzed and written in analysis.txt, and certain processing methods are prohibited. MsetEquals: Given two Multisets, determines if they are equal based on whether the elements and counts are exactly the same. The time complexity needs to be analyzed and written in analysis.txt, and certain processing methods are prohibited. MsetMostCommon: Given a Multiset, an integer, and an array, stores the most common elements in the Multiset in descending order of count into the array and returns the number of elements stored. The time complexity needs to be analyzed and written in analysis.txt. Balanced Binary Search Tree (Part 3) Update the implementation to use a height-balanced binary search tree so that MsetInsert, MsetInsertMany, MsetDelete, and MsetDeleteMany have a worst-case time complexity of $O(log n)$, and ensure that the underlying binary search tree of any Multiset is always height-balanced. Cursor Operations (Part 4) MsetCursorNew: Creates a new cursor for a given Multiset, initially positioned at the start of the Multiset. MsetCursorFree: Frees all memory allocated to a given cursor. MsetCursorGet: Returns the element at the cursor’s position and its count. If the cursor is at the start or end of the Multiset, it returns {UNDEFINED, 0}. MsetCursorNext: Moves the cursor to the next largest element. If there is no next largest element, it moves to the end of the Multiset. If the cursor is already at the end, it does not move. Returns false if the cursor is at the end after the operation, otherwise returns true. MsetCursorPrev: Moves the cursor to the next smallest element. If there is no next smallest element, it moves to the start of the Multiset. If the cursor is already at the start, it does not move. Returns false if the cursor is at the start after the operation, otherwise returns true. All cursor operations should have a worst-case time complexity of $O(1)$ or $O(log n)$. The design and implementation and how the time complexity requirement is met need to be explained in analysis.txt. 2.2 Assignment File Description Initial Files Makefile: A set of dependencies used to control compilation. Mset.h: The interface to the Multiset ADT, which cannot be modified. Mset.c: The implementation of the Multiset ADT (initially incomplete). MsetStructs.h: The definition of structs used in the Multiset ADT (initially incomplete). testMset.c: A main program containing some basic tests for the Multiset ADT. analysis.txt: A template for entering the time complexity analysis of selected functions. Struct Usage Requirements Must use struct node for binary search tree nodes. The elements of the multiset must be stored in the elem fields of struct node, and their counts must be stored in the count fields. The left and right pointers are used to connect a tree node to its left and right subtrees and cannot be used for other purposes. The tree field must point to the binary search tree that stores all the elements of the multiset. 2.3 Testing and Debugging Testing The testMset.c is provided as a basic test program. It can be compiled by make and run by ./testMset. The tests are assertion-based, and a failed test will cause the program to exit. A test can be ignored by commenting out the corresponding test function. It is recommended to add your own test functions. Debugging Students are expected to know basic debugging methods, such as using print statements, basic GDB commands, and running Valgrind. The use of GDB and Valgrind can be learned in relevant lab exercises. 3. Background Knowledge Prerequisite Knowledge Requirements: Recursion, analysis of algorithms, abstract data types, binary search trees, balanced binary search trees (including AVL trees) Multiset Related Concepts Similar to the concept of a set but allows duplicate elements, and each element has a count. It can be represented in the form of elements and their counts enclosed in curly braces, such as {(1, 3), (4, 2)}, indicating that element 1 appears 3 times and element 4 appears 2 times. Related symbolic notations are defined, such as cA(x) representing the count of element x in multiset A, and if x is not in A, then cA(x)=0. The empty multiset is denoted by ∅. :#The Solution needs to be customised that’s why we didn’t attach the solution For the Programming Help for this solution email or whatsapp me at: [email protected] Whatsapp : +1 419 877 7882
Module Name: Contemporary Topics in Software Engineering Module Code: ITP4507 Assignment Number: One Weighting of This Assignment: 66.67% of the End of Module AssessmentTask SpecificationAssignment Project Exam Help Snow Storm Company develops an RPG game “Fantastic World (FW)” on the PC. The major characters in this game are known as HERO and they have various kinds of characteristic. For example, Warrior focus on defence and Warlock target on https://.commagic damage. Each player in this game can play more than one hero. Currently, this game only has Warrior and Warlock. In the coming future, this game will be extended to support more kinds of hero such as healer. The following is the simplified class diagram of existing data maintained by FW. Add As a system analyst of the Company, you are required to design and develop FW. You can get the source codes of above classes from Moodle. FW should provide the following functions: 1. Create a Player. 2. Add a hero (Warrior or Warlock) to the current player. 3. Remove a hero from the current player. 4. Select a player by using a player ID. 5. Call a hero’s skill by a hero ID. 6. Show the detail information of current player. 7. Change the player’s name of the current player. 8. Show all players. 9. Set current player. 10. Undo last command. 11. Redo the last undone command. 12. Show undo/redo list. 13. Exit System.Your system design should conform to the Open Closed Principle so that your design should easily be extended to support new heroes (for examples, healers, rangers etc..).You MUST apply the following design patterns for your new system ⚫ Command patternhttps://.com to provide the “create player”, “set current player”, “add hero”, “call hero skill”, “delete hero”, “show current player”, “display all players”, “change player’s name”. “undo”, “redo” and “show undo/redo list” functions ⚫ Factory pattern Add assignmentchefor Abstract Factory Pattern to create different kinds of Command objects and different kinds of Player/Hero objects (e.g., Warrior object, Warlock object, Player object, etc.) ⚫ Memento pattern to provide “Undo” and “Redo” functions on the “call hero skill” and “change player’s name” functions. Assignment Report In addition to the system development, you are required to write up a Short Report covers the following sections: 1. Assumptions regarding the problem context 2. Application design with class diagram 3. Discussion and explanation on each of the design patterns applied to the application 4. Test Plan and Test Cases 5. Well documented Source CodeMark Allocation Your assignment work will be marked according to the following criteria. Work Mark Allocated System Coding and Implementation a) Implementation of the system and coding style (Hard-coded output will result in zero mark.) 30% b) Correctness of system functions * (Hard-coded output will result in zero mark.) 15% c) Test Plan and Test Cases (Will be used in testing your own application.) 10%System Analysis and Design, and Discussion d) Design of your system and correct use of design patterns 20% e) Application design with class diagram 10% f) Discussion and explanation on each of the design patterns applied to the application 15%Total 100%Submission of Assignment Work 1. The front page of your submission shouldAdd include the programme title(code), module title(code), and student number and student name. 2. Submit your Java source code and your report to https://moodle.vtc.edu.hk . • Well documented Source Code of your program. • Report for analysis, design, discussion, user guide, test plan and test cases of your following work. A. The assumption made during analysis and design of the application B. System design on your application with class diagram C. Discussion on the design patterns that applied on your program D. Test Plan with Test Cases (Design your own test plan and corresponding test cases for each function. For each test case, you should provide a SCREEN CAPTURE for test result).Extra Reference: Sample Test Testing Method This sample run is served for reference only. You are free to design your own user interface. But to make the testing environment simple and to apply the “Copy and Paste” testing method described on page 16 easily, you are advised to accept user input at the command prompt as shown in the sample run below.Sample Run of assignmentFor following examples, character(s) with underline is user’s input.1. Create player Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system Please enter command [ c | g | a | m | d | s | Assignment Project Exam Helpp | t | u | r | l | x ] :-c Player ID:- P001 Player Name:- Thomas Yiu Player Thomas Yiu is created. Current player is changed to P001. Add 2. Add heroes to the player Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Thomas Yiu. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-a Please input hero information (id, name):- H001, peter pang Hero Type (1 = Warrior | 2 = Warlock ):- 1 Hero is added.Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Thomas Yiu. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-a Please input hero information (id, name):- H002, john wick Hero Type (1 = Warrior | 2 = Warlock ):- 2 Hero is added.3. Show the current player Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Thomas Yiu. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-s Player Thomas Yiu (P001) Heroes: H001, peter pang, Warrior, Hp: 500, Damage: 0, Defence Point: 500 H002, john wick, Warlock, Hp: 100, Damage: 200, Mp: 5004. Create another player and display all players Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit systemAdd The current player is P001 Thomas Yiu. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-c Player ID:- P002 Player Name:- Stan Lee Player Stan Lee is created. Current player is changed to P002. Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P002 Stan Lee. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-p Player Thomas Yiu (P001) Player Stan Lee (P002)5. Add heroes to the player P002 Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P002 Stan Lee. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-a Please input hero information (id, name):- H003, scarlet witch Hero Type (1 = Warrior | 2 = Warlock ):- 2 Hero is added.Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current playerAssignment Project Exam Help is P002 Stan Lee Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-a Please input hero information (id, name):- H004, tony stark Hero Type (1 = Warriorhttps://.com | 2 = Warlock ):- 1 Hero is added. Add 6. Show the current player c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P002 Stan Lee. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-s Player Stan Lee (P002) Heroes: H003, scarlet witch, Warlock, Hp: 100, Damage: 200, Mp: 500 H004, tony stark, Warrior, Hp: 500, Damage: 0, Defence Point: 5007. Set the current player by Player ID Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P002 Stan Lee. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-g Please input player ID:- P001 Changed current player to P001.8. Call hero skill Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Thomas Yiu. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-m Please input hero ID:- H001 H001 peter pang’s attributes are changed to: H001, peter pang, Hp: 500, Damage: 250, Defence Point: 400Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit systemhttps://.com The current player is P001 Thomas Yiu. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :Add -s Player Thomas Yiu (P001) Heroes: H001, peter pang, Warrior, Hp: 500, Damage: 250, Defence Point: 400 H002, john wick, Warlock, Hp: 100, Damage: 200, Mp: 5009. Delete hero Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Thomas Yiu. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-d Please input hero ID:- H002 H002 john wick is deleted.10. Change the name of the current player Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Thomas Yiu. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-t Please input new name of the current player:- Russo Brothers Player’s name is updated.Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Russo Brothers. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-s Player Russo Brothers (P001) Heroes: H001, peter pang, Warrior, https://.comHp: 500, Damage: 250, Defence Point: 400 11. Show undo/redo list Add Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Russo Brothers. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-l Undo List Change player’s name, P001, Russo Brothers Delete hero, H002 Call hero skill, H001, peter pang, Warrior, Hp: 500, Damage: 250, Defence Point: 400 Add hero, H004, tony stark, Warrior Add hero, H003, scarlet witch, Warlock Create player, P002, Stan Lee Add hero, H002, john wick, Warlock Add hero, H001, peter pang, Warrior Create player, P001, Thomas Yiu — End of undo list — Redo List — End of redo list – 12. UNDO actions in UNDO list Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Russo Brothers. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-u Command (Change player’s name, P001, Russo Brothers) is undone.Fantastic World (FW) c = create playerAssignment Project Exam Help, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player ishttps://.com P001 Thomas Yiu. Fantastic World (FW) c = create player, g = set current Add assignmentchefplayer, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Thomas Yiu. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-u Command (Delete hero, H002) is undone.Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Thomas Yiu. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-u Command (Call hero skill, H001, peter pang, Warrior, Hp: 500, Damage: 250, Defence Point: 400) is undone.Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Thomas Yiu. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-s Player Thomas Yiu (P001) Heroes: H001, peter pang, Warrior, Hp: 500, Damage: 0, Defence Point: 500 H002, john wick, Warlock, Hp: 100, Damage: 200, Mp: 500Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Thomas Yiu. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] : -u Command (Add hero, H004, tony stark, Warrior) is undone. Fantastic World (FW) c = create player, g = set current Add assignmentchefplayer, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Thomas Yiu. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-u Command (Add hero, H003, scarlet witch, Warlock) is undone.Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Thomas Yiu. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-g Please input player ID:- P002 Changed current player to P002.Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P002 Stan Lee. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-s Player Stan Lee (P002) Heroes:Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P002 Stan Lee. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-u Command (Create player, P002, Stan LeeAssignment Project Exam Help) is undone. Current player is changed to P001.Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = Add assignmentchefexit system The current player is P001 Thomas Yiu. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-g Please input player ID:- P002 Player P002 is not found!!Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Thomas Yiu. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-l Undo List Add hero, H002, john wick, Warlock Add hero, H001, peter pang, Warrior Create player, P001, Thomas Yiu — End of undo list — Redo List Create player, P002, Stan Lee Add hero, H003, scarlet witch, Warlock Add hero, H004, tony stark, Warrior Call hero skill, H001, peter pang, Warrior, Hp: 1000, Damage: 500, Defence Point: 400 Delete hero, H002 Change player’s name, P001, Russo Brothers — End of redo list –13. REDO actions in REDO list Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player isAssignment Project Exam Help P001 Thomas Yiu. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-r Command (Create player, P002, Stan Leehttps://.com) is redone. The current player is changed to P002. Add Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P002 Stan Lee. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-r Command (Add hero, H003, scarlet witch, Warlock) is redone.Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P002 Stan Lee. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-r Command (Add hero, H004, tony stark, Warrior) is redone. Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P002 Stan Lee. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-s Player Stan Lee (P002) Heroes: H003, scarlet witch, Warlock, Hp: 100, Damage: 200, Mp: 500 H004, tony stark, Warrior, Hp: 500, Damage: 0, Defence Point: 500 Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P002 Stan Lee. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-r Command (Call hero skillAssignment Project Exam Help, H001, peter pang, Warrior, Hp: 500, Damage: 250, Defence Point: 400) is redone.Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s Add name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Thomas Yiu. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-g Please input player ID:- P001 Changed current player to P001.Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Russo Brothers. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-s Player Russo Brothers (P001) Heroes: H001, peter pang, Warrior, Hp: 500, Damage: 250, Defence Point: 400 H002, john wick, Warlock, Hp: 100, Damage: 200, Mp: 500Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P002 Stan Lee. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-l Undo List Call hero skill, H001, peter pang, Warrior, Hp: 1000, Damage: 500, Defence Point: 400 Add hero, H004, tony stark, Warrior Add hero, H003, scarlet witch, Warlock Create player, P002, Stan Lee Add hero, H002, john wick, Warlock Add hero, H001, peter pang, Warrior Create player, P001, Thomas YiuAssignment Project Exam Help — End of undo list — Redo List Delete hero, H002 Change player’s name, P001, Russo Brothers — End of redo list Add –– END OF SAMPLE TEST –You can ease the testing by using the ‘Copy and Paste’ method rather than inputting data manually. Prepare a text file, which includes all user inputs in a test run. By using the ‘Copy and Paste’ method, you can automatically input in the command prompt window and then get the result automatically (without the input data echoed). The following is an example of the text file for user inputs. Sample User Inputs for a Test Runc P001 Thomas Yiu a H001, peter pang 1 a H002, john wick 2 s c P002 Stan Lee p a H003, scarlet witchAssignment Project Exam Help a H004, tony stark 1g P001 m s d H002 t Russo Brothers s l u u u s u u g P002 s u g P002 l r r r s r g P0 01 s l xExpected Output of the Test Run Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :Player ID:- Player Name:- Player Thomas Yiu is created. Current player is changed to P001. Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Thomas Yiu. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :- Please input hero information (id, name):- Hero Type (1 = Warrior | 2 = Warlock ):- Hero is added.Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d Add = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Thomas Yiu. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-Please input hero information (id, name):- Hero Type (1 = Warrior | 2 = Warlock ):- Hero is added.Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Thomas Yiu. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-Player Thomas Yiu (P001) Heroes: H001, peter pang, Warrior, Hp: 500, Damage: 0, Defence Point: 500 H002, john wick, Warlock, Hp: 100, Damage: 200, Mp: 500 Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Thomas Yiu. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-Player ID:- Player Name:- Player Stan Lee is created. Current player is changed to P002.Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P002 Stan Lee. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-Player Thomas Yiu (P001) Player Stan Lee (P002) Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P002 Stan Lee. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-Please input hero information (id, name):- H003, scarlet witch Hero Type (1 = Warrior | 2 = Warlock ):- 2 Hero is added. Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P002 SilenAdd assignmentcheft Hill. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :- aPlease input hero information (id, name):- H004, tony stark Hero Type (1 = Warrior | 2 = Warlock ):- 1 Hero is added. Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P002 Stan Lee. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-Player Stan Lee (P002) Heroes: H003, scarlet witch, Warlock, Hp: 100, Damage: 200, Mp: 500 H004, tony stark, Warrior, Hp: 500, Damage: 0, Defence Point: 500 Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P002 Stan Lee. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-Please input player ID:- P001 Changed current player to P001. Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Thomas Yiu. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-Please input hero ID:- H001 H001 peter pang’s attributes are changed to: H001, peter pang, Hp: 500, Damage: 250, Defence Point: 400 Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Thomas Yiu. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-Player Thomas Yiu (P001) Heroes: H001, peter pang, Warrior, Hp: 500, Damage: 250, Defence Point: 400 H002, john wick, Warlock, Hp: 100, Damage: 200, Mp: 500 11. Delete hero Fantastic World (FWAssignment Project Exam Help) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Thomas Yiu. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-Please input hero ID:- H002 H002 john wick is deleted. Add 12. Change the name of the current player Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Thomas Yiu. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-Please input new name of the current player:- Russo Brothers Player’s name is updated.Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Russo Brothers. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-Player Russo Brothers (P001) Heroes: H001, peter pang, Warrior, Hp: 500, Damage: 250, Defence Point: 400 Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Russo Brothers. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-Undo List Change player’s name, P001, Russo Brothers Delete hero, H002 Call hero skill, H001, peter pang, Warrior, Hp: 1000, Damage: 500, Defence Point: 400 Add hero, H004, tony stark, Warrior Add hero, H003, scarlet witch, Warlock Create player, P002, Stan Lee Add hero, H002, john wick, Warlock Add hero, H001, peter pang, Warrior Create player, P001, Thomas Yiu — End of undo list — Redo List — End of redo list – Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Russo Brothers. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :Command (Change player’s name, P001, Russo Brothers) is undone. Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Thomas Yiu. Add Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Thomas Yiu. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :Command (Delete hero, H002) is undone.Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Thomas Yiu. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :- Command (Call hero skill, H001, peter pang, Warrior, Hp: 500, Damage: 250, Defence Point: 400) is undone.Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Thomas Yiu. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-Player Thomas Yiu (P001) Heroes: H001, peter pang, Warrior, Hp: 500, Damage: 0, Defence Point: 500 H002, john wick, Warlock, Hp: 100, Damage: 200, Mp: 500 Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Thomas Yiu. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :Command (Add hero, H004, tony stark, Warrior) is undone. Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Thomas Yiu. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :Command (Add hero, H003, scarlet witch, Warlock) is undone. Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Thomas Yiu. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-Please input player ID:- P002 Changed current player to P002. Add Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P002 Stan Lee. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-Player Stan Lee (P002) Heroes:Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P002 Stan Lee. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :Command (Create player, P002, Stan Lee) is undone. Current player is changed to P001.Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Thomas Yiu. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-Please input player ID:- P002 Player P002 is not found!! Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Thomas Yiu. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-Undo List Add hero, H002, john wick, Warlock Add hero, H001, peter pang, Warrior Create player, P001, Thomas Yiu — End of undo list — Redo List Create player, P002, Stan Lee Add hero, H003, scarlet witch, Warlock Add hero, H004, tony stark, Warrior Call hero skill, H001, peter pang, Warrior, Hp: 500, Damage: 250, Defence Point: 400 Delete hero, H002 Change player’s name, P001, Russo Assignment Project Exam HelpBrothers — End of redo list – Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Thomas YiAdd assignmentchefu. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :Command (Create player, P002, Stan Lee) is redone. The current player is changed to P002.Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P002 Stan Lee. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :Command (Add hero, H003, scarlet witch, Warlock) is redone. Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P002 Stan Lee. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :Command (Add hero, H004, tony stark, Warrior) is redone. Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P002 Stan Lee. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :- Player Stan Lee (P002) Heroes: H003, scarlet witch, Warlock, Hp: 100, Damage: 200, Mp: 500 H004, tony stark, Warrior, Hp: 500, Damage: 0, Defence Point: 500 Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P002 Stan Lee. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :- Command (Call hero skill, H001, peter pang, Warrior, Hp: 500, Damage: 250, Defence Point: 400) is redone.Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Thomas Yiu. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-Please input player ID:- P001 Changed current player to P001. Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P001 Russo Add assignmentchefBrothers. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-Player Thomas Yiu (P001) Heroes: H001, peter pang, Warrior, Hp: 500, Damage: 250, Defence Point: 400 H002, john wick, Warlock, Hp: 100, Damage: 200, Mp: 500 Fantastic World (FW) c = create player, g = set current player, a = add hero, m = call hero skill, d = delete hero, s = show player, p = display all players, t = change player’s name, u = undo, r = redo, l = list undo/redo, x = exit system The current player is P002 Stan Lee. Please enter command [ c | g | a | m | d | s | p | t | u | r | l | x ] :-Undo List Call hero skill, H001, peter pang, Warrior, Hp: 500, Damage: 250, Defence Point: 400 Add hero, H004, tony stark, Warrior Add hero, H003, scarlet witch, Warlock Create player, P002, Stan Lee Add hero, H002, john wick, Warlock Add hero, H001, peter pang, Warrior Create player, P001, Thomas Yiu — End of undo list — Redo List Delete hero, H002 Change player’s name, P001, Russo Brothers — End of redo list –Add Requirement for Scanner usage Wrong Scanner usage (more than one object of Scanner is created for reading keyboard input): // create new Scanner objects in loop do { Scanner sc = new Scanner( System.in); choice = sc.nextInt(); } while (choice != 1);Correct Scanner usage (only one Scanner object is created for reading keyword input): Following is an example program to use a Global Scanner object or pass as a parameter to do the input. import java.util.Scanner;public class Test { //Global declaration for Scanner public static Scanner sc = new Scanner(System.in);public static void main(String args[]) { int x; System.out.print(“Enter x:”); x = sc.nextInt(); Assignment Project Exam Help } public static void method1() { System.out.print(“Enter y:”); y = sc.nextInt(); } public static void method2(Scanner sc) { int y; System.out.print(“Enter y:”); y = sc.nextInt(); }}*** END ***
A. Scenario B. Function Requirement Fruits & Consumption Management (for shop staff and warehouse staff) CRUD for fruit types Show a list of all fruits and the source location Show the stock level for different locations (source country, shop, city, target country) Reserve fruits from source city Check reserve records Borrow fruits from other shops in the same cities Check the fruits on delivery (borrow/reserve) Update fruits stock level in the shop/warehouse Check – in, Check – out, Approve – Reserve, Approve – Borrow Analytic / Report (for Senior management) Show a list of reserve needs of the selected shop/city/country (hints: aggregation of the reserve records) Show a list of consumption records of the selected shop/city/country under different seasons Account Management (for all suitable users in different position) Show a list of existing users Create and delete users (Shop, Warehouse, Senior management) Edit users with detail and roles Manage the user role Extra Feature You are encouraged to work on the extra features to score bonus mark, for example: – Show report in graphical format – Forecast report to achieve 1 SKU delivery (1 SKU means 1 fruit deliver in 1 day to other country by average time consumed) C. Project Requirement According to the scenario above, you are required to design and develop a web application with Java EE to solve the above background needs. D. Guideline AI Policy Submission of Assignment Work 1. The front page of your submission should include the course title, module title, student identity number, student name, and group number. 2. A written report should include the followings: Assumption and the user and system requirements Site map System structure on how MVC Model is applied Database structure Conclusions Skill checklist which lists your used skills (or technologies) in a single page and highlights the skills and technologies applied in your project 4. You are required to demonstrate your assignment. You will fail this module if you do not demonstrate the assignment in the lab session as required. 5. The End – E. Guideline Submission of Assignment Work 1. The front page of your submission should include the course title, module title, student identity number, student name, and group number. 2. A written report should include the followings: Assumption and the user and system requirements Site map System structure on how MVC Model is applied Database structure Brief description (1 or 2 pages only) on the major characteristics and design of your application Conclusions Skill checklist which lists your used skills (or technologies) in a single page and highlights the skills and technologies applied in your project 4. You are required to demonstrate your assignment. You will fail this module if you do not demonstrate the assignment in the lab session as required. 5. The End – # ITP4511 Project :#The Solution needs to be customised that’s why we didn’t attach the solution For the Programming Help for this solution email or whatsapp me at: [email protected] Whatsapp : +1 419 877 7882
Assignment Instructions 1. Code Requirement: You must write all the code yourself. Only use standard libraries. Standard data structures (like list, dictionary, tuple, set) are allowed as long as they are space/time – efficient for your purpose. Input/output and other unavoidable routines are exempted. 2. Submission Procedure: File Naming and Content: All your scripts must contain your name and student ID. Zip File: Upload a .zip archive via Moodle with the filename in the form .zip. The archive should extract to a directory named after your student ID. Include a2.py script and a2report.pdf in this directory, and nothing else. Submission Method: Submit the archive electronically via Moodle. Specification Adherence: Strictly follow the specifications in each question. Do not hard – code input filenames; they should be passed from the command line. Assignment Task DL – distance ≤1 multiple pattern matching within a collection of texts 1. Task: Given a collection of text files and multiple pattern files, report all occurrences of each pattern that matches, under the DL – distance ≤1 threshold, the regions of texts in the collection. 2. Data Structure Requirement: Use the suffix tree data structure as the primary search data structure for DL – distance ≤1 searches. 3. Input Files Details: Character Set: Each text/pattern file contains characters from the 7 – bit ASCII printable characters (32–126) and ASCII whitespace control codes (8–13, 32). Assume the ‘$’ symbol does not appear in text/pattern files. Program Specification 1. Program Name: a2.py 2. Argument: The program accepts one argument, which is the filename of a run – configuration file. The run – configuration file format is: 1 2 … N 1 2 … M 3. Command Line Usage: a2.py 4. Output File Name: output a2.txt 5. Output Format: … Numbering: The pattern number and text number in each line should follow the numbering in the run – configuration file. They need not be in sorted order. Position of Occurrence: It is the 1 – based position where the pattern was observed in the text under a DL – distance of 0 or 1. Written PDF Report (a2report.pdf) :#The Solution needs to be customised that’s why we didn’t attach the solution For the Programming Help for this solution email or whatsapp me at: [email protected] Whatsapp : +1 419 877 7882
o Assignment FAQ: There is an Assignment Frequently Asked Questions page set up for the Assignment 2 on EdStem Forum. Problem Description maintainsAssignment Project Exam Helpand stores all of the business transactions information (e.g. properties, hosts, listings, booking, etc.) required for the management’s daily operation. As the business grows, M-Stay has decided to build a Data Warehouse to improve their analysis and work efficiency. However,https://.comsince the staff at M-Stay have limited Business Intelligence and Data Warehouse knowledge, they have decided to hire you to design, develop and quickly generate BI reports from a Data Warehouse. The operational databaseAdd assignmentcheftables can be found at the MStay account. You can, for example, execute the following query: select * from MStay.; The data definition of each table in MStay is as follows: Table Name Attributes,Data Types and Key Constraints Notes REVIEW Review_ID Number (PK) The table stores review information of the relatedReview_Comment Varchar2 Booking_ID Number (FK)BOOKING Booking_ID Number (PK) The table stores booking information.Booking_Duration Number Booking_Cost Number Booking_Num_Guests Number Listing_ID Number (FK) nment ProjecGuest_ID t ExaNumber (FK) GUEST h ttps://powcodGuest_ID er.coNumber m (PK) The table stores all guest information. Guest_Name Varchar2 Listing_ID owco Number (PK) der The table stores all listingListing_Title Varchar2 Each listing has one property and one host information. Listing_Price Number Listing_Min_Nights Number Listing_Max_Nights Number Prop_ID Number (FK) Type_ID Number (FK) Host_ID Number(FK) HOST Host_ID Number (PK) The table stores all host information. Host_Name Varchar2 Host_Location Varchar2 Host_About Varchar2 Host_Listing_Count Number HOST_VERIFICA TION Host_ID Number (PF) The table stores the verification informationAssig Channel_ID nment Projec Number t Exa(PF) m between host and Helpchannel. CHANNEL h Channel_ID Number (PK) The table stores the channel ofttps://powcod er.com verification for the hosts. Channel_Name Varchar2 Type_ID owcoNumber (PK) derThe table ores all listing types. Type_Description Varchar2 PROPERTY Prop_ID Number (PK) The table stores all property information. Prop_Description Varchar2 Prop_Neighbourhood_Overv iew Varchar2 Prop_Num_Beds Number Prop_Num_Bedrooms Number Prop_Num_Bathrooms Number Prop_Num_Reviews NumberA. Transformationhttps://.comStage The first stage of this assignment is divided into TWO main tasks: 1.Design a dataAdd assignmentchefwarehouse for the above M-Stay database. You are required to create a data warehouse for the M-Stay database. The management is especially interested in the following indicators: ● Number of reviews ● Number of listings Prop_Rating_Location Number Prop_Rating_Cleanliness Number Prop_Rating_Value Number Prop_Average_Rating Number PROPERTY_AME NITY Prop_ID Number (PF) The table links property and amenity tablesAmm_ID Number (PF) AMENITY Amm_ID Number (PK) The table stores all amenities informationAmm_Description Varchar2 ● Average booking cost (find appropriate fact measures that can calculate the average booking cost) The following is a list of dimension attributes that you should include in your data warehouse: ● Listing type ● Listing time [Month, Year] ● Listing season o (Spring: 9 to 11, Summer: 12 to 2, Autumn: 3 to 5 and Winter: 6 to 8) ● Listing maximum stay duration [short-term: less than 14 nights, medium-term: 14 to 30 nights, long-term: more than 30 nights] ● Listing price range [low: less than $100, medium: $100 to $200, high: more than $200] ● Channels ● Booking duration [short-term: less than 30 nights, medium-term: 30 to 90 nights, long-term: more than 90 nights] ● Review time [Month, Year] ● Booking cost range [low: less than $5000, medium: $5000 to $10000, high: more than $10000] For the attribute, ensure that it meets the requirements of the range or group specified in your submission, if required in the specification. – Preparation stage. Before you start designing the data warehouse, you have to ensure that you have explored the operational database and have done sufficient data cleaning. Once you have done the data cleaning process, you are required to explain what strategies you haveAssignment Project Exam Helptaken to explore and clean the data. The outputs of this task for Report are: a) If you have done the data cleaning process, explain the strategies you used in this process (https://.comyou need to show the SQL to explore the operational database and SQL of the data cleaning, as well as the screenshot of data before and after data cleaning). Add – Designing the data warehouse by drawing star/snowflake schema. Design task A: ● How many long-term stay duration listings are listed on Facebook? ● How many listings are there in summer for an “Entire home/apt” in a medium price range? ● How many bookings were there for “Private rooms” with a short-term stay duration in 2015? Note: the star schema you created in Design Task A as the highest level of aggregation Design task B: In this assignment, consider the star schema you created in Design Task A as the highest level of aggregation. The M-Stay company manager wants to implement a drill-down function to explore more detailed information. Your task is to suggest several ways to increase the granularity of your fact tables from Design Task A. In other words, the manager wants to decrease the aggregation level of the fact tables you created in Design Task A. The outputs of task A & B for Report are: b) A star/snowflake schema diagrams for design task A (You can use Lucidchart to draw the star schema). c) List suggestion of increase the granularity of your fact tables for design task B. 2. Implement design task A star/snowflake schema using SQL. YouAssignment Project Exam Helpare required to implement the star/snowflake schema that you have drawn in design task A. This implies that you need to create the fact and dimension tables in SQL. The output is a series of SQL statements to perform this task. You will also need to show thathttps://.comthis task has been carried out successfully. Note: ● If your Add assignmentchefaccount is full, you will need to drop all of the tables that you have previously created during the tutorials. ● If you have dropped all tables in your account and you still encounter the ORA-01536: space quota exceeded for tablespace ‘TABLE_NAME’, please check your SQL code whether you have properly joined all tables. This issue was mainly caused when you did not do the table join properly as the number of records multiplied during the process. The outputs of this task for Report are: a) Screenshots of the table structure you created for Design Task A, including the dimension table and fact tables. A sample of screenshots of the table structureConduct a data analysis using the star schema you created in Design Task A by writing SQL queries to explore the data further. Present your findings in a clear and concise manner, demonstrating your understanding of the dataset and highlighting any noteworthy observations or patterns. The outputs of this task for Report are: Assignment Project Exam Help1. Findings report: A detailed explanation of your findings, including any significant observations or patterns identified during the analysis. Submission Checklist AddStep 1: ReportAdd (25% of the total score) A combined pdf file save as: YourstudentID_A2_report.pdf, containing all of the above tasks: A. Cover page B. If you have done the data cleaning process, explain the strategies you used in this process (you need to show the SQL to explore the operational database and SQL of the data cleaning, as well as the screenshot of data before and after data cleaning). Note that you are only required to find around 5 (five) data errors for this stage. C. A star/snowflake schema diagrams for design task A D. List suggestion of increase the granularity of your fact tables for design task B E. Screenshots of the table structure you created for Design Task A only, including the dimension table and fact tables. a. SQL file for creating the star schema is NOT required in submission F. Findings report: A detailed explanation of your findings, including any significant observations or patterns identified during the analysis. Step 2: Poster (35% of the total score) One page standard A4 poster in PDF format to save as: YourstudentID_A2_poster.pdf Extract key information from the report you created and present it in a one-page poster. The poster must be in standard A4 size and in PDF format, which can be either landscape or portrait. The content should be clear and easy to understand. Avoid using technical jargon or complex language. Review the poster before submission to ensure it effectively communicates the key messages of your report. Note: Ensure the poster content is consistent with the key structure and findings of your report, and choose an appropriate layout that effectively organizes the information in a clear and logical manner. Maintain a good balance of text and visualsAssignment Project Exam Helpto enhance readability, and ensure all visuals are relevant and support the content of the poster. Label all visuals clearly and provide captions where necessary. Avoid overcrowding the poster with too much text or too many visuals, andhttps://.comensure the poster is free of any grammatical or typographical errors. Key guidanceAdd assignmentchefof design a poster: ● What is the main theme/objective of the poster that you want to express? ● Who is your target audience for this poster? ● Do you really need all the details from your report on this poster? Step 3: Video presentation (40% of the total score) A five minute video presentation in mp4 format save as: YourstudentID_A2_video.mp4 Based on the report and poster you have created, present your design and findings in a five-minute video presentation. Ensure you thoroughly understand both the report and the poster to effectively extract and communicate the key points. Assignment Submission The assignment must be submitted electronically through Moodle. Please ensure the following: 1. Step 1 output: A combined pdf file save as: YourstudentID_A2_report.pdf 2. Step 2 output: One page standard A4 poster in PDF format to save as: YourstudentID_A2_poster.pdf 3. Step 3 output: A five minute video presentation in mp4 format save as: YourstudentID_A2_video.mp4 Zip all above files from step 1 to 3, and name the ZIP folder as A2_YourstudentID.zip. ●Assignment Project Exam HelpThe submission of this assignment must be in the form of a single ZIP file. Only PDF and .mp4 files will be accepted within the zip file. No other formats will be accepted. ● You musthttps://.comensure that you have all the files listed in this checklist before submitting your assignment to Moodle. Failure to submit a complete list of files will lead to mark penalties. ● It is importantAdd assignmentchefto note that our support hours are limited and we don’t have the capacity to deal with submission issues outside of working hours. You must ensure that you have all the files listed in this checklist before submitting your assignment to Moodle. Failure to submit a complete list of files will result in a mark penalty. Authorship Late Penalty: Special Consideration: All extensionsAssignment Project Exam Help/ special considerations will now be handled by the central Spec Con team. Please do not email teaching staff to request an extension or special consideration. All special consideration requests should be made using the Special Consideration Application. Please do not assume that submission of a Special Consideration application guarantees that it will be granted – you must receive an official confirmation that it has been granted. Getting help and support: What can you get help for? ● Consultations with the Teaching Team Talk to the Teaching Team: https://lms.monash.edu/course/view.php?id=162086§ion=2 ● English language skills Talk to English Connect: https://www.monash.edu/english-connect ● Study skills Talk to a learning skills advisor: https://www.monash.edu/library/skills/contacts ● Counselling Talk to a counsellor: https://www.monash.edu/health/counselling/appointments Test your knowledge, collusion (FIT No Collusion Module)All the best for your Assignment! Add
You will be working alone for this project. This specification is subject to change at any time for additional clarification. Desired Outcomes ● Exposure to using C++ std::string ● Exposure to GoogleTest ● Use of git repository ● An understanding of how to develop Makefiles that build and execute unit tests ● An understanding : of how to calculate edit distance Project Description You will be implementingAssignment Project Exam Help a set of C++ string manipulation utilities that are like those available in python. To guide your development and to provide exposure to Test Driven Development, you will be developing GoogleTest tests to test your functions. You will also be developing a Makefile to compile and run your tests. You must use good coding practice by developing this project in a git repository. The string utility functions that you will have to develop are as follows: // Returns a substring of the string str, allows for negative values as in // python end == 0QQ: 749389476 means to include end of string std::string Slice(const std::string &str, ssize_t start, ssize_t end=0); // Returns the capitalized string as in python std::string Capitalize(const std::string &str);// Returns the upper- or lower-case strings as in python std::string Upper(const std::string &str); std::string Lower(const std::string &str);// Returns the left/right/both stripped strings (white space characters are // removed from left, right or both) std::string LStrip(const std::string &str); std::string RStrip(const std::string &str); std::string Strip(const std::string &str);// Returns the center/left/right justified strings std::string Center(const std::string &str, int width, char fill = ‘ ‘); std::string LJust(const std::string &str, int width, char fill = ‘ ‘); std::string RJust(const std::string &str, int width, char fill = ‘ ‘);// Returns the string str with all instances of old replaced with rep std::string Replace(const std::string &str, const std::string &old, const std::string &rep);// Splits the string up into a vector of strings based on splt parameter, if // splt parameter 程序代写代做 CS编程辅导is empty string, then split on white space std::vector< std::string > Split(const std::string &str, const std::string &splt = “”);// Joins a vector of strings into a single string std::string Join(const std::string &str, const std::vector< std::string > &vect);// Replaces tabs with spaces aligning at the tabstops std::string ExpandTabs(const std::string &str, int tabsize = 4); // Calculates the Levenshtein distance (edit distance) between the two // strings. See https://en.wikipedia.org/wiki/Levenshtein_distance for // more information. int EditDistance(const std::string &left, const std::string &right, bool ignorecase=false); : The Makefile you develop needs to implement the following: ● Must create obj directory for object files (if doesn’t exist) ● Must create bin directory for binary files (if doesn’t exist) ● Must compileAssignment Project Exam Help string utils file and string utils tests using C++17 ● Must link string utils and string utils tests object files to make teststrutils executable ● Must execute [email protected] teststrutils executable ● Must provide a clean that will remove the obj and bin directoriesYou can unzip the givenQQ: 749389476 zip file with utilities on your local machine, or if you upload the file to the CSIF, you can unzip it with the command: unzip proj1.zipYou must submit thehttps://tutorcs.com source file(s), your Makefile, README.md file, and.git directory in a zip archive. Do a make clean prior to zipping up your files so the size will be smaller. You can zip a directory with the command: zip -r archive-name.zip directory-nameYou should avoid using existing source code as a primer that is currently available on the Internet. You MUST specify in your README.md file any sources of code that you have viewed to help you complete this project. You MUST properly document ALL uses of Generative AI following the guidelines outlined in the Generative AI Restrictions. All class projects will be submitted to MOSS to determine if students have excessively collaborated. Excessive collaboration, or failure to list external code sources will result in the matter being referred to Student Judicial Affairs. Recommended Approach The recommended approach is as follows: 1. Create a git repository and add the provided files. 2. Create a Makefile程序代写代做 CS编程辅导 to meet the specified requirements. Since no tests have been written, all tests should pass. 3. Write tests for each of the functions. Each test you write should fail initially. Make sure to have sufficient coverage of the possible input parameters. Grading Your submission will be autograded. Make sure your code compiles on Gradescope, and passes all the test cases. Helpful Hints : ● Read through the guides that are provided on Canvas ● See http://www.cplusplus.com/reference/, it is a good reference for C++ built in functions and classes ● Use lenth(), substr(), etc. from the string class whenever possible. ● If the build fails, there will likely be errors, scroll back up to the first error and start from there. It will output the line string “FILE @ line: X” where FILE is the source filename and X is the line number the code is on. You can copy and paste it in multiple places and it will output thatQQ: 749389476 particular line number when it is on it.
DNA Search This project explores pattern – matching techniques to find a pattern within a DNA sequence composed of the alphabet {A, C, G, T}. Example: Consider the following DNA sequence: ATGACGATCTACGTATGGCAGCCACGCTTTTGATGTTAAGTCACACAGCCAAGTCAACAAGGGC GACTTCATGATCTTTCCGCTCCGTTGGTGTAGGCCCGTGTTCAAATTCAATGGCTGATTGGAAT TACCTTTGAAATACTCCAACCGACCGCCACGGCCAGGGTCCCGCTCGCTCTCTGTGGCCCTCCC ACAAAACTCCGGTGAAAGTTGATTTGGACACGGACCCAAAGCAGCGTAGATTATTCGAGCGTAT TCGGTAGTCATTGAGGCCCCAA The pattern “GCTTTT” is found at index 27 (where the first character of the sequence is at index 0). Note: Overlapping matches are treated as separate occurrences. For instance, in the sequence ‘AAAAAA’ with the pattern ‘AAA’, there are 4 occurrences at indices 0, 1, 2, and 3. You will write a C program and a RISC – V program to identify the indices where a given pattern appears in a DNA sequence. Strategy 1. Pre – coding Analysis: Before writing any code, analyze the task requirements and constraints. Mentally explore various approaches and algorithms, considering their potential performance, code length, and storage costs. There are often trade – offs between these metrics. 2. High – Level Language (HLL) Implementation: Choose a promising approach and first implement it in a high – level language (e.g., C) to deepen understanding. HLL implementations are more flexible for exploring solutions and should be developed before creating the assembly version, where design changes are more difficult. For P1 – 1, you will write a C implementation of the program. 3. Assembly Translation: Once a working C version is completed, “be the compiler” to translate it into RISC – V assembly. This step helps understand how HLL constructs map to machine – level instructions and offers opportunities to optimize performance and efficiency. You will write the assembly version for P1 – 2. P1 – 1: High – Level Language Implementation 1. Development Approach: Start with a simple implementation to understand the problem, then experiment with optimizations. Time spent here can save significant effort during assembly coding. 2. Shell Program: Use the provided P1 – 1 – shell.c as a template. Rename it to P1 – 1.c and modify it by adding your code. The shell auto generates a 10240 – character DNA sequence and a random pattern (3 – 10 characters) using ASCII characters A, C, G, T. 3. Match Function Requirements: Implement the Match function with four parameters: 4. Pointer to the pattern string 5. Pattern length 6. Pointer to the DNA sequence 7. Sequence length The function must store matching indices in ascending order in the global array MatchIndices, ending with – 1. Example: For the sequence “AACAAC” and pattern “AAC”, MatchIndices should be [0, 3, – 1]. 8. Grading Considerations: 9. Do not modify the provided print statements, as they are used for grading. 10. Use the DEBUG flag to wrap debug prints (set to 0 in submitted code). 11. Submission requirements: File named P1 – 1.c Compile and run with gcc on Linux (no warnings). Self – contained code (no header files). P1 – 2: Assembly Level Implementation 1. Shell Program: Use the provided P1 – 2 – shell.asm as a starting point; rename it to P1 – 2.asm. The assembly program uses ecall to generate a random DNA sequence (4800 characters, packed as 2 – bit nucleotides) and a pattern (3 – 7 nucleotides). 2. ECALL Routines: 3. 512 (Generate DNA Sequence): a0: Address for the pattern (stored as right – to – left 2 – bit pairs). a1: Address for the sequence (4800 characters, 600 words with lower 16 bits as 2 – bit nucleotides). a2: Pattern length (3 – 7). 4. 513 (Verify Solution): a3: Address of MatchIndices array (sorted indices + – 1). Outputs debug messages for correctness. 5. 552 (Highlight Letter): a6: Offset in the sequence (0 – 4799) for debugging visualization. 6. Memory Constraints: 7. The Matches array must be alloc 128 (do not modify size). 8. Performance Metrics: 9. Baseline Scores: Static code size: 44 instructions Dynamic execution length: 45740 instructions Storage: 743 words 10. Scoring Formula: [ ext{PercentCredit} = 2-rac{ ext{Metric}{ ext{Your Program}}}{ ext{Metric}{ ext{Baseline Program}}} ] 11. Accuracy (25 points): Reduced by 10% per failed trial (out of 100), with style deductions possible. 12. Performance scores are adjusted by trial failures (10% deduction per error; no credit for ≥10 failures). 13. Submission Requirements: 14. File named P1 – 2.asm. 15. Call ecall 513 to report results and use jalr to exit. 16. No infinite loops or simulator errors. Project Grading | Part Description | Percentage | |——————————–|————| | P1 – 1 (C code: correctness/style) | 25% | | P1 – 2 (Assembly: correctness/style) | 25% | | Static code size | 15% | | Dynamic execution length | 25% | | Operand storage requirements | 10% | | Total | 100% | Honor Policy All code must be independently designed, implemented, and tested by the student. Use of AI tools, code sharing, or collaboration constitutes academic misconduct. ECE2035 Project One :#The Solution needs to be customised that’s why we didn’t attach the solution For the Programming Help for this solution email or whatsapp me at: [email protected] Whatsapp : +1 419 877 7882
Please follow the below steps to test your Project-3. Download the zip of this git repository. Unzip the repository. Implement your kernel module code in the proc_filesys.c (i.e. all //TODO part, DO NOT change any other part). Modify Your_Name and ASU_ID in the test.sh, and then run the test.sh script with the test arguments. Test Cases: ./test.sh 1 | write to kernel (25 pts); read entire content (25 pts); read from head (25 pts); read from middle (25 pts) | 100 | | 4 | sudo ./test.sh 2 | Bonus! write beyond the size limit, return: Invalid argument | 0.5 extra point of your CSE330 final total grades | | 5 | sudo ./test.sh 3 | Bonus! read from tail | 0.5 extra point of your CSE330 final total grades | Note: Sample Output Screenshots: Test Case 1Test Case 2Test Case 3