ProjectImplement the following Vehicle Control system with the specifications listed below. 1. Ask the user if he/she wants a. Turn on the vehicle engine b. Turn off the vehicle engine c. Quit the system2. If chose to “Quit the system”: Quit program3. If chose to “Turn off the vehicle engine”: Ask him/her again what he/she wants to do (Requirement 1)4. Once a choice has been chosen, print on screen the system state.5. If chose to “Turn on the vehicle engine”, display “Sensors set menu”, menu that simulates the vehicle sensors readings. a. Turn off the engine b. Set the traffic light color. c. Set the room temperature (Temperature Sensor) d. Set the engine temperature (Engine Temperature Sensor)6. While the engine is ON, menu in requirement 5 must be always displayed and waits for an answer.7. Based on the answer of requirement 6. a. Based on traffic light data (Take it as input from console, we will assume that this is the sensor read value) i. If the traffic light is ‘G’ set vehicle speed to 100 km/hr ii. If the traffic light is ‘O’ set vehicle speed to 30 km/hr iii. If the traffic light is ‘R’ set vehicle speed to 0 km /h b. Based on room temperature data (Take it as input from console, we will assume that this is the sensor read value) i. If temperature less than 10, Turn AC ON and set temperature to 20 ii. If temperature is greater than 30, Turn AC ON and set temperature to 20 iii. If temperature is otherwise, Turn AC OFFc. Based on engine temperature data (Take it as input from console, we will assume that this is the sensor read value) i. If temperature less than 100, Turn “Engine Temperature Controller” ON and set temperature to 125 ii. If temperature is greater than 150, Turn “Engine Temperature Controller” ON and set temperature to 125 iii. If temperature is otherwise, Turn “Engine Temperature Controller” OFFd. If vehicle speed is 30 km/hr i. Turn ON AC if it was OFF and set room temperature to: current temperature * (5/4) + 1 ii. Turn ON “Engine Temperature Controller” if it was OFF and set engine temperature to: current temperature * (5/4) + 1e. Display the current vehicle state after applying 7.a to 7.d i. Engine state: ON/OFF ii. AC: ON/OFF iii. Vehicle Speed iv. Room Temperature v. Engine Temperature Controller State. vi. Engine Temperature8. If chose in menu of requirement 5 to “Turn off the engine”, the menu of requirement 1 must be displayed.9. Bonus Requirement: Create #define WITH_ENGINE_TEMP_CONTROLLER, if this #define is 1 then compile/run the code lines that are related to the “Engine Temperature Controller, else do not compile/run. (Code that implements5—d, 7-c, 7-d-ii, 7-e-v and 7-e-vi) Notes • To get an character input use: printf(“a. Turn on the vehicle engine “); printf(“b. Turn off the vehicle engine “); printf(“c. Quit the system “); scanf(“ %c”,&input); Make sure you left a space before %c to prevent scanf function from take new line or enter from the above printf functions as input. • For the bonus requirement, search for how to use preprocessor directive like below in C Language. #if (CONDITION) … #endif This topic will be discussed later in C For Embedded Systems (Embedded C) Course.Thanks and Good Luck Eng / Mohamed Tarek
● Only some problems are given now – so you can get started. More problems will be added next week. ● Any changes/clarifications to the problems will be highlighted in blue. ● For these problems, as before you will have to create your test data. ● If any details are missing – you can make assumptions about them and state them as comments at the top of the program. == Question 1. The canteen in the Institute maintains has a table of prices of items, like: 1. Samosa: 15 2. Idli: 30 3. Maggie: 50 4. Dosa, 70 … For the program you have to write, set the menu in your program by this statement (feel free to add more items). menu = [(“Samosa”, 15), (“Idli”, 30), (“Maggie”, 50), (“Dosa”, 70), (“Tea”, 10), (“Coffee”, 20), (“Sandwich”, 35), (“ColdDrink”, 25)] Maggie, 1, Rs 50 Samosa, 5, Rs 75 TOTAL, 6 items, Rs 125 Question 2. Write a program that will repeatedly take (for a student) input of this type: course_no, number_of_credits, and grade_received (eg. CSE101 4 A). These inputs are for the courses the student has done in the semester. If no input is given (i.e. just a return), that means no courses are left. From this data, print the transcript for the semester for the student as: Course_no: grade Course_no: grade …. SGPA: n.nn (two decimal places) In the transcript, the records should be printed such that they are sorted by the course_no (hint: when sorting a list of tuples/lists, the default sorting is with the first element of the tuple/list). First just write code to read the input, split it into a list, and pass it to a function to check for errors (recall string has operations like isdigit() to check if the string is integer). If input is valid, add to a list of tuples (initially, to check the course name structure just check if it is an alphanumeric string – later write a function to check its valid structure – requires a bit of logic). Independently write a function which, given a list of tuples, can compute the SGPA. Then combine the two. Question 3: Before you graduate, you get the signature from all your friends in your yearbook – which will be stored as a dictionary for the program. In this dictionary, the keys are the name of other students in the batch; the value is either 0 (for not signed) or 1 (for signed). Like you, all other students are also doing the same – and there is a dictionary entry for each. For creating this dictionary, input is given in a file – if there are N students, then the file contains: name1: name2, 1 name3, 0 name4, 1 … name2: name1, 0 name3, 0 name4, 0 … (Like this data is there for all the graduating students). Write a program to determine who has the most signatures and who has the least (if there are more than one for max/min – print all). Suggestion. Initially manually create a dictionary yearbook = {name1: {…}, name2: {….}, …}, and then determine students with most/least signatures. Then write a function to read the file and create this dictionary. Question 4. The goal of this game is for the user (player) to guess the 5 character word in a limited number of chances. Pick up at least 50 5 character words (e.g. from https://7esl.com/5-letter-words/ ) and save them in a list or any other structure (you can assign them in a statement, i.e. hard code them). Select a word at random from this list – for this, generate a random number between 0 and length of the list of words using the random number generator randint() provided in python (Pls look it up it is quite easy) – this is the index in the list for the word the user has to guess. Now prompt the user to input a five character string as his/her guess. After the guess, the program should show the user the chars from the guess string which are in correct places, and correct chars in wrong places, by outputting: –a-d (if a and d were in the input string and are in these two places in the word), other characters present: b (if b is present in the word) Let the user repeatedly give guesses, till either the guess is correct, or the number of tries is 6. Bonus: Accept from the user only valid words as guesses. For this, use an available online dictionary API to check if the given string is a word or not. And if not, prompt the user and do not count this attempt. Question 5. You are given a list of coordinates (each as a x,y tuple) representing a 2D shape. If the shape has N coordinate, create a Nx3 matrix with the first column having the x values, 2nd column having the y values, and the third column being 1 for all. This represents the matrix for the shape. You have to scale this 2D shape. For scaling, take as input cx and cy (scaling parameters) from the user, and form a matrix of the type:For scaling the shape, multiply the matrix of the shape with this matrix. Finally, return the new shape in the form of a list of coordinates – which will be the first two columns of the resultant matrix. Suggestion. While you can easily find matrix multiplication on the net, try to first write your own function for this – use online help only if needed. Question 6 (program for grading). For IP course, the performance in different elements of the class for students is given in a file (IPmarks.txt) – one line per student as comma separated values: Roll_no, m1, m2, m3, … Separately, the weight of each of the assessments is given as a list of tuples (you can hard code this in your program – assume that correct number of items is provided): wts = [(10, 5), (20, 5), (100, 15), (40, 10), …] Roll_no, total_marks, grade For grading, assume that A is above 80%, A- from 70, B from 60, B- from 50, C from 40, C- from 35, D from 30, and below 30 is F. Question 7 (Your personal address book): Your program should provide the following operations: (i) insert a new entry, (ii) delete an entry (iii) find all matching entries given a partial name, (iv) find the entry with a phone number or email, and (v) exit. When the program exits, it writes the current dictionary in a file (addrbook.txt) and when it is started again next time, it reads the existing address book. Suggestion: To create an address book with some entries, write a small script which will call the add function to add a few entries to create a dictionary. Bonus. Write a program to merge address book of your friend with yours. For this, you can store the address book as a json or as a list of dictionary items – if both you and your friend agree to the structure, merging will be easier. Question 8. (A simplified page rank system.) Given a text file (pages.txt) which has lines of the form: URLnn, init_importance: text containing some URLnn. Each line represents a page with the first URLnn being its URL and init_importance is a number between 0 and 1.0, which gives the initial importance of this page. The URLnn in the rest of the line refers to the URL links to other pages that this page contains. (To simplify, instead of giving a full path, we are using URLnn, and instead of having separate files for each page, we are giving the text of a page as a line in the file). Example: URL02, 0.6: This is another page (page is represented as a line in this). This has reference to URL05, URL04, and URL00 … Pages are ranked according to their overall importance. Let the total number of unique links in a page i be links[i] (i.e. these many unique pages this page refers to). The overall importance of a page i is sum over all the pages (j) which have a link to page i of: init_importance[j]/ links[j] (i.e. all the pages to which the page j has a link are distributed the importance of this page equally) Your program has to find the highest ranking N pages (N can be set in a variable or taken as input) for a given input file. It seems natural to create a dictionary with page URL as the index, and having its init_importance, overall importance, the set of URLs it accesses, etc in it. Note. This simplified version has only one round of computation for the overall importance from the initi_importance. The ranking algorithm actually takes the current importance (starting with init_importance, which is often set to 1) and then updates the importance repeatedly till the changes are very small. If you want, you can extend your program to implement this. Bonus problem/mini-project. You can work in groups of 2 (or 3 max), i.e. all students can submit the same code – at the top of the code have a comment stating names of all students who worked together. Write an interesting application using some public APIs. The application should be interactive i.e. gives the user some query options, and gives the result of the query after fetching the data using APIs and processing it. There should be more than one API call in this application, preferably from different base APIs (i.e. APIs from different organizations). Using the music sites’ APIs, develop an application that can get the list of songs that were sung by some whose lyrics were written by some . Either of the fields, if not specified, should give songs for all singers/poets. Give this list as a CSV specifying the song (generally by its name or the starting lines – whatever the site uses), the singer, the lyrics_writers/poet, the year it was sung, and whatever other info is there. (It will be good if songs from multiple sites are in this list – will make it more complete.) (It will be nice if you can also give a YouTube link for listening/viewing this song).
Examine the following relation and its attributes and answer the following questions. Assume these are the values for “all time”. Assume girls with the same name are the same person. GIRL GROUP AGE GAME CATEGORY PRICE Charlotte 5 year olds 5 Mirror Makeup 4.88 Susan 6 year olds 6 Lipstick Makeup 5.95 Jane 5 year olds 5 Chess Games 7.55 Susan 6 year olds 6 Checkers Games 5.95 Susan 6 year olds 6 Mirror Makeup 4.88 Carrie 6 year olds 6 Lipstick Makeup 5.95 Jacqueline 5 year olds 5 Visual Basic Prog. Languages 199.99For each question below, please be sure to review the course content for additional reference material.1) Is this relation in at least 1NF? Why or why not?2) What is the primary key of the initial relation (assume the values shown are the only possible tuples for all time)? Remember that a primary key must be unique and not null.3) Describe the specific data anomalies that exist if we DELETE the tuple containing Jacqueline.4) Draw a functional dependency diagram for the initial relation. This diagram should agree with the primary key you selected in above. This can be drawn in any drawing tool.5) Based on your diagram, what normal form is the initial relation in? Why?6) If necessary, decompose the initial relation into a set of non-loss 3NF relations by showing the relations, attributes, and tuples. Show complete relations with attribute headings and all data values in the tuples of your relations. Determine the number of 3NF relations you end up with after normalization, write this number, and then circle the number.Additional References:• What is Normalization? https://www.guru99.com/database-normalization.html • What is a Primary Key? https://www.techopedia.com/definition/5547/primary-key • What are data anomalies? https://databasemanagement.fandom.com/wiki/Data_Anomalies • What is a functional dependency diagram? https://www.dlsweb.rmit.edu.au/Toolbox/ecommerce/dad_respak/dad_tutorial/html/dad_db3f_tu t.htm#:~:text=A%20set%20of%20Functional%20Dependencies,the%20direction%20of%20the %20dependency. • The Third Normal Form. https://www.geeksforgeeks.org/third-normal-form-3nf/Grading rubricAttributes Meets Does Not Meet Normal form 20 points Student correctly identifies normal form of initial relation 0 points Major error in identification of normal form or not specified Primary key 25 points Student correctly identified primary key of initial relation 0 points Major error with identification of primary key or not specified Data anomalies 15 points Student correctly describes data anomalies 0 points Major errors with description of data anomalies or not specified Functional dependency diagram 15 points Student correctly develops functional dependency diagram of initial relation 0 points Major errors developing functional dependency diagram or not specified Normalized 3NF relations 25 points Student correctly develops the proper set of 3NF relations via normalization 0 points Major errors in development of proper set of 3NF relations or not specified
Ensure that your Gradescope submission contains the following file:○ Lab3.asmThis lab will introduce you to the RISC-V assembly language programming using RARS (RISCVAssembler and Runtime Simulator). You will write a program with nested loops in the provided file Lab3.asm to write a specific pattern to a file by the name of lab3_output.txt (which will be generated by running your submitted Lab3.asm file. lab3_output.txt does not exist prior to this action)Make sure you properly follow the steps mentioned in the RARS site to download and install RARS on your machine. Please approach a TA or tutor in your lab section if you have issues installing RARS on your machine!Since this is your very first foray into assembly programming, please read this document thoroughly without skipping any sections!Much like how a high-level program has a specific file extension (.c for C, .py for python) RARS based RISC-V programs have a .asm extension.In the Lab3 folder in the course Google Slide, you will see 7 assembly files. They are meant for you to read (and understand) in sequence:Please download these files and make sure to open them in the RARS Text editor only. Doing otherwise will cause comments and other important code sections to not be properly highlighted and can be a hindrance to learning assembly language intuitively. Steps for opening, assembling and running a .asm file are provided later in this document.These 7 files have enough comments in the source code to jump start your understanding of RISC-V assembly programming if the lectures have not yet covered certain topics in assembly programming.Beyond these three files, you should have all the required resources in the Lecture Slides themselves, in the lecture pages following the topic “Von Neuman and RISC- V”. These lecture slides are very selfexplanatory. You are encouraged read ahead even if the instructor hasn’t started discussing them in lecture. You are also encouraged to read the excellent RARS documentation which can be found by clicking “help” on the RARS program, or at these URLs: https://github.com/TheThirdOne/rars/wiki and https://github.com/TheThirdOne/rarsFor the usage of macros (which are utilized heavily in this lab to generate system calls refererred to as ecalls), please also refer to the RARS documentation on macros and ecalls as well. For lab3, you don’t even need to know what the inside of a macro block looks like so long you know just what it is supposed to do overall.Helpful tip: For lab3 and lab4, it is recommended that you create two separate folders in your machine, lab3 and lab4. Make each folder the workspace for your respective lab. So, for the given lab, place all the provided .asm files in the Lab3 folder along with a copy of the .jar RARS application file, and run RARS from there. This is where you will create your Lab3.asm file as well.Figure 1 Ideal workspace setup for lab3/lab4Henceforth, run all .asm files pertinent to Lab3 on this local copy of the .jar RARS application.Open the RARS application. You should get the window below.Figure 2 Opening the RARS applicationLet us open firstRARSprogram.asm by clicking File -> Open.Make sure the comments (which appear in green) are properly indented and haven’t been misaligned when you downloaded the file from the Google Drive. They should appear as shown below:Figure 3 Opening an asm file on RARSMake sure to thoroughly read the entire contents of this file in the text editor. Verbose comments have been provided to guide you along in explaining each step in the source code. This will be the norm for the other .asm files in the Lab3 folder in Google Drive as well.After you have read and understood the source code, it is time to assemble the program. Before you assemble, go to Settings and make sure you have the exact following options checked (or unchecked). For this course, you are allowed to use pseudo instructions. Pseudo instructions are those instructions which are not native to the RISC-V instruction set but the RARS environment has defined these new ones by a combination of actual RISC-V instructions. Permit pseudo instructions (this actually makes coding easier for you). This should be done for every RARS code in CSE12!Figure 4 RARS settingNow click on Assemble (the Wrench and screwdriver icon). If correctly assembled, your Messages window should show the following information:Figure 5 Successful assemblyNow Click on the Run button to Run the program. You will get the following output:Figure 6 Successful RuntimeNow try running the other .asm files.One word of caution when your text editor contains multiple opened files is to make sure of assembling the correct assembly file. For example, in the window below, multiple files are open. If I want to only assemble and run add.asm, then my tab for add.asm should be highlighted as shown below. Only then can I click Assemble, then Run.Figure 7 Multiple tabs openRARS has a helpful feature where instead of Running the entire program at once, you can Run One Step At A Time. The corresponding button is beside the Run button. This allows you to sequentially execute each line of code and see how it affects the values of the Registers as they appear to the right of your screen.The file multiply.asm makes extensive use of macros to help create a more readable main program section (Instructions on how to use macros are provided in the file comments). So does the source code in the files fileWriteDemo.asm, fileReadDemo.asm and patternDisplayDemo.asm (we will discuss more on the aspect of file reads and writes that these .asm files do shortly). Based on how we define a macro in the source code, it is tempting to confuse it with a function. However, macros are NOT functions! Whenever you place multiple instances of the same macro in your code, you are copying the macro’s contents in those code areas for the same number of times.When you want to open a new file on RARS, go to File->New. The default file name riscv1.asm shows up on the tab. When you save this file, you MUST make sure that you are explicitly defining the correct extension(.asm) as shown below.Figure 8 Saving a new file in RARSFile creation and manipulation is a very common part of the learning curve whenever you learn of a new high level programming language, be it C or Python. For lab3, we will be writing the display pattern to a file so that it is more convenient for the auto grader. The auto grader within Gradescope will do a file text equality check between files generated by your lab3 source code and expected correctly generated files and accordingly provide you points (or not!).To give you a demo, we have two reference assembly source code files: fileWriteDemo.asm and fileReadDemo.asm. The former file creates a file with the name fileDemo.txt. The following text is written into fileDemo.txt: “These pretzels are making me thirsty!”. The latter file fileReadDemo.asm contains code to open fileDemo.txt and extract out this text to display on the output console of RARS.The following two images shows the results of having run fileWriteDemo.asm and then fileReadDemo.asm.Figure 9 A new file generated in my workspace after running fileWriteDemo.asm. Note the file size to be shown as 1KB despite us having written only 38 bytes of data into it. That is because a file also contains metadata and a header generated by your OS as well.Figure 10 RARS output console after running fileReadDemo.asmBoth fileWriteDemo.asm and fileReadDemo.asm use many macros within the source code to make the main.text section of the code more programmer friendly in terms of writing. For the purposes of lab3, you DO NOT need to understand WHAT these macros are doing within their definition block. It suffices to know simply what the result of executing a macro in your source code simply does. However, understanding the macros does help to build your foundation in RARS programming well.One thing to note is that since lab3 does not focus on proper function coding in RISC-V assembly, it can get very difficult to keep track of random unintentional instances of your registers to change value. For instance, in C or Python, you can define a variable temp, assign it a specific value, and be rest assured that this variable does not change from the assigned value during code compilation or runtime unless explicitly told to. However, in a large source code assembly, working with a limited number of registers means that it is very difficult to keep track of each individual register value unless you are very careful.We will deal with register preservation in lab4 but in lab3, you will only be asked to ensure that you do not use specific registers in your Lab3.asm source code. The list of these taboo registers will be highlighted in the section later on Lab3Besides the aforementioned 2 files related to file write and read, we also have a 3rd .asm file, patternDisplayDemo.asm. The source code in this file, once run, asks as input an integer n and then prints the pattern “* “ n number of times horizontally.Figure 11 Output console after running patternDisplayDemo.asm for user input n=3 and7. In both cases, make sure to check the contents of the created file patternDisplay.txt as wellSimilar to patternDisplayDemo.asm, Lab3.asm will also make use of loops (nested loops to be precise) to generate a pattern based on the value of n inputted by the user. Thus, you should thoroughly read and understand the working of source code in patternDisplayDemo.asm.This program will print out a pattern with stars (asterisks, ascii hex code 0x2a) and blank space (ascii hex code 0x20) and the newline character ‘ ’(ascii hex code 0x0a).The actual task of opening the file lab3_output.txt and writing the contents to it is borne by macros used in starter code included in the Lab3.asm file. Consider the screenshot of the Lab3.asm file below:Figure 12 Lab3.asm screenshotAs you can see, you should write down your code ONLY within the area indicated by the comments.The way this code works regarding file manipulation is as follows:When a file is created in RARS, it is assigned a file descriptor ID, in the form of an integer number. Future references to this file through macros are then made by referring to this file descriptor ID number. Once we create a file, we first need to set aside memory space within our RISC-V memory where data to be written to the file is kept. This space is referred to as a “memory buffer”. In Lab3.asm, we have defined the memory space starting from address 0x10040000 as our internal memory buffer. Specifically, we hold a doubleword (64 bits) at this address which keeps track of how many bytes we intend to finally write to the file. 0x10040008 onwards, we start collecting the bytes that will be written into the file lab3_output.txt.In your student code, you can update the file buffer with any character with the macro write_to_buffer. For example, I want to the write the character sequence “**** ” to my memory buffer within Lab3.asm’s student code section. Then I would need to write the following student code as shown next:Figure 13 Modified Lab3.asm screenshotRun this Lab3.asm file and open the generated lab3_output.txt in a text editor. Specifically, if you are using Notepad++ (which is strongly recommended), make sure to apply the setting: View->Show Symbol ->Show All Characters. This will make characters such as null and newline visible.Figure 14 lab3_output.txt screenshot from running modified Lab3.asmAs you can see, the blank space appears as an orange dot, newline as LF (Line Feed) and null as NUL.You can see these characters as they reside in the file buffer in memory too on RARS as shown below. If you go to Execute window after running this modified Lab3.asm, selecting 0x1004000 for view and enabling ASCII view, you will get the following screenshot:Figure 15 “* ** * ” data as it resides in file bufferNote that within each individual cell in the Data Segment matrix above, we should read the bytes from right to left.NOTE: For your student starter code, you MUST NOT use any of the registers: t0 to t6, sp. a0 to a7 should only be temporarilly used to pass parameters or receive parameters from macros. Using the registers s0 through s11 should be enough for Lab3 assignment.The following is a screenshot showing the runtime of the actual solved Lab3.asm code: Figure 16 Solved Lab3.asm runtime demoWhen we open the generated lab3_output.txt file, we get the following text:Figure 17 lab3_output.txt screenshotYour student code MUST display the prompts and error messages in response to user input EXACTLY as shown in Figure 16. Please make use of the provided strings in the .data section of the starter code in Lab3.asm to make sure you do not use any other type of sentence!NOTE: Although you are not required to print each row in the pattern on your output console, doing so (as shown in Figure 16 ) will greatly help in the real time code debugging. So, it is strongly advised to do so.The Lab3 folder in the Google Drive contains some test cases in testCases subfolder for the case when user input was n=1, 3, 6, 8, 30. Make sure your code output generates the exact same alignment of characters as provided there for the corresponding n input in your student code.Note that our grading script is automated, so it is imperative that your program’s output matches the specification exactly. The output that deviates from the spec will cause point deduction. Files to be submitted to your Lab3 gradescope portalLab3.asm-This file contains your pseudocode and assembly code. Include a header comment as indicated in the documentation guidelines here.This is the lab assignment where most students start to get flagged for cheating. Please review the pamphlet on Academic Dishonesty and look at the examples in the first lecture for acceptable and unacceptable collaboration.You should be doing this assignment completely all by yourself!The following rubric applies provided you have fulfilled all criteria in Minimum Submission Requirements. Failing any criteria listed in that section would result in an automatic grade of zero which cannot be legible for applying for a regrade request.20 pt Lab3.asm assembles without errors (thus even if you submit Lab3.asm having written absolutely no student code, you would still get 20 pts!) 80 pt output in file lab3_output.txt matches the specification:20 pt error check zero and negative heights using the convention shown in Figure 1620 pt prompts user until a correct input is entered as shown in Figure 1620 pt number of rows match user input (i.e., if n=6, the pattern would have 6 row20 pt correct sequence of stars and newline characters on each rowAll course materials and relevant files located in the Lab3 folder in the course Google Drive must not be shared by the students outside of the course curriculum on any type of public domain site or for financial gain. Thus, if any of the Lab3 documents is found in any type of publicly available site (e.g., GitHub, stack Exchange), or for monetary gain (e.g., Chegg), then the original poster will be cited for misusing CSE12 course-based content and will be reported to UCSC for academic dishonesty.In the case of sites such as Chegg.com, we have been able to locate course material shared by a previous quarter student. Chegg cooperated with us by providing the student’s contact details, which was sufficient proof of the student’s misconduct leading to an automatic failing grade in the course.
CSE 112 Computer Organization Introduction and Instructions ● This will be a group assignment, each student in the group will be marked separately. Therefore try to make sure that work is roughly divided equally among all the members of the group. ● In this assignment, you will have to design and implement a custom assembler and a custom simulator for a given ISA. ● You are not restricted to any programming language. However, your program must read from stdin and write to stdout. ● You must use GitHub to collaborate. You must track your progress via git. ● The automated testing infrastructure assumes that you have a working Linux-based shell. For those who are using Windows, you can either use a VM or WSL. QUESTION DESCRIPTION: There are a total of four questions in this assignment: 1. Designing and Implementing the assembler. 2. Designing and Implementing the simulator. 3. Extending the functionality of the assembler-simulator set-up to handle simple floating-point computations. 4. A bonus question based on the assembler and simulator. The bonus will be worth 10%. TEST CASES: We will release some test cases with the assignment so that you can test your implementations. During the evaluations, a superset of these test cases would be provided to you, on which you will get graded. DEADLINES: You will have two deadlines for this assignment: 1. The mid-evaluation: b. You will be tested mostly on the test cases already provided to you with the assignment. c. However, we might add some other test cases as well. d. You will only be evaluated on the assembler. (20%) 2. The final evaluation: b. You should also have completed Q3(10%). c. You will be evaluated on a much larger set of test cases this time. d. You will also be evaluated on the bonus question at this stage. ❖ The mid evaluation will be worth 20% of your final assignment grade. The final evaluation will be worth the rest 80% of your final assignment grade. The bonus will be worth 10% making the total 110%. GRADING: a. Q1 and Q2 are mandatory questions. b. In Q1, you will have to make an assembler. c. In Q2, you have to make a simulator for which detailed information is mentioned in the respective questions. d. Q3 is an extension of the functionality of the assembler-simulator set-up that you build. e. Q4 is a bonus question. ● For Q1 and Q2: Grading will be based on the number of test cases that your program passes. a. Assembler: The test cases are divided into 3 sets: i. ErrorGen: These tests are supposed to generate errors ii. simpleBin: These are simple test cases that are supposed to generate a binary. iii. hardGen: These are hard test cases that are supposed to generate a binary. b. Simulator: The test cases are divided into 2 sets: ● For Q3: a. Assembler: The test cases are divided into 2 sets: ■ ErrorGen: These tests are supposed to generate errors ■ SimpleBin: These are simple test cases which are supposed to generate a binary. b. Simulator: The test cases are divided into single set: ■ simpleBin: These are simple test cases which are supposed to generate a trace. For Q4: ● You need to design 5 instructions on your own. ● Upgrade your Assembler to handle those instructions. ● Upgrade your simulators to simulate the newly defined instructions. ● You need to create a readme file for the instructions that you have created. EVALUATION PROCEDURE: 2) On the day of your demo, a compressed archive of all tests will be shared with you on google classroom. This archive will include other test cases as well, which will not be provided to you beforehand. 3) On the day of evaluation, you must b) Prove the integrity of the tests archive by computing the sha256sum of the archive. To compute the checksum, you can run “sha256sum 4) Then you can extract the archive and replace the “automatedTesting/tests” directory. 5) Then you need to execute the automated testing infrastructure, which will run all the tests and finally print your score. 1. Any copying of code from your peers or from the internet will invoke the institute’s Assignment Description ISA description: Consider a 16 bit ISA with the following instructions and opcodes, along with the syntax of an assembly language which supports this ISA. The ISA has 6 encoding types of instructions. The description of the types is given later. Opcode Instruction Semantics Syntax Type 00000 Addition Performs reg1 = reg2 + reg3. If the computation overflows, then the overflow flag is set and 0 is written in reg1 add reg1 reg2 reg3 A 00001 Subtraction Performs reg1 = reg2- reg3. In case reg3 > reg2, 0 is written to reg1 and overflow flag is set. sub reg1 reg2 reg3 A 00010 Move Immediate Performs reg1 = $Imm where Imm is a 7 bit value. mov reg1 $Imm B 00011 Move Register Move content of reg2 into reg1. mov reg1 reg2 C 00100 Load Loads data from mem_addr into reg1. ld reg1 mem_addr D 00101 Store Stores data from reg1 to mem_addr. st reg1 mem_addr D 00110 Multiply Performs reg1 = reg2 x reg3. If the computation overflows, then the overflow flag is set and 0 is written in reg1 mul reg1 reg2 reg3 A00111 Divide Performs reg3/reg4. Stores the quotient in R0 and the remainder in R1. If reg4 is 0 then overflow flag is set and content of R0 and R1 are set to 0 div reg3 reg4 C 01000 Right Shift Right shifts reg1 by $Imm, where $Imm is a 7 bit value. rs reg1 $Imm B 01001 Left Shift Left shifts reg1 by $Imm, where $Imm is a 7 bit value. ls reg1 $Imm B 01010 Exclusive OR Performs bitwise XOR of reg2 and reg3. Stores the result in reg1. xor reg1 reg2 reg3 A 01011 Or Performs bitwise OR of reg2 and reg3. Stores the result in reg1. or reg1 reg2 reg3 A 01100 And Performs bitwise AND of reg2 and reg3. Stores the result in reg1. and reg1 reg2 reg3 A 01101 Invert Performs bitwise NOT of reg2. Stores the result in reg1. not reg1 reg2 C 01110 Compare Compares reg1 and reg2 and sets up the FLAGS register. cmp reg1 reg2 C 01111 Uncondition al Jump Jumps to mem_addr, where mem_addr is a memory address. jmp mem_addr E 11100 Jump If Less Than Jump to mem_addr if the less than flag is set (less than flag = 1), where mem_addr is a memory address. jlt mem_addr E 11101 Jump If Greater Than Jump to mem_addr if the greater than flag is set (greater than flag = 1), where mem_addr is a memory address. jgt mem_addr E 11111 Jump If Equal Jump to mem_addr if the equal flag is set (equal flag = 1), where mem_addr is a memory address. je mem_addr E 11010 Halt Stops the machine from executing until reset hlt F where reg(x) denotes register, mem_addr is a memory address (must be an 7-bit binary number), and Imm denotes a constant value (must be an 7-bit binary number). The ISA has 7 general purpose registers and 1 flag register. The ISA supports an address size of 7 bits, which is double byte addressable. Therefore, each address fetches two bytes of data. This results in a total address space of 256 bytes. This ISA only supports whole number arithmetic. If the subtraction results in a negative number; for example “3 – 4”, the reg value will be set to 0 and overflow bit will be set. All the representations of the number are hence unsigned. The registers in assembly are named as R0, R1, R2, … , R6 and FLAGS. Each register is 16 bits. Note: “mov reg $Imm”: This instruction copies the Imm(7bit) value in the register’s lower 7 bits. The upper 9 bits are zeroed out. Example: Suppose R0 has 1110_1010_1000_1110 stored, and mov R0 $13 is executed. The final value of R0 will be 0000_0000_0000_1101. FLAGS semantics The semantics of the flags register are: ● Overflow (V): This flag is set by {add, sub,mul, div} when the result of the operation overflows. This shows the overflow status for the last executed instruction. ● Less than (L): This flag is set by the “cmp reg1 reg2” instruction if reg1 < reg2 ● Greater than (G): This flag is set by the “cmp reg1 reg2” instruction if the value of reg1 > reg2 ● Equal (E): This flag is set by the “cmp reg1 reg2” instruction if reg1 = reg2 The default state of the FLAGS register is all zeros. If an instruction does not set the FLAGS register after the execution, the FLAGS register is reset to zeros. The structure of the FLAGS register is as follows: Unused 12 bits V L G E 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 The only operation allowed in the FLAGS register is “mov reg1 FLAGS”, where reg1 can be any of the registers from R0 to R6. This instruction reads FLAGS register and writes the data into reg1. All other operations on the FLAGS register are prohibited. The cmp instruction can implicitly write to the FLAGS register. Similarly, conditional jump instructions can implicitly read the FLAGS register. Example: R0 has 5, R1 has 10 Implicit write: cmp R0 R1 will set the L (less than) flag in the FLAGS register. Implicit read: jlt 0001001 will read the FLAGS register and figure out that the L flag was set, and then jump to address 0001001. Binary Encoding The ISA has 6 types of instructions with distinct encoding styles. However, each instruction is of 16 bits, regardless of the type. ● Type A: 3 register type Opcode (5 bits) Unused (2 bits) reg1 (3 bits ) reg2 (3 bits) reg3 (3 bits ) 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 ● Type B: register and immediate type opcode (5 bits) Unused (1 bit) reg1 (3 bits) Immediate Value (7 bits) 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 ● Type C: 2 registers type Opcode (5 bits) Unuse (5 bits) d reg1 (3 bits) reg2 (3 bits ) 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 ● Type D: register and memory address type opcode (5 bits) Unused (1 bit) reg1 (3 bits) Memory Address (7 bits) 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 ● Type E: memory address type opcode (5 bits) unused (4 bits) Memory Address (7 bits) 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 ● Type F: halt opcode (5 bits) unused (11 bits) 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 Binary representation for the register are given as follows:- Register Address R0 000 R1 001 R2 010 R3 011 R4 100 R5 101 R6 110 FLAGS 111 Executable binary syntax The machine exposed by the ISA starts executing the code provided to it in the following format, until it reaches hlt instruction. There can only be one hlt instruction in the whole program, and it must be the last instruction. The execution starts from the 0th address. The ISA follows von-neumann architecture with a unified code and data memory. The variables must be allocated in the binary in the program order. code (last instruction) halt variables Questions: Q1: Assembler: ● Empty line: Ignore these lines ● A label ● An instruction ● A variable definition Each of these entities have the following grammar: ● The syntax of all the supported instructions is given above. The fields of an instruction are whitespace separated. The instruction itself might also have whitespace before it. An instruction can be one of the following: ○ The opcode must be one of the supported mnemonic. ○ A register can be one of R0, R1, … R6, and FLAGS. ○ A mem_addr in jump instructions must be a label. ○ A Imm must be a whole number = 0. ○ A mem_addr in load and store must be a variable. ● A variable definition is of the following format: var xyz which declares a 16 bit variable called xyz. This variable name can be used in place of mem_addr fields in load and store instructions. All variables must be defined at the very beginning of the assembly program. The assembler should be capable of: 1) Handling all supported instructions 2) Handling labels 3) Handling variables 4) Making sure that any illegal instruction (any instruction (or instruction usage) which is not supported) results in a syntax error. In particular you must handle: a) Typos in instruction name or register name b) Use of undefined variables c) Use of undefined labels d) Illegal use of FLAGS register e) Illegal Immediate values (more than 7 bits) f) Misuse of labels as variables or vice-versa g) Variables not declared at the beginning h) Missing hlt instruction i) hlt not being used as the last instruction You need to generate distinct readable errors for all these conditions. If you find any other illegal usage, you are required to generate a “General Syntax Error”. The assembler must print out all these errors. 5) If the code is error free, then the corresponding binary is generated. The binary file is a text file in which each line is a 16bit binary number written using 0s and 1s in ASCII. The assembler can write less than or equal to 128 lines. Input/Output format: ● The assembler must read the assembly program as an input text file (stdin). ● The assembler must generate the binary (if there are no errors) as an output text file (stdout). Example of an assembly program: var X mov R1 $10 mov R2 $100 mul R3 R2 R1 st R3 X hlt The above program will be converted into the following machine code 0001000100001010 0001001001100100 0011000011010001 0010101100000101 1101000000000000 Q2: Simulator: You need to write a simulator for the given ISA. The input to the simulator is a binary file (the format is the same as the format of the binary file generated by the assembler in Q1. The simulator should load the binary in the system memory at the beginning, and then start executing the code at address 0. The code is executed until hlt is reached. After execution of each instruction, the simulator should output one line containing an 7 bit number denoting the program counter. This should be followed by 8 space separated 16 bit binary numbers denoting the values of the registers (R0, R1, … R6 and FLAGS). …. The output must be written to stdout. Similarly, the input must be read from stdin. After the program is halted, print the memory dump of the whole memory. This should be 128 lines, each having a 16 bit value. ….. Your simulator must have the following distinct components: 1. Memory (MEM): MEM takes in an 7 bit address and returns a 16 bit value as the data. The MEM stores 256 bytes, initialized to 0s. 2. Program Counter (PC): The PC is an 7 bit register which points to the current instruction. 3. Register File (RF): The RF takes in the register name (R0, R1, … R6 or FLAGS) and returns the value stored at that register. 4. Execution Engine (EE): The EE takes the address of the instruction from the PC, uses it to get the stored instruction from MEM, and executes the instruction by updating the RF and PC. The simulator should follow roughly the following pseudocode: initialize(MEM); // Load memory from stdin PC = 0; // Start from the first instruction halted = false; while(not halted) { Instruction = MEM.fetchData(PC); // Get current instruction halted, new_PC = EE.execute(Instruction); // Update RF compute new_PC PC.dump(); // Print PC RF.dump(); // Print RF state PC.update(new_PC); // Update PC } MEM.dump() // Print the complete memory Q3: Floating-Point Arithmetic: CSE112_Floating_point_representation: NO sign bit, 3 exponent bits, and 5 mantissa bits. ● In the registers, only the last 8 bits will be used in computations and initialization for the floating-point numbers. Modify the assembler and simulator to include arithmetic operations for floating-point numbers of the form(precision) given above. Specifically, include the following functions: 10000 F_Addition Performs reg1 = reg2 + reg3. If the computation overflows, then the overflow flag is set and reg1 is set to 0 addf reg1 reg2 reg3 A 10001 F_Subtraction Performs reg1 = reg2 – reg3. In subf reg1 reg2 reg3 A case reg3 >reg2, 0 is written to reg1 and overflow flag is set. 10010 Move F_Immediate Performs reg1 = $Imm where Imm is an 8-bit floating-point value. movf reg1 $Imm B Note: ● For moving 1.5 into reg1. The instruction(in assembly language) should be: movf reg1 $1.5 ● Keep in mind that in floating point multiplication $Imm is 8 bit so you need to make a new Type B syntax with 8 bit. ● The students must only apply the operations for the floating-point numbers that can be represented in the given system(8 bits), else they should report it as an error. Q4: (Bonus) Designing New Instructions In Q4. You need to design five instructions on your own or take help of online resources. After their design you need to upgrade your assembler to include those instructions.Similarly you need to upgrade your simulator to simulate those instructions. Upgrade Assembler 5% Upgrade Simulator 5% A proper readme file with description of the instructions (opcode, semantics etc) should be included and proper explanation should be given to respectiveTA during evaluation. N.B. You are not allowed to change the existing opcode.
• You should do this assignment alone. • Do not copy or turn in solutions from other sources. You will be penalized if caught. • You can use any concurrency library to help with your implementation. The data structure implementation obviously has to be your own. Submission • Submission will be through Canvas. • Submit a compressed file called “⟨roll⟩-assign5.tar.gz”. The compressed file should have the following structure. — roll — — roll-assign5.pdf — — — — — — — — — — • We encourage you to use the LATEX typesetting system for generating the PDF file. You can use tools like Tikz, Inkscape, or Draw.io for drawing figures if required. You can alternatively upload a scanned copy of a handwritten solution, but make sure the submission is legible. • You will get up to two late days to submit your assignment, with a 25% penalty for each day. Evaluation • Write your programs such that the exact output format (if any) is respected. • We will primarily use the GPU3 department server for evaluation. Implement a concurrent open-addressing-based hash table using Pthreads. Implement support for the following operations. Listing 1: Operations supported by the bounded partial stack data structure void batch_insert(HashTable *ht, KeyValuePairs *kv_pairs, bool* result); void batch_delete(HashTable *ht, KeyList *key_list, bool* result); void batch_loopkup(HashTable *ht, KeyList *key_list, uint32_t* result); 1 2 3 The third parameter result is a boolean array to indicate the success of an insert (duplicate keys are not allowed) or delete query. For batch_lookup(), result will contain the values if present, and a negative value otherwise to indicate a failure. Adapt the driver to the exact prototype that you design. Description • Key and the value are four bytes (unsigned int) each. • The APIs work on a batch of items. • Use double hashing for resolving collisions. You should test different hash functions and compare their performance. • Assume that the concurrency is limited to operations in a batch. Each batch contains of operations of the same type. For example, there can be concurrent insertions of 10000 key and value pairs in a batch. • Use threading to issue concurrent calls to the hash table data structure (e.g., OpenMP). • Assume the maximum load factor is 0.8. • We will provide random keys and values (via files) that will be input to the hash table for testing. – random_keys_insert.bin – random_values_insert.bin – random_keys_delete.bin – random_keys_search.bin • Empirically ensure the correctness of your code with unit tests. The unit test cases can contain a fixed sequence of operations. Include the test cases as separate function calls (e.g., test1 and test2). • Use compile time flag USE_TBB to switch between the two versions. • Use the best hash functions (determined empirically from bullet (iii)) for comparing with TBB. Submission (i) A header file that implements the hash table, (ii) Extend the driver code to make calls to the hash table and compute the throughput of the kernels (e.g., 1000 inserts per second), (iii) Compare the throughput of your implementation with different primary and secondary hash functions (include all hash functions in the header file), (iv) Extend the driver to use a concurrent hash table using Intel TBB, (v) Implement a print method which can be invoked after every operation (e.g., batch_insert()) to check the content of the hash table, (vi) Compare the throughput of the TBB and Pthread implementations. Implement an “unbounded, total, lock-free” concurrent stack using linked lists. The stack will contain only positive 4-byte integer values. Listing 2: Operations supported by the unbounded total lockfree stack data structure int pop(); void push(int v); 1 2 Since the Stack is unbounded and total, we do not need APIs like isEmpty() and isFull(). Listing 3: Possible definition of a linked list node class Node { public: int value; Node *next; } 1 2 3 4 5 • Unlike Problem 1, concurrency means different threads can make individual calls to push() and pop(), • Your implementation should be non-blocking, i.e., you should not use locks, • You can maintain a pointer to the top of the stack as follows: std::atomic top, • A pop() operation on an empty stack should return a negative integer, i.e., it will not block, • Ignore OOM issues while implementing the push() operation, • Use random_values_insert.bin from Problem 1 to generate a sequence of random values to be pushed, • Whether a thread issues a push() or a pop() can be decided based on probability, • Lock-free implementations are vulnerable to the ABA problem. An easy fix that is mostly correct is to maintain the count of pop operations performed on the stack. A CAS operation should then compare both the top pointer and the count. Submission (i) A header file that implements the concurrent stack, (ii) Extend the driver code from Problem 1 to work with stacks, (iii) Report the time taken to complete n concurrent operations, where n ∈ {1e5,1e6,1e7}, (iv) Implement a print method which can be invoked from the driver to check the content of the stack, (v) Evaluate the scalability (strong) of your implementation with threads (e.g., 1,2,4,8, and 16).
CS 382 Computer Architecture1 Task 1: Calculating Dot Product The .data segment must be declared as follows:where vec1 and vec2 are two vectors, and dot is where we store the dot product result. You must store the dot product into variable dot . There’s no need to use loops; you can just hard code the offsets for now. You can always assume the vector length is 3. Requirements Note your code is a complete assembly program (not just a sequence of instructions). It should be able to assemble, link, and execute without error and warnings. When executed, the program should finish without problems (also without any outputs); If your code cannot assemble, you get no credit – this is the same as C programs that cannot be compiled; MUL instruction can be used for multiplications; Avoid using registers X29 and X30; You must store the dot product result into the variable dot ; You have to put comments on each line of instruction; 2 Task 2: Debugging Assembly Using gdb To check if our programs are correct, we would have to rely on gdb (sorry, still not printf() yet!). A very comprehensive tutorial of using gdb to debug assembly programs is in Appendix B.3 of the textbook. Read through the section before you start this task. In this task, you’d need to write a report on using gdb to debug task 1. You need to provide sufficient screenshots of gdb to show that your program is correct. Step into gdb and use commands to show that the result is correct. Requirements Simply one screenshot of showing the final result is not sufficient. For each step you took and command you typed on gdb, you need a screenshot, and explain what you’re trying to accomplish at that step. For example, setting a break point needs one; stepping into an instruction needs one, and so on; 1 You must use the correct command to show directly that the dot-product calculation is correct and is stored back to memory; The screenshots must not be pictures taken from your phone or camera; 3 Grading The lab will be graded based on a total of 10 points, 5 for task 1 and 5 for task 2. The following lists deductibles, and the lowest score is 0 – no negative scores: Task 1: • -3: the calculation result of dot-product is wrong; • -3: the calculation result of dot-product is not in dot variable; • -1: one or more instructions is missing comments; • -1: the program has any type of output on terminal when executing; • -1: no pledge and/or name. Task 2: • -5: the report is not in PDF format; • -2: the screenshots are not taken directly from the laptop; • -2: missing screenshot and/or explanation of one or more steps in debugging; • -2: not showing the final value of dot-product calculation in gdb in memory; • -1: no pledge and/or name in the report. General deductions (only deduct once): -10: the code in task 1 does not assemble, or the program terminates abnormally/unsuccessfully; does not attempt the task; is generated by a compiler; cannot be explained clearly in person.Attendance: check off at the end of the lab to get attendance credit.2
Partner (if any): Ben Carpenter Pledge: “I pledge my honor that I have abided by the Stevens Honor System.” CS 382 Lab 4 Task 2 Start by using the b _start command:first 19 in firstand add the product so farnext 2 in 2ndof the second vector and add the product so farnext 2 in 3rdof the third vector and add the product so faruntil line 40: Use x/1dg &dot: The result of the dot product should be stored here. We see it prints 140 which is the dot productto end of program: Ends program
A property management company manages individual properties they will build to rent, and charges them a management fee as the percentages of the monthly rental amount. The properties cannot overlap each other, and each property must be within the limits of the management company’s plot. Write an application that lets the user create a management company and add the properties managed by the company to its list. Assume the maximum number of properties handled by the company is 5.• Aggregation • Passing object to method • Array Structure • Objects as elements of the Array • Processing array elements • Copy Constructor • Junit testingData Element class – Property.java The class Property will contain: 1. Instance variables for property name, city, rental amount, owner, and plot. Refer to JavaDoc for the data types of each instance variable. 2. toString method to represent a Property object. 3. Constructors and getter and setter methods. Refer to Javadoc of the Property class.Data Element class – Plot.java The class Plot will contain: 1. Instance variables to represent the x and y coordinates of the upper left corner of the location, and depth and width to represent the vertical and horizontal extents of the plot. 2. A toString method to represent a Plot object 3. Constructors, Refer to Javadoc for Plot class. 4. A method named overlaps that takes a Plot instance and determines if it is overlapped by the current plot. 5. A method named encompasses that takes a Plot instance and determines if the current plot contains it. Note that the determination should be inclusive, in other words, if an edge lies on the edge of the current plot, this is acceptable. Data Structure – An Array of Property objects to hold the properties that the management company handles. This array will be declared as an attribute of the ManagementCompany class.Data Manager class – ManagementCompany.java It will contain instance variables of name, tax Id, management fee, MAX_PROPERTY (a constant set to 5) and an array named properties of Property objects of size MAX_PROPERTY, as well as two constants MGMT_WIDTH and MGMT_DEPTH, both set to 10; an attribute plot of type Plot that defines the plot of the ManagementCompany Class. Refer to Javadoc for more details. The class ManagementCompany will contain the following methods in addition to get and set methods: 1. Constructors (refer to Javadoc for more details) 2. Method addProperty (3 versions): 2.1.1. Pass in a parameter of type Property object (calls Property copy constructor). It will add the copy of the Property object to the properties array. 2.2. Method addProperty version 2: 2.2.1. Pass in four parameters of types: • String propertyName, • String city, • double rent, • String ownerName. 2.2.2. Calls Property 4-arg constructor. 2.3. Method addProperty version 3: 2.3.1. Pass in eight parameters of types: • String propertyName, • String city, • double rent, • String ownerName, • int x, • int y, • int width • int depth. 2.3.2. Calls Property 8-arg constructor. 2.4. addProperty methods will return the index of the array where the property is added. If there is a problem adding the property, this method will return -1 if the array is full, -2 if the property is null, -3 if the plot for the property is not encompassed by the management company plot, or -4 if the plot for the property overlaps any other property’s plot. 3. Method totalRent– Returns the total rent of the properties in the properties array.4. Method maxRentPropertyIndex- returns the index of the property within the properties array that has the highest rent amount. This method will be private. 5. Method maxRentProp- Returns the highest rent amount of the property within the properties array. For simplicity assume that each “Property” object’s rent amount is different. This method should call the maxRentPropertyIndex method. 6. Method toString- returns information of ALL the properties within this management company by accessing the “Properties” array. The format is as following example:List of the properties for Alliance, taxID: 1235 ______________________________________________________ Property Name : Belmar Located in Silver Spring Belonging to:John Smith Rent Amount: 1200.0 Property Name : Camden Lakeway Located in Rockville Belonging to:Ann Taylor Rent Amount: 2450.0 Property Name : Hamptons Located in Rockville Belonging to:Rick Steves Rent Amount: 1250.0 ______________________________________________________total management Fee: 294.0Driver class – (provided) The provided PropertyMgmDriverNoGui.java is a class that allows you to test the methods of ManagementCompany.javaGUI Driver class – (provided) A Graphical User Interface (GUI) is provided. Be sure that the GUI will compile and run with your methods. The GUI will not compile if your methods in ManagementCompany.java are not exactly in the format specified. Do not modify the GUI.JUnit Test Run the JUnit test file (provided). Ensure that the JUnit tests all succeed. Do not modify the JUnit tests. Implement your tests in ManagementCompanyTestSTUDENT. These tests should be similar to the Junit tests.Write a Data Element Class named Property that has fields to hold the property name, the city where the property is located, the rent amount, the owner’s name, and the Plot to be occupied by the property, along with getters and setters to access and set these fields. Write a parameterized constructor (i.e., takes values for the fields as parameters) and a copy constructor (takes a Property object as the parameter). Follow the Javadoc file provided.A driver class is provided that creates rental properties to test the property manager. A Graphical User Interface is provided using JavaFX which duplicates this driver’s functionality. You are not required to read in any data, but the GUI will allow you to enter the property management company and each property by hand. A directory of images is provided. Be sure to place the “images” directory (provided) inside the “src” directory in Eclipse. The images do not need to display in order for the GUI to continue running.Upload the initial files from Blackboard and your final java files to GitHub in your repo from Lab 1, in a directory named CMSC203_Assignment4.Operation When driver-driven application starts, a driver class (provided) creates a management company, creates rental properties, adds them to the property manager, and prints information about the properties using the property manager’s methods. When the GUI-driven application starts (provided), a window is created as in the following screen shots which allows the user to enter applicable data and display the resulting property. The driver and the GUI will both use the same classes and methods for their operation. The JUnit test class also tests the same classes as the driver and the GUI. Expected output from running PropertyMgmDriverNoGui.javaExpected output from running with GUI:PropertyMgmGui.java at startupAdd Management Co Info (Note Mgmt Co Plot)Add property information – the Plot outlineAdd property information – successful additionAdd property information – unsuccessful: overlapsAdd property information – unsuccessful: Mgmt Co Plot does not encompass Property Plot Note: red rectangle’s width extends to right of window.Add property information – unsuccessful: too many propertiesResult of “Max Rent” button Result of “Total Rent” buttonResult of “List of Properties” button• Turn in a UML class diagram in a Word document. • Submit pseudo-code for the primary methods specified in ManagementCompany.java, Property.java, and Plot.java in a Word document. Do not just list what gets read in a printed out, but explain the algorithm being used. • Learning Experience: highlight your lessons learned and learning experience from working on this project. o What have you learned? o What did you struggle with? o What will you do differently on your next project? o Include what parts of the project you were successful at, and what parts (if any) you were not successful at. • GitHub: In your repository (see Lab 1), upload your Word file and java file. You will want to upload these files as contents of a directory so that future uploads can be kept separate. Take and submit a screen shot of the GitHub repository. Notes: • Proper naming conventions: All constants, except 0 and 1, should be named. Constant names should be all upper-case, variable names should begin in lower case, but subsequent words should be in title case. Variable and method names should be descriptive of the role of the variable or method. Single letter names should be avoided. Indentation: It must be consistent throughout the program and must reflect the control structureSubmission Detail Submit the following files: • Word document with a name FirstInitialLastName_Assignment3.docx should include: o UML Class Diagram o Pseudocode for each of the methods specified in ManagementCompany.java, Property.java, and Plot.java. o Screen snapshots of the GUI with several properties o Screen snapshot of GitHub submission o Lessons Learned o Check List o doc (a directory) containing your javadoc files o src (a directory) contains your (.java) files o File1.java (example) ▪ File2.java (example) ▪ File_Test.java (example)• A zip file will only contain the .java files and will be named: FirstInitialLastName_Assignment4_Moss.zip. This .zip will not have any folders in it – only .java files.Grading Rubric See attachment: CMSC203 Assignment 4 Rubric_Summer20.xlsxAssignment 4 Check List # Y/N Comments 1. Assignment files: • FirstInitialLastName_ Assignment#_Moss.zip • FirstInitialLastName_Assignment#.docx/.pdf • Source java files 2. Program compiles 3. Program runs with desired outputs related to a Test Plan 4. Documentation file: • Comprehensive Test Plan • Screenshots for each Test case listed in the Test Plan • Screenshots of your GitHub account with submitted Assignment# (if required) • UML Diagram • Algorithms/Pseudocode • Flowchart (if required) • Lessons Learned • Checklist is completed and included in the Documentation
Project 3 is on malware analysis. You’ll be learning about and manipulating malware on the Windows and Android platforms. Read the write-up below to get started. It’s very long but also very comprehensive, walking you through the beginning of the project and including helpful posts from previous semesters on Piazza. ResourcesWrite-upCS6262_P3_WriteupLinks to an external site.Android WriteupActionsQuestionnaireassignment-questionnaire.txtDownload assignment-questionnaire.txtFAQFAQ.md Download FAQ.mdVMThe project VM can be downloaded from:https://www.dropbox.com/s/dnk6acztw9ewp83/Project%203.zip?dl=0Links to an external site.The password to the archive is cs6262.Submission:There is an autograder on Gradescope to help verify your answers as you work. To use it, submit your assignment-questionnaire.txt file to both the Windows and Android assignments. However, this will not count for your final grade.To receive full credit, you must submit1) assignment-questionnaire.txt to Gradescope,2) report.zip to Canvas.If you did not submit report.zip on time, we will contact you, and a 5-point deduction will be applied to your total score.Have fun, and start early!Project 3: Malware Analysis CS 6262 Sections:Scenario:You got a malware sample from the wild! Your task is to discover what the malware does by analyzing it. How do you discover the malware’s behaviors? There are multiple ways of analyzing it but we’ll be focusing on two ways: Static Analysis and Dynamic Analysis. Static Analysis: Dynamic Analysis: In our scenario, you are going to analyze the given malware with tools that we provide. These tools help you to analyze the malware with static and dynamic analysis.Objective:Requirement: Project Structure:○ https://www.virtualbox.org/wiki/Downloads ● Download the Virtual Machine (VM)○ https://www.dropbox.com/s/dnk6acztw9ewp83/Project%203.zip?dl=0○ Unarchive the file with 7zip and password is cs6262 ● Network Configurations:○ tap0:■ Virtual network interface for Windows XP • IP Address: 192.168.133.101 ○ br0■ A network bridge between Windows XP and Ubuntu○ enp0s3■ A network that faces the Internet ○ Go to File → Import Appliance○ Select the ova file and import it○ For detailed information on how to import the VM, see: ○ Before starting, it might be useful to configure the settings, allocate more base memory, processors etc. to your VM, as per your device configurations for better performance. ● VM user credentials○ Username: analysis ○ Password: analysis NOTE: VM Setup ■ init.py○ Type your Georgia Tech username (your Canvas LoginName) after running this •$./init.py○ Malware:Tutorials:○ Update the project 3 before begin■ Open the terminal (Ctrl-Alt-T, or choose terminal from the menu)■ Run ./update.sh ○ Initializing the project■ Open the terminal (Ctrl-Alt-T, or choose terminal from the menu)■ Run ./init.py ○ Note:■ These are malware samples hosted under the Georgia Tech Network■ IMPORTANT $ file unzip ○ We need a secure experiment environment to execute the malware ○ Why?■ Insecure analysis environment could damage your system ■ You may not want:■ Contain malware in a virtual environment○ Conservative rules(allow network traffic only if it is secure) ○ We provide a Win XP VM as a testbed! ● Run Win XP VM○ Run Windows XP Virtual Machine with virt-manager○ Open a terminal○ Type “virt-manager” and double click “winxpsp3”○ Click the icon with the two monitors and click on “basecamp” ○ Right click on basecamp, and click “Start snapshot.” Click Yes if prompted.○ Once, virt-manager successfully calls the snapshot, click Show the graphical console.■ Click on the Windows Start Menu and Turn off Computer. ■ Then select Restart ○ DO NOT MODIFY OR DELETE THE GIVEN SNAPSHOTS! ■ The given snapshots are your backups for your analysis.■ If something bad happens on your testbed, always revert back to the basecamp snapshot. ● Copy from Shared Directory○ Go to the shared directory by clicking its icon (in Windows XP)■ Copy stage1.exe into Desktop■ If you execute it in the shared directory, the error message will pop up. Please copy the file to Desktop. ○ Now we will run the malware■ Execute stage1.exe (double click the icon)■ It will say “Executing Stage 1 Malware”. Then, click OK.○ Otherwise, malware execution will be blocked○ If you want to halt the malware that is running…■ Execute stop_malware in the temp directory.○ To analyze network behaviors, you need■ Wireshark (https://www.wireshark.org/)■ Capturing & Recording inbound/outbound network packets○ By capturing and recording network packets through the tools■ Reveal C&C protocol■ Attack Source & Destination ○ But, malware will not do anything. Why? ■ The C2 server is dead!■ Therefore, the malware (C2 client) will never unfold its behaviors.■ Question?○ Let’s check it through network monitoring ■ Everything has been already installed.■ Open Wireshark, capture the traffic for the network bridge(Make sure to run with root privileges)■ IP address = 192.168.133.1■ Reference: https://www.wireshark.org/docs/■ Get yourself familiarized with Linux commands and how to employ Wireshark.■ Other references:○ From WireShark, we can notice that the malware tries to connect to the host at 128.61.240.66, but it fails○ Let’s make it redirect to our fake C2 server■ Go to ~/tools/network■ Edit iptables_rules to redirect the traffic to 128.61.240.66 to192.168.133.1 (fake host)○ Whenever you edit iptables_rules, always run reset.■ (type “./reset” from the ~/tools/network directory)○ IMPORTANT! If you shut down your project VM, be sure to run reset again the next time you start it up. ○ Observing C2 traffic■ In WireShark, we can notice that now the malware can communicate with our fake C2 server■ You can see the contents of the traffic by right-clicking on the line, then clicking Follow – TCP Stream ○ Let’s take a look at cuckoo. Cuckoo is NOT necessarily required to complete this project, but it is a useful tool to help you understand what your malware is doing, and therefore how you might want to modify your score.h file later in the project.○ Note! You can’t run the testbed VM and cuckoo simultaneously.○ Always turn off the testbed VM, and follow the steps below to execute Cuckoo ○ Open two terminals.○ ‘$workon cuckoo’ (Set virtualenv as cuckoo for both terminal1 and terminal2)○ Open one terminal in debug mode, with command: ‘$cuckoo -d’○ Open other cuckoo terminal for the webserver, with command: ‘$cuckoo web’ ○ Reference: Malware Analysis using Cuckoo Sandbox○ If you get an error when running cuckoo web because port 8000 is already inuse, run “sudo fuser -k 8000/tcp” and try again. ○ The Cuckoo uses asnapshot of the given testbed VM. ○ The snapshot is 1501466914○ • DO NOT TOUCH thesnapshot! ○ To open the cuckoo web server, type the following URL into Chromium ■ http://localhost:8000○ To upload a file, click the red box and choose a file. ○ Once you click the Analyze button, it will take some time to run the malware. ○ Once you click the Analyze button, it will take some time to run the malware. ○ The malware does not exhibit its behavior because we did not send the correct command through our fake C2 server ○ We will use■ File/Registry/Process tracing analysis to guess the malware behavior.■ control-flow graph (CFG) analysis and symbolic execution to figure out the list of the correct commands○ The purpose of tracing analysis is to draw a big picture of the malware ■ What kinds of System call/API does the malware use?■ Does the malware create/read/write a file? How about a registry?○ The purpose of CFG analysis is to find the exact logic that involves the interpretation of the command and the execution of malicious behavior○ Then, symbolic execution finds the command that drives the malware into that execution path ○ On the side bar, there are useful menus for tracing analysis. ■ We are focusing on:○ Trace behaviors in time sequence. ● Static Analysis on Cuckoo○ Static Analysis■ Information about the malware. ■ Win32 PE format information○ .text○ Strings, etc.○ .data○ .idata○ .reloc○ More information: Malware researcher’s handbook (demystifying PE file) ○ Interestingly three DLL(Dynamic Link Libraries) files are imported.○ In WININET.dll, we can see that the malware uses http protocol.○ In ADVAPI32.dll, we can check if the malware touches registry files ○ In Kernel32.dll, we can check the malware waiting signal, also sleep. ○ Tracing a behavior(file/process/thread/registry/network) in time sequence.○ Useful to figure out cause-and-effect in process/file/network.○ Malware creates a new file and runs the process, then writes it to memory. ○ Based on our analysis with Cuckoo, we can determine if… ■ The malware uses HTTP protocol to communicate ● Communicate with whom? C&C?■ The malware touches(create/write/read) a file/registry/processModifying the registry? ○ Based on the pre-information that we collected from the previous step, we aregoing to perform CFG analysis & symbolic execution analysis○ CFG:■ graph representation of computation and control flow in the program■ Nodes are basic blocks■ Edges represent possible flow of control from the end of one block to the beginning of the other. ○ But, in malware analysis, we are analyzing CFG at the instruction level.○ We provide a tool for you that helps to find command interpretation logic and malicious logic■ We list the functions of system calls the malware uses internally■ If you provide the score (how malicious it is, or how likely the malicious logic is to use such a function) for the functions, then the tool will find where the malicious logic is, based on its score■ Your job is to write the score value per each function ○ More info: http://www.cs.cornell.edu/courses/cs412/2008sp/lectures/lec24.pdf○ From our network analysis, we know that the malware uses an Internet connection to 128.61.240.66○ From our cuckoo-based analysis, we know that the malware uses the HTTP protocol.○ Moreover, it uses some particular functions to communicate and stay in touch with the command and control server.○ Modify the score values for these particular functions in order to generate a better CFG – for proper analysis.○ Find the file to be edited – score.h.○ Path: /tools/cfg-generation/score.h ○ Build control flow graph■ By executing ./generate.py stage1, the tool gives you the CFG ● This finds the function with higher score○ Implies that this calls high score functions on its execution ■ For stage2○ Note: your graph and its memory addresses will vary from this example ○ The function entry is at the address of 405190■ And, there is a function (marked as sub) of score 12■ This implies that○ Run from 405190 to 40525a○ Finding Commands with Symbolic Execution■ We want to find a command that drives malware from 405190 to 40525a■ Rather than executing the program with some input, symbolic execution treats the input data as a symbolic variable, then tries to calculate expressions for the input along the execution.■ Path explosion■ Modeling statements and environments■ Constraint solving○ Symbolic Execution Engine: Klee, Angr, Mayhem, etc. • Loading a binary into the analysis program○ • Translating a binary into an intermediate representation (IR). • Translating that IR into a semantic representation○ • Performing the actual analysis with symbolic execution. ○ In this example, ONLY i=2, j=9 conditions will lead the program to print “Correct!”○ Symbolic execution is available to solve the expression in order to reach a target, in this case ”Correct”.○ Let’s apply it into Malware Command & Control logic. A C&C bot(malware) is expecting inputs(solve the expressions) to trigger behaviors(targets). ○ In this example, ONLY ‘launch-attack’ and ‘remove’ commands(inputs) triggers attack() and destroy_itself().○ Symbolic execution is able to find ”launch-attack” as an input to trigger attack(), which is a malicious behavior.○ Plus, ”remove” will lead to destroy_itself(), which is another behavior.○ Our job in this project with Symbolic execution is to find inputs, and then feed the inputs to trigger behaviors. ○ We prepared a symbolic executor and a solver for you■ Your job is to find the starting point of the function which interprets the command, and find the end point where malware actually executes some function that does malicious operations■ The symbolic executor is called angr (http://angr.io/index.html) ○ We prepared a symbolic executor and a solver for you.○ How do you run it?■ Go to ~/tools/sym-exec■ Run it likepython ./sym_exec.py [program_path] [start_address] [end_address]○ Replace the (above) start and end addresses from your CFG graph.○ The command will be printed at the end (if found) ○ After CFG analysis + symbolic execution, reconstruct the C2 server○ The tool for reconstructing the C2 server is already on the VM○ It runs nginx and php script■ This will look like ~/tools/c2-command/stage*-command.txt■ Your job is to add your commands to the relevant *.txt file“$insert” (note: the name of the command you see may vary) ● Then, type ”$insert” and save the file.○ Note: This means that if you want to run only a particular command, you’ll need to remove, or comment out the other commands in your file ○ SimState ■ angr – SimState■ While angr perform symbolic execution, it stores the current state of the program in the SimState objects.■ SimState is a structure that contains the program’s memory, register and other information.■ SimState provides interaction with memory and registers. For example, state.regs offers read, write accesses with the name of each registers such as state.regs.eip, state.regs.rbx, state.regs.ebx, state.regs.ebh ■ Creating an empty 64 bit SimState ○ Bitvectors ■ Since, we are dealing with binary files, we don’t deal with regular integers.■ In binary program, everything becomes bits and sequence of bits.■ A bitvector is a sequence of bits used to perform integer arithmetic for symbolic execution.■ Creating some 32 bit bitvector values■ state.solver.BVV(4,32) will create 32 bit length bitvector with value 4■ We can perform arithmetic operations or comparisons using the bitvectors ○ Symbolic Bitvectors ■ state.solver.BVS(’x’, 32) will create a symbolic variable named x with 32 bit length■ Angr allows us to perform arithmetic operation or comparisons using them. ○ Registers ○ Constraints ■ In a CFG, a line like if ( x > 10 ) creates a branch. Please look at the Symbolic Execution Concepts tutorial.■ Assuming x is a symbolic variable, this will create a 4> when the True branch is taken for the successor state■ For the false branch,negation of a 4> will be created. ■ Adding a constraint to a SimState ○ Radare2 ■ Launch radare2 with $ r2 ~/shared/payload.exe■ Then type aaa which will analyze all (functions + bbs)■ afl list all functions ■ For more information :○ You don’t have to use Radare2.○ Here some of the tools you may want to use ○ Check its network access with Wireshark○ Redirect network traffic to if required (if the connection fails)○ Try to identify malicious functions by editing score.h and using the cfg-generation tool○ Discover the list of commands using the symbolic execution tool ○ Fill the commands in ~/tools/c2-command/stage2-command.txt ○ Run it as mentioned before. ○ for linux malware symbolic execution○ python linux_sym_exec.py path_to_linux_mw start target○ To make it work, you need to modify two linux_sym_exec.py functions■ targs_len_before and opts_len_before ● ~/tools/dynamicanalysis/○ instrace.linux.log : the dynamic instruction trace for the linux malware○ detect_loop.py : you have to modify this file to find the loop in the given trace ○ Usage: python detect_loop.py ○ Search for C&C commands and trigger conditions○ Vet the app for any anti-analysis techniques that need to be removed. ○ Background services ○ You have received a malware sample sms.apk. ○ You need to identify communication with the C&C server ○ Identify anti-analysis techniques being used by the app.○ Identify commands that trigger any malicious behavior.○ An emulator for Android 4.4 is pre-installed ■ Run ‘run-emulator’○ Jadx■ Disassembles apk files into Java source code.○ Rebuilds apk files.○ ~/Android/MaliciousMessenger/tutorialApps ■ Emu-check.apk■ Another tutorial example○ Target app to analyze to answer the questionnaire○ On the questionnaire sheet, there are entries for writing domain names. Please follow the following rules on getting answers for those questions.○ You should write FQDN, which means, if the full domain name is canof.gtisc.gatech.edu then write canof.gtisc.gatech.edu, not just gatech.edu or gtisc.gatech.edu○ For the others (connections check, DDoS, sending info, etc.), you should get the exact domain name that the malware uses. For example, the IP address 130.207.188.35 belongs to both coe.gatech.edu and web-plesk5.gatech.edu.○ Because there are multiple mappings, you cannot be sure about which domain that the malware used by just using nslookup. In this case, please go through the other way of getting domain names from DNS Packets in Wireshark. ○ All Domains should be based on Wireshark DNS packets■ e.g., get it from a DNS query packet or redirect HTTP traffic into a local VM and examine the Host header.○ If you get see the log in the Wireshark, You will find DNS query(Standard query) and DNS response(Standard query response)○ In Domain Name System section, there is Query section, like below ○ Queries:■ x.y.z: type A, class IN.○ Answers:■ x.y.z: type CNAME, class IN, cname a.b.c○ You should use x.y.z○ For all URLs, you do not have to specify the protocol (http:// or https://, etc.).○ However, if HTTP traffic is like the following:■ POST /a/b/c/d?asdf=1234 HTTP/1.1 Host: www.zzz.com ○ Then please write this as■ www.zzz.com/a/b/c/d?asdf=1234○ There are pre-installed PHP scripts in the VM locally that read the *.txt file for each stage,■ These scripts send the command to the malware after reading them from the TXT files.■ One caveat of these scripts is that they are written to send the commands in random order (i.e., if there are commands a, b, c, then the script will randomly choose one command and send it to the malware).■ So if you want to test ONE command at a time, then please write only that command in the TXT file.○ You could use free IDA-Pro, objdump or radare2 for this task to find out called attack functions, and the target addresses.○ Look for some angr examples on the github, which adds constraints to the state.○ For the loop detection, focus on function sequence that called repetitive ● Correct command but malware is not working?○ Note that some commands for stage 2 are different per each student, by having 4 digit hexadecimal numbers at the end of the command.■ Ex. a command for stage 2 is formatted like $COMMANDa1b4■ (NOTE: three commands in stage 2 have the 4 digit hexadecimal tail.■ All commands in stage 3 have the 4 digit hexadecimal tail on the command.○ However, there could be a case that only gets the front part of the command like■ $COMMAND■ If the endpoint address of symbolic execution is not correctly set. In such a case, please set the correct end point that you can get the entire command.○ In the VM, we provide cuckoo, which is a dynamic malware analysis framework.■ It is very convenient and easy to use.■ While you are running cuckoo, you might meet some warnings and errors “critical time blah blah~” and “YARA signature…. blah blah”. Please ignore them.■ Because you are executing malware in the QEMU Windows VM, the framework needs to set a time.■ In our case, the malware is never going to unfold even though you give an infinite time to be executing the malware unless you feed the right inputs(The malware expects C2 commands.) ○ IPtable Setting■ If you check /home/analysis/.cuckoo/conf/kvm.conf, you will find how we set the QEMU windows host VM.■ You will find the IP of the host VM is “192.168.133.101”.■ If you want to see network behaviors in Cuckoo, you want to forward the IP in /home/analysis/tools/network/iptables- rules.■ For example, open iptables-rules, you want to addsudo iptables -t nat -A PREROUTING -p tcp -s 192.168.133.101 -d[DEST-IP] –dport 80 -j DNAT –to 192.168.133.1:80 ○ Run the Windows VM only when:■ Sending commands to malware■ Analyzing network traffic via Wireshark■ Once done with those tasks, turn off the Windows VM.○ Avoid running the windows VM when:■ Running cuckoo analysis■ Generating CFGs■ Running Symbolic Execution – This is quite resource intensive, avoid doing other stuff to get this done quickly. (TIP: If this seems to be taking infinite memory/time, you’re mostly trying to reach an unreachable / invalid address! check your addresses!)○ Try running the VM at a lower resolution (recommend at-least 1280×800, for legibility) – If you have a very high resolution on your host machine. You can do this in 2 ways:■ VirtualBox Menu – View > Virtual Screen 1 > Resize to a x b■ Ubuntu Menu – Type “Displays” > Change it there○ Restart after a task / stage. This is mostly a last resort but restarting the VM after finishing a task/stage made everything feel really smooth, instead of trying to free memory etc. Just be sure to run ./reset in ~/tools/networks after each VM restart! ○ Fewer resource allocation could result in some issues, you could try to reinstall the VM image (deleting the previously stored state), and even Virtual-box as a last resort.■ ~/report/assignment-questionnaire.txt■ stage1.exe, stage2.exe, payload.exe (linux malware)■ ~/tools/network/iptables_rules■ ~/tools/cfg-generation/score.hIf you did not submit report.zip on time, a 5-point deduction will be applied to your total score.○ Read assignment-questionnaire.txt ○ Carefully read the questions, and answer them in assignment-questionnaire.txt ○ For each stage, there are 4-6 questions regarding the behavior of the malware. ● Android Part○ READ ~/Android/MaliciousMessenger/writeup.pdf ○ Carefully read the writeup, answer in assignment-questionnaire.txt○ Make sure you overwrite ANSWER_HERE ○ As each section is worth an equal amount of your overall P2 grade, we normalized the Windows score by dividing by 1.1 (and rounded up), then averaged it with the Android score to get your final grade. So effectively, each point in the table above is worth half a point of your final project grade (slightly less for Windows).Android Malware Analysis LabJune 11, 2017[1] Every application must have an AndroidManifest.xml file in its root directory. The manifest file provides essential information about your app to the Android system, which the system must have before it can run any of the app’s code. Among other things, the manifest file does the followingIn Listing 1 an example of an app’s manifest file is shown. From it, we can see that this app declares that it needs the INTERNET and RECEIVE SMS permissions. Additionally, the app uses three components: ActivityOne, SmsReceiver, and myAppsService. ActivityOne is declared in lines 80-85. The intent-filter tag specifies the types of intents that an activity, service, or broadcast receive can respond to. An intent filter declares the capabilities of its parent component – what an activity or service can do and what types of broadcasts a receiver can handle. It opens the component to receiving the intents of the advertised type, while filtering out those that are not meaningful for the component. Lines 16-21 declare a broadcast receiver component named SmsReceiver.From the intent filters, we see that the Android OS will notify SmsReceiver when the device receives a new text message. The final component this app uses is a service component named ServiceOfApp declared on lines 23-25.The Android Manifest file provides a high-level abstraction of an app’s behavior. When attempting to manually inspect the internal behaviors of an application statically, the manifest file is a good starting point. It provides key insights on the permissions an application is using, the components it is using, and how the application interacts with the Android OS and the outside world. Additional information about the contents and attributes of the manifest file can be found in the Android documentation [1].12345101112131417181920212425262728293031Listing 1: An example of an app’s Android Manifest FileAndroid uses the Android application package (APK) format to distribute apps to Android devices. Apks are nothing more than a zip file containing resources and assembled Java code. However, if you were to simply unzip the apk you would only have two files: classes.dex and resources.arsc. Since viewing or editing compiled files is next to impossible, the apk file needs to be decoded or disassembled. If one wishes to analyze an app at the bytecode level, reverse engineering tools, such as Apktool [2] are available. Additionally, the app’s Java source code can be partially reconstructed using JADX [3]. You will probably find both tools useful for completing this lab.Apktool is a reverse engineering tool for Android apps. It can decode resources to nearly original form and rebuild them after making some modifications. It also makes working with an app easier because of the project like file structure and automation of some repetitive tasks like building apk, etc. [2]. The functionality of Apktool is well-documented and we will briefly describes how this tool can be used to decode and build apk files. More information about Apktool can be found in its documentation [2].In this example, we will use Apktool to decompile a malicious apk that was found in the wild (a7f94d45c7e1de8033db7f064189f89e82ac12c1) [4]. The apk is a repackaged version of the CoinPirates game that includes a malicious payload.Apktool provides a command line interface. Its most common use case is for decoding and disassembling apk files. If you need to decode an apk file, you use the d (decode) option and pass the apk file as an argument. An example is shown in Listing 2 on line 1.123456789 101112131415Listing 2: Decoding an apk using Apktool.If you look in the directory created you should see something similar to Listing 3. For this lab, we will focus mostly on the AndroidManifest.xml file, the res/ directory, and the smali/ directory. The app’s resources, such as its images and layouts can be found in the res/ directory. In the smali/ directory, the original classes found in the classes.dex file can be found. Apktool converts the original classes.dex file into smali using baksmali[5], an assembler/disassembler for the dex format. We will discuss the contents of these files and smali syntax later on.12Listing 3: Contents of the directory created.Apktool also can rebuild an apk file from the decoded resources after making some modifications, such as modifying the smali code. To build an app you need to provide the b (build) parameter to Apktool and also provide the decoded directory as an argument like the example in Listing 4.123456Listing 4: Rebuilding an apk file using Apktool.If you received no errors, the new apk should be found in the dist subdirectory of the directory provided as input. For example the apk created from running the command in Listing 4 is shown in Listing 5. In your working directory, you will still have a copy of the original apk file. It does not include any modifications you may have made.123Listing 5: The location of the modified apk.The next step is to sign the apk you just created. If the apk has not been signed it will fail to install on an emulator or real device. The Android SDK provides a utility program called apksigner that is located in the Android/Sdk/build-tools/SDK version/ directory. We have provided this program on your VM (You can also use jarsigner if you prefer). For this lab, you should just sign the apk with the debug key, which is located in the debug.keystore file located in your $HOME/.android/ directory. An example of signing an apk is shown in Listing 6. You need to provide the location of the keystore after the –ks option and pass the apk file as an argument. You will be prompted for a password. The default password is android.123Listing 6: Signing your apk file (password is android).After you have signed your apk, install it onto the emulator to verify everything went correctly.Apktool can also be useful for making small modifications to the underlying byte code. For example, let’s assume a malicious app is using the anti-analysis check shown in Listing 7 to prevent the execution of any malicious behavior if the Build type is eng. Use apktool to disassemble this app, so that you can modify the code located in the smali directory. Use apktool to disassemble the app located in tutorialApps/emu-check.apk. After you have done so, open the file emu-check/smali/com/myapplication/MainActivity.smali in a text editor. You will see the code shown in Listing 8. The code shown is smali and is a representation of Dalvik bytecode. The Android Developer’s website provides a page that discusses the types of instructions and arguments [6].For the checkEnvironment method, the app is checking the model’s build type to see if it is equal to the string “eng”. In the bytecode, we see that the value of Build.TYPE is stored in register v0 on line 7. The string constant “eng” is stored in register v1 on line 9. The comparison of the strings is completed on line 11 and the result is stored in register v0. On line 13 we see that if the value stored in register v0 is equal to zero, then a jump to the cond 0 branch will occur. Therefore, if the Build.TYPE is not ”eng” then a jump to cond 0 occurs and the malicious behavior will be triggered. Since we are on an emulator, our Build.TYPE will be “eng” and the jump will not occur. To force the controlflow to go to cond 0, change the statement on line 15 to “goto :cond 0”. This will force the branch to occur every time the app runs. Build and sign the app. Install it onto the emulator (If you installed the previous version you will need to uninstall it first) and open the app. If you check logcat, you will see that the Build type is ”eng”. However, the app will now log the ”do something malicious” instead.12345Listing 7: Prevents malicious behavior if the build type is eng. 45678910111213141516171819202122232425262728293031323334353637Listing 8: checkEnvironment in smali.JADX [3] is another tool that can be used to disassemble apk files. However, JADX disassembles the Dalvik byte code into JAVA source code. The translation is imperfect and will most likely be incomplete, but it is still useful for doing analysis. JADX provides two interfaces: a command line interface and a gui interface. For this lab, we will only discuss the gui interface. You can start the GUI interface of JADX by running jadx-gui from the command line. When the program first opens, it will ask the user to choose a file to disassemble. It supports apk, dex, jar, class, zip, and aar files. This discussion will only discuss using apk files. After you choose the apk file, JADX will begin disassembling the apk. When it’s complete you should see the source code for each class in the Menu pane. If you review the source code, you can see it is not ideal, but it does provide insight into the app’s behavior.Now that we have disassembled the apk file, we can begin analyzing the source code to identify suspicious behavior. Defining behavior within Android is challenging. Behavior that may be suspicious or malicious in one application may be expected behavior in another application. It is reasonable for a messaging app to access a user’s contacts, but if a utility app, such as a flashlight app, accesses a user’s contacts it should raise suspicion. Therefore, the behavior that makes an application potentially malicious is not a particular pattern, but the behavior in an application that is inconsistent with the end user’s expectation. The easiest starting point for identifying any questionable behavior is by looking at the App’s manifest file. The manifest file provides a high-level abstract of an app’s behaviorIn JADX, the AndroidManifest.xml is located in the Resources/ directory. The highest level of security for Android is the permission system that protects the usage of sensitive behavior. The manifest file shows us that the CoinPirates app has access to 14 permissions. Malware often abuses the text messaging permissions to communicate with their C&C server and to try and send premium text messages without the user being aware.456789101112131415Listing 9: Permissions used by CoinPiratesAfter observing the permissions, the next goal is to vet the application by analyzing how the application uses the sensitive APIs that are protected by the suspicious permissions. Since malware writers often repackage their payload within real apps with 100’s of classes, it would be too time-consuming to search through all the source code. Instead, we will focus on the entry points of the application.[2] Android applications are written using the Java programming language. Unlike conventional Java programs, Android applications do not have a main() function or a single entry point for execution. Instead, they are designed using components. App components make up the essential building blocks of an Android app. Each component is a different point through which the system can enter a developer’s application. There are four different types of components: activities, services, content providers, and broadcast receivers. Each type of component serves a different role and the set of components used in an Android application define its overall behavior. The activity component creates user interfaces. For example, a messaging application may have one activity that creates the user interface for allowing a user to input their message and another activity for allowing the user to view their contacts. The service component runs in the background to perform tasks. Unlike, activity components, service components do not have a user interface. For example, a service component can be used to play music in the background. The content provider component handles application data. Using content providers, an application can store data in files, SQLite databases, or other persistent storage locations an application can access. The broadcast receiver component responds to system-wide broadcast announcements. For example, the system may broadcast that a picture has been captured, and the broadcast receiver can alert the application of this action. In general, broadcast receivers do minimal work, but instead, alert other components that an event occurred.Since the components are required to be declared in the manifest, this allows us to quickly identify any interesting entry points without having to search through the source code. To avoid detection, malware usually does not trigger until it receives commands from its C&C server. The two most common and efficient wants for this communication is through the network and sms. Since SMS can provide communication when the user does not have a wifi connection, it is usually preferred. Since this app has declared the RECEIVE SMS permission, we know that it has the ability to receive broadcasts about arriving text messages through a broadcast receiver. If a broadcast receiver wants to receive a text message, it must specify that it can handle this action by adding the action to its intent filter inside the manifest file. The action required is shown in Listing 10.1Listing 10: Action required to receive SMS broadcastsIn the CoinPirates manifest, we see that only one receiver has this ability, and the component’s declaration provides us with enough information to identify the package and class name that declares the receiver. Additionally, the components declaration raises more suspicion. First, it is manipulating the naming convention and is located in the com.android package. Next, it has a priority of 10000. In Android, broadcasts can be ordered or sent to all apps at the same time. In general, applications with a higher priority will receive the broadcast first. Additionally, they have the choice of aborting the broadcast() or allowing it to be sent to the app with the next highest priority. Therefore, this behavior can be manipulated by malicious apps to hide the notification of received text messages[3]456Listing 11: Action required to receive SMS broadcastsIf we use JADX to analyze the source code for the SMSReceiver class, we can identify any suspicious behavior that may occur when a text message is received. The Android OS notifies broadcast receivers by calling the receiver’s onReceive method. Therefore, we should start our analysis from this point in the app. When looking over the source code of the onReceive method, we see that the method immediately queries a database called “mydb.” The source code also shows us that the values received from the database are being compared to the sender’s number and the contents of the sms body. Based on thee results of these comparisons, the app uses the needDel (delete text message) or needUpload variables to control the apps’ control-flow.Identifying suspicious entry point that are defined in the manifest file, allows us to quickly identify suspicious behavior. For example, After analyzing the SMSReceiver we see that it is being used by the C&C server to trigger malicious behavior. We also know that the app uses the “mydb” database to interpret the C&C servers commands. While the SMSReceiver app provides the most insight, the malicious app is also using two other receivers, AlarmReceiver and BootReceiver, to start the Monitor Service. We leaving analyzing the MonitorService component to the reader.Using static analysis, we can identify the necessary events required to trigger malicious behavior in the app. Our next goal will be to leverage the details we extracted from the static analysis to dynamically generate the malicious behavior at run time.In the case that the events necessary to trigger the malicious behavior is dependent on external sources, such as a text message being received, we will need to simulate these events. Android provides several tools for injecting events into the emulator, and you can read the full documentation on the Developer’s Website [7]. One tool is the emulator console. Each running emulator instance provides a console that lets you query and control the emulated device environment. For example, you can use the console to manage port redirection, network characteristics, and telephony events while your application is running on the emulator. The console emulator will be useful for injecting events, such as text messages from a specific number or changing the location’s device. The official documentation provides several examples.A developer can provide an app with resources by placing it in a specific subdirectory of the res/ folder. Once you provide a resource in your application, you can use it by referencing its resource ID. Each resource is grouped into a ”type“ such as string, layout, or drawable.When viewing an APK in JADX, you can find the resources an app uses in the Resources directory under the resources.arsc tab. After expanding the resources.arsc file, you can find many basic resources, such as hardcoded strings found in the values directory.When JADX decompiles the APK back into source code, resources will be referenced by their ID in the R class, you can use this to create a mapping from the Resource ID to its original name in the res/resources.arsc/values subdirectory. 123Using SMS as a protocol for a C&C server is an important design decision that is different from traditional IP-based approaches known from infected PCs. The main advantages of an SMS-based approach instead of IP-based are the fact that it does not require steady connections, that SMS is ubiquitous, and that SMS can accommodate offline bots easily [8]. sms.apk is leveraging SMS to receive commands from its C&C server, you need to identify them.At this point we should have enough information to trigger the malicious behavior. The C&C server can be started by running ./start server from the command line. Start the server and send the necessary text messages. Unfortunately, no malicious behavior will be exhibited. This is because the malicious app has placed anti-analysis techniques into the app to prevent analysis. Our next goal will be to find them and see if we can emulate these triggers or remove them.The Android/BadAccents malware, discussed in [8], contains two specific checks on the incoming SMS number. It checks for ‘84’ and ‘82’ numbers, which indicates that the malware expects SMS from a C&C SMS server either located in China or South Korea. It seems the app we are inspecting does something similar.From Stage 1, we know the required country code and the necessary commands to trigger the malicious behavior. However, even if we send the correct commands with the correct country code, sms.apk will still not exhibit any malicious behavior. In order to maximize the longevity of malware, malicious developers want to prevent analysis. Since the majority of dynamic analysis frameworks are based on emulation, malicious developers integrate anti-analysis techniques to change an app’s behavior. If an app senses that the underlying environment is an emulator and not a real phone, it will change its behavior to not exhibit any suspicious behavior. In Stage 2 we will try to identify how sms.apk is checking if it is on an emulator. Then we will modify sms.apk to remove this check and trigger the malicious behavior.The most basic form of emulation detection is when a malicious app leverages a static heuristic. Static heuristics are pre-initialized values that provide information about the underlying environment [9]. Apps running on a system can check these static heuristics by calling Android APIs. For many of the values, the emulator will return values that are inconsistent with what would happen if the app was running on an real device. For example, if the TelephonyManager.getDeviceId() API returns all 0’s, the device in question is an emulator. This is because this value cannot exist on a physical device.A list of the possible static heuristics that can be found in sms.apk can be found in [10]. However, the one just mentioned would be a good starting point.The final question is a two-step process. The first step will be to modify sms.apk and remove the environment check so that we can run sms.apk on an emulator. The second step will be sending the commands found in Stage 1 to the emulator and having it exhibit malicious behavior. Upon success, the C&C server will generate the final answers.4.6.3 Step 1:4.6.4 Step 2://developer.android.com/studio/run/emulator-commandline.html# events.[1] Portions of this section are reproduced from work created and shared by the Android Open Source Project and used according to terms described in the Creative Commons 2.5 Attribution License.[2] Portions of this section are reproduced from work created and shared by the Android Open Source Project and used according to terms described in the Creative Commons 2.5 Attribution License.[3] As of Android 4.4 this has been slightly adjusted. The default SMS app will always receive the broadcast first, regardless of priority.
Assignment 3 1 Instructions • You can use inbuilt libraries for Math, plotting, and handling the data (eg. NumPy, Pandas, Matplotlib). • Usage instructions for other libraries can be found in the question. • Only (*.py) files should be submitted for code. • Create a (*.pdf) report explaining your assumptions, approach, results, and any further detail asked in the question. • You should be able to replicate your results during demo. • Note you are not allowed to use libraries which can take data, fit the model, predict the labels and give final evaluation metrics.2 Question-1 Use https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz MNIST dataset for this question and select two digits – 0 and 1. Label them as -1 and 1. In this exercise you will be implementing AdaBoost.M1. Perform following tasks. • Divide the train set into train and val set. Keep 1000 samples from each class for val. Note val should be used to evaluate the performance of the classifier. Must not be used in obtaining PCA matrix. • Apply PCA and reduce the dimension to p = 5. You can use the train set of the two classes to obtain PCA matrix. For the remaining parts, use the reduced dimension dataset. • Now learn a decision tree using the train set. You need to grow a decision stump. For each dimension, find the unique values and sort them in ascending order. The splits to be evaluated will be midpoint of two consecutive unique values. Find the best split by minimizing weighted 1 • Compute α1 and update weights. • Now build another tree h2(x) using the train set but with updated weights. Compute α2 and update weights. Similarly grow 300 such stumps. • After every iteration find the accuracy on val set and report. You should show a plot of accuracy on val set vs. number of trees. Use the tree that gives highest accuracy and evaluate that tree on test set. Report test accuracy. [2] Q2. Consider the above as a regression problem. Apply gradient boosting using absolute loss and report the MSE between predicted and actual values of test set. • Divide the train set into train and val set. Keep 1000 samples from each class for val. Note val should be used to evaluate the performance of the classifier. Must not be used in obtaining PCA matrix. • Apply PCA and reduce the dimension to p = 5. You can use the train set of the two classes to obtain PCA matrix. For the remaining parts, use the reduced dimension dataset. • Now learn a decision tree using the train set. You need to grow a decision stump. For each dimension, find the unique values and sort them in ascending order. The splits to be evaluated will be midpoint of two consecutive unique values. Find the best split by minimizing SSR. Denote this as h1(x). [1] • Compute residue using y − .01h1(x). • Now build another tree h2(x) using the train set but with updated labels. Note, now you have to update labels based on the way we update labels for absolute loss. That is the labels will be obtained as negative gradients. Compute residue using y − .01h1(x) − .01h2(x). [1] • Similarly grow 300 such stumps. Note, the labels are updated every iteration based on negative gradients. • After every iteration find the MSE on val set and report. You should show a plot of MSE on val set vs. number of trees. Use the tree that gives lowest MSE and evaluate that tree on test set. Report test MSE. [1] 2
Ensure that your Gradescope submission contains the following file:○ Lab3.asmThis lab will introduce you to the RISC-V assembly language programming using RARS (RISCVAssembler and Runtime Simulator). You will write a program with nested loops in the provided file Lab3.asm to write a specific pattern to a file by the name of lab3_output.txt (which will be generated by running your submitted Lab3.asm file. lab3_output.txt does not exist prior to this action)Make sure you properly follow the steps mentioned in the RARS site to download and install RARS on your machine. Please approach a TA or tutor in your lab section if you have issues installing RARS on your machine!Since this is your very first foray into assembly programming, please read this document thoroughly without skipping any sections!Much like how a high-level program has a specific file extension (.c for C, .py for python) RARS based RISC-V programs have a .asm extension.In the Lab3 folder in the course Google Slide, you will see 7 assembly files. They are meant for you to read (and understand) in sequence:Please download these files and make sure to open them in the RARS Text editor only. Doing otherwise will cause comments and other important code sections to not be properly highlighted and can be a hindrance to learning assembly language intuitively. Steps for opening, assembling and running a .asm file are provided later in this document.These 7 files have enough comments in the source code to jump start your understanding of RISC-V assembly programming if the lectures have not yet covered certain topics in assembly programming.Beyond these three files, you should have all the required resources in the Lecture Slides themselves, in the lecture pages following the topic “Von Neuman and RISC- V”. These lecture slides are very selfexplanatory. You are encouraged read ahead even if the instructor hasn’t started discussing them in lecture. You are also encouraged to read the excellent RARS documentation which can be found by clicking “help” on the RARS program, or at these URLs: https://github.com/TheThirdOne/rars/wiki and https://github.com/TheThirdOne/rarsFor the usage of macros (which are utilized heavily in this lab to generate system calls refererred to as ecalls), please also refer to the RARS documentation on macros and ecalls as well. For lab3, you don’t even need to know what the inside of a macro block looks like so long you know just what it is supposed to do overall.Helpful tip: For lab3 and lab4, it is recommended that you create two separate folders in your machine, lab3 and lab4. Make each folder the workspace for your respective lab. So, for the given lab, place all the provided .asm files in the Lab3 folder along with a copy of the .jar RARS application file, and run RARS from there. This is where you will create your Lab3.asm file as well.Figure 1 Ideal workspace setup for lab3/lab4Henceforth, run all .asm files pertinent to Lab3 on this local copy of the .jar RARS application.Open the RARS application. You should get the window below.Figure 2 Opening the RARS applicationLet us open firstRARSprogram.asm by clicking File -> Open.Make sure the comments (which appear in green) are properly indented and haven’t been misaligned when you downloaded the file from the Google Drive. They should appear as shown below:Figure 3 Opening an asm file on RARSMake sure to thoroughly read the entire contents of this file in the text editor. Verbose comments have been provided to guide you along in explaining each step in the source code. This will be the norm for the other .asm files in the Lab3 folder in Google Drive as well.After you have read and understood the source code, it is time to assemble the program. Before you assemble, go to Settings and make sure you have the exact following options checked (or unchecked). For this course, you are allowed to use pseudo instructions. Pseudo instructions are those instructions which are not native to the RISC-V instruction set but the RARS environment has defined these new ones by a combination of actual RISC-V instructions. Permit pseudo instructions (this actually makes coding easier for you). This should be done for every RARS code in CSE12!Figure 4 RARS settingNow click on Assemble (the Wrench and screwdriver icon). If correctly assembled, your Messages window should show the following information:Figure 5 Successful assemblyNow Click on the Run button to Run the program. You will get the following output:Figure 6 Successful RuntimeNow try running the other .asm files.One word of caution when your text editor contains multiple opened files is to make sure of assembling the correct assembly file. For example, in the window below, multiple files are open. If I want to only assemble and run add.asm, then my tab for add.asm should be highlighted as shown below. Only then can I click Assemble, then Run.Figure 7 Multiple tabs openRARS has a helpful feature where instead of Running the entire program at once, you can Run One Step At A Time. The corresponding button is beside the Run button. This allows you to sequentially execute each line of code and see how it affects the values of the Registers as they appear to the right of your screen.The file multiply.asm makes extensive use of macros to help create a more readable main program section (Instructions on how to use macros are provided in the file comments). So does the source code in the files fileWriteDemo.asm, fileReadDemo.asm and patternDisplayDemo.asm (we will discuss more on the aspect of file reads and writes that these .asm files do shortly). Based on how we define a macro in the source code, it is tempting to confuse it with a function. However, macros are NOT functions! Whenever you place multiple instances of the same macro in your code, you are copying the macro’s contents in those code areas for the same number of times.When you want to open a new file on RARS, go to File->New. The default file name riscv1.asm shows up on the tab. When you save this file, you MUST make sure that you are explicitly defining the correct extension(.asm) as shown below.Figure 8 Saving a new file in RARSFile creation and manipulation is a very common part of the learning curve whenever you learn of a new high level programming language, be it C or Python. For lab3, we will be writing the display pattern to a file so that it is more convenient for the auto grader. The auto grader within Gradescope will do a file text equality check between files generated by your lab3 source code and expected correctly generated files and accordingly provide you points (or not!).To give you a demo, we have two reference assembly source code files: fileWriteDemo.asm and fileReadDemo.asm. The former file creates a file with the name fileDemo.txt. The following text is written into fileDemo.txt: “These pretzels are making me thirsty!”. The latter file fileReadDemo.asm contains code to open fileDemo.txt and extract out this text to display on the output console of RARS.The following two images shows the results of having run fileWriteDemo.asm and then fileReadDemo.asm.Figure 9 A new file generated in my workspace after running fileWriteDemo.asm. Note the file size to be shown as 1KB despite us having written only 38 bytes of data into it. That is because a file also contains metadata and a header generated by your OS as well.Figure 10 RARS output console after running fileReadDemo.asmBoth fileWriteDemo.asm and fileReadDemo.asm use many macros within the source code to make the main.text section of the code more programmer friendly in terms of writing. For the purposes of lab3, you DO NOT need to understand WHAT these macros are doing within their definition block. It suffices to know simply what the result of executing a macro in your source code simply does. However, understanding the macros does help to build your foundation in RARS programming well.One thing to note is that since lab3 does not focus on proper function coding in RISC-V assembly, it can get very difficult to keep track of random unintentional instances of your registers to change value. For instance, in C or Python, you can define a variable temp, assign it a specific value, and be rest assured that this variable does not change from the assigned value during code compilation or runtime unless explicitly told to. However, in a large source code assembly, working with a limited number of registers means that it is very difficult to keep track of each individual register value unless you are very careful.We will deal with register preservation in lab4 but in lab3, you will only be asked to ensure that you do not use specific registers in your Lab3.asm source code. The list of these taboo registers will be highlighted in the section later on Lab3Besides the aforementioned 2 files related to file write and read, we also have a 3rd .asm file, patternDisplayDemo.asm. The source code in this file, once run, asks as input an integer n and then prints the pattern “* “ n number of times horizontally.Figure 11 Output console after running patternDisplayDemo.asm for user input n=3 and7. In both cases, make sure to check the contents of the created file patternDisplay.txt as wellSimilar to patternDisplayDemo.asm, Lab3.asm will also make use of loops (nested loops to be precise) to generate a pattern based on the value of n inputted by the user. Thus, you should thoroughly read and understand the working of source code in patternDisplayDemo.asm.This program will print out a pattern with stars (asterisks, ascii hex code 0x2a) and blank space (ascii hex code 0x20) and the newline character ‘ ’(ascii hex code 0x0a).The actual task of opening the file lab3_output.txt and writing the contents to it is borne by macros used in starter code included in the Lab3.asm file. Consider the screenshot of the Lab3.asm file below:Figure 12 Lab3.asm screenshotAs you can see, you should write down your code ONLY within the area indicated by the comments.The way this code works regarding file manipulation is as follows:When a file is created in RARS, it is assigned a file descriptor ID, in the form of an integer number. Future references to this file through macros are then made by referring to this file descriptor ID number. Once we create a file, we first need to set aside memory space within our RISC-V memory where data to be written to the file is kept. This space is referred to as a “memory buffer”. In Lab3.asm, we have defined the memory space starting from address 0x10040000 as our internal memory buffer. Specifically, we hold a doubleword (64 bits) at this address which keeps track of how many bytes we intend to finally write to the file. 0x10040008 onwards, we start collecting the bytes that will be written into the file lab3_output.txt.In your student code, you can update the file buffer with any character with the macro write_to_buffer. For example, I want to the write the character sequence “**** ” to my memory buffer within Lab3.asm’s student code section. Then I would need to write the following student code as shown next:Figure 13 Modified Lab3.asm screenshotRun this Lab3.asm file and open the generated lab3_output.txt in a text editor. Specifically, if you are using Notepad++ (which is strongly recommended), make sure to apply the setting: View->Show Symbol ->Show All Characters. This will make characters such as null and newline visible.Figure 14 lab3_output.txt screenshot from running modified Lab3.asmAs you can see, the blank space appears as an orange dot, newline as LF (Line Feed) and null as NUL.You can see these characters as they reside in the file buffer in memory too on RARS as shown below. If you go to Execute window after running this modified Lab3.asm, selecting 0x1004000 for view and enabling ASCII view, you will get the following screenshot:Figure 15 “* ** * ” data as it resides in file bufferNote that within each individual cell in the Data Segment matrix above, we should read the bytes from right to left.NOTE: For your student starter code, you MUST NOT use any of the registers: t0 to t6, sp. a0 to a7 should only be temporarilly used to pass parameters or receive parameters from macros. Using the registers s0 through s11 should be enough for Lab3 assignment.The following is a screenshot showing the runtime of the actual solved Lab3.asm code: Figure 16 Solved Lab3.asm runtime demoWhen we open the generated lab3_output.txt file, we get the following text:Figure 17 lab3_output.txt screenshotYour student code MUST display the prompts and error messages in response to user input EXACTLY as shown in Figure 16. Please make use of the provided strings in the .data section of the starter code in Lab3.asm to make sure you do not use any other type of sentence!NOTE: Although you are not required to print each row in the pattern on your output console, doing so (as shown in Figure 16 ) will greatly help in the real time code debugging. So, it is strongly advised to do so.The Lab3 folder in the Google Drive contains some test cases in testCases subfolder for the case when user input was n=1, 3, 6, 8, 30. Make sure your code output generates the exact same alignment of characters as provided there for the corresponding n input in your student code.Note that our grading script is automated, so it is imperative that your program’s output matches the specification exactly. The output that deviates from the spec will cause point deduction. Files to be submitted to your Lab3 gradescope portalLab3.asm-This file contains your pseudocode and assembly code. Include a header comment as indicated in the documentation guidelines here.This is the lab assignment where most students start to get flagged for cheating. Please review the pamphlet on Academic Dishonesty and look at the examples in the first lecture for acceptable and unacceptable collaboration.You should be doing this assignment completely all by yourself!The following rubric applies provided you have fulfilled all criteria in Minimum Submission Requirements. Failing any criteria listed in that section would result in an automatic grade of zero which cannot be legible for applying for a regrade request.20 pt Lab3.asm assembles without errors (thus even if you submit Lab3.asm having written absolutely no student code, you would still get 20 pts!) 80 pt output in file lab3_output.txt matches the specification:20 pt error check zero and negative heights using the convention shown in Figure 1620 pt prompts user until a correct input is entered as shown in Figure 1620 pt number of rows match user input (i.e., if n=6, the pattern would have 6 row20 pt correct sequence of stars and newline characters on each rowAll course materials and relevant files located in the Lab3 folder in the course Google Drive must not be shared by the students outside of the course curriculum on any type of public domain site or for financial gain. Thus, if any of the Lab3 documents is found in any type of publicly available site (e.g., GitHub, stack Exchange), or for monetary gain (e.g., Chegg), then the original poster will be cited for misusing CSE12 course-based content and will be reported to UCSC for academic dishonesty.In the case of sites such as Chegg.com, we have been able to locate course material shared by a previous quarter student. Chegg cooperated with us by providing the student’s contact details, which was sufficient proof of the student’s misconduct leading to an automatic failing grade in the course.
Assignment 3 1 Instructions • You can use inbuilt libraries for Math, plotting, and handling the data (eg. NumPy, Pandas, Matplotlib). • Usage instructions for other libraries can be found in the question. • Only (*.py) files should be submitted for code. • Create a (*.pdf) report explaining your assumptions, approach, results, and any further detail asked in the question. • You should be able to replicate your results during demo.2 Question-1 Use https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz MNIST dataset for this question and select class 0, 1 and 2. Note you are not allowed to use libraries which can take data, fit the model, predict the classes and give accuracy. Perform following tasks. • Apply PCA and reduce the dimension to p = 10. You can use the entire train set of these 3 classes to obtain PCA matrix. For the remaining parts, use the reduced dimension dataset. • Now learn a decision tree using the train set. You need to grow a decision tree with 3 terminal nodes. This is similar to what we did in the baseball salary example. For the first split, consider all p dimensions. For each dimension, consider one split which will divide the space into two regions. Find the total Gini index. Similarly find the total Gini index for all 50 dimensions. Find the best split by searching for minimum Gini index. Suppose, you split across 10th dimension. Choose one of the splits, and repeat the steps to find best split. Once you find it, the entire p dimensional space is divided into three regions. [2] 1 • Find the class of all samples in test set of these 3 classes. For a particular test sample, check where the samples lies in the segmented space. The class for a particular sample is the class of sample which is in majority in the region to which the test sample belongs. Report accuracy and class-wise accuracy for testing dataset. [1] • Now use bagging, develop 5 different datasets from the original dataset. Learn trees for all these datasets. For test samples, use majority voting (atleast 3 trees should predict the same class) to find the class of a given sample. In case there is a tie, that is two trees predict one class and other two trees predict another class, then you can choose either of the classes. Report the total accuracy and class-wise accuracy. [1] 2
Assignment 2 1 Instructions • You can use inbuilt libraries for Math, plotting, and handling the data (eg. NumPy, Pandas, Matplotlib). • Usage instructions for other libraries can be found in the question. • Only (*.py) files should be submitted for code. • Create a (*.pdf) report explaining your assumptions, approach, results, and any further detail asked in the question. • You should be able to replicate your results during demo.2 Question-1 Use https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz MNIST dataset for this question and perform following tasks. • It has all in all 60K train samples from 10 classes and 10K test samples. 10 classes are digits from 0-9. Labels or classes for all samples in train and test set is available. • Visualize 5 samples from each class in the train set in form of images. • Find the class of all samples in test set. Report accuracy and class-wise accuracy for testing dataset. Accuracy is ratio of total number of samples correctly classified to the total number of samples tested. Total number of samples tested is 10K. Similarly, for each class report the accuracy. Note the labels or classes for each sample is given in the dataset. 1 3 Question-2 Use same downloaded dataset from Question 1 and perform following tasks. • Choose 100 samples from each class and create a 784×1000 data matrix. Let this be X. • Remove mean from X. • Apply PCA on the centralized X. You need to compute covariance S = XX>/999. The find its eigenvectors and eigenvalues. You can use any library for this. Sort them in descending order and create matrix U. • Perform Y = U>X and reconstruct Xrecon = UY . Check the MSE between X and Xrecon. This should be close to 0. MSE = Pi,j(X(i,j) − Xrecon(i,j))2. • Now chose p = 5,10,20 eigenvectors from U. For each p, obtain UpY , add mean that was removed from X, reshape each column to 28×28, and plot the image. You should see that as p increase the reconstructed images look more like their original counterparts. Plot 5 images from each class. • Let test set be Xtest. Find . For each value of p find Y , and apply QDA from Q1 on Y. Obtain accuracy on test set as well as per class accuracy. As p inreases, accuracy shall increase. 2
Assignment Q1. Consider two Cauchy distributions in one dimensionAssume P(ω1) = P(ω2). Find the total probability of error. Note you need to first obtain decision boundary using p(ω1|x) = p(ω2|x). Then determine the regions where error occurs and then use p(error) = Rx p(error|x)p(x)dx. Plot the the conditional likelihoods, p(x|ωi)p(ωi), and mark the regions where error will occur. This shall be rough hand-drawn sketch. As p(x) is same when equating posteriors, we can simply use p(x|ωi)p(ωi). [1] Q2. Compute the unbiased covariance matrix: [0.5]Here, X ∈ Rd×N form. Q3.a. In multi-category case, probability of error p(error) is given as 1p(correct), where p(correct) is the probability of being correct. Consider a case of 3 classes or categories. Draw a rough sketch of p(x|ωi)p(ωi) ∀i = 1,2,3. Give an expression for p(error). Assume equi-probable priors for simplicity. [1] b. Mark the regions if the three conditional likelihoods are Gaussians p(x|ωi) N(µi,1), µ1 = −1,µ2 = 0,µ3 = 1. Find the p(error) in terms of CDF of standard distribution. [1] Q4. Find the likelihood ratio test for following Cauchy pdf:Assume P(ω1) = P(ω2) and 0-1 loss. [1] 1
Systems Section B Assignment 4 Topic: Concurrency and 1. All questions have to be done using the C programming language only. 2. The general guidelines mentioned in each question will ensure more readable code and an easier time during the demo for both you and your evaluator, so please follow them. 3. Helper functions can be added as necessary. Ensure that you are only using the synchronization primitive as mentioned in the question. Question 1: The dining philosophers problem (OSTEP Chapter 31, page 13) is a classic problem to demonstrate the concepts of deadlocks. The original setup contains five philosophers sitting on a round table, with a fork between each philosopher. Each philosopher can perform only one of two actions – eating and thinking. For eating, each philosopher requires 2 forks that are kept at their left and right sides. Allowing unrestricted access to each philosopher can lead to a deadlock. Consider a modified version of the classic problem stated below: Besides the 5 forks, there are now 2 bowls that are kept at the centre of the table. For eating, each of them now requires two forks and a bowl. Model the above problem using threads as philosophers. Specifically, each philosopher carries out 2 tasks, eating and thinking. Time required for both tasks can be simulated using the sleep() function (or any other function that causes a delay), and each philosopher must indicate what action it is carrying out by printing to the console. You must also ensure that there are no deadlocks in the code. You are only allowed to use Mutexes (locks) and Conditional Variables for this question Guideline: Please structure your code to contain the following functions: void* philosopher(void* args) // for running philosopher thread void eating() // for entering eating state void thinking() // for entering thinking state Deliverables: ● C code ● Readme file explaining the following: ○ Why deadlocks can occur in the problem setup ○ How your proposed solution avoids deadlock. ○ Fairness of the solution i.e. for your implementation, which and how many of the 5 philosopher threads are able to eat, and a rough estimate of how often a philosopher is able to eat (if at all). Question 2: Imagine a situation where multiple passengers eagerly await their turn to take a ride in a car. This car has a limited capacity and can only set off when fully occupied, with a maximum of C passengers on board (where C is less than the total number of passengers). Passengers have the simple tasks of getting on and off the car, while the car itself must manage the loading, running, and unloading procedures. Passengers are allowed to board only when the car has completed the loading process, and the car can commence its journey once it has reached its maximum passenger capacity. Passengers can disembark from the car only after it has completed its unloading process. Simulate the above by modeling the car and the passengers as threads. Take the total number of passengers and capacity as input from the user. Simulate the above problem by modeling the passengers and car as threads. Specifically, the car thread has to do the following tasks: 1. Load specified number of passengers for the ride 2. Wait for all passengers to get on the ride 3. Run the duration of the ride 4. Unload all the passengers until ride is empty Each passenger thread has to do the following: 1. Board the ride when it is available 2. Get off the ride when the ride is over Time taken for each step can be simulated using the sleep function with appropriate duration, and every action carried out must be printed to console (in case of passengers, mention which passenger is carrying out the action using appropriate means). Synchronization can be carried out between car and passenger threads as necessary. Ensure that your code is deadlock-free. Use semaphores for synchronization. Guidelines: Please ensure that the following functions are present in the code: void* car(void* args) // car thread void* passenger(void* args) // passenger thread void load() // loading car with passengers void unload() // unloading passengers void board() // passenger boards car void offboard() // passenger gets off car Deliverables: ● C code ● Writeup explaining code logic and how you avoid concurrency bugs in code Question 3: Modeling each car as a thread, write a program such that all cars from the left and the right side are able to cross without violating the above constraints (the number of cars on the left and right is to be taken as input from the user). The following assumptions are also satisfied: 1. Once a car gets on the bridge, it will definitely cross it. 2. Each car takes a fixed amount of time to cross the bridge. Crossing the bridge can be simulated as the thread calling the sleep() function. For every thread that crosses the bridge, you must indicate which thread it is and which side it is originating from by printing appropriate information to the console. Use semaphores for synchronization. Guidelines/Hint: It would be much easier to write different thread functions for left and right side cars. Therefore, please try and include the following functions: void* left(void* args) // cars on the left void* right(void* args) // cars on the right void passing(int dir) // car from some direction is traveling on the bridge Deliverables: ● C code● Writeup explaining code Published using Google Docs logic and how you avoid Report abuse Learn more concurrency bugs in code
There should be ONLY ONE submission per group Submit a .zip named RollNo1_RollNo2.zip file containing code and write-up. The first question requires you to set up your testbench VM. The VM should not require too much resources – approximately 4 GB RAM, with 2 virtual CPU cores and about 20 GB of hard drive space. You need to install Artix-base (runit version) on that following the instructions on the site. The installation must be done using the text mode instructions (and not by using the GUI). To make your life easier you would need to do the following after the installation: 1. Enable ArchLinux repositories. 2. Install the following packages: binutils, elfutils, gcc, gdb, make, automake, autoconf, yasm and vim. Thereafter, you would require downloading the stock linux kernel (from https://www.kernel.org/) and compiling and installing it on your testbench. Ensure that it boots up! Some Important Points: Please use the attached kernel config while compiling your kernel. – The attached kernel config supports Oracle Virtual Box, VMware, and Qemu. Some Important Points: – Once you have unpacked the tarball from kernel.org, run ‘make mrproper’ to clean up the build artifacts – Copy ‘config-rev-9-gold’ into the unpacked directory and rename it to .config – Make sure to run make nconfig. Now press Escape to update the config according to your kernel version – Now run make -j$(nproc) and continue the kernel compilation Info regarding Vmware (Optional): – Open your VM settings and remove Printer, SoundCard, and USB Controller. Our VM does not require these devices. Info regarding Virtual Box (Optional): – Open your VM settings and disable soundcard by going to the settings of your VM. Grading Rubric: UEFI enabled VM booting Artix. The students should be able to show the correct partitioning of the virtual hard drive This exercise is to show you how to use Linux’s scheduling policies for different processes. You will be creating three processes, where each of the three processes will count from 1 to 2^32. The three processes should be created with fork() and thereafter the child processes should use execl() family system calls to run another C file which will do the counting mentioned above. Reiterating you need to launch three process, each of which calls another process(the counting process) The following should be the detailed specification of each of the process, to being with: 1. Process 1 : Uses SCHED OTHER scheduling discipline with standard priority (nice:0). 2. Process 2 : Uses SCHED RR scheduling discipline with default priority. 3. Process 3 : Uses SCHED FIFO scheduling discipline with default priority. Each of these processes must time the process of counting from 1 to 2^32. To time the execution, the parent process could get the clock timestamp (using clock_gettime()) before the fork and after each process terminates (the event of which could be identified when the blocking system call waitpid() returns). Hint: To correctly benchmark scheduling policy, remember that all the three processes should be in READY state and then it is the policy that decides which process to give the CPU first. Grading Rubric: Successful compilation of the program. Appropriately used system calls to create processes with the system calls to set the scheduling discipline and their priorities. Error Handling Makefile to compile the above programs. README/Write-up describing the program logic used for achieving the above (no more than one page)and an explanation of the outcomes of the tests/measurements. This exercise requires you to write your own small kernel module. You require to implement a kernel system call as a module. The task of the system call would be to read the entries of the process task_struct and print the number of currently running processes. The system call should be implemented in the kernel. It should be functional only when the module is loaded, not otherwise. The task_struct data structure requires the header file. Ensure that the required header files are on your system else install these header files by trying these commands: sudo apt-get install linux-headers sudo apt-get install linux-headers-generic NOTE: You can show the module loading and unloading in both Artix Linux or any Linux Distribution of your choice. However, make sure you have installed all dependencies and given the permissions as required. If you are using a Windows system you would need to use a VM with any Linux operating system of your choice. Module programming cannot be done in WSL. What to submit: 1. Fully functional system call that runs only when the module is loaded, and not otherwise 2. A proper Makefile to compile the module. 3. A readme file describing the program logic. 4. A screenshot of the message displayed while loading and unloading the module.
There should be ONLY ONE submission per group Submit a .zip named RollNo1_RollNo2.zip file containing code and write-up.Updates after the initial posting of assignment are highlighted in yellow. 1) A) Parent (P) is having ID B) ID of P’s Child is The child should print two statements: C) Child is having ID D) My Parent ID is Make use of wait() in such a manner that the order of the four statements A, B, C and D is: A, C, D, B. You are free to use any other relevant statement/printf as you desire and their order of execution does not matter. 2) Write a program to create two processes using vfork() system call in which the child process will calculate the factorial of 4, and the parent process will calculate the Fibonacci series up to 16. Parent should wait for the child to complete its working. Clarification: For Fibonacci series, you have to calculate and print the first 16 elements of Fibonacci series The following question is a bonus part: Write a program to create two processes using vfork()/fork() system call in which the child process will calculate the factorial of 4, and parent process will calculate the Fibonacci series up to 16. The child should wait for the parent to complete its working. Rubrics: Student should have implemented following things: ● Program source code(s) with Makefile to compile. ● Write-up giving a brief description of how the program works (less than 1 page) Q2 : You’re making a special unix system, and your project manager wants you to create three specific commands for c shell (c program that acts as a shell for your system) to be used in it.(50 1) word: It is a built-in (internal) command, Reads the number of words (assume word is a “space-separated” string) in a text file, and throws an error if the file does not exist. Syntax: word [-option] [file_name] It should additionally also cater 2 options: ● -n : ignores new line character ● -d : difference between the word sizes of two text files Note: Only one of the options (-n or -d) can be used at a time with the word command. 2) dir: It creates a directory, and then changes the path to that directory. It is an external command, throw an error if that directory already exists. Syntax: dir [-option] [dir_name] It should additionally also cater 2 options: ● -r : removes if the directory already exists and create a new directory instead of throwing an error It should additionally also cater 2 options: ● -d : display time described by STRING Rubrics: Student should have implemented following things: ● Program source code(s) with Makefile to compile. ● Write-up giving a brief description of how the program works (less than 1 page) ● Use C – libraries for implementing the shell commands. ● Use of exec(), fork(), wait() ● Error handling in terms of wrong command or wrong option or wrong argument Create a bash script shell that acts like a math calculator. It should do these things: 1) Read a text file named “input.txt” that has two numbers and an operation in the format x y operation where x and y are numbers and the operation is the name of the command. 2) Calculate the result of that operation. 3) Save the result in a new text file named “output.txt” in the directory named “Result” (if the directory doesn’t exist in the current directory, make it). There are only three operations: 1) “xor”: Get the xor of the two given numbers. 2) “product”: Get the product of the two given numbers. 3) “compare”: Get the bigger number from the two given numbers. Rubrics: Student should have implemented following things: ● Program source code(s) with Makefile to compile. ● Write-up giving a brief description of how the program works (less than 1 page) ● Read, write of text file through bash script only ● Creation of directory(if needed) through bash script only Test Cases for the Assignment – 1 Q1) There is no test case for this question, the flow of expected output is mentioned in all three parts, and your solution should give the output in the same flow. Q2)Q3) Input.txtOutput.txtInput.txtOutput.txt