In this assignment, you will create an adversarial search agent to play the 2048-puzzle game. A demo of the game is available here: https://play2048.co/I. 2048 As A Two-Player Game II. Choosing a Search Algorithm: Expectiminimax III. Using The Skeleton Code IV. What You Need to Submit V. Important Information VI. Optional Heuristics VII. Before You Submit2048 is played on a 4×4 grid with numbered tiles which can slide up, down, left, or right. This game can be modeled as a two player game, in which the computer AI generates a 2- or 4-tile placed randomly on the board, and the player then selects a direction to move the tiles.Note that the tiles move until they either (1) collide with another tile, or (2) collide with the edge of the grid. If two tiles of the same number collide in a move, they merge into a single tile valued at the sum of the two originals. The resulting tile cannot merge with another tile again in the same move.Usually, each role in a two-player games has a similar set of moves to choose from, and similar objectives (e.g. chess). In 2048 however, the player roles are inherently asymmetric, as the Computer AI places tiles and the Player moves them. Adversarial search can still be applied! Using your previous experience with objects, states, nodes, functions, and implicit or explicit search trees, along with our skeleton code, focus on optimizing your player algorithm to solve 2048 as efficiently and consistently as possible.Review the lecture on adversarial search. Is 2048 a zero-sum game? What are the minimax and expectiminimax principles? The tile-generating Computer AI of 2048 is not particularly adversarial as it spawns tiles irrespective of whether a spawn is the most adversarial to the user’s progress, with a 90% probability of a 2 and 10% for a 4 1 (from GameManager.py). However, our Player AI will play as if the computer is adversarial since this proves more effective in beating the game. We will specifically use the expectiminimax algorithm.With expectiminimax, your game playing strategy assumes the Computer AI chooses a tile to place in a way that minimizes the Player’s outcome. Note whether or not the Computer AI is optimally adversarial is a question to consider. As a general principle, how far the opponent’s behavior deviates from the player’s assumption certainly affects how well the AI performs. However you will see that this strategy works well in this game.Expectiminimax is a natural extension of the minimax algorithm, so think about how to implement minimax first.As we saw in the simple case of tic-tac-toe, it is useful to employ the minimax algorithm assuming the opponent is a perfect ”minimizing” agent. In practice, an algorithm with the perfect opponent assumption deviates from reality when playing a sub-par opponent making silly moves, but still leads to the desired outcome of never losing. If the deviation goes the other way, however, (a ”maximax” opponent in which the opponent wants us to win), winning is obviously not guaranteed.III. Using the Skeleton Code The skeleton code includes the following files. Note that you will only be working in one of them, and the rest are read-only: • Read-only: GameManager.py. This is the driver program that loads your Computer AI and Player AI and begins a game where they compete with each other. See below on how to execute this program.• Read-only: Grid.py This module defines the Grid object, along with some useful operations: move(), getAvailableCells(), insertTile(), and clone(), which you may use in your code. These are by no means the most efficient methods available, so if you wish to strive for better performance, feel free to ignore these and write your own helper methods in a separate file.• Read-only: BaseAI.py This is the base class for any AI component. All AIs inherit from this module, and implement the getMove() function, which takes a Grid object as parameter and returns a move (there are different ”moves” for different AIs).• Read-only: ComputerAI.py. This inherits from BaseAI. The getMove() function returns a computer action that is a tuple (x, y) indicating the place you want to place a tile.• Writable: IntelligentAgent.py. You will create this file. The IntelligentAgent class should inherit from BaseAI. The getMove() function to implement must return a number that indicates the player’s action. In particular, 0 stands for ”Up”, 1 stands for ”Down”, 2 stands for ”Left”, and 3 stands for ”Right”. This is where your player-optimizing logic lives and is executed. Feel free to create submodules for this file to use, and include any submodules in your submission.• Read-only: BaseDisplayer.py and Displayer.py. These print the grid. To test your code, execute the game manager like so: $ python3 GameManager.py The progress of the game will be displayed on your terminal screen with one snapshot printed after each move that the Computer AI or Player AI makes. Your Player AI is allowed 0.2 seconds to come up with each move.The process continues until the game is over; that is, until no further legal moves can be made. At the end of the game, the maximum tile value on the board is printed.IMPORTANT: Do not modify the files that are specified as read-only. When your submission is graded, the grader will first automatically over-write all read-only files in the directory before executing your code. This is to ensure that all students are using the same game-play mechanism and computer opponent, and that you cannot ”work around” the skeleton program and manually output a high score.IV. What You Need To Submit Your job in this assignment is to write IntelligentAgent.py, which intelligently plays the 2048-puzzle game. Here is a snippet of starter code to allow you to observe how the game looks when it is played out. In the following ”naive” Player AI.The getMove() function simply selects a next move in random out of the available moves: import random from BaseAI import BaseAI class IntelligentAgent(BaseAI): def getMove(self, grid): # Selects a random move and returns it moveset = grid.getAvailableMoves() return random.choice(moveset)[0] if moveset else NoneOf course, that is indeed a very naive way to play the 2048-puzzle game. If you submit this as your finished product, you will likely receive a low grade. You should implement your Player AI with the following points in mind:• Employ the expectiminimax algorithm. This is a requirement. There are many viable strategies to beat the 2048-puzzle game, but in this assignment we will be using the expectiminimax algorithm. Note that 90% of tiles placed by the computer are 2’s, while the remaining 10% are 4’s. It may be helpful to first implement regular minimax.• Implement alpha-beta pruning. This is a requirement. This should speed up the search process by eliminating irrelevant branches. In this case, is there anything we can do about move ordering?• Use heuristic functions. What is the maximum height of the game tree? Unlike elementary games like tic-tac-toe, in this game it is highly impracticable to search the entire depth of the theoretical game tree.To be able to cut off your search at any point, you must employ heuristic functions to allow you to assign approximate values to nodes in the tree. Remember, the time limit allowed for each move is 0.2 seconds, so you must implement a systematic way to cut off your search before time runs out.• Assign heuristic weights. You will likely want to include more than one heuristic function. In that case, you will need to assign weights associated with each individual heuristic. Deciding on an appropriate set of weights will take careful reasoning, along with careful experimentation.If you feel adventurous, you can also simply write an optimization meta-algorithm to iterate over the space of weight vectors, until you arrive at results that you are happy enough with.V. Important Information Please read the following information carefully. Before you post a clarifying question on the discussion board, make sure that your question is not already answered in the following sections.1. Note on Python 3 We will only accept homeworks in python 3 (python 2 is being discontinued). To test your algorithm in Python 3, execute the game manager like so: $ python3 GameManager.py2. Basic Requirements Page 3 Your submission must fulfill the following requirements: • You must use adversarial search in your IntelligentAgent (expectiminimax with alpha-beta pruning). • You must provide your move within the time limit of 0.2 seconds. • You must name your file IntelligentAgent.py (Python 3). • Your grade will depend on the maximum tile values your program usually gets to.3. Grading Submissions Grading is exceptionally straightforward for this project: the better your Player AI performs, the higher your grade. While this is straightforward, we admit that this Adversarial Search project is the most difficult project in this class because of its open-endedness.Your Player AI will be pitted against the standard Computer AI for a total of 10 games, and the maximum tile value of each game will be recorded. Among the 10 runs, we pick and average top 5 maximum tile values. Based on the average of these 5 maximum tile values, your submission will be assessed out of a total of 100 points.• Submissions that are no better than random will receive a score of zero. • Submissions which contain two 1024 runs and three 2048 runs will receive full credit. For example, [256, 512, 512, 512, 1024, 1024, 1024, 2048, 2048, 2048] will receive full credit.• Submissions that fall somewhere in between will receive partial credit on a logarithmic scale. That is, every time you manage to double your average maximum tile value, you will be moving your final grade up in equally-spaced notches (instead of doubling as well). For other credit examples, please see the FAQs.VI. Optional Heuristics – In Addition to Required Heuristics Your heuristics are key to your IntelligentAgent’s consistent success. Watch some pro 2048 players’ videos. Observe patterns you might try enforcing via heuristics, and compare these players’ performances to that of your IntelligentAgent as a potential benchmark of whether your AI is performing well or not.Take into consideration both quantitative and qualitative measures, such as: • the absolute value of tiles, • the difference in value between adjacent tiles, • the potential for merging of similar tiles, • the ordering of tiles across rows, columns, and diagonalsWe also encourage you to research 2048-specific heuristics. This repo (http://ovolve.github.io/2048-AI/) is a basic 2048 AI written in JavaScript. Your IntelligentAgent should do much better. This particular StackOverflow thread (https://stackoverflow.com/questions/22342854/ what-is-the-optimal-algorithm-for-the-game-2048) provides the concepts of “Monotonicity” and “Smoothness” which you may choose to implement.Whatever ideas you use, remember to use expectiminimax and alpha-beta pruning as instructed to receive full credit.VII. Before You Submit • Make sure your code executes. In particular, make sure you name your file correctly according to the instructions specified above, especially regarding different Python versions. • Make sure your IntelligentAgent.py does not print anything to the screen. Printing gameplay progress is handled by Grid.py, and there should ideally be nothing else printed.
The objective of Sudoku is to fill a 9×9 grid with the numbers 1-9 so that each column, row, and 3×3 sub-grid (or box) contains one of each digit. You may try out the game here: sudoku.com. Sudoku has 81 variables, i.e. 81 tiles.The variables are named by row and column, and are valued from 1 to 9 subject to the constraints that no two cells in the same row, column, or box may be the same. Frame your problem in terms of variables, domains, and constraints.We suggest representing a Sudoku board with a Python dictionary, where each key is a variable name based on location, and value of the tile placed there. Using variable names Al… A9… I1… I9, the board above has: • sudoku dict[“B1”] = 2, and • sudoku dict[“E2”] = 9. We give value zero to a tile that has not yet been filled.Executing your program Your program will be executed as follows: $ python3 sudoku.py In the starter zip, sudokus start.txt, contains hundreds of sample unsolved Sudoku boards, and sudokus finish.txt the corresponding solutions. Each board is represented as a single line of text, starting from the top-left corner of the board, and listed left-to-right, top-to-bottom. If you place the file sudokus start.txt in the same folder as sudoku.py the following command will apply your algorithm to all boards and produce an output. For detail, read the main method in skeleton code. $ python3 sudoku.pyThe first board in sudokus start.txt is represented as the string: 003020600900305001001806400008102900700000008006708200002609500800203009005010300 Which is equivalent to: 0 0 3 0 2 0 6 0 0 9 0 0 3 0 5 0 0 1 0 0 1 8 0 6 4 0 0 0 0 8 1 0 2 9 0 0 7 0 0 0 0 0 0 0 8 0 0 6 7 0 8 2 0 0 0 0 2 6 0 9 5 0 0 8 0 0 2 0 3 0 0 9 0 0 5 0 1 0 3 0 0 Your program will generate output.txt, containing a single line of text representing the finished Sudoku board. E.g.: 483921657967345821251876493548132976729564138136798245372689514814253769695417382Test your program using sudokus finish.txt, which contains the solved versions of all of the same puzzles. All puzzles we provided contained unique solutions. III. Backtracking Algorithm Implement backtracking search using the minimum remaining value heuristic. Pick your own order of values to try for each variable, and apply forward checking to reduce variables domains.• Test your program on sudokus start.txt. • Report the number of puzzles you can solve and the mean, standard deviation, min, and max of the runtime over all puzzles in sudokus start.txt.IV. Important Information 1. Test-Run Your Code Test, test, test. Make sure you produce an output file with the exact format of the example given. 2. Grading Submissions We test your final program on 20 boards. Each board is worth 5 points if solved, and zero otherwise. These boards are similar to those in your starter zip, so if you solve all those, you’ll get full credit. 3. Time Limit No brute-force! Your program should solve puzzles in well under a minute per board. Programs with much longer running times will be killed. 4. Just for fun Try your code on the world’s hardest Sudokus! There’s nothing to submit here, just for fun. For example:Sudoku: 800000000003600000070090200050007000000045700000100030001000068008500010090000400 Solution: 812753649943682175675491283154237896369845721287169534521974368438526917796318452 IV. What You Need To Submit 1. Your sudoku.py file (and any other python code dependency) 2. A README.txt with your results, including the: • number of boards you could solve from sudokus start.txt, • running time statistics: min, max, mean, and standard deviation.V. Before You Submit • Ensure that your file is named sudoku.py • Ensure that your file compiles and runs. • After your submission on Gradescope you will receive feedback in 5 minutes on whether your code has the proper filename, output format and execution time. Please address any issue and resubmit before the deadline.
In this assignment you will create an agent to solve the N-puzzle game. You will implement and compare several search algorithms, and collect some statistics related to their performances. Visit mypuzzle.org/sliding for the game’s rules.Please read all sections carefully: I. Introduction II. Algorithm Review III. What You Need To Submit IV. What Your Program Outputs V. Implementation and Testing VI. Before You Finish I.The N-puzzle game consists of a board holding N = m2 − 1 distinct movable tiles, plus one empty space. There is one tile for each number in the set {0, 1,…, m2 − 1}. In this assignment, we will represent the blank space with the number 0 and focus on the m = 3 case (8-puzzle). In this combinatorial search problem, the aim is to get from any initial board state to the configuration with all tiles arranged in ascending order {0, 1,…, m2 − 1} – this is your goal state.The search space is the set of all possible states reachable from the initial state. Each move consists of swapping the empty space with a component in one of the four directions {‘Up’, ‘Down’, ‘Left’, ‘Right’}. Give each move a cost of one.Thus, the total cost of a path will be equal to the number of moves made. II. Algorithm Review Recall from lecture that search begins by visiting the root node of the search tree, given by the initial state. Three main events occur when visiting a node: • First, we remove a node from the frontier set. • Second, we check if this node matches the goal state. • If not, we then expand the node.To expand a node, we generate all of its immediate successors and add them to the frontier, if they (i) are not yet already in the frontier, and (ii) have not been visited yet. This describes the life cycle of a visit, and is the basic order of operations for search agents in this assignment–(1) remove, (2) check, and (3) expand. We will implement the assignment algorithms as described here. Please refer to lecture notes for further details, and review the lecture pseudo-code before you begin.IMPORTANT: You may encounter implementations that attempt to short-circuit this order by performing the goal-check on successor nodes immediately upon expansion of a parent node. For example, Russell & Norvig’s implementation of BFS does precisely this. Doing so may lead to edgecase gains in efficiency, but do not alter the general characteristics of complexity and optimality for each method. For simplicity and grading purposes in this assignment, do not make such modifications to algorithms learned in lecture.III. What You Need To Submit Your job in this assignment is to write puzzle.py, which solves any 8-puzzle board when given an arbitrary starting configuration. The program will be executed as follows: $ python3 puzzle.py The method argument will be one of the following. You must implement all three of them: bfs (Breadth-First Search) dfs (Depth-First Search) ast (A-Star Search)The board argument will be a comma-separated list of integers containing no spaces. For example, to use the bread-first search strategy to solve the input board given by the starting configuration {0,8,7,6,5,4,3,2,1}, the program will be executed like so (with no spaces between commas): $ python3 puzzle.py bfs 0,8,7,6,5,4,3,2,1IV. What Your Program Outputs Your program will create and/or write to a file called output.txt, containing the following statistics: path to goal: the sequence of moves taken to reach the goal cost of path: the number of moves taken to reach the goal nodes expanded: the number of nodes that have been expanded search depth: the depth within the search tree when the goal node is found max search depth: the maximum depth of the search tree in the lifetime of the algorithm running time: the total running time of the search instance, reported in seconds max ram usage: the maximum RAM usage in the lifetime of the process as measured by the ru maxrss attribute in the resource module, reported in megabytes Example 1: Breadth-First Search Suppose the program is executed for breadth-first search as follows: $ python3 puzzle.py bfs 1,2,5,3,4,0,6,7,8 This should result in the solution path:The output file will contain exactly the following lines: path to goal: [‘Up’, ‘Left’, ‘Left’] cost of path: 3 nodes expanded: 10 search depth: 3 max search depth: 4 running time: 0.00188088 max ram usage: 0.07812500 Example 2: Depth-First Search Suppose the program is executed for depth-first search as follows: $ python3 puzzle.py dfs 1,2,5,3,4,0,6,7,8 This should result in the solution path: The output file will contain exactly the following lines: path to goal: [‘Up’, ‘Left’, ‘Left’] cost of path: 3 nodes expanded: 181437 search depth: 3 max search depth: 66125 running time: 5.01608433 max ram usage: 4.23940217 More test cases are provided in the FAQs.Note on Correctness All variables, except running time and max ram usage, have one and only one correct answer when running BFS and DFS. A* nodes expanded might vary depending on implementation details. You’ll be fine as long as your algorithm follows all specifications listed in these instructions. Page 3 As running time and max ram usage values vary greatly depending on your machine and implementation details, there is no “correct” value to look for.They are for you to monitor time and space complexity of your code, which we highly recommend. A good way to check the correctness of your program is to walk through small examples by hand, like the ones above. Use the following piece of code to calculate max ram usage: import r e s o u r c e d f s s t a r t r a m = r e s o u r c e . g e t r u s a g e ( r e s o u r c e .RUSAGE SELF ) . ru m ax r s s d f s r am = ( r e s o u r c e .g e t r u s a g e ( r e s o u r c e .RUSAGE SELF ) . ru m ax r s s − d f s s t a r t r a m ) / ( 2 ∗∗ 2 0 ) Our grading script is working on a linux environment. For windows users, please change you max ram usage calculation code so it is linux compatible during submission. You can test you code on linux platforrm using services such as Google Colab.V. Implementation and Testing For your first programming project, we are providing hints and explicit instructions. Before posting a question on the discussion board, make sure your question is not already answered here or in the FAQs. 1. Implementation You will implement the following three algorithms as demonstrated in lecture. In particular: • Breadth-First Search. Use an explicit queue, as shown in lecture. • Depth-First Search. Use an explicit stack, as shown in lecture. • A-Star Search. Use a priority queue, as shown in lecture. For the choice of heuristic, use the Manhattan priority function; that is, the sum of the distances of the tiles from their goal positions. Note that the blanks space is not considered an actual tile here.2. Order of Visits In this assignment, where an arbitrary choice must be made, we always visit child nodes in the “UDLR” order; that is, [‘Up’, ‘Down’, ‘Left’, ‘Right’] in that exact order. Specifically: • Breadth-First Search. Enqueue in UDLR order; de-queuing results in UDLR order. • Depth-First Search. Push onto the stack in reverse-UDLR order; popping off results in UDLR order. • A-Star Search. Since you are using a priority queue, what happens with duplicate keys? How do you ensure nodes are retrieved from the priority queue in the desired order?3. Submission Test Cases Run all three of your algorithms on the following test cases: Test Case 1 $python3 puzzle.py bfs 3,1,2,0,4,5,6,7,8 $python3 puzzle.py dfs 3,1,2,0,4,5,6,7,8 $python3 puzzle.py ast 3,1,2,0,4,5,6,7,8 Test Case 2 $python3 puzzle.py bfs 1,2,5,3,4,0,6,7,8 $python3 puzzle.py dfs 1,2,5,3,4,0,6,7,8 $python3 puzzle.py ast 1,2,5,3,4,0,6,7,8 Make sure your code passes at least these test cases and follows our formatting exactly. The results of each test are assessed by 8 items: 7 are listed in SectionIV. What Your Program Outputs. The last point is for code that executes and produces any output at all. Each item is worth 0.75 point.4. Grading and Stress Tests We will grade your project by running additional test cases on your code. In particular, there will be five test cases in total, each tested on all three of your algorithms, for a total of 15 distinct tests. Similar to the submission Page 4 test cases, each test will be graded by 8 items, for a total of 90 points. Plus, we give 10 points for code completing all 15 test cases within 10 minutes. If you implement your code with reasonable designs of data structures, your code will solve all 15 test cases within a minute in total.We will be using a wide variety of inputs to stress-test your algorithms to check for correctness of implementation. So, we recommend that you test your own code extensively. Don’t worry about checking for malformed input boards, including boards of non-square dimensions, other sizes, or unsolvable boards. You will not be graded on the absolute values of your running time or RAM usage statistics.The values of these statistics can vary widely depending on the machine. However, we recommend that you take advantage of them in testing your code. Try batch-running your algorithms on various inputs, and plotting your results on a graph to learn more about the space and time complexity characteristics of your code. Just because an algorithm provides the correct path to goal does not mean it has been implemented correctly.5. Tips on Getting Started Begin by writing a class to represent the state of the game at a given turn, including parent and child nodes. We suggest writing a separate solver class to work with the state class. Feel free to experiment with your design, for example including a board class to represent the low-level physical configuration of the tiles, delegating the high-level functionality to the state class. You will not be graded on your design, so you are at a liberty to choose among your favorite programming paradigms. Students have successfully completed this project using an entirely object-oriented approach, and others have done so with a purely functional approach. Your submission will receive full credit as long as your puzzle program outputs the correct information.VI. Before You Finish • Make sure your code passes at least the submission test cases. • Make sure your algorithms generate the correct solution for an arbitrary solvable problem instance of 8-puzzle. • Make sure your program always terminates without error, and in a reasonable amount of time. You will receive zero points from the grader if your program fails to terminate.Running times of more than a minute or two may indicate a problem with your implementation. If your implementation exceeds the time limit allocated (20 minutes for all test cases), your grade may be incomplete. • Make sure your program output follows the specified format exactly. In particular, for the path to goal, use square brackets to surround the list of items, use single quotes around each item, and capitalize the first letter of each item. Round floating-point numbers to 8 places after the decimal. You will not receive proper credit from the grader if your format differs from the provided examples above.COMS 4701 Puzzle FAQs Q. My search algorithm seems correct but is too slow. How can I reduce its running time? A. Search algorithm is perhaps one of the best learning materials for computational complexity and Python’s idiosyncrasies. There are four dos and don’ts: 1. Don’t store possibly large data member such as solution path in search tree node class. Instead, rethink what operation should be fast.Explanation: Storing a path from the root node in each node class achieves O(1) lookup time at the expense of O(n) creation time. For example, if the current state is visited after 60,000 intermediate states, the current state has to allocate a list of 60,000 elements, and each of the children states have to allocate a list of 60,001 elements. This would soon use up physical memory, and typically your machine’s Operating System kills the search process. A key observation is that in the case of search algorithm, path lookup operation is executed just once after search finishes.Thus, the lookup operation is fine to be slow. You might consider another data structure having O(n) lookup time for solution path but requiring O(1) operations during search. 2. Don’t be satisfied by just using list as frontier. Instead, design your Frontier class which works faster. Explanation: one major bottleneck of list, dequeue, or queue class in Python is that their membership testing operation is O(n).The membership testing speed is critical for search algorithm because that operation is executed for every child state. Coming up with using such list-like data structures is a good first step, but for using it with reasonable execution time, you might need one more trick. Note that pseudocode in lecture slides does not necessarily reflect implementation details (i.e. time and space). Rather, it conceptually shows the algorithm’s inputs, processing orders, and outputs.One of your missions in this assignment is to make the ”frontier” thing into a reality by using Python’s ”low-level” primitives. 3. Don’t use O(n) operation when you have another faster way to do the same thing. Explanation: Roughly speaking, if you set one O(n) operation under your for neighbor in neighbors loop, your code will be highly likely to exceed grading time limit. In other words, it happens that your code executes drastically faster when you fix just one line of your code.For example, merging two sets and checking an element is in the merged set is an expensive operation, while checking an element is in one of the two sets are O(1) operation. 4. Don’t use copy.deepcopy() for list. Instead, use list1 = [5,6]; list2 = list(list1) or list2 = list1[:]. Explanation: copy.deepcopy() handles very rare recursive edge cases and is slow. When simply copying a list or other data structures, you can construct a new list by list() constructor or : operator.Some people avoid the second notation due to readability, but it’s frequently used in the real world. Q. Do I need to optimize my search algorithm as much as possible? A. You don’t need to squeeze your code’s performance by fancy optimization techniques such as bit shifting or reducing the number of function calls (i.e. putting every operation in one function for reducing overhead of function calls). Except copy.deepcopy(), most of your design choices are about choosing best data structures in terms of time/space complexity.Q. Is there any other test cases? A. The following two test cases might help for your stat validation. Note that long path to goals are truncated and running time/max ram usage are removed: python puzzle.py dfs 6,1,8,4,0,2,7,3,5 path to goal: [’Up’, ’Left’, ’Down’, … , ’Up’, ’Left’, ’Up’, ’Left’] cost of path: 46142 nodes expanded: 51015 search depth: 46142 max search depth: 46142 python puzzle.py bfs 6,1,8,4,0,2,7,3,5 path to goal: [’Down’, ’Right’, ’Up’, ’Up’, ’Left’, ’Down’, ’Right’, ’Down’, ’Left’, ’Up’, ’Left’, ’Up’, ’Right’, ’Right’, ’Down’, ’Down’, ’Left’, ’Left’, ’Up’, ’Up’] cost of path: 20 nodes expanded: 54094 search depth: 20 max search depth: 21 python puzzle.py ast 6,1,8,4,0,2,7,3,5 path to goal: [’Down’, ’Right’, ’Up’, ’Up’, ’Left’, ’Down’, ’Right’, ’Down’, ’Left’, ’Up’, ’Left’, ’Up’, ’Right’, ’Right’, ’Down’, ’Down’, ’Left’, ’Left’, ’Up’, ’Up’] cost of path: 20 nodes expanded: 696 search depth: 20 max search depth: 20 python puzzle.py dfs 8,6,4,2,1,3,5,7,0 path to goal: [’Up’, ’Up’, ’Left’, …, , ’Up’, ’Up’, ’Left’] cost of path: 9612 nodes expanded: 9869 search depth: 9612 max search depth: 9612 python puzzle.py bfs 8,6,4,2,1,3,5,7,0 path to goal: [’Left’, ’Up’, ’Up’, ’Left’, ’Down’, ’Right’, ’Down’, ’Left’, ’Up’, ’Right’, ’Right’, ’Up’, ’Left’, ’Left’, ’Down’, ’Right’, ’Right’, ’Up’, ’Left’, ’Down’, ’Down’, ’Right’, ’Up’, ’Left’, ’Up’, ’Left’] cost of path: 26 nodes expanded: 166786 search depth: 26 max search depth: 27 python puzzle.py ast 8,6,4,2,1,3,5,7,0 path to goal: [’Left’, ’Up’, ’Up’, ’Left’, ’Down’, ’Right’, ’Down’, ’Left’, ’Up’, ’Right’, ’Right’, ’Up’, ’Left’, ’Left’, ’Down’, ’Right’, ’Right’, ’Up’, ’Left’, ’Down’, ’Down’, ’Right’, ’Up’, ’Left’, ’Up’, ’Left’] cost of path: 26 nodes expanded: 1585 search depth: 26 max search depth: 26 Page 7 Q. Why does the example of python puzzle.py dfs 1,2,5,3,4,0,6,7,8 return [’Up’, ’Left’, ’Left’] instead of [’Up’, ’Left’, ’Down’, … ] (a solution path with 31 moves)? We are using UDLR (Up, Down, Left, Right) order, and Down move should be executed before Left move. Isn’t the [’Up’, ’Left’, ’Left’] solution resulted from optimization forbidden in project instruction? No, [’Up’, ’Left’, ’Left’] solution does not use the forbidden optimization and is a correct answer.Compared to the simplicity of the 3-move solution path, you would notice that ”nodes expanded” statistic is extremely large (181437 states). In fact, the total reachable states in (solvable) 8-puzzle is 9!/2 = 181440 states, so this statistic suggests depth first search constantly overlooks the goal state and expands more than 99.9% of possible states in the search space. Why is this happening? Please think about the reason for one minute before reading the following explanations.There are two facts we can read from dfs pseudocode in class slides: Fact 1. goalTest() function is only called for states which are just popped out from frontier. In other words, goalTest() is not applied to neighbor states even if the neighbor is actually the solution state (because we prohibited such an optimization). Fact 2. A child state (neighbor) is only pushed into frontier when it’s not already in frontier (and explored). More specifically, the goal state is only pushed into frontier when it’s not already in frontier.Combining those two facts reaches to the conclusion: since the goal state is already pushed into frontier (i.e. the state corresponding to [’Up’, ’Left’, ’Left’] move), any subsequent encounter to the solution state cannot execute frontier.push() or goalTest() and is effectively meaningless. Thus, the dfs algorithm extensively searches through numerous states in 8-puzzle (without putting the solution state into frontier again), gradually goes back to the previously pushed states, and finally find the solution state which is pushed at the very beginning of the search.This question would arise when you remove/forget neighbor in frontier checking in your pseudocode. And if you add the checking, you will face the slowness of membership checking. In that case, please see the first question of this page (”Q. My search algorithm seems correct but is too slow. How can I reduce its running time?”). Q. Why the max search depth of python puzzle.py bfs 1,2,5,3,4,0,6,7,8 is 4 even though the goal state is at depth 3? The following figure would be useful (please ignore g and h values)
In this assignment you will be examining search strategies for the 15-puzzle, admissible heuristics for a trajectory planning problem, and pruning in alphabeta search trees. You should submit a report in .pdf format with answers to the questions below.For this question you will construct a table showing the number of states expanded when the 15-puzzle is solved, from various start positions, using four different search strategies: (i) Breadth First Search (ii) Iterative Deepening Search (iii) Greedy Search (using the Manhattan Distance heuristic) (iv) A* Search (using the Manhattan Distance heuristic)Download the file path search.zip from this directory: https://www.cse.unsw.edu.au/~cs3411/24T1/code/ Unzip the file and change directory to path search unzip path_search.zip cd path_search Run the code by typing: python3 search.py –start 2634-5178-AB0C-9DEF –s bfsThe –start argument specifies the start position, which in this case is: 2 6 3 4 5 1 7 8 A B C 9 D E F Start State 1 2 3 4 5 6 7 8 9 A B C D E FGoal State The Goal State is shown on the right. The –s argument specifies the search strategy (bfs for Breadth First Search).The code should print out the number of expanded nodes (by thousands) as it searches. It should then print a path from the Start State to the Goal State, followed by the number of nodes Generated and Expanded, and the Length and Cost of the path (which are both equal to 12 in this case).(a) Draw up a table in this format: Start State BFS IDS Greedy A* start1 start2 start3Run each of the four search strategies from three specified start positions, using the following combinations of command-line arguments: Start Positions: start1: –start 2634-5178-AB0C-9DEF start2: –start 1034-728B-5D6A-E9FC start3: –start 5247-61C0-9A83-DEBF Search Strategies: BFS: –s bfs IDS: –s dfs –id Greedy: –s greedy A*S earch: –s astarIn each case, record in your table the length of the path, and the number of nodes Expanded during the search. Include the completed table in your report. (b) Briefly discuss the efficiency of these four search strategies, with regard to the number of nodes Expanded, and the length of the resulting path.In this question you will be exploring a search strategy known as Heuristic Path Search, which is a best-first search using the objective function: fw(n) = (2 − w)g(n) + wh(n), where w is a number between 0 and 2. Heuristic Path Search is equivalent to Uniform Cost Search when w = 0, to A* Search when w = 1, and Greedy Search when w = 2. It is Complete for all w between 0 and 2.(a) Prove that Heuristic Path Search is optimal when 0 ≤ w ≤ 1, assuming h() is admissible. (You may also assume that A* Search is optimal.) Hint: show that minimizing fw(n) = (2 − w)g(n) + wh(n) is the same as minimizing f ′ w(n) = g(n) + h ′ (n) for some function h ′ (n) with the property that h ′ (n) ≤ h(n) for all n.(b) Draw up a table in this format (the top row has been filled in for you): start4 start5 start6 IDA* Search 45 545120 50 4178819 56 169367641 HPS, w = 1.1 58 13770561 HPS, w = 1.2 HPS, w = 1.3 HPS, w = 1.4Run search.py on each of the three start states shown below, using the Iterative Deepening version of Heuristic Path Search, with w = 1.1, 1.2, 1.3 and 1.4 . Start Positions: start4: –start A974-3256-FD8B-EC01 start5: –start 153E-A02C-9FBD-8476 start6: –start 418E-7AD0-9C52-3FB6 Search Strategies: HPS, w = 1.1: –s heuristic –w 1.1 –id HPS, w = 1.2: –s heuristic –w 1.2 –id HPS, w = 1.3: –s heuristic –w 1.3 –id HPS, w = 1.4: –s heuristic –w 1.4 –idIn each case, record in your table the length of the path, and the number of nodes Expanded during the search. Include the completed table in your report.(c) Briefly discuss how the path Length and the number of Expanded nodes change as the value of w varies between 1.0 (equivalent to IDA*) and 1.4.We saw in lectures how the Straight-Line-Distance heuristic can be used to find the shortest-distance path between two points. However, in many robotic applications we wish to minimize not the distance but rather the time taken to traverse the path, by speeding up on the straight bits and avoiding sharp corners.In this question you will be exploring a simplified version of this problem in the form of a game known as Graph Paper Grand Prix (GPGP).To play GPGP, you first need to draw the outline of a racing track on a sheet of graph paper, and choose a start location S = (rS, cS) as well as a Goal location G = (rG, cG) where r and c are the row and column. The agent begins at location S, with velocity (0,0). A “state” for the agent consists of a position (r, c) and a velocity (u, v), where r, c, u, v are (positive or negative) integers. G SAt each time step, the agent has the opportunity to increase or decrease each component of its velocity by one unit, or to keep it the same. In other words, the agent must choose an acceleration vector (a, b) with a, b ∈ {−1, 0, +1}. It then updates its velocity from (u, v) to (u ′ , v′ ) = (u + a, v + b), and updates its position – using the new velocity – from (r, c) to (r + u ′ , c + v ′ ).The aim of the game is to travel as fast as possible, but without crashing into any obstacles or running off the edge of the track, and eventually stop at the Goal with velocity (0, 0). 4 G 0 k n SWe first consider a 1-dimensional version of GPGP where the vehicle moves through integer locations on a number line, with no obstacles. Assume the Goal is at location n, and that the agent starts at location 0, with velocity k.We will use M(n, k) to denote the minimum number of time steps required to arrive and stop at the Goal. Clearly M(−n, −k) = M(n, k) so we only need to compute M(n, k) for k ≥ 0.(a) Starting with the special case k = 0, compute M(n, 0) for 1 ≤ n ≤ 21 by writing down the optimal sequence of actions for all n between 1 and 21. For example, if n = 7 then the optimal sequence is [+ + ◦ − ◦−] so M(7, 0) = 6. (When multiple solutions exist, you should pick the one which goes “fast early” i.e. with all the +’s at the beginning.)(b) Assume n ≥ 0. By extrapolating patterns in the sequences from part (a), explain why the general formula for M(n, 0) is M(n, 0) = l 2 √ n m , where ⌈z⌉ denotes z rounded up to the nearest integer.Hint: Do not try to use recurrence relations. You should instead use this identity: l 2 √ n m = 2s + 1, if n = s 2 + j, 1 ≤ j ≤ s 2s + 2, if n = s 2 + s + j , 1 ≤ j ≤ s 2s + 2, if n = (s + 1)2(c) Assuming the result from part (b), show that if k ≥ 0 and n ≥ 1 2 k(k−1) then M(n, k) = l 2 q n + 1 2 k(k+1)m − kHint: Consider the path of the agent as part of a larger path.(d) Derive a formula for M(n, k) in the case where k ≥ 0 and n < 1 2 k(k−1).(e) Write down an admissible heuristic (that approximates the true number of moves as closely as possible) for the original 2-dimensional GPGP game in terms of the function M() derived above.Hint: Consider the horizontal and vertical motion separately, keeping in mind that, unlike the 15-puzzle, the agent can move in the horizontal and vertical direction simultaneously. Your heuristic should be of this form: h (r, c, u, v, rG, cG) = max(M(.., ..), M(.., ..))(a) Consider a game tree of depth 4, where each internal node has exactly two children (shown below). Fill in the leaves of this game tree with all of the values from 0 to 15, in such a way that the alpha-beta algorithm prunes as many nodes as possible. Hint: make sure that, at each branch of the tree, all the leaves in the left subtree are preferable to all the leaves in the right subtree (for the player whose turn it is to move). MIN MAX MAX MIN(b) Trace through the alpha-beta search algorithm on your tree, clearly showing which of the original 16 leaves are evaluated.(c) Now consider another game tree of depth 4, but where each internal node has exactly three children. Assume that the leaves have been assigned in such a way that the alpha-beta algorithm prunes as many nodes as possible. Draw the shape of the pruned tree. How many of the original 81 leaves will be evaluated?Hint: If you look closely at the pruned tree from part (b) you will see a pattern. Some nodes explore all of their children; other nodes explore only their leftmost child and prune the other children.The path down the extreme left side of the tree is called the line of best play or Principal Variation (PV). Nodes along this path are called PV-nodes. PV-nodes explore all of their children. If we follow a path starting from a PV-node but proceeding through non-PV nodes, we see an alternation between nodes which explore all of their children, and those which explore only one child. By reproducing this pattern for the tree in part (c), you should be able to draw the shape of the pruned tree (without actually assigning values to the leaves or tracing through the alpha-beta algorithm).(d) What is the time complexity of alpha-beta search, if the best move is always examined first (at every branch of the tree)? Explain why.Submission This assignment must be submitted electronically – either through WebCMS, or from the command line, using: give cs3411 hw2 hw2.pdfThe submission deadline is Tuesday 2nd April, 2pm. Late submissions will incur a penalty of 5% per day, up to a maximum of 5 days.Group submissions will not be allowed. By all means, discuss the assignment with your fellow students. But you must write (or type) your answers individually. Do not copy anyone else’s assignment, or send your assignment to any other student. Good luck!
(a) Feedforward-designed Convolutional Neural Networks (FF-CNNs) (20%) When two CNN layers are cascaded, a non-linear activation function is used in between. As an alternative to the non-linear activation, Kuo et al. proposed the Saak (subspace approximation via augmented kernels) transform [1] and the Saab (subspace approximation via adjusted bias) transform [2].Specifically, Kuo et al. [2] proposed the first example of a feedforward-designed CNN (FF-CNN), where all model parameters are determined in a feedforward manner without backpropagation. It has two main cascaded modules: 1) Convolutional layers via multi-stage Saab transforms 2) Fully-connected (FC) layers via multi-stage linear least squared regression (LLSR)Although the term “successive subspace learning” (SSL) was not used in [2] explicitly, it does provide the first SSL design example. Read paper [2] carefully and answer the following questions:(1) Summarize the Saab transform with a flow diagram. Explain it in your own words. The codes for the Saab transform can be found at https://github.com/USC-MCL/Channelwise-Saab-Transform. Please read the codes with the paper to understand the Saab transform better.(2) Explain similarities and differences between FF-CNN and backpropagation-designed CNN (BPCNNs). Do not copy any sentence from a paper directly which is plagiarism. Your scores will depend on the degree of your understanding.(b) Understanding PixelHop and PixelHop++ (15%) Two interpretable models adopting the SSL principle for the image classification task were proposed by Chen et al. They are known as PixelHop [3] and PixelHop++ [4]. Read the two papers carefully and answer the questions below. You can use various tools in your explanation such as flow charts, figures, formula etc. You should demonstrate your understanding through your answer.(1) Explain the SSL methodology in your own words. Compare Deep Learning (DL) and SSL. (2) What are the functions of Modules 1, 2 and 3, respectively, in the SSL framework?(3) Explain the neighborhood construction and subspace approximation steps in the PixelHop unit and the PixelHop++ unit and make a comparison. Specifically, explain the differences between the basic Saab transform and the channel-wise (c/w) Saab transform.Please apply PixelHop and PixelHop++ to the MNIST and the Fashion-MNIST datasets. (a) Building PixelHop++ Model (35%) The block diagram of PixelHop++ isshown in Figure 1. It contains three PixelHop++ Units. The codes for the c/w Saab transform module is provided in the GitHub. You can import them in your program to build your model in Python based on the diagram. You should adopt the parameters and the classifier choice in Table 1.Figure 1 Block diagram of the PixelHop++ model [4] Table 1 Choice of hyper-parameters of PixelHop++ model for MNIST dataset Spatial Neighborhood size in all PixelHop++ units 5×5 Stride 1 Max-pooling (2×2) -to- (1×1) Energy threshold for intermediate nodes (TH1) 0.005 Energy threshold for discarded nodes (TH2) 0.001 Classifier XGBoostNumber of estimators in classifier 100 (1) Train Module 1 using the whole set or a subset of 10000 training images (depending on your memory). Remember to keep balance among different classes (i.e. randomly select 1000 images per class if you use 10,000 training images). Then, train Module 3 only on Hop3 feature. Report training time and train accuracy. What is your model size in terms of the total number of parameters? (2) Apply your model to 10,000 testing images and report the test accuracy. (3) With the same TH2, try different TH1 energy threshold values in Module 1 and report the testaccuracy and the model size for different choices. Plot the curve of TH1 vs. the test accuracy. Discuss your result.(b) Comparison between PixelHop and PixelHop++ (15%) The codes for the Saab transform are provided in the GitHub. Please use the Saab transform (instead of c/w Saab transform) to build the PixelHop model with the same parameter settings as PixelHop++ in Table 1. Note that TH2 is treated as the energy threshold used in the PixelHop paper.(1) Compare the performance of PixelHop and PixelHop++ in terms of the train accuracy and the test accuracy. Discuss your result. (2) Compare the model size of PixelHop and PixelHop++ in terms of the number of model parameters. Discuss your result.(c) Error analysis (15%) A dataset often contains easy and hard classes. Conduct the following error analysis based on your trained PixelHop++ model using 60,000 training images:(1) Compute the confusion matrix and show it as a heat map in your report. Which object class yields the lowest error rate? Which object class yields the highest one? (2) Find out the confusing class groups and discuss why they are easily confused with each other. You can use some exemplary images to support your statement.(3) Propose ideas to improve the accuracy of difficult classes for PixelHop++ and justify your ideas. There is no need to implement your ideas.References [1] C.-C. Jay Kuo and Yueru Chen, “On data-driven Saak transform,” Journal of VisualCommunication and Image Representation, vol. 50, pp. 237–246, 2018. [2] C.-C.Jay Kuo, Min Zhang, Siyang Li,Jiali Duan, and Yueru Chen, “Interpretable convolutional neural networks via feedforward design,” Journal of Visual Communication and Image Representation, vol. 60, pp. 346–359, 2019.[3] Yueru Chen and C.-C. Jay Kuo, “Pixelhop: A successive subspace learning (ssl) method for object recognition,” Journal of Visual Communication and Image Representation, p. 102749, 2020. [4] Yueru Chen, Mozhdeh Rouhsedaghat, Suya You, Raghuveer Rao, C.-C. Jay Kuo, “PixelHop++: A Small Successive-Subspace-Learning-Based (SSL-based) Model for Image Classification,” https://arxiv.org/abs/2002.03141, 2020[5] Yueru Chen, Yijing Yang, Wei Wang, C.-C. Jay Kuo, “Ensembles of Feedforward-designed Convolutional Neural Networks”, in International Conference on Image Processing, 2019
In this problem, you will implement texture analysis and segmentation algorithms based on the 5×5 Laws Filters constructed by the tensor product of the five 1D kernels in Table 1.a) Texture Classification – Feature Extraction (15%) 48 images of four types of textures are given for the texture classification task. They are split into two sets, 36 training samples and 12 testing samples. The ground truth labels of the 36 training samples are known, while the testing samples’ categories are waiting for you to explore. Samples of these images are shown in Fig. 1. Figure 1: Examples of Grass, Blanket, Stones, Brick [2]Please follow steps below to extract features for all texture images provided and do analysis: 1. Filter bank response computation: Use the twenty-five 5×5 Laws Filters in Table 1 to extract the response vectors from each pixel in the image (use appropriate boundary extensions).2. Energy feature averaging: Compute the energy feature of each element of the response vector. Average the energy feature vectors of all image pixels, leading to a 25-D feature vector for each image. Which feature dimension has the strongest discriminant power? Which has the weakest? Please justify your answer.3. Feature reduction: Reduce the feature dimension from 25 to 3 using the principal component analysis (PCA). Plot the reduced 3-D feature vectors in the 3-D feature space. Please conduct texture classification using the nearest neighbor rule based on the Mahalanobis distance.Note: Built-in PCA function can be used. b) Advanced Texture Classification — Classifier Exploration (20%) Based on the 25-D and 3-D feature vectors obtained above, conduct both unsupervised and supervised learning. Please follow the steps below.1. Unsupervised: K-means clustering is a kind of unsupervised learning algorithm which separates the textures into different categories without the help of ground truth labels. It will not directly tell the class for each image but will group similar images together.a. Apply the K-means algorithm for test images based on the 25-D feature and the reduced 3- D feature, respectively. Set the hyperparameter K (number of clusters) equal to the number of possible classes in the dataset (e.g. K=4).b. Use the test labels to evaluate the purity of each cluster. Specifically, classify the images in each cluster as the majority class of that cluster. Report the error rate for both methods. Discuss the effectiveness of the feature dimension reduction over the K-means clustering.2. Supervised: Use the 3-D feature of training images to train the Random Forest (RF) and the Support Vector Machine (SVM), respectively. Then predict the test set labels and report the error rate. Compare the two classifiers. Note: Built-in K-means function, RF and SVM can be used.a) Basic Texture Segmentation (20%) Segment the texture mosaic Mosaic.raw in Fig. 2 by following the steps below: 1. Filter bank response computation: Use the twenty-five 5×5 Laws Filters in Table 1 to extract the response vectors from each pixel in the image (use appropriate boundary extensions).2. Energy feature computation: Use a window approach to compute the energy measure for each center pixel based on the results from step 1. You may try a couple of different window sizes. After this step, you will obtain 25-D energy feature vector for each pixel.3. Energy feature normalization: All kernels have a zero-mean except for �5!�5. Actually, the feature extracted by the filter �5! �5 is not a useful feature for texture classification and segmentation. Use its energy to normalize all other features at each pixel.4. Segmentation: Discard the feature associated with L5T L5. Use the K-means algorithm to perform segmentation on the composite texture image given in Fig. 2 based on the 24-D energy feature vectors.If there are K textures in the image, your output image will be of K colors, with each color represents one type of texture. Use the following randomly generated color map to represent the K=6 regions. The ordering does not matter. 0 1 2 3 4 5 R 107 114 175 167 144 157 G 143 99 128 57 147 189 B 159 107 74 32 104 204 Figure 2: Texture mosaic image. [3]b) Advanced Texture Segmentation (10%) You may not get good segmentation results for the complicated texture mosaic image in Fig. 2. Please develop some techniques to improve your segmentation result. Several ideas are sketched below.1. Use the PCA for feature reduction. Use the dimension reduced features to do texture segmentation of Fig. 2. 2. Develop a post-processing technique to merge small holes. 3. Enhance the boundary of two adjacent regions by focusing on the texture properties in these two regions only.Image feature extractors are useful for representing the image information in a low dimensional form. (a) Salient Point Descriptor (Basic: 10%) SIFT is an effective tool to extract salient points in an image. Read the paper in [1] and answer the following questions.1. From the paper abstract, the SIFT is robust to what geometric modifications? 2. How does SIFT achieve its robustness to each of them? 3. How does SIFT enhance its robustness to illumination change? 4. What are the advantages of using Difference of Gaussians (DoG) instead of Laplacian of Gaussians (LoG) in SIFT?5. What is the SIFT’s output vector size in its original paper? (b) Image Matching (Basic: 15%) You can apply SIFT to image matching. Extract and show SIFT features. 1. Find key-points of the Cat_1 and Cat_Dog images in Fig. 3. Pick the key-point with the largest scale in Cat_1 and find its closest neighboring key-point in Cat_Dog. You can do nearest neighbor search in the searching database for the query image Cat_1 which is represented as a SIFT extracted feature vector. Discuss your results, especially the orientation of each key-point. Show the corresponding SIFT pairs between Cat_1 and Cat_Dog.2. Perform the same processing with the following three image pairs: a) Dog_1 and Cat_Dog, b) Cat_1 vs Cat_2, c) Cat_1 vs Dog_1. The matching may not work well between different objects or against the same object but with a large viewing angle difference. Show and comment on the matching results. Explain why it works or fails in some cases.You are allowed to use open source library (OpenCV or VLFeat) to extract features. (a) Cat_1 (b) Cat_2 EE 569 Digital Image Processing: Homework #4 Professor C.-C. Jay Kuo Page 5 of 6 (c) Dog_1 (d) Cat_Dog Figure 3: Images for image matching. [4](c) Bag of Words (10%) If we create a codebook with K codewords representing K types of key patterns, each image can be represented by the codewords. A histogram can be calculated to reflect the occurrence of each codeword in an image. This representation is called the Bag of Words (BoW).Apply the K-means clustering to the extracted SIFT features from the four images in part (b) to form a codebook with K=8 codewords. Each codeword is characterized by the centroid of the SIFT feature vectors. Generate the BoW representations for Cat_1, Dog_1 and Dog_2 provided in the materials. Match Dog_2’s BoW representation with Cat_1 and Dog_1, respectively. Which one gives you the better matching? Show the histograms of these three images and discuss your observations.Figure 4: Dog_2 image. [4]Appendix: Problem 1: Texture Analysis 48 texture images (./train and ./test) 128×128 8-bit grayscale Problem 2: Texture Segmentation Mosaic.raw 512×512 8-bit grayscale Problem 3: Image Feature Extractors Cat_1.raw 600×400 24-bit Color (RGB) Cat_2.raw 600×400 24-bit Color (RGB) EE 569 Digital Image Processing: Homework #4 Professor C.-C. Jay Kuo Page 6 of 6 Cat_Dog.raw 600×400 24-bit Color (RGB) Dog_1.raw 600×400 24-bit Color (RGB) Dog_2.raw 600×400 24-bit Color (RGB)Reference Images Images in this homework are taken from SIPI Image Database [2], Prague dataset [3] and Google images [4]. References [1] David G. Lowe, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, 60(2), 91-110, 2004 [2] https://sipi.usc.edu/database/ [3] https://mosaic.utia.cas.cz/ [4] [Online] http://images.google.com/
(a) Bilinear Demosaicing (10%) To capture color images, digital camera sensors are usually arranged in form of a color filter array (CFA), called the Bayer array, as shown in Figure 1. Since each sensor at a pixel location only captures one of the three primary colors (R, G, B), the other two colors have to be re-constructed based on their neighbor pixel values to obtain the full color. Demosaicing is the process of translating this Bayer array of primary colors into a color image that contains the R, G, B values at each pixel.Figure 1: (a) Single CCD sensor covered by a CFA and (b) Bayer pattern [1].Implement the simplest demosaicing method based on bilinear interpolation. Exemplary demosaicing results are given in Figure 2. With this method, the missing color value at each pixel is approximated by bilinear interpolation using the average of its two or four adjacent pixels of the same color. To give an example, the missing blue and green values at pixel �!,# are estimated as: As for pixel �!,!, the blue and red values are calculated as:(a) (b) Figure 2: (a) The original image and (b) the demosaiced image by bilinear interpolation. Figure 3: The House image before demosaicing. (1) Apply the bilinear demosaicing to the House image in Figure 3 and show your results. (2) Compare your demosaiced image with the color image House_ori which is obtained from a more advanced demosaicing algorithm. Do you observe any artifacts? If yes, explain the cause of the artifacts and provide your ideas to improve the demosaicing performance.(b) Histogram Manipulation (20%) Implement two histogram equalization techniques: • Method A: the transfer-function-based histogram equalization method, • Method B: the cumulative-probability-based histogram equalization method to enhance the contrast of the Hat image in Figure 4 below.(1) Plot the histograms of the original image. The figure should have the intensity value as the x-axis and the number of pixels as the y-axis.(2) Apply Method A to the original image and show the enhanced image. Plot the transfer function. (3) Apply Method B to the original image and show the enhanced image. Plot the cumulative histograms before and after enhancement.(4) Discuss your observations on these two enhancement results. Which one do you think is better and why? Note that MATLAB users CANNOT use functions from the Image Processing Toolbox except displaying function like imshow().Figure 4: Hat image (c) Contrast Limited Adaptive Histogram Equalization (15%) Instead of histogram equalization using global statistics from the entire image, adaptive histogram equalization breaks one image into several regions (tiles) and finds the histogram in each separately. Contrast Limited Adaptive Histogram Equalization (CLAHE) is one of the most popular algorithms.Please read the paper [2] carefully to learn about CLAHE. In this problem, you will apply the CLAHE to do image haze removal, where you remove the fog in the provided image taken in a bad weather. Figure 5 shows an example before and after the haze removal. The process can be done through the following 3 steps.Step-1: Transform the image from RGB color space to YUV color space. The transformation can be found in the Appendix. Step-2: Apply a histogram equalization algorithm on the Y channel to get Y’. Step-3: Combine Y’ with U and V, transform them back to RGB color space. The resulted image is your haze-removed image.(1) Explain CLAHE in your own words. (2) Perform the above three steps on Taj_Mahal.raw by applying the two histogram equalization methods in Problem 1(b) in step-2. Show the resulted images for both methods.(3) Repeat the three steps but apply the CLAHE in step-2. Here, you are allowed to use open-source code for CLAHE. For C++, you can use OpenCV. For Matlab, you can check function adapthisteq. Tune the hyperparameters including the number of tiles and the clip limit. Show the resulted image that you think is the most pleasant subjectively. (4) Compare your results between (2) and (3). Discuss your observations. (a) Original foggy image (b) Defogged image Figure 5: An examples of image haze removal.In this problem, you will implement a set of denoising algorithms to improve image quality. You can use the PSNR (peak-signal-to-noise-ratio) quality metric to assess the performance of your denoising algorithm. The PSNR value for R, G, B channels can be, respectively, calculated as follows:Remove noise in the image in Figure 5(b), compare it with the original image in Figure 5(a), and answer the following questions: (a) Basic denoising methods (10%) (1) What is the type of embedded noise in Figure 5(b)? Justify your answer. (2) Apply a linear filter to the noisy image. Compare the performance of two choices of the filter parameters – the uniform weight function and the Gaussian weight function under different filter sizes.(a) Original image (b) Noisy image Figure 5: The original and noisy Flower images. (b) Bilateral Filtering (10%)In most low-pass linear filters, we often see degradation of edges. However, using some nonlinear filters, we can preserve the edges. Bilateral filters are one such kind of filters. A discrete bilateral filter is given by: where (�, �) is the neighboring pixel location within the window centered around (�,�), � is the image with noise, � is the filtered image. �$ and �% are two spread parameters.(1) Implement the bilateral denoising filter and apply it to the noisy image. (2) Explain the roles of �$ and �%. Discuss the change in filter’s performance with respect to the values of �$ and �%. (3) Does this filter perform better than linear filters you implemented in Problem 2(a)? Justify your answer in words. (c) Non-Local Means (NLM) Filtering (10%)The non-local mean filter utilizes the pixel value from a larger region rather the mean of a local window centered around the target pixel. A discrete non-local mean filter with Gaussian weighting function is as follows:where �, � are the noisy and filtered images respectively, �&,’ is the window centered around location (�, �), and ℎ is the filtering parameter, �( ≤ � and �( ≤ � denote the window size of your choice.The Gaussian weighted Euclidian distance between window �3�),*4and �3�+,,4 is defined as: 5�3�),*4 − �3�+,,45-,. – = 8 �.(�/, �-)3�(� − �/,� − �-) − �(� − �/, � − �-)4 0!,0″∈ℵ – where ℵ denotes the local neighborhood centered at the origin, �!, �” ∈ ℵ denotes the relative position in the neighborhood window. � > 0 is the standard deviation of the Gaussian kernel.(1) Apply the NLM filter (using any open-source code, e.g. C++ can use OpenCV, Matlab can check function imnlmfilt) to the noisy image. Try several filter parameters and discuss their effect on filtering process. Clearly state your final choice of parameters in your report.(Note that there are four parameters to discuss: the big search window size ℵ, the small neighbor window size �( , the Gaussian smoothing parameter for the search window ℎ, the Gaussian smoothing parameter for the neighbor window �. If the open source code doesn’t provide ways to adjust a certain parameter, just analyze that parameter theoretically.)(2) Compare the performance of NLM with filters used in Problem 2(a) and Problem 2(b). (d) Mixed noises in color image (10%) Figure 6 (b) is a noisy color image corrupted with mixed types of noises. Please identify noise types in the image and answer the following questions:(1) What types of noises are there? Justify your answer. (2) What filters would you like use to remove mixed noise? Can you cascade these filters in any order? Justify your answer.(3) Get the best results in removing mixed noise. Include the following in your report: 1. Describe your method and show its results 2. Discuss its shortcomings. 3. Give some suggestions to improve its performance. (a) (b) Figure 6: (a) the original Flower image (b) the Flower image with mixed noises.An exemplary frosted glass effect for the Lake image is shown in Figure 8.Figure 8 An example of frosted glass effect This effect can be created by simply replacing the color at each pixel with the color at a random pixel in its local neighborhood of size NxN. For example, considering N=3, the color of the center pixel at location 0 can be replaced by the color at any of the location 0-8 (including the location 0 which means unchanged).(1) Implement the frosted glass filtering with N = 5 or 7 and apply it to the Flower.raw image. Note that you can still formulate the process as a 2-D image filtering problem. (2) Repeat the process for the noisy image Flower_noisy.raw. Describe your observations and discuss whether the noise leads to any difference.(3) Considering the Bilateral filtering denoising algorithm. Try the following two processes: a. First, perform denoising on Flower_gray_noisy.raw. Then, apply the frosted-glass filtering. b. First, apply the frosted-glass filtering on Flower_gray_noisy.raw. Then, perform denoising. Compare your results and discuss your observations.Appendix: Problem 1: Image Demosaicing and Histogram Manipulation House.raw 768×512 8-bit gray House_ori.raw 768×512 24-bit color(RGB) Hat.raw 256×256 8-bit gray Taj_Mahal.raw 600×400 24-bit color(RGB)Problem 2: Image Denoising & Problem 3 Flower.raw 768×512 24-bit color(RGB) Flower_noisy.raw 768×512 24-bit color(RGB) Flower_gray.raw 768×512 8-bit gray Flower_gray_noisy.raw 768×512 8-bit gray Note: “768×512” means “width=768, height=512”. Convert from RGB to YUV color space Y = (0.257 * R) + (0.504 * G) + (0.098 * B) + 16 U = -(0.148 * R) – (0.291 * G) + (0.439 * B) + 128 V = (0.439 * R) – (0.368 * G) – (0.071 * B) + 128 Convert from YUV to RGB color space R = 1.164(Y – 16) + 1.596(V – 128) G = 1.164(Y – 16) – 0.813(V – 128) – 0.391(U – 128) B = 1.164(Y – 16) + 2.018(U – 128)Reference Images All images in this homework are from Google images [3], USC-SIPI image database [4], Kodak image dataset [5] or others [6].References [1] M. E. Celebi et al. (eds.), Color Image and Video Enhancement. [2] Zuiderveld, Karel. “Contrast Limited Adaptive Histograph Equalization.” Graphic Gems IV. San Diego: Academic Press Professional, 1994. 474–485. [3] [Online] http://images.google.com/ [4] [Online] http://sipi.usc.edu/database/ [5] http://r0k.us/graphics/kodak/ [6] https://comedytravelwriting.com/5-best-photos-of-the-taj-mahal-in-fog/
Take what you’ve learned in this class and design and implement a fun/cool project that utilizes the DE1SoC board and peripherals.Be creative! You are required to use: 1) A VGA display (building off of Lab 5). 2) Some form of significant memory storage (e.g., pixel data, audio clip, other game data). You can refer to Labs 2–4 for a refresher on initializing and using memory.3) Some form of user input, which could be the switches and push buttons or some of the peripherals listed below.You can, and are encouraged to, use additional peripherals. All drivers you will need for this lab can be found in the folder on Canvas within the folder. Unfortunately, we are limited to the currently available interfaces on LabsLand, but these include:• Sound output from a speaker (building off of Lab 3). • A virtual 360-degree joystick. • An “N8” controller, shown in the form of an old controller System (4 directional pad, Select, Start, A, B). from the Nintendo EntertainmentNotes for using a VGA display: Please note that the provided VGA driver in the featured than the one provided in Lab 5. folder is different and more fullyNotes for using audio output: Recall that LabsLand does not allow you hear your audio live. Unfortunately, this means you can’t do things like producing sounds/noises at certain events as they would only be heard in the recording afterward. However, it would be acceptable to produce an audio output as a final artifact based on the user input (e.g., a beat or drum generator).You are allowed to take inspiration and code from elsewhere (e.g., a software implementation, your 271/369 project), but make sure that you cite your sources. However, these portions will not count towards the overall difficulty of your project – we care about what you will be implementing this quarter.Category 1: Games • Side-scrolling: The player(s) move through a level, avoiding or destroying obstacles or enemies. • Combat: Two or more players compete to collect points and/or defeat the others, maybe with projectiles or by growing their own body as an obstacle. • Tile-matching: Tetris, Bust-a-Move, Candy Crush, or something of similar complexity. • Card: Blackjack, Solitaire, Set, or something of similar complexity.Category 2: Audiovisual • Paint: Allow the user to draw on the VGA or otherwise change an image output. • Audio Visualization: Use an audio input file to display some sort of reactive visualization. • Music Generation: Take in user input to generate sound, e.g., music notation/composition. Basic pixel-by-pixel painting is not complex enough to be used as the sole feature of a project – this will need to be extended with other, more complex features.Category 3: Create Your Own Come up with your own idea that satisfies the list of requirements above and submit a proposal to the course staff! Past class video with other examples: https://youtu.be/3J6ZwsfqRKQ Do note that that video is from when the course was using DE1-SoC lab kits, so there were different peripherals available.Project Explanation Video This quarter will not require a video demo. The links below are left as examples. Here are some links (UW login required) to past project videos that earned full scores on the video (i.e., not necessarily on the project itself): • Flood Fill Algorithm: https://drive.google.com/file/d/1wQ0kaWS5sYTW_nqx97thSxf5XRjbGFSy/view?usp=sharing • Red Light, Green Light Game: https://drive.google.com/file/d/17OUh_Y5rUIQt0QEx2tVa1pa4yn0tr5e/view?usp=sharingProject Demonstration/Turn-In Requirements In-Person Demo • Demonstrate your completed project working on LabsLand. • Be prepared to answer questions on both the theoretical and practical parts of the project.Project Report (submit as PDF on Gradescope) • Include the required Design Procedure, Results, and Experience Report sections. o If you worked with a partner, include a partner work summary in your Experience Report. • Don’t forget to also submit your SystemVerilog files ( ), including testbenches!Lab 6 Rubric Grading Criteria Points Name, student ID, lab number 2 pts Design Procedure 20 pts Results 14 pts Experience Report 14 pts SystemVerilog code uploaded 5 pts Code Style 5 pts PROJECT DEMO 70 pts ▪ Bonus points available for particularly impressive projects (10 pts) 140 pts Submit a PDF document to Gradescope detailing your proposed Lab 6 project, even if it is from the list of suggestions. This document should be only a few paragraphs long and will give us a chance to verify that you’ve put some thought into your project and that your project (especially if not from the list of suggestions), is of an appropriate difficulty level, i.e., it is not too simple nor too difficult.Recall that, in addition to the standard DE1-SoC switches and push buttons, you have the following peripherals available for use, though in pre-defined subsets based on the LabsLand interfaces: 1) A VGA display (building off of Lab 5, but note that there is a new driver available for Lab 6). 2) Sound output from a speaker (building off of Lab 3)†.3) A virtual 360-degree joystick. 4) An “N8” controller, shown in the form of an old controller from the Nintendo Entertainment System (4 directional pad, Select, Start, A, B). † Recall that LabsLand does not allow you hear your audio live. In your proposal:• Describe your major project behavior, features, components/modules, and user interaction. • Include at least a top-level block diagram (other diagrams welcome). Be sure to label the blocks and ports in your diagram as best you can. Don’t worry if the final design differs from your proposal. Lab 6 Proposal Rubric Grading Criteria Points Project description 5 pts System block diagram, with labels 5 pts 10 pts
In computer systems it is necessary to provide a substantial amount of memory. If a system is implemented using only FPGA technology, it is possible to provide some amount of memory by using the memory resources that exist in the FPGA device. In this lab, we will examine the general issues involved in implementing such memory.The FPGA included on the DE1-SoC board provides dedicated memory resources and has M10K blocks, each of which contains 10240 memory bits. The M10K blocks can be configured to implement memories of various sizes. A common term used to specify the size of a memory is its aspect ratio, which gives the depth (in words) and the width (in bits) as depth × width. In this lab, we will use an aspect ratio of 32 × 3.There are two important features of the M10K blocks: 1) They include registers to synchronize all the input and output signals to a clock input. 2) They have separate ports for writing data to the memory and reading data from the memory. A conceptual diagram of the Random-Access Memory (RAM) module that we want implement is shown in Figure 1a. It contains 32 three-bit words (i.e., rows) that are accessed using a five-bit port, a three-bit bidirectional port, and a control input. However, given the properties of the M10K blocks, we will instead implement the modified 32 x 3 RAM module shown in Figure 1b. It includes registers for the address, data input, and write ports, and uses a separate unregistered data output port. Figure 1: 32 x 3 RAM module for Lab 2.Remember to start the tasks in this lab by either following the first steps from the Warmup Task from Lab 1 or by making a copy of an existing Quartus project folder.Commonly used logic structures, such as adders, registers, counters, and memories, can be implemented in an FPGA chip by using prebuilt modules that are provided in libraries. In this exercise, we will use such a module to implement the memory shown in Figure 1b.1) Create a new Quartus project. 2) Get a library RAM module from the IP Catalog: a) Open the IP Catalog in the Quartus menu by clicking on → “ ”. b) In the IP Catalog window, expand “ ”, then “ ”, then “ ”. Then double-click “ ”.c) In the “ ” dialog box that opens, append the text “ ” to the end of the file name and select “ ” as the file type. Then click “ ” to open the configuration wizard. d) In the configuration window, specify 3-bit width for the output bus and 32 words of memory.Then select the “ ” radio button for memory block type and “ ” for the clocking method before clicking “ ”. e) Now, deselect the check box for registering (i.e., placing a register on) the “ ”. This creates a RAM module that matches the structure in Figure 1b. Click “ ” to accept the defaults for the rest of the settings in the Wizard.f) You will have to click “ ” one more time at the Summary page to exit the Wizard. If prompted, add the new Quartus Prime IP File to your project by clicking “ ”. 3) Go to “ ” in the Project Navigator window and open (nested under ) to examine it. It defines the following module:4) Create a new SystemVerilog file called that instantiates the module, using appropriate input and output signals for the memory ports as shown in Figure 1b. a) Once you compile your circuit, various parts of the Compilation Report will indicate that the RAM circuit is implemented using 96 bits in one of the FPGA memory blocks. module ram32x3 (address, clock, data, wren, q); input [4:0] address; // Address input clock; input [2:0] data; // DataIn input wren; // Write output [2:0] q; // DataOut // …Check above the definition of this module. If you see a line that looks like the following: you will need to add the same line just above your testbench module in Step 4, otherwise you may see the following error message during simulation: “… does not have a timeunit/timeprecision specification in effect, but other modules do.”5) Create a suitable testbench to verify that you can read and write to the memory in simulation.Instead of creating a memory module by using the IP Catalog, we can implement the required memory by specifying its structure in SystemVerilog code as a multidimensional array. A 32 × 3 array, which has 32 words with 3 bits per word, can be declared by the statement: On the FPGA, such an array can be implemented either by using flip-flops found in each logic cell or, more efficiently, by using the built-in memory blocks.1) Write a SystemVerilog module in a new file that provides the necessary functionality as Task #1 but using a multidimensional array. You should be able to use the same testbench as Task #1. 2) Create a top-level SystemVerilog module in a new file that instantiates your new module and uses the inputs and outputs on the DE1-SoC as specified: a. Use switches – to specify and switches – to specify . b. Use as the signal and as the input. c. Display the value (in hex) on – , the value on , and on . A basic 7-segment driver module has been provided for you, which you can freely change; describe any changes you make, but no testbench or waveforms are needed for this module. 3) Synthesize the circuit and download it to a DE1-SoC on LabsLand to test its functionality.You will likely encounter the following simulation error at first: “Instantiation of ‘altsyncram’ failed. The design unit was not found.” Solution: In the ModelSim menu, select → “ ” and then go to the tab and add under “ ”.Then, go to the tab, select your testbench module, and click . If you are using files to run your simulations, you can also add the text “ ” to the end of line where you specify your testbench module. logic [2:0] memory_array [31:0];In general, arrays with asynchronous reads (i.e., not registered) will be mapped to flip-flops and arrays with synchronous reads (i.e., registered) will be mapped to memory blocks. In LabsLand, make sure that you select the “Standard” user interface, as opposed to the “Breadboard” that we used in Lab 1:The RAM blocks in Figure 1 have a single port for both read and write operations. For this task you will create a different type of memory module that has separate ports for the addresses of read and write operations. You will also learn how to create and use a memory initialization file (MIF).1) Generate the desired memory module from the IP Catalog by using the “ ” module and call the file . a) Under “ ”, select “ ” and then “ ”. b) Configure the memory size, clocking method, and registered ports the same way as in Task #1. c) Under “ ”, select “ ”.i. This setting specifies that it does not matter whether the memory outputs the new data being written, or the old data previously stored, in the case that the write and read addresses are the same during a write operation. d) Under “ ”, select “ ” and specify the filename .i. This memory initialization file (MIF) allows us to initialize our RAM to specific values that are loaded when the circuit is programmed into the FPGA chip. You will create this MIF file in Step 2. e) Finish the Wizard and then examine the generated memory module in .2) Create your MIF file : a) In the Quartus menu, go to → and then select “ ”. b) Specify 32 words and word size of 3. c) Manually fill the grid with the values you want to place in each memory address. i. If helpful, the menu will let you adjust the number of cells per row displayed and the address and memory radices. d) Save the MIF file as . You can also manually edit this file in a text editor.3) Augment the top-level module (and its testbench) from Task #2 to use to toggle between the memories of Task #2 ( ) and Task #3 ( ). The following requirements are purposefully similar to Task #2 but pay attention to the differences. Any significant, new modules should have testbenches and simulations; you should mention the source for any provided or reused modules.a) Use – and – to specify the write address and write data, respectively. Display (in hex) the write address on – and the write data on . b) Use a counter to cycle through read addresses about one per second (no verification needed). Display (in hex) the read address on – and the 3-bit word content on .c) Use the 50 MHz clock, , to synchronize the system and use as a reset signal. Make sure that you properly handle metastability issues for asynchronous inputs. d) Make sure that the memories from Task #2 and Task #3 are independent of each other (i.e., writing to one memory should not be reflected in the other memory)!4) Test your circuit and verify that the initial contents of the memory match your file. Make sure that you can independently write data to any address by using the switches and move between the Task #2 and Task #3 memories using .Lab Demonstration/Turn-In Requirements In-Person Demo • Briefly show and explain your SystemVerilog memory implementation from Task #2. • Briefly show and your file from Task #3. • Demonstrate your working Task #3 memory circuit (which includes the Task #2 circuit) on the DE1- SoC. • Be prepared to answer 1-2 questions about your lab experience to the TA.Lab Report (submit as PDF on Gradescope) • Include the required Design Procedure, Results, and Experience Report sections. o The designs for basic or provided modules like the 7-segment driver and counter should be briefly mentioned, but you do not need to include state diagrams or simulations for these. • Don’t forget to also submit your SystemVerilog files ( ), including testbenches!Lab 2 Rubric Grading Criteria Points Name, student ID, lab number 2 pts Design Procedure ▪ System block diagram for your finished Task #3 (this includes Tasks #2) 6 ptsResults ▪ Waveforms & explanations of Tasks #1 & 2 memories (recommended combined into a single simulation) ▪ Waveforms & explanations of top-level module for Task #3 ▪ Waveforms & explanations of any other significant and new modules for Task #3 12 pts Experience Report 6 pts SystemVerilog code uploaded 5 pts Code Style 5 pts LAB DEMO 26 pts 62 pts
This lab is a refresher on finite state machines (FSMs) that you learned about and designed in EE271 or CSE369.Consider a parking lot with a single gate for both entry and exit. To keep track of the occupancy of the lot, we decide to use two photosensors to track when cars enter or exit, as shown in Figure 1.When an object is between the photo transmitter and the photo receiver, the light is blocked and the corresponding output is asserted to . By monitoring the events of both sensors, we can determine whether a car is entering or exiting, or even if a pedestrian is passing through! You may assume that two cars won’t be entering/exiting at the same time.For example, the following sequence indicates that a car entered: 1) Initially, both sensors are unblocked (i.e., ) 2) Sensor becomes blocked (i.e., ) 3) Both sensors are blocked (i.e., ) 4) Sensor becomes unblocked (i.e., ) 5) Both sensors are unblocked (i.e., )Your task is to design a parking lot occupancy counter as follows: 1) Design and implement an FSM for the car detection with two input signals, and , and two output signals, and . The and signals assert true for one clock cycle when a car enters or exits the lot, respectively. a) You may assume that cars will not change direction while entering or exiting the parking lot. b) Make sure that your FSM does not detect pedestrians (how would the input pattern differ?).2) Design and implement a car counter with two control signals, and , which increment and decrement the counter, respectively, when asserted. a) Assume that the maximum capacity of the parking lot is 16 spots.3) Design and implement a module for the parking lot occupancy, which combines the car detection and the counter. Your system should have the following properties: Read the whole lab before starting on any work. If you have any questions at any point, first double-check the lab document. If you still cannot find the answer, you should ask a TA. Figure 1: The parking lot photosensor setup.a) Use off-board switches (i.e., not the s) to mimic the two photosensor outputs. b) Display the current car count on the seven-segment displays and , with the following exceptions: i. If the counter reaches , display “ ” on – . ii. When the lot is empty, display “ ” on – and the number “ ” on .c) Use 2 off-board LEDs (i.e., not your LEDRs) to indicate the values of the and signals. i. A logical should turn the corresponding LED on and a logical should turn it off. Figure 2: LabsLand GPIO headers (from GPIO_Guide.pdf) for your reference. Connecting off-board components requires the use of the GPIO pins of the DE1-SoC board.Please refer to the document “ ” for information on their usage in LabsLand. a) Wire the anodes of 2 LEDs to FPGA output ports. b) Wire the middle pins of the 3 switches to FPGA input ports of your choosing as your three inputs ( , , and ).Lab Demonstration/Turn-In Requirements In-Person Demo • Demonstrate your working parking lot occupancy counter on the DE1-SoC. • Demonstrate your reset functionality. • Be prepared to answer 1-2 questions about your design to the TA.Lab Report (submit as PDF on Gradescope) • Include the required Design Procedure, Results, and Experience Report sections. • Don’t forget to also submit your commented SystemVerilog files ( ), including testbenches!Lab 1 Rubric Grading Criteria Points Name, student ID, lab number 2 pts Design Procedure ▪ System block diagram and diagrams for car detection and counting 12 ptsResults ▪ Simulations for top-level and car detection (none needed for car counting) ▪ Present state included in all FSM simulations 10 pts Experience Report 6 pts SystemVerilog code uploaded 5 pts Code Style 5 pts LAB DEMO 20 pts 60 ptsMake sure that you read the document “ ” for details about what to expect during the lab demo and what we expect to be in your submitted lab report and SystemVerilog code. You can also read “ ” for more detailed examples.
Question 1. Add the stamps for the following elements in functions named makeGmatrix.m, makeCmatrix.m, and makeBvect.m a. Resistor (already Implemented) b. Capacitor c. Inductord. Voltage Source (already Implemented for DC voltage sources) e. Current Source f. Voltage Controlled Voltage Source g. Voltage Controlled Current Source h. Current Controlled Current Source i. Mutual Inductor j. Ideal opampQuestion 2. Write a frequency analysis function named fsolve.m. Run all test bench functions provided to test your simulator.Deliverables: 1. Submit all your code in a zip file. 2. Submit a PDF file containing your results for all testbench functions.
1. (18 marks) Chapter 7, Memory access, Anderson Lock The following benchmark is designed to measure the average time for reading array A containing L elements from memory, when the array elements are s words apart: for stride s from 4 Bytes (1 word) to L/2 by 2x (s = 1, 2, 4, …, L/2) time the following loop (repeat many times and average) for i from 0 to L-1 load A[i*(s+1)] from memory (4 Bytes)So, the benchmark reads L words from memory into cache, where L is a constant; however, these L words are not consecutive memory locations (they are “s” words apart). The results obtained from the benchmark are shown in Fig.1: Figure1. Average time for reading each array elementAssuming only one level of cache exists, the cache line size is 4 words, and a single processor is running the benchmark, answer the following questions: 1.1 In Fig.1, when the total number of elements of the array – L – is smaller than L’, the average time per access remains constant. What does the value of L’ represent? What does time t0 show?1.2 When L is larger than L’, what does time t1 indicate in Fig.1? 1.3 For each part of the graph – parts 1, 2 and 3 -, justify its behaviour. We know a phenomenon called false sharing could cause unnecessary invalidations in ALock (Anderson Lock), and one way to avoid false sharing is to pad array elements so that distinct elements are mapped to distinct cache lines.1.4 Considering the results from Fig.1, how could the padding technique used in ALock degrade the overall performance of the lock?2. (18 marks) Chapter 9, examining the fine-grained algorithm 2.1. Provide the code for the contains() method missing from the fine-grained algorithm described in chapter 9.2.2. Write a test to verify the contains() method and explain why your implementation is correct. 3. (12 marks) Chapter 10, designing a bounded lock-based queue 3.1. Design a bounded lock-based queue implementation using an array instead of a linked list. Allow parallelism by using two separate locks for head and tail.3.2. Try to transform your algorithm to be lock-free. Where do you run into difficulty?4. (32 marks) Chapter 16, matrix vector multiplication 4.1. Implement a sequential matrix vector multiplication. 4.2. Give an efficient and highly parallel multithreaded algorithm for multiplying an n × n matrix A by a length-n vector x that achieves work Θ(n 2 ) and critical path Θ(log n).[Hint: If you use a cached thread pool with the Future interface, to achieve the Θ(log n) critical path you can follow the algorithm for matrix multiplication in Chapter 16. Otherwise, depending on how you divide the problem, you could obtain a critical path different than Θ(log n). This is acceptable, just be sure to show your work and justify in your report the critical path of your implementation.]4.3. Write a test program that measures the execution time for multiplying a 2,000 by 2,000 matrix with a corresponding 2000 wide vector using the parallel method and sequential method, respectively. Discuss the execution time of the two methods and compute the speedup. Be sure to indicate how many threads are being used.4.4. Analyze and discuss the work and critical-path length of your implementation, and give the parallelism. Total: 80 marks
1. (24 marks) This question will examine the concept of Locks. 1.1. Implement the Filter lock described in Chapter 2 of the course text. 1.2. Does the Filter lock allow some threads to overtake others an arbitrary number of times? Explain.1.3. Implement Lamport’s Bakery lock also described in Chapter 2. 1.4. Does the Bakery lock allow some threads to overtake others an arbitrary number of times? Explain1.5. Propose a test that verifies if a lock works, i.e., if it provides mutual exclusion. 1.6. Provide an implementation for the proposed test and verify if the implemented locks do provide mutual exclusion.2. (8 marks) Consider LockOne and LockTwo introduced in the text book; do they still satisfy twothread mutual exclusion if the shared atomic registers – “flag” in LockOne and “victim” in LockTwo – are replaced by regular registers?3. (12 marks) Programmers at the Flaky Computer Corporation designed the protocol shown in Fig. 1 to achieve n-thread mutual exclusion. For each question, either sketch a proof, or display an execution where it fails.3.1. Does this protocol satisfy mutual exclusion? (Hint: Start the proof by assuming that two threads A and B are in the critical section at the same time.) 3.2. Is this protocol deadlock-free? Explain. 3.3. Is this protocol starvation-free? Explain.1 class LockThree implements Lock { 2 private int turn; 3 private boolean busy = false; 4 public void lock() { 5 int me = ThreadID.get(); 6 turn = me; 7 do { 8 busy = true; 9 } while ( turn = me || busy); 10 } 11 public void unlock() { 12 Busy = false; 13 } 14 } Fig.1 LockThree Lock used in Question 24. (12 marks) For each of the histories shown in Figs. 2 and 3, are they sequentially consistent? Linearizable? Justify your answer. Fig. 2 History (a) Fig. 3 History (b)5. (8 marks) Consider the class shown in Fig. 4. Suppose two threads A and B are concurrently calling the methods writer and reader.5.1. According to what you have been told about the Java memory model, will the reader method ever divide by zero? If yes, describe the order in which writer and reader should be invoked (by threads A and B) and take effect for a division by zero to happen.5.2. Is division by zero possible if both x and v are volatile? What happens if neither x nor v are volatile? Justify your answer. Fig. 4 Volatile example6. (8 marks) Consider the regular M-valued MRSW construction shown in Fig. 5; True or false: 6.1. If we change the loop at line 11 to “for (int i = x + 1; i < RANGE; i++)”, then the construction is still a regular M-valued MRSW register. Justify your answer.6.2. If we change the loop at line 11 to “for (int i = x + 1; i < RANGE; i++)”, then the construction yields a safe M-valued MRSW register. Justify your answer. Fig. 5 The regular M-valued MRSW class7. (4 marks) Show that if binary consensus using atomic registers is impossible for two threads, then it is also impossible for n threads, where n > 2. (Hint: argue by reduction: if we had a protocol to solve binary consensus for n threads, then we can transform it into a two-thread protocol.)8. (4 marks) Show that if binary consensus using atomic registers is impossible for n threads, then so is consensus over k values, where k > 2. Total: 80 marks
1. (24 marks) Matrix multiplication is a common operation in linear algebra. Suppose you have multiple processors, so you can speed up a given matrix multiplication. 1.1. Modify the following method in the sample code to Implement a matrix multiplication sequentially: public static double[][] sequentialMultiplyMatrix(double[][] a, double[][] b)1.2. Modify the following method in the sample code to Implement a matrix multiplication in parallel: public static double[][] parallelMultiplyMatrix(double[][] a, double[][] b) Explain the parallel tasks defined in your code.1.3. Add a method to the source code that measures the execution time for both sequential and parallel matrix multiplication.1.4. Vary the number of threads being used by the parallel method for matrix sizes of 2000 by 2000 and plot the execution time as a function of the number of threads.1.5. Vary the size of the matrices being multiplied as (100 by 100, 200 by 200, 500 by 500, 1000 by 1000, 2000 by 2000, 3000 by 3000, 4000 by 4000) and plot the execution time as a function of matrix size for both parallel and sequential methods in one figure. Use the number of threads for which the parallel execution time in 1.4 is minimum.1.6. For the generated graphs in 1.4 and 1.5 comment on their shape and possible reasons for the observed behavior.2. (8 marks) Write a program that demonstrates deadlock. 2.1. Explain under what conditions deadlock could occur and comment on its possible consequences. 2.2. Discuss possible design solutions to avoid deadlock.3. (16 marks) The original dining philosophers problem was invented by E. W. Dijkstra, a concurrency pioneer, to clarify the notions of deadlock- and starvation-freedom. Imagine five philosophers who spend their days just thinking and feasting on sushi. They sit around a circular table with five chairs. The table has a big plate of sushi in the center; however, due tocurrent dining restrictions, there are only five chopsticks available, as shown in Fig. 1. Each philosopher thinks. When she/he gets hungry, they pick up the two chopsticks closest to them. If a philosopher can pick up both chopsticks, they can eat for a while. After a philosopher finishes eating, they put down the chopsticks and start to think again.Figure 1 3.1. Modify the class public class DiningPhilosophers in the sample code to simulate the behavior of the philosophers for any number n of them, where each philosopher is a thread, and the chopsticks are shared objects. Notice that you must prevent a situation where two philosophers hold the same chopstick at the same time. Notice that in this program philosophers should be eventually deadlocked.3.2. Amend your program so that it never reaches a state where philosophers are deadlocked, that is, it is never the case that each philosopher holds one chop- stick and is stuck waiting for another to get the second chopstick. Explain your solution to avoid deadlock. 3.3. Amend your program so that no philosopher ever starves. Explain your solution to avoid starvation.4. (12 marks) Use Amdahl’s law to answer the following questions: 4.1. Suppose the sequential part of a program accounts for 40% of the program’s execution time on a single processor. Find a limit for the overall speedup that can be achieved by running the program on an n-processor machine.4.2. Now suppose the sequential part accounts for 30% of the program’s computation time. Let sn be the program’s speedup on n processors, assuming the rest of the program is perfectly parallelizable. Your boss tells you to double this speedup: the revised program should have speedup s′n > sn*2. You advertise for a programmer to replace the sequential part with an improved version that decreases the sequential time percentage by a factor of k. What value of k should you require?4.3. Suppose the sequential time percentage could be decreased 3-fold, and when we do so, the modified program takes half the time of the original on n processors. What fraction of the overall execution time did the sequential part account for? Express your answer as a function of n.You may assume that the program, when executed sequentially, takes unit time. Total: 60 marks
1. Load and Convert Image to Grayscale: path = ‘/content/drive/Mydrive/Assignment_x/images/’ image = cv2.imread(path + ‘img1.png’) Read the image ./images/mcgill_arts_building.jpg and convert it to grayscale using an appropriate image processing library.Calculate the image derivatives and in both the x and y directions using Sobel filters or another derivative method. Compute the products of the derivatives: , , and .Smooth the derivative products , , and using Gaussian filtering to reduce noise.For each pixel, calculate the corner response function using the formula: where is the second moment matrix formed from the smoothed derivative products, and .Perform non-maximum suppression on the corner response map to identify potential corner points by keeping only the local maxima.Apply a threshold to the corner response map, and mark the local maxima that exceed the threshold as detected corners. Overlay the detected corners on the original image for visualization, similar to the example shown in the demo.7. Experiment with Thresholds: Experiment with different threshold values to control the sensitivity of corner detection. Summarize your findings on the effect of varying thresholds in a markdown cell.2.1 SIFT Keypoint Matching Between Two Images 1. Compute SIFT Keypoints and Descriptors: Detect SIFT keypoints and compute descriptors for the images ./images/graf1.png and ./images/graf2.png .2. Match Keypoints: Use a brute-force matching method to find correspondences between the keypoints of both images. 3. Sort Matches by Distance: Sort the matched keypoints in ascending order based on their matching distance.4. Display the Top Ten Matches: Visualize the top ten matched keypoints, overlaying them on both images.5. Plot the Matching Distances for the Top 100 Matches: Create a plot with keypoint indices on the x-axis and their corresponding matching distances on the y-axis to visualize the distribution of matching distances.2.2 Scale Invariance 1. Compute SIFT Keypoints for the Original Image: Detect the SIFT keypoints and compute their descriptors for ./images/graf1.png .2. Scale the Original Image: Create 4 scaled versions of ./images/graf1.png using scaling factors of 0.25, 0.6, 3, and 5. Display each of the scaled images.3. Compute SIFT Keypoints for the Scaled Images: For each scaled image, compute the SIFT keypoints and their descriptors. 4. Match Keypoints: Perform brute-force matching between the keypoints of the original image ( ./images/graf1.png ) and each of the scaled images.5. Sort Matches by Distance: For each pair of images (original vs. scaled), sort the matches according to the matching distance. 6. Display the Top Ten Matches: Visualize the top ten matches between the original and each scaled image.7. Plot the Matching Distances: For each pair, plot the matching distance for the top 100 keypoints, using the keypoint index on the x-axis and matching distance on the y-axis. 8. Discuss Trends: Analyze the results and discuss how increasing the scale affects the matching distances.2.3 Rotation Invariance 1. Rotate Image: Rotate ./images/graf1.png by the following angles: 30°, 75°, 90°, and 180°. Display the four rotated images.2. Compute SIFT Keypoints for Rotated Images: For each of the four rotated images, compute SIFT keypoints and descriptors. 3. Match Keypoints: Use a brute-force matching method to match the keypoints of the original image ( ./images/graf1.png ) to each of the rotated images.4. Sort Matches by Distance: Sort the matched keypoints for each image pair based on the matching distance in ascending order. 5. Display the Top Ten Matches: Visualize the top ten matched keypoints for each image pair (original vs. rotated).6. Plot the Matching Distances for the Top 100 Matches: Create a plot showing the matching distance for the top 100 keypoints, with keypoint indices on the x-axis and corresponding matching distances on the y-axis.7. Discuss Trends: Analyze how increasing the angle of rotation affects the matching distances. Explain the trends observed in the plot. 2.4 Repeat 2.1-2.3 using the image pair you obtained from assignment 1.You are given three different views of the same scene located in the folder ./images/Q3 . Follow these steps to stitch the images together: 1. Keypoint Detection and Matching: a. Compute SIFT keypoints and descriptors for images 1 and 2. b. Match the keypoints between images 1 and 2. Display the 20 best matching pairs.2. Homography Estimation and Transformation: a. Use the RANSAC method to compute the homography matrix that aligns keypoints from image 1 to image 2. b. Apply the computed homography to transform image 1. Image 2 should remain unchanged.3. Image Stitching: a. Stitch the transformed image 1 with the original image 2. For better visualization, average the pixel intensity in the overlap region so that you can see patterns from both images. Name this result image_12 . Display image_12 .4. Repeat the Process for Image 3: a. Compute SIFT keypoints and descriptors for image_12 and image 3. b. Match the keypoints between image_12 and image 3. Display the 20 best matching pairs. c. Compute the homography using RANSAC to align keypoints from image_12 to image 3. Apply this homography to transform image 3. Image image_12 should remain unchanged.5. Final Stitching: a. Stitch the transformed image 3 with image_12 . Also average the pixel intensity in the overlap region. Display the final stitched image.
Using a cellphone camera or a standalone digital camera, capture two images of a household object, where each image of the object is taken from a slightly different viewpoint. These images can be of any size. An example of such an image pair is shown below:Please note that all assignments are to be done individually (not in groups), and so you must acquire your own images. Do not share images or copy someone else’s images! You will be using these images, as well as the processed images, in some future assignments, so it is important to do all the steps correctly. Display the original images in the assignment’s Jupyter notebook.If your images are color (RGB), convert them to grayscale (monochrome) by using the python statement: gray = np.dot(your_image[…,:3], [0.2989, 0.5870, 0.1140]) In your notebook explain what this python statement does. Display the grayscale images in the assignment’s Jupyter notebook. Use the resize() function of scikit-image to resize your image so that the largest dimension has a size of 256 pixels. A tutorial on how to use the resize function can be found at: https://scikit-image.org/docs/stable/auto_examples/transform/plot_rescale.htmlWrite python code that can perform convolution on an image with a specific filter kernel. Using the convolution code, smooth the pair of resized grayscale images using a 5×5 pixel Gaussian kernel. Then repeat the smoothing on the original grayscale images, this time using a 15×15 Gaussian kernel. Display the smoothed images in the notebook. Also show in the notebook the kernel values (either as an array of numbers or as an image).Using your convolution code, compute the x and y derivative images of each of the smoothed images using the horizontal and vertical Sobel filters. Display the derivative images in the notebook.With python, compute the edge gradient magnitude and orientation of the smoothed images using the Sobel filter values computed in part 5. Display the magnitude and orientation images in the notebook.For the orientation image, display the angle value using an RGB colormap, such as ‘jet’ in the imshow() function (e.g. something like ax.imshow(data, cmap=’jet’) ). Details about matplotlib colormaps, including other colormaps you can try, can be found at: https://matplotlib.org/stable/tutorials/colors/colormaps.html
In this section, you are going to train models on the publicly available CIFAR-10 dataset source. The CIFAR-10 dataset consists of 60000 32×32 color images in 10 classes, with 6000 images per class. For more information, you are encouraged to look at their webpage. You are expected to implement a Convolution Neural Network (CNN) to classify the images based on their context.Free T4 GPU in Colab is highly recommended for executing the code in this section. 1. Implement a shallow CNN with the layers mentioned below. • A Convolution layer with 32 kernels of size 3×3 • A ReLU activation • A Convolution layer with 64 kernels of size 3×3 • A ReLU activation • A maxpool layer with kernels size of 2×2 • A convolution layer with 64 kernels of size 3×3• A ReLU activation • A convolution layer with 64 kernels of size 3×3 • A ReLU activation • A flattening layer. (This layer resizes a 3D tensor to a feature vector). • A fully connected layer with an output size of 10. (Classes should be predicted as numerical values (like 0-9). Note: If you use Pytorch, you can use print(next(model.parameters()).device) to check if you’re using the GPU for training.2. Use Pytorch Class torchvision.datasets.CIFAR10 to load the dataset.3. Training, validation and test settings are shown below. • 50,000 images for training (training set). Divide the 10,000 test set images of CIFAR10 into two subsets by 1:1. 5,000 images for validation (validation set), and 5,000 images for final testing (test set).• Batch size = 32. • SGD optimizer with an initial learning rate of 0.002. • Loss function: categorical cross entropy criterion.• Training iteration can be 90,000 or more (If you use epoch counting, epoch can be 58 or more). Perform a validation every 5000 iterations (3 epochs). It may take about 30 minutes on T4 GPU. If you don’t have enough computational resources, you are allowed to reduce the number of images by taking a subset of this dataset or reduce the training iterations (epochs). State the size of your subset and iterations (epochs) in the README. • Use the default setting for the rest of the hyperparameters.4. Plot the training loss, validation loss, and validation accuracy over the training iterations (or epochs). Fig. 1 shows an example. State whether the training appears to be overfitting and why.Fig. 1. Training loss, validation loss, and validation accuracy over the training iterations.5. Please give the test accuracy on the test set from the iteration (or epoch) where the validation accuracy is maximum as your test accuracy result.6. Let’s discuss the effects of the Kernel size. Change all kernel sizes to 5×5 and train a new network with the same other hyperparameters. Compare the run time and the test accuracy of models under different kernel sizes and briefly discuss the possible factors that affect the performance of a CNN.7. Use Pytorch Class torchvision.models.resnet18 to implement a deep network ResNet18. Set the training iteration as 6000 or more (If you use epoch counting, epoch can be 5 or more) and perform a validation on the validation set every 500 iterations (1 epoch). Give the test accuracy on the test set from the iteration (or epoch) where the validation accuracy is maximum as the test accuracy result. The rest of the hyperparameters should be the same as the above shallow CNN. Note that:7.1 By setting the parameter pretrained, you can choose to either train a new ResNet18 model from scratch or fine-tune the ResNet18 model that has been fully trained on the ImageNet dataset.7.2 Since the image size of CIFAR10 is 32×32 and the standard ResNet18 accepts 224×224 input by default, we may need to first resize the input image to 224×224 (You are free to use other available transformation, such as padding). Besides, the output channel of the final fully connected layer of ResNet18 needs to be modified to 10 to meet the classification requirements of CIFAR10.Compare the impact of using a pre-trained ResNet18 versus not and discuss the reason. Compare the test accuracy of the deep ResNet18 versus the shallow CNN.YOLOv8 is a state-of-the-art (SOTA) model for a wide range of object detection and tracking, instance segmentation, image classification and pose estimation tasks. In this section, you will be asked to take a photo of a street in Montreal and summarize the information in this image by using a trained YOLOv8 model. This section does not ask to implement and train the model from scratch.You can use a well-trained YOLOv8 model from its official implementation. 1. Use your cellphone or a digital camera to capture a street scene in Montréal. 2. Implement the trained YOLOv8 object detection model to identify what are the types of objects included in the image (such as person, bicycle, vehicle, tree) and count the number of each object. 3. Display the original and predicted images in your notebook.
In this section, we will ask you to train Classifiers on the CIFIAR10 dataset. The CIFIAR10 dataset contains 60000 32×32 images. Ten classes are included, with 6000 images per class in it. Further information can be found on the dataset webpage https://www.cs.toronto.edu/ kriz/cifar.html. GPU is highly recommended for running code in this section.1. Resize the train/test images to 64×64 and convert them to grayscale images. Compute HoG features with cells of 8×8 pixels, blocks of 4×4 cells, and 4 bins. This should generate a feature vector of size 1600 per image, which can be regarded as features for training classifiers.2. Fit a non-linear SVM classifier with default hyperparameters on the features and the class features of the training images. 3. Predict labels of the test images by feeding the test features to the trained classifier and calculate classification accuracy.4. Tune values of hyperparameters ’gamma’ and ’C’ to observe the accuracy change and select the hyperparameters with the highest test accuracy. Display your fine-tuning process by listing all the test cases with their parameter and corresponding accuracy.5. Fit a Random Forest(RF) classifier (set n_estimators=10, max_depth=5, and criterion=’entropy’) on the features and the class labels of the training images. 6. Predict labels of the test images by feeding the test features to the trained classifier and calculate classification accuracy.7. Compare the performance of SVM and RF. Experiment training both classifiers with a range of random states(different values for random_state). Evaluate the stability within the random state. List the strengths and weaknesses of each model.In this section, you will work on face detection. To achieve this purpose, you are required to compute EigenFaces and use the Viola-Jones detector to identify faces. We have provided a subset from CelebA(1) face dataset under the folder Q2 part1. The subset contains 100 color images.Figure 1: Sample images from CelebA dataset. 1. Read this subset into the code environment and convert all images into grayscale. 2. Implement the Snapshot method for PCA (covered in Lecture 8, Slide 55) from scratch using Numpy. Display the first five face images.3. Use a sliding window method to detect faces in the image??, which is named Person.png under folder Q2 part2. Use the result from the previous step to compute the distance in the eignspace between the window contents and your training data. 4. Set a threshold to detect faces and select the best-performed value. Show your fine-tuned process.5. Label the detected images with bound boxes and display the final result image with labels. 6. Use an existing implementation of the Viola-Jones face detector to detect faces on the same image. Compare the result with the method you implemented.7. Evaluate your predicted result in both methods(True/False, Positive/Negative). 8. Explain under what conditions the Viola-Jones detector works when PCA does not. Figure 2: Face Detection on Multi-person image.9. Evaluate the performance of this model and explain the steps that this network took to achieve the final result.References [1] LIU, Z., LUO, P., WANG, X., AND TANG, X. Deep learning face attributes in the wild. In Proceedings of International Conference on Computer Vision (ICCV) (December 2015).