First Year Integrative Seminar 2 PUFY 1013 Spring / 2025 Course Description This course aims to challenge old beliefs about what you can do with writing. It invites inquiry and helps you to think about research as an energizing practice. In Integrative Seminar 2 you will be encouraged to pursue topics you find perplexing, fascinating…maybe even a bit frightening. How can curiosity lead you to ask productive questions and get answers to them? How can you develop a writing process that is organic and unfolds over time? In the first half of the semester quick assignments will introduce you to a variety of research methods and help you to define an area of interest. In the second half of the semester you will pursue your own research based project, in connection with your Studio work. Throughout the semester, you will read texts which explore a wide array of forms that researched writing can take. Once again, Studio and Seminar will come together through a series of bridge projects that highlight the components of the research process: inquiry, context, investigation, interpretation, argument, connections and reflection. Bridge projects are the basis of the collaborative relationship of ideas between the two courses. They ask you to engage with writing as a form. of making, and making as a form of thinking, in order to explicitly and productively blur the boundaries between Seminar and Studio. Class Description - Visual Culture Meaning is embedded in the endless images, spaces, and artifacts that make up our visual culture–what we see around us. How can we learn to look beneath the surface and see what’s being communicated through a broad range of forms – art, advertisements, products, fashion, photography, illustration, architecture, performance, technology, etc. What new and unexpected forms demand our scrutiny? In this class we will explore how to translate the visual into writing, how to create visuals with words, and ultimately how to make a fial product that embodies the best of both written and visual elements. Shared Capacities This course satisfies the following share capacities: ● Authorship ● Critical Analysis ● Communication ● Work in Complex Systems ● Flexibility and Resiliency ● Research Literacy Bridge Project Summary Bridge projects are the heart of the Integrative Studio-Seminar pairing. These projects will “bridge” the ideas, questions and content between these two courses. The following units are explored in Studio and Seminar: Bridge 1: Inquiry Bridge 2: Context + Investigation Bridge 3: Interpretation + Argument Bridge 4: Connections Bridge 5: Reflection and Final Presentations BRIDGE 1: Inquiry → Neighborhoods Tour (weeks 1-3): Learn about NYC through its design elements, motifs, and styles using observation, recording, and experience In Studio, you will create a Neighborhood Portfolio and then Poster inspired by a collection of drawing, photography, notes, audio and video of design motifs after visiting an assigned neighborhood. Brighton Beach, Harlem, Jackson Heights, and Flushing. In Seminar, you will write two different versions of a neighborhood walking tour, guiding readers through their assigned neighborhood and incorporating sensory details and figurative language. With this assignment, we will begin to make conscious choices in the process of crafting structures out of language. We will also begin to get comfortable with drafting and revision. Version 1 due Week 2 - Sunday, 2/2 Version 2 due Week 3 - Sunday, 2/9 Visual References: Francis Alys and Sophie Calle (walks), Alan Lomax and Field Recordings, Wolfgang Tillmans, Martin Parr (photography), New York City Archeological Repository (https://archaeology.cityofnewyork.us/) Readings and Resources: “Blackberries” by Leslie Norris Harvest documentary, dir. by Kevin Barnes (2017) Here is New York (excerpt) by E.B. White Selected poems by various authors They Live (excerpt), dir. by John Carpenter (1988) BRIDGE 2: Context + Investigation → Pavilion Ship (weeks 4-7): Explore materials, space, location and scale through research and building a model, then imagine that model as a spaceship. In Studio, you will research and design a pavilion, in a particular place in NYC (can be a place you visited in Bridge 1 or someplace new). The purpose of this new structure is to see, to look, and to view. After identifying location, propose materials and design to support your goals. Finally, build a model. In Seminar, you will write a short science fiction story, imagining that your pavilion is a self-contained “ship” and describing what life is like on that ship. With this assignment, we will continue to use language as a creative tool while also engaging in a more deliberate drafting process. 1st Draft due Week 5 - Sunday, 2/23 2nd Draft due Week 7 - Sunday, 3/9 Visual References: New York Highline Park, Little Island, Reindeer Pavilion (Snohetta), MoMA’s collection of architectural models. Readings and Resources: “All Summer in a Day” by Ray Bradbury “The Machine Stops” by E.M. Forster Soylent Green, dir. by Richard Fleischer (1973) SPRING BREAK 3/10-3/16 BRIDGE 3: Interpretation + Argument → Annotated Bibliography (weeks 8-9): Develop a thesis question byfollowing your curiosity, then conduct initial research on a question/topic of your choosing. In Studio, you will craft a question to answer through a creative, research-based project. You will brainstorm ideas and then narrow to one topic and one question that can be tested through making. In Seminar, you will conduct exploratory research in line with your interests and curiosities and write an annotated bibliography–a written documentation of your research and exploration. With this assignment, we will continue to practice the revision process and familiarize ourselves with the standards and practices of artistic and academic research. 1st Draft due Week 8 - Sunday 3/23 2nd Draft due Week 9 - Sunday, 3/30 Visual References: Tide and Current Taxi (Marie Lorenz), Sugarbaby (Kara WAlker), The Bowery in Two Systems (Martha Rosler), Broadway Walk (William Pope.L) Readings and Resources: Frances Ha, dir. by Noah Baumbach (2013) Readings will vary from student to student, depending on their area of interest BRIDGE 4: Connections → Artist’s Statement + Zine (weeks 10-14): Transform and test your research through afinal project; write an artist’s statement; and collect, edit, and reimagine your writing in a zine. In Studio, you will create a Final Project to be shared with the class. The scope of each project will be individual and be determined in consultation with your Studio teacher. The final Seminar project is divided into two major outputs. First, students will write a 3-5 page Artist’s Statement that addresses the methods, materials, and process used in their final Studio project, as well as their broader motivations, inspirations, and intentions as an artist. This statement will also discuss and incorporate research from Bridge 3. Then, students will *remediate* their Artist’s Statement into a Zine, which will incorporate visual imagery and previous writings from this class. Both the Statement and the Zine will adhere to the Chicago Manual of Style. for citations and formatting. Ultimately, these Zines will be submitted to a public collection. 1st Draft Artist’s Statement due Week 11 - Sunday, 4/13 2nd Draft Artist’s Statement due Week 12 - Sunday, 4/20 Zine due Week 14 - Sunday, May 4th Readings and Resources: West Side Story, dir. by Robert Wise and Jerome Robbins (1961) West Side Story, dir. by Steven Spielberg (2021) Student led-research → zines, past and present Visit to Special Collections BRIDGE 5 - Reflection and Final Presentations (week 15): Write reflective LP postsfor both Studio and Seminar. Studio Assignment You will write a reflective LP post including a description and analysis of your final project: project statement, an overview of the process, reflections on the critique, and illustrations from your presentation Seminar Assignment You will write a reflective LP post about your final Zine project, your research process, and your progress in the course overall. Learning Outcomes By the successful completion of this course, students will be able to: 1. Bring writing and making together through critical thought. (Studio and Seminar) 2. Continue to build information literacy by finding and using a variety of sources: online and print research material; electronic catalogs and indexes; books, periodicals, databases, websites, archives, and exhibition materials. (Seminar) 3. Develop skills and vocabulary necessary for persuasive argumentation, by learning to craft coherent thesis statements and support arguments. Write clear and cogent image and text-based analysis of their own and others’ work. Exercise both formal and informal writing forms, including process-writing, as well as the critical thesis-driven essay and a culminating, final research paper. (Seminar) 4. Demonstrate critical reading skills by identifying central arguments and supporting evidence in texts including critical essays and other examples of thesis-driven writing. (Seminar) 5. Successfully attribute the use of others’ ideas and images by using a standard citation format and thus avoid plagiarism. (Seminar) 6. Utilize online tools individually and collaboratively in order to collect, organize and communicate research. (Studio and Seminar) 7. Demonstrate an introductory capacity to collect, analyze, interpret and synthesize information through multiple research methods, discussions, writings, and making processes. (Studio and Seminar) 8. Use the online learning portfolio to engage with the idea of making as a form of thinking. Demonstrate the ability to reflect on process, choices made, creative and critical skills learned, and connections fostered, through analysis, reflection, documentation and archiving on the learning portfolio. (Studio and Seminar) 9. Engage with art and design as a generator, embodiment and transmitter of cultural ideas. Demonstrate an understanding of value systems as social constructs. (Studio and Seminar) 10. Use writing and reading-based research to investigate, iterate, and hone a question / problem using a range of methods. Continue to develop on outcomes from Seminar 1. (Seminar) Assessable Tasks ● Write an essay integrating multiple senses and points of view (learning outcome 5, 6, 7) ● Engage in the workshop process (learning outcomes 1, 2, 5) ● Develop an annotated bibliography (learning outcomes 1, 2, 6, 7) ● Write creative science-fiction text with a solid thesis (learning outcomes 5, 6, 7) ● Contribute drafts and final papers to the Learning Portfolio, discuss/write about connections to studio and writing development over the semester (learning outcomes 1, 3) Expectations, Commitments and Community Agreements This course is not a performance given by me, your teacher, nor a process of transmitting information from my brain to yours; rather, this course is a learning community where knowledge is co-created. This means that you yourself are an integral part of cultivating an environment where learning is both possible and enjoyable. On the first day of class, we will work together to build our own community agreements and make commitments to ourselves and to each other. These will be added to the syllabus and referred to as needed throughout the semester. ON DIVERSITY, EQUITY & INCLUSION In order to build community together, it is essential that students from all backgrounds and perspectives feel welcomed and valued. Racism, sexism, transphobia, homophobia, xenophobia, ableism, ageism, classism, and any other forms of oppressive attitudes will not be tolerated. We each bring unique talents, inquiries, knowledge, skills, and personal experiences that contribute to creating a robust and welcoming learning community in which we all can belong and thrive to our fullest potential. As much as possible, we will … …create a supportive, welcoming, and respectful classroom environment …be fully present in class (not on phones, not doing unrelated work) …take advantage of a 10-15 minute break each class to refresh our minds and bodies …review vocabulary regularly and encourage vocabulary practice and retention …address grammar issues as they arise …utilize office hours, before/after class and by appointment ...accommodate the need for physical movement and other needs during sustained attention: Acceptable “Fidgets” ● Friendship bracelets ● Tiny models ● Embroidery ● Clay/slime ● Cross stitch ● Doodling ● Cat’s cradle (for one person) ● Coloring ● Knitting ● Jigsaw puzzle ● Crochet ● Bathroom breaks
Spring 2025 CIT 596 - HW3 This homework deals with the following topics * Mergesort * Quicksort • The homework has to be submitted in electronic form as a pdf file. You can use any editing software you want in order to produce the pdf. No pictures of handwritten solutions. • For any algorithm that we have done in class that you want to use, you are allowed to say ”using Quicksort” or something like that. You are also allowed to just use its runtime without having to reprove it. • For a question involving recursion, please express the time taken in the form of a recurrence and then you can use the master theorem (assuming the master theorem is applicable). • For a question that involves an algorithm that we cover in class, you can use the final big-O result. No need to show the derivation again. For example, if binary search shows up in your algorithm you can just say “we know binary search is O(log n).” • For all questions in this HW and subsequent HWs the goal is to find algorithms that are most efficient in terms of big-O analysis. You do receive partial credit if your algorithm is less efficient than the best. You do not receive credit though if your algorithm computes an incorrect result. So be sure to check for correctness before you worry about efficiency. • Unless otherwise specified, you should write your algorithm analysis as “In the worst case, this algorithm is ....” . • Reminder: Your algorithm should not rely on a fancy data structure in a particular language. Your algorithm needs to be in plain English or in pseudocode. • You do not have to worry about tiny edge cases like empty arrays, empty lists etc. Student Name: 〈 Your Name 〉 Collaborator(if any) : 〈 Your Collaborators 〉(at most 2 other collaborators.) 1. (5 points) You are given a huge array of numbers (could be integers or floats) and your task is to see whether the number 9798 is in this array or not. Two of your friends suggest methods to help you do this • Friend 1 says just loop through the array and try to find the element. • Friend 2 has heard of this cool thing called binary search so they ask you to first sort the array as quickly as you can and then apply this binary search idea. Which of your friends are you going to listen to and why? Compare their answers in terms of their big-O run time. You are allowed to directly use results from lecture regarding the runtime of binary search or any other algorithm. 2. (5 points) What is the running time of quicksort if all the elements are identical? To answer this question please use the quicksort partition pseudocode in the textbook that has been copied here. The screenshot of that will be posted to Canvas. This pseudocode shows you how to partition a subarray that starts from index p and goes all the way to index r. The pivot element being used in this partition is A[r]. Briefly explain your answer. 3. (6 points) Mergesort divides an array into 2 equally sized pieces. Consider modifying it to an algorithm called mergesort3. Instead of dividing into 2 pieces, divide into 3. Recursively sort the 3 pieces and then do a merge of the 3 pieces. (a) Write pseudocode for a function that takes in 3 sorted arrays and returns a merged completely sorted array. Do this WITHOUT just simply calling the regular 2 argument merge twice. (b) What is the time complexity of the function that merges 3 sorted arrays ? Give us a big-O expression. We are asking you for the time complexity of the 3 way merge and not of mergesort at this point. (c) Write a recurrence relation (T(n) = ...) to express how the run time of mergesort with this 3 way merge algorithm can be computed recursively. Now use the master theorem to compute the big O of this mergesort. (d) Is this new mergesort better than what we covered in class. in pure big-O terms. 4. (6 points) A percentile (or a centile) is a measure used in statistics indicating the value below which a given percentage of observations in a group of observations falls. For example, the 20th percentile is the value (or score) below which 20% of the observations may be found. We want to take an array and a percentile value x. Then compute the value that will correspond to that percentile. The way that a lot of statistics texts will do this is this 3 step process (https://goodcalculators.com/percentile-calculator/) 1. Arrange the data such that the entries span from the smallest to the largest values (ascending order). 2. Calculate an index i (the position of the pth percentile) as follows: i = (p/100) ∗ n 3. If i is not an integer, round it up. The next integer greater than i represents the position of the pth percentile. If i is an integer, the pth percentile is the average of the values in positions i and i + 1. For example, computePercentile([1, 8, 2.8, 2, 5], 50) should return 2.8. The steps from statistics make it seem like you have to do a complete sorting of the array. However, we can do a bit better. Using the idea of quicksort’s partition, write an efficient algorithm in plain English (just explain the process here) to find percentile(array, x) to find the xth percentile of an array. The analysis can be a bit tricky here, so you do not need to do that part in this question. You do have to explain why it would work. You can rely on what you have learned in lecture about partition and quicksort. 5. (10 points) For this question you have two types of objects - keys and locks. You can ‘compare’ a key to a lock by trying to fit one into the other. There are 3 possible results for the comparison - the key is too big, the key is too small, or they “match” . You cannot directly compare a key to a key, or a lock to a lock. Suppose you are given n keys and n locks. The keys are all different sizes, and there is a matching key to each lock. 1. Design an algorithm (the usual WEAPARTE stuff) that runs in expected O(nlgn) to correctly match each key to its corresponding lock. Your algorithm should have O(nlog n) as the expected number of comparisons in its worst case. 2. What is the worst-case runtime of your algorithm? Please justify in 2 lines.
CSEN 338 (4 Units) Image and Video Compression PROJECT INFORMATION AND LIST OF SUGGESTED PROJECTS Phases: • Group or individual project • Literature survey, study • Design or Discussion or Comparison • Implementation if any • Presentation & Report Suggested Topics: 1. Entropy Coding or Decoding • Conduct a literature survey on a fast entropy coding or decoding techniques (e.g. fast or parallel CABAC decoding or Golomb decoding). • Conduct a simple implementation (e.g. software in Python/C/Matlab or hardware design) of a simple entropy coder; test it out with images and evaluate your results (e.g. memory, complexity). • Do a presentation and write a report. 2. Predictor or Quantizer Design • Conduct literature survey on good predictor/interpolator/filter techniques (e.g. AIF, Wiener, directional) or good quantization techniques (e.g. Lloyd-Max or others, cascaded, adaptive, AQMS, RDOQ, subjective). It would be better (but not critical) if you take into account perceptual visual distortion, especially for quantization. • Produce an implementation (e.g. software in Python/C/Matlab or hardware design) of a simple predictor or quantizer; test out your work with images. • Evaluate your results (e.g. bit-rate vs. visual quality vs. computational complexity). • Do a presentation and write a report. 3. 2-D Transform • Conduct literature survey on new 2-D or directional transform for image/video coding. • Produce an implementation & its inverse (e.g. Python/C/Matlab) for the transform; you can use Matlab Image Processing Tool Box; test your transform. with several images with quantization. • Evaluate your results (e.g. bit-rate vs. visual quality vs. computational/memory complexity). • Do a presentation and write a report. 4. Design of a Motion Estimator / Compensator • Conduct literature survey and study on recent motion estimation techniques. • Investigate one issue for motion estimation (e.g. sub-pel, AIF filter, motion model, coding methods for MV or residue, reference picture selection/generation, search range, partial distortion search, MV competition,flexible search patterns, merge mode, affine methods, motion fields) or advanced motion methods in HEVC or VVC. • Design and produce a simple implementation to demonstrate motion estimation (e.g. in JM or HM or Matlab/C/Python); test out your design with two or more frames. • Evaluate your results (e.g. bit-rate vs. visual quality vs. computational/memory complexity). • Do a presentation and write a report. 5. Performance Analysis for Image Codec • Study an image codec structure of the BPG standard. • You can also find out new techniques adopted in BPG that were not in JPEG. • Suggest a list of performance issues to evaluate (e.g. coding efficiency, computational complexity). • Do a presentation and write a report. 6. Performance Analysis for Video Codec • Study a video codec or just decoding. JVET VTM (VVC/H.266) is preferred, but JCT-VC HM (HEVC/H.265) or H.264 JM arefine. For some cases you need to know C++. Alternatively, you could explore 3D or scalable extensions of the codec (e.g. JCT-3V HTM (3D-HEVC) software). • You can also find out new techniques adopted in VTM that were not in HM, or in HM that are not in JM; or comparing two different codecs. • Suggest a list of performance issues to evaluate (e.g. coding efficiency and computational complexity). Alternatively, you could explore multicore processor or GPUs for speeding up codecs. • Do a presentation and write a report. 7. Machine Learning Methods in Video Coding • Study a video coding technique and see which part(s) (e.g. mode decision, partitioning/coding unit depth decision, transform, intra prediction, motion estimation) can machine learning methods (e.g. SVM, classification, decision trees, PCA, sparse dictionary learning, K-SVD) or deep learning methods (e.g. CNN, GAN, RNN, transformer) be applied to achieve better performance. • Suggest a list of performance issues to evaluate (e.g. coding efficiency, computational complexity). • Do a presentation and write a report. 8. Deep Learning Methods in Video Coding • I am particularly interested in the study of one of these methods for end-to-end image/video coding: • (a) Use of convolutional neural network (CNN) in video coding. • (b) Use of generative adversarial network (GAN) in image/video coding. • (c) Use of autoencoder (AE), variational autoencoder (VAE), recurrent neural network (RNN),LSTM, and especially transformers, in image/video coding. • (d) Use of deep learning approaches in motion estimation. • (e) Use of reinforcement learning in video rate control. • (f) Visual quality metric for learned image/video coding. • (g) Complexity reduction methods. • Suggest a list of performance issues to evaluate (e.g. coding efficiency, computational complexity). • Do a presentation and write a report. 9. A Current Hot Topic (typically 1 person but can be more) - Survey / Study / Comparison Select a recently hot topic of interest from: • Latest and future JPEG image coding standards - JPEG-AI, JPEG-DNA, JPEG-NFT, JPEG-XE. • Latest and future video coding standards - VVC/H.266,NNVC, etc. • Video coding for HDR, WCG, 360 video, screen content, 3-D, point cloud, etc. • Visual volumetric video coding (V3C) standards - VPCC, MIV, V-DMC. • Video coding for surveillance video. • Video for visually impaired. • Image coding for plenoptic images - light field, point cloud, or holograph. • Visual quality metric for plenoptic images, 360 video, 3-D video, and point cloud. • Deep learning assisted tools for image/video coding (e.g. post-processing, optimization, rate control, reference frame. generation, intra prediction). • End-to-end deep learning-based image/video coding (e.g. autoencoder, transformer). • Generative AI approaches - GAN, diffusion probabilistic models, etc. • Advanced AI approaches like codebook-based methods, tokenization, etc. • Deep learning in visual quality assessment. • Image or video coding for machines (VCM) (e.g. semantic coding, feature coding). • Green video coding. • Advanced approaches in HVS (e.g. psycho-visual studies, DMOS, JND, SSIM, saliency map, reference-free method) and its effect on RDO and quantization. • Visual attention and saliency. • Methods in 3D video coding (e.g. depth coding, view synthesis, and 3D-HEVC). • Graphics compression (especially 3D graphics) or haptic compression or AR video. • Coding for immersive multimedia, VR/AR in metaverse. • Visual communication (e.g. transcoding, rate control or shaping, congestion control) over networks. • Advanced intra-prediction methods or inter-prediction methods for VVC or beyond. • RDO plus complexity, Lagrange multiplier, or advanced quantization methods. • Pre- or post-processing (e.g. artifact removal, denoising, or error concealment). • Computer vision approaches (e.g. face detection) in coding. • Parallel methods and/or use of GPUs or TPUs (systolic arrays) in video coding. • Other advanced topics (e.g. super-resolution). • Your suggestion (need approval, and must be related to compression). Conduct a literature survey on the topic. Study and compare different approaches, if needed. Do a presentation and write a report. 10. Your Suggestion • Needs approval. • Should involve study or design, simple implementation or comparison/evaluation, report, and presentation. Note: Above are just suggestions, feel free to modify them or suggest your own topic to suit your projects. Note: The use of AI tools such as ChatGPT is not allowed. Copying materials from websites is not allowed , but references are encouraged. References 1. The Internet and Web Sites. 2. Books recommended on the syllabus and other related books. 3. Industrial Magazines (for more industrial oriented projects). 4. Journals (IEEE Transactions on Circuits & Systems for Video Technology, IEEE Transactions on Consumer Electronics, IEEE Transactions on Multimedia, IEEE Transactions on Image Processing, IEEE Communications, IEEE Multimedia, IEEE Signal Processing, IEEE Networks, etc.) (For more research oriented projects). 5. Conference proceedings. Report Suggestion (but not limited to): • Abstract of about 50 words • Literature survey • Design or Analysis or Study • Implementation or Comparison (if any) • Results or Presentation (if any) • Evaluation or Comparison • Conclusion (about 20 words) • List of References (must) • Appendices (if any) It must be a technical report, not sale’stalk, user’s manual, or layman’stalk. Your report must not exceed 4 pages (for 1 person), 6 pages (group of 2), 8 pages (group of 3), or 10 pages (group of 4). For each group project, please indicate the individual responsible for each portion of the work. Note that students in the same group may or may not receive the same grade. Note also that project result is not the most important thing. I grade you based on your effort and work, how challenging the project is, and what you have learned. Please do not “copy and paste” materials from web or other literatures. Please do not use AI tools such as ChatGPT to generate materials for the report.
ECE 219: Data Mining - Models & Algorithms Winter 2025 Project 3: Recommender Systems Due Feb 21, 2025 by 11:59 pm 1 Note In this project, you are provided only with the training subset of the movie dataset. All the metrics that you will report have to do with validation performance as averaged across folds. For the web10k dataset, you are allowed to directly use the partitioned dataset. 2 Introduction The increasing importance of the web as a medium for electronic and business transactions and advertisement, and social media has served as a driving force behind the development of recommender systems technology. Among the benefits, recommender systems provide a means to prioritize data for each user from the infinite information available on the internet. Such systems are critical to ensuring (among others): (a) the detection of hate speech, (b) user retention on a web service, and (c) fast and high-quality access to relevant information. An important catalyst is the ease with which the web enables users to provide feedback about a small portion of the web that they traverse. Such user-driven sparse feedback poses the following challenge in the desing of recommender systems: Can we utilize these sparse user datapoints to infer generalized user interests? We define some terms: • The entity to which the recommendation is provided is referred to as the user; • The product being recommended is an item. The basic models for recommender systems works with two kinds of data: A User-Item interactions such as ratings (a user (you) provides ratings about a movie (item)); B Attribute information about the users and items such as textual profiles or relevant key- words (deep representations about a user or item). Models that use type A data are referred to as collaborative filtering methods, whereas models that use type B data are referred to as content-based methods. In this project, we will build a recommendation system using collaborative filtering methods. 3 Collaborative filtering models Collaborative filtering models use the collaborative power of established user-item interactions to make recommendations about new user-item interactions. In this project we use a ratings database where the user is an audience member who viewed a movie, and the item is the movie being rated. The main challenge in designing collaborative filtering methods is that the underlying rat- ings matrices are sparse. Consider this example of a movie application in which users specify ratings indicating their like or dislike of specific movies. Most users would have viewed only a small fraction of the large universe of available movies and as a result most of the ratings are unspecified. The basic idea of collaborative filtering methods is that these unspeci- fied ratings can be imputed because the observed ratings are often highly correlated across various users and items. For example, consider two users named John and Molly, who have very similar tastes. If their respective ratings exist within our database and are very similar, then the media recommended to them should likely be similar as well. For those few scenarios in which only John has rated a movie M, the similarity across other movies to Molly’s preferences should make clear that Molly might also prefer movie M. Thus, most collaborative filtering methods leverage either inter-item correlations or inter-user correlations for the prediction process. In this project, we will implement and analyze the performance of two types of collaborative filtering methods: 1. Neighborhood-based collaborative filtering: Directly leverages the choices of other users to determine potential items to recommend to the current user. 2. Model-based collaborative filtering: Estimates a joint model from all user data, en- abling the generation of new recommendations without accessing the entire user base, and allowing for queries on a more compact model. 4 Dataset In this project, we will build a recommendation system to predict the ratings of movies in the provided dataset. The dataset can be downloaded here. For the subsequent discussion, we assume that the ratings matrix is denoted by R (you will have to construct this), and it is an m × n matrix containing m users (rows) and n movies (columns). The (i,j) entry of the matrix is the rating by user i for movie j and is denoted by rij . Before moving on to the collaborative filter implementation, we will analyze and visualize some properties of this dataset. QUESTION 1: Explore the Dataset: In this question, we explore the structure of the data. A Compute the sparsity of the movie rating dataset: Sparsity = Total number of possible ratings/Total number of available ratings (1) B Plot a histogram showing the frequency of the rating values: Bin the raw rating values into intervals of width 0.5 and use the binned rating values as the horizontal axis. Count the number of entries in the ratings matrix R that fall within each bin and use this count as the height of the vertical axis for that particular bin. Comment on the shape of the histogram. C Plot the distribution of the number of ratings received among movies: The X-axis should be the movie index ordered by decreasing frequency and the Y-axis should be the number of ratings the movie has received; ties can broken in anyway. A monotonically decreasing trend is expected. D Plot the distribution of ratings among users: The X-axis should be the user index ordered by decreasing frequency and the Y-axis should be the number of movies the user has rated. The requirement of the plot is similar to that in Question C. E Discuss the salient features of the distributions from Questions C,D and their implications for the recommendation process. F Compute the variance of the rating values received by each movie: Bin the variance values into intervals of width 0.5 and use the binned variance values as the horizontal axis. Count the number of movies with variance values in the binned intervals and use this count as the vertical axis. Briefly comment on the shape of the resulting histogram. 5 Neighborhood-based collaborative filtering The basic idea in neighborhood-based methods is to use either user-user similarity or item-item similarity to make predictions from a ratings matrix. There are two basic principles used in neighborhood-based models: 1. User-based models: Similar users have similar ratings on the same item. Therefore, if John and Molly have rated movies in a similar way in the past, then one can use John’s observed ratings on the movie Terminator to predict Molly’s rating on this movie. Item is kept constant. 2. Item-based models: Similar items are rated in a similar way by the same user. Therefore, John’s ratings on similar science fiction movies like Alien and Predator can be used to predict his rating on Terminator. User is kept constant. In this project, we will only implement user-based collaborative filtering (implementation of item-based collaborative filtering is very similar). 5.1 User-based neighborhood models In this approach, we are trying to find a set of users similar in their rating strategy to a target user. This results in a user-based neighborhood and we will use the majority vote within the neighborhood to provide recommendations. In order to determine the neighborhood of the target user u, their similarity to all the other users is computed. Therefore, a similarity function needs to be created between each pair of the historical rating patterns - one by each user across the movies. In this project, we will use the Pearson-correlation coefficient to compute this similarity as a correlation. 5.2 Pearson-correlation coefficient The Pearson-correlation coefficient between users u and v denoted by Pearson(u,v) captures the similarity between the rating vectors of users u and v. First some notation: • Iu : Set of item indices for which ratings have been specified by user u; • Iv : Set of item indices for which ratings have been specified by user v; • µu : Mean rating for user u computed using her specified ratings; • ruk : Rating of user u for item k. QUESTION 2: Understanding the Pearson Correlation Coefficient: A Write down the formula for µu in terms of Iu and ruk ; B In plain words, explain the meaning of Iu ∩ Iv . Can Iu ∩ Iv = ∅? (Hint: Rating matrix R is sparse) Then, with the above notation, the Pearson-correlation coefficient between a pair of users u and v is defined by equation 2: 5.3 k-Nearest neighborhood (k-NN) Having introduced a similarity metric between users (as a correlation coefficient between their ratings across movies), we are now ready to define a neighborhood of users. The k-Nearest neighbors of user u, denoted by Pu , is the set of k users with the highest Pearson-correlation coefficient with user u (pairwise). 5.4 Prediction function The predicted rating that user u might award for item j, denoted by ˆ(r)uj , can simply be modeled by equation 3: QUESTION 3: Understanding the Prediction function: Can you explain the reason behind mean-centering the raw ratings (rvj − µv ) in the prediction function? (Hint: Consider users who either rate all items highly or rate all items poorly and the impact of these users on the prediction function.) 5.5 k-NN collaborative filter The previous sections have equipped you with the basics needed to implement a k-NN col- laborative filter for predicting ratings of the movies. Although we have provided you with the equations needed to write a function for predicting the ratings, we don’t require you to write it. Instead, you can use the built-in python functions for prediction. 5.5.1 Design and test via cross-validation In this part of the project, you will design a k-NN collaborative filter and test its performance via 10-fold cross validation. In a 10-fold cross-validation, the dataset is partitioned into 10 equal sized subsets. Of the 10 subsets, a single subset is retained as the validation data for testing the filter, and the remaining 9 subsets are used to train the filter. The cross-validation process is then repeated 10 times, with each of the 10-subsets used exactly once as the validation data. QUESTION 4: Design a k-NN collaborative filter to predict the ratings of the movies in the original dataset and evaluate its performance using 10-fold cross validation. Sweep k (number of neighbors) from 2 to 100 in step sizes of 2, and for each k compute the average RMSE and average MAE obtained by averaging the RMSE and MAE across all 10 folds. Plot average RMSE (Y-axis) against k (X-axis) and average MAE (Y-axis) against k (X-axis). The functions that might be useful for solving this question are described in these documen- tations . Use Pearson-correlation function as the similarity metric. You can read about how to specify the similarity metric in the documentation: http://surprise.readthedocs.io/en/stable/ similarities.html QUESTION 5: Use the plot from question 4, to find a ’minimum k ’. Note: The term ’minimum k’ in this context means that increasing k above the minimum value would not result in a significant decrease in average RMSE or average MAE. If you get the plot correct, then ’minimum k’ would correspond to the k value for which average RMSE and average MAE converges to a steady-state value. Please report the steady state values of average RMSE and average MAE. 5.6 Filter model performance based on subsets of the raw data In this part of the project, we will analyze the performance of the k-NN collaborative filter in predicting the ratings of the movies in trimmed data subsets. The subsets can be formed in many ways, but we will consider the following trimming options: • Popular movie trimming: In this trimming, we trim the dataset to contain movies that have received more than 2 ratings. If a movie in the set has received less than or equal to 2 ratings in the entire dataset then we delete that movie from the set and do not train/predict the rating of that movie using the model. • Unpopular movie trimming: In this trimming, we trim the dataset to contain movies that have only received less than or equal to 2 ratings. If a movie in the set has received more than 2 ratings in the entire dataset then we delete that movie from the set and do not train/predict the rating of that movie using the model. • High variance movie trimming: In this trimming, we trim the set to contain movies that have variance (of the rating values received) of at least 2 and have received at least 5 ratings in the entire dataset. If a movie has variance less than 2 or has received less than 5 ratings in the entire dataset then we delete that movie from the set and do not train/predict the rating of that movie using the model. Having defined the types of trimming operations above, now we can evaluate the performance of the k-NN filter architecture in predicting the ratings of the movies in the trimmed dataset. 5.6.1 Performance evaluation using ROC curve Receiver operating characteristic (ROC) curve is a commonly used graphical tool for visualizing the performance of a binary classifier. It plots the true positive rate (TPR) against the false positive rate (FPR). In the context of recommendation systems, it is a measure of the relevance of the items recommended to the user. Since the observed ratings are in a continuous scale (0-5), so we first need to convert the observed ratings to a binary scale. This can be done by thresholding the observed ratings. If the observed rating is greater than the threshold value, then we set it to 1 (implies that the user liked the item). If the observed rating is less than the threshold value, then we set it to 0 (implies that the user disliked the item). After having performed this conversion, we can plot the ROC curve for the recommendation system in a manner analogous to that of a binary classifier. QUESTION 6: Within EACH of the 3 trimmed subsets in the dataset, design (train and validate): A k-NN collaborative filter on the ratings of the movies (i.e Popular, Unpopular or High-Variance) and evaluate each of the three models’ performance using 10-fold cross validation: • Sweep k (number of neighbors) from 2 to 100 in step sizes of 2, and for each k compute the average RMSE obtained by averaging the RMSE across all 10 folds. Plot average RMSE (Y-axis) against k (X-axis). Also, report the minimum average RMSE. • Plot the ROC curves for the k-NN collaborative filters for threshold values [2.5, 3, 3.5, 4]. These thresholds are applied only on the ground truth labels in held-out validation set. For each of the plots, also report the area under the curve (AUC) value. You should have 4 × 4 plots in this section (4 trimming options – including no trimming times 4 thresholds) - all thresholds can be condensed into one plot per trimming option yielding only 4 plots. We provide you with the following hints: • Write trimming function that takes as input the set of data and outputs a trimmed set; • For each value of k, split the trimmed dataset into 10 pairs of training and validation sets. (trainset 1, testset 1), (trainset 2, testset 2), . . . , (trainset 10, testset 10) The following documentation might be useful for the splitting: http://surprise.readthedocs.io/en/stable/getting_started.html#use-cross-validation-iterators • For each pair of (trainset, testset): – Train the collaborative filter on the trimmed train set – Predict the ratings of the movies in the trimmed validation set using the trained collaborative filter – Compute the RMSE of the predictions in the trimmed validation set • Compute the average RMSE by averaging across all the 10 folds. • For the ROC plotting, split the dataset into 90% for training and 10% for validation. The functions described in the documentation below might be useful http://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_curve.html 6 Model-based collaborative filtering In model-based collaborative filtering, models are developed using machine learning algorithms to predict users’ rating of unrated items. Some examples of model-based methods include decision trees, rule-based models, bayesian methods, and latent factor models. In this project, we will explore latent factor based models for collaborative filtering. 6.1 Latent factor based collaborative filtering Latent factor based models can be considered as a direct method for matrix completion. It estimates the missing entries of the rating matrix R, to predict what items a user will most probably like other than the ones they have rated. The basic idea is to exploit the fact that significant portions of the rows and columns of the rating matrix are correlated. As a result, the data has built-in redundancies and the rating matrix R can be approximated by a low-rank matrix. The low-rank matrix provides a robust estimation of the missing entries. The method of approximating a matrix by a low-rank matrix is called matrix factorization. The matrix factorization problem in latent factor based model can be formulated as an opti- mization problem given by 4 In the above optimization problem, U and V are matrices of dimension m × k and n × k respectively, where k is the number of latent factors. However, in the above setting it is assumed that all the entries of the rating matrix R is known, which is not the case with sparse rating matrices. Fortunately, latent factor model can still find the matrices U and V even when the rating matrix R is sparse. It does it by modifying the cost function to take only known rating values into account. This modification is achieved by defining a weight matrix W in the following manner: Then, we can reformulate the optimization problem as Since the rating matrix R is sparse, the observed set of ratings is very small. As a result, it might cause over-fitting. A common approach to address this problem is to use regulariza- tion. The optimization problem with regularization is given by equation 6. The regularization parameter λ is always non-negative and it controls the weight of the regularization term. There are many variations to the unconstrained matrix factorization formulation (equation 6) depending on the modification to the objective function and the constraint set. In this project, we will explore two such variations: • Non-negative matrix factorization (NMF) • Matrix factorization with bias (MF with bias) 6.2 Non-negative matrix factorization (NMF) Non-negative matrix factorization may be used for ratings matrices that are non-negative. As we have seen in the lecture, the major advantage of this method is the high level of inter- pretability it provides in understanding the user-item interactions. The main difference from other forms of matrix factorization is that the latent factors U and V must be non-negative. Therefore, optimization formulation in non-negative matrix factorization is in 7: There are many optimization algorithms like stochastic gradient descent (SGD), alternating least-squares (ALS),etc for solving the optimization problem in 7. Since you are familiar with the SGD method, we will not describe it here. Instead, we will provide the motivation and main idea behind the ALS algorithm. SGD is very sensitive to initialization and step size. ALS is less sensitive to initialization and step size, and therefore a more stable algorithm than SGD. ALS also has a faster convergence rate than SGD. The main idea in ALS, is to keep U fixed and then solve for V. In the next stage, keep V fixed and solve for U. In this algorithm, at each stage we are solving a least-squares problem. Although ALS has a faster convergence rate and is more stable, we will use SGD in this project. The main reason behind this is based on the fact that the python package that we will be using to design the NMF-based collaborative filter only has the SGD implementation. This choice would have no effect on the performance of the filter designed because both the SGD and ALS converges for the original dataset. The only downside of using SGD is that it will take a little bit longer to converge, but that will not be a big issue as you will see while designing the NMF filter. QUESTION 7: Understanding the NMF cost function: Is the optimization problem given by equation 5 convex? Consider the optimization problem given by equation 5. For U fixed, formulate it as a least-squares problem. 6.2.1 Prediction function After we have solved the optimization problem in equation 7 for U and V , then we can use them for predicting the ratings.The predicted rating of user i for item j, denoted by ˆ(r)ij , is given by equation 8 Having covered the basics of matrix factorization, now we are ready to implement a NMF based collaborative filter to predict the ratings of the movies. We have provided you with the necessary background to implement the filter on your own, but we don’t require you to do that. Instead, you can use provided functions in Python for the implementation. 6.2.2 Design and test via cross-validation In this part, you will design a NMF-based collaborative filter and test its performance via 10-fold cross validation. Details on 10-fold cross validation have been provided in one of the earlier sections. QUESTION 8: Designing the NMF Collaborative Filter: A Design a NMF-based collaborative filter to predict the ratings of the movies in the original dataset and evaluate its performance using 10-fold cross-validation. Sweep k (number of latent factors) from 2 to 50 in step sizes of 2, and for each k compute the average RMSE and average MAE obtained by averaging the RMSE and MAE across all 10 folds. If NMF takes too long, you can increase the step size. Increasing it too much will result in poorer granularity in your results. Plot the average RMSE (Y-axis) against k (X-axis) and the average MAE (Y- axis) against k (X-axis). For solving this question, use the default value for the regularization parameter. B Use the plot from the previous part to find the optimal number of latent factors. Optimal number of latent factors is the value of k that gives the minimum average RMSE or the minimum average MAE. Please report the minimum average RMSE and MAE. Is the optimal number of latent factors same as the number of movie genres? C Performance on trimmed dataset subsets: For each of Popular, Unpopular and High- Variance subsets - – Design a NMF collaborative filter for each trimmed subset and evaluate its performance using 10-fold cross validation. Sweep k (number of latent factors) from 2 to 50 in step sizes of 2, and for each k compute the average RMSE obtained by averaging the RMSE across all 10 folds. – Plot average RMSE (Y-axis) against k (X-axis); item Report the minimum average RMSE. • Plot the ROC curves for the NMF-based collaborative filter and also report the area under the curve (AUC) value as done in Question 6. For solving this question, the functions described in the documentation below might be useful: http://surprise.readthedocs.io/ en/stable/matrix_factorization.html 6.2.3 Interpretability of NMF The major advantage of NMF over other forms of matrix factorization is not necessarily one of accuracy, but that of the high level of interpretability it provides in understanding user-item interactions. In this part of the project, we will explore the interpretability of NMF. Specifically, we will explore the connection between latent factors and movie genres. QUESTION 9: Interpreting the NMF model: Perform Non-negative matrix factorization on the ratings matrix R to obtain the factor matrices U and V , where U represents the user-latent factors interaction and V represents the movie-latent factors interaction (use k = 20). For each column of V , sort the movies in descending order and report the genres of the top 10 movies. Do the top 10 movies belong to a particular or a small collection of genre? Is there a connection between the latent factors and the movie genres? In this question, there will be 20 columns of V and you don’t need to report the top 10 movies and genres for all the 20 columns. You will get full credit, as long as you report for a couple columns and provide a clear explanation on the connection between movie genres and latent factors. 6.3 Matrix factorization with bias (MF with bias) In MF with bias, we modify the cost function (equation 6) by adding bias term for each user and item. With this modification, the optimization formulation of MF with bias is given by equation 9 In the above formulation, bu is the bias of user u and bi is the bias of item i, and we jointly optimize over U,V, bu , bi to find the optimal values. 6.3.1 Prediction function After we have solved the optimization problem in equation 9 for U,V, bu , bi , then we can use them for predicting the ratings. The predicted rating of user i for item j, denoted by ˆ(r)ij is given by equation 10 where µ is the mean of all ratings, bi is the bias of user i, and bj is the bias of item j.
Commerce 2KA3 Information Systems in Business (Winter 2025) Hands-on Using Microsoft Access Purpose: The purpose of this assignment is to learn how to create a database by using Microsoft Access. In this assignment, you will create tables, and make queries for course registration from the perspective of a system administrator. This assignment has two parts: (1) completion of the hands-on and (2) development of an E-R model. Total Marks: This assignment is scored out of 100 marks, which is worth 10% of your final grade. Late Penalty: Late submissions will incur a 20% (i.e., 20 marks) penalty per day, meaning that after 5 days past the due date, late submissions receive zero marks. Due to unexpected difficulties that may arise, please start and finish the assignment as soon as possible. Submission Instructions: You are required to follow the steps described in this document as well as submitting the materials described on page 12. The submissions (i.e., Access file and a pdf file must be uploaded to the Dropbox for Access assignment (assignment 1) on the pertinent Avenue account under Assessments > Assignments > Assignment 1). Please note that any other form of submission will not be accepted. Submitting your assignment to the Dropbox on Avenue is a two-step process. One step is uploading your document to the Dropbox folder for Access Assignment on Avenue; next step is actually submitting your document to be marked. Make sure you actually submit your assignment (and not just upload it). Send an email to TA – Ms. Fatemeh Navazi ([email protected]), for any questions that you may have about this assignment. (mention 2KA3 in the title of your email) In this assignment MS Access Software will be used. Windows users can use MS Access installed on the laptop/PC, Microsoft 365 or Vlab to access MS Access. Mac users need to use Vlab to access MS Access. To use Microsoft Access on Vlab, Please Connect to McMaster VPN First. Instructions on how to connect to McMaster VPN from out of campus are available at: https://mcmasteru365.sharepoint.com/:u:/r/sites/UTS- NetSoft/SitePages/All%20Software/McMasterVPN.aspx?csf=1&web=1&e=efDKak To use Microsoft Access on Vlab, please choose the document that matches the operating system of your computer between the following two documents and follow the instructions. Both documents are shared on Avenue. Windows Instructions: Avenue-> COMMERCE 2KA3:Information Systems in Business-> Content-> Assignments-> Assignment #1 -> vlab-Virtual-Applications-Instructions-Windows Mac Instructions: Avenue-> COMMERCE 2KA3:Information Systems in Business-> Content-> Assignments-> Assignment #1 -> vlab-Virtual-Applications-Instructions-OSX1 You may access MS Access Software through the "Access" icon on VLAB Resources. For any issues related to Vlab, please Contact DSB IT department at [email protected] make sure to include your full name, your MacID, and the course number in your request. Part 1 1. Getting started Before starting the assignment, make yourself familiar with the Microsoft Access Interface. 2. Creating tables (20 Marks) From the Access file menu, select New, then: a. Select the Blank Desktop Database icon in the main panel. Name your database file Assign and click the folder icon next to the FileName b. From the opened window, select Desktop and click OK. c. Finally, click on Create button. Data are stored in tables in Access. A table consists of columns, called fields, and rows, called records. To store data for course registration processing you need to create four tables: Courses, Registrations, RegistrationLn, and Students. The Courses table is used to store course information; the Registration table is used to store student registration; the RegistrationLn table is used to store detailed registration lines for each registration; and the Students table is used to store detailed student information. Their structures are shown as follows: Table 1. The Structure of Four Tables Now we want to create a table for Courses, Registrations, RegistrationLn, and Students. To create a table, under the Create tab, select the Table Design icon. You need to specify the field Name, the Data Type, the key, and field properties. All the required data to create tables is shown in Table 1. Give the table a name when you save it (see the table names in the above list). Please reference the following figure for creating each of your new tables. The specification of field properties is straightforward. Learn the meanings of field properties from Help and specify them accordingly. Note: To specify a key field, click the cell to the left side of a field name and then right click and select Primary Key or select the Primary Key icon under the Design tab. Figure 2. Use Design View to Create a Table Set the Default value of Registration_Date field as Date() by clicking the button of the field Default Value and building expression in the following window. Figure 3. Build expression Date() Note: To specify a concatenated key in the RegistrationLn table, select both the Registration_Number field and Course_ID field (hold while you left click them), and then right click or push the icon to set both fields as the concatenated primary key. 3. Specifying relationships and integrity control (20 Marks) Referential Integrity refers to ensuring that the records in related tables in our database are consistent with one another. When enforcement of referential integrity is in effect, Microsoft Access does not let you add records to a related table when there is no associated record in the primary table. For example, in the RegistrationLn table, there is a record Registration_Number which should also exist in the Registration table. Referential integrity also allows updating and deleting data in related tables. For example, when a Registration_Number is deleted from the Registration table, then all associated Registration_Numbers are also deleted from the RegistrationLn table. In Access, the relationships (the links created through foreign keys) between tables can be explicitly specified. Integrity can then be defined based on these links. From the main toolbar, select the Relationships icon from the Database Tools tab. The Relationships window will emerge and the Show Table window will popup. Click the Registration table, and the Add button. Similarly, add the RegistrationLn, Courses and Students tables, so the fourstables appear in a row in the Relationships window. Close the Show Table window. Click and drag the Registration_Number field in the Registration field list to the Registration_Number field in the RegistrationLn field list. An Edit Relationships window opens. Then, click the Enforce Referential Integrity check box. Click the Cascade Delete Related Records check box. This means that, if you delete a registration, the system will delete all the registration lines related to this registration. Click the Create button to create this relationship. The Relationships window display now illustrates that you have created a One to Many relationship from Registration to RegistrationLn. Note - Ifyou receive a table locked error message, close the tables you are working on. Figure 4. Edit Relationships Click and drag the Course_ID field in the Courses field list to the Course_ID field in the RegistrationLn field list. Click the Enforce Referential Integrity check box, and then click the Cascade Update Related Fields check box. This means that, if you change the Course_ID in the Coursestable, the same Course_ID will be changed in all of related records in the RegistrationLn table. Click the Create button to create this relationship. Click and drag the Student_ID field in the Students field list to the Student_ID field in the Registration field list. Click the Enforce Referential Integrity check box.Then click the Cascade Update Related Fields check box. This means that, if you change the Student_ID in the Students table, the same Student_ID will be changed in all of related order records in the Registration table. The Relationship dialogue box should appear as shown in the following figure (Figure 5). Note: If you want to delete or edit an existing relationship, right click the line joining the two tables of interest and select Edit Relationship. After finishing with your relationships, click the X icon in the upper right corner of the Relationships window and Close this window. The relationships you defined will be saved automatically. Figure 5. Define the Relationships between Four Tables
BACKGROUND:You’ve been hired by the IPLRA (International Programming Language Review Association) to conduct a security audit for their newly released API. They are excited to finally release an API to the community for developers across the world to leverage. In fact, they see this API as a way to increase their amount of reviews by 800%. The only thing standing in their way is a final audit and approval, by you. Unfortunately, after only 5 minutes of looking at the API, you’ve found issues and need to report them. Your goal is to bring visibility to these vulnerabilities in their API by finding the flags for each scenario. Good luck on your flag hunt and we hope you enjoy learning all about modern web APIs.Note: The IPLRA is not real and we made it up.SETUP:To get set up for the flags, carefully follow the steps below.You will need switch users. Log into the VM with the following user.The username, password and VM location are located on Canvas.Run this at the terminal to start the API$ ./StartContainer.shproject_apisecurity.json is available in the /home/apisec/Desktop folder. Put all flags in this file and submit it as your final deliverable.To access the Web API open Chrome in the VM and navigate to this URL. This is the Swagger documentation page that describes the API and allows for testing: http://localhost:8080/swagger/index.html_*Note: You can also click the “Swagger UI” bookmark in _Chrome******GATECH_ID IS A REQUIRED HEADER******NOTE: This is not the Georgia Tech Username, it is the GTID that you can find usng the steps on the Required ReadingBe very careful! When you copy and paste be sure to strip off all leading spaces or special characters.Submission Details:File submission instructions:This project needs to be submitted via gradescope. Navigate to the course in Canvas, click ‘Gradescope’, click ‘Project API Security’ and submit there.The contents of the submission file should be the following. There is a project_apisecurity.json file in your vm with a template set up, or you can copy-paste this to your newly created project_apisecurity.json file elsewhere and replace the placeholders with the flags you retrieve from each relevant task.Note: You can use TextEdit or Vim to create and edit this file. Do not use LibreOffice or any Word Document editor. It must be in proper JSON format with no special characters in order to pass the autograder and these Word Document editors are likely to introduce special characters.If you can’t find the file in the VM just copy this format below:{“flag1”: “”,“flag2”: “”,“flag3”: “”,“flag4”: “”,“flag5”: “”,“flag6”: “”,“flag7”: “”, “flag8”: “”}An example of what the submitted file content should look like:{“flag1”: “4ec60c3e084d8387f0f33916e9b08b99d5264a486c29130dd4a5a530b958c5c0f1faeaca2ce30b478281ec546a4729f“flag2”: “f496d9514c01e8019cd2bc21edfeb8e33f4a29af14a8bf92f7b3c14b5e06c5c0f1faeaca2ce30b478281ec546a4729f“flag3”: “b621bba0bb535f2f7a222bd32994d3875bcfcad651160c543de0a01dbe2e0c5c0f1faeaca2ce30b478281ec546a4729“flag4”: “f38e2cafb43ab4a0a647a8b08fc97bca25aa7cfb517029d5dd02faf49bff5c5c0f1faeaca2ce30b478281ec546a4729“flag5”: “1711ee5eb85b9020d1f4193ee6d884abd12a2eadc4890d28c490ae0c36446c5c0f1faeaca2ce30b478281ec546a4729“flag6”: “1711ee5eb85b9020d1f4193ee6d884abd12a2eadc4890d28c490ae0c36446c5c0f1faeaca2ce30b478281ec546a4729“flag7”: “1711ee5eb85b9020d1f4193ee6d884abd12a2eadc4890d28c490ae0c36446c5c0f1faeaca2ce30b478281ec546a4729 “flag8”: “f38e2cafb43ab4a0a647a8b08fc97bca25aa7cfb517029d5dd02faf49bff5c5c0f1faeaca2ce30b478281ec546a4729}TABLE OF CONTENTSThis flag will introduce you to basic API functionality using a documentation and test harness tool called Swagger. Swagger is a very popular tool used to develop and test web APIs and has plugins/modules in most programming languages. You can learn more about Swagger here: https://swagger.io/You’ll need to leverage Swagger (or any other http tool you desire such as curl or Postman) to determine how the API is configured and what endpoints to invoke to earn this flag.Warning: The site doesn’t use file storage or a database, all data is stored in memory. If you crash the web API or restart the VM, any data you have created/modified will have been lost and you’ll need to begin at step 1.To earn your flag you must perform the following actions by making API calls.Hints:In order to get this flag you need to create a new reviewer in the system. Unfortunately, the developers locked down this functionality some time ago so you’ll need an auth token in order to perform it. You read in the newspaper last week that Programming Reviews LLC had a big data breach so there is a good chance you can come across some credentials.To earn your flag you must perform the following actions.Hints:https://learning.postman.com/docs/getting-started/introduction/Include your flag2 into the json file and now onto Flag 3! Now that you’ve used an Auth token we’re going to dig a bit deeper into JWT (JSON Web Tokens). This flag is simple and designed only to get you acquainted with how JWTs are constructed. There are numerous resources to help you work with JWTs, one we recommend is https://jwt.io/ but you are not required to use this site for the project. Choose any library, tool or site you wish to inspect and construct JWT tokens.To earn your flag you must perform the following actions.Hints:The next few flags will require some trial and error and a bit of research on your part to succeed. Your task is to craft JWT tokens such that you can use the token to successfully authenticate and earn your flag.You are a PHP ninja! You can’t get enough of this language. When you learned that others hate it and gave it bad reviews you felt the need to “correct the situation”. You’ve learned of an API that allows you to delete reviews. Muhahahah! The problem is that only the site moderator can do this and you don’t have his credentials. This has not stopped you in the past.To earn your flag you must perform the following actions.Hints:You’ve learned about a new experimental programming language that is TOP SECRET! This language only requires 1 single keyword to find a polynomial time algorithm to solve any NP-hard problem! You want the 1 million dollar reward for solving this problem and thus need access to this programming language. Find the language.To earn your flag you must perform the following actions.Hints:
1. Suppose we have n+ positive training examples and n− negative training examples. Let C+ be the center of the positive examples and C− be the center of the negative examples, i.e., C+ = 1 n+ P i: yi=+1 xi and C− = 1 n− P i: yi=−1 xi . Consider a simple classifier called CLOSE that classifies a test example x by assigning it to the class whose center is closest. • Show that the decision boundary of the CLOSE classifier is a linear hyperplane of the form sign(w · x + b). Compute the values of w and b in terms of C+ and C−.• Recall that the weight vector can be written as a linear combination of all the training examples: w = Pn++n− i=1 αi · yi · xi . Compute the dual weights (α’s). How many of the training examples are support vectors?2. Suppose we use the following radial basis function (RBF) kernel: K(xi , xj ) = exp(− 1 2 kxi − xjk 2 ), which has some implicit unknown mapping φ(x). • Prove that the mapping φ(x) corresponding to RBF kernel has infinite dimensions. • Prove that for any two input examples xi and xj , the squared Euclidean distance of their corresponding points in the higher-dimensional space defined by φ is less than 2, i.e., kφ(xi) − φ(xj )k 2 ≤ 2. 3. The decision boundary of a SVM with a kernel function (via implicit feature mapping φ(.)) is defined as follows: w · φ(x) + b = P i∈SV yiαiK(xi , x) + b = f(x; α, b) , where w and b are parameters of the decision boundary in the feature space phi defined by the kernel function K, SV is the set of support vectors, and αi is the dual weight of the i th support vector. Let us assume that we use the radial basis function (RBF) kernel K(xi , xj ) = exp(− 1 2 kxi − xjk 2 ); also assume that the training examples are linearly separable in the feature space φ and SVM finds a decision boundary that perfectly separates the training examples. If we choose a testing example xfar that is far away from any training instance xi (distance here is measured in the original feature space
1. (4.5 Percent) Implementation of Q-Learning algorithm and experimentation. You are given a Gridworld environment that is defined as follows: State space: GridWorld has 10×10 = 100 distinct states. The start state is the top left cell. The gray cells are walls and cannot be moved to. Actions: The agent can choose from up to 4 actions (left, right, up, down) to move around. Environment Dynamics: GridWorld is deterministic, leading to the same new state given each state and action Rewards: The agent receives +1 reward when it is in the center square (the one that shows R 1.0), and -1 reward in a few states (R -1.0 is shown for these). The state with +1.0 reward is the goal state and resets the agent back to start. In other words, this is a deterministic, finite Markov Decision Process (MDP). Assume the discount factor β=0.9. Implement the Q-learning algorithm (slide 46) to learn the Q values for each state-action pair. Assume a small fixed learning rate α=0.01. Experiment with different explore/exploit policies: 1) -greedy. Try values 0.1, 0.2, and 0.3. 2) Boltzman exploration. Start with a large temperature value T and follow a fixed scheduling rate. Give these details in your report. How many iterations did it take to reach convergence with different exploration policies? Please show the converged Q values for each state-action pair. 2. (1.5 Percent) Convolutional Neural Networks (CNNs) for solving image classification task. You will train a CNN on Fashion MNIST data. The network architecture contains 4 CNN layers followed by one pooling layer and a final fully connected layer. The basic architecture (in sequential order) will be as follows: First CNN layer: input channels – 1, output channels – 8, kernel size = 5, padding = 2, stride = 2 followed by ReLU operation Second CNN layer: input channels – 8, output channels – 16, kernel size = 3, padding = 1, stride = 2 followed by ReLU operation Third CNN layer: input channels – 16, output channels – 32, kernel size = 3, padding = 1, stride = 2 followed by ReLU operation Fourth CNN layer: input channels – 32, output channels – 32, kernel size = 3, padding = 1, stride = 2 followed by ReLU operation one “Average” pooling layer (nn.AdaptiveAvgPool2d(1) would work in PyTorch)Figure 1: Grid world domain with states and rewards.Fully connected layer (nn.Linear in PyTorch) – determine the number of input features from previous CNN layers. This can be done easily by hand. The number of output features will be equal to number of classes, i.e., 10. If you want help, you can use the direct formula given on this page: http://cs231n.github.io/convolutional-networks/. This will be a straightforward extension from the code discussed in the demo session. Plot the training and testing accuracy as a function of atleast 10 epochs. You could use a smaller sized dataset if compute power is a hurdle. A good choice would be 50 percent of the training set and 10 percent of the testing set. Please make sure you have equal ratio of all classes in the dataset. You can try all tips mentioned in the demo session for solving this task. Optionally, it will be a good idea to try adding other training techniques to see the maximum accuracy possible. Some of them include batch normalization, data augmentation, using other optimizers like ADAM etc. 3. (1.5 Percent) Please read the following paper and write a brief summary (at most one page) of the main points. D. Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, Michael Young, Jean-Franois Crespo, Dan Dennison: Hidden Technical Debt in Machine Learning Systems. NIPS 2015: 2503-2511 4. (1.5 Percent) Please read the following paper and write a brief summary (at most one page) of the main points. Eric Breck, Shanqing Cai, Eric Nielsen, Michael Salib, D. Sculley: The ML test score: A rubric for ML production readiness and technical debt reduction. BigData 2017: 1123-1132 Grading Rubric Each question in the students work will be assigned a letter grade of either A,B,C,D, or F by the Instructor and TAs. This five-point (discrete) scale is described as follows: • A) Exemplary (=100%). Solution presented solves the problem stated correctly and meets all requirements of the problem. Solution is clearly presented. Assumptions made are reasonable and are explicitly stated in the solution. Solution represents an elegant and effective way to solve the problem and is not overly complicated than is necessary. • B) Capable (=75%). Solution is mostly correct, satisfying most of the above criteria under the exemplary category, but contains some minor pitfalls, errors/flaws or limitations. • C) Needs Improvement (=50%). Solution demonstrates a viable approach toward solving the problem but contains some major pitfalls, errors/flaws or limitations.• D) Unsatisfactory (=25%) Critical elements of the solution are missing or significantly flawed. Solution does not demonstrate sufficient understanding of the problem and/or any reasonable directions to solve the problem. • F) Not attempted (=0%) No solution provided. The points on a given homework question will be equal to the percentage assigned (given by the letter grades shown above) multiplied by the maximum number of possible points worth for that question. For example, if a question is worth 6 points and the answer is awarded a B grade, then that implies 4.5 points out of 6.
Analytical Part (6 Percent) 1. Suppose x = (x1, x2, · · · , xd) and z = (z1, z2, · · · , zd) be any two points in a high-dimensional space (i.e., d is very large). a. Try to prove the following, where the right-hand side quantity represent the standard Euclidean distance. 1 √ d X d i=1 xi − 1 √ d X d i=1 zi !2 ≤ X d i=1 (xi − zi) 2 (1) Hint: Use Jensen’s inequality – If X is a random variable and f is a convex function, then f(E[X]) ≤ E[f(X)]. b. We know that the computation of nearest neighbors is very expensive in the highdimensional space. Discuss how we can make use of the above property to make the nearest neighbors computation efficient? 2. We briefly discussed in the class about Locality Sensitive Hashing (LSH) algorithm to make the nearest neighbor classifier efficient. Please read the following paper and briefly summarize the key ideas as you understood: Alexandr Andoni, Piotr Indyk: Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Communications of ACM 51(1): 117-122 (2008) http:// people.csail.mit.edu/indyk/p117-andoni.pdf 3. We know that we can convert any decision tree into a set of if-then rules, where there is one rule per leaf node. Suppose you are given a set of rules R = {r1, r2, · · · , rk}, where ri corresponds to the i th rule. Is it possible to convert the rule set R into an equivalent decision tree? Explain your construction or give a counterexample. 4. Please read the following paper and briefly summarize the key ideas as you understood (You can skip the proofs, but it is important to understand the main results): Andrew Y. Ng, Michael I. Jordan: On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes. NIPS 2001: 841-848 http://ai.stanford.edu/~ang/ papers/nips01-discriminativegenerative.pdf 5. Naive Bayes vs. Logistic Regression a. Let us assume that the training data satisfies the Naive Bayes assumption (i.e., features are independent given the class label). As the training data approaches infinity, which classifier will produce better results, Naive Bayes or Logistic Regression? Please explain your reasoning. b. Let us assume that the training data does NOT satisfy the Naive Bayes assumption. As the training data approaches infinity, which classifier will produce better results, Naive Bayes or Logistic Regression? Please explain your reasoning. c. Can we compute P(X) from the learned parameters of a Naive Bayes classifier? Please explain your reasoning. d. Can we compute P(X) from the learned parameters of a Logistic Regression classifier? Please explain your reasoning. 6. Please read the following paper and briefly summarize the key ideas as you understood: Thomas G. Dietterich: Ensemble Methods in Machine Learning. Multiple Classifier Systems 2000: 1-15 https://web.engr.oregonstate.edu/~tgd/publications/mcs-ensembles.pdf 7. We need to perform statistical tests to compare the performance of two learning algorithms on a given learning task. Please read the following paper and briefly summarize the key ideas as you understood: Thomas G. Dietterich: Approximate Statistical Test For Comparing Supervised Classification Learning Algorithms. Neural Computation 10(7): 1895-1923 (1998) http://sci2s.ugr.es/ keel/pdf/algorithm/articulo/dietterich1998.pdf 8. (Finite-Horizon MDPs.) Our basic definition of an MDP in class defined the reward function R(s) to be a function of just the state, which we will call a state reward function. It is also common to define a reward function to be a function of the state and action, written as R(s, a), which we will call a state-action reward function. The meaning is that the agent gets a reward of R(s, a) when they take action a in state s. While this may seem to be a significant difference, it does not fundamentally extend our modeling power, nor does it fundamentally change the algorithms that we have developed. a) Describe a real world problem where the corresponding MDP is more naturally modeled using a state-action reward function compared to using a state reward function. b) Modify the Finite-horizon value iteration algorithm so that it works for state-action reward functions. Do this by writing out the new update equation that is used in each iteration and explaining the modification from the equation given in class for state rewards. c) Any MDP with a state-action reward function can be transformed into an “equivalent” MDP with just a state reward function. Show how any MDP with a state-action reward function R(s, a) can be transformed into a different MDP with state reward function R(s), such that the optimal policies in the new MDP correspond exactly to the optimal policies in the original MDP. That is an optimal policy in the new MDP can be mapped to an optimal policy in the original MDP. Hint: It will be necessary for the new MDP to introduce new “book keeping” states that are not in the original MDP. 9. (k-th Order MDPs.) A standard MDP is described by a set of states S, a set of actions A, a transition function T, and a reward function R. Where T(s, a, s0 ) gives the probability of transitioning to s 0 after taking action a in state s, and R(s) gives the immediate reward of being in state s. A k-order MDP is described in the same way with one exception. The transition function T depends on the current state s and also the previous k−1 states. That is, T(sk−1, · · · , s1, s, a, s0 ) = P r(s 0 |a, s, s1, · · · , sk−1) gives the probability of transitioning to state s 0 given that action a was taken in state s and the previous k − 1 states were (sk−1, · · · , s1). Given a k-order MDP M = (S, A, T, R) describe how to construct a standard (First-order) MDP M0 = (S 0 , A0 , T0 , R0 ) that is equivalent to M. Here equivalent means that a solution to M0 can be easily converted into a solution to M. Be sure to describe S 0 , A0 , T 0 , and R0 . Give a brief justification for your construction. 10. Some MDP formulations use a reward function R(s, a) that depends on the action taken in a state or a reward function R(s, a, s0 ) that also depends on the result state s 0 (we get reward R(s, a, s0 ) when we take action a in state s and then transition to s 0 ). Write the Bellman optimality equation with discount factor β for each of these two formulations. 11. Consider a trivially simple MDP with two states S = {s0, s1} and a single action A = {a}. The reward function is R(s0) = 0 and R(s1) = 1. The transition function is T(s0, a, s1) = 1 and T(s1, a, s1) = 1. Note that there is only a single policy π for this MDP that takes action a in both states. a) Using a discount factor β = 1 (i.e. no discounting), write out the linear equations for evaluating the policy and attempt to solve the linear system. What happens and why? b) Repeat the previous question using a discount factor of β = 0.9. Programming and Empirical Analysis (3 Percent) 1. (40 points) Fortune Cookie Classifier1 You will build a binary fortune cookie classifier. This classifier will be used to classify fortune cookie messages into two classes: messages that predict what will happen in the future (class 1) and messages that just contain a wise saying (class 0). For example, “Never go in against a Sicilian when death is on the line” would be a message in class 0. “You will get an A in Machine learning class” would be a message in class 1. Files Provided There are three sets of files. All words in these files are lower case and punctuation has been removed. 1) The training data: traindata.txt: This is the training data consisting of fortune cookie messages. trainlabels.txt: This file contains the class labels for the training data. 2) The testing data: testdata.txt: This is the testing data consisting of fortune cookie messages. testlabels.txt: This file contains the class labels for the testing data. 3) A list of stopwords: stoplist.txt There are two steps: the pre-processing step and the classification step. In the pre-processing step, you will convert fortune cookie messages into features to be used by your classifier. You will be using a bag of words representation. The following steps outline the process involved: Form the vocabulary. The vocabulary consists of the set of all the words that are in the training data with stop words removed (stop words are common, uninformative words such as 1Thanks to Weng-Keen Wong and his advisor Andrew Moore for sharing the data. “a” and “the” that are listed in the file stoplist.txt). The vocabulary will now be the features of your training data. Keep the vocabulary in alphabetical order to help you with debugging. Now, convert the training data into a set of features. Let M be the size of your vocabulary. For each fortune cookie message, you will convert it into a feature vector of size M. Each slot in that feature vector takes the value of 0 or 1. For these M slots, if the ith slot is 1, it means that the ith word in the vocabulary is present in the fortune cookie message; otherwise, if it is 0, then the ith word is not present in the message. Most of these feature vector slots will be 0. Since you are keeping the vocabulary in alphabetical order, the first feature will be the first word alphabetically in the vocabulary. Implement the Naive Bayes Classifier (with Laplace Smoothing) and run it on the training data. Compute the training and testing accuracy. To debug and test your implementation, you can employ Weka (weka.classifiers.bayes.NaiveBayes): http://www.cs.waikato.ac.nz/ml/weka/downloading.html or scikit-learn (http://scikit-learn. org/stable/modules/naive_bayes.html) Instructions for Code Submission and Output Format. Please follow the below instructions. It will help us in grading your programming part of the homework. Please submit the zip file on Canvas. • Supported programming languages: Python, Java, C++ • Store all the relevant files in a folder and submit the corresponding zipfile named after your student-id, e.g., 114513209.zip • This folder should have a script file named run_code.sh Executing this script should do all the necessary steps required for executing the code including compiling, linking, and execution • Assume relative file paths in your code. Some examples: ‘‘./filename.txt’’ or ‘‘../hw2/filename.txt’’ • The output of your program should be dumped in a file named “output.txt” • Make sure the output.txt file is dumped when you execute the script run_code.sh • Zip the entire folder and submit it as .zip Grading Rubric Each question in the students work will be assigned a letter grade of either A,B,C,D, or F by the Instructor and TAs. This five-point (discrete) scale is described as follows: • A) Exemplary (=100%). Solution presented solves the problem stated correctly and meets all requirements of the problem. Solution is clearly presented. Assumptions made are reasonable and are explicitly stated in the solution. Solution represents an elegant and effective way to solve the problem and is not overly complicated than is necessary. • B) Capable (=75%). Solution is mostly correct, satisfying most of the above criteria under the exemplary category, but contains some minor pitfalls, errors/flaws or limitations. • C) Needs Improvement (=50%). Solution demonstrates a viable approach toward solving the problem but contains some major pitfalls, errors/flaws or limitations. • D) Unsatisfactory (=25%) Critical elements of the solution are missing or significantly flawed. Solution does not demonstrate sufficient understanding of the problem and/or any reasonable directions to solve the problem. • F) Not attempted (=0%) No solution provided. The points on a given homework question will be equal to the percentage assigned (given by the letter grades shown above) multiplied by the maximum number of possible points worth for that question. For example, if a question is worth 6 points and the answer is awarded a B grade, then that implies 4.5 points out of 6.
1. Answer the following questions with a yes or no along with proper justification. a. Is the decision boundary of voted perceptron linear? b. Is the decision boundary of averaged perceptron linear? 2. In the class, we saw the Passive-Aggressive (PA) update that tries to achieve a margin equal to one after each update. Derive the PA weight update for achieving margin M. 3. Consider the following setting. You are provided with n training examples: (x1, y1, h1),(x2, y2, h2), · · · ,(xn, yn, hn), where xi is the input example, yi is the class label (+1 or -1), and hi > 0 is the importance weight of the example. The teacher gave you some additional information by specifying the importance of each training example.a. How will you modify the perceptron algorithm to be able to leverage this extra information? Please justify your answer. b. How can you solve this learning problem using the standard perceptron algorithm? Please justify your answer. I’m looking for a reduction based solution. 4. Consider the following setting. You are provided with n training examples: (x1, y1),(x2, y2), · · · ,(xn, yn), where xi is the input example, and yi is the class label (+1 or -1). However, the training data is highly imbalanced (say 90% of the examples are negative and 10% of the examples are positive) and we care more about the accuracy of positive examples. a. How will you modify the perceptron algorithm to solve this learning problem? Please justify your answer. b. How can you solve this learning problem using the standard perceptron algorithm? Please justify your answer. I’m looking for a reduction based solution.1. Implement a binary classifier with both perceptron and passive-aggressive (PA) weight update as shown below.Algorithm 1 Online Binary-Classifier Learning Algorithm Input: D = Training examples, T = maximum number of training iterations Output: w, the final weight vector 1: Initialize the weights w = 0 2: for each training iteration itr ∈ {1, 2, · · · , T} do 3: for each training example (xt , yt) ∈ D do 4: yˆt = sign(w · xt) // predict using the current weights 5: if mistake then 6: w = w + τ · yt · xt // update the weights 7: end if 8: end for 9: end for 10: return final weight vector w For standard perceptron, you will use τ = 1, and for Passive-Aggressive (PA) algorithm, you will compute the learning rate τ as follows. τ = 1 − yt · (w · xt) kxtk 2 (1) Implement a multi-class online learning algorithm with both perceptron and passive-aggressive (PA) weight update as shown below. Employ the single weight vector represenation (representationII as discussed in the class). This representation is defined as follows. Each training example is of the form (xt , yt), where xt ∈
Task 1: (a) What do the following two programs print? (b) What do functions orange and guave do when called with positive integer as input argument? (answer in plain language) (c) http://www.pythontutor.com/visualize.html#mode Open file t1.py and copy both programs into Python Vizualizer. Make sure you understand all the functions that are running (at the same time) as you step through the execution of the program.Task 2: (a) What does the following program print? (b) Write one sentence that describes abstractly (in plain English) what the function mulberry does when called with positive integer as input argument. (c) What would happen if we made mulberry(-3) call in the main, instead of mulberry (5)? (d) http://www.pythontutor.com/visualize.html#mode Open file t1.py and copy code below into Python Vizualizer. Make sure you understand how the program is running and what it is doing as you step through its execution of the program. (e) What is the maximum number of mulberry functions running at any one time, i.e. what is the maximum number mulberry functions on the stack at any one time? – What is the answer for general n? – Make the call mulberry(1000) in Python shell instead mulberry(4) of and observe what happened? Why did that happen? Task 3: (a) What does the following program print? (b) Write one sentence that describes abstractly (in plain English) what the function cantaloupe does when called with positive integer as input argument. (c) http://www.pythontutor.com/visualize.html#mode Open file t1.py and copy code below into Python Vizualizer. Make sure you understand how the program is running and what it is doing as you step through the execution of the program. (d) What is the maximum number of cantaloupe functions running at any one time, i.e. what is the maximum number cantaloupe functions on the stack at any one time? What is the answer for general n? Task 4: (a) What does the following program print? (b) Write one sentence that describes abstractly (in plain English) what the function almond does given a list of numbers lst as input argument? (c) http://www.pythontutor.com/visualize.html#mode Open file t1.py and copy code below into Python Vizualizer. Make sure you understand how the program is running and what it is doing as you step through the execution of the program. Task 5: (a) What does the following program print? (b) Write one sentence that describes abstractly (in plain English) what the function fig does given a list of numbers lst as input argument and high=len(lst)-1. (c) http://www.pythontutor.com/visualize.html#mode Open file t1.py and copy code below into Python Vizualizer. Make sure you understand how the program is running and what it is doing as you step through the execution of the program. Task 6: (a) What does the following program print? (b) Write one sentence that describes abstractly (in plain English) what the function almond does (given strings s and ch as input and assuming ch contains only one character) (c) http://www.pythontutor.com/visualize.html#mode Open file t1.py and copy code below into Python Vizualizer. Make sure you understand how the program is running and what it is doing as you step through the execution of the program. 11 Recursion: Paper Study • Open file sum_prod.py It contains contains 2 functions that both compute the sum: 1+2+3 …+n. One computes it in the ‘usual’ way using python’s loops (this is called, iterative solution), and the other in a recursiveway (i.e. using function calls that solve a problem on the smaller problem instance) Similarly sum_prod.py contains 2 function that both compute the product 1*2*3…*n (thus they compute n!, i.e. n factorial) in iterative and recursive way (as we have seen in class) Study with TAs and understand well all these 4 solutions. 12 Recursion: Programming Exercise 1 • Write a recursive function (do not use python’s loops), called m, that computes the following series, given positive integer i: • In the “main” write a loop that tests your function m by displaying values m(i) for i = 1, 2, . . . , 10, as follows m(1)= 0.3333333333333333 m(2)= 0.7333333333333334 m(3)= 1.161904761904762 m(4)= 1.6063492063492064 m(5)= 2.060894660894661 m(6)= 2.5224331224331227 m(7)= 2.9890997890997895 m(8)= 3.459688024393907 m(9)= 3.933372234920223 m(10)= 4.409562711110699 13 Recursion: Programming Exercise 2 • Write a recursive function, called count_digits, that counts the number of digits in a given positive integer n. • Test your function: >>> count_digits(0) 1 >>> count_digits(7) 1 >>> count_digits(73) 2 >>> count_digits(13079797) 8 >>> 14 Recursion: Programming Exercise 3 • A string is a palindrome if it reads the same from the left and from the right. For example, word “kayak” is a palindrome, so is a name “Anna”, so is a word “a”. Word “uncle” is not a palindrome. • Write a recursive function, called is_palindrome, that returns True if the input string is a palindrome and otherwise returns False. Test your function. • Notice: a word of length n is a palindrom if 1st and nth letter are the same, AND 2nd and (n-1)st are the same, and so on … until we get to the “middle” of the word. >>> is_palindrome(‘blurb’) False >>> is_palindrome(‘a’) True >>> is_palindrome(‘anna’) True >>> is_palindrome(‘Anna’) True >>> is_palindrome(“A man, a plan, a canal — Panama!”) False >>> is_palindrome(“Madam, I’m Adam”) False 15 Checking if a string is a palindrome can be divided into two subproblems: 1. Check if the 1st and the last character in the string are the same 2. Ignore the two end characters and check if the rest of the substring is a palindrome. Notice that the 2nd subproblemis the same as the original problem but smaller in size. Useful string methods: lower(), and string slicing Idea/Strategy 16 Recursion: Programming Exercise 4 • Refine your function is_palindrome, and call the modified version, is_palindrome_v2, such that it ignores all characters but the characters that correspond to the letters of English alphabet. You may find Python’s string method .isalpha() useful. • Test your function with the following at least: >>> is_palindrome_v2(“A man, a plan, a canal — Panama!”) True >>> is_palindrome_v2(“Go hang a salami, I’m a lasagna hog”) True >>> is_palindrome_v2(“Madam, I’m Adam”) True >>> is_palindrome_v2(“Madam, I’m”) False >>> is_palindrome_v2(‘blurb’) False >>> is_palindrome_v2(‘a’) True >>> is_palindrome_v2(‘Anna’) True 17 Recursion: Programming Exercise 5: GCD The greatest common divisor (GCD) of two integers (at least one of which is not zero), is the largest positive integer that divides the numbers without a remainder. For example, the GCD of 12 and 8 is 4. The following technique is known as Euclid’s Algorithm because it appears in Euclid’s Elements (Book 7, ca. 300 BC). It may be the oldest nontrivial algorithm. The process is based on the observation that, if r is the remainder when a is divided by b, then the common divisors of a and b are the same as the common divisors of b and r. Thus we can use the equation gcd(a,b) = gcd(b,r) to successively reduce the problem of computing a GCD to the problem of computing the GCD of smaller and smaller pairs of integers. For example, gcd(36, 20) = gcd(20, 16) = gcd(16, 4) = gcd(4, 0) = 4 implies that the GCD of 36 and 20 is 4. It can be shown that for any two starting numbers, this repeated reduction eventually produces a pair where the second number is 0. Then the GCD is the other number in the pair. Write a recursive function called gcd that takes two integer parameters a and b (you can assume a>=b) and that uses Euclid’s algorithm to compute and return the greatest common divisor of the two numbers. Test your function. 18 Difficult question: GCD What is the depth of the recursion (i.e. the maximum number of gcd methods on the stack/memory) of your gcd function if you called it with gcd(36, 20)? You can find this out by drawing the diagrams as we did in class and or running your function and the gcd(36, 20) in Python Visualizer. Here is a difficult question and food for thought: What is the maximum depth of the recursion of your gcd method for general a and b (where a>=b)?
Task 1: “Reference” Variables and Objects (a) What does the program below print? (Class Point is a simplified version of the class we designed in class). IMPORTANT: notice that the function riddle is outside of Point class (pay attention to what is lined up) (b) http://www.pythontutor.com/visualize.html#mode Open file t1.py and copy it in Python Vizualizer. Make sure you understand what is happening with variables as you step through the execution of the program.Task 2: “Reference” Variables and Objects (a) What does the program below print? (Class Point is a simplified version of the class we designed in class) (b) http://www.pythontutor.com/visualize.html#mode Open file t2.py and copy it in Python Vizualizer. Make sure you understand what is happening with variables as you step through the execution of the program.Task 3: “Reference” Variables and Objects (a) What does the program on the left prints? The objects you design are MUTABLE. (b) Open file t3.py and copy it in Python Vizualizer. Make sure you understand what is happening with variables as you step through the execution of the program.Task 4: “Reference” Variables and Objects (a) What does the program on the left prints? (b) Open file t3.py and copy it in Python Vizualizer. Make sure you understand what is happening with variables as you step through the execution of the program. A class method is really a function defined in the class namespace; when Python executes it first translates it to and actually executes this last statement Introduction to Computing Using Python Class methods (REVIEW) >>> lst = [9, 1, 8, 2, 7, 3] >>> lst [9, 1, 8, 2, 7, 3] >>> lst.sort() >>> lst [1, 2, 3, 7, 8, 9] >>> __add__ list __add__() sort() count x count() pop pop() . . . >>> lst = [9, 1, 8, 2, 7, 3] >>> lst [9, 1, 8, 2, 7, 3] >>> lst.sort() >>> lst [1, 2, 3, 7, 8, 9] >>> lst = [9, 1, 8, 2, 7, 3] >>> lst [9, 1, 8, 2, 7, 3] >>> list.sort(lst) >>> lst [1, 2, 3, 7, 8, 9] >>> >>> lst = [9, 1, 8, 2, 7, 3] >>> lst [9, 1, 8, 2, 7, 3] >>> lst.sort() >>> lst [1, 2, 3, 7, 8, 9] >>> lst = [9, 1, 8, 2, 7, 3] >>> lst [9, 1, 8, 2, 7, 3] >>> list.sort(lst) >>> lst [1, 2, 3, 7, 8, 9] >>> lst.append(6) >>> lst [1, 2, 3, 7, 8, 9, 6] >>> >>> lst = [9, 1, 8, 2, 7, 3] >>> lst [9, 1, 8, 2, 7, 3] >>> lst.sort() >>> lst [1, 2, 3, 7, 8, 9] >>> lst = [9, 1, 8, 2, 7, 3] >>> lst [9, 1, 8, 2, 7, 3] >>> list.sort(lst) >>> lst [1, 2, 3, 7, 8, 9] >>> lst.append(6) >>> lst [1, 2, 3, 7, 8, 9, 6] >>> list.append(lst, 5) >>> lst [1, 2, 3, 7, 8, 9, 6, 5] lst.sort() list.sort(lst) lst.append(6) instance.method(arg1, arg2, …) list.append(lst, 6) class.method(self, arg1, arg2, …) The function has an extra argument, which is the object invoking the methodTask 5: Methods class translation to Function calls – Open the file called t5.py and translate all the indicated method calls to equivalent function calls – Once done run those function calls in Python shell to observe that they indeed are equivalent to their method call couterpartsProgramming exercise 1: Open file lab10.py It contains the classes we developed during lectures last week. Do the following exercises: A) Add method distance to the class Point. It takes another Point object as input and returns the distance to that point (from the point invoking the method). (Recall that you need to import math to get sqrt function) >>> c = Point(0,1) >>> d = Point(1,0) >>> c.distance(d) 1.4142135623730951 B) Add to class Point methods up, down, left, and right that move the Point object by 1 unit in the appropriate direction. The implementation of each should not modify instance variables x and y directly but rather indirectly by calling existing method move(). >>> a = Point(3, 4) >>> a.left() >>> a.get() (2, 4) C) In to class Animal modify the constructor to set age of the Animal object. Also add method getAge to retrieve the age of the Animal object. >>> flipper = Animal(‘dolphin’, ‘?’, 3) >>> flipper.getAge() 3 Operator Method x + y x.__add__(y) x – y x.__sub__(y) x * y x.__mul__(y) x / y x.__truediv__(y) x // y x.__floordiv__(y) x % y x.__mod__(y) x == y x.__eq__(y) x != y x.__ne__(y) x > y x.__gt__(y) x >= y x.__ge__(y) x < y x.__lt__(y) x >> ‘!’*10 ‘!!!!!!!!!!’ >>> [1,2,3] == [2,3,4] False >>> 2 < 5 True >>> ‘a’ >> len([1,1,2,3,5,8]) 6 >>> repr([1,2,3]) ‘[1, 2, 3]’ >>> repr(193) ‘193’ >>> repr(set()) ‘set()’ Built-in function repr()returns the canonical string representation of an object >>> [1,2,3] [1, 2, 3] >>> 193 193 >>> set() set() >>> [1,2,3].__repr__() ‘[1, 2, 3]’ >>> int(193).__repr__() ‘193’ >>> set().__repr__() ‘set()’ • This is the representation printed by the shell when evaluating the object >> ‘!’.__mul__(10) ‘!!!!!!!!!!’ >>> [1,2,3].__eq__([2,3,4]) False >>> int(2).__lt__(5) True >>> ‘a’.__le__(‘a’) True >>> [1,1,2,3,5,8].__len__() 6 >>> a = Point(3, 4) >>> a Point(3, 4) Introduction to Computing Using Python Overloading repr() In Python, operators are translated into method calls To add an overloaded operator to a user-defined class, the corresponding method must be implemented >>> a = Point(3, 4) >>> a Point(3, 4) To get this behavior method __repr__()must be implemented and added to class Point class Point: # other Point methods here def __repr__(self): ‘canonical string representation Point(x, y)’ return ‘Point({}, {})’.format(self.x, self.y) __repr__() should return the (canonical) string representation of the point >>> a = Point(3, 4) >>> a.__repr__() Point(3, 4) Introduction to Computing Using Python Overloading operator + To get this behavior method __add__()must be implemented and added to class Point class Point: # other Point methods here def __add__(self, point): return Point(self.x+point.x, self.y+point.y) def __repr__(self): ‘canonical string representation Point(x, y)’ return ‘Point({}, {})’.format(self.x, self.y) >>> a = Point(3,4) >>> b = Point(1,2) >>> a+b Point(4, 6) __add__()should return a new Point object whose coordinates are the sum of the coordinates of a and b >>> a = Point(3,4) >>> b = Point(1,2) >>> a.__add__(b) Point(4, 6) Also, method __repr__() should be implemented to achieve the desired display of the result in the shell >>> a = Point(3,4) >>> b = Point(1,2) >>> a+b Point(4, 6) >>> a = Point(3,4) >>> b = Point(1,2) >>> a.__add__(b) Point(4, 6) >>> print([1,2,3]) [1, 2, 3] >>> print(193) 193 >>> print(set()) set() >>> [1,2,3].__str__() ‘[1, 2, 3]’ >>> int(193).__str__() ‘193’ >>> set().__str__() ‘set()’ Operator Method x + y x.__add__(y) x – y x.__sub__(y) x * y x.__mul__(y) x / y x.__truediv__(y) x // y x.__floordiv__(y) x % y x.__mod__(y) x == y x.__eq__(y) x != y x.__ne__(y) x > y x.__gt__(y) x >= y x.__ge__(y) x < y x.__lt__(y) x >> str([1,2,3]) ‘[1, 2, 3]’ >>> str(193) ‘193’ >>> str(set()) ‘set()’ Built-in function __str()__ returns the “pretty” string representation of an object • This is the representation printed by the print() statement and is meant to be readable by humans Built-in function __repr()__ returns the canonical string representation of an object • This is the representation printed by the shell when evaluating the object • The string returned by __repr__ method must look like the statement that creates that object Programming exercise 2: Open file lab10.py. It contains the classes we developed during lectures last week. Do the following exercises: Overload appropriate operators for class Card so that you can compare cards based on rank. In particular override __gt__ , __ge__ , __lt__ and __le__ >>> c1=Card(‘3′,’u2660’) >>> c2=Card(‘5′,’u2662’) >>> c1 Card(3,♠) >>> c2 Card(5,♢) >>> c1 < c2 True >>> c1 > c2 False >>> c1>> x = BankAccount(700) >>> x.balance() 700.00 >>> x.withdraw(70) >>> x.balance() 630.00 >>> x.deposit(7) >>> x.balance() 637.00 >>> x BankAccount(637.00) Programming exercise 4: Ping Pong Write a class named PingPong that has a method next that alternates between printing ‘PING’ and ‘PONG’ as shown below. >>> ball = PingPong() >>> ball.next() PING >>> ball.next() PONG >>> ball.next() PING >>> ball.next() PONG Programming exercise 5a: Class Queue Goal: develop a class Queue , an ordered collection of objects that restricts insertions to the rear of the queue and removal from the front of the queue •The class Queue should support methods: • Queue(): Constructor that initializes the queue to an empty queue • enqueue(item): Add item to the end of the queue • dequeue(): Remove and return the element at the front of the queue • isEmpty(): Returns True if the queue is empty, False otherwise >>> appts = Queue() >>> appts.enqueue(‘John’) >>> appts.enqueue(‘Annie’) >>> appts.enqueue(‘Sandy’) >>> appts.dequeue() ‘John’ >>> appts.dequeue() ‘Annie’ >>> appts.dequeue() ‘Sandy’ >>> appts.isEmpty() True rear rear ‘Annie’ ‘Sandy’ Introduction to Computing Using Python by Lj. Perkovic Class Queue: example >>> appts = Queue() >>> appts.enqueue(‘John’) >>> appts.enqueue(‘Annie’) >>> appts.enqueue(‘Sandy’) >>> appts.dequeue() ‘John’ >>> >>> appts = Queue() >>> appts.enqueue(‘John’) >>> appts.enqueue(‘Annie’) >>> appts.enqueue(‘Sandy’) >>> appts.dequeue() ‘John’ >>> appts.dequeue() ‘Annie’ >>> >>> appts = Queue() >>> appts.enqueue(‘John’) >>> appts.enqueue(‘Annie’) >>> appts.enqueue(‘Sandy’) >>> appts.dequeue() ‘John’ >>> appts.dequeue() ‘Annie’ >>> appts.dequeue() ‘Sandy’ >>> >>> appts = Queue() >>> appts.enqueue(‘John’) >>> appts.enqueue(‘Annie’) >>> appts.enqueue(‘Sandy’) >>> appts.dequeue() ‘John’ >>> appts.dequeue() ‘Annie’ >>> appts.dequeue() ‘Sandy’ >>> appts.isEmpty() True appts ‘John’ front rear ‘Annie’ ‘Sandy’ rear rear Programming exercise 5b: Class Queue Make your class Queue user friendly by adding to it __eq__, __repr__ and __len__ Example: >>> q1=Queue() >>> q1.enqueue(‘kiwi’) >>> q1.enqueue(‘apple’) >>> print(q1) >>> print(q1) Queue([‘kiwi’, ‘apple’]) >>> len(q1) 2 >>> q2=Queue() >>> q2.enqueue(‘apple’) >>> q1==q2 False >>> q1.dequeue() ‘kiwi’ >>> q1==q2 True Programming exercise (Inheritance) 6: Class Vector Implement a class Vector that supports the same methods as the class Point we developed in class. In other words it inherits all atributes (data and methods) from class Point. (Revisit class Animal and Bird to see a simple example) The class Vector should also support vector addition and product operations. The addition of two vectors >>> v1 = Vector(1, 3) >>> v2 = Vector(-2, 4) is a new vector whose coordinates are the sum of the corresponding coordinates of v1 and v2: >>> v1 + v2 Vector(-1, 7) The product of v1 and v2 is the sum of the products of the corresponding coordinates: >>> v1 * v2 10 In order for a Vector object to be displayed as Vector(., .) instead of Point(., .),you will need to override method __repr__(). Programming exercise 7a: Class Marsupial Write a class named Marsupialthat can be used as shown below: >>> m=Marsupial(“red”) >>> m.put_in_pouch(‘doll’) >>> m.put_in_pouch(‘firetruck’) >>> m.put_in_pouch(‘kitten’) >>> m.pouch_contents() [‘doll’, ‘firetruck’, ‘kitten’] >>> print(m) I am a red Marsupial. Programming exercise 7b (Inheritance): Class Kangaroo Write a class named Kangaroo as a subclass of Marsupialthat inherits all the attributes of Marsupial and also: •extends the Marsupial __init__ constructor to take, as input, the coordinates x and y of the Kangaroo object, •has methodjump that takes number values dx and dy as input and moves the kangaroo by dx units along the x-axis and by dy units along the y-axis, and •overloads the __str__ operator so it behaves as shown below. >>> k = Kangaroo(“blue”, 0,0) >>> print(k) I am a blue Kangaroo located at coordinates (0,0) >>> k.put_in_pouch(‘doll’) >>> k.put_in_pouch(‘firetruck’) >>> k.put_in_pouch(‘kitten’) >>> k.pouch_contents() [‘doll’, ‘firetruck’, ‘kitten’] >>> k.jump(1,0) >>> k.jump(1,0) >>> k.jump(1,0) >>> print(k) I am a blue Kangaroo located at coordinates (3,0) Programming exercise 8 : Class Points Write a class named Points(that represents points in the plane). The class has a list containing elements that are objects of type Point. __init__ constructor creates an empty list if no input list is given. Otherwise it sets the list to the given list. •has method add(x,y) that adds an object Point with coordinates x and y to points •has method left_most_point that returns the left most point in the set. If there is more than one left most it returns the bottom left most •Has method len and overrides __repr__ >>> a=[Point(1,1), Point(1,2), Point(2,20), Point(1.5, -20)] >>> mypoints=Points(a) >>> mypoints.add(1,-1) >>> mypoints.left_most_point() Point(1,-1) >>> len(mypoints) 5 >>> mypoints Points([Point(1,1), Point(1,2), Point(2,20), Point(1.5,-20),Point(1,-1)])
DICTIONARIES SEARCHING AND SORTING (and a bit of exceptions handling): Algorithms to solve them, their analysis/efficiency Applications of searching and sorting Part 0: Dictionaries Prog ex Dic 1: Write function letter2number(lgrade) that takes as input a letter grade and returns the corresponding number grade according to uottawa grading system. Make two versions of the functions, one by using if statement, and without if statements with dictionary only. >>> letter2number(‘A-’) 8 >>> letter2number(‘E’) 1 Prog ex Dic 2: Dictionaries Prog ex Dic 3: Dictionaries Prog ex Dic 3 (from the textbook): Prog ex Dic 4 (from the textbook): Catching and Handling Exceptions First, learn about catching and handling exceptions •by reading the provided exceptions-handling.pdf file which contains the material following from one of your recommended textbooks (the one by Ljubomir Perkovic) and/or •and/or by watching this short video: https://www.youtube.com/watch? v=evu5qFjpNJk&index=3&list=PLO9y7hOkmmSEqzlgHLJ0XpJTIBNceFQb •and or by optionally reading: http://interactivepython.org/runestone/static/pythonds/Introduction/ ExceptionHandling.html •Finally to see an example of how to used this in bigger context and learn more about reading files, populating 2D list and making a bigger program, such as making a trivia quiz, some may find the following lecture I taught last year useful: https://youtu.be/wI9dEAyeAto (For those interested, the files from that lecture are also included in this lab, quiz.csv and quizgame.py) The first file contains a collection of trivia quiz questions and the second a program we developed in class last year that make a trivia game) Programming exercise 0 (useful for your assignment 4): Write a function called get_year() that has no input parameter and it returns an integer. The function prompts the user to enter 4 digit integer for a year. If the user enters 4 digit integer for a year, then the function returns that number as integer. Otherwise the function keeps on asking the user to reenter. Note that your function cannot crash if the user enters something like “aha”. Instead it should print a meaningful message, like “Please give a four digit integer for the year”. Test your function in python shell by calling it: >>> get_year() 9 ANALASYS OF ALGORITHMS: For the rest of this semester running time of a program/algorithm/function will mean the same thing as the number of operations of a program/algorithm/function. These two terms mean the same thing!! I will use them interchangeably. You textbook uses “running time” Big O notation hides the constants and slower going terms of a function. For example: 4n2 is O(n2 ), (and so is n 2 /100 +n for example) 20n -70 is O(n) 10log2 n + 100 is O(log2 n) Task 1: For each of the following 7 functions answer the question about how many operations the functions does, roughly, as n grows. For example the answer is roughly linear in n for foo3, i.e O(n) Task 2: answer the 4 questions 1. What does the program on the left print? 2. Write one sentence that describes abstractly (in plain English) what the function annona does? (something a person with no programming knowledge can understand) 3. Suppose the function annona is called on an list that has 1000 integers. How many times would L[i]==L[j] test be executed? a) less than 20 b) Between 1000 and 9,000 times c) Between 10,000 and 100, 000 times d) Between 200, 000 and 2 million timee) between 10 million and 200 million times f) None of the above 4. How many operations (roughly) does function L do on a list of size n? PART 1: SEARCHING Python’s search Python’s “in” operator can be used to determine if a given element is in a given list. For example >>> 10 in [1,20,-1,10,-5, 10] True Python’s .index method can be used to determine if a given element is in a given list and if it is it returns its position, and otherwise -1 >>> L=[1,20,-1,10,-5,10] >>> L.index(10) 3 >>> A=[‘d’, ‘a’, ‘b’, ‘a’] >>> A.index(‘a’) 1 >>> A.index(‘c’) -1 Both in and .index methods perform linear search and thus take linear number of operations in the size of the list Study: overview of Linear Search Linear search starts at index 0 and looks at each item one by one. At each index, we ask this question: Is the value we are looking for at the current index? IMPORTANT: Linear search is used to find an item in an UNSORTED list! It takes, roughly, linear number of operations (in the worst case, i.e. O(n) ), where n is the size of the list. Python’s “in” and .index built-in functions perform linear search and thus they too take do O(n) operations. Task 3: Linear Search Open the file called linear_search-3-versions.py and study the 3 implementations of linear search. They are all correct and all do roughly n operations ( i.e. O(n) ) on lists of size n. “Which one you prefer is largely a matter of taste: some programmers dislike returning in the middle of a loop, so they won’t like the second version. Others dislike modifying parameters in any way, so they won’t like the third version. Still others will dislike that extra check that happens in the first version. “ Interlude: some useful list methods to know Here is some useful list methods that modify the given list lst: lst.append(item) adds item to the end of lst lst.pop() removes the last element from lst lst.pop(i) removes item in position i in the lst and returns that item lst.insert(i, item) inserts v before index i in lst For more run in python shell: help(list.append) help(list.pop) help(list.insert) Programming exercise 1 and Task 4: Prog ex 1: •All three versions of linear search in linear_search-3-versions.py start at index 0. Rewrite all three to search from the end of the list instead of from the beginning. Make sure you test them. Task 4: •For the new versions of linear search: if you are looking for value v and it appears in the list more than once, which position will your modified linear searches find? Programming ex 2: min or max and Task 5 Prog Ex 2: •Write a function named min_or_max_index that has two parameters: one of type list and another type bool. If the Boolean parameter refers to True, the function returns a tuple containing the minimum and its index; and if it refers to False, it returns a tuple containing the maximum and its index. (do not use python’s min and max functions – roll your own implementation) Task 5: •On a list of size n, what is roughly the number of operations your program does in the worst case? (constant, linear in n, quadratic in n, log n ….?) Study: overview of Binary Search and Task 6 IMPORTANT: Binary search is used to find an item in an SORTED list! It does, roughly, log2 n of operations (in the worst case, i.e. O(log2 n)), where n the size of the list. Task 6: Open the file called binary_search.py. It contains the binary search version we developed in class and study again how it works. Question: Binary search is significantly faster than the linear search but requires that the list is sorted. As you know, the number of operations for the best sorting algorithm is on the order of n log2 n, where n is the size of the list. If we search a lot of times on the same list of data, it makes sense to sort it once before doing the searching. Roughly how many times do we need to search in order to make sorting and then searching as fast or faster than linear search? Programming exercise 3 (a bit of a challenge): 1. Write a function called first_one(L) that takes as input a list where each element of L is either a number 0 or 1. In addition all 0s appear before all 1s. Function first_one(L) should return the position in L of the first 1. If there is no 1s in the list, return -1. Unless the list is very small (less than 3 elements) your solution should not access all the elements in the list. In particular if the list has 1000 elements you solution should access roughly 10 of them, if it has 100,000 elements it should access not more than 20. Roll your own implementation (only use loops and if statements). Basically it should run in O(log n) operations. >>> first_one( [0,0,0,0,0,0,1,1] ) 6 >>> first_one( [1,1] ) 0 >>> first_one( [0,0,0] ) -1 2. Repeat the exercise except this time write a function called last_zero(L) that returns the position of the last 0. If helpful, you can make a call to your own function first_one(L) Study Refined Binary Search The version of binary search we developed in class returns SOME POSITION where value v is found in the given list L (and -1 if v is not in the list). Open the file called binary_search_refined.py. It contains a refined binary search version that returns THE FIRST position the value v is found in L (and -1 if v is not in L) 1.Study and understand the solution 2.Think of how just one call to binary_search_refined.py would resolve the previous first_one problem (and different variants of such problem) PART 2: SORTING Task 7: Selection Sort See file selection_sort.py to recall the selection sort algorithm we studied and developed in class. •Given the unsorted list [6, 5, 4, 3, 7, 1, 2] write down what the contents of the list would be after each iteration (of the outer loop) as the list is sorted using the selection sort 6 Programming Exercises Recall from the analysis we did in class that selection sort does roughly n2 operations (more precisely, O(n^2)) on a list of size n. As contrast, merge sort does at most O( n log2 n) operations. You can think of python’s sort as also doing at most O( n log2 n) operations (although it is not exactly true). Open file applications.py and program (and test) the 6 functions described there. All your solutions should perform at most O(n log n) operations. The only Python function you can use is “.sort” or “sorted” (since you know something about the number of operations they take and how they work roughly). You can also use .append and assume one append call takes constant number of operations. DO NOT USE DICTIONARIES. DICTIONARIES are black boxes, for now. Their running time analysis you will only study in the 2nd year. BONUS TASK 8: CHALLENGE ANALYSIS: The function below sorts the given list L. Can you figure out how many operations it takes in the worst case? In other words is it better of worse or the same as as selection sort (selec. sort does O(n2) operations). What about merge sort? Think of L being a list where elements are ordered from biggest to smallest. That is the worst case for the below algorithm. You can assume that .append and .pop(0) take constant number of operations. min takes linear on a list of size n. You can find rough running times of Python’s operations here: https://wiki.python.org/moin/TimeComplexity AFTERWARDS, AT HOME: — Study the insertion sort from the 2nd recommended textbook (Practical Programming). It is yet another algorithm that takes quadratic number of operations, i.e. O(n2), to sort. Try to figure out why in the worst case it takes quadratic number of operations? — Implement bubble sort https://en.wikipedia.org/wiki/Bubble_sort Bubble sort is also quadratic algorithm in the worst case
Introduction to matrices • A matrix is a two dimensional rectangular grid of numbers: • The dimensions of the matrix are the numbers of rows and columns (in the above case: row dimension 3, column dimension 3). • A value within a matrix is referred to by its row and column indices, in that order. – Math: number rows and columns from 1, from upper left corner • In math notation, M1,2 = 2 – In Python, matrices are implemented via 2D lists and indices start from 0, as they do in (1D) lists. • Thus, M[0][1] is 2 ú ú ú û ù ê ê ê ë é = 7 8 9 4 5 6 1 2 3 M Matrix element processing • To visit every element of an array, we had to use a loop. • To visit every element of a matrix, we need to use a loop inside a loop: – Typically the outer loop goes through each row of a matrix – And the inner loop goes through each column within one row of a matrix. Intro to matrices in python and programming exercises 1 to 5 Open a file called basics-2Dlists.py and spend time studying all the matrix functions there. The open the file called basics-2Dlists-todo.py and implement (and test) the 5 functions labeled as programming exercise 1 to 5 in that file. Two notes about exercises 2 and 5: • Q2: For clarification of programming exercise 2 see the next page • Q5: In programming exercise 5, try to find a solution that does not create any extra list. Or even better, find two solutions, one that creates an extra list and one that does not Details about Prog Ex 2 Find the sum of the upper triangle of a square matrix (i.e. the diagonal and up). 1 4 5 3 2 6 3 6 4 6 M = 4 3 6 7 2 3 4 2 2 4 2 3 8 3 5 How do we know if an element of a square matrix is on or above the main diagonal? row_index >> x = [1, 0, 3, 0, 0, 5, 7] >>> y=move_zeros_v1(x) >>> print(x, y) [1, 0, 3, 0, 0, 5, 7] [1, 3, 5, 7, 0, 0, 0] >>> x = [1, 0, 3, 0, 0, 5, 7] >>> z=move_zeros_v2(x) >>> print(x, z) [1, 3, 5, 7, 0, 0, 0] None >>> x = [1, 0, 3, 0, 0, 5, 7] >>> t=move_zeros_v3(x) >>> print(x, t) [1, 7, 3, 5, 0, 0, 0] None
Task 1 Go to the page below in your interactive textbook -read through -click Forward until reaching the end on the Python Visualizing example (to understand it) -and answer the two questions and the end: https://runestone.academy/ns/books/published/py4e-int/iterations/while.html A solution with forloop def sum_odd_for(n): odd_sum = 0 for num in range(n): if num % 2 == 1: odd_sum = odd_sum + num return odd_sum A solution with a whileloop def sum_odd_while(n): odd_sum = 0 i=1 while i < n: if i % 2 == 1: odd_sum = odd_sum + i i=i+1 return odd_sum Study of while vs for … in some elementary functions: Problem: sum of positive odd integers smaller than n list_sum = 0 Problem: Sum of odd A solution with forloop numbers in a given list a A solution with a whileloop def sum_list_for(a): ”’ (list)->num ”’ list_sum = 0 for i in range(len(a)): if a[i] % 2 == 1: list_sum = list_sum return list_sum def sum_list_while(a): ”’ (list)->num i=0 while i < len(a): if a[i] % 2 == 1: list_sum = list_sum +a[i] i=i+1 return list_sum + a[i] 6 7 def minimum_for_v2(a): ”’ (list)->num ”’ minimum = a[0] for i in range(len(a)): if a[i] < minimum: minimum = a[i] return minimum Study of while vs for … in anotherimportant elementary functions: Two solutions with forloop (forloop over elements and overindices) def minimum_for_v1(a): ”’ (list)->num ”’ minimum = a[0] for element in a: if element < minimum: minimum = element return minimum A solution with a whileloop def minimum_while(a): ”’ (list)->num”’ minimum = a[0] i=0 while i < len(a): if a[i] < minimum: minimum = a[i] i=i+1 return minimum Problem: find a minimum element in a list (of length at least 1) 8 Task 2 and Programming Exercise1 After studying the previous four functions 1.Open the file called seven_functions.py Copy/paste, one by one, the three solutions labeled with “# with while loop” into Python visualizer. Run through each example and understand how the solutions work and how the variables change in the loops. Play by changing what variable i is initialized to and how it is incremented (do for example i=i+3) and see what happens. As always, you can find python visualizer here (make sure you choose Python 3) http://www.pythontutor.com/visualize.html#mode=edit 2.Programming exercise 1: Open file called prog-ex-1.py Complete the function sum_odd_while_v2 that takes an integer n as input parameter and computes the sum of all odd integers between 5 and n (inclusively). Use WHILE LOOP only and NO IF STATEMENTS. >>> sum_odd_while_v2(10) 21 >>> sum_odd_while_v2(-7) 0 Programming Exercise2 Write a program using while loop that •asks the user for two integers (one by one or at the same time if you know how to handle it). Then it adds them and displays the sum. Then the user is asked to enter ‘yes’ if she wishes to perform the operation again, and otherwise if she enters anything other than ‘yes’ the program terminates. The operation (described in the bullet above) is repeated, as long as the user enters ‘yes’. Task 3 Question 1. What does the program below print ifthe input user entered was: 1 5 -3 0 Question 2.Can you tell what the program does? Write one sentence explaining what the functiondoes, in plain English. Task 4 Question 1. How many times does the program below print aha? Task 5 Question 1. What does the the code on the right print? ps. sometimes I say code, sometimes I say program. I mean the same thing. Task 6 13 A programmer who wrote a function below thinks that the function tests if a given positive integer n>=2 is a prime. However there is a logical mistake. 1. Where is the mistake? 2. For what number does the function return a wrong answer? Can you think of the smallest such number? 3. Fix the mistake. Programming Exercise3 Write a function first_neg that takes a (possibly empty) list of numbers as input parameter, finds the first occurrence of a negative number, and returns the index (i.e. the position in the list) ofthat number.Ifthe list contains no negative numbers orit is empty, the program should return None. Use while loop (and not for loop) and your while loop should stop looping once the first negative number isfound. Test yourfunction with atleastthe examples below >>> first_neg([2, 3, -1, 4, -2]) 2 >>> first_neg([2, 3, 1, 4, 2]) >>> first_neg([]) Programming Exercise4 Write a function sum_5_consecutive thattakes a list of numbers as input and returns True ifthere are 5 consecutive numbers in the listthat sum to zero. Otherwise itreturns False. The function should also return False ifthe list has less than 5 elements Solve this in twoways: 1.for loop (over indices of the list) 2.while loop (over indices of thelist) In both cases you need to think about “stopping condition” in order to avoid “IndexError: list index out of range ” 3. Test your function with at least the examplesbelow >>> sum_5_consecutive([2, 3, True -3, 2, 4,-6]) >>> sum_5_consecutive ([-10, False 1, 1, 4, 2, 10, 13]) >>> sum_5_consecutive([2, 1, True -3, -3, -3, 2, 7, 4, -6]) >>> sum_5_consecutive ([]) False >>> sum_5_consecutive ([1, -1,0]) False Programming Exercise 5 : Different ways to create usefullists Recall, for example, that a=[2] creates a list (refer to by variable a) with one element, a number 2 in this case. b=[None] creates a list with one element. That element is an object of type None. That is sometimes useful when we are not ready yet to assign a value to an element of a list. c=[] creates an empty list, i.e. a list of length zero Recall further that multiplying a list by an integer n, creates a new list that repeats the given list n times. Or that applying ‘+’ operator to two lists creates a new lists that concatenates the given two lists. For example: [1,2]+[10,20,0] creates a list [1,2,10,20,0] [7,2]*3 creates a list [7,2,7,2,7,2] Finally, recall the slicing. For example, if a=[2,3,4,1], a[:] returns a new list that is a copy of list a. Open the file called creating_various_lists.py. The firstline is given to you.It asks the user to enter a positive even integer n. For each green programing exercise below, try to find at least two solutions (e.g. one by using a loop with accumulator pattern and another by using int*list). 1.Create a list a (i.e a listreferred to by variable a) of length n filled with zeros 2.Create a list b of length n filled with random integers between 1 and 100 3.Create a variable c that is an alias of b 4.Setfirst half ofthe elements of c to zero, and then print both b and c 5.Copy list b into a listd 6.Create a new list e that has every 2nd element of b Programming Exercise 6: FibonacciNumbers Write a function called fib(n)thattakes as input an integer n (greater than 1) and creates a list containing n values such that a[0] =1 a[1] =1 a[i] = a[i-1] + a[i-2]for all i in the range 1 < i < n and it prints thatlist once it creates it >>> fib(7) [1 1 2 3 5 8 13] Once you are done,trace your program on paper and then Python visualizerto see whatthe values in the list will be for n=5. Programming Exercise7: Write a function inner_productthatthattakes as input two lists of integers (of same length) and then computes and returns the inner product ofthe two lists ofintegers.The inner product of two lists x = [x1, x2,…,xn] and y = [y1, y2,…,yn]is the value x1 y1 + x2 y2 + …+ xn yn . >>> inner_product([2,3,4], [1,0,2]) 10 Moreprogramming exercises (outside of lab): Do everything in the first 13 exercises here https://runestone.academy/ns/books/published/py4e-int/lists/Exercises.html ******** and if you need more exercises…. go to the web site below, for some more practice programming questions but be aware that the web site is in Python 2,thus if you do these exercises do them on your own computerAND NOT in the little browser window ofthe web site. http://codingbat.com/python
Starting Lab 7 • Open a browser and log into Brightspace • On the left hand side under Labs tab, find lab6 material contained in lab7-students.pdf file • Download that file to the Desktop. 2 Before starting, always make sure you are running Python 3 This slide is applicable to all labs, exercises, assignments … etc ALWAYS MAKE SURE FIRST that you are running Python 3 That is, when you click on IDLE (or start python any other way) look at the first line that the Python shell displays. It should say Python 3 (and then some extra digits) If you do not know how to do this, read the material provided with Lab 1. It explains it step by step 3 Task (to be completed at home) I strongly encourage you to complete these two quizes at home. Leave the time in the lab for questions to TAs about quiz problems whose solution you do not understand. Go to coursera webpage and log in. 1.Go to the following link to and complete the Quiz of Week 4. You will find the quiz at the bottom of the page https://www.coursera.org/learn/learn-to-program/home/week/4 2.Go to the following link to and complete the Quiz of Week 5. You will find the quiz at the bottom of the page https://www.coursera.org/learn/learn-to-program/home/week/5 • You can do each quiz more than once. 4 Programming Exercises (the most important lab) The following exercises are easily the most important exercises in this whole semester. Solving these problems (by yourself preferably) should greatly increase your understanding of computational problem solving and programming. Do as many as possible (preferably all) of the following 13 programming exercises from your textbook by Perkovic. 11 out of 13 are mandatory for this lab (your choice which 11) — see the slides to come. Introduction to Computing Using Python: An Application Development Focus, 2nd Edition, Ljubomir Perkovic Sometimes the author uses a word “outputs”. By that he means “prints” First recall from the next 4 slides, list (and few string) functions and methods that you will need. Introduction to Computing Using Python by Lj. Perkovic List operators and functions Like strings, lists can be manipulated with operators and functions >>> lst = [1, 2, 3] >>> lstB = [0, 4] >>> 4 in lst False >>> 4 not in lst True >>> lst + lstB [1, 2, 3, 0, 4] >>> 2*lst [1, 2, 3, 1, 2, 3] >>> lst[0] 1 >>> lst[1] 2 >>> lst[-1] 3 >>> len(lst) 3 >>> min(lst) 1 >>> max(lst) 3 >>> sum(lst) 6 >>> help(list … Usage Explanation x in lst x is an item of lst x not in lst x is not an item of lst lst + lstB Concatenation of lst and lstB lst*n, n*lst Concatenation of n copies of lst lst[i] Item at index i of lst len(lst) Number of items in lst min(lst) Minimum item in lst max(lst) Maximum item in lst sum(lst) Sum of items in lst Introduction to Computing Using Python by Lj. Perkovic Lists methods >>> lst = [1, 2, 3] >>> lst.append(7) >>> lst.append(3) >>> lst [1, 2, 3, 7, 3] >>> lst.count(3) 2 >>> lst.remove(2) >>> lst [1, 3, 7, 3] >>> lst.reverse() >>> lst [3, 7, 3, 1] >>> lst.index(3) 0 >>> lst.sort() >>> lst [1, 3, 3, 7] >>> lst.remove(3) >>> lst [1, 3, 7] >>> lst.pop() 7 >>> lst [1, 3] Usage Explanation lst.append(item) adds item to the end of lst lst.count(item) returns the number of times item occurs in lst lst.index(item) Returns index of (first occurrence of) item in lst lst.pop() Removes and returns the last item in lst lst.remove(item) Removes (the first occurrence of) item from lst lst.reverse(item) Reverses the order of items in lst lst.sort(item) Sorts the items of lst in increasing order Methods append(), remove(), reverse(), and sort() do not return any value; they, along with method pop(), modify list lst Introduction to Computing Using Python by Lj Perkovic String operators >>> ‘Hello, World!’ ‘Hello, World!’ >>> s = ‘rock’ >>> t = ‘climbing’ >>> s == ‘rock’ True >>> s != t True >>> s < t False >>> s > t True >>> s + t ‘rockclimbing’ >>> s + ‘ ‘ + t ‘rock climbing’ >>> 5 * s ‘rockrockrockrockrock’ >>> 30 * ‘_’ ‘______________________________’ >>> ‘o’ in s True >>> ‘o’ in t False >>> ‘bi’ in t True >>> len(t) 8 Usage Explanation x in s x is a substring of s x not in s x is not a substring of s s + t Concatenation of s and t s * n, n * s Concatenation of n copies of s s[i] Character at index i of s len(s) (function) Length of string s >> help(str) Help on class str in module builtins: class str(object) | str(string[, encoding[, errors]]) -> str … To view all operators, use the help() tool Usage Explanation s.capitalize() returns a copy of s with first character capitalized s.count(target) returns the number of occurences of target in s s.find(target) returns the index of the first occurrence of target in s s.lower() returns lowercase copy of s s.replace(old, new) returns copy of s with every occurrence of old replaced with new s.split(sep) returns list of substrings of s, delimited by sep s.strip() returns copy of s without leading and trailing whitespace s.upper() returns lowercase copy of s Introduction to Computing Using Python by Lj. Perkovic String methods Strings are immutable; none of the string methods modify string link Programming Exercises rpmaximum ×
Task 1: What does thisprint? Task 2: What does thisprint? Task 3: What does thisprint? Returns sum of allpositive integers smaller thann def summation(n): ‘’'(number)->number”’ odd_sum = 0 for num in range(n): odd_sum = odd_sum + num return odd_sum Sum of numbers in a given list a. A solution with loop overelements def sum_list_v1(a): ”’ (list)->num ”’ list_sum = 0 for item in a: list_sum = list_sum + item return list_sum return list_sum 8 Returns a product of allpositive integers smaller thann def product(n): ‘’'(number)->number’’’ prod = 1 for num in range(n): prod = prod * num return prod Sum of numbers in a given list a. Asolution with a loop over indices def sum_list_v2(a): ”’ (list)->num ”’ list_sum = 0 i = 0 for i in range(len(a)): list_sum = list_sum + a[i] Study the following four fundamentalalgorithms: 1. After studying the previous four fundamental functions, how would you modify them to only sum odd numbers (or odd list elements)? Try yourself and then see next step for solutions. 2. Open the file called four_functions.py Copy/paste, one by one, Example 1 to 4 into Python visualizer. Run through each example and understand how the solutions work and how the variables change in the loops. As always, you can find python visualizer here (make sure you choose Python 3) http://www.pythontutor.com/visualize.html#mode=edit 3. Programming exercise: Write a function called ah(l,x,y) that given a list l, and integers x and y such that x >> t=[5, 1, -2.5, 10, 13, 8] >>> ah(t, 2,11) (3, 5) Recallthat you can return two numbers referred by variables a and b by justreturning a tuple (a,b) 9 Task 4 and Programming Exercise1 Task 5 How many stars does the following program print? 10 a) 0 b) 15 c)45 d) 48 e) 68 Intermission: print functionrevisited In other words, unless specified otherwise, the default end forthe printfunction is end=’ ’ This is***Lab 5 This isLab 6 This is Lab 7 Would Print => Built-in function print, when its completes printing, enters a new line. For example: print(“This is”) print(“Lab 5”) Prints: This is Lab 5 As mentioned in the last lab, this default behavior of the print function can be changed by specifying what we want print function to end with. For example: print(“This is”, end=’***’) print(“Lab 5.”) print(“This is”, end=”) print(“Lab 6”, end=’ ’) print(“This is”, end=’ ‘) print(“Lab 7”) >>> pets = [‘boa’, ‘cat’, ‘dog’] >>> for pet in pets: print(pet) boa cat dog >>> Introduction to Computing Using Python by Lj. Perkovic More examples of use ofprint function Function print prints, by default, a newline character after printing its arguments The endargument allows for customized end characters >>> pets = [‘boa’, ‘cat’, ‘dog’] >>> for pet in pets: print(pet) boa cat dog >>> >>> pets = [‘boa’, ‘cat’, ‘dog’] >>> for pet in pets: print(pet) boa cat dog >>> for pet in pets: print(pet, end=’, ‘) boa, cat, dog, >>> >>> pets = [‘boa’, ‘cat’, ‘dog’] >>> for pet in pets: print(pet) boa cat dog >>> for pet in pets: print(pet, end=’, ‘) boa, cat, dog, >>> for pet in pets: print(pet, end=’!!! ‘) boa!!! cat!!! dog!!! >>> Task 6 13 1. What does this print? 2. What does this print? 3. What does this print? Task 7 What does the following program print? 14 Programming Exercise 2: Experiment with PerfectNumbers • A positive integer is called a perfect number ifitis equalto the sum of all of its positive divisors, excluding itself. For example, 6 is perfect number since 6=1+2+3. The next is 28=1+2+4+7+12. There are four perfect numbers less than 10,000. Write a program that prints allthese four numbers. • Your program should have a function called is_perfect that takes as input a positive integers and returns True if itis perfect and False otherwise. • Once you are done. Modify your program so that it looks for all perfect numbers smaller than 35 million. What do you notice? Assuming that you computer can do a billion instructions in a sec, can you figure out how long,roughly, will ittake your computerto find 5th perfect number (it is 33,550,336). Is the answer roughly: couple of minutes, couple of hours, couple of days, … weeks, months, years ? What if you wanted to wait until it prints 6th perfect number, which is 8,589,869,056? • 15 Programming Exercise3a: Arithmetic progression Recall that a sequence of numbers forms an arithmetic progression if the difference between every pair of consecutive numbers is the same. For example: -5, -1, 3, 7,11 forms an arithmetic progression since the difference between every pair of consecutive numbers is 4. On the contrary 5, 10, 15, 24, 29 is not an arithmetic progression since the difference between some consecutive pairs is 5 and some 4. A sequence that has exactly one number is considered arithmetic,too. • Write a function called arithmetic that takes as input a list of numbers and returns True ifthe numbers ofthe listform arithmetic progression. And Falseotherwise Testing: >>> arithmetic( [-5, -1, 3, 7, 11] ) True >>> arithmetic([0, -1, 3, 7, 11]) False >>> a = [5, 10, 15, 24, 29] >>> arithmetic(a) False >>> arithmetic(a[:3]) True Programming Exercise3b: and now … is itsorted? Now modify your method arithmetic slightly so that instead ittests ifthe numbers in the give lists are ordered for smallest to largest. Call the new function is_sorted Testing: >>> is_sorted([1, 1, 1, 7, 7]) True >>> is_sorted([-10, -1, 3, 7, 100]) True >>> is_sorted([0, 3, 1, 7, 11]) False >>> a = [5, -10, 15, 24, 29] >>> is_sorted(a) False >>> is_sorted(a[1:4]) True Task 6: Flow of execution The the following two multiple choice questions: https://runestone.academy/ns/books/published/py4e-int/functions/flowofexecution.html
Task 1 In Python interpreter assign string ‘good’ to variable s1, ‘bad’ to variable ‘s2’ and ‘silly’to variable s3. Write Python expressions involving strings s1, s2, and s3 that correspondto: a) ‘ll’ appears ins3 b) the blank space does not appear in s1 c) the concatenation of s1, s2, ands3 d) the blank space appears in the concatenation of s1, s2, and s3 e) the concatenation of 10copies of s3 f) the total number of characters in the concatenation of s1, s2, and s3 Introduction to Computing Using Python by Lj. Perkovic Indexing operator, revisited ‘p ‘ ‘p ‘ ‘l ‘ ‘e’ =’A ‘ s = s[0:2] s[1:4] = s[2:5] = S[2:] = s[:2] 3 4 >>> s = ‘Apple’ >>> s[0] ‘A’ >>> s[1] ‘p’ >>> s[4] ‘e’ ‘ A p p l e ‘ 0 1 2 -2 The indexing operator returns the character at index i (as a single character string). -5 -4 -3 ‘A p’ ‘p p l’ ‘p l e’ ‘p l e’ s[-3:-1] = ‘p l’ = ‘A p’ s[i:j] :the slice of s starting atindex i and ending before indexj s[i:] : the slice of s starting at index i s[:j] : the slice of s ending before index j -1 >>> s = ‘Apple’ >>> s[0:2] ‘Ap’ >>> s[1:4] ‘ppl’ >>> s[2:5] ‘ple’ >>> s[2:] ‘ple’ >>> s[:2] ‘Ap’ >>> s[-3:-1] ‘pl’ çNote: There are no blank spaces here, just appears so for visibility purposes Introduction to Computing Using Python by Lj. Perkovic Task 2 Start python interpreter 1.In python interpreter, create aha variable and assign ‘abcdefgh’ to it. 2.Then write Python expressions (in the interpreter) using string aha and the indexing operator that evaluate to: a) ‘abcd’ b) ‘def’ c) ‘h’ d) ‘fg’ e) ‘defgh’ f) ‘fgh’ g) ‘adg’ h) ‘be’ Usage Explanation s.capitalize() returns a copy of swith first character capitalized s.count(target) returns the number of occurencesof target in s s.find(target) returns the index of thefirst occurrence of targetin s s.lower() returns lowercase copy ofs s.replace(old, new) returns copy of s with every occurrence of oldreplaced with new s.split(sep) returns list of substrings ofs, delimited by sep s.strip() returns copy of s withoutleading and trailing whitespace s.upper() returns uppercase copy ofs Introduction to Computing Usi String methods Strings are immutable; none of the string methods modify string s Introduction to Computing Using Python by Lj. Perkovic Task 3 Copy/paste the following expression (in black) to Python Interpreter: s = ”’It was the best of times, it was the worst of times; it it was was the age of wisdom, it was the age of foolishness; the epoch of belief, it was the epoch of incredulity; it was …”’ (The beginning of A Tale of Two Cities by Charles Dickens.) Then do the following, in order: (a)Write a sequence of statements that produce a copy of s, named newS, in which characters ., ,, ;, and have been replaced by blank spaces. (b) Remove leading and trailing blank spaces in newS (and name the new string newS). (c) Make all the characters in newS lowercase (and name the new string newS). (d) Compute the number of occurrences in newS of string ‘it was’. (e) Change every occurrence of was to is (and name the new string newS). Task 4 a) Follow the link in (b) and trace through the two Python Vizualizer examples thathave “Code Lence 1” and “Code Lence 2” in the caption. You do that by clicking Forward untilthe program ends, like we did it class (It does not matter here thatit says Python 2.7. For these two programs Python 2 and 3 behave thesame). b) ThenAnswer the two multiple choice questions atthe end https://runestone.academy/ns/books/published/py4e-int/functions/fruitfulvoid.html c)Then follow this link and do the two multiple choiceexercises: https://runestone.academy/ns/books/published/py4e-int/functions/flowofexecution.html 9 Task 5: Moretracing and print vs return 10 Open the program called print-vs-return-and-function-calls.py, provided in this lab. The lines the start with # are commented out. Thus Python will ignore them. In other words, if you press Run Module the lines that start with # will not be executed. Uncomment the lines as instructed in the file. Each time you do, save and press “Run Module”. But before you press “Run Module”, write down what you think the program will print. Then press “Run Module” and compare. Note: once you uncomment a line as instructed, leave it as is (do not put back the comments, but rather continue with the next set of lines you are instructed to uncomment). Programming exercise: SolvedProblem Suppose that you are given the following two problems to solve. Read the two problems.Think about how you would solve them, and then open and study the two provided solutions in prog_solved_v1.py and prog_solved_v2.py. Runboth. Version 1: Write a program that asks a userfor her name and age and prints a nice message stating ifthe useris eligible to vote. Version 2: Write a program that asks a user for name and age and prints a nice message stating ifthe useris eligible to vote.As a part of your solution the program should have a function called is_eligible that given the age as input parameter returns true or false depending on weatherthe age is less than 18 or not. 11 Programming exercise1 Repeatthe exercise in the previous question (version 2), where in addition you need to ask the user fortheir citizenship and ifthey are currently in prison convicted for a criminal offence. Your program should print a nice message telling the user if they are eligible to vote (i.e. if they are 18+, Canadian and do not live in prison convicted for a criminal offence,then they can vote. Otherwise not). You should modify function is_eligible so it takes to additional paramters as input. In particular the head of the function should be: is_eligible(age, citizenship,prison) Your program should work ifthe user enters any of the following versions of answers forthe two new questions: Canadian Canada Canada canadian Yes YES No no and so on Note thatin Canada, one can vote even ifin prison convicted for a criminal offence. This example if fictional. 12 Programming exercise2 Write a function called mess thattakes a phrase (i.e., a string) as input and then returns the copy of that phrase where each characterthatis one of the last 8 consonants of English alphabet iscapitalized (so, r, s, t, v, w, x,y , z) and where each blank space is replaced bydash. For this question, use a for loop over characters of a string, and “accumulator”. (We will see, or have seen, that in Lecture 8 on Monday). When called from the python shell, your function should behave as follows: >>> mess(‘Random access memory ‘) ‘Random-acceSS-memoRY–‘ >>> mess(‘central processing ‘cenTRal-pRoceSSing—uniT.’ unit.’) 13 Introduction to Computing Usi Built-in function range() Function range() is used to iterate over a sequence of numbers in a specified range • This iterates over the n numbers 0, 1, 2, …, n-1 for i in range(n): • This iterates over the n numbers k, k+1, k+2, …, n-1 for i in range(k, n): • This iterates over the n numbers k, k+c, k+2c, k+3c, …, n-1 for i in range(k, n, c): In particular the first time a program encounters a forloop it creates the variable whose name follows the keyword for. In the above examples, the variable name is i.Then that variable, i in this case, takes values in the given range one by one. Each time it takes the next value it enters the for-loop and executes its body. The for-loop terminates after i has taken on all the values in the range, as shown above) >>> for i in range(2, 3): print(i) 2 >>> >>> for i in range(2, 2): print(i) >>> Introduction to Computing Usi Examples of for loops withrange() >>> for i in range(4): print(i) 0 1 2 3 >>> >>> for i in range(0): print(i) >>> >>> for i in range(1): print(i) 0 >>> >>> for i in range(2, 6): print(i) 2 3 4 5 >>> >>> for i in range(0, 16, 4): print(i) 0 4 8 12 >>> >>> for i in range(2, 16, 10): print(i) 2 12 >>> Python Visualizer 16 Go to Python Visualizer here (make sure you choose Python 3) http://www.pythontutor.com/visualize.html#mode=edit and copy/paste, only by one,the following loops to it and click Forward to visualize the execution. Pay attention what is assigned to variable i and when does loop terminate. for i in range(3): print(i) for i in range(2,4): print(i) for i in range(2,2): print(i) for i in range(1,10, 3): print(i) Introduction to Computing Usi Task 6 Before attempting the following exercise study the examples from the previous slide. Then open a new file in IDLE and write a program with 6 separate for loops that will print the 6 sequences listed below in parts a) to f). (you do not have to print commas, but you can if you know how to). Note that if you put ,end=” ” at the end of a print function call, then the print function will print the blank space when it finishes rather than go to the new line. Eg: this prints numbers 0 to 9 in one line >>> for i in range(10): print(i,end=” “) 0 1 2 3 4 5 6 7 8 9 >>> a)0, 1, 2, 3, 4, 5, 6, 7, 8 , 9, 10 b)1, 2, 3, 4, 5, 6, 7, 8, 9 c)0, 2, 4, 6, 8 d)1, 3, 5, 7, 9 e)20, 30, 40, 50, 60 f)10, 9, 8, 7, 6, 5, 4, 3, 2, 1 Programming exercise3: Open the file ex23n8.py. Inside of thatfile: 1. write a function called print_all_23n8(num), that takes as input a non-negative integer num and prints allthe the non-negative numbers smaller than num that are divisible by 2 or 3 but not 8. You should use the function that is already present in the file (and that you developed forthe lastlab) 2. Outside of that function ask the user for a non-negative integer. Your program should then print all non-negative numbers that are divisible by 2 or 3 but not 8, by making a call to functionprint_all_23n8 3. Run your program and test it by entering, for example, 1000 when promptedfor a number 18 Programming exercise4: This program: for i in range(4): print(“*************” ) Prints : ************* ************* ************* ************* Write a program that asks a user for a positive integer and a character. And then draws half piramid with the given number of raws using that character. For example if the user enter 3 and $, your program should draw (that is,print): $ $$ $$$ Bonus excercise: For a callenge, draw a real pirmaid like the one to the right that should be displayed ifthe user entered 10 and # # ### ##### ####### ######### ########### ############# ############### ###############1 #9 # ################### Programming exercise5: 1. Write a program that asks a user for a positive integer and then prints allthe divisors of the given integer. For example if the user entered 6, the program should print: 1, 2, 3,6 2. Add a function, called prime, to this program. Function prime takes a positive integer as input parameter and tests if it is a prime (that is, it returns true if the given number is a prime and false otherwise). Recall that a number is prime if it atleast 2 and ifitis only divisible by 1 and itself. Then make a callto this function and print the message stating ifthe number the user inputted in 1) is a prime. 3. Copy/paste your whole solution into Python Visualizer, click throughwith Forward to understand see howit runs 4. Bonus exercise: Write a program that asks a user for a positive integer, n,and prints all the primes smaller thann. 20 Making a table With these tools, you now can make a program that prints nice tables. Seehere: https://runestone.academy/ns/books/published//thinkcspy/MoreAboutIteration /SimpleTables.html