COM00142M Department of Computer Science Advanced Programming I. Module Learning Outcomes The module learning outcomes (MLO’s) for this module are as follows: MLO 1. Demonstrate a critical understanding of the theory and application of advanced programming techniques. MLO 2. Design and implement programs for real-world problems. MLO 3. Communicate design decisions for the selection, storage and manipulation of data. MLO 4. Critically evaluate the legal and ethical impact of software developments in real-world contexts. This assessment addresses all the module learning outcomes listed above. II. Assessment Background/Scenario Your task is to design and develop a prototype application that demonstrates how data from the given data set can be formatted, reshaped and used to generate specific outputs. Your application can be a single programme or a collection of programmes that provide the equivalent functionality as described below. Data Set (CSV) The dataset contains online activity logs for 152 university students enrolled in a blended Computer Science course. The dataset is further split into three CSV files: “USER_LOG”, “ACTIVITY_LOG”, and “COMPONENT_CODES”. The “USER_LOG” CSV consists of: ● Date ● Time ● User Full Name *Anonymized The “ACTIVITY_LOG” CSV file contains: ● User Full Name *Anonymized ● Component ● Action ● Target The “COMPONENT_CODES” CSV file contains: ● Component ● Code Functional requirements The application should provide the following basic functionality: ● A means to load the initial data set (CSV file(s) provided) and translate it into a suitable format – either XML, JSON or an entity relationship structure (not CSV). ● A means to back up the suitable format using either files or a database. This should preserve the current state of the data when the program is closed and make it available when the program is reopened. ● A process for cleaning and preparing the data set, managing inconsistencies, errors and missing values. Cleaning can be done at either the CSV stage or after you have translated the data set into a new format and is required to be done before you apply any of the data manipulations and outputs detailed below. ● A graphical user interface(s) for interacting with the data set(s) that enables the user to: o Load the initial data set (the CSV file(s)). o Apply the cleaning, transformation, REMOVE and RESHAPE to produce a prepared data set. o Load the prepared data set (from its translated format). o Manipulate the range of values used to generate OUTPUT STATISTICS, GRAPHS and perform. CORRELATION analysis. o Use the prepared data set to generate OUTPUT STATISTICS, GRAPHS and CORRELATION results. It should be assumed that this prototype application will be able to handle other sets of data generated from the same source, i.e. data with the same column and row structure in CSV format, but containing different values and anomalies. However, the application is not required to be generic (work with multiple unknown data sets). Given this best practice regarding code reuse, encapsulation and a well-defined programming interface should be applied where applicable. Data manipulation and outputs Your prototype application needs to be able to perform. the following actions on the data set, once it has been translated into your selected format. First, determine whether NumPy or Pandas is more appropriate for this dataset. Next, decide if it’s more appropriate to split the data into manageable chunks before performing the following actions. Further, you should apply each of these actions in order, the later ones being the more challenging to achieve. REMOVE: No outputs should include any data from Component: System, and Folder. RENAME: The column “User Full Name *Anonymized” should be renamed as User_ID both in ACTIVITY_LOG and USER_LOG CSVs. MERGE: Merge the suitable CSVs for analysing user interactions with each component. RESHAPE: Reshape the data using pivot operation. COUNT: The interactions for each user with the Component for each month. Add this new field to the new structure. OUTPUT STATISTICS: Produce the mean, mode and median for the components: Quiz, Lecture, Assignment, Attendance, and Survey. a. For each month b. For the entire 13-week academic semester OUTPUT CORRELATION: Produce a suitable graph that displays the following information from user interactions with the following components: Assignment, Quiz, Lecture, Book, Project, and Course. Determine if there is any significant correlation between the ‘User_ID’ and ‘Component’. You will need to select an appropriate visualisation to demonstrate this. Non-functional requirements ● The GUI interface must be able to provide appropriate feedback to confirm or deny a user’s actions. ● The application must be able to handle internal and user-generated errors. Technical requirements A. The application is built using Core Python from version 3.7 - 3.12. B. The application uses one or more of the advanced Application Programming Interfaces (API’s) introduced on this module such as: NumPy, Pandas, Seaborn, Matplotlib. It should NOT use alternative API’s for this functionality; however, appropriate Python core libraries can be used to access/query a database. C. The application MUST run within the Anaconda environment using a Jupyter notebook. D. The application and its parts must not run concurrently, and must NOT use Python threads. The requirements specified here are the constraints within which you need to produce your prototype application. They are not negotiable. III. Assessment Task(s) This assessment has two tasks: A. Design and implement a suitable prototype application that meets the specified requirements as either a single program, or a series of clearly identifiable programs. The program(s) submitted MUST be able to run under the constraints of the technical requirements section. B. Produce a report that addresses the questions below and demonstrates your approach to the design and development of your prototype application, clearly justifying the decisions you have made. You should support your discussion with appropriate reference to relevant sources using the correct citation and reference structure as indicated in the guide to IEEE referencing system. Where requested, you should select code samples from your software development that demonstrate specific algorithms and interactions. All code samples should be captured as images (screen shots), appropriately labelled, and presented in the appendix. You should refer to and discuss these within the context of each question. Do NOT include screenshots in the body of your report. For further guidance on using appendices, please see the ‘Submission Formatting’ page in Canvas. Report contains 3 sections, as follows: The report consists of three main sections, containing a series of questions to satisfy the learning outcomes. Each question has an indicative word count indicating what would be considered a reasonable response given the whole report. You may choose to redistribute this across questions; however, you must not exceed a total of 3,000 words and a maximum of 12 pages in the appendices. There is no limit on the number of references you provide. For further guidance on word counts and the required formatting of your report, please see the ‘Submission Formatting’ page in Canvas. Section 1: Theory supported by code samples (40%, 1,200 words plus up to 6 pages in the appendices) Evidence for learning outcome: Demonstrate critical understanding of the theory and application of advanced programming techniques; design and implement programs for real-world problems. [MLO1, MLO2] 1a) [20 marks] Identify ONE part of your program design (such as processing the initial data set) that has the potential to be redesigned concurrently, using Python Threads. Clearly identify the program part and justify its selection and potential. Then discuss any specific issues that would need to be considered to refactor this part, and the wider impact of this refactoring on your whole program design. You should consider how data and/or communications will be passed between concurrent aspects, such as threads, and justify which Python constructs would support this redevelopment effectively. It is expected that this question can be reasonably addressed within 600 words, with no more than 2 pages in the appendix for either pseudo code, diagrams or code samples that support your discussion. This section will require appropriate citations to achieve a pass. 1b) [20 marks] With specific reference to GUI interface constructs (such as text labels and buttons), and best practice regarding interface layouts, discuss how your GUI design and implementation supports THREE of the user interactions required by your prototype application. You should then justify your design decision for each, providing comparative examples to support your approach. You should aim to demonstrate as wide a range of interface constructs/layouts as your prototype application supports. It is expected that this question can be reasonably addressed within 600 words, with no more than 4 pages in the appendix for GUI layout diagrams (wireframes OR screenshots) AND code samples that support your discussion. This section will require appropriate citations to achieve a pass. Section 2: Design decisions supported by code samples (40%, 1,200 words, up to 6 pages in the appendices) Evidence for learning outcome: Communicate design decisions for the selection, storage and manipulation of data; design and implement programs for real-world problems.[MLO3, MLO2] 2a) [10 marks] With specific reference to the data manipulation requirements, REMOVE and RESHAPE, discuss your reasoning for your selected data format (JSON, XML, or entity relationship structure), and what advantages/disadvantages it has demonstrated in this context. It is expected that this question can be reasonably addressed within 400 words with no more than 1 page of appendices for code samples, or data format samples. This section will require appropriate citations to achieve a pass. NOTE: Failure to submit a functional program (or programs) in the Jupyter notebook format may result in a grade of zero for 2a only. 2b) [30 marks] For each of OUTPUT STATISTICS, GRAPHS, and CORRELATION discuss and demonstrate, via appropriate code samples and program output, the following: ● Any additional cleaning you have undertaken and justify it in the context of the relevant output(s). State clearly if you have carried out no additional cleaning, and justify why you chose not to do so. ● Explain why the APIs you selected for data analysis were chosen over other available options, focusing on how they are suited to producing the desired outputs. ● Provide a clear code example of how you have applied the selected API’s to achieve each output. ● What you observe from each output and what conclusion/s you can draw from it, if any. It is expected that this question can be reasonably addressed within 800 words, with no more than 5 pages in the appendix for code samples, and screenshots of output and visualisations that support your discussion. Section 3: Reflection on the ethical, moral and legal aspects (20%, 600 words) Evidence for learning outcome: Critically evaluate the legal and ethical impact of software developments within real-world contexts. [ MLO4] 3) [20 marks] Reflect on the ethical, moral and legal aspects of computing, as discussed in the module, and demonstrate an awareness of how these need to be considered in the role of a software engineer. Critically evaluate the following statement by building an effective ‘for’ or ‘against’ argument. This should be supported by the literature, using comparative examples, and recognition of the opposition’s position where appropriate. “The moderation of social media platforms by their owners/operators is robust, fair, and effective at removing problematic content. Consequently, software engineers should not be required to consider the ethical, moral or legal consequences of employing user-submitted social media content as training data for machine learning.” It is expected that this question can be reasonably addressed within 600 words. This section will require appropriate citations to achieve a pass. IV. Deliverables The appendices limit (12 pages) for this assessment supersedes that stated on the ‘Submission Formatting’ page in Canvas. Other than this, your assignment should be laid out following all other formatting guidelines that are specified in the ‘Submission Formatting’ page in Canvas. You should submit two files as follows: ● A completed report answering the given questions as a single file in either .docx or .pdf format. This should NOT be included in the zipped file and should not exceed given word counts, or page limits. ● A single zipped file containing your program or programs. If a database has been used, you should produce a file dump of the data/table structure to include here. This should NOT contain the original data set. Using a database: Where you have opted to use an SQL or relational database (other than Mongo), include the following after your list of references: A. Name of database and link to download (install package) B. Version number of the database used C. The name of the Jupyter notebook that creates and populates the database D. The point in your code where local host and the port are set (make this clear) You should make sure that your submitted code contains all the code required to set up and populate your database via a local host connection. Referencing You are required to use the IEEE referencing style for citing books, articles, and all other sources (such as websites) used in your assignment. Good referencing is essential in order to meet the standards of academic integrity set by the University. All your sources must be acknowledged, regardless of whether you’ve included direct quotes or not. Visit your Academic Integrity Tutorial module in Canvas for additional guidance on effective referencing. V. Marking Criteria Section/Task Criteria Available marks Section 1. Theory supported by code samples Functional program(s) An implementation of your software design, using the specified platform(s), to demonstrate MLO2 as well as allow verification of your report discussion 1a. Adaptation to a concurrent model Appropriate concurrent mechanisms/constructs have been selected for the refactoring. These and potential issues/impacts have been discussed in the context of the given scenario and requirements. 20 1b. Implementing user interaction Appropriate GUI constructs and layouts have been selected to support the required interactions. There is a clear rationale for their selection given best practice in GUI constructs and layout. 20 Section 2. Design decisions supported by code Functional program(s) An implementation of your software design, using the specified platform(s), to demonstrate MLO2 as well as allow verification of your report discussion. (partially non-functioning code can still attract credit if it addresses the stated requirements and specification). a. Selected data format An effective format has been selected and a rational argument is presented for how it supports the nature of the data and the type of analysis required to produce the prototype applications requirements. Failure to submit a functional implementation may result in a grade of zero for this question only (2a) 10 b. Generating outputs Appropriate code constructs, internal data structures, visual representations have been selected and applied to achieve the given requirements. Considerations have been made for any anomalies within the data set. There is a clear justification for design decisions, and accurate observations made given the applications output. 30 Section 3. Reflection on ethics, morals and legal aspects Ethics, moral and legal Clear and appropriate examples from the literature are used to build an effective argument to support a ‘for’ or ‘against’ position on the statement. 20 TOTAL 100
Innovation Management ELE00158M Part 2 Individual Assignment (65%) - 2024/25 There are two parts in this module’s assessment: • Part 1: Group Assignment (35%) • Part 2: Individual Assignment (65%) The following is the brief for Part 2: Individual Assignment. As part of the module, you will be working with an assigned team on the development of a new product- based business idea in an area of engineering and technology in the UK market. The main objective is to explore the real-world market and test the commercial viability of your business idea. You will be using an online concept testing tool called SimVenture Validate to build your business plan including costing and revenue analysis. You will need to complete all the relevant components within this canvas using guidelines from your module coordinator. Following the work on your canvas, the next step is Testing. As part of your testing in your market research, you will need to address the following: • Identify a range of critical assumptions from your business canvas in SimVenture Validate. Each group should identify a minimum of 15 critical assumptions and correlate them to the blocks from your business canvas. • In your critical assumptions, you must also identify some of the Key Performance Indicators (KPIs) wherever applicable and discuss why these might be important for you to analyze at this stage. As a group, you will need to test your critical assumptions in the real market using relevant research techniques and use the collected data to evaluate the commercial viability of your business idea. Your market research involves using a combination of primary and secondary data and different testing methods. Please note: You will need to complete the relevant ethics application form and seek approval before collecting any primary data, please use the guidelines from your lectures. Once your data collection and analysis are complete, as a group you will then finalize the ‘tests’ part of SimVenture Validate as well as your group’sportfolio. Following the group work, write an individual report using the information from the research undertaken for your business idea addressing the following questions: uding ,customersegment ,marketanalysis, revenue model, test resultsandcomm viability. (Suggested word count 350 wo esandkeyfeatures.5 marksCriticalAssumptions● Listall your criticala Canvas inSimVenture Va oosingthesecritical businessplan.(5 marks)10 marksMarket Researchand Analysis● Discussyourapproachfortestingyourcriticalassumptionsecific teststrategiesanddata collection. (5marks)● thedata youhav yfindings . (25 )30 marksCommercialViability • iscuss how ndbrandvalueofg:oWhat do you know about your customer segment theirbehaviourfollowingyourmarkettestingWhat odel rks) product and why? (5 mark15 marks lneedtoassumetheroleof . dmapthat duct.Usingthe lupdatetheproduct your roadmap and deliver newrks)● Proposethestructureof ProductBacklogfo effectiveSprintPlanningwithinyourte product management. (10 marks)25 marksConclusion● In your conclusion, reflect onth prospectsof yourbusiness ideausingevidencefromyour duct work and yourward?5 marksReferencing,appendix, overallreport layoutand readability● Listall referencescitedinthe reportu sapprovalforyour data collection, your interview/survey guides and a copyofyourgroupportfoliofromSimVentureValidate Additional information: ● Suggested word count 3500 (Title page, Executive Summary, Reference list and appendix not included), IEEE referencing should be used. Font size 12, one line spacing. ● Your report should be submitted as a pdf file.
UPD204 1st SEMESTER 2024/25 GIS REPORT BEng/BA URBAN PLANNING AND DESIGN – Year 3 GEOGRAPHIC INFORMATION SYSTEMS INSTRUCTION TO STUDENTS: 1. Assessment Weight: 60% of the final mark 2. Delivery Method: - Main Report: a short essay of approx. 1,000 words (no more than 1,500 words) - Appendix: attach three assignments of the practical sessions (P3 & P8 & P9) * Word count of the report excludes Appendix and the reference list. 3. Submission Date: 5:00 pm on 24 December 2024 - Submit an electronic copy on LMO (no hard copy submission required). - No need to submit the original dataset and intermediate files but please make sure they are stored properly in your own PC or Box to allow the module examiner to access your original files when necessary. - Late submission penalty applies Aims and Learning Outcomes The GIS Report assessment is related to aims and learning outcomes of the UPD204 module. Students who complete this assessment successfully should be able to: A. apply the practical uses of GIS to real-world problem solving; B. demonstrate the basic methods of spatial analysis; C. assess issues relating to data quality and GIS use; D. illustrate trends in the development of GIS; E. manipulate spatial data; and, F. operate the basics of GIS software. Task You have been appointed to a research post at a local GIS research institute of your city. Your research team has just started a planning project to support local policy making and needs to conduct a case study on how to apply GIS to spatial planning at the city scale. In your final report, you are required to deliver a spatial planning practice by conducting a GIS analysis using city-level spatial data. You should explain, and your report should demonstrate: ● a spatial analysis practice of your own using available GIS data, for example, POI data, road network, census data, land cover data, etc. (for an international student, you can choose your hometown or another city you are familiar with as a case study); and, ● the process and result of spatial analysis that you have conducted. You also have to attach assignments from three Practical Sessions as the appendix (1 page each). Details of each assignment will be provided in class. Guidelines In order to conduct the spatial analysis using GIS data available: (1) you can use the available Chinese dataset in the UPD204DataChina folder; (2) you may download various dataset (*) for which you can access through: ① General Statistic Database (China): https://libguides.lib.xjtlu.edu.cn/c.php?g=520529&p=3559901 ② China Boundaries Data (gadm.org): on Learning Mall shown as below ③ Suzhou GIS Data (from Open Street Map): on Learning Mall Shown as below: Note: GIS data for China from Open Street Map could be downloaded from: https://box.xjtlu.edu.cn/f/dbb36dd20da64ee980f9/?dl=1 ④ Suzhou Land Cover Data (from ESRI): on Learning Mall Shown as below: Note: Global Land Cover Data 2017-2023 from ESRI could be downloaded from https://livingatlas.arcgis.com/landcover/ *Note: For datasets at national or global scale, please try to process the original dataset and extract the data that is suitable for your study area. More dataset might be available and would be uploaded to LMO. Please double-check the data list on LMO. (3) you may also explore more geographic data available on the Internet (you must cite date sources), for example http://libguides.lib.xjtlu.edu.cn/GIS/OpenGeo-Resources http://libguides.lib.xjtlu.edu.cn/GIS/Geo-DataAccess http://www.beijingcitylab.com/data-released-1/ http://www.gadm.org/download_world (*) http://www.gadm.org/download_country_v3 (*) http://www.openstreetmap.org http://www.gscloud.cn/sources/index?pid=1&rootid=1 (In Chinese) http://worldmap.harvard.edu/chinamap/ (*) https://dataverse.harvard.edu/dataverse/hrs http://download.geofabrik.de/asia/china.html https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/M7WEFY http://www.stats.gov.cn/ http://www.fas.harvard.edu/~chgis/ http://www.davidrumsey.com/luna/servlet/view/all/where/China http://freegisdata.rtwilson.com/ http://www.dbr.nu/data/geo/ (*) http://www.fas.harvard.edu/~chgis/data/chgis/downloads/v4/ Note: (*) indicates the site settings of the webpage are probably required to be adjusted first, such as the site settings of Flash, Pop-ups and redirects, Ads, Insecure Content, etc when using the Chrome. In ArcCatalog, you should create a new working folder called “Coursework” under the “UPD204MyAssessments” folder. You must store all GIS data and ArcGIS files in the “Coursework” folder in order to allow the module examiner to access your original files when necessary. You should design your own spatial analysis in considering: what the purpose of your spatial analysis is; why it is important; and how the analysis can be conducted. The outcome of your spatial analysis can include, but not limited to: (1) a choropleth map that shows the spatial variation of a socio-demographic or environmental attribute (i.e. the elderly, greenspace, etc.) of your city at different spatial scales (i.e., districts, sub-districts, etc.); (2) a map that illustrates the suitable locations for a new public facility (i.e., fire stations, schools, talent apartments, etc.); (3) a measurement of spatial patterns using points of interest and spatial autocorrelation analysis; and so on. You must also pay attention to the map design in order to communicate the analysis results with your audience, policy maker. Assessment Criteria There are three marking criteria for the assignment: GIS Report Descriptions 1 Research 45% Understanding of knowledge; depth, richness and accuracy of information; analysis methods; quality of analysis; skills of GIS software in handling spatial data; recommendations. 2 Presentation 40% Effective graphic design of map production (scale; orientation; title; border; legend; visual hierarchy); quality of the report presentation; written communication skills; outline; sequencing; linkage; ending; referencing. 3 Appendices 15% Production of the spatial analysis outcomes; description and discussion about the results of spatial analysis. Total 100% Rating scales conform. to the XJTLU Marking Descriptor.
ASSIGNMENT STAT433/833 Assignment 5: Due December 19th at 10am 1 | Time Reversal [6] Suppose that W is a standard Wiener process (SWP) and that Mt = sup0≤s≤t Ws. a) [3] Show that for Vt = W1−t − W1 is a SWP on time interval [0, 1]. b) [3] Show that Mt − Wt and Mt have the same distribution for each t > 0. 2 | Arcsin Law [6] Show that the probability a standard Wiener processes has no 0s on the interval (t, 1) is (2/π) arcsin √t and that the probability density function for the time of the last 0 before time 1 is 3 | Inverse Gaussians [12] Suppose that W is a standard Wiener process and that a > b > c > 0 and n ∈ N . Let Tc = inf(t ≥ 0 s.t. Wt = c) and define Ta, Tb, T1, Tb−a, and Tn similarly. a) [3] Find the Laplace transform. and expected value of Tc: E exp(−λTc) and E Tc. b) [3] Is (Ta − Tb) ⊥⊥ FTc? How does the distribution of Tb − Ta compare with that of Tb−a? c) [3] Let λ > 0. How does the joint distribution of (Ta, Tb, Tc) compare to the joint distribution of (λTa/√λ, λTb/√λ, λTc/√λ)? d) [3] Suppose that T(i) are IID from the same distribution as T1. What is the distribution of What is the distribution of Tn? What is the distribution of n2T1? 4 | Recurrence and Transience [10] a) [1] Fully describe the set of harmonic functions on an open interval (a, b) of R = R1 (d = 1!!). b) [2] For a SWP in d = 1, and for 0 < a < x < b compute Px(Ta < Tb). c) [2] Does a SWP in d = 1 revisit neighbourhoods of the origin with probability 1 from any initial location? d) [1] Does a SWP in d = 1 revisit the origin with probability 1 from any initial location? e) [2] Is the set of times where a SWP in d = 1 visits the origin unbounded? f) [1] Would you describe the SWP in d = 1 as recurrent or Transient? g) [1] How does the recurrence or transience of a SWP as a function of the dimension compare to component-wise random walks on Zd? 5 | Total and Quadratic variation [16] An partition of [0, T] of cardinality n is a finite collection of points, π = (t0, t1, ..., tn) such that 0 = t0 < t1... < tn = T. The size of a partition π = (t0, t1, ..., tn) is ∥π∥ = maxk∈{1,...n} (tk − tk−1). The collection of all partitions of [0, T] is denoted Π. For a function H : II → R define If lim sup∥π∥→0 H(π) = lim inf∥π∥→0 H(π) we say that lim∥π∥→0 H(π) exists and is equal to that common value. The total variation of a function f can be defined as TVT (f) = lim∥π∥→0 sadf(π) for (sadf stands for “sum-absolute-df” and “df” refers to a change in f) The quadratic variation of a function f can be defined as [f]T = lim∥π∥→0 s(df) 2 (π) for (s(df)2 stands for “sum-of-the-squared-dfs”) a) [2] Show that if f is differentiable on [0, T] and f ′ is continuous on that interval then for any partition 0 = t0 < t1 < ... < tn = T it holds that b) [2] Under the assumptions above argue that for any ϵ > 0 that there is a partition where this is within ϵ > 0 of an equality. c) [3] Under the assumptions above show that the quadratic variation of f on [0, T], [f]T , is 0. d) [3] For Wt standard Wiener process, let Show that E Yn is unbounded in n and that Var(Yn) is bounded in n. e) [3] Show that lim sup∥π∥→0 sadW(π) = +∞ with probability 1. f) [3] Define the quadratic co-variation of functions f and g by Show that if the quadratic variations of f and g exist, then ⟨f, g⟩T = lim∥π∥→0 s(df)(dg)(π), where and that
Foundations of Applied Mathematics MATH 4700 – Fall 2024 Practice Final 1 Hungry, Hungry Lions (20 points) Suppose some ecosystem contains some lions and a rather small number of zebras. Zebras eat grass, which is in abundant supply, while there is nothing for the lions to eat except for zebras. Here you will be asked to model the interactions between three populations: • well-fed lions, • hungry lions, • zebras. Well-fed lions are simply those that have recently had occasion to eat zebra, while hungry lions are those which have gone without zebra meat for a while. Assume that, in the absence of any external influences, each of the subpopulations would have a constant per capita birth rate and death rate. Of course these birth rates and death rates will be different for the different populations. a. (7 points) Define variables and parameters for a population model to describe this ecosystem, indicating dimensions. Preferably, define your constant param-eters so that they are all positive, but if you have a negative parameter, state explicitly that it is negative. b. (3 points) What kinds of inequalities are suggested by the information given above? (No assumption is made that this is a long-lived, stable ecosystem; the species could be either flourishing or extinct.) c. (10 points) Write down a set of differential equations to model the dynamics of the populations of the ecosystem. 2 Perturbation Theory (30 points) Consider the following nondimensional system of equations where ε is a small nondi-mensional parameter (|ε| ≪ 1): a. (10 points) Develop a regular perturbation theory to construct a good approx-imation to the solution of the initial value problem. You should explicitly solve for the leading order nontrivial main term, that is, the most important term in the perturbation expansion for x and y which is not zero. You should also obtain the differential equations including initial conditions for the next most important nonzero term in the perturbation series expansion for the solutions x and y. But you do not need to solve these differential equations. Just convince yourself that the solutions are not zero. If a solution is zero, it means you should derive the differential equation for the next term in the perturbation series expansion of the solution, until you get a differential equation which has a nonzero solution. Your answer is complete if you can write an approximations for x(t) and y(t) each as a sum of: • a nonzero leading order term for which you provide an explicit formula, plus • a nonzero correction term for which you provide a differential equation with initial conditions. • an estimate of the error of your approximation. b. (10 points) Sketch the solution for (x(t), y(t)) on the phase plane over a time long enough that x varies significantly. Note you do not need to obtain a full analytical approximation to sketch the solution graphically. c. (10 points) Use a suitable approximation method to obtain an analytical de-scription for how x(t) varies with time, to leading order, as it moves significantly away from its initial value. 3 Statistical Analysis of SEIR Model (20 points plus bonus points) In class, we discussed the SEIR model for susceptible (S), exposed (E), infective (I), and recovered (R) populations: where N = S +E +I +R is the total population. There are many kinetic coefficients (all positive) here whose meaning aren’t important for the problem. The initial conditions are: S(0) = 500, E(0) = 0, I(0) = 27, and R(0) = 0. Suppose we take the following two parameters as uncertain: • the average time of infectiousness τI , which is given a prior distribution with a symmetric triangular shape on the values 10 ≤ τI ≤ 18 with the peak value (mode) at τI = 14, • the infectivity parameter β is given a prior distribution of a normal distribution with a peak at µ = 2.25, a width σ = 1, truncated to reject negative values. The ground truth values are τI = 14 and β = 2.25. a. (5 points) Here are the predictions of the dynamical model based purely on prior beliefs, using the same convention as in class (solid line for ground truth, dashed line for the mean of the simulations sampled from the prior, with in-dividual samples from the prior parameter distribution plotted in faint lines). The dashed lines and solid lines are essentially coincident, meaning the mean of the prior predictions is very close to the ground truth. What might explain this? Figure 1: Prior predictions of SEIR model. b. (5 points) The infective population I(t) and recovered population R(t) were taken as observables, with random counting errors of magnitudes σI = 75 and σR = 20, respectively. Furthermore, the recovered population was observed with a bias of -10 on average. The data scientist’s observation model was the same as the true observation model. The posterior distribution obtained from observations at 100 time points equidis-tributed over the time interval 0 ≤ t ≤ 30 can be summarized (via the reported posterior mean and standard deviation): • τI = 14 ± 1, • β = 2.42 ± 0.22. How do the posterior and prior distributions of the parameters τI and β compare in terms of their accuracy and precision? Explain. c. (10 points plus bonus points) Here are the predictions of the dynamical model based purely on the posterior distribution of the parameters, using the same convention as in class (solid line for ground truth, dashed line for the mean of the simulations sampled from the posterior, with individual samples from the posterior parameter distribution plotted in faint lines). The observed data is plotted by discrete circles. As not much is happening after the end of the data collection period, let’s focus on understanding the behavior. of the various populations during the time interval 0 ≤ t ≤ 30. How did the data collected affect the inference of how the various populations behaved on the time interval 0 ≤ t ≤ 30, relative to the prior beliefs? The bonus points are for good thoughts on why the data affected this inference. Feel free to continue your answer on the next page. Figure 2: Posterior predictions of SEIR model. d. (5 bonus points) What would be the point of even using a model to predict what happened over a time interval where we already have collected observa-tional data? 4 Influential Mathematics (25 points plus 5 bonus points) Imagine a nation (Lineland) which is represented spatially in a one-dimensional way as an interval [0, L] where L = 3000 km is the distance between the east and the west coast. Each point x represents the north-south cross-section of Lineland which is a distance x from the west coast. a. (7 points) Write a partial differential equation model with initial conditions to describe the evolution of the number density ρ(x, t) of individuals in Lineland as a function of east-west location x and time t subject to the following require-ments: • The population density of the nation starts as (2 + cos(2πx/L)) × 105 people/km, i.e., there is a higher population density near the coasts than the midlands. • People are moving and traveling, with individual variations but with a bias to move away from the coasts (I would have said the opposite 5 years ago). But no one enters or leaves Lineland, and we neglect births and deaths on the time scale of interest. You should write down a specific partial differential equation, but it can involve positive parameters that you do not specify numerical values for. Explain as precisely as you can the meaning of any parameters you introduce. And do use the quantitative specifications from the problem in your model. b. (5 points) Sketch the initial population density ρ(x, 0) and an approximate representation of ρ(x, t) after t = 1 year. c. (8 points) Consider a social practice (say a certain way of wearing clothing or a way of verbal communication) which people find interesting to adopt when they see others perform. it in real life, but for some reason is not influential via social media. Write a partial differential equation model to describe the number density ρS(x, t) of individuals in Lineland adopting the new social practice as a function of east-west location x and time t which has the following features: • At the present time t = 0, the social practice has been adopted by 5% of the population living within 100 km of the west coast, and by nobody anywhere else. • The motion of people in Lineland from part a is not correlated with (related to) whether or not the individuals adopt the social practice • A given individual who has not adopted the new social practice is more likely to adopt this practice when they see others in their (real-world) vicinity doing the new social practice. • People who are doing the new social practice find it lame and stop doing it after about 2 months on average. You should write down a specific partial differential equation, but it can involve positive parameters that you do not specify numerical values for. Explain as precisely as you can the meaning of any parameters you introduce. And do use the quantitative specifications from the problem in your model. d. (5 bonus points) How would your model in part c change if the social practice could be spread by social media? e. (5 points) How would you use the solution to your model to part c to count the number of people adopting the new social practice in the middle third of the country one year from now? 5 Standing, Walking, or Running (30 points plus 15 bonus points) Consider a random walker moving on a one-dimensional lattice with spacing ∆x and time step ∆t. At each time step, the walker independently makes one of three choices with the indicated probabilities: • stands still with probability ps, • moves one lattice site to the left with probability pℓ , or • moves two lattice sites to the left with probability p2. The random walker starts at position m = 0. Define w(m, N) to be the probability that the random walker is at position m after N time steps. a. (10 points) Derive a master equation to relate w(m, N + 1) to its value at the previous time step N, for N ≥ 0. Explain your reasoning. b. (10 points) Now suppose that the lattice sites are separated by a small dis-tance ∆x and the time steps take a short time ∆t, and that Np random walkers are moving independently according to the above rules, all starting from lattice site m = 0 at time step N = 0. Take a continuum limit of the master equa-tion to obtain a macroscale partial differential equation for the number density ρ(x, t) of Np random walkers each moving independently according to the model stated above. Relate all coefficients in the partial differential equation to the microscale parameters. Assume that ps, pℓ , and p2 are not close to the values of 0 or 1. c. (10 points) Sketch the number density ρ(x, t) as a function of x for t = 0, t = 1, and t = 10. d. (15 bonus points) What other macroscale equations can be achieved by taking ps, pℓ , and/or p2 to be O(∆x) close to 0 or 1 as the continuum limit is taken?
Biology 200 Sample Final Exam Question 1. (____/ 7 marks) Chloroplasts, mitochondria, and nuclei are three intracellular organelles that are surrounded by two biological membranes. Although they possess some structural similarities, they have many differences in function and biogenesis. For statements 1 to 7 in the table below, indicate whether it applies to: A. Chloroplasts only B. Mitochondria only C. Nuclei only D. Both chloroplasts and mitochondria, but not nuclei E. All three organelles STATEMENT Fill letter A-E 1) These organelles contain remnant genomes carried on circular chromosomes, which reflect their evolutionary origins. 2) mRNA is transported through multi-protein complexes in the membranes surrounding this organelle to the cytoplasm for translation. 3) This organelle contains euchromatin and heterochromatin. 4) The outer envelope membrane is continuous with the endoplasmic reticulum. 5) The outer envelope membrane, but not the inner envelope membrane, is porous and allows molecules to diffuse across freely. 6) The inner envelope membrane of this organelle is highly folded increasing its surface area. 7) This organelle contains an additional internal membrane system, in addition to the two membranes surrounding the organelle. Question 2. (____/ 12 marks) For each of the observations below, write a statement that could explain these results, based on what the data shows and on your knowledge of cellular structure/ function (3 marks each) A. If taxol is added to cells at metaphase, the chromosomes do not move towards the spindle poles in anaphase. B. Electron micrographs show that mitochondria in heart muscle have a much higher density of cristae than mitochondria in skin cells. C. Microtubules formed in vitro from tubulin that is bound to a non-hydrolyzable form. of GTP were found to be exceptionally stable. D. Some membrane proteins can be readily extracted with 1M NaCl, whereas others require the use of a detergent. Question 3. (____/15 marks) On the domain maps provided, draw and label all targeting sequences that you predict would be found in the proteins, A - E. If you predict no targeting sequence is required, clearly indicate this in your answer – a blank will NOT be considered for marks. (2 marks each): A. Soluble ER resident B. Rubisco Small Subunit 2B – a nuclear-encoded chloroplast stromal protein C. Transcription factor D. α-tubulin E. Glycophorin A (plasma membrane protein), 1 transmembrane domain, N-terminus outside the cell. Match each of the proteins mentioned above with its cellular location, as identified in each micrograph (arrow) below. (1 mark each): Micrograph Protein (A-E) 1 2 3 4 5 Question 4 (____/9 marks) The epidermal growth factor receptor (EGFR) is a transmembrane protein localized to the plasma membrane in skin epithelial cells. When EGFR is activated by binding to the extracellular peptide EGF, it is internalized via vesicles. Ultimately, this internalization will lead to changes in cell growth and division. Researchers studied the relationship between EGFR internalization and cell growth by investigating wild type cells compared to cells with a non-functional mutant form. of clathrin. A. The first experiment examined the amount of EGFR found in early endosomes in the wild type and mutant cells, in the absence (-EGF) or presence (+EGF) of the extracellular peptide. Describe the results shown, and explain what you can conclude based on these data. (3 marks) B. In the next experiment, they examined the effect of EGFR internalization on cell cycle proteins, to see whether there were any changes. The SDS-PAGE data below shows the results for Mitotic Cyclin (M-Cyclin), a protein known to promote entry into mitosis, in the absence or presence of EGF. Describe the results, and explain what you can conclude based on the gel shown. (2 marks). C. The researchers discovered that after the wild-type cells are exposed to EGF, a transcription factor called ERK1/2 moves into the nucleus. On the other hand, ERK1/2 remained localized in the cytoplasm when the clathrin mutants were exposed to EGF. Considering this and the rest of the data in this question, propose a model describing how EGF-binding to EGFR might affect the cell cycle progression. In your explanation, include the role of EGF and its receptor, clathrin, the transcription factor ERK1/2 and Mitotic Cyclin. (4 marks) Question 5 (____/ 10 marks) Actin Capping Protein (ACP) binds to the plus end of actin filaments, preventing the actin filaments from gaining or losing monomers. Its activity is blocked by regulatory proteins such as V-1. In this experiment, Takeda et al. (2010) examine the role of V-1 and ACP in the regulation of actin polymerization. In Panel A, the ratio of F-actin (actin filaments) to G-actin (actin monomers) were measured for WT (normal) and V1 (over-expresses V-1) cells. In Panel B, actin filaments and nuclei were stained with fluorescent dyes and cells were examined through fluorescence and light microscopy (red = actin; blue = DNA). Assume that both cell lines express the same amount of ACP. A. Compare the bars in the chart in Panel A. What do they tell you about the amount of actin in each cell type? How might the change in expression levels of V1 be affecting ACP’s ability to function? (3 marks). B. Panel B shows paired brightfield and fluorescence microscopy images of each of the cell types measured in Panel A. Compare the shape of the WT cells to the V1 cells, and explain how the actin shown in the fluorescence images is contributing to that shape. (4 marks) C. Based on the data shown in Panels A and B, how might V-1 be influencing cell shape in the overexpressing cells? (3 marks) Question 6 (____/ 8 Marks) You generated a yeast strain with a temperature-sensitive mutated form. of M-cyclin that is unable to fold stably at a non-permissive temperature (37°C). To assess its role in the cell cycle, you grow these mutant cells in a test tube at the permissive temperature (25°C) then shift them to 37°C. A. Explain what happens to M-cyclin concentration as the cell progresses through interphase at the permissive temperature. How does this compare at the non-permissive temperature? (2 marks) B. What effect would this mutation have on the regulation of cell cycle at the non-permissive temperature (2 marks)? C. You fluorescently labelled the DNA of your mutant yeast cells and used fluorescence activated cell sorting (FACS) to analyze the effect of this mutation on the cell cycle. Note that these cells complete a cell cycle in 90 minutes where interphase lasts for approximately 70 minutes. In your experiment, you have 2 tubes (labeled Tube 1 and Tube 2). In Tube 1, you incubate the yeast cells at the non-permissive temperature, 37°C, for 2 hours. In Tube 2, you incubate the cells for 2 hours at non permissive temperature, followed by 30 minutes at the permissive temperature. In the appropriate spaces below, draw the expected FACS readout from each Tube and label the graphs with the relevant phase of cell cycle expected. Question 7 (____/ 9 Marks) Nocodazole is used as an anti-cancer drug that binds to a/b tubulin subunits inside cells. Shown below are fluorescence images of cells in the presence or absence of nocodazole, by fluorescence-tagged immunolabeling for a-tubulin. A. Describe the fluorescence pattern seen in Panels A and B. What does this tell us about the effect of nocodozole on microtubules? (3 marks) Describe the Fluorescence The Effect of Nocodozole Panel A Panel B B. Based on your knowledge of microtubule polymerization at the plus end, propose a model for how Nocodazole might be leading to the results shown. (2 mark) C. Predict what would happen to the cell in Panel B if the nocodazole was washed away and the cell was allowed to recover. (1 mark) D. Predict the impact of nocodazole treatment on constitutive secretion in these cells? (1 mark) Explain why. (1 mark) E. Propose a role for nocodazole in the M-phase which makes it useful as an anti-cancer drug. (1 mark) Question 8 (____/ 9 Marks) During normal cellular function, protein activity must be strictly controlled such that cellular processes can be turned ‘off’ or ‘on’ quickly. One strategy for such control involves chemical modification of proteins, through addition or removal of covalently bonded chemical groups. Name THREE distinct examples/proteins by which protein activity can be turned “on” or “off” through three distinct types of chemical modification. For each example, describe: 1) general mechanism i.e. type of chemical modification (1 mark), 2) example of a protein whose function is regulated by this mechanism (1 mark), 3) how is the protein function turned “on” and “off” (3 marks total per example) Note: All examples and explanations must come from BIOL 200 course material.
Assessment: LM Food Safety Management Systems Assessment For this module assessment you will be working in groups of up to 5 students. The assessment is based on the production of an apple puree baby food. In the first assessment each member of the team must write about a different prerequisite programme and also provide a description of the programme that would relate to the production of the baby food. In the second part, as a group you must produce a HACCP plan for the apple puree baby food. More details of the assessment can be found below Food Safety Management Systems Assessment 1 (Individual assessment) 1. Explain what are Prerequisite programmes (PRPs) and the role they play in a successful food safety management system. Also, discuss how Codex Alimentarius' General Principles of Hygiene' and Regulation 852/2004 provides the flexibility to be able to apply to a wide variety of food businesses. (25 marks) 2. Explain the principles behind the design of one of the following PRPs. NOTE THAT EACH MEMBER OF YOUR GROUP MUST SELECT A DIFFERENT PRP i) an effective pest control system or ii) an appropriate cleaning programme or iii) establishing an appropriate shelf-life or iv) supplier management or v) equipment and premises design (50 marks) 3. Provide a description of the chosen PRP that could be used in the baby food factory making apple puree. (25 marks) Marks will be awarded for 1. Relevance – does it answer the question posed? 2. Understanding - is it accurate and does it demonstrate an understanding of the subject? 3. Presentation and organisation – is it well-written, structured and easy to read? 4. Referencing – correctly cited and referenced, paraphrasing and absence of plagiarism. Note that Generative AI must not be used to answer the assessment although it can be used to scope out the topic and develop some initial ideas. You can use it as a supplement to enhance your understanding, rather than as a substitute for your own work. These tools should not be used to write your assignment and/or to clarify and develop your arguments. You should not use them to translate large amounts of text into English. Here is a link to the Univerity's guidance on AI. 50 % of the final mark for this module. Word Count – 2500 Word Count 2500. Penalties will be applied where students exceed the word count by more than 10%. Up to 2750 words No penalty 2751-3000 words -10% >3000 words Only the first 3000 words will be marked and a 10% penalty applied. The following is not counted towards the word count. 1. The question posed, for example if you use these as headings. 2. Reference lists and appendices you provide at the end of the work 3. Tables and figures, providing those tables do not contain extensive text that aims to answer the question. Food Safety Management Systems Assessment 2 (Group work) The second assessment is a piece of group work. You are required to produce a HACCP planfor the production of apple puree as a baby food. The manufacturer wants to fortify the foodswith vitamin C, D and Folic Acid to the product too. The group can choose the type of packaging and whether it is for example shelfstable or needs to be refrigerated. Your work however should be consistent with whatever decision you make. It will require you to research how this product is made and any significant chemical, physical or microbiological hazards associated with it. You should also review any relevant legis lation related to the product such as the Processed Cereal-based Foods and Baby Foods for Infants and Young Children (England) Regulations 2003. The scope of the HACCP plan should include microbiological, physical and chemical hazards. The plan should include: 1. A description of the most important hazards associated with the production of this product and how they would be controlled. 2. A descriptions of the prerequisite programmes that you would put in place and these should relate to the product being produced. 3. Details of the preliminary steps before the HACCP principles are applied. 4. Details of the application of the HACCP principles to identify the CCPs and control procedures. 5. An indication of what verification activities and documents are needed. 6. Evidence that the plan is valid and will control the hazards. Although the final mark will be applied to all group members, students will be expected to identify the individual contributions in the submitted plan. If unequal contributions are revealed by peer evaluation, the assessor will investigate the situation. If necessary individuals will be excluded from the group mark and have to submit an individual piece. All group members must complete the feedback form. to confirm participation of team members. Submission date: Monday 20th January 2025 50 % of the final mark for this module. Word Count 8000 maximum
CS152 Project 6: Searching and Optimization Now that we have a measure of the impact of darting on the elephant population we can proceed to the next step which is determining what is the optimal darting probability. To do this we are going to use the Binary Search strategy and “bounce” around in the search space of possible darting probabilities. Of course, since there is no exact solution to this problem we will also be specifying a tolerance to help us stop the search. Once we get this working we will expand the program so that it can handle optimizing multiple parameters. Tasks T1. Write a function to optimize the percent darted Create a new file called optimize.py. Have the file import the sys, elephant, and random packages. Then create a function called optimize with the following algorithm. # Executes a search to bring the result of the function optfunc to zero. # min: minimum parameter value to search # max: maximum parameter value to search # optfunc: function to optimize # parameters: optional parameter list to pass to optfunc # tolerance: how close to zero to get before terminating the search # maIterations: how many iterations to run before terminating the search # verbose: whether to print lots of information or not def optimize( min, max, optfunc, parameters = None, tolerance = 0.001, maxIterations = 20, verbose=False ): The optimize function is very similar to the binary search function you wrote in the lab. 1. Start by assigning to a variable done the value False. 2. Start a loop that continues while done is equal to False. 3. Inside the loop, assign to testValue the average of max and min. This is not (should not be) an integer calculation. If verbose is True, print out testValue. 4. Assign to result the return value of calling optfunc with testValue and parameters as the arguments. If verbose is True, print out the result value. 5. If the result is positive, assign to max the value of testValue. Else if the result is negative, assign to min the value of testValue. Else, assign to done the value True. 6. If max - min is less than the tolerance value, then assign to done the value True. 7. Decrement maxIterations. If maxIterations is less than or equal to zero, then set done to True. 8. Outside the loop, return testValue. Note: Python allows for passing the names of functions as parameters. For example, if target is the name of the function, you can pass target as a parameter and then use that parameter to call the target function. See below. To test your optimize function, copy the following code into your optimize.py and run it. As noted in the comments, try making tolerance smaller and smaller. You should see that the result gets closer and closer to the target value. # A function that returns x - target def target (x, pars): return x - 0.73542618 # you could also use return x - 1.0 to get close to 1.0 # Tests the binary search using a simple target function. # Try changing the tolerance to see how that affects the search. def testTarget (): res = optimize ( 0.0, 1.0, target, tolerance = 0.01, verbose=True) print res if name == " main ": testTarget () T2. Test the optimize function with your elephantSim The next step is to test the optimize function with your elephantSim function. Create a testEsim function (similar to testTarget above) that calls optimize with a min value of 0.0, a max value of 0.5, and passes it elephant.elephantSim as the target function. You probably want to set verbose=True as well. As with the testTarget function, assign the return value to a variable and then print the variable. At the bottom of your code, change testTarget() to testEsim() then run optimize.py. Does your optimize function find a value close to 0.43 for the percent darted? T3. Automate varying a simulation parameter The next step is to automate the process of evaluating the effects of changing a simulation parameter across a range of values. This function will let us discover, for example, the effect on the dart probability of changing the calfSurvival rate from 80% to 90% in steps of 1%. The function definition is given below. # Evaluates the effects of the selected parameter on the dart percentage # whichParameter: the index of the parameter to test # testmin: the minimum value to test # testmax: the maximum value to test # test step: the step between parameter values to test # defaults: default parameters to use (default value of None) de f evalParameterEffect ( whichParameter, testmin, testmax, test step, defaults=None, verbose=False ): # if defaults is None, assign to simParameters the result of calling elephant.defaultParameters. # else, assign to simParameters a copy of defaults (e.g. simParameters = defaults [:] # create an empty list (e.g. results) to hold the results if verbose: print "Evaluating parameter %d from %.3f to %.3f with step %.3f" % (whichParameter, testmin, testmax, test step) # assign to t the value testmin # while t is less than testmax # assign to the whichParameter element of simParameters (e.g. simParameters [whichParameter]) the value t # assign to percD art the result of calling optimize with the appropriate arguments, including simParameters # append to results the tuple (t, percD art) if verbose: print "%8.3f t%8.3f" % (t, percD art) # increment t by the value test step if verbose: print "Terminating" # return the list of results Test your evalParameterEffects function by modifying your top level code at the bottom of your file to be the following. if name == " main ": evalParameterEffect ( elephant.IDXProbAdultSurvival, 0.98, 1.0, 0.001, verbose=True ) Example output: Evaluating parameter 5 from 0.980 to 1.000 with step 0.001 0.980 0.317 ... # this will keep stepping through from .981 to .998 0.999 0.457 Terminating T4. Evaluate the effects of varying other parameters Your final task is to make the following evaluations, showing the effect on the dart percentage of the following parameter sweeps. Required Elements: Make a table or graph (or both) for each case. These five items should go in your report. 1. Vary the adult survival probability from 0.98 to 1.0 in steps of 0.001.(example above) 2. Vary the calf survival probability from 0.80 to 0.90 in steps of 0.01. 3. Vary the senior survival probability from 0.1 to 0.5 in steps of 0.05. 4. Vary the calving interval from 3.0 to 3.4 in steps of 0.05. 5. Vary the max age from 56 to 66 in steps of 2. While debugging the automation process, you might want to reduce the carrying capacity even further to 200 or 500 elephants. Only run with 1000 elephants once everything works properly. This will take a while to run with 1000 elephants. Be patient. Example plot for first item: Required Analysis Interpretation: For full credit, the data outputs (above 1-5) must be interpreted for the reader in clear sentences within the reflection writeup (below). Required Follow up Reflection Questions: 1. What does an import statement do? 2. What is binary search? How does it work differently than a linear search algorithm? 3. Why is binary search faster than a linear search (e.g. going page by page to find a word in a dictionary)? 4. For full credit, the data must be interpreted for the reader in clear sentences within the writeup. What do these numbers mean? Why does the darting percentage go up/down when you vary each parameter? How should Kruger National Park use this information? 5. How might you apply this type of optimized search algorithm approach to something you are really interested in? Extensions Extensions are your opportunity to customize your project, learn something else of interest to you, and improve your grade. The following are some suggested extensions, but you are free to choose your own. Be sure to describe any extensions you complete in your report. ● Automate the graphing process using matplotlib or another graphing package of your choice. ● Have your program write out proper CSV files with a header line and appropriate commas for the full process. ● How much variation is there in the average total population for a 200-year elephant simulation across different runs? How stable is the estimate generated by doing 5 simulation runs? Calculating the standard deviation of the population sizes is a reasonable indicator of spread. ● Check out the os package in Python documentation(import os). What could you do with the os.system function to automate your simulations? ● Uber-extension: explore varying two parameters simultaneously. For example, if you are evaluating maximum age from 56 to 64, combine it with a set of values for senior survival rates. If you have five values for maximum age and five values for senior survival rates, there would be 25 unique parameter combinations. The plot of the probability of darting for a stable population would be a 3D graph with horizontal axes for the two parameter values and a vertical axis indicating the probability of darting. ● Does the carrying capacity have a significant effect on the probability of darting? Is there a carrying capacity below which the probability of darting goes to zero? Write your project report Reports are not included in the compressed file! Please don’t make the graders hunt for your report. You can write your report in any word processor you like and submit a PDF document in the Google Classroom assignment folder. Or use a Google Document format. Review the Writeup Guidelines document in the Labs and Projects folder. Your intended audience for your report is your peers who are not taking CS classes. From week to week, you can assume your audience has read your prior reports. Your goal should be to explain to peers what you accomplished in the project and to give them a sense of how you did it. The following is a list and description of the mandatory sections you must include in your report. Do not include the descriptions in your report, but use them as a guide in writing your report. ● Abstract A summary of the project, in your own words. This should be no more than a few sentences. Give the reader context and identify the key purpose of the assignment. An abstract should define the project's key lecture concepts in your own words for a general, non-CS audience. It should also describe the program's context and output, highlighting a couple of important algorithmic and/or scientific details. Writing an effective abstract is an important skill. Consider the following questions while writing it. ○ Does it describe the CS concepts of the project (e.g. writing well-organized and efficient code)? ○ Does it describe the specific project application (e.g. generating data)? ○ Does it describe your solution and how it was developed (e.g. what code did you write)? ○ Does it describe the results or outputs (e.g. did your code work as expected and what did the results tell you)? ○ Is it concise? ○ Are all of the terms well-defined? ○ Does it read logically and in the proper order? ● Methods The method section should describe in clear sentences (without pasting any code) at least one example of your own computational thinking that helped you complete your project. This could involve illustrating how a key lecture concept was applied to creating an image, how you solved a challenging problem, or explaining an algorithmic feature that is essential to your program as well as why it is so essential. The explanation should be suitable for a general audience who does not know Python. ● Results Present your results in a clear manner using human-friendly images or graphs labeled with captions and interpreted for a general audience such as your peers not in the course. Explain, for a general, non-CS audience, what your output means and whether it makes sense. ● Reflection and Follow-up questions Draw connections between lecture concepts utilized in this project and real-world problems that interest you. How else could these concepts apply to our everyday lives? What are some specific things you had to learn or discover in order to complete the project? Look for a set of short answer questions in this section of the report template. ● Extensions (Required even if you did not do any) A description of any extensions you undertook, including text output or images demonstrating those extensions. If you added any modules, functions, or other design components, note their structure and the algorithms you used. ● References/Acknowledgements (Required even if there are none) Identify your collaborators, including TAs and professors. Include in that list anyone whose code you may have seen, such as those of friends who have taken the course in a previous semester. Cite any other sources, imported libraries, or tutorials you used to complete the project. Submit your Code Turn in your code by zipping the file and uploading it to Google Classroom. When submitting your code, double check the following. 1. Is your name at the top of each Python file? 2. Does every function have a docstring (‘’’ ‘’’) specifying what it does? 3. Is your Lab 06 folder in your Project 06 folder? 4. Have you checked to make sure you have included all required elements and outputs in your project report? 5. If you have done an Extension, have you included this information in your report under the Extension heading? Even if you have not done any extensions, include a section in your report where you state this. 6. Have you acknowledged any help you may have received from classmates, your instructor, the TAs, or outside sources (internet, books, videos, etc.)? If you received no help at all, have you indicated that under the Sources heading of the report?
ECON3173 – Cross Section and Panel Data Analysis Individual Project: Guidelines and Questions This document provides guidelines and questions for the Individual Project of ECON3173, which accounts for 40% of the total marks. Honoring the precepts of academic integrity and applying its principles are fundamental responsibilities of all students and scholars at UIC. You are advised to read through the ‘UIC Guidelines for Handling Academic Dishonesty’ file on iSpace before you start your assignment. Any form. of plagiarism or cheating can result in various disciplinary and corrective activities. Using generative AI tools is not allowed. Deadline: by 20/12/2024. Submission Method: a) Please submit your typing assignment report in a single PDF file to Turnitin ‘Submission Link: Report’via iSpace. The filename of your PDF submissions should have the following format: ECON3173_Project_Student ID_Name in Pinyin (e.g., ECON3173_Project_190000001_Mi Lin). b) Save your data and .do file(s) in a zip file. Name your zip file as ECON3173_Project_Student ID_Name in Pinyin. Then, upload your file to‘Submission Link: Stata Data and Program’via iSpace. You are expected to submit 2 .do files, namely ParA.do and PartB.do, respectively, should be able to replicate each part of your submitted work. c) Use the ‘ECON3173_Individual Project_Report Template’ file on the iSpace to input your report. Ensure you provide a question number for each part of your work. Format Requirements Cover page: Please input your name and student ID on the top of the cover page of the report template, which is available on iSpace. Word limit: The required minimum word count is 1,500 words, with a maximum of 2,000 words in total, excluding tables, graphs, and appendices. Referencing: Your report should include appropriate references in APA format to avariety of necessary literature sources and a wide- ranging bibliography of academic aspects of economics. Font / Size: Cambria 12 or Times New Roman 12. Spacing / Sides: 1.0 / Single-sided / Single-line spacing between two paragraphs. Pagination required: Yes Margins: At 2.50 to both left and right, and ‘justified’ . Part A: Imitate an existing research (30%) Traffic crashes are the leading cause of death for Americans between the ages of 5 and 32. Through various spending policies, the federal government has encouraged states to institute mandatory seat belt laws to reduce the number of fatalities and serious injuries. In this exercise, following Einav and Cohen (2003), you will investigate how effectively these laws increase seat belt use and reduce fatalities using the “SeatBelts.dta” dataset posted on iSpace. The data file Seatbelts contains a panel of data from 50 U.S. states plus the District of Columbia from 1983 through 1997. The dataset is detailed in “Seatbelts_Description.pdf”. It was used in the Einav and Cohen (2003) paper, which serves as the background reading for this exercise. Both files are available on iSpace as well. A1. Using OLS to estimate the effect of seat belt use on fatalities by regressing fatalityrate on sb_useage, speed65, speed70, ba08, drinkage21, ln(income), and age. Does the estimated result suggest that increased seat belt use reduces fatalities? Report, interpret, and comment on your results. (5%) A2. Run a one-way fixed effect model with state-fixed effects. Do the results change when you add state-fixed effects? Run a two-way fixed effects model with both time and state-fixed effects. Do the results change when you add time-fixed effects plus state-fixed effects? Report, interpret, and comment on your results. (5%) A3. Which regression specification you obtained from A1 and A2 is most reliable? Explain why. (5%) A4. Using the results from the two-way fixed effects model with both time and state- fixed effects, discuss the magnitude of the coefficient on sb_useage. Is it large? Small? How many lives would be saved if seat belt use increased from 52% to 90%? Illustrate your calculation and comment on your result. (5%) A5. There are two ways that mandatory seat belt laws are enforced: “Primary” enforcement means that a police officer can stop a car and ticket the driver if the officer observes an occupant not wearing a seat belt; “secondary” enforcement means that a police officer can write a ticket if an occupant is not wearing a seat belt, but must have another reason to stop the car. In the data set, primary is a binary variable for primary enforcement, and secondary is a binary variable for secondary enforcement. Run a regression of sb_useage on primary, secondary, speed65, speed70, ba08, drinkage21, ln(income), and age, including fixed state and time effects in the regression. Does primary enforcement lead to more seat belt use? What about secondary enforcement? Report, interpret, and comment on your results. (5%) A6. In 2000, New Jersey changed from secondary enforcement to primary enforcement. Assume that data availability is not an issue, design a Differences-in-Differences estimation strategy to estimate the number of lives potentially saved per year by making this change. Explain your approach. (5%) Part B: A small-scale research project – innovation (70%) Introduction: In this project, you are invited to empirically investigate potential causality between firms’ business environment and economic performance using a firm-level dataset from economies included in the World Bank Enterprise Survey Data (WBESD). WBESD database collects information about an economy’s business environment, how individual firms experience it, how it changes over time, and the various constraints to firm performance and growth, etc. The entire database is available to researchers and includes all questions from the surveys at the firm level. Guidelines to download and prepare data for this individual project: a) Please visit https://login.enterprisesurveys.org/ to register your user account for the WBESD database (see the snapshot below). Registration is free. b) There are a total of 97 economies represented in the World Bank Enterprise Surveys Database (WBESD). Among these, 61 economies have a time span of at least three years. For their individual projects, students are required to select data from a panel of random combinations of three different economies out of these 61 economies. Data allocation protocol: Students are required to pick a lottery ticker number first. An “Individual Project Lottery Ticket Sign-up Sheet” will be available in iSpace from 9 p.m. on Tuesday, 26/11/2024. Please sign up for a lottery ticket number by 12 noon on Wednesday, 27/11/2024. We will operate on a 'first-come, first-served' basis. A lucky draw will be conducted in class on Thursday, 28/11/2024, to assign specific economies to each lottery ticket number. c) Once registration is completed, login and download the data following the steps below: i. Login with yourusername and password. You will be directed to the ‘Full Survey Data’ page. ii. Select ‘Panel data’ under ‘Survey Type’ on the left. Ensure you are on the ‘Data by Economy’ view instead of ‘Combined Data’ . See the snapshot below. iii. Download your economies’ corresponding data and documentation for all the available years. For example, Afghanistan has two panel data files, one for 2005 and 2009 and the other for 2008, 2010, and 2014. Then download both of them. iv. Extract the data and survey documentation files into a working folder on your PC. Now the data file is ready to open in Stata. Answer ALL of the Following Questions Note that this is not an essay-type assignment. Please answer the questions one by one. For each question, the performance of the Stata do files accounts for 20% of the marks. B1) Use the Stata command “append” to append data of all years and economies into a single Stata data file with panel data format. Select and rename variables in Table 1. ‘Old name’ refers to the variable name in the original dataset, while ‘New name’ is the new corresponding name to be defined. Generate a new variable exp_dum (export dummy) equals 1 if sales_exp is positive; otherwise, 0. Generate a new variable foreign_dum (foreign ownership dummy) equals 1 if foreign is positive; otherwise, 0. Generate a new variable soe_dum (state ownership dummy) equals 1 if soe is positive; otherwise, 0. (5%) Table 1: Variable List Survey Questions Old name New name GENERAL INFORMATION Year the survey was conducted year year Panel ID (the same ID for each firm across different years) panelid panelid What percentage of this firm is owned by the Government/State % b2c soe What percentage of this firm is owned by Private foreign individuals, companies, or organizations % b2b foreign SALES During the past fiscal year, what was this establishment’s total annual sales? d2 sales During the past fiscal year, what percentage of this establishment’s sales were: Direct exports % d3c sales_exp LABOUR and CAPITAL Total number of permanent, full-time workers at the end of last fiscal year l1 employees During the past fiscal year, what was the net book value, i.e., the value of assets after depreciation, of the following: Machinery, vehicles, and equipment n6a capital Cost of raw materials and intermediate goods used in production in the last fiscal year n2e materials BUSINESS-GOVERNMENT RELATIONS In atypical week over the last 12 months, what percentage of total senior management’s time was spent dealing with requirements imposed by government regulations? % j2 reg_time Over the last 12 months, has this establishment secured a government contract or attempted to secure a contract with the government? Yes/No j6a gov_contract B2) Conduct exploratory data analysis for variables listed in B1), i.e., using appropriate summary statistics (e.g., observation number, mean, standard deviation, minimum and maximum values, etc.) to explain your data and make necessary comments. (10%) Considering total output is measured by Sales, labour input is measured by the total number of employees, capital is measured by capital, and intermediate inputs are measured by materials, a production function can be written as: sales = Acapitalα Employeesβ Materialsy (1) where A is the total factor productivity (TFP). Based on panel data, taking logarithms on both sides, equation (1) is transformed to ln(salesit ) = ln(A) + α ln(capitalit ) + β ln(Labourit ) + y ln(Materialsit ) + eit (2) B3) Based on the variable ‘panelid’, set the dataset in a panel format, then obtain the estimated coefficients in equation (2) by running panel data regression. Comment on your results, including a) a discussion on the capital, labour, and materials elasticities of sales, respectively, and b) a comparison between the panel two-way fixed effect and random effect model. (10%) B4) Include Export_dumit as a new variable of interest in equation (2) to test if a firm’s performance improves after entering export markets. Based on relevant economic theories or literature, explore the data set to add appropriate control variables to the production equation (2). Run a panel regression and interpret the results. (10%) B5) To what extent could we use the estimated coefficient on Export_dumit obtained in B4) for causal inference? Explain. Illustrate an appropriate empirical strategy to make improvements if deemed required. Finally, reestimate the model based on your proposed empirical strategy, compare the results to what you have obtained in B4, and comment on the results. (10%) B6) Explore the complete data set (i.e., do not have to be limited to the variables listed in B1) and design an empirical model to evaluate the impact of ownership structure on the export decision. Interpret and comment on the empirical results you obtained. (10%) B7) Predict firm-level productivity (i.e., ln(A) + eit ) to test the hypothesis that good business-government relations boost productivity. Explore the complete data set (i.e., do not have to be limited to the variables listed in B1) to propose an empirical model with an appropriate empirical strategy based on relevant economic theories or literature. Explain the variable(s) you select for this question. Interpret and comment on the empirical results you obtained. (15%)
COSC2500/COSC7500—Semester 2, 2024 Exercises—due 3:00 pm Friday 25th October Required The grade of 1–5 awarded for these exercises is based on the required exercises. If all of the required exercises are completed correctly, a grade of 5 will be obtained. R4.1 Do either part (a), (b), or (c) below. (If you want to do more than one of these, see the additional exercises!) These are Q1 in Sauer computer prob- lems 8.1, 8.2, and 8.3. a) From Sauer 8.1: Solve ut = 2uxx for 0 ≤ x ≤ 1, 0 ≤ t ≤ 1, for the sets of boundary conditions i. u (x, 0) = 2 cosh x for 0 ≤ x ≤ 1 u (0, t) = 2 exp(2t) for 0 ≤ t ≤ 1 u (1, t) = (exp(2) + 1) exp(2t − 1) for 0 ≤ t ≤ 1 (Solution is exp(2t + x) + exp(2t − x)) ii. u (x, 0) = exp x for 0 ≤ x ≤ 1 u (0, t) = exp(2t) for 0 ≤ t ≤ 1 u (1, t) = exp(2t + 1) for 0 ≤ t ≤ 1 (Solution is exp(2t + x)) using the forward difference method for step sizes h = 0.1 and k = 0.002. Plot the approximate solution (the mesh command might be useful). What happens if you use k > 0.003? Compare with the exact solutions. HINT: You can use Program 8.1 (heatfd.m) from Sauer. b) From Sauer 8.2: Solve the following initial–boundary value problems using the finite difference method with h = 0.05 and k = h/c. Plot the solutions. i. utt = 16u xx u (x, 0) = sin(πx) for 0 ≤ x ≤ 1 ut(x, 0) = 0 for 0 ≤ x ≤ 1 u (0, t) = 0 for 0 ≤ t ≤ 1 u (1, t) = 0 for 0 ≤ t ≤ 1 (Solution is u (x, t) = sin(πx) cos (4πt)) ii. utt = 4u xx u (x, 0) = exp(−x) for 0 ≤ x ≤ 1 ut(x, 0) = −2 exp(−x) for 0 ≤ x ≤ 1 u (0, t) = exp(−2t) for 0 ≤ t ≤ 1 u (1, t) = exp(−1 − 2t) for 0 ≤ t ≤ 1 (Solution is u (x, t) = exp(−x − 2t)) c) From Sauer 8.3: Solve the Laplace equation for the following boundary conditions using the finite difference method with h = k = 0.1. Plot the solutions. i. u (x, 0) = sin(πx) for 0 ≤ x ≤ 1 u (x, 1) = exp(−π) sin(πx) for 0 ≤ x ≤ 1 u (0, y) = 0 for 0 ≤ y ≤ 1 u (1, y) = 0 for 0 ≤ y ≤ 1 (Solution is u (x, y) = exp(−πy) sin(πx)) ii. u (x, 0) = 0 for 0 ≤ x ≤ 1 u (x, 1) = 0 for 0 ≤ x ≤ 1 u (0, y) = 0 for 0 ≤ y ≤ 1 u (1, y) = sinh π sin(πy) for 0 ≤ y ≤ 1 (Solution is u (x, y) = sinh(πx) sin(πy)) HINT: You can use Program 8.5 (poisson.m) from Sauer. R4.2 Use a Monte Carlo method to find the area of a circle, the volume of a sphere, and so on, for higher-dimensional “circles” and “spheres” . Make a statistical estimate of the accuracy of your result, and compare with known results. R4.3 From computer problems 9.3 in Sauer: In a biased random walk, the proba- bility of going up one unit is 0 < p < 1, and the probability of going down one unit is q = 1 − p. Design a Monte Carlo simulation with n = 10000 to find the probability that the biased random walk with p = 0.7 on the intervals a) [−2, 5] b) [−5, 3] c) [−8, 3] reaches the top. Calculate the error by comparing with the correct answer (q/p) a+b − 1/(q/p) b − 1 for p ≠ q and interval [−b, a]. Additional Attempts at these exercises can earn additional marks, but will not count towards the grade of 1–5 for the exercises. Completing all of these exercises does not mean that 6 marks will be obtained—the marks depend on the quality of the answers. It is possible to earn all 6 marks without completing all of these additional exercises. A4.1 Do the other part (i.e., (a) or (b)) for R6.2. A4.2 Find a PDE solver. Discuss the code, if source code is available. Test, and discuss. A4.3 From computer problems 9.4 in Sauer: Use the Euler-Maruyama method to solve dy = Btdt + (9y2 )1/3 dBt over the interval t = [0, 1] with initial condition y(0) = 0. Note that Bn = Σ0 ∆Bi. Use step sizes of h = 0.1, 0.01, 0.001, and use 5000 simulations for each step size to find the mean value of y(1) and the error. How does the error vary with step size? A4.4 Implement and test a random number generator. A4.5 Generate a distribution of random numbers other than uniform. or normal. What kind of problem would this distribution be useful for? A4.6 Find or implement code for high-dimensional optimisation. Test, and dis- cuss. A4.7 Implement Monte Carlo integration for circles, spheres, etc., using C or some other compiled language. Compare the performance of your code with your Matlab version. Investigate the effect of using single precision instead of double precision. A4.8 Write a popular science article describing a problem and the use of Monte Carlo methods to solve it. Programming hints Note that rand() and randn() can return a vector or matrix of random numbers. Instead of getting one random number at a time in a loop, you can get them all at once! • rand(N) andrandn(N) will return N × N matrices of random numbers. • rand(M,N) andrandn(M,N) will return M × N matrices of random num- bers. R4.2 You need to demonstrate a sufficient understanding of the question. This means that you will need to do the integration for multiple dimensions. Going from a 1-sphere (asphere in 1 dimension) to a 10-sphere is sufficient (That is: 1, 2, 3, 4, 5, 6, 7, 8, 9 and 10). If you are using rand() in MATLAB or similar, one way to do this is to sam- ple points inside a box. This method is finding the ratio of In/N, where In is how many random points lie inside the sphere, and N ishow many points used in the simulation. It is important to note that this is a ratio of the “box” it covers; this means you will need to multiply the ratio by the volume of the box. Algorithm R4.3 (For a unit sphere). Draw X from U(0,1) Find the distance the point is from the centre . Count how many points are inside the sphere . Calculate ratio = inside/N . Calculate volume = ratio*boxVolume R4.3 There are several ways to go about this question. When the walk reaches the top or the bottom of the interval (that is, if at any point the value of the walk equals a or −b) you break the calculation immediately and go to the next calculation. The stating point for the walk is 0. For some, it maybe helpful to think of it as a Discrete Markov Chain. Each state is finite, and once it reaches the boundary, it can keep going back to boundary with a probability of 1. Algorithm R4.3 Easy algorithm, ignoring Murphy’s Law. Initialse ReachedTheTop Do N times Initialise W * Random walk process * Stop * Random Walk process * while not at the boundary Draw u from U if u > q go up otherwise go down if W is at a Increment ReachedTheTop stop if W is at -b stop There is a more advanced algorithm. This is only useful if you are going to vectorise; if you are not going to vectorise, do the one above. Initialise W Do M times Draw dW from sign ( U - q ) W = W + dW if Wi = a , set Wi to be inf if Wi = -b , set Wi to be - inf Stop Sum up W equal to inf Question: Why does this algorithm work? Is it mathematically correct? What does W(W==a) do? A4.4 Bt is a partial sum, and is related to every random number drawn before (for that walk). Algorithm R4.4 Non-vectorised. Calculate the number of steps . Do N times *Euler -Maruyama step* Stop Find the mean . *Euler -Maruyama step* Do number of steps times Draw Z from N(0,1) . Calculate dB and Bt . Apply Euler -Maruyama scheme Stop And of course, a more advanced vectorised version. (Thomas wrote these hints, and he vectorises a lot.) Draw Z from N(0,1) Calculate Bt and initialise Y Calculate the number of steps Do number of steps times Apply Euler Maruyama - scheme (using element -wise operations) Stop Find the mean .
CS152 Lab 6: Searching and Optimization The purpose of this lab/project is to get you thinking about automating a scientific experiment and optimizing the parameters of a process. This is the second part of our study of the elephant population in Kruger National Park in South Africa. This week we'll be focusing on how varying the parameters will change the management strategy. Rather than manually exploring the parameter space, we will automate the process and use a search method to find the optimal darting percentage given the simulation parameters. Setting up the Lab/Project Create a Lab06 folder and open VS Code in it. L1. Implement binary search on a sorted list In your Lab06 folder create a new file, search.py. Then create a function searchSortedList that takes in two parameters, a list and a target value. You can use the following template. def searchSortedList ( mylist, value ): # assign to the variable done, the value False # assign to the variable found, the value False # assign to the variable count, the value 0 # assign to the variable maxIdx, the one less than the length of mylist # assign to the variable minIdx, the value 0 # start a while loop that executes while done is not True # increment count (which keeps track of how many times the loop executes # assign to testIndex the average of maxIdx and minIdx (use integer math) # if the myList value at testIndex is less than value # assign to minIdx the value testIndex + 1 # el if the myList value at testIndex is greater than value # assign to maxIdx the value testIndex - 1 # else # set done to True # set found to True # if maxIdx is less than minIdx # set done to True # set found to False return (found, count) Once you have coded that up, import the random package, copy the following test function, and run it to see if your function finds the value 42 in the list (it should). Think about why it can take a different number of steps each time you run the program. Experiment with different values of N and relate the number of steps that the function takes to complete to log2(N). def test (): a = [] N = 10**6 for i in range (N): a.append ( random.rand int (0,N) ) a.append (42) a.sort () print (searchSortedList ( a, 42 )) if name == " main ": test () This function is practice for creating the optimization function for the elephant simulation. That function will have a similar structure, and it will also execute a binary search, but it will have more parameters and be more flexible in how it works. L2. Write a function to generate a default parameter list Create a Project06 folder and copy your elephant.py file from the last project. Next, add a new function to your elephant.py file. The function should be called defaultParameters (). It should take no arguments and return a list with all of the necessary parameters for the simulation with their default values. calvingIn t = 3.1 probDart = 0.0 juvAge = 12 maxAge = 60 probCalfSurvival = 0.85 probAdultSurvival = 0.996 probSeniorSurvival = 0.2 carryingCapacity = 1000 numYears = 200 NOTE: For this project, set the default carrying capacity to 1000. L3. Write an elephant simulation function Next, also in your elephant.py file, create a function elephantSim. The arguments to the function should be probDart and a parameter list, which should have a default value of None. In other words, set it up as the following. def elephantSim ( probDart, inputParameters = None ): First, set up the parameters. If inputParameters is equal to the value None, then assign to parameters the result of calling your defaultParameters function. Otherwise, assign to parameters the value of inputParameters. Next, store the parameter value for probDart in the correct location in the parameters list. Then, declare an empty list named results where we will store the return from the call to runSimulation as follows: results = results + runSimulation (parameters) Place this in a loop which will call runSimulation five times and collect all of the results in a single list. The structure of the results list is that of a list of lists. Each sublist consists of the total population and other statistics for a given simulation year. Finally, loop over the results list and calculate the average total population. Remember, the total population for each year is the first element in each sublist. elephantSim will return the following crucial piece of information: (carrying capacity) - (average total population of the five simulations). Cast this value as an integer before returning. Just like our cost function in all our search algorithms, this metric will guide the search for the optimal darting probability. You can think of this return value as specifying whether too many or too few elephants are being darted. If the return value is negative, then the population is too big and we need to tweak the darting probability higher so that the elephant population shrinks; and if the return value is positive, then the population is too small and we need to tweak the darting probability lower so that the elephant population grows. L4. Test your elephantSim function Once you have written elephantSim, use the testfile test_elephantSim.py to evaluate its performance. It tests the percent darted for five different values. The difference should go from negative (darting probability too low) to positive (darting probability too high) for the default simulation parameters. Example output: (your numbers may be different because of the random and your elephant parameter boundaries are different) probDarting 0.405 diff -385 probDarting 0.415 diff -331 probDarting 0.425 diff -172 probDarting 0.435 diff -8 probDarting 0.445 diff 152
I. Financial Price and Return Consider a 60-month (5 year) investment in two assets: the Vanguard S&P 500 index (VFINX) and Amazon stock (AMZN). Suppose you buy one share of the S&P 500 fund and one share of Amazon stock at the end of January, 2011 for Pvfinx ,t−60 = 108, Pamzn ,t−60 = 170 , and then sell these shares at the end of January, 2016 for Pvfinx ,t = 179, Pamzn ,t = 587 . (Note: these are actual adjusted closing prices taken from Yahoo!). In this question, you will see how much money you could have made if you invested in these assets right after the financial crisis. a. What are the simple 60-month (5-year) returns for the two investments? b. What are the continuously compounded (cc) 60-month (5-year) returns for the two investments? Why are the cc returns smaller? c. Suppose you invested $1,000 in each asset at the end of January, 2011. How much would each investment be worth at the end of January, 2016? d. What are the compound annual returns on the two 5 year investments? e. At the end of January, 2011, suppose you plan to invest in a portfolio of VFINX and AMZN over the next 60 months (5 years). Suppose you purchase 10 shares of the VFINX mutual fund (at $108/share) and 10 shares of AMZN stock (at $170/share). What is the value of your initial investment? What are the portfolio weights in the two assets as of the end of January, 2011? f. Using the results from part a. compute the 5-year simple and cc portfolio returns. What is the value of your portfolio at the end of January, 2016? g. Go to http://finance.yahoo.com and download monthly data on a stock of your choice (except Starbucks and Amazon) over the period January, 2015 to January, 2024. Read the data into Excel and make sure to reorder the data so that time runs forward. Delete all columns except those containing the dates and the adjusted closing prices. Save the file as a .csv (comma separated value) file and call it examprice.csv. (i) Plot the monthly closing price data of your stock using the plot() function. Please add a legend. (ii) Compute monthly simple and continuously compounded returns. Plot these returns separately first. Then also plot on the same graph. (iii) Calculate the growth of $1 invested in your stock, and report the plot of future values. II. Constant Expected Return and Single Index Model Consider the constant expected return (CER) model rit = μi + εit , εit ~ iid N(0,σi2 ) cov(rit , rjt ) = σij , cor(rit, rjt ) = ρij for the monthly simple returns on the Vanguard S&P500 index (VFINX) and Amazon stock (AMZN) presented in part I above. Below are simulated returns and some graphical descriptive statistics for VFINX and AMZN from the CER model calibrated using the sample estimates of the CER model parameters for the two assets (these are the sample statistics from the table of the previous question). a. Which features of the actual returns shown in part I are captured by the simulated CER model returns and which features are not? b. Does the CER model appear to be a good model for VFINX and AMZN returns? Why or why not? c. For each asset, compute estimated standard errors for , andρ(ˆ) . Using the table below, show the estimates in one row and the standard errors in another row. (i) Briefly comment on the “co-movement” between the two assets. (ii) What do you expect for the sign of beta of AMZN if you estimate the Single Index Model for AMZN? ^ μvfinx ^ μamzn ^ σvfinx ^ σ ^ pfvinx,amzn Estimate Std. Error d. For Amazon (AMZN) only, compute 95% confidence intervals for mu. From this result, do we have any expected positive return for AMZN (statistically)?
MTH205 Introduction to Statistical Methods Tutorial 5 Based on Chapter 5 1. Show that in the one-way analysis of variance, the error mean square MSE is an unbiased estimator of the treatment variance σ2. Hint: You may wish to start from the definition of SSE, i.e. we first add and subtract the treatment mean µi from each term to obtain and then expand. 2. For the one-way ANOVA table, we know that Show that SST = SSTr + SSE. 3. Show that in the one-way analysis of variance, E (SSTr)=(I − 1) σ2 if H0 is true E (SSTr) > (I − 1) σ2 if H0 is false. You should start from the definition of 4. The removal of ammoniacal nitrogen is an important aspect of treatment of leachate at landfill sites. The rate of removal (in percent per day) is recorded for several days for each of several treatment methods. The results are presented in the following table. Construct the one-way ANOVA table. Do the treatment methods di↵er in their rates of removal? 5. An experiment was performed to determine whether the annealing temperature of ductile iron affects its tensile strength. Five specimens were annealed at each of four temperatures. The tensile strength (in ksi) was measured for each. The results are presented in the following table. (i) Construct the one-way ANOVA table. Can you conclude that there are di↵erences among the mean strengths? (ii) Use the Bonferroni method to determine which pairs of means, if any, are di↵erent at the 5% significance level. (iii) Use the Tukey-Kramer method to determine which pairs of means, if any, are di↵erent at the 5% significance level. (iv) Which is the more powerful method to find all the pairs of treatments whose means are different, the Bonferroni method or the Tukey-Kramer method? A metallurgist wants to determine whether the mean tensile strength for specimens annealed at 900◦C differs from the mean strengths for specimens annealed at 750◦C, 800◦C and 850◦C. (v) Use the Bonferroni method to determine which pairs of means, if any, for 750◦C, 800◦C and 850◦C to differ from the mean for 900◦C at the 5% significance level. (vi) Use the Tukey-Kramer method to determine which pairs of means, if any, for 750◦C, 800◦C and 850◦C to differ from the mean for 900◦C at the 5% significance level. (vii) Which is the more powerful method to find all the pairs of treatments whose means di↵er from the 900◦C, the Bonferroni method or the Tukey-Kramer method?
MATH GR5280, Capital Markets & Investments Final Project The aim of this Final Project is to practically implement the ideas from the course, specifically from Chapters 7 and 8 of [BKM13]. Using Bloomberg, you will be given a recent 20 years of recent historical daily total return data for ten stocks, which belong in groups to three-four different sectors (according to Yahoo!finance), one (S&P 500) equity index and a proxy for risk-free rate (1-month Fed Funds rate). Additionally, you will be given contemporaneous ESG [ESG3] scores data also from Bloomberg for all of your companies with detailed explanations to them. In order to reduce the non-Gaussian effects, you will need to aggregate the daily data to the monthly observations, and based on those monthly observations, you will need to calculate all proper optimization inputs for the full Markowitz Model (“MM”), alongside the Index Model (“IM”). Using these optimization inputs for MM and IM you will need to find the regions of permissible portfolios (efficient frontier, minimal risk portfolio, optimal portfolio, and minimal return portfolios frontier) for the following four cases of problems: 1. This optimization is designed to simulate the typical limitations existing in the U.S. mutual fund industry: a U.S. open-ended mutual fund is not allowed to have any short positions, for details see the Investment Company Act of 1940, Section 12(a)(3) (https://www.law.cornell.edu/uscode/text/15/80a-12): wi ≥ 0, for ∀i ; 2. Now, having the efficient risky portfolio {w(ˆ)i }i(1)1 for the solution for the above problem 1, you will need to solve the problem 1 above with the following constraint on ESG: 3. This optimization constraint is designed to simulate the Regulation T by FINRA (https://www.finra.org/rules-guidance/key-topics/margin-accounts), which allows broker-dealers to allow their customers to have positions, 50% or more of which are funded by the customer’s account equity: 4. Lastly, having the efficient risky portfolio {w(ˆ)i }i(1)1 for the solution for the above problem 3, you will need to solve the problem 3 above with the following constraint on ESG: You will need to numerically solve the above problems using the template “FinalProject AlexeiChekhlov Group0.xlsx” and submit your numerical solutions as such file, with filename adjusted with your “FinalProject FirstnameLastname Group(your group#).xlsx”. Please, do not insert or delete any cells, keep the existing format – it is very nicely done and the graphs will allow you to “see” your solutions. The areas of cells that you will need to fill-in with your numerical solutions are as follows. The points for MM: P2:AC3, P5:AC6, P8:AC9, P11:AC12. The curves (frontiers) for MM: C33:F113, I33:L113, O33:R153. The points for IM: AI2:AV3, AI5:AV6, AI8:AV9, AI11:AV12. The curves (frontiers) for IM: AM33:AP113, AS33:AV113, AY33:BB153. The grading will be done by comparing your tabulated results to exact solutions. The calculations should be done on a Windows computer with licensed Microsoft Office installed. Again, you will be given 20 years of daily data of total returns for the S&P 500 index (ticker symbol “SPX”), and for ten stocks (ticker symbols see the table below) such that there are three-four sectors of stocks with stocks in each group belonging to one (Yahoo!finance) sector and an instrument representing risk-free rate, 1-month annual Fed Funds rate (ticker symbol “FEDL01”). Note that stocks in each group are completely different. Therefore, each group will have its own results and conclusions. Below, please, find the table of stock ticker symbols (aka, tickers) for each group to work with: Group #1 Group #2 Group #3 Group #4 Stock #1 ADBE AMZN NVDA QCOM Stock #2 IBM AAPL CSCO AKAM Stock #3 SAP CTXS INTC ORCL Stock #4 BAC JPM GS MSFT Stock #5 C BRK/A USB CVX Stock #6 WFC PGR TD CN XOM Stock #7 TRV UPS ALL IMO Stock #8 LUK FDX PG KO Stock #9 ALK JBHT JNJ PEP Stock #10 HA LSTR CL MCD Below, please, find the table which shows the details for each of the stocks and which stocks belong to the same sector in each group. # Group #1 Full Name Sector (Yahoo!finance) 1 ADBE Adobe Inc. Technology 2 IBM International Business Machines Corporation Technology 3 SAP SAP SE Technology 4 BAC Bank of America Corporation Financial Services 5 C Citigroup Inc. Financial Services 6 WFC Wells Fargo & Company Financial Services 7 TRV The Travelers Companies, Inc. Financial Services 8 LUV Southwest Airlines Co. Industrials 9 ALK Alaska Air Group, Inc. Industrials 10 HA Hawaiian Holdings, Inc. Industrials # Group #2 Full Name Sector (Yahoo!finance) 1 AMZN Amazon.com, Inc. Consumer Cyclical 2 AAPL Apple Inc. Technology 3 FFIV F5 Networks, Inc. Technology 4 JPM JPMorgan Chase & Co. Financial Services 5 BRK/A Berkshire Hathaway Inc. Financial Services 6 PGR The Progressive Corporation Financial Services 7 UPS United Parcel Service, Inc. Industrials 8 FDX FedEx Corporation Industrials 9 JBHT J.B. Hunt Transport Services, Inc. Industrials 10 LSTR Landstar System, Inc. Industrials # Group #3 Full Name Sector (Yahoo!finance) 1 NVDA NVIDIA Corporation Technology 2 CSCO Cisco Systems, Inc. Technology 3 INTC Intel Corporation Technology 4 GS The Goldman Sachs Group, Inc. Financial Services 5 USB U.S. Bancorp Financial Services 6 TD CN The Toronto-Dominion Bank Financial Services 7 ALL The Allstate Corporation Financial Services 8 PG The Procter & Gamble Company Consumer Defensive 9 JNJ Johnson & Johnson Healthcare 10 CL Colgate-Palmolive Company Consumer Defensive # Group #4 Full Name Sector (Yahoo!finance) 1 QCOM QUALCOMM Incorporated Technology 2 AKAM Akamai Technologies, Inc. Technology 3 ORCL Oracle Corporation Technology 4 MSFT Microsoft Corporation Technology 5 CVX Chevron Corporation Energy 6 XOM Exxon Mobil Corporation Energy 7 IMO Imperial Oil Limited Energy 8 KO The Coca-Cola Company Consumer Defensive 9 PEP PepsiCo, Inc. Consumer Defensive 10 MCD McDonald's Corporation Consumer Cyclical Using this data and the template Excel spreadsheet you will need to make all the necessary calculations to produce the Permissible Portfolios Region, which combines the Efficient Frontier, the Minimal Risk or Variance Frontier, and the Minimal Return Frontier for a given set of constraints (1-4 above). The Minimal Return Frontier and the Efficient Frontier together are forming the Minimal Risk or Variance Frontier – it is just a matter of reformulating the optimization problem, as follows: :MinimalReturnFrontier :MinimalRiskPortfolio :
SPH4U - Physics for University Test 5 - Unit 4 Waves Multiple Choice - ***Circle one choice for each question.*** 1. (10 points) [K, T] (a) Which of the following is not a category used to describe waves? A. Transverse B. Mechanical C. Longitudinal D. Energetic (b) Electromagnetic waves are a special category of waves that do not require . A. energy to propagate B. a medium to propagate C. a vacuum to propagate D. a slit for diffraction (c) The refractive index is a ratio that measures . A. The change in velocity of a wave between mediums B. The angle at which a wave will bend C. The wavelength of an incidence wave D. The velocity of light in a vacuum (d) Perfect is achieved when two waves are out of phase with each other (peak to trough). A. constructive interference B. constructive diffraction C. destructive interference D. destructive refraction (e) The photoelectric effect depends on which wave characteristic? A. Frequency. B. Amplitude. C. Velocity. D. Intensity. (f) Which of the following processes is achieved bypassing light through a filter? A. Scattering of white light. B. Polarization. C. Thin film interference. D. Reflection. (g) π shift results from which of the following scenarios? A. Refraction in a medium that decreases wave speed. B. Refraction in a medium that increases wave speed. C. Reflection off a medium that would have increased wave speed. D. Reflection off a medium that would have decreased waves speed. (h) Which of the following theories explains why light can produce an interference pattern in a singleslit exper- iment. A. Einsteins’ theory of wave/particle duality . B. Newtons’ theory of corpuscles. C. Huygens’ theory of wavelets. D. Snells’ law. (i) Which of the following is NOT a method of polarization? A. Scattering B. Diffraction C. Double refraction D. Reflection (j) is a product of thin film interference A. Iridescence B. White light scattering C. Single slit diffraction D. Polarization Short Answers - ***Give complete answers, SHOW ALL OF YOUR WORK, don’t forget to use GRASP - Given, Required, Analysis, Solve, Paraphrase.*** 2. (10 points) [A, T] In water where light can not pass through, dolphins will resort to echolocation as a sense rather than eyesight. (a) In air where nair = 1.00, the speed of sound is measured as 380m/s. If nwater is 1.33, calculate the new speed of sound. (3 pts) (b) The dolphin uses its’ echolocation to judge the distance of a fish nearby. A wave is sent out and received again in 0.008s, calculate the distance of the fish from the dolphin (4 pts) (c) The frequency of the wave emitted by the dolphin is at 255khz, what is the wavelength? (3 pts) 3. (10 points) [C,A, T] A wave passes from air into ethyl alcohol (n = 1.36) at an angle of incidence 37。. (a) Using the template provided below, draw the expected refraction of the wave. (Include the following infor- mation: the normal line, and wave fronts to represent the wavelength) (2 pts) (b) If the initial wavelength of the wave is 940nm, calculate both the angle of refraction and the new wavelength. (4 pts) (c) In this instance, the refraction is not perfect and there is partial reflection. i) In the space below, calculate and justify the angle of reflection. (2 pts) ii) Using the information given within this question, what do we know about the phase of thereflected wave? (2 pts) 4. (15 points) [K, A] When light passes through a narrow opening or around an obstacle, we often see diffraction. (a) A wave of light with λ 0.82cm passes through an opening with an aperture of 0.6cm. Will there be noticeable diffraction? (2 pts) (b) A second slit is introduced and a pattern emerges containing lines of constructive and destructive interference. In the space below, explain how the following pattern is generated: (2 pts) Constructive interference Destructive interference (c) If the distance between the two slits is 0.045m, determine the angle θ of a point where both waves meet at the 3rd anti-nodal line. (4 pts) (d) Calculate the difference in path-length between the two waves at the 3rd anti-nodal line. (2 pts) 5. (15 points) [K, C] In the late 1800s/early 1900s, scientists witnessed a new phenomenon known as the photoelec- tric effect. This would lead to the theory of wave-particle duality. Scientist Heinrich Hertz was the first to witness the photoelectric effect - he found, when experimenting with UV rays, that when shone on a metal sheet, electrons would begin to escape. (a) With the understanding of classical wave theory, explain what Hertz was expecting to see in his experiment with amplitude of waves? (Hint: Energy is directly proportional to amplitude) (4 pts) (b) To his surprise Hertz found amplitude would not trigger the photoelectric effect. He found however, that a variation in wavelength would. This would be built upon later by Einstein who gave his relationship as E = hf. i) Explain why varying wavelength would trigger the photoelectric effect. (Answer using the relationship between wavelength and frequency). (2 pts) ii) Explain how waves above a threshold frequency cause the electrons to escape. (4 pts) (c) It was determined through the electron double slit experiment that quantum matter (like electrons and pho- tons) had properties of both waves and particles. A limitation in measuring quantum matter is the observer effect - in the space below, explain the observer effect. (5 pts) 6. (10 points) [K, T] Some sunglasses allow for reduction in intensity of light by using a polarizing filter. (a) If light arrives at the glasses with an intensity of 145cd, at an angle of 20。. Calculate the new intensity through polarised glasses. (Note: the SI unit for light intensity is Candela (cd)) (2 pts) (b) An analyser is used to test light for polarization - explain how an analyser is used. (3 pts) (c) In the space below, explain how light reflecting off a new medium can be polarised. (3 pts) (d) Light travels through glass (nglass = 1.52) andreflects off of water (nwater = 1.33), calculate the angle needed for perfect polarization (also known as Brewster’s angle). (2 pts) 7. (10 points) [K, A] When passing light waves through a single slit, it was found that an interference pattern is still generated. (a) The single slit interference was explained by Huygens and his theory of wavelets. In the space below, explain how the wavelet theory can create a single wave interference pattern. (4 pts) (b) A light source has a wavelength of 550nm, and passes through a slit with an aperture of 5μm. Calculate the thickness of the first order bright fringe in degrees. (3 pts). (c) Calculate the distance (△y) between the 1st and 2nd order bright fringes if the screen that the pattern is projected onto is 1.2cm away. (3 pts). 8. (10 points) [C,A, T] Oil slicks are an example a phenomenon known as thin film interference. (a) Explain how partial reflection and partial refraction lead to thin film interference. (4 pts) (b) The thickness of the film dictates whether light waves will interfere in a constructive or destructive way. Explain how this is the case. (3 pts) (c) A wave of light with wavelength 484nm enters from the gasoline medium (ngasoline = 1.40) into a water medium (nwater = 1.33). Calculate the new wavelength. (3 pts) (d) Will constructive interference happen at whole or partial integers of the wavelength in this example? Justify your answer. (2 pts) (e) What are the three smallest thicknesses of water that will results in constructive interference? (3 pts). 9. (10 points) [A, T] A general rule states that the resolution of an interference pattern increases with the number of slits used. This leads to the use of a diffraction grating in most laboratory settings. (a) A wave of light is passed through a diffraction grating (300lines/cm) with a 3rd order maxima at an angle of 23。. Calculate the wavelength. (3 pts) (b) Calculate the frequency of the wave in part a. (Note: assume the light is travelling in air with a velocity of c). (2 pts) (c) When white light is used in a diffraction grating, we get white light scattering into a rainbow at each maxima. Explain how a diffraction grating creates the rainbow spectrum pattern seen in the image below. (3 pts) (d) Explain why there is no rainbow scattering at the 0 order maxima. (2 pts)
SAS Basics Bootcamp COVID-19 Data Analysis (Final Project) Instructions (35 points) As the culminating assignment for this course, students will complete a data analysis using real-world COVID-19 data. This assignment has been structured to closely resemble the format of case study questions and assignments typically encountered in data analyst job interviews. This assignment provides an excellent opportunity for you to showcase your skills in analytical thinking, research, data analysis, and effective communication. Case Study As a data analyst for the Department of Public Health in the state of [DESIGNATED STATE], you are tasked with assessing the impact of COVID-19 to guide resource allocation effectively. Using state-level data from the Johns Hopkins University's Center for Systems Science and Engineering (CSSE), your analysis will focus on one specific COVID-19 outcome (e.g., tests, confirmed cases, hospitalizations, or deaths). The Governor is particularly interested in understanding how [DESIGNATED STATE] compares to neighboring states in managing the COVID-19 pandemic. Your role involves analyzing the data to uncover trends, highlight disparities, and identify areas needing intervention or support. This analysis is vital for informing decisions made by policymakers, healthcare providers, and the public to allocate resources effectively. Project Goals 1. Provide Context and Data Overview o Offer a clear description of the Johns Hopkins CSSE data as the source. o Explain the purpose of the analysis and its significance for public health and policymaking. 2. Analyze Daily Trends o Focus on daily trends for the selected COVID-19 outcome in [DESIGNATED STATE] and the regional average during the study period (June 1–7, 2020). o Present data clearly using graphs or tables. o Summarize findings, emphasizing key observations and disparities between your state and the regional average. 3. Recommend Strategies for Resource Allocation o Based on findings, provide actionable recommendations to help the Governor optimize resources and policy decisions. o Align recommendations with observed trends and disparities. About the data: The “study period” is from June 1 to June 7, 2020. ● COVID-19 data comes from the Center for Systems Science and Engineering at Johns Hopkins (https://github.com/CSSEGISandData). ○ Eight (8) SAS datasets with daily COVID-19 information for the study period (“cd0601”- “cd0607”) and for the day prior to the start of the study period (“cd0531”) ● Region data comes from U.S. Census (https://www2.census.gov/geo/pdfs/maps-data/maps/reference/us_regdiv.pdf) ○ One (1) SAS dataset indicating region for each U.S. state/territory, based on U.S. Census classification (“region”) Information should be presented as short paragraphs (occasional bullet points are acceptable) and data visualizations to illustrate key information clearly and concisely to stakeholders potentially unfamiliar with epidemiologic and statistical language. Grading Rubrics Section 1: Data Source and Analysis Purpose (2 points) · Clearly describe the data source (Johns Hopkins CSSE) and its relevance to the analysis. · Explain how the analysis aligns with public health policy and resource allocation goals. Section 2: Data Preparation (9 points total) · Compute daily new cases (or the chosen outcome) for [DESIGNATED STATE] (3 points). · Compute the regional average for the same metric (3 points). · Integrate data by merging COVID-19 data with Census region data (3 points). Section 3: Daily Trends Analysis (10 points total) · Clearly analyze and describe trends for [DESIGNATED STATE] during June 1–7, 2020 (5 points). · Highlight notable disparities between your state and the regional average (5 points). Section 4: Recommendations for Resource Allocation (4 points) · Provide actionable, evidence-based recommendations. · Ensure recommendations address disparities and observed trends. Additional Grading Criteria · Clarity and Coherence (2 points): Write clearly, logically, and professionally. · Data Visualization (3 points): Use effective graphs and tables to present data. · SAS Code Quality (5 points): Write well-documented, readable SAS code with appropriate comments. Formatting requirements: - The document should be single-spaced with 0.5-inch margins and should not exceed two pages, including visualizations. Submit the document as a Word document (.doc or .docx) - Please do not include questions. Do not write it in a Q&A format. Write the analysis report for the stakeholders and the public (Remember that you are a government employee, which means that most of your communications are public record: source 1, 2, 3). - Submit SAS code in either a separate document or as an appendix page below the write-up.
Assessment BriefPROG2004 ObjectOriented ProgrammingModule 4 – Advanced exception handlingThe following part of the assessment coversthe content in Module 4.Part 4 – Implementing exception handlingAt the moment, you have not implemented any exception handling in your program. For this part of the assignment:• Where applicable, make sure that all setters in your program confirm that the values they are writing to your instance variables are valid. If they arenot, throw an IllegalArgumentException and print an appropriate error message.• Add any other exception handling that you feel is appropriate to your program.DemonstrationIn the partFour method in the AssessmentTwo class:• Using one of the setters that you added exception handling to:o Pass a valid value to the method and show that the instance variable is seto Pass an invalid value to the method and show that the exception is caughtTitle Assessment 2 (Continue from Project 2 from theprevious Assessment)Deadline 11:30 AM, 16 December 2024Submission Code + Video using a USB drive2Assessment BriefModule 5 – Input/outputThe following part of the assessment coversthe content in Module 5.An important part of many programs is the ability to back up data to a file and then restore it as needed. In this section of the assignment, we will add thisability to our program.Hint for exporting and importing dataA common way to store data in a file that needs to be imported later is to use comma-separated values (csv). This means that we store a record on a singleline, and we separate values using a comma (,). For example, imagine an object for a class called Animal has the following information:• species: Dog• breed: Poodle• colour: Brown• name: Fido• age: 7You could store the Animal object in the file on a single line like:Dog, Poodle, brown, Fido, 7When you read the file, each line in the file will contain the details for a single Animal object. You can then use the split() method from the String classto splitthe line into the individual values and then use the values to create a new Animal object.Part 5 – Writing to a fileThe Classroom classis missing the ability to back up the Members who have signed up for the Classroom. Forthis part of the assignment:• Add a method to the Classroom class that writes the details of all of the Members that have signed up for the Classroom (i.e. stored in theLinkedList) to a file. The details for each Member should be written on their own line.• Youmust make sure to add all appropriate exception handling and error messages.DemonstrationIn the partFive method in the AssessmentTwo class:3Assessment Brief• Create a new Classroom.• Add a minimum of 5 Members to the Classroom (i.e., the LinkedList).• Export the Members to a file.Part 6 – Reading from a fileThe Classroom class is also missing the ability to restore the members who have signed up for the Classroom. For this part of the assignment:• Add a method to the Classroom class that can read the file that was created in the previous section.• When reading the file, you need to sign up all members for the Classroom (i.e., add them to theLinkedList). You must make sure to add all appropriate exception handling and error messages.Note: If you cannot enrol the Members in the Classroom (i.e., add them to the LinkedList), you will still get marks for reading the file.DemonstrationIn the partSix method in the AssessmentTwo class:• Create a new Classroom.• Importthe file you created in the previous part of the assignment.• Print the number of Members in the LinkedList to confirm that the correct number of Members were imported.• Print all Members in the LinkedList to confirm that the details of each Member were imported correctly.Module 6 – ConcurrencyThe following part of the assessment coversthe content in Module 6.Part 7 – lock() and unlock() methodsYou are using a LinkedList to store the Members signed up for a Classroom. However, a LinkedList is not thread-safe. This meansthat if multiple threads wereperforming operations on the Members signed up for a Classroom you could encounter issues. For this part of the assignment:• Use the lock() and unlock() methods to protect any critical sections of code in the Classroom class that perform any operations on the LinkedListthat stores the Members signed up for a Classroom.• Youmust make sure to add all appropriate exception handling and error messages.4Assessment BriefResourcesTo complete the task, you are recommended to:• Study modules 1 - 6 materials and complete all learning activities• Take an active role in the weekly tutorial and workshop.Task SubmissionYou are required to submittwo items for this assessment, including:• Your Java project 2 (after updating it)• A 5 minutes video explain the new parts.Assessment CriteriaPlease refer to the rubric provided in the assessment folder for the assessment criteria. Marking criteria include:• Java code compiles with Java 17 LTS• Use of correct coding style, including the use of comments• Accuracy of coding• Use ofsuitable coding structures• Correctsubmission and naming conventions of assessment items asrequired
03CIT4057 Introduction to Computer ProgrammingProjectOverview:The project will be 30% of overall grade of the course.The project is done by team. Each team is formed by 3‑5 students.Background:In cryptography, a Caesar cipher is one of the simplest and most widely known encryption techniques. It is atype of substitution cipher in which each letter in the plaintext is replaced by a letter some ffxed number ofpositions down the alphabet.For example, with a left shift of 3, D would be replaced by A, E would become B, and so on.The Caesar cipher is named after Julius Caesar, who, according to Suetonius, used it with a shift of three (A be‑coming D when encrypting, and D becoming A when decrypting) to protect messages of military signiffcance.While Caesar’s was the ffrst recorded use of this scheme, other substitution ciphers are known to have beenused earlier.2Speciffcation:‧ Decrypt a Caesar cipher encrypted text.‧ The encrypted text is stored in a ffle.‧ The program reads the ffle and shows the text after decryption.Bonus:‧ Establish a web server which accepts the encrypted message and then present the decrypted text.Double bonus:‧ In additional of the manual input, the webserver accepts encrytped text and returns the decrypted textin JSON format.Deliverables:Group work:For every item below, 1 group submits 1 copy only.1. The group should submit the source code to the GitHub group repository.2. Powerpoint for the presentation. The PPT SHOULD include the member list. The group should submitthe PPT to the Blackboard.Individual work:For every item below, every student should submit 1 copy.3. Student must do a peer review for every member in the group. He/she should ffll the 360 review form andsubmit it to the Blackboard.3Due dateAll items mentioned in the Section Deliverables should be submitted on or before2024‑12‑18 23:59:59.Presentation:Each team will present their project in the last class.4Appendix A:Breaking The Cipher:To look for the “shift”, you need to know the following technique about cracking Caesar ciphers that hasbeen around for over a thousand years.Any language such as English has a known distribution for each letter.For example,the letter “E” is the most common letter in English making up about 12% of the letters on average (ignoringcase). The letter “T” is next (about 9%), followed by “A” ( about 8%), and so on. But the point is that onlythe order “E”, “T” and “A” does matter, not the percentage.The procedure begins by ffnding the most common letter. You can guess that the most common letter mapsto “E.” You can now ffnd the “shift” from the most common letter in the cipher‑text to the expected mostcommon letter “E”.For example,if the most common letter in the cipher‑text is “H”, you know that the shift from “E” to “H” is 3. Youshould check that the shift for the next most common letter “T”, and third most common letter “A” is also3. Once you know the shift, you can apply the shift to all the letters in the cipher‑text and get the originalplain‑text message.What about spaces between words and punctuation?In the real world, there is no space or punctuation in a cipher‑text. The reason is that those are useful cluesfor deciphering.However, there are spaces in the cipher‑text for this project because they will be helpful for you to recognizethat your deciphering is correct or not. But you will need to ignore spaces when counting letters (if you forgetto ignore them, beware that the space will be the most common character).The suggested algorithm will be:1. Read the cipher‑text.2. Get a count of each character in the entire cipher‑text (ignore spaces).3. Find the most common character.4. Find the shift from “E” to that most common character.5. Check that the shift also works for the next most common.6. Using the shift, decode each character of the cipher‑text and print.5