Assignment Remit Programme Title BSc Accounting and Finance Module Title Widening Accounting Horizons A Module Code 33975 Assignment Title Business Report Level LC Weighting 50% Hand Out Date 11 November 2024 Deadline Date & Time 20 December 2024 Before 12pm (12-noon) Feedback Post Date 21st Working day after the deadline date Assignment Format Other Assignment Length 1,000 words Submission Format Online Individual ASSESSMENT DETAILS Widening Accounting Horizons A (08 33975) Assignment 2: Individual 1,000 word Business Report Attendance and participation in lectures and seminars is vital for your success in this module because if you do not attend then you will not have company transactions to include and discuss in your report The Business report relates to the second half of the module (focusing on SAASU and business processes). There is a required format for the presentation of this report: a word document with “cropped screengrabs” of the relevant sections of your Saasu ERP software. It should demonstrate well-organised presentation and coherent writing. Detail: Saasu numbers plus explanatory narrative During your lectures you will receive guidance and tuition on how a company accounts for transactions. You will also have practical experience of ‘posting’ those transactions in your own virtual company that you set up in Saasu. NB DO NOT WORRY ABOUT PERFECTION - your transactions do not have to be 100% accurate – just close enough to prove that you understand what you are doing. This aligns to many businesses where Financial statements are not accurate to the penny, but portray a ‘materially’ correct representation of the businesses financial position (Some of you will explore this in more detail next year if you select the AUDIT module) Your transactions should mirror those that I ask you to make, and not include any others – There is a risk that if you post random or wrong transactions that you lose some marks in the “numbers” section of the assignment below. If you do accidentally create the wrong transactions, you can delete them, but this can be quite complicated depending on how much other work you have done. Follow the guidance in the interactive lectures, and if needed spend time afterwards, ensuring your Saasu transactions are a close as possible to what has been requested. You do not have to be perfect with your work; this is a new area and will be challenging for some of you (especially if you miss lectures or don’t listen!). The detail (wordcount 1,000 words – not including screenshots): This report should include the following: 1. Screenshots of the following screens: o Dashboard at year end* o Trial balance at year end o Inventory (also known as Stock) at year end o Accounts Payable (or Trade Creditors) at the year end o Accounts Receivable (or trade Debtors) at the year end 2. Narrative relating to the following screenshots 1. Trial balance at year end – Description and identification of any key / unusual balances not already discussed in 2, 3 or 4 below 2. Inventory at year end by quantity and value – Describe the type of inventory that exists in the business and whether management need to action anything / any other points that you think are relevant 3. Accounts Payable at the year end – Describe what AP relates to and why management do not want this to be too small. Also any other points that you think are relevant 4. Accounts Receivable at the year end – Describe what AR relates to and why management want this figure to be small. Also any other points that you think are relevant 3. Good professional appearance: Think about how you want to structure the above to make it look like a professional piece of work. (eg Headings / Titles / etc) * NB Year end for the purpose of this assignment occurs immediately after the final hands on Saasu session that you do – ie your Saasu final position with your software. It does not relate to any of my add-on sessions – these will occur outside Saasu. Other information Rubric – this is guidance on how you will be marked out of 100 on the Saasu Business report section: Description Long description Weighting Saasu: Overall presentation The Saasu assignment should be clearly structured to address the assignment given; screenshots* should be presented clearly along with supporting narrative. 20 Saasu: Numbers reported Marks will be awarded for how close your uploaded Saasu trial balance is to the model answer (including accounts payable / accounts receivable / inventory) 40 Saasu: Narrative Marks will be awarded for successful interpretation of the screenshots uploaded, and for identification of any potential errors or key points if applicable. 40 * Screenshots to be captured using the snip & sketch function, printscreen function or similar, providing that your screenshots are readable otherwise I will not be able to award marks for this area. Some students who struggle to get good screengrabs may use photo’s but you will drop some marks given that this is a business report.
Epidemiology GPH-GU 2106, Section 013 FALL 2024 COURSE DESCRIPTION Epidemiology is the study of the distribution and determinants of health and disease in different human populations and the application of methods to improve disease outcomes. As such, epidemiology is the basic science of public health. This course is designed to introduce students in all fields of public health to the background, basic principles and methods of public health epidemiology. Topics covered in this course include: basic principles of epidemiology; measures of disease frequency; epidemiologic study designs: experimental and observational; bias; confounding; outbreak investigations; screening; causality; and ethical issues in epidemiologic research. In addition, students will develop skills to read, interpret and evaluate health information from published epidemiologic studies. COURSE FORMAT This in-person, three (3) credit course consists of two main components: a lecture and a discussion section. In total, students will receive 2,250 minutes of live in-person instruction and discussion with a recitation instructor. Attending both lecture and discussion sections are equally important to ensuring success in the course. All students are required to attend lecture and discussion sections. COURSE OVERVIEW Lectures will follow a didactic format that will provide students with information and explanation about the importance of the fundamental concepts of epidemiology as they apply to measuring and understanding population-level health. Lectures will focus on introducing the topics outlined in the course schedule and for providing examples of how these concepts are measured via the application of epidemiologic study designs. The overarching goal of the discussion sessions is to enhance familiarity and confidence in the concepts covered in the lectures. In order to meet this goal, students will work in small groups on case studies and exercises developed to provide connections between concepts covered in lectures and real-world examples scenarios. In addition, these weekly sessions provide another opportunity for students to clarify any concepts presented in the online lecture materials, as well as review prior and/or upcoming homework assignments. PRE-REQUISITES There are no formal pre-requisite courses for this class. However, success in this course requires an understanding of basic arithmetic and algebraic concepts and ability to apply these concepts. Specifically, students should feel comfortable working with fractions, decimals, multi-step arithmetic problems and extract numerical information described in text and graphical formats. For further guidance or support, please refer to the information and resources provided in the Foundations for Epidemiology and Biostatistics primer (GPH-GU 5010). LEARNING OBJECTIVES & FOUNDATIONAL COMPETENCIES Learning Objective Foundational Competencies Lecture Assessment 1. To explain the role of epidemiology in the field of public health. - Apply epidemiological methods to the breadth of settings and situations in PH practice Lecture 1 Midterm Exam 2. To identify appropriate measures of morbidity and mortality used to examine the major causes and trends of morbidity and mortality in the US and other populations. - Apply epidemiological methods to the breadth of settings and situations in PH practice - Analyze quantitative and qualitative data using biostatistics, informatics, computer-based programming and software as appropriate Lecture 2-4 Hwk 1 Midterm Exam 3. To distinguish between the role and application of quantitative versus qualitative methods in describing and assessing a population’s health. - Select quantitative and qualitative data collection methods appropriate for a given public health context - Apply epidemiological methods to the breadth of settings and situations in PH practice Lectures 5 Hwk 2 4. To describe epidemiologic study designs used to examine the health status of a population and be able to evaluate the strengths and limitations of each. - Apply epidemiological methods to the breadth of settings and situations in PH practice Lecture 5-7, 9 Hwks 3-4 Midterm Exam 5. To identify and describe the impact of bias, including confounding, in epidemiologic studies. - Analyze quantitative and qualitative data using biostatistics, informatics, computer-based programming and software as appropriate - Interpret results of data analysis for PH research, policy, or practice Lectures 10, 11 Hwk 5 Final Exam 6. To identify the different roles of mediators and effect moderators and identify appropriate techniques to evaluate the presence of each. - Analyze quantitative and qualitative data using biostatistics, informatics, computer-based programming and software as appropriate - Interpret results of data analysis for PH research, policy, or practice Lectures 12 Final Exam 7. To describe the key characteristics of an outbreak and the key steps to identifying the cause of the outbreak. - Analyze quantitative and qualitative data using biostatistics, informatics, computer-based programming and software as appropriate - Interpret results of data analysis for PH research, policy, or practice Lecture 13 Hwk 6 Final Exam 8. To review the epidemiological criteria needed to establish causal relationships. - Interpret results of data analysis for PH research, policy, or practice Lecture 12 Final Exam 9. To discuss the role of primary, secondary, and tertiary prevention in population health with a focus on screening. - Analyze quantitative and qualitative data using biostatistics, informatics, computer-based programming and software as appropriate - Interpret results of data analysis for PH research, policy, or practice Lecture 14 Final Exams 10.To read and evaluate epidemiologic studies in the medical and public health literature to explain the critical importance of evidence in advancing PH knowledge. - Interpret results of data analysis for PH research, policy, or practice Lectures 5-7, 9 Hwks 3-4 COURSE REQUIREMENTS AND EXPECTATIONS A. READINGS Required course texts: The following resources are available to provide more background information: a. Aschengrau A & Seage GR. Essentials of Epidemiology in Public Health. 4th Edition (2018). The 3rd edition of this textbook is available online at the NYU Library: https://ebookcentral.proquest.com/lib/nyulibrary-ebooks/detail.action?docID=3319339. 2. All ERIC Notebook readings listed below can be accessed at: https://sph.unc.edu/epid/eric/ 3. All Lancet Series readings can be accessed at: http://www.thelancet.com/series/epidemiology-2002 4. Additional required readings will be assigned to supplement the main textbook or as part of various homework assignments; a list of these is provided on the next page. Readings that are published journal articles can be accessed via the NYU Library’s journal access that is located under the Research tab of NYUHome. I reserve the right to add readings during the course of the semester as appropriate. Additional resources: 1. If you would like to purchase or borrow another textbook from the library, we recommend: a. Gordis L. Epidemiology. 6th Edition (2019). Additional textbooks based on content area, level of expertise, etc. as well several websites, and articles are also available. For specific areas of interest, please let me know, and I can provide additional resources. B. REQUIREMENTS & EXPECTATIONS 1. Students are expected to attend all lecture sessions. Students are expected to come to class on time to prevent disrupting the lecture and classroom activities. 2. Attend discussion sections: - Discussion sessions are held on a weekly basis and led by discussion section instructors noted above. Only attend the discussion section you are enrolled in and do not attend another discussion for which you are not registered. - Active participation in the discussion sessions is also expected and highly encouraged. - Attendance is mandatory. If you cannot attend a given session, it is your responsibility to notify your discussion section instructor beforehand, or in case of an emergency, immediately upon return. All other absences from the discussion section will be considered unexcused. - Any student who has more than 2 unexcused absences from the scheduled discussion section meetings will lose points from their discussion section grade. 3. Technology Policy for lecture sessions: - Mobile device ringers will be turned off or placed on vibrate before class. - Laptops and tablets can ONLY be used in the classroom to take notes, make calculations, and download/read course materials. There are studies that indicate that non-academic use of the Internet is associated with poorer learning outcomes. PLUS, it really does distract your fellow classmates seated near you! 4. Complete reading assignments prior to class. Readings are listed in the course schedule and additional readings may be assigned as needed. 5. Complete homework assignments (6): Homework assignments are due on the dates noted below. They will be posted to BrightSpace. Late homework will not be accepted. You can rely on your class notes or other supplemental materials to complete your assignment, but it is an individual effort so do not share answers with others!
Public Economics, Course 01:220:460 Sample Exam, Two Pages 1. Consumer choice. There are two divisible private goods: Good 1 and Good 2. Alice has endowment (5, 2), and she views the goods as perfect complements. The market prices are (p1, p2) = (4, 2). (a). [5] Write down a utility function that represents Alice’s preferences. (b). [5] Calculate Alice’s choice from her budget set. (c). [5] On a graph, show Alice’s budget set, Alice’s choice, and Alice’s indifference curve through her choice. 2. Exchange economies. There are two divisible private goods, Good 1 and Good 2, and two consumers, Alice and Bob. Alice’s indifference curves are all lines with slope −2, and her endowment is ωA = (0, 4). Bob’s indifference curves are all lines with slope − 2/1, and his endowment is ωB = (4, 0). (a). [5] Draw the Edgeworth box and the allocation where both agents consume their endowments. Is this allocation efficient? Why or why not? (b). [10] Calculate the competitive equilibrium, giving the allocation and some prices (p1, p2) compatible with the allocation. In an Edgeworth box, illustrate Alice’s budget set, Bob’s budget set, Alice’s choice with its indifference curve, and Bob’s choice with its indifference curve. (c). [5] Is the competitive equilibrium envy-free? Why or why not? 3. Public goods. Suppose Alice, Bob, and Carol have the following demand functions for rockets: QA(p) = 16 − p, QB(p) = 7 − 2/1p, and QC(p) = 6 − 2/1p. Rockets are a public good, and there is a constant marginal cost for rockets equal to 12. (a). [5] What is the efficient quantity? (b). [5] What are the Lindahl prices? (c). [5] What is the total cost of public provision? How much does each agent pay in total? 4. Externalities. Briefly (but accurately!) answer the following questions on the topic of externalities. (a). [5] Describe the Coase Theorem. (b). [5] For reducing emissions, is cap-and-trade cost-effective? Explain. (c). [5] For reducing emissions, is an emissions fee cost-effective? Explain. 5. Healthcare. A risk-neutral insurance company sells policies to fully insure against healthcare costs for a market with 1000 low-risk customers and 1000 high-risk customers. All customers are risk-neutral, each low-risk customer has an expected healthcare cost of $5, 000, each high-risk customer has an expected healthcare cost of $20, 000, and the company is not able to observe who is high-risk and who is low-risk without experience rating. For the following questions, if there are several possible outcomes, then consider the possible outcome that has the most trade. (a). [5] If there is no experience rating or government intervention, what is the outcome? Explain. (b). [5] With experience rating, what is the outcome? Explain. (c). [5] With community rating backed by mandate, what is the outcome? Explain. 6. Education. Briefly (but accurately!) answer the following questions on the topic of education. (a). [4] Provide an argument for government intervention in education. (b). [4] Provide a second argument for government intervention in education. (b). [2] Provide a third argument for government intervention in education. 7. Retirement. There are three periods: year zero (the present), year one, and year two. The interest rate is r = 0.5. There are three equally likely scenarios: 1. Alice pays 6 in year zero, then does not continue. 2. Alice pays 6 in year zero, then receives 9 in year one, then does not continue. 3. Alice pays 6 in year zero, then receives 9 in year one, then receives 9 in year two. (a). [5] Calculate Alice’s Social Security Wealth. (b). [5] Suppose innovations in medicine make the first scenario less likely. Does Alice’s Social Security Wealth increase, decrease, or stay the same? Explain.
ECON 385 Intermediate Macroeconomic Theory II, Fall 2021. Final exam. 138 points. 1. (20 points) An employee has to choose between two contracts. Assume that the net real interest rate on saving and borrowing equals r > 0. Under contract A, she has gross incomes y and y 0 in the current and future periods, respectively, and has to pay taxes t and t 0 in the current and future periods, respectively. Under contract B, an employer offers the employee an option to increase income next year by x·(1 +r) units and reduce income this year by x units. Taxes are the same under both contracts. (a) (10 points) Write down current and future budget constraints and the lifetime budget constraint under the two contracts. Which contract would the employee choose and why? (Hint: you should compare lifetime wealth under the two con-tracts.) (b) (10 points) Assume that preferences over current and future consumption are U(c, c') = −2/1(c − c¯) 2 − 2/1β(c 0 − c¯) 2 , where ¯c is the bliss consumption level and β = 1+ 1 r . Find consumption in the current and future periods and saving under the two contracts. Compare consumption levels and saving under the two contracts. 2. (10 points) A consumer receives income y in the current period and income y 0 in the future period, and pays taxes t and t 0 in the current and future periods, respectively. The consumer can lend/save at the real interest rate r1. The consumer can borrow at the real interest rate r2 > r1 only an amount x or less, where x < 1+r2/y'−t'. Use a diagram to clearly show the consumer’s budget set. Distinguish the cases when a consumer is: i) a saver, ii) a borrower whose borrowed amount does not exceed x, and iii) a borrower who chooses to borrow the full allowable amount x. 3. (36 points) Assume a consumer has current-period income y = 220, future period income y' = 150, current and future taxes t = 60 and t' = 50, respectively, and faces a market real interest rate of r = 0. Consumer’s preferences over current and future consumption are U(c, c') = min (c, c'). The consumer faces a credit-market imperfection in that she cannot borrow at all, that is, s ≥ 0. (a) (6 points) Calculate her optimal c, c', s. (b) (6 points) Suppose that everything remains unchanged, except that now t = 40 and t 0 = 70. Calculate the effects on current and future consumption and optimal saving. (c) (6 points) Calculate the marginal propensity to consume for this consumer fol-lowing the tax change, that is, the change in the current consumption following the change in taxes and disposable income that it entails. Define the Ricardian equivalence and comment if it holds in this case. (d) (18 points) Now suppose alternatively that y = 120. Repeat the above parts, and explain any differences. 4. (10 points) Consider a onetime change in government policy that immediately and permanently increases the level of the labor force in an economy from L0 to L1 > L0 at some point in time t0. Assuming the economy with technological progress at a rate g starts in its initial steady state, use the Solow model to explain what happens to the economy over time and in the long run. In particular, draw two diagrams: 1) for real wages with time on the horizontal axis using a ratio scale; and 2) the Solow diagram that outlines the changes. Assume that the growth rate of population stays constant over time at a rate n. 5. (50 points) Consumer has quadratic preferences and cares about consumption over two periods: U(c0, c1) = − 2/1(c0 − c¯) 2 − β2/1(c1 − c¯) 2 . Assume that the real interest rate, r, is zero, and the time discount factor, β, equals 1. (Note (!) that consumption can be negative if preferences are quadratic.) (a) (7 points) Consumer’s disposable income in period 0 equals 10, and in period 1 equals 20. There’s no uncertainty. Write down the Euler equation and find the optimal consumption levels in periods 0 and 1, and the optimal savings. (b) Assume now that period 0 income stays at 10, while period 1 income is uncer-tain. There’re two possible states of nature that might realize in period 1—with probability π = 3 1 , income will equal 0 in period 1 if state 0 occurs whereas with probability 1 − π = 2 3 income will equal 30 in period 1 if state 1 occurs. Con-sumer has to make decision about her consumption and saving for period 0 before uncertainty is resolved. Consumer now maximizes expected utility EU(c0, c1) = − 2/1(c0 − c¯) 2 − πβ2/1(c1(0) − c¯) 2 − (1 − π)β2/1(c1(1) − c¯) 2 , where c1(k) is consumption in period 1, state k = 0, 1. (i) (3 points) Write down the Euler equation and find the expected value and variance of income in period 1. (ii) (6 points) Find the optimal consumption and saving in period 0, and con-sumption in period 1 in both states of nature. (iii) (1 point) Does your answer for the optimal consumption in period 0 and savings differ from the answer to (5a), and why it does or why it doesn’t? (c) Assume now that income in period 1 state 0 equals 0 with probability π = 0.99 and income in period 1 state 1 equals 2000 with probability 1 − π = 0.01. (i) (3 points) Write down the Euler equation and find the expected value and variance of income in period 1. (ii) (6 points) Find the optimal consumption and saving in period 0, and con-sumption in period 1 in both states of nature. (iii) (1 point) Does your answer for the optimal consumption in period 0 and savings differ from the answer to (5b), and why it does or why it doesn’t? Assume now that each period’s utility function is u(c) = ln(c). Continue assuming that the real interest rate, r, is zero, and the time discount factor, β, equals 1. (d) (7 points) Write down the Euler equation and find the optimal consumption in periods 0 and 1 and optimal saving in period 0 given the data in (5a). (e) (8 points) Write down the Euler equation and find the optimal consumption in periods 0 and 1 and optimal saving in period 0 given the data in (5b). Compare the optimal saving to the value you found in (5b) and argue why they are different (if different at all). (f) (8 points) Write down the Euler equation and find the optimal consumption in periods 0 and 1 and optimal saving in period 0 given the data in (5c). Compare the optimal saving to the value you found in (5c) and argue why they are different (if different at all). 6. (12 points) Consider an economy that begins with output at potential and an inflation rate of ¯π, so the economy begins in steady state. A new chair of the central bank decides to lower the long-run inflation target to ¯π 0 < π¯. Show how the economy responds over time, using the AS/AD framework. (You should clearly label the axes and explain everything you want to show on your graph. You should also list the equations for AD and AS curves.) Comment on your results.
COMP201: Software Engineering I Object Oriented Design Coursework Assignment 2 (2024/2025) Modelling with UML Assessment Information o class20/11/2024Deadline Day & Date&Time16 sessedbefullyawareoftheprinciplesand practiceofanOOapproachtothe computer systems; nciplesinpractice. mentsNoPurposeofassessment bility to producean OO designinUML. REPORT Title page: put your name, your student number and the course on the first page. TASK 1. (25%) Given the following informal specification, identify good candidates for classes and attributes, and identify things that are outside of the problem domain. You should use the noun identification technique and show your working. Also identify all potential inheritance relationships. You should ensure that data is NOT duplicated across classes even if a user places multiple bookings. Use the noun identification method of class elicitation for the first pass. For full marks please try and avoid as much as possible duplication of data within any class. Present your design as a class diagram, including all relevant attributes and relationships Your customer is a travel agency that wants a reservation system that will run on the Internet. This reservation system will allow clients to keep track of all their travel reservations for airlines, hotel, rental cars and travel insurance. The client must enter the names of all his/her traveling companions, but all reservations will be referenced by the primary client. The system needs to make it easy for a client to have multiple reservations. All reservations will include a booking number as well as a reference to the names, passport numbers and dates of birth of all the travelers involved in the reservation. The system should also have an address for the primary client. Airline reservations will include the airline, flight number, class of seat and travel date and time. Hotel reservations will include the type (twin, single, double) of room, the start date and the enddate as well as the name and address of the hotel. Car rental reservations will include the type of car requested, start date, days of hire and the drivers’ license numbers. The insurance booking will have a start date, an enddate and level of cover bronze, silver or gold. TASK 2. (25%) You are required to draw a UML activity diagram to represent the following scenario of a hairdresser’s salon. Customers enter the salon and wait until the next hairdresser is free. They then indicate whether they would like their hair washed first or a “dry-cut” without having their hair washed. The hairdresser washes the hair (if asked for) and then cuts it. After finishing the customer’s hair the hairdresser moves onto the next waiting customer, or waits for another one to enter the salon. The customer goes to the till and waits for a cashier to be free to take their payment. They can pay by either cash or by credit card (where they need to type their pin into the machine) and they then leave the salon. TASK 3. (25%) Read the following passage carefully. An employee has a name, address, phone number, date of birth and job title. Employees can be appointed and can leave, and are either monthly paid employees or weekly paid employees. Monthly paid employees have a bank sort code, bank account number and number of holidays while weekly paid employees are paid in cash on a specified day of the week - their payday. Weekly paid employees may apply to be promoted to a monthly paid employee. Monthly paid employees can take a holiday if they have sufficient number of holidays remaining. All employees are entitled to use the Sports Centre if they register to do so. The Sports Centre is made up of two gyms (with a maximum capacity), three tennis courts and a bar. The bar can be booked for special events, and has three rates of hire - a working hours' rate, an evening rate and a weekend rate. The Sports Centre holds a list of employees who have registered. An employee's age can be calculated from their date ofbirth, in order to prevent under-age drinking at the bar. You are required to draw a UML class diagram for the above system. All the key words you need to include are underlined - do not invent any details additional to those given above: 1. Illustrate the various classes that exist, with their attributes and operations (including any derived ones, represented in the usual way) 2. Mark on the relationships that exist between the classes using the standard UML symbols to represent the type of each relationship 3.Add multiplicities 4. for any relationships of association: a. mark on the navigability b. appropriately name the two roles TASK4. (25%) Draw a UML sequence diagram that specifies the following protocol of initiating a two-party phone call. NOTE: ArgoUML does not fully support Sequence Diagrams, it may be better to use a different program (such as OpenOffice Draw/ Microsoft Powerpoint) or (neatly) draw the diagram by hand. Let us assume that there are four objects involved: • two Callers (s andr), • an unnamed telephone Switch, and • Conversation (c) between the two parties. The sequence begins with one Caller (s) sending a message (liftReceiver) to the Switch object. In turn, the Switch calls setDialTone on the Caller, and the Caller iterates (7 times) on the message dialDigit to itself. The dialled digits are then sent to the Switch. The Switch object then calls itself with the message routeCall. It then creates a Conversation object (c), to which it delegates the rest of the work. The Conversation object (c) rings the Caller (r), who asynchronously sends the message liftReceiver. The Conversation object then tells both Caller objects to connect, after which they talk. Once Caller (r) sends a disconnect message to Conversation then Conversation tells both Caller objects to disconnect and also it tells the Switch to disconnect. After that Switch deletes the object Conversation. All the keywords you need to include are underlined - do not invent any details additional to those given above.
CMPEN 331 – Computer Organization and Design, Final Project You will convert your (almost complete) single-cycle processor that you built throughout the semester into a 5-stage pipelined processor. Open the Vivado project from HW5. We will start from there. Important Notes on Grading The final project is in three steps, and each step has a point of 15 pts, 5 pts, and 5 pts. They must be done in order to receive points. For example, you cannot do the first part and the third part only, and not do the second part (the third part will not receive any credits in this case). At least try to make the first part run correctly for 15 pts. The second/third part is harder but only 5 pts each, and the points for these two parts will only be given if the code produces a correct result (no partial point for trying and failing). Strategically plan on trying the last two parts. 1. Implementing Pipelining (15 pts) Below is the 5-stage pipeline processor diagram from our lecture slide. The main differences are (1) four pipeline registers are added (highlighted in yellow), and (2) signals go through the pipeline registers, instead of directly connecting the components. You do not need to implement the signal and logic related to branch/jump (red X). 1.1. Adding Pipeline Registers Write four modules for each pipeline register. Until now, module skeletons were always provided, but this time you must write your own modules from scratch. Fortunately, the four modules are simple and look very similar. You should be able to identify the inputs/outputs of the registers from what you learned from the lecture. However, I am listing them below to make your life easier. · IF/ID Pipeline Register o Inputs: 1-bit clk, 32-bit inst o Outputs: 32-bit inst_d o Description: On a positive edge of clk, inst_d is set to inst. The _d indicates that the signal is an output from the if/id register. · ID/EX Pipeline Register o Inputs § 1-bit clk, regWrite, memToReg, memWrite, aluSrc, regDst, memRead § 32-bit regOut1, regOut2, imm32 § 5-bit rt, rd (HINT: These are part of inst_d) § 4-bit aluControl o Outputs: Duplicated versions of the above signals, with an _x subscript. o Description: On a positive edge of clk, the output signals (with an _x subscript) are set the corresponding input signals. · EX/MEM Pipeline Register o Inputs § 1-bit clk, regWrite_x, memToReg_x, memWrite_x, memRead_x § 32-bit aluOut, regOut2_x § 5-bit writeAddr o Outputs: Duplicated versions of the above signals, with an _m subscript. o Description: On a positive edge of clk, the output signals (with an _m subscript) are set the corresponding input signals. · MEM/WB Pipeline Register o Inputs § 1-bit clk, regWrite_m, memToReg_m § 32-bit aluOut_m, memOut § 5-bit writeAddr_m o Outputs: Duplicated versions of the above signals, with an _b subscript. o Description: On a positive edge of clk, the output signals (with an _b subscript) are set the corresponding input signals. 1.2. Re-routing Signals Place each pipeline registers between the stages and connect them properly with existing components. You will need to replace many of the existing wires with the outputs of the pipeline registers. Carefully refer to the above figure and the lecture slides to correctly connect each component. This step is extremely easy to make mistakes. 1.3. Testing If you are done, try running your code. The result, unfortunately, will look like this. The result is different from what we have seen in HW4! Why? (Write your answer in the report) Replace the saved instructions in the instruction memory to the following and try running again. Don’t entirely delete the original instructions (just comment it out) as we will use it again later. · memory[25] = {6'b100011, 5'd0, 5'd1, 16'd0}; · memory[26] = {6'b100011, 5'd0, 5'd2, 16'd4}; · memory[27] = {6'b100011, 5'd0, 5'd3, 16'd8}; · memory[28] = {6'b100011, 5'd0, 5'd4, 16'd16}; · memory[29] = {6'b000000, 5'd1, 5'd2, 5'd5, 11'b00000100000}; · memory[30] = {6'b100011, 5'd3, 5'd6, 16'hFFFC}; · memory[31] = {6'b000000, 5'd4, 5'd3, 5'd7, 11'b00000100010}; If successful, you should see the result like below: What are the new instructions added? Is this result correct? (Explain in the report) 1.4. Debugging It is very easy to make small mistakes in this project. In the waveform. view, carefully follow each signal through the pipeline and see if their behavior. matches your expectation. See if there are any unexpected Xs or Zs. Debugging is an essential part of programming, and your ability to debug is part of the project evaluation. TAs will not debug your code for you, although they can guide you at a high level. 2. Data Forwarding (5 pts) “One” reason the original code did not run correctly was due to data hazards. You will implement EX forwarding and MEM forwarding that we learned during class to (almost) fix the problem. Below is the diagram from the lecture slide on how forwarding works. 2.1. Adding an Additional Signal in the Pipeline Register You will see that one signal is missing: ID/EX RegisterRs is currently not an input from our ID/EX pipeline register (or what we would call in our convention, rs_x). Our ID/EX pipeline register must be changed to include this signal. · ID/EX Pipeline Register (revisited) o (Additional) Input: 5-bit rs (HINT: this is part of inst_d) o (Additional) Output: 5-bit rs_x. o Description: On a positive edge of clk, rs_x is set to rs. 2.2. Implementing Additional Modules You need to additionally implement modules for a forwarding unit and a 32-bit 3x1 mux. · Forwarding Unit o Inputs § 5-bit writeAddr_m, writeAddr_b, rs_x, rt_x § 1-bit regWrite_m, regWrite_b o Outputs: 2-bit forwardA, forwardB o Description: Set forwardA and forwardB based on other signals. The c-code equivalent can be found in the slides. · 32-bit 3x1 Mux o Inputs: 32-bit in0, 32-bit in1, 32-bit in2, 2-bit sel o Output: 32-bit out o Description: If sel==0, ut=in0. If sel==1, ut=in1. If sel==2, ut=in2. 2.3. Connecting Everything As shown in the above figure, add an instance of the forwarding unit and two instances of the 3x1 mux, and redirect wires properly between them. 2.4. Testing Your Code This time, initialize the instruction memory with the code below. This is a new code that is different both with the original code (from the skeleton) and the code from above: · memory[25] = {6'b100011, 5'd0, 5'd1, 16'd0}; · memory[26] = {6'b100011, 5'd0, 5'd2, 16'd4}; · memory[27] = {6'b100011, 5'd0, 5'd4, 16'd16}; · memory[28] = {6'b000000, 5'd1, 5'd2, 5'd3, 11'b00000100010}; · memory[29] = {6'b100011, 5'd3, 5'd4, 16'hFFFC}; If you correctly implemented data forwarding, you should see a result like this (without data forwarding, the result will look different, of course): Now, try running the original 4-line code from the skeleton. Unfortunately, your result will look like this: This is still wrong. Why? (Explain in your report) 3. Detecting Load-use Hazards and Stalling (5 pts) This is the final part of the proposal. You will implement a hazard unit that detects load-use hazard and stall the pipeline. As we have learned, hazard unit detects load-use hazard and generates three signals: IF/ID.Bubble, PC.Write, and IF/ID.Write, as shown in the below figure. IF/ID.Bubble inserts zeros to the ID/EX pipeline instead of the output signals of the control unit. IF/ID.Write disables the IF/ID pipeline from being updated. PC.Write disables the PC from being updated. We will simplify the design a little bit: We will generate only one signal, stall, that replaces all three aforementioned signals. Instead of adding another mux after the control unit, we will modify the control unit itself to take in stall as an input and generate zeros if stall==1. 3.1. Implementing the Hazard Unit · Hazard Unit o Inputs: 5-bit rt_x, rt_d, rs_d, 1-bit memRead_x o Outputs: 1-bit stall o Description: Depending on the input signals, generate the stall signal. The c-code equivalent can be found in the lecture slides. stall==1 means the pipeline is stalled. o HINT: rs_d and rt_d are parts of inst_d. o HINT: You might need to use an additional initial begin-end block to initialize the stall signal to be zero. 3.2. Modifying control_unit, program_counter, and IF/ID Pipeline register Below modules must be updated accordingly · control_unit o (Additional) Input: 1-bit stall o Description: If stall==1, set all the output signals to zero. · program_counter & IF/ID pipeline register o (Additional) Input: 1-bit stall o Description: If stall==1, do not update the output. o HINT: This should be just one additional if statement. 3.3. Connecting Everything As shown in the above figure, add the hazard unit and connect it with other modules properly. If your code is correct, the original code from the skeleton (4-line version) must produce the same output as HW5: 4. Submission You must submit (1) your new datapath.v and (2) a short report in a single zip file. The report must discuss the following: · What are the 7 instructions from 1.3? Does the waveform. result match the expected output? Explain why the 7 instructions from 1.3 run correctly, while the original 4 instructions fail to produce the same output as HW5. · What are the 5 instructions from 2.4? Does the waveform. result match the expected output? Explain why the 5 instructions from 2.4 run correctly, while the original 4 instructions fail to produce the same output as HW5. The report must be in pdf or Microsoft Word. No handwritten report allowed. Always write the code clearly and add proper comments. Hard-to-read code will lose points.
Exam 2 CS-GY 6033 INET Fall 2024 Noember 18th 2024 Question 1 (a) 6 points The drawing below represents a valid red-black tree. ● Show how to insert the new node 43. You must show both the initial insertion, and any changes made by RB-repair. ● The tree now has 14 nodes. What is the maximum number of nodes that can be added before the black heigh increases? (b) 6 points An interval tree is drawnn below. Answer each of the following: 1. Is it possible to assign black/red colors to the tree and add NIL nodes so that the tree is implemented as a Red-black tree? (Do not change its shape). 2. Add the attributes x.max to each node 3. Suppose we carry out the INTERVAL-SEARCH(i) algorithm from class, where i = [23, 24]. Show which node is returned. 4. Given an example of an interval i for which INTERVAl-SEARCH(i) returns node [21, 27], or explain why it is impossible. Question 2 4 points (a) Give an example of a BST on 10 nodes such that the Inorder and Postorder traversal produce the same output, or explain that it is not possible (b) Give an example of a BST on 10 nodes such that the Preorder and Postorder traversal produce the same output, or explain that it is not possible Question 3 4 points Let T be the root of a BST augmented with the attribute x.leaves, which is the number of leaves in the subtree rooted at x. Suppose T is not initially augmented with this value. Write the pseudo-code for a procedure called SetLeaves(T) which correctly sets the attribute x.leaves for all nodes in the tree. Justify the runtime of O(n). Question 4 6 points Consider the above BST augmented with x.leaves. Describe how to update the TreeInsert algorithm from class so that the attributes x.leaves are correctly updated after an insert of node z into the tree T. You do not need to explicitly provide the pseudo-code for the new version of TreeInsert. Instead, you must carefully describe how the procedure is updated, step by step, and justify the runtime of O(h). Question 5 6 points Let T be a Red-Black tree, which is also complete (recall the definition of complete is that every level is full, except perhaps the last, where the last level is filled in from left to right). A student claims that the RBT is colored in such a way that all levels are colored black, except the last level. Your job is to write the pseudo-code for an algorithm called CheckColoring(T) which takes in a reference to the red-black tree T, and returns TRUE if indeed the tree is colored all black except the last level (which is red). Otherwise the procedure returns FALSE. You may call another algorithm from within CheckColoring if you find that helpful. Justify the runtime of O(n). Question 6 Let T be a binary search tree that stores information on exam grades for students in the Algorithms class. Each tree node x contains the following attributes: x.grade: Grade of the student, used as the key of the BST x.prereq: Grade the student x achieved in the prerequisite course x.credits: Number of credits currently achieved by student x x.maxcredits Maximum number of credits achieved by all students in the subtree rooted at x x.PMax: Maximum of the prerequisite course grades for all students in the subtree rooted at x x.size:. Number of nodes in the subtree rooted at x (a) 2 points Describe how to implement the above BST such that inserts and deletes can be carried out in time O(log n). (b) 4 points Write the pseudo-code for a recursive algorithm called MaxCredits(T), which returns the maximum number of credits achieved by any student in T with an Algo Exam grade of 80 or more. Justify the runtime of O(log n) (c) 4 points Write the pseudo-code for a recursive algorithm called MaxPregrade(T), which returns the maximum grade achieved in the prerequisite course by any student in T with an Algo Exam grade under 60. Justify the runtime of O(log n). (d) 4 points Consider the student in the class with the highest grade in the pre-requisite course. Determine how many students achieved a grade higher than this student on the Algo Exam. Write the pseudo-code for your procedure, called AboveBest(T). Justify the runtime of O(log n). Question 7 4 points Consider T which is a reference to the root node of a binary search tree. Let PrintDepth(T, i) be a procedure that prints the nodes in T at depth i. The pseudo-code for this procedure is given below: PrintDepth(T, i) if i = 0 print T.key else PrintDepth(T.left,i − 1) PrintDepth(T.right,i − 1) Using the procedure above, your job is to write a procedure that prints the nodes of a BST level by level. In the example below, the output should be 15, 8, 27, 5, 20, 35, 2, 7, 25, 42, 37. Question 8: DP WARM UP 8 points Suppose we are given a set of weights w[1, 2,..., n] with their associated prices p[1, 2,..., n] and a target total weight T. The goal is to select a set of weights whose total sum is exactly equal to T, such that we minimise the total price of the selected weights. Your Job: Provide a DP solution for this problem. You MUST use the following DP table: M[0, 1,...,n,0, 1,..., T] where M[i,j] is the minimum possible cost of achieving exactly weight j using weights selected from 0, 1, 2...i. If achieving exactly weight j is impossible, you may set entry M[i,j] to an any flag value that works in your solution. You must include: - justify how you initialise the table - the relationship you use to fill up the table - the pseudo-code - the runtime You may use the MaxValueSet problem from class as reference, where the original pseudo-code is copied below. for j = 0 to T set V [0, j] = 0 for i = 0 to n set V [i, 0] = 0 for i = 1 to n for j = 1 to T if w [i] ≤ j V [i,j] = max{v [i] + V [i − 1, j − w [i]] , V [i − 1, j]} else V [i,j] = V [i − 1, j] return V [n, T] Question 9: DP WARM UP 6 points Suppose we have a sequence of characters in the array C[1, 2,..., n] where each character represents a color. The characters are selected from the set {R,B,G,Y, P}. An example of such a sequence is RBBGYPRGBY PY PY. Using these characters, we would like to draw a rainbow, where an arc of a certain color can be made be connecting two of the same color. For a proper rainbow to be drawn, colored arcs cannot cross each other! An example of a rainbow is shown below: Arcs of different colors have different values. Yellow arcs are worth $100. Blue arcs are worth $200. Green arcs are worth $300. Pink arcs are worth $400. Red arcs are worth $500. Your Job: Update the relevant DP problem from class so that it returns the maximum value of a rainbow that is drawn using from the characters in C[1, 2,..., n]. You must properly define the DP table, explain the initialisation, justify how you fill in the entries, provide the pseudo-code, clearly show which value is returned, and justify the runtime. Question 10 6 points Suppose we have a hiking trail that goes from mile marker 1 to mile marker n. At each mile marker there is ONE rock with a certain weight. If we pick up the rock at mile marker i and carry it to mile marker j, we get paid an amount which depends on the weight and how far we carried it, where the payment is (j − i) × (weight). An example is given below, where our total profit is 49. Suppose you are given as input the array W[1, 2,..., n], where W[i] represents the weight of the rock at marker i. The goal is to determine the maximum amount of money that can be made. Note that you can only walk forwards on the trail, and that you can only carry one rock at a time! You are allowed to drop a rock and pick up a rock at the same mile marker, or you may drop a rock and continue walking with nothing. Suppose a student provides a greedy approach to solving this problem which works as follows: -Pick up the rock at mile 1 -Walk until we find a larger rock, and then put the current rock down and trade up for the heavier rock. -Continue until we get to the last mile marker. Your Job: Determine if this greedy approach produces the optimal solution. You must either give an example showing that this approach does not produce the best result, OR, you must explain why this greedy approach always produces the maximum amount of money. Question 11 12 points Consider another hiking trail that goes from mile marker 1 to mile marker n. At each marker, there is again ONE rock. Each rock must now be carried a certain number of miles. The input to the problem is D[1, 2,..., n] where D[i] is the distance that we much carry rock i. For example, if D[1] = 4, the rock at marker 1 must be carried at least 4 miles. At each mile marker, we have the choice of picking up the rock, or continue walking with no rock. We cannot carry more than one rock at a time, but we are allowed to drop a rock, and pick a new one up at the same mile. Any rock that we pick up MUST be carried the minimum number of miles. In the example below, the rock pick ups and drop offs show that we are able to transport four rocks. Your job: Write a DP solution that solves the problem of finding the maximum number of rocks that can be successfully transported during the hike. Be sure to include a properly defined DP table, describe the initialization, how the table is filled up, the pseudo-code, and the runtime. BONUS: 5 points Update the above solution so that you also print out the optimal solution: this consists of the the mile markers where you pick up rocks.
Bioc0015 Coursework: Website Introduction In our modern society, we rely on online tools and resources for information, research and educational purposes. Communicating scientific information, online, is increasingly an important means of reaching a diverse world-wide audience. This assignment is designed to provide an opportunity to specifically demonstrate key skills in creativity and innovation, information technology and presentation of material to key decision makers with varied scientific backgrounds and perspectives. Aim and Objective This assignment involves creating a webpage based upon presenting a molecular mechanism in technology-based context (e.g., method, drug, product, service, company). The molecular mechanism must be chosen from (or inspired by) a theme presented during your lectures and also supported with additional reading. This assignment aims to exercise your skills in innovation and creative thinking as well as researching and presenting science. You must use the literature to research an applied technology You should use peer-reviewed articles and may also use papers submitted to established repositories such as BioRxiv. You are encouraged to create or develop a novel or innovative applied technology. This can also include advancing or repurposing current technology. Well-researched science is key! Website guidelines · There is a limit of ca. 9 students who can be assigned to each lecturer’s theme(s) (Cabrita, Christodoulou, Thalassinos, Raleigh, King, Purton, Djordjevic, Marechal, Santini). Please fill out the Online Form with your choice BEFORE you start the coursework. You will not be able to submit otherwise. The availability of a lecturer’s theme is on a “first-come-first-served” basis. · The website should be presented in a way that is relevant to the context that you have chosen to describe your mechanism. For example, if you are a company selling a drug that targets a particular mechanism, it should have the look/feel of a company. · The website design is up to you and needs to relay the technology and underlying science effectively. · Your website must be prepared for a general audience and consist of no more than 8 tabs: o One tab must describe the underlying detailed science for a specialist audience (must be fully referenced) o An area or section in your website must describe the science for a non-specialist (i.e. a scientist outside of your field) o One tab must be for references o Please include a disclaimer on your website stating that the content presented is for UCL’s BIOC0015 course. It is also recommended that the disclaimer mentioning that the contents of the website is fictional and for educational purposes only. · Any web hosting platform. may be used (e.g Wix, Weebly or similar), as long as the published webpage will be available up to June 30th 2025, for marking and for viewing by the external examiners as part of the examinations process. There are free website hosts available, though you may also wish to use UCL servers to host their web-based presentation. Some additional guidance for these approaches can be found on pg 4. · Assessors will be using Macs or PCs so please check that your website can be viewed on e.g. Chrome and Safari Note: This assignment is not about web development (i.e., you will not need to use coding skills). Submission For this assignment you’ll submit your website’s URL within the coursework pulldown menu (details will be posted on Moodle). You will also need to provide 2 screenshots: (1) tab showing the science for the specialist science (2) any other tab that shows the format of your website. The assignment must be uploaded by 11.00am on 15 January 2025. SoRA candidates are permitted extra time as stipulated in their conditions. Late submissions will not be permitted. Assessment This assignment counts towards 65% of the module. The website will be assessed on the science, including how well it has been researched and presented, the organisation of the website itself. The marking guidelines include: Scientific content · Applied technology and its relevance · Basis for the scientific phenomenon · Quality of the scientific detail provided for the topic of choice · Quality and detail of the science for the technology to be credible · Sources used for content Website layout, organisation and use of visual elements to relay the science · Context and clarity of the basis for the website · Overall navigation, flow of information to describe/present the science · Use of figures and schematics to support the text · Appropriate use of dynamic elements (videos, gifs, clickable links, links to external sources as necessary) Overall impression & audience accessibility · Spelling, grammar · Effectiveness of how the science presented, relative to the website’s context · Content and overall use of language appropriate for both specialists and non-specialists You will receive marks and feedback within 4 weeks of submission, unless otherwise stated. How to start? o What to choose? - You may choose any molecular mechanism with a role in either biotechnology or health and disease (or both if you wish), or other aspects of applied research in academia or industry – as inspired from the themes presented by lecturers in the course - Alternatively, you may also instead focus on presenting a technique or technology that is used to characterize a molecular mechanism Examples in previous years have included: · New therapeutic treatments (small molecules, antibodies) acting at a chosen molecular target · Novel treatments for a human disease · Fictional start-up companies o Presenting “new” treatments / molecules / antibodies If you choose to create a new treatment/molecule/antibody, you must provide a detailed specification of your invention, together with a very clear description of the mechanism by which it acts to modulate its target pathway. o Including a fictional start-up/company If your web-page includes a fictional start-up/company, your website should also offer an insight into what your start-up/company is about (vision, aims, deliverables etc) o Some examples from previous years https://algalbioworks.weebly.com/ https://zcbtxwu.wixsite.com/ariella https://zcbtgjw.wixsite.com/mysite/toxic-metal-pollution https://zcbtntk.wixsite.com/therasis Where do I find website builders/hosts? Some free website hosts include: o weebly: https://education.weebly.com/ o wix: www.wix.com (education version: http://www.wix.com/wixed/wixed-beta/complete-your-website) o dreamweaver (available on UCL-based computers or remotely using Desktop@UCL) Instructions for accessing and using Dreamweaver on the UCL network: Note: these instructions have been derived from a web-building/hosting course in UCL 1. Accessing Dreamweaver 8(“Macromedia Dreamweaver 8”) On UCL computers: o Login to WTS using your UCL username and password Remote access via Desktop@UCL o Go to: http://www.ucl.ac.uk/isd/services/computers/remote-access/desktop o Click on Desktop@UCL link After logging in: · Map the T-drive to your account (Start | Programs | Utilities | Mount-Unix FileStore (T-drive)). · Important: You will have to do this every time you connect to WTS. · Navigate to your T drive: check that you have a folder called html.pub · Close Windows Explorer. 2. Setting up a site in Dreamweaver 8 · Open Dreamweaver 8 (Start | Programs | Applications M-N | Macromedia | Dreamweaver 8 | Macromedia Dreamweaver 8) · Go to Manage Sites on the Site menu and click New and then Site. · Click the Advanced tab (at the top of the dialogue box). · Site Name is My Homepage · Local Root Folder is T:html.pub · Remove the tick from the Enable Cache box. 3. Writing your first HTML page · Create an empty file, if there is not already one open, by going to File | New. Make sure that Basic Page and HTML are selected. Click “Create”. · Type the following: Homepage This is my first HTML page. I am [first name] [surname]. Resources Google! IS HTML Images etc. · In the Title row (at the top of Dreamweaver), replace Untitled Document with Homepage of [your name]. This is what appears in the browser’s Title Bar and, if the page is bookmarked, it will appear on the Bookmarks or Favorites menu in the browser. · Save the file as index.htm by going to Save on the File menu. Make sure that it is saving in the T:html.pub folder. Click Save. · Go to the View menu and select Code and Design. This view is useful for trouble-shooting. You can drag the line between the views up and down, and edit the page in either view. Go to Design on the View menu to return to the original view. Note: If a change is made in Code view you need to click in Design view before it takes effect. 4. Headings h1-h6 · Display the Properties Window, if it isn’t already open, by going to Properties on the Window menu. It appears at the bottom of the screen. · Highlight Homepage in the document and select Heading 1 in the Format box in the Properties Window. · Highlight Resources and select Heading 2 in the Format box. Note: h3-h6 also exist — the higher the number, the smaller the heading. 5. Formatting text · Highlight your surname and click the B icon in the Properties Window. Your surname should appear in bold. Tip: Do not use underline in HTML files, otherwise your users will think it's a link! · Highlight Google!, IS and HTML, and click the icon in the Properties Window with the three squares and the three lines. You should now have a bulleted list. · Click above Resources and go to Insert | HTML | Horizontal Rule on the Insert menu. You should now have a line across the page. · Save the file by going to Close on the File menu. Click Yes to save the changes. Note: It is good to use plenty of headings and lists to make your pages easy to understand, and accessible to disabled users. · Create a new file by going to the File menu and click New. Click Create. · In the Title row, type Images. Also create a Heading 1 with the text Images. · Save the file with the name more.htm Tip: Keep file and folder names simple — only use 0-9, lower case a-z and hyphen. Do not use spaces 6. Links Still in more.htm, type the following line: Return to homepage. · Highlight Return to homepage and in the Link box in the Properties Window, type index.htm · Close the file by going to Close on the File menu. Save any changes if prompted. · Back in index.htm, highlight Images etc. and in the Link box, type: more.htm Note: In the case of the following links to other sites, you should select _blank in the Target box (to the right of the Link box), so that the other site opens in its own window. Otherwise the visitor to your site would have to click the Back button to return to your site. · Highlight Google! and in the Link box, type http://www.google.co.uk/ · Highlight IS and in the Link box, type http://www.ucl.ac.uk/is/ · Highlight HTML and in the Link box, type http://www.homepages.ucl.ac.uk/~ccaacdi/knowhow/building.htm#html · Highlight your name and in the Link box, type mailto: followed immediately by your e-mail address. You can use similar techniques to link to any type of file — including PDFs, Word documents, images and sound files (.wav or .mp3). 7. Putting your files on www.homepages.ucl.ac.uk Close index.htm by going to Close on the File menu. If prompted to, save any changes. · In the Site pane (on the right, go to the Window menu and click Site if it is not there). · Make sure the drop-down box at the top of the Site pane says Local (NOT Remote) View. · Click the refresh icon (circular arrow, second icon from the left). · Highlight the file index.htm and click the blue up-arrow Put Files icon.
Do not include dependent files, if prompted to.
Highlight more.htm and click the same icon. The UCL homepages server requires an additional step not normally necessary on other servers: · Go to Start | Programs | Software I-P and click Publish Web Pages.
Again, use your UCL User ID and e-mail password. You should now be able to see index.htm and more.htm in Internet Explorer at: www.ucl.ac.uk/~userid e.g. www. ucl.ac.uk/~zcgbh01 Are your links working? If not: · Open the file whose links are not working in Dreamweaver. · Fix the link. · Close the file, saving the changes. · Put the file again. · Reload the page in the browser by navigating to it and pressing F5. Notes: · The “HTML” link goes to a place (an “anchor”) in an HTML file, not to the top of the file. If you want to put an anchor in a file, go to the Insert menu, click Named Anchor and give it a name. There must be no spaces or uppercase letters in anchor names. A link to an anchor has a # before the anchor. It is not necessary to put an anchor at the top of a document, as linking to the document’s own file name e.g. index.htm will reload e.g. index.htm at the top. 8. Images · In Internet Explorer go to www.ucl.ac.uk/ls/specdig and find an image you like. Right-click on it and click Save Picture As. Give the image a name and save it in R:website. · In Dreamweaver return to the bottom of more.htm. · Go to the Insert menu, click Image, locate your image and click OK. · Click the image to select it. Enter a description of the image in the Alt box in the Properties Window. Note: You can resize the image by dragging its corners. However, this should really be done in an image processing program such as Photo Shop Elements (Start | Programs | Graphics Packages | Photoshop | Photoshop Elements. If you resize an image by dragging its corners in Dreamweaver, it slightly slows down the loading of your page. Images should be JPEGs or GIFs and not larger than 150Kb (check the size in Windows NT Explorer). 9. Tables The easiest way of controlling where images and text appear on an HTML page is to use a table. · At the end of more.htm, go to the Insert menu and click Table. · Enter the number of rows and columns (start with 3 and 2 respectively) you require. · Border should be 0 (NOT 1), as you do not want to be able to see the table's borders. · Set the table width to 100 percent. · Click your image and use Edit | Cut and Edit | Paste to move your image into the right-hand cell of your table. If the image is large, you may need to drag the left edge of its cell a little to the right to open up the left hand cell. Type a little text in the left hand cell, and you will see how it all works. By default, text appears in the middle of a cell. Change Default in the Vert box in the Properties Window to get the text to appear at the top of the cell. You may need to click the little triangle, bottom right of the Properties Window to expand the Properties Window so that you can see the Vert box. Table formatting · If you want another row in your table, click in the last cell and press the Tab key. · To merge cells, select the row by clicking to the left of it. Wait until the mouse cursor becomes a small arrow pointing right. Right-click and select Table and then Merge Cells. 10. Colours · You can change the background page colour and the font colour on an individual webpage by first right-clicking on the page. Click Page Properties. Click the squares next to Background color and Text color and select the colours. · You may also change the colour of a small amount of text by selecting it, and clicking the square two to the right of Size in the Properties Window. · You may change the colour of the background of a particular cell by selecting all of the contents of the cell with your mouse and selecting a colour in the Bg square two to the right of Header in the Properties Window, which needs to be expanded by clicking the arrow in its lower right corner. 11. Viewing and deleting files on the server In the early days of learning HTML, you may wish to delete files on the server if you do not want the search engines to find them. · Select Remote view in the Local view drop-down box in the Site pane on the right. (Press F8 if the Site pane is hidden.) · If you are not already connected to the server, click the Connects to remote server icon (first from the left; looks like two plugs). · Highlight any file(s) you wish to delete and press the Delete button on your keyboard.
Chemistry 125-225: Machine Learning in Chemistry Fall Quarter, 2024 Homework Assignment #2 - Due: November 11, 2024. Turn in a writeup with your responses to a ll questions below, codes, outputs (e.g. graphs, etc.). Attach all your Python files as well, so we can run them. Problem 1: Linear Regression and Least Squares Fitting In this problem, you will derive and apply the least squares method to fit a straight line to data from a real-world chemistry dataset. This exercise will help you understand the derivation of least squares fitting, its implementation, and how it can be applied in a chemical context. Background and Derivations The least squares method is widely used to fit models to data, one of the simplest applications being to fit a straight line y = mx + b, where m is the slope and b is the intercept. (a) Derivation of the Normal Equations Suppose we have a dataset consisting of N observations. Each observation includes an independent variable xi and a dependent variable yi , which are related (approximately linearly) by the equation yi ≈ mxi + b. Our objective is to find the slope m and intercept b that best fit the data in a least- squares sense, minimizing the sum of squared residuals between the actual y-values and the predicted values ˆ(y) = mx + b. Define the following: • Vector y: The N-dimensional vector of observed y-values. • Matrix A: An N × 2 matrix containing the x-values of the observations in the second column and a column of ones in the first column. This setup allows us to solve for both m and b simultaneously. • Vector x: The vector containing our unknown parameters, m and b. The goal is to find the vector x such that Ax is as close as possible to y in a least-squares sense. This leads to minimizing the squared error ||Ax − y||2. i. Show that the optimal solution for x that minimizes the squared error satisfies the normal equa- tion: ATAx = AT y. *Hint:* Start by expanding ||Ax − y||2 and then set the gradient with respect to x to zero. ii. Using the normal equation ATAx = AT y, derive explicit formulas for m and b that apply when fitting a line. Show that these are equivalent to: iii. Suppose in a chemical experiment, x represents the concentration of a reactant (e.g., molarity, M) and y represents the rate of reaction observed at each concentration. Explain how finding m and b could help you interpret the relationship between concentration and reaction rate, possibly leading to insights about reaction kinetics. (b) Download and Explore the Data We will use the ESOL (Delaney) dataset from MoleculeNet, which provides information on water solubility (log solubility) of organic molecules. Access this dataset using the DeepChem Python library as shown below. i. Install DeepChem (if not already installed) by running: pip install deepchem ii. Use the following Python code to load the dataset and extract features and labels: import deepchem as dc # Load the ESOL dataset tasks , datasets , transformers = dc . molnet . load_delaney () train_dataset , valid_dataset , test_dataset = datasets # Extract features (X) and labels (y) from the training dataset X_train = train_dataset .X y_train = train_dataset .y iii. Display the structure of X train and y train to understand what these variables represent in this dataset. iv. Answer the following questions: • What are the features in this dataset? • Which feature(s) would you expect to correlate with solubility? (c) Implement Linear Regression on the Dataset i. Using the LinearRegression model from scikit-learn, train a linear regression model on X train and y train. ii. Use the following code to train the model and calculate relevant parameters (slope, intercept, and RMSE): from sklearn . linear_model import LinearRegression from sklearn . metrics import mean_squared_error import numpy as np # Train a simple linear regression model on the dataset linear_regressor = LinearRegression () linear_regressor . fit ( X_train , y_train ) # Obtain slope ( coefficients ) and intercept slope = linear_regressor . coef_ . flatten () # Flatten if it ’s a multidimensional array intercept = float ( linear_regressor . intercept_ ) # Convert intercept to float # Predict target variable for training data y_train_pred = linear_regressor . predict ( X_train ) # Calculate RMSE rmse = np . sqrt ( mean_squared_error ( y_train , y_train_pred )) # Print results in the terminal print (f" Slope (s): { slope }") print (f" Intercept : { intercept }") print (f" RMSE : { rmse }") iii. Answer the following questions: • Explain the meaning of each parameter (slope, intercept) in the context of solubility. • What does the RMSE value indicate about the model’s accuracy? (d) Plot the Fitted Line i. Plot the actual versus predicted values of solubility to assess the fit visually. import matplotlib . pyplot as plt # Plot actual vs predicted solubility values plt . scatter ( y_train , y_train_pred , label =’Predicted vs Actual ’) plt . plot ([ min ( y_train ) , max( y_train )] , [ min ( y_train ) , max( y_train )] , color =’red ’, linestyle. =’--’, label =’Ideal Fit ’) plt . xlabel (" Actual Solubility ") plt . ylabel (" Predicted Solubility ") plt . legend () plt . title (" Linear Regression on ESOL Dataset ") plt . show () ii. Answer the following questions: • How well does the fitted line match the data visually? • Are there any potential outliers or deviations? (e) Assumptions in Least Squares Fitting i. Discuss the assumptions behind least squares fitting. ii. Answer the following questions: • What assumptions are made about the distribution of noise in least squares fitting? • How might non-Gaussian noise affect the accuracy of your linear model? Submission Submit a report containing: • Code for each part. • Plots and answers to the questions. • Interpretation of results. Problem 2: Nonlinear Least Squares Fitting on Experimental Chem- istry Data In this problem, you will perform. a nonlinear least squares fit on experimental chemistry data from the Free- Solv dataset, a curated database of experimental and calculated hydration free energies for small molecules in water. This exercise will introduce you to concepts in nonlinear regression, data filtering, and model evaluation. The FreeSolv dataset can be accessed at https://github.com/MobleyLab/FreeSolv. You will download the dataset and perform. the following steps to fit a model that describes the relationship between calculated and experimental hydration free energies. 1. Data Download and Preparation (a) Download the database .txt file from the FreeSolv GitHub repository, or use Python code to download it programmatically. (b) Load the dataset using pandas in Python, skipping comment lines (lines starting with #) and specifying the delimiter as a semicolon. Name the columns as follows: • Compound ID, SMILES, IUPAC Name, Expt Free Energy, Uncertainty, Calc Free Energy, DOI, Notes, Additional Column 1, Additional Column 2 (c) Extract the Calc Free Energy as the predictor variable X and the Expt Free Energy as the response variable y. 2. Outlier Detection and Filtering (a) Remove any rows in X or y that contain NaN or infinite values. (b) Filter out outliers in the dataset by removing data points where the Expt Free Energy value is more than 3 standard deviations away from the mean. This step ensures that extreme values do not unduly influence the fit. 3. Model Definition and Fitting (a) Define a quadratic model of the form. f(x) = ax2 + bx + c where a, b, and c are parameters to be determined. (b) Use scipy .optimize .curve fit to fit the model to the filtered data and extract the best-fit parameters a, b, and c. 4. Plotting the Results (a) Plot the filtered data points (as a scatter plot) and the fitted quadratic model (as a smooth curve). (b) Label the axes appropriately as ”Calculated Free Energy” (x-axis) and ”Experimental Free En- ergy” (y-axis). (c) Display the fitted parameters a, b, and c in the plot title. 5. Model Evaluation (a) Calculate the Root Mean Square Error (RMSE) to evaluate the model’s accuracy. The RMSE is defined as: where yi are the actual experimental values and f(xi ) are the values predicted by your quadratic model. (b) Print the RMSE to assess the quality of your model fit. Questions (a) Why is it important to remove outliers when performing regression? How could outliers affect the quality of your model? (b) Explain why a quadratic model was chosen here instead of a simple linear or exponential model. Under what circumstances would each model type be appropriate? (c) The RMSE provides a measure of fit quality. Would a lower RMSE always indicate a better model in a physical or chemical context? Why or why not? (d) What assumptions are implicit in least squares fitting regarding the distribution of errors in the data? Discuss how violations of these assumptions might influence your fit. Hints: • Use the pandas, numpy, and scipy libraries in Python to handle data processing, model fitting, and numerical calculations. • For plotting, use matplotlib.pyplot. The plot function can be used to draw the quadratic fit, and the scatter function is suitable for data points. • Ensure that any non-numeric values (e.g., missing data or text) in X or yare handled before performing the fit. Problem 3: Nonlinear Least Squares Fit Using Gradient Descent in numpy In this problem, you will perform a nonlinear least squares fit by implementing a gradient descent algorithm from scratch using only numpy. Unlike previous problems where you used scipy to handle optimization, here you will manually compute the gradient vector of the loss function and use it in a gradient descent loop. This exercise will deepen your understanding of the principles behind nonlinear least squares fitting and gradient-based optimization. Objective: Fit a quadratic model to a dataset using gradient descent. Specifically, given a set of data points (xi , yi ), find parameters a, b, and c that minimize the sum of squared errors: where f(x) = ax2 + bx + c is the quadratic model function. 1. Data Normalization (Scaling) (a) Normalize the predictor variable X and the response variable y to have zero mean and unit variance: where µX and σX are the mean and standard deviation of X, and similarly for y. This step helps stabilize gradient descent by preventing large gradients. 2. Define the Loss Function and Gradient (a) Derive the loss function L for the nonlinear least squares fit: where xi and yi are the data points. (b) Compute the partial derivatives of L with respect to each parameter a, b, and c. This will form. the gradient vector you will use in gradient descent. You are given the partial derivative with respect to a: Using a similar approach, compute the partial derivatives with respect to b and c, ∂b/∂L and ∂c/∂L , respectively. (c) Write a Python function compute gradient(X, y, a, b, c) that takes in the data X andy along with the parameters a, b, and c, and returns the gradient vector as anumpy array [∂L/∂a,∂L/∂b,∂L/∂c]. 3. Implement Gradient Descent for Optimization (a) Initialize the parameters a, b, and c with some starting values (e.g., all zeros). (b) Set a small learning rate η (e.g., 0.00001) to prevent parameter values from growing too quickly during updates. (c) Set a convergence threshold (e.g., 1 × 10 −6) to determine when to stop iterating. (d) Write a loop that performs the following steps: i. Compute the current loss L based on the current values of a, b, and c. ii. Use compute gradient to calculate the gradient vector for the current values of a, b, and c. iii. Update the parameters using the gradient descent update rule: iv. Break out of the loop if the change in loss between iterations is smaller than the convergence threshold. (e) After convergence, print the optimized values of a, b, and c. 4. Convert Parameters Back to Original Scale (a) Since you optimized the parameters on the normalized data, convert the parameters back to the original scale for accurate interpretation. Use the following transformations: 5. Plotting and Evaluation (a) Plot the original data points and the fitted quadratic model using the de-normalized parameters. (b) Plot the loss over iterations to observe the convergence of gradient descent. (c) Calculate the final loss and print it to evaluate the fit quality. Questions (a) Why is it important to remove outliers when performing regression? How could outliers affect the quality of your model? (b) Explain why a quadratic model was chosen here instead of a simple linear or exponential model. Under what circumstances would each model type be appropriate? (c) The RMSE provides a measure of fit quality. Would a lower RMSE always indicate a better model in a physical or chemical context? Why or why not? (d) What impact does the learning rate η have on the convergence of gradient descent? What happens if η is too large or too small? (e) Explain the purpose of data normalization in gradient descent. How does normalization affect conver- gence and stability? What to Turn In: 1. Your Python code implementing gradient descent and data normalization. 2. A plot showing the original data and the fitted quadratic curve. 3. A plot of the loss over iterations, illustrating the convergence behavior. 4. A brief explanation answering each of the questions posed above. 5. The final optimized parameters a, b, and c on the original scale, along with the final loss. Hints: • To track convergence, keep a record of the previous loss and compare it with the current loss in each iteration. • If you observe very large or NaN values in your parameters, reduce the learning rate or check your gradient computation. • Use numpy operations (e.g., np .sum, np.dot) to implement the gradient calculations efficiently.
INFS2044 Assignment 2 Case Study In this assignment, you will be developing a system for finding images based on the objects present in the images. The system will ingest images, detect objects in the images, and retrieve images based on labels associated with objects and by similarity with an example image. Use Cases The system supports the following use cases: • UC1 Ingest Image: User provides an image, and System stores the image, identifies objects in the image, and records the object types detected in the image in an index. • UC2 Retrieve Objects by Description: User specifies a list of object types, and the system returns the images in its index that match those listed. The system shall support two matching modes: o ALL: an image matches if and only if an object of each specified type is present in the image o SOME: an image matches if an object of at least one specified type is present in the image • UC3 Retrieve Similar Images: User provides an image, and the system retrieves the top K most similar images in order of descending similarity. The provided image may or may not already be in the system. The similarity between two images is determined based on the cosine similarity measure between the object types present in each image. The integer K (K>1) specifies the maximum number of images to retrieve. • UC4 List Images: System shows each image and the object types associated with each image in the index. Example Commands The following are example commands that the command line frontend of the system shall implement: UC1: $ python image_search.py add example_images/image1.jpg Detected objects chair,dining table,potted plant $ python image_search.py add example_images/image2.jpg Detected objects car,person,truck $ python image_search.py add example_images/image3.jpg Detected objects chair,person $ python image_search.py add example_images/image4.jpg Detected objects car $ python image_search.py add example_images/image5.jpg Detected objects car,person,traffic light $ python image_search.py add example_images/image6.jpg Detected objects chair,couch UC2: $ python image_search.py search --all car person example_images/image2.jpg: car,person,truck example_images/image5.jpg: car,person,traffic light 2 matches found. $ python image_search.py search --some car person example_images/image2.jpg: car,person,truck example_images/image3.jpg: chair,person example_images/image4.jpg: car example_images/image5.jpg: car,person,traffic light 4 matches found. UC3: $ python image_search.py similar --k 999 example_images/image3.jpg 1.0000 example_images/image3.jpg 0.5000 example_images/image6.jpg 0.4082 example_images/image1.jpg 0.4082 example_images/image2.jpg 0.4082 example_images/image5.jpg 0.0000 example_images/image4.jpg $ python image_search.py similar --k 3 example_images/image3.jpg 1.0000 example_images/image3.jpg 0.5000 example_images/image6.jpg 0.4082 example_images/image1.jpg $ python image_search.py similar example_images/image7.jpg 0.5774 example_images/image1.jpg UC4: $ python image_search.py list example_images/image1.jpg: chair,dining table,potted plant example_images/image2.jpg: car,person,truck example_images/image3.jpg: chair,person example_images/image4.jpg: car example_images/image5.jpg: car,person,traffic light example_images/image6.jpg: chair,couch 6 images found. Other requirements Input File Format The system shall be able to read and process images in JPEG format. For UC2, you can assume that all labels are entered in lowercase, and labels containing spaces are appropriately surrounded by quotes. Output Format The output of the system shall conform to the format of the example outputs given above. Unless indicated otherwise, the output of the system does not need to be sorted. For UC3, the output shall be sorted in descending order of similarity. That is, the most similar matching image and its similarity shall be listed first, followed by the next similar image, etc. For UC4, the output shall be sorted in ascending alphabetical order. Internal Storage You are free to choose either a file-based storage mechanism or an SQLite-based database for the implementation of the Index Access component. The index shall store the file path to the image, not the image data itself. Object detection The supplied code for object detection can detect ~90 object types. Future variations • Other object detection models (including external cloud-based systems) could be implemented. • Additional object types could be introduced. • Additional query types could be introduced. • Other similarity metrics could be implemented. • Other indexing technologies could be leveraged. • Other output formats (for the same information) could be introduced. These variations are not in scope for your implementation in this assignment, but your design must be able to accommodate these extensions largely without modifying the code that you have produced. Decomposition You must use the following component decomposition as the basis for your implementation design: The responsibilities of the elements are as follows: Elements Responsibilities Console App Front-end, interact with the user Image Search Manager Orchestrates the use case processes Object Detection Engine Detect objects in an image Matching Engine Finds matching images given the object types Index Access Stores and accesses the indexed images Image Access Read images from the file system You may introduce additional components in the architecture, provided that you justify why these additional components are required. Scope & Constraints Your implementation must respect the boundaries defined by the decomposition and include classes for each of the elements in this decomposition. The implementation must: • run using Python 3.10 or higher, and • use only the Python 3.10 standard libraries and the packages listed in the requirements.txt files supplied with this case study, and • not rely on any platform-specific features, and • extend the supplied code, and • correctly implement the functions described in this document, and • it must function correctly with any given input files (you can assume that the entire content of the files fits into main memory), and • it must include a comprehensive unit test suite using pytest, and • adhere to the given decomposition and design principles taught in this course. Focus your attention on the quality of the code. It is not sufficient to merely create a functionally correct program to pass this assignment. The emphasis is on creating a well-structured, modular, object-oriented design that satisfies the design principles and coding practices discussed in this course. Implementation Notes You can use the code supplied in module object_detector.py to detect objects in images and to encode the tags associated with an image as a Boolean vector (which you will need to compute the cosine similarity). Do not modify this file. You can use the function matplotlib.image.imread to load the image data from a file, and sklearn.metrics.pairwise.cosine_similarity to compute the cosine similarity between two vectors representing lists of tags.
MAT E 640 Advanced Thermodynamics in Materials Department of Chemical and Materials Engineering Deferred Final Exam January 20, 2021 180 minutes Answer all questions. State any assumptions you make and explain your answers. The total number of marks is 90. 1. Choose whether the following statements are true or false. If a statement is false, explain why? [10] T F The enthalpy of mixing in ideal solution is independent of temperature, while the enthalpy of mixing in regular solution depends on temperature. T F In a typical phase diagram, as temperature decreases, the Gibbs free energy of both solid and liquid solution as a function of composition also decreases. T F The partial molar value of an extensive property in a mixture of ideal gases is the same as the molar property of its pure component. T F When A-B solution shows positive deviation from Raoult’s law, A-A and B-B components are more likely forming clusters in solution. T F For a non-ideal gas, molar volume of vapor and liquid keep constant during transformation from vapor to liquid below the critical pressure. 2. Multiple Choice - Circle ONE correct answer for each. (10 marks) i) Which of the following is true for the heat and work during a thermodynamic process? a) q for reversible < q for irreversible and work for reversible > work for irreversible b) q for reversible > q for irreversible and work for reversible > work for irreversible c) q for reversible > q for irreversible and work for reversible < work for irreversible d) q for reversible < q for irreversible and work for reversible < work for irreversible ii) Which constraints must be imposed on system to make the Gibbs function decrease? a) constant T and P b) constant U and T c) constant U and V d) constant T and V iii) During phase transitions like vaporization, melting and sublimation a) pressure and temperature remains constant b) volume and entropy changes c) both of the mentioned d) none of the mentioned iv) What is the configurational entropy of 5 moles of CH3Cl at absolute zero K? a) 0 J·K-1 b) 57.6 J·K-1 c) 11.5 J·K-1 d) 45.6 J·K-1 e) 9.12 J·K-1 v) The van der Waals equation is the equation of state for a non-ideal gas. The properties of a gas predicted by this equation are different from that of an ideal gas. Under what two conditions are the properties of a real gas expected to be the most non-ideal? a) low temperature and high pressure b) low temperature and low pressure c) high temperature and high pressure d) high temperature and low pressure 3. Shown below is the free energy of mixing diagram for liquid and solid solutions of a binary A-B system at temperature T1. (assuming △sm(o),A =△sm(o),B ) a) Is T1 greater than, less than or equal to the melting temperatures for pure A and pure B? (5 Marks) b) Which element has the higher melting point? (5 Marks) c) This system exhibits a positive deviation from ideal behaviour for both the liquid and solid solutions. Which solution (liquid or solid) has the larger enthalpy of mixing? (5 Marks) 4. Briefly explain the difference in the phase separation due to composition fluctuation within the miscibility gap in the cases between i) alloy composition within binode region but outside of spinode region, and ii) alloy composition within spinode region (hint: compare free energy change, morphology difference, etc) (10 Marks) 5. For sulfur dioxide, Tcr = 430.7 K and Pcr = 77.8 atm. Calculate a) The critical van der Waals constants for the gas. b) The critical volume of van der Waals SO2, c) The pressure exerted by 1 mole of SO2 occupying a volume of 500 cm3 at 500 K. Compare this with the pressure which would be exerted by an ideal gas occupying the same molar volume at the same temperature. (10 Marks) 6. In the P-V diagram, sketch a typical isotherm for a van der Waals gas at temperature below critical temperature. Label the equilibrium pressure as accurate as possible. (5 Marks) 7. A binary phase diagram is given below. Predict the Ωl and estimate the heat (enthalpy) of melting for pure Li. (10 Marks) 8. For the Mg-Pb phase diagram given below draw schematics of plausible molar free energy curves showing the common tangent construction as a function of composition, xPb, at T = 600 。C, T = 455 。C, T = 250 。C, and T = 100 。C. (20 Marks)
Course Information Course Title: STATISTICAL DATA MINING Course Number and Section: MATH 4720 W03 Course Description An introductory course to statistical data mining. It covers some fundamental concepts; popular techniques; and algorithms in statistical data mining. Topics include: supervised learning; unsupervised learning; probabilistic reasoning; regression; and nearest-neighbors; classification; model selection; component analysis; random forest; support vector machine; and clustering. Prerequisite(s): MATH 3710 or Approved Petition Required Course Level Student Learning Outcomes Upon successful completion of this course, the student will be able to: . Restate basic concepts and terminologies in statistical learning. . Describe how and when learning works on practical problems. . Implement some specific algorithms and methods in statistical learning. . Apply some techniques in learning to real world data. . Critically evaluate the results in the form. of written reports and present them to classmates and others. Instructional Technique(s) This course mainly is taught by lectures. For the technical and implementing parts, I will ask students to use programming to run some simulation or real data analysis. During the lecuturing process, the students are welcome to give me any feedback or suggestions. Required Textbooks and Materials Signal Processing and Machine Learning with Applications Michael M. Richter, Sheuli Paul, Veton Këpuska, Marius Silaghi Springer 2022 Bookstore Link: https://link.springer.com/book/10.1007/978-3-319-45372-9 Your Campus bookstore offers a Price Match guarantee. If you find our class texts or access codes cheaper at Booksmart, Barnes & Noble, or Amazon the campus bookstore will match the price at the time of purchase, or for up to 7 days after purchase. Search your course materials by the ISBN provided in this syllabus to assure that your price match is acceptable. Topics and Assignments Week/Unit Topics Assignments Due week 1 Digital Signal Representation to be announced in the lecture week 2 Signal Processing Background to be announced in the lecture week 3 Fundamentals of Signal Transformations to be announced in the lecture week 4 Digital Filters to be announced in the lecture week 5 Estimation and Detection to be announced in the lecture week 6 Adaptive Signal Processing to be announced in the lecture week 7 Spectral Analysis to be announced in the lecture week 8 midterm exam none week 9 General Learning to be announced in the lecture week 10 Signal Processes, Learning, and Recognition to be announced in the lecture week 11 Stochastic Processes to be announced in the lecture week 12 Feature Extraction to be announced in the lecture week 13 Unsupervised Learning to be announced in the lecture week 14 Markov Model and Hidden Stochastic Model to be announced in the lecture week 15 Audio Signals and Speech Recognition to be announced in the lecture week 16 final exam none Important Dates For important dates, please consult the Academic Calendar via the following link: https://www.wku.edu.cn/en/academics/academic-calendar Technical Requirements (if any) 1. In order for your Canvas course to function correctly, you need to use an appropriate internet browser, either Google Chrome or Firefox. It is best to use the most updated versions of these browsers. 2. Many students are eligible for a free MS Office Software Student Edition. To start the application process, go to the Office 365 Education website. Eligible students are required to create an account and provide a valid Kean University ID to obtain access to the software applications. 3. Remember to download the latest versions of software used in this class. Assessment I. (for those skipping my lectures less than or equal to 2 times) 10%: homework, class participation, attendance, presentations; 40%: midterm exam 50%: final exam II. (for those skipping my lectures 3 to 4 times) 10%: homework, class participation, attendance, presentations; 30%: midterm exam 35%: final exam 25%: oral exam III. (for those skipping my lectures more than 5 times) 10%: homework, class participation, attendance, presentations; 20%: midterm exam 30%: final exam 40%: oral exam
Faculty of Arts & Science Fall 2024 Quiz 5 CSC 110 Y1F Question 1. Tabular Data [6 marks] Here is a sample list similar to the data we saw in the lecture. Each column represents an ID, a civic centre name, the number of marriage licenses issued, and the month and year when they were issued (YYYY, MM, DD). import datetime marriage_data = [ [1657, ' ET ' , 80, datetime.date(2011, 1, 1)], [1658, ' NY ' , 136, datetime.date(2011, 1, 1)], [1659, ' SC ' , 159, datetime.date(2011, 1, 1)], [1660, ' TO ' , 367, datetime.date(2011, 1, 1)], [1662, ' NY ' , 150, datetime.date(2011, 2, 1)], [1664, ' TO ' , 383, datetime.date(2011, 2, 1)] ] a) What would the following expressions evaluate to? Write your answers in the space provided below each statement. >>> len(marriage_data) >>> marriage_data[5][2] >>> len(marriage_data[-1]) >>> min([row[2] for row in marriage_data]) b) Fill in a plain English docstring description for the function below, based on the provided function body. The preconditions are provided for you. You do not need to provide any doctests or further preconditions. def do_something(data: list[list], year: int) -> float: """ Preconditions: - year is a positive integer - data satisfies all of the properties described in the beginning of this question """ counts = [row[2] for row in data if row[3] .year == year] num_months = len({row[3] .month for row in data if row[3] .year == year}) return sum(counts) / num_months Question 2. Dataclasses [3 marks] Recall this dataclass we discussed in lecture: @dataclass class MarriageData: """A record of the number of marriage licenses issued in a civic centre in a given month . Instance Attributes: - id: a unique identifier for the record - civic_centre: the name of the civic centre - num_licenses: the number of licenses issued - date_issued: the month and year these licenses were issued """ id: int civic_centre: str num_licenses: int date_issued: datetime .date Rewrite the function body from the do_something function from the previous question so that it now deals with a list of MarriageData instances. Fill this in below: def do_something_v2(data: list[MarriageData], year: int) -> float: """ Docstring omitted . The code should do the same thing that it does in Question 1 . """ counts = num_months = return sum(counts) / num_months Question 3. Debugging / Index-Based For Loops [3 marks] The function below is an incorrect attempt to return True if a string s is a palindrome. Your friend Bob believes the function is correct because he tried calling the function with some string arguments which the code did work correctly for. Answer the questions below. def is_palindrome(s: str) -> bool: """Return whether s is a palindrome . A palindrome is a string that reads the same backward as forward . >>> is_palindrome( ' davad ' ) True >>> is_palindrome( ' david ' ) False """ n = len(s) for i in range(n // 2): if s[i] != s[n - 1 - i]: return False else: return True Part (a) [1 mark] Give an example of a valid argument for s where this function will return the correct expected value (according to the docstring): Part (b) [1 mark] Give an example of a valid argument for s where this function will NOT return the correct expected value: Part (c) [1 mark] Briefly explain (in 1–2 sentences) why this function is incorrect (identify the issue in the code):
FBE 506 Quantitative Methods in Finance Assignment # 2 1. Graph the following functions: a. Y = 1.5X + 4 b. Y = Ln(X) c. Y = eX d. Y = 5/(X – 1) e. Y = X2 – 2X + 20 f. Y = X3 - X2 + 12X + 20 2. Graph the following functions where T is time: a. Y = .5T b. Y = 2T c. Y = (-.5)T 3. Download S&P500 index. Convert the index from daily to monthly. Convert the monthly index to monthly return to S&P500. Graph the monthly index. Graph the monthly return to S&P500 4. Download the quarterly real GDP from Fred and graph the GDP. Convert the real GDP to real GDP growth and graph the real GDP growth. 5. Download AAPL (Apple stock prices). Convert AAPL to monthly AAPL. Graph the monthly price data. Convert monthly AAPL price to return to monthly AAPL and graph the return to monthly AAPL. 6. Graph monthly AAPL and monthly S&P500 on one coordinate system and comment on the relation between the two variables. 7. Graph returns to monthly AAPL and monthly S&P500 on one coordinate system and comment on their relation.
Faculty of Arts & Science Fall 2024 Quiz 5 - V2 CSC 110 Y1F Question 1. Tabular Data [6 marks] Consider the following sample list representing student scores. Each row contains a student ID, their name, and their scores on three quizzes. student_scores = [ [1099752, ' Alice ' , 85, 90, 88], [1087711, ' Bob ' , 78, 85, 92], [1000023, ' Priya ' , 90, 92, 87], [100048, ' Muchen ' , 82, 88, 85], [109943, ' Chirly ' , 88, 84, 90] ] Part (a) [3 marks] What would the following pieces of code evaluate to? Write your answers in each blank space provided. >>> len(student_scores[0]) >>> student_scores[3][1] >>> max([row[0] for row in student_scores]) Part (b) [3 marks] Complete the function below, based on the provided specification. def student_quiz1_meets_threshold(data: list[list], student_id: int, goal_score: int) -> bool: """Return whether the student with student_id scored at least goal_score on quiz 1 . If the student_id does not exist in data, return False . Preconditions: - goal_score > 0 - data is a valid list of student scores, structured the way we described above - quiz 1 scores appear at index 2 in each student data sublist >>> student_quiz1_meets_threshold(student_scores, 1099752, 80) True >>> student_quiz1_meets_threshold(student_scores, 1087711, 90) False >>> student_quiz1_meets_threshold(student_scores, 1, 90) False """ Question 2. Python Dataclasses [3 marks] We want to represent each student’s data using data classes instead. Complete the StudentData class below. Each StudentData instance should have the following attributes for a student: id, name and list of integer quiz_scores. When declaring the data type of each attribute, make it as specific as possible. You should also include the following representation invariants: • There are a total of three scores in the quiz_scores list • All scores in the quiz_scores list are greater than or equal to 0 from dataclasses import dataclass @dataclass class StudentData: """Data about a student ' s performance on three quizzes . Representation Invariants: # TODO: Write the two representation invariants below, as Python expressions Instance Attributes: # TODO: Write each instance attribute ' s variable name and description below """ # TODO: Write each instance attribute ' s variable name and specific data type below Question 3. Debugging / For Loops [3 marks] The function below is an incorrect attempt to return True if a string s is made up of only uppercase vowels (that is, a A, E, I, O or a U). Your friend Bob believes the function is correct because he tried calling the function with some string arguments which the code did work correctly for. Answer the questions below. def all_upper_vowel(s: str) -> bool: """Return True if and only if s consists of all uppercase vowels .""" for char in s: if char in ' AEIOU ' : return True return False Part (a) [1 mark] Give an example of a valid argument for all_upper_vowels where this function will return the correct expected value (according to the docstring): Part (b) [1 mark] Give an example of a valid argument for all_upper_vowels where this function will NOT return the correct expected value: Part (c) [1 mark] Briefly explain (in 1–2 sentences) why this function is incorrect (identify the issue in the code):
CAN309 Information Theory and Data Communications Assessment 2 all Marks10%Submission Deadline15 Assessment Objective This assessment aims at evaluating students’ understanding and problem solving skills in Channel Coding, Cryptography, Transceiver Design and Data Communications Networking, which are accumulated during lectures, tutorials and after-class study. Submission Procedure Please submit the electronic copy on Learning Mall Online. Marking Scheme The specific marks assigned are shown on the right column of each question and sub-question. The assessment of Exercise 3 and Exercise 6 includes MATLAB implementation. Please include your MATLAB code script in the context of the report (NOT as a separate .m file) for Exercise 3. The assessment of Exercise 6 is in the form of a short report. It should include: a) A short analysis of the questions and the equations used in deriving your results/codes. b) Results and plots (if needed). c) Discussion and conclusion. The assignment covering the 6 Exercises should be submitted as a single report in PDF format and named as ‘Student ID_GivenName_Surname.pdf’. The designed MATLAB codes in .m format for Exercise 6 should beenclosed together with the report as the separate file(s). Please compress your report and MATLAB codes as a single .zip file and named as ‘Student ID_GivenName_Surname.zip’ for the submission. POINTS Repetition code achieves error correction by repeating the transmitted information bits r additional times. If the transmitted information is 1 bit and the number of the redundant binary bits r = 6, answer the following questions: i) Construct the repetition codeword set. (2 points) ii) What is the minimum Hamming distancedmin of the code? (2 points) iii) How many errors in a block can the code a) detect, b) correct, and c) detect and correct at the sametime? (4 points) iv) Given a noisy channel with symbol error probability of p, p = 0.01, calculate the bit error rate (BER) with this repetition code. (7 points) 2 - (15 )EXERCISE 3 - (15POINTS)EXERCISE 4 - (15POINTS)EXERCISE 5 -(20 POINTS)EXERCISE 6 - Open Ended The theorems and principles in Channel Coding and Cryptography could be quite mathematical and difficult to comprehend. Using MATLAB to visualize specific theorems, concepts and properties helps to strengthen our understanding on the challenging part of the knowledge. Please select ONE theorem/concept that was introduced in Channel Coding and Cryptography, and design the MATLAB code to visualize the related formulation/properties. Note: The MATLAB codes in .m format should be typeset properly and be included together with Assignment 2 (in PDF) as a single compressed document for submission. Quick Guidance on the Open Ended Question: The solution should be in the form. of a short report covering the following three sections; the section marks are given below. Section 1: A description of the theorem or concept with formulation. (5 points) Section 2: MATLAB codes and Graphs to visualize the theorem/concept. (10 points) Section 3: Detailed comments and discussion. (5 points) Appendix A: Matrix Inverse Calculation 1. Gauss elimination method Let A bean × n matrix, the inverse of A, if it exists, can be computed, by row reduction via the following steps: Step 1: Then × n identity matrix is augmented to the right of A, forming an × 2n block matrix [A | I]. Step 2: Through application of elementary row operations, find the reduced echelon form of this n × 2n matrix, [I | B]. Step 3: There is BA = I, and therefore, B = A −1 which is the inverse matrix of A. Note: The matrix A is invertible if and only if the left block can be reduced to the identity matrix I; in this case the right block of the final matrix is A−1 . 2. The determinant method Given a square matrix A, the inverse of A can be calculated via the following steps: Step 1: Find determinant of A, |A|. If |A| = 0, A−1 does not exist. If |A| ≠ 0, one can proceed to find the inverse of the matrix. Step 2: Replace each element of A by its cofactor. Step 3: Transpose the result to form. the adjoint matrix, adj(A). Step 4: The matrix inverse is then given by A−1 = |A|/1 adj(A)
System and Networks COMPSCI 4043 Monday 26 April 2021 1. (a) Express the following in 32-bit two’s complement code, giving your answers in hexadecimal. Show your working. i. 1023 ii. -1023 [4] (b) If the calculation 100 + 30 – 20 is performed by an 8-bit CPU, using an 8-bit two’s complement code, will an overflow be generated? Explain your answer and say what result you’d expect to be generated. [3] (c) For the mathematical set of integers, subtraction is always the same as adding the inverse of a number. Thus 3 – 2 = 3 + (-2) and 3 – (-2) = 3 + 2. Is this also true for a two’s complement code? Justify your answer. [2] (d) Write a Sigma16 program that accesses an array X of n 16-bit two’s complement numbers and for each element X[i] in X, stores a 0 or 1 in the corresponding element of a second array Y according to the following rule. Y[i] = 0 if X[i] is odd; Y[i] = 1 if Y[i] is even [7] (e) Write a segment of Sigma16 code (not a complete program) which sets R2 to 0 if the second most significant bit in R1 is a 0, and sets R2 to 1, otherwise. Estimate how many Sigma16 cycles your segment will take to run as you’ve written it. [4] For reference, here is part of the instruction set of the Sigma16 CPU. lea Rd, x[Ra] Rd:= x +Ra load Rd, x[Ra] Rd:= mem[x +Ra] store Rd, x[Ra] mem[x +Ra]:=Rd add Rd,Ra,Rb Rd:= Ra+Rb sub Rd,Ra,Rb Rd:= Ra-Rb mul Rd,Ra,Rb Rd:= Ra*Rb div Rd,Ra,Rb Rd:= Ra/Rb, R15:=Ra mod Rb and Rd,Ra,Rb Rd:= Ra AND Rb inv Rd,Ra,Rb Rd:= NOT Ra or Rd,Ra,Rb Rd:= Ra OR Rb xor Rd,Ra,Rb Rd:= Ra XOR Rb cmplt Rd,Ra,Rb Rd:= Ra Rb shiftl Rd,Ra,Rb Rd:=Ra logic shifted left Rb places shiftr Rd,Ra,Rb Rd:=Ra logic shifted right Rb places jumpf Rd, x[Ra] If Rd=0 then PC:=x+Ra jumpt Rd, x[Ra] If Rd0 then PC:=x+Ra jal Rd, x[Ra] Rd:= pc, pc: =x +Ra trap Rd,Ra,Rb PC:= interrupt handler jump x[Ra] PC:= x +Ra 2 (a) How do cache memories speedup program execution? Many caches only cache read cycles. Why is this and why is it usually not seen as a major limitation? Discuss whether there are circumstances where caching write cycles would provide some benefit. [6] (b) The following Sigma 16 code is intended to take a 10-element array of two’scomplement numbers (only first element is shown) and replace all the elements with their twos complement inverse. The number $8000 is not permitted. However, although the code will assemble, it contains several errors. i. Draw up aregister use table for the program (suitable for inclusion as comment). ii. Identify the errors and explain how you would correct them. iii. Write out the corrected program. LOAD R1,1[R0] ;Set R1 to constant 1 ADD R2,R0,R0 ;i:=0 LOAD R3,n[R0] ;Set R3 ton FORLOOP CMPEQ R5,R2,R3 ;Is i
7SSGN110 Environmental Data Analysis | Practical 1 | Introduction to Excel & data exploration 1. Introduction 1.1. About this practical This practical will introduce the manipulation and exploration of data using Microsoft Excel and the R statistical environment. The result of your work in Excel can contribute to the formative coursework assignment. We will investigate changes in water height (‘water level’) in the River Frome in Dorset, linked to rainfall in the upstream catchment. Field data are from a pressure transducer placed on the riverbed at a site near Frampton, which recorded water pressure from May 2003 to April 2005 (data used in the practical are from Moggridge and Goodson, 2005). Pressure Transducers (PTs) measure the pressure produced by a column of water above a sensing element and output a voltage. The higher the pressure recorded by the PT, the higher the water level sitting above it. The voltage output readings were taken every 15 minutes for the duration of the study and recorded in a data logger. This practical will introduce (or maybe refresh) you to the following Excel functions: simple formulae, scatter plots, simple linear regression, time and date formats and conversions, text to columns, pivot tables, time-series graphs and graph formatting. Some knowledge of Excel is assumed. This practical will also introduce the R statistical environment and show how it can be used to perform. the same functions. For the formative coursework you should present plots and graphs created in Excel only. The instructions for R are to get you started learning how to use that software environment – we will focus more on R in future weeks. 1.2. Download the data Data from (Moggridge and Goodson, 2005) are provided in an Excel spreadsheet “Frome_Data.xlsx” on KEATS. Download this file to your computer and open it in the Desktop App (do not use the online web-version because this lacks some of the functionality we’ll be using). 1.3. Where to save the data? At King’s, your personal computing storage is in the ‘cloud’, referred to as (Microsoft) OneDrive. It is accessible through the same suite of services as your Outlook email, Calendar, and the online versions of MS Office Apps (Word, Excel, PowerPoint), from any laptop or desktop computer, at home or at King’s. Open your OneDrive to see what kind of directory structure it has and to create a new Folder for your EDA practical exercises. It is also good practice to always have a USB memory stick with you for saving your work, so that you can access your files when offline. Furthermore, when doing GIS and Remote Sensing (other modules), many files are extremely large (many GB) and a USB memory stick can be more efficient that uploading and downloading from the ‘cloud’ . In whichever file managing method you use, you will need to be able to identify full paths for files and folders when you start working in R. Paths may look like: ‘E:/My Documents/EDA/etc’ (PC) or like ‘~/users/yourusername/desktop/etc’ (Mac). • Q1: After doing some research about your machine, what is the path (address) you would use to access the folder you have just created? Type the path here: Make sure to start saving your work in this folder. 2. Data familiarization Open the Frome data in the Excel spreadsheet and familiarise yourself with the data. This workbook contains three worksheets: 1. Metadata: information on the data in the other worksheets 2. Logger: the data from the logger 3. Calibration Data: a small set of data points where the relative water height (also called water level) in the river was manually measured and matched against the recorded PT voltage output Go to the worksheet Logger. This is typical of what a data logger output may look like. These data show the PT voltages over the duration of the study. However, for purpose of this study, we need water level information, not PT voltage. In order to decipher water level from a water pressure measurement, at certain points in the study the water level was recorded manually and the corresponding PT voltage was noted. This is recorded in the Calibration Data worksheet. By doing this we can convert the voltages to relative water heights. Check you understand what the values in each column of each worksheet refer to by reviewing the Metadata worksheet. You should always go through this process of checking what the fields and values refer to when encountering a new dataset. 3. Converting PT Output to Water Levels Using the Calibration Data, we can establish the relationship between voltage and water height and apply this relationship to all the Logger data, to obtain a full time-series of water heights. We will do this using a technique called linear regression, which we will cover in more detail later on in the module. One way to do a linear regression in Excel is to plot data as a scatterplot and then add a simple linear regression line. To do this: a) Go to the Calibration worksheet. b) Select the PT and WaterLevel data. Create a scatterplot of these data. If you do not know how to create the plot: with data selected goto Insert Charts X Y (Scatter) You plot should appear as a window within the worksheet. It is often convenient to move plots (known as charts in Excel) to their own worksheet. To do this: c) Right-click on an empty area near the top of the plot Move chart Select new sheet d) Rename the new sheet created (from ‘Chart1’ to something sensible) by right-clicking on the tab at bottom of the window, clicking Rename, and then typing the new name. Putting your chart in a new sheet will make it easier to edit the chart. You should see a clear pattern in the data. You can now perform. a simple linear regression on these data, which will create a line of best fit. We can then use the equation of this line to convert the PT voltages to water levels. To perform a simple linear regression on data in a chart in Excel: e) Right-click on the data series and select Add Trendline. A menu should appear. o In Trendline options, make sure Linear is selected o Check Display equation on chart and Display R-squared value on chart o Close the menu by clicking the cross in the top right You should have something similar to the image below (note, this figure deliberately does not have a figure caption). This is not a well-formatted chart and you should now edit the chart to improve how it communicates the data. For example: f) Add chart elements such as axis titles by clicking Add Chart Element at the top left of the Design tab g) Label the X and Y axes with the variable name and its measurement units in ( ) Think about why the image below does not communicate well and what edits you could make to present the data more clearly. Figure 3 Example of generated graph The R2 value shown on the plot gives an indication of the ‘Goodness of Fit’ and ranges from 0 to 1 (with 1 being a perfect linear relationship, 0 no relationship). Does your regression line have a good fit? The equation of the simple linear regression (Trendline) you have created allows you to predict Water Level from a known value of PT. To do this: h) Open the Logger worksheet i) Label a new column as “WaterLevel” j) Use the equation to calculate the water levels from the PT output: i. Select acell in your new column, type ‘=’ in the formula bar at the top of the screen below the menu and then type the remainder of your equation (which columns contain the relevant variables?). Type Enter. ii. Copy the contents of your cell down to apply the equation to all rows of data Save your Excel workbook! 4. Converting Date and Time Formats The date and times recorded in the Logger sheet are not in a format that Excel recognises. Examine how the time and date are displayed, what is unusual about this? Currently, the date is portrayed as days past since the first of the year, as in the first data point was collected on the 126th day of the year 2003. To convert the dates and times from the data logger to Excel format you will first need to find out what the date was on the 126th day of the year in 2003. This can easily be done with a quick internet search, for example try this site https://www.epochconverter.com/days/2003 Establishing the date and time of the first data point a) Create a new column and call it “DateTime” b) Type in the correct date and time for the first data row using this format dd/mm/yyyy hh:mm c) Correctly format the column in Excel as follows: a. Select the column b. Right-click on the selected cells and click Format Cells c. A new window should appear i. In the list to the left, select Custom ii. Select dd/mm/yyyy hh:mm (if this is not in the list you can type it in yourself). iii. Click OK Next you will need to add 15-minute intervals to the first cell in order to fill the entire column accurately 1. Select the cell below the one you just filled in 2. Type in a formula that adds 15 minutes to the cell above it, e.g.: “=E2+TIME(0,15,0)” . Note,if you are not working in column E you will need to adjust the formula accordingly. 3. Copy this formula to the rest of the column Check Point, is this correct? Confirm with the GTA available that you have formatted the dates correctly. You should now have a column which contains data in the following format 01/01/2004 00:00 (for example). Compare the values in your new DateTime column with those in the respective Year, Day and Time columns to check the conversion makes sense, especially at the end of the dataset(!). This is a habit you should get into, to always verify that your data processing and manipulations are working correctly throughout the whole dataset. • Q2: What do you notice when you check whether the conversion has gone correctly all the way to the end of the dataset? How are you going to rectify this? Do not proceed until you have fixed the problem, because otherwise your time-series will be incorrect. Save your Excel workbook! 5. Daily Summary Data As there are very many data points from the Logger (collected every 15 minutes) we can create daily summaries of the water level to reduce the number of values we plot in a time-series. To do this, we can use a Pivot Tables, which is a means of summarising (aggregating) data in Excel. First, we need to convert our Date and Time formats so they just show the date (pivot tables require replicate measurements). 5.1. Splitting Date and Time Data To convert formats, we will split the DateTime data, using the “Text to Columns” function (this is a useful function you should bear in mind in the future). To split date and time data using the “Text to Columns” function: a) Select the DateTime column and copy it. b) Select the next column to the right c) Right-click, goto Paste Special and select the icon Values. (Make sure you format the column as Date Time in the same way as above) d) Click OK. This has duplicated the DateTime column, but the data are simply text without formulas. See this by clicking on values in each of the columns and looking at the formula bar. How are they different between the original and duplicated columns? e) Select this new column. f) Click on Data → Text to Column. A new window should appear i. Select Delimited and click Next ii. Check “space” as a delimiter (as there is a space between the date and the time) and click Next. iii. You can see in the Data preview that we have split the previous column into 2 new columns. With the first columnselected (which has the dates), select “Date DMY” as the column data format and the destination as first cell in the column to the right of DateTime (likely K1). iv. In the data preview, click on the second column (General). Check the box under column data format which says Do not import column (skip) v. Click Finish g) Label this new column “Date” You should now have a new column which just has the date that the water level was taken (if times are still showing in this column, Format Cells todd/mm/yyyy as instep 3c above). Save your Excel workbook! 5.2. Using Pivot Tables/Charts A pivot table is a data summarization tool used in spreadsheet software to automatically count, sum or average data based on some aggregating factor. Usually a new table is created that summarisesan existing one. In this case we will create a daily summary of our data. a) Highlight all the columns of data on your sheet b) Click on Insert → Pivot Table → Pivot Table. All data should be automatically selected and a new window should appear. c) Make sure that the destination of the Pivot Table is a New Worksheet and click “OK” d) A new worksheet will appear, with a blank table e) On the right, drag the “Date” field into the section labelled “ Rows” at the bottom of the window f) Drag the “WaterLevel” field into the section labelled Values at the bottom of the window g) In the Values section, click on “Sum of WaterLevel”, click Value Field Settings, and change the function from “Sum” to “Average”, the click OK a. Right click in your Pivot Table on one of the years in the Row Labels column and click on Ungroup h) Right click in your Pivot Table (i.e. on one of the values that has been calculated) and select Pivot Table Options Totals & Filters o Uncheck the “Show grand totals for columns” and “Show grand totals for rows” Think about what has happened here. We have gone from data measured every 15 to a summary of those data for each day in the dataset. What has this done to the amount of data in the new summary data? Pivot tables are interactive, so it is best to copy and paste the data from the pivot tables into a new worksheet: i) Create a new worksheet and give it an appropriate label j) Copy the data in the Pivot Table k) In your new worksheet Paste Special with Values and Number Formatting Through this section you have created several new variables (columns of data) and worksheets. You should consider addingto the metadata worksheet to ensure these are described properly. This will be very helpful when you come back to your data in future. Save your Excel workbook! 6. Plotting a Time-Series Using your new complete data set of predicted average daily water level, create a line plot (time series) of the data (e.g., select your data thengo to Insert → Line →2D Line → Line). Format your chart in a manner which you think is suitable, so that it is clear and easy to interpret. Most of this can be done through the Design menu, which appears on the top of the screen when you select a chart. Remember, you may want to move the chart to a new window before you start editing. Consider the following: Titles and labels of axes Suitable formats of the axis labels (right click on each axis to change this or use Chart Tools). For example, the date should be clear Are the axes a suitable length? Do they allow easy interpretation of patterns? (this can also be changed by right- clicking on the axis or through Chart Tools) Suitable format of data series: is it clear? (Right click on the data series to change this) The background of the chart (can be changed by right-clicking in the chart and selecting Format PlotArea) Gridlines (which can be changed via the Add Chart Elements button): o Do you need gridlines on the x andy axis? o The format of gridlines – they should not dominate the graph! Save your Excel workbook! 7. Download NRFA Data As rainfall is the key driver behind the river flow we will download daily rainfall amounts for the catchment area upstream of Frampton from the UK National River Flow Archive (NRFA). Moggridge and Goodson (2005) collected their data at Frampton, a little distance upstream of gauging station 44004 ‘Fromeat Dorchester’ . Information about stations and the data collected at them are available online from the NRFA website which is managed by the Centre for Ecology and Hydrology (CEH). The NRFA website URL is:http://nrfa.ceh.ac.uk To download daily rainfall data: a) Go to the link above and click on Data → Search for Data b) In the Search Table box, enter the station ID number c) Click on the station ID number in the filtered table d) Click on the Daily Flow Data tab e) Click Download catchment daily rainfall data f) Select appropriate responses, agree to the terms and conditions and click Download Depending on what browser you are using the data will either be automatically downloaded or you will received a dialogue box asking if you want open or save the data. When downloading data from the Internet you almost always want to save the file to disc. If you click open you are at high risk of losing the data later. It is much easier to save the data to disk, then open the data in your desired software. The data you have downloaded are in comma separated values format. You can tell this because the file suffix is ‘ .csv’ . A succinct description of csv files from file.org is as follows: “Files that contain the .csv file extension are comma delimited files that contain separated database fields. These database fields have been exported into a format that contains a single line for each database record. The record is then divided and each field of the record that has been exported into a single line is separated by a comma.” [http://file.org/extension/csv] Many environmental datasets you will download and use are in csv format. Excel can open csv files for viewing, and save to this format. However, remember that csv format is a basic text file which will not contain, for example, Excel formulas and other formatting that you might later add to it in Excel. If you plan to work with data in Excel in the future you should save the data in .xlsx format. Let’s do that for the file you just downloaded: a) Open the .csv file in Excel b) ‘save as’ an .xlsx file The data you have download is for the entire data collection period available. For the formative coursework assignment you will need to reduce this down to match the period of the field data from Frampton, and follow the instructions and requirements for the coursework. 8. Data Analysis in R The remainder of this practical is intended to introduce you tothe R statistical programming language and environment. R is a freely available powerful language/platform for dealing with data both statistically and graphically. RStudio is an Integrated Development Environment (IDE) that enables more efficient and flexible use of R. We will use R and RStudio throughout the remainder of the module so it will be useful for you to get up to speed soon. Initially, the focus is on R, but in the remainder of the module you will likely find it useful to use RStudio. But remember, for the first coursework you need to use Excel (not R). The introductory R activities are hosted online on two pages: • StartR:http://bit.ly/KCL_StartR • First R Analysis:http://bit.ly/KCL_FirstR Work through each of the pages in turn. The StartR page provides some instructions for installing the R and RStudio software on your own computer (should you wish) and then an overview of how R and RStudio work in general. The First R Analysis page then repeats the analysis presented above (in Excel) in R – if you get a bit lost as to what the R code is doing, compare it to what you did in Excel. The data (and script) files you will need for the activities are linked to from the pages, except for the Frome River data which are slightly different from those used above in that they are in .csv format, not Excel format, and are found on the module KEATS page (week 1): • LoggerData.csv • Calibration.csv