Assignment Chef icon Assignment Chef

Browse assignments

Assignment catalog

33,401 assignments available

[SOLVED] FN3142 Quantitative Finance Summer 2020 Matlab

FN3142 Quantitative Finance Summer 2020 Question 1 Consider the following ARMA(1) process: zt  = √ + Qzt-1 + εt + θεt-1 ,                                            (1) where εt  is a zero-mean white noise process with variance σ 2 , and assume  j Qj , jθj  <  1  and Q + θ ≠ 0, which together make sure zt  is covariance stationary. (a) [20 marks] Calculate the conditional and unconditional means of zt , that is, Et-1  [zt] and E [zt]. (b) [20 marks] Set Q = 0.  Derive the autocovariance and autocorrelation function of this process for all lags as functions of the parameters θ and σ . (c) [30 marks] Assume now Q ≠ 0.  Calculate the conditional and unconditional variances of zt , that is, V art-1  [zt] and Var [zt]. Hint:  for the unconditional variance, you might want to start by deriving the uncondi- tional covariance between the variable and the innovation term, i.e., Cov [zt , εt] . (d) [30 marks] Derive the autocovariance and autocorrelation for lags of 1 and 2 as functions of the parameters of the model. Hint:  use the hint of part (c) . Question 2 (a) [20 marks] Explain in your own words how one can conduct an unconditional coverage backtest for whether a Value-at-Risk measure is optimal, and relate this test to the so-called “violation ratio.” (b)  [20 marks] Suppose that after we have built the hit variable Hitt(i)  =  1{rt   ≤  Var t(-i)},i =  1, 2, for two particular Value-at-Risk measures Var t(-1)  and Var t(-2), the following simple regressions are run, with the standard errors in parentheses corresponding to the parameter estimates: Hitt(1) = 0.06151 + ut        (0.00432) Hitt(2) = 0.04372 + ut        (0.00589) Describe how the above regression outputs can be used to test the accuracy of the VaR forecasts. Do these regression results help us decide which model is better? Explain. (c) [20 marks] Using your own words, describe the conditional coverage backtest proposed by Christofersen (1998) based on the fact that the hit variable is i.i.d.  Bernoulli(Q), where Q is the critical level, under the null hypothesis that the forecast of the conditional Value-at-Risk measure VaRt  is optimal. (d) [20 marks] Give an example of a sequence of hits for a 5% VaR model, which has the correct unconditional coverage but incorrect conditional coverage. (e)  [20 marks] Discuss at least two approaches to VaR forecasting to deal with skewness and/or kurtosis of the conditional distribution of asset returns. Question 3 Answer all five sub-questions. (a) [20 marks] What is the definition of market efficiency for a fixed horizon? Is it possible to have deviations from efficiency in a market that is efficient? Explain. (b) [20 marks] Describe collective data snooping and individual data snooping in your own words, and briefly discuss the diferences between them. (c) [20 marks] Forecast optimality is judged by comparing properties of a given forecast with those that we know are true.  An optimal forecast generates forecast errors which, given a loss function, must obey some properties.  Under a mean-square-error loss function, what three properties must the optimal forecast error et+hjt  = Yt+h - Y(ˆ)thjt for a horizon h possess? For the remaining two sub-questions of the exercise, consider a forecast Y(ˆ)t+1jt   of a variable Yt+1 . You have 100 observations of Y(ˆ)t+1jt and Yt+1, and decide to run the following regression: Yt+1  = Q + βY(ˆ)t+1jt + ετ The results you obtain are given in Table I: Estimate Std Error Q β -0.0081 1.6135 0.0052 0.2399            Table I. Regression results (d) [20 marks] What null hypothesis should we setup in order to test for forecast optimality? Can this test be conducted with the information given? (e) [20 marks] Explain what can be inferred from Table I. Question 4 The probability density function of the normal distribution is given by where µ is the mean and σ 2  is the variance of the distribution. (a) [20 marks] Assuming that µ = 0, derive the maximum likelihood estimate of σ 2   given the sample of i.i.d data (x1 , x2, . . . , xT ). (b) [20 marks] Now assume that xt  is conditionally normally distributed as N(0, σt(2)), where σt(2) = ω + βσt(2)-1 + αxt(2)-1 Write down the likelihood function for this model given a sample of data (x1 , x2, . . . , xT ). (c)  [15 marks] Describe how we can obtain estimates for {ω, α, β}  for the GARCH(1,1) model and discuss estimation di culties. (d) [20 marks] Describe in your own words what graphical method and formal tests you can use to detect volatility clustering. (e) [25 marks] Describe the RiskMetrics exponential smoother model for multivariate volatil- ity, and discuss the pros and cons of the constant conditional correlation model of Bollerslev (1990) versus the RiskMetrics approach.      

$25.00 View

[SOLVED] INMR95 Business Data Analytics 2024/25 Matlab

Module Code and Title INMR95 Business Data Analytics Academic Year 2024/25 Type of Assessment Individual Data Analytics Report Weighting of Assessment 100% Individual or Group Assessment ☒ Individual                   ☐ Group Module Convenor Office Hours/Opportunities for advice and feedback Workshops are dedicated to answering all questions related to the course. E.g., worksheet exercises, R workshops, understanding the material, and understanding the assignment. Contact the module convenor if there are any issues that cannot be solved in the workshops. 1. Submission details Submission deadline 31/01/2025 Submission point ☒ Blackboard    ☒ Turnitin     ☐ Other:  Enter text here Item(s) to be submitted Compulsory: One document with three chapters (as described in the assignment description document found on Blackboard). Optional: CSV files as R-scripts NOTE. If you are not submitting your CSV files and R-scripts, then add those to the appendix of the document. File type ☒ PDF      ☒ Word       ☐ PPT      ☐ Excel       ☐ Video      ☒ Other:  .R and .CSV files are optional Formatting guidelines Harvard-style. formatting Structure (e.g. required sub-sections) Three chapters are expected, as described in the assignment description document. Size of assessment (word limit or length) and penalty applied 4000 words. Standard university penalties apply. Referencing style ☒ Harvard                   ☐ Other:  Enter text here                      2. What is the purpose of this assessment? The following table shows which of the module learning outcomes are being assessed in this assignment. Use this table to help you see the connection between this assessment and your learning on the module. Module learning outcomes to be assessed The general aim of this coursework is for students to apply descriptive, predictive, and prescriptive analytics (supported by visualisation techniques) in order to explore data, as well as develop models, which ultimately contribute to data-driven decision-making for two different problem domains and data sets (LO1, LO2, LO4). Students are expected to document and present their findings in a 4,000 word report that is worth 100% of their grade. All data, as well as the saved R scripts that highlight data manipulation and management must be stored and submitted along with the documentation (LO3). In more detail, the documentation should be split into three sections: Section one requires the use of inferential statistics and dimension reduction techniques in order to extract components from a survey and use the component scores in several analyses (LO1). Students are expected to critically analyse their results, manage and manipulate the data (LO3), as well as illustrate their findings using visualisation techniques (LO2). Section two requires students to train at least two types of machine learning algorithms (or regression models) in order to support data-driven decision making (LO4). Section three requires students to report on the results of sections one and two in a document that presents the findings for a layman audience, with explicit recommendations based on the analyses (LO4). This section needs to be particularly rich in visualisation (LO2). 3. What is the task for this assessment? Task (attach an assignment brief if required)  See assignment details document 4. What is required of me in this assessment? Guidelines/details of how to prepare your submission See assignment details document   Expectations for group work (if applicable)   Self-regulation: Make sure that you… Include all your R script, so keep a log of what you’re doing so you don’t have to repeat yourself. Make sure to add comments in order to highlight what you are doing in each step. Three key pieces of advice based on the feedback given to the previous cohort who completed this assignment 1. Do not miss seminars and workshops 2. Ask the instructors for help if you get stuck 3. Do not be put off by the steep learning curve of R. It gets easier as you work through the workshops. Formative assessment opportunities/activities Online quizzes, seminars, and workshops. These are NOT marked and you can repeat them as many times as you like. 5. What resources might I use to prepare my work? Data analytics is an extremely popular field of study. As such you will have no problems finding information from various sources. The lecture slides, worksheets, R workshops, videos, and audio information that I provide should be your starting point. From there I recommend you look at books  (e.g., the Field et al., 2012, which is the other recommended textbook), online tutorials (particularly for examples in R), journal and conference papers (the last two are particularly important in order to understand how we present results of the analysis).  

$25.00 View

[SOLVED] GEOLOGY 1104 Geology for Engineers IR

GEOLOGY 1104 Geology for Engineers I Part A: 6 shorter answer questions (5 marks each) 1. On the blank plot below, draw P wave and S wave velocity vs depth for Earth, in which the outer core is solid rather than liquid. Assume the composition of the outer core and inner core are the same. 2. The graph to the right illustrates how the temperature changed with time for part of the rock cycle. In just one or two sentences maximum, explain the events illustrated by this temperature time plot. 3. Briefly explain how the internal structure of the silicate minerals biotite and quartz determines the presence or absence of cleavage in those minerals. 4. Venn Diagram: Convergent Plate Boundaries.  Use the Venn diagram, below, to compare and contrast the similarities and differences between the three styles of convergent plate boundaries. Write features unique to each group in the larger areas of the circles; note features that the different boundaries share in the overlapping areas of the circles. Place the numbers corresponding to the list of characteristics below in the most suitable locations on the diagram. 1.   Present on Earth 2.   Formed from magma 3.    Form. exclusively at or near the surface 4.   May form. from the mantle 5.   Example: Gneiss 6.   Classification of this/these rock type(s) considers texture 7.   Form. beds and bedding 8.   Can be deposited in running water 9.   Example: granite 10. May undergo weathering at Earth’s surface to form. sediment 11. Form. as a result of crystallization 12. Solidify under conditions of decreasing heat 13. Classification of this/these rock type(s) considers composition 14. Maybe intrusive 15. Can display a fabric, or foliation 5. Column 1 provides you with the names of four different geologic features. In columns 2 and 3, circle all the answers that apply to that feature. Column 1 Column 2 Column 3 Name of Feature Type of Plate Boundary or Other Feature (circle just one best answer) Likely Cause(s) of Melting (circle all that may apply) rift valley (a) continental rift (b) ocean-continent convergent (c) continental collision (d) hotspot in a continent (a) decompression melting as the mantle rises (b) melting by adding water along a subduction zone (c) melting of continental crust caused by an influx of mantle-derived magma Mid-ocean ridge (a) oceanic divergent (b) ocean-ocean convergent (c) ocean-continent convergent (d) hotspot in an ocean (a) decompression melting as the mantle rises (b) melting by adding water along a subduction zone (c) melting of continental crust caused by an influx of mantle-derived magma Continental magmatic arc (a) continental rift (b) ocean-continent convergent (c) continental collision (d) hotspot in a continent (a) decompression melting as the mantle rises (b) melting by adding water along a subduction zone (c) melting of continental crust caused by an influx of mantle-derived magma Island arc (a) oceanic divergent (b) ocean-ocean convergent (c) ocean-continent convergent (d) hotspot in an ocean (a) decompression melting as the mantle rises (b) melting by adding water along a subduction zone (c) melting of continental crust caused by an influx of mantle-derived magma 6. What happens during lithification, and how does lithification differ from metamorphism? Part B. Answer TWO of the following THREE longer answer questions in the pages provided. Where possible, drawings are encouraged. 10 marks each. 1. The crust/mantle boundary is very different from the lithosphere/asthenosphere boundary. Answer the following three parts for each of these two boundaries. Plan on about 1-1.5 pages for your final   answer. Dot point answers are fine. Sketches encouraged. A) Does the composition or rock type vary across each of the two boundaries? Explain your answer. B) Do physical properties vary across each of the two boundaries? Explain your answer. C) Do either of the two boundaries have anything to do with plate movement? If yes, what? 2. Question about the generation of magma in the mantle. Fill in the short answers in column two for   each question.  Numbers are provided to indicate how many answers for each question part a through e. All answers are togo in the boxes provided. Note that this question continues on the next page. Column 1: questions Column 2: answers a. Three reasons why mantle rock ever (partially) melts? (just a few sentences for each) 1. 2. 3. Column 1: questions Column 2: answers b. What type of rock melts in the mantle (felsic, mafic or ultramafic) c. What type of magma is formed from that event? d. What igneous rocks (by specific name) are formed when that magma cools? 1. 2. e. What are three typical minerals in these types of rocks from part d? 1. 2. 3. Question 3. All answers for this question are done on this page or on the map, following page. Winter is coming.  Stannis Baratheon recognises that a shortage of obsidian puts Westeros at great risk from the Others. In the quest for further resources, he has sent his men to gather earthquake data and  bathymetric surveys of the oceans, and also plotted a number of volcanic islands in their search for obsidian (dragonstone). The map next page is the result of their efforts. The grey labelled areas are continental masses. The white is ocean. The solid black lines are plate boundaries of some type. The small land bits in the southwest corner of the map, on the Dorne Plate, are volcanic islands with ages labelled. Your tasks areas follows: a) On the map, some plate boundaries (the solid lines) have been labelled A-D. In the spaces below,   name each plate boundary type.  Be specific about the types of plates involved. For example, don’t just write ‘subduction zone’; but ‘ocean plate-ocean plate subduction zone. A B C D b) Of the four boundaries labelled A, B, Cand D, at which boundary(ies)would you expect to find volcanic activity? c) If you thought any of the labelled boundaries A, B, C or D was a subduction zone boundary, indicate on the map on which side you would expect to find volcanoes. Mark the letter Von that side. d) Based on the age of the volcanic islands in the southwestern corner, draw anarrow on the map that indicates the direction of movement of the Dorne Plate.

$25.00 View

[SOLVED] IMAT 3712 Human Computer Interaction 2024/25 SQL

Faculty of Computing, Engineering & Media (CEM) Coursework Brief 2024/25 Module name: Human Computer Interaction Module code: IMAT 3712 Title of the Assessment: Usability Evaluation of an Interactive System This coursework item is: (delete as appropriate) Summative This summative coursework will be marked anonymously: (delete as appropriate) No The learning outcomes that are assessed by this coursework are: 1. Be able to apply key general principles of usability, and a comprehensive understanding of different aspects of user experience, both to guide effective design and to evaluate existing systems. 5. Be able to undertake a sophisticated analysis and appraisal of the suitability of a range of different techniques for evaluating the usability of interactive systems for particular systems, situations and purposes, and apply the evaluation techniques to produce usability evaluations. This coursework is: (delete as appropriate) Individual If other or mixed ... explain here: This coursework constitutes 40% of the overall module mark. Date Set: Tuesday 10 December 2024 Date & Time Due (the deadline): Friday 31 January 2025 at 12.00 noon In accordance with the University Assessment and Feedback Policy, your marked coursework and feedback will be available to you on: You should normally receive feedback on your coursework no later than 15 University working days after the formal hand-in date, provided that you have met the submission deadline If for any reason this is not forthcoming by the due date your module leader will let you know why and when it can be expected. The Associate Professor Student Experience ([email protected]) should be informed of any issues relating to the return of marked coursework and feedback. When completed you are required to submit your coursework via: 1. Turnitin via LearningZone Late submission of coursework policy: Late submissions will be processed in accordance with current University regulations. Please check the regulations carefully to determine what late submission period is allowed for your programme. Academic Offences and Bad Academic Practices: Please ensure you read the section entitled “Academic Offences and Bad Academic Practice” in the module handbook or the relevant sections in this link: BaseCamp Link:  Overview: Assessment and Good Academic Practices Tasks to be undertaken: 1. Choose an interactive system to study, and (if appropriate) decide on the subset of functionality to consider 2. Select an appropriate systematic usability evaluation technique, and define a procedure for carrying out a usability evaluation 3. Apply the procedure to do a systematic usability evaluation 4. Assess the usability of the interactive system using the results of applying the systematic usability evaluation procedure 5. Write a report documenting all this 6. Prepare a presentation 7. Deliver the presentation to the marker and answer questions about the work Deliverables to be submitted for assessment: 1. Report 2. Presentation visual aids (normally PowerPoint)

$25.00 View

[SOLVED] GGR305 Biogeography Research Question Assignment Python

GGR305 Biogeography Research Question Assignment Worth 5% of your final grade, this assignment is due Friday, Jan 31 by 11:59pm; Late submissions (after Jan 31) will occur  10% penalty per day (including weekends) for maximum five days. Assignments submitted five calendar days beyond the due date will be assigned a grade of zero. Purpose: 1. Work through the formulation of a term paper question. 2. Familiarize yourself with a search strategy to effectively research your topic. 3. Identify and review relevant scholarly articles. The scope of the term paper was discussed more thoroughly in class. Option Topics: 1. Focus on one taxon (species, genus, family, etc.) investigating how it has changed overtime: what physical and biological limiting factors impact its distribution throughout its biological history; how has the  distribution changed over time;  and what  are  the hypotheses explaining these changes? For example: How has the distribution of the horse changed overtime? 2. Focus on a specific location investigating what plants, mammals, birds, invertebrates, or other taxonomic group are currently present: how has this changed over a given time period (e.g. since the last glaciation); what are the physical conditions of this location; how have these conditions changed over time; and what is the impact of these changes on the taxa of interest? For example: how has the vegetation of Southern Ontario changed since the last glaciation and have these changes impacted on taxonomic diversity? 3. Focus on humans’ impact on a given taxa or the diversity in a particular location. For example: how have humans impacted the biota of New Zealand since the first Polynesians arrived? 4. Another approach of your choosing (prior approval suggested). Basic requirements: 1.   Your assignment should be typed, and use 12pt Times New Roman, double-spaced. 2.   Make sure your name and student number are on your assignment. 3.   You do not need a cover page. 4.   The assignment must be submitted as a PDF file. Tasks: 1.   Provide the research question you plan on addressing in your term paper. This research question  is  the  foundation  for  your  paper  and  must  be  sufficiently  clear  to  ensure appropriate focus for your research. It should be stated in one sentence. (10 pts) •  See section above for suggestions about appropriate topics. • You can simply write: “My research question is: … .” 2.   After using your key words/phrases to search scholarly article databases, provide the full reference for at least five (5) articles relevant to your research question. Please follow APA style. (10 pts) •  See here for appropriate style here. (you only need to follow APA style. for references, not the rest of your assignment format) 3.   Choose one of the five articles you identified, answer the following questions. (80 pts) 1)  What is the research question(s) that this paper is trying to answer? (10 pts) 2)  How is the research question addressed? In three to five sentences, please identify the type of data used (field observations, samples of material, computer simulated data, or something else) and how the data were analyzed. In other words, what methods were used in the paper. (15 pts) 3)  In three to five sentences, please explain why you think this paper was published (addressing a basic gap in knowledge, testing a new hypothesis, applying an existing hypothesis or theory in a new situation, or something else). This information may be stated in the introduction of the paper and/or be your interpretation of the paper’s contribution. (15 pts) 4)  In one paragraph, describe the major findings or conclusions of the paper. (20 pts) 5)  In one paragraph, explain how this paper is relevant to the question you are researching for your term paper.  You must justify your choice/explanation by providing a logical rationale. (20 pts)

$25.00 View

[SOLVED] INMR95 Assessment Details Python

INMR95 Assessment Details Section 1 – Descriptive Analytics Introduction Case study: Student satisfaction is a KPI for most, if not all higher education institutes. There are a range of reasons why students may or may not be satisfied with their courses. The Turkiye Student Evaluation Datasets gives us a small insight into the complexities that drive student experience; you have been hired by a company called HigherEdCo ltd. as a higher education consultant to perform. several multivariate analyses that will indicate the factors that impact student experience (according to the data collected). Your analysis should be approached critically, and variable as well as method selections should be justified. You MUST reduce the dimensionality of this dataset. The dataset you will be working with is the Turkiye Student Evaluation Data Set (Gunduz & Fokue, 2013). The dataset is made up of the following variables: instr: Instructor's identifier; values taken from {1,2,3} class: Course code (descriptor); values taken from {1-13} repeat: Number of times the student is taking this course; values taken from {0,1,2,3,...} attendance: Code of the level of attendance; values from {0, 1, 2, 3, 4} difficulty: Level of difficulty of the course as perceived by the student; values taken from {1,2,3,4,5} Q1: The semester course content, teaching method and evaluation system were provided at the start. Q2: The course aims and objectives were clearly stated at the beginning of the period. Q3: The course was worth the amount of credit assigned to it. Q4: The course was taught according to the syllabus announced on the first day of class. Q5: The class discussions, homework assignments, applications and studies were satisfactory. Q6: The textbook and other courses resources were sufficient and up to date. Q7: The course allowed field work, applications, laboratory, discussion and other studies. Q8: The quizzes, assignments, projects and exams contributed to helping the learning. Q9: I greatly enjoyed the class and was eager to actively participate during the lectures. Q10: My initial expectations about the course were met at the end of the period or year. Q11: The course was relevant and beneficial to my professional development. Q12: The course helped me look at life and the world with a new perspective. Q13: The Instructor's knowledge was relevant and up to date. Q14: The Instructor came prepared for classes. Q15: The Instructor taught in accordance with the announced lesson plan. Q16: The Instructor was committed to the course and was understandable. Q17: The Instructor arrived on time for classes. Q18: The Instructor has a smooth and easy to follow delivery/speech. Q19: The Instructor made effective use of class hours. Q20: The Instructor explained the course and was eager to be helpful to students. Q21: The Instructor demonstrated a positive approach to students. Q22: The Instructor was open and respectful of the views of students about the course. Q23: The Instructor encouraged participation in the course. Q24: The Instructor gave relevant homework assignments/projects, and helped/guided students. Q25: The Instructor responded to questions about the course inside and outside of the course. Q26: The Instructor's evaluation system (midterm and final questions, projects, assignments, etc.) effectively measured the course objectives. Q27: The Instructor provided solutions to exams and discussed them with students. Q28: The Instructor treated all students in a right and objective manner. It’s up to you to choose your independent and dependent variable(s), as well as the tests you will run. However, everything you do needs to be justified, i.e., you need to explain why you chose to use that particular test, why you treated a certain variable as e.g., categorical, and why you transformed variables (if applicable). In short, a good project will critically analyse the results obtained and identify its limitations. The more detailed and exhaustive your analysis, the more likely you are to score a high grade (see marking scheme). Make sure to include figures and tables to support your findings. Expected Project Output In the end you need to submit a section with the following headings: 1. Introduction (briefly what your aim was for the analysis and your research question) 2. Process (what types of statistical testing did you use to answer the research question and the rationale for using said methods). 3. Results (the results of all the analyses, including figures, tables, and test outputs). The output needs to be written for an academic audience. You must also submit: · Your new csv file with any new variables you extracted/modified from the data. You must name this exploratory.csv · Your R script. that shows, with comments, step by step the process you took to analyse the data. You must name this exploratory.R · Anything else that you feel is relevant is welcome (but not required) Section 2 – Predictive Analytics Case study: You have been hired as a consultant to provide data-driven recommendations to the marketing department of the German-Hellenic bank. The bank has supplied you with anonymised data (the data we will be using has been supplied by Moro et al. (2014), and can be found in the UCI website). Here is a list of the variables: Input variables: # bank client data: 1 - age (numeric) 2 - job : type of job (categorical: 'admin.','blue-collar','entrepreneur','housemaid','management','retired','self-employed','services','student','technician','unemployed','unknown') 3 - marital : marital status (categorical: 'divorced','married','single','unknown'; note: 'divorced' means divorced or widowed) 4 - education (categorical: 'basic.4y','basic.6y','basic.9y','high.school','illiterate','professional.course','university.degree','unknown') 5 - default: has credit in default? (categorical: 'no','yes','unknown') 6 – balance: Account balance 7 - housing: has housing loan? (categorical: 'no','yes','unknown') 8 - loan: has personal loan? (categorical: 'no','yes','unknown') # related with the last contact of the current campaign: 9 - contact: contact communication type (categorical: 'cellular','telephone') 10 - month: last contact month of year (categorical: 'jan', 'feb', 'mar', ..., 'nov', 'dec') 11 - day_of_week: last contact day of the week (categorical: 'mon','tue','wed','thu','fri') 12 - duration: last contact duration, in seconds (numeric). Important note: this attribute highly affects the output target (e.g., if duration=0 then y='no'). Yet, the duration is not known before a call is performed. Also, after the end of the call y is obviously known. Thus, this input should only be included for benchmark purposes and should be discarded if the intention is to have a realistic predictive model. # other attributes: 13 - campaign: number of contacts performed during this campaign and for this client (numeric, includes last contact) 14 - pdays: number of days that passed by after the client was last contacted from a previous campaign (numeric; 999 means client was not previously contacted) 15 - previous: number of contacts performed before this campaign and for this client (numeric) 16 - poutcome: outcome of the previous marketing campaign (categorical: 'failure','nonexistent','success') # social and economic context attributes Output variable (desired target): 17 - y - has the client subscribed a term deposit? (binary: 'yes','no') Note that variable 12 should be discarded. The original data set has outcome variable 17 as the desired output. However, you do not necessarily need to focus on this variable. You are expected to explore other relationships in the data and present interesting findings to your client (hint: look at balance for example). You should build multiple models for comparisons, but present two final models on two different outcome variables. All actions taken need to be critically analysed and justified. The more detailed and exhaustive your analysis, the more likely you are to score a high grade (see marking scheme). Your outcome variables can be categorical, continuous, or a mix of both (i.e., one model as a classification model, one as a regression model). Expected Project Output In the end you need to submit a section with the following headings: 1. Introduction (briefly what your aim was for the analysis, along with your research question) 2. Process (what types of models did you use to answer the research question and the rationale for using said modelling techniques). 3. Results (the results of the analyses and the models, including model performance and model comparisons). All the output needs to be written for an academic audience. You must also submit: · Your new csv file with any new variables you extracted/modified from the data. You must name this analysis.csv · Your R script. that shows, with comments, step by step the process you took to analyse the data. You must name this analysis.R · A R shiny app or anything else you feel is relevant (optional) Section 3 – Prescriptive Analytics In this section you will form. data-driven recommendations using your findings from Section 1 and Section 2 for your two clients. You can expect your audience to be a layman audience with little to no understanding of statistics and modelling. Therefore, unlike the results is section 1 and 2, your report needs to be written in such a way that a layman audience can understand it. Ultimately you need to make a convincing argument that states how your client should proceed based on the results of your findings. You are expected to use a critical approach by using the results obtained to both generate recommendations and identify limitations. You can include an interactive Shiny R app, which is optional but will increase your likelihood of delivering a more robust solution. Expected Project Output In the end you need to submit a document with the following headings: 1. Client: HigherEdCo ltd. - Executive summary 2. Aims and Objectives 3. Analysis 4. Recommendations 5. Limitations 1. Client: German-Hellenic Bank. - Executive summary 2. Aims and Objectives 3. Analysis 4. Recommendations 5. Limitations References Gunduz, G. & Fokoue, E. (2013). UCI Machine Learning Repository [[https://archive.ics.uci.edu]]. Irvine, CA: University of California, School of Information and Computer Science. S. Moro, P. Cortez and P. Rita. A Data-Driven Approach to Predict the Success of Bank Telemarketing. Decision Support Systems, Elsevier, 62:22-31, June 2014

$25.00 View

[SOLVED] MGTS7523 Assessment 3 2024 Semester 2SPSS

MGTS7523 Assessment #3 2024, Semester 2 Part 1: Building a predator-prey (seabirds – cats) stock and flow model The aim of this part of the assignment is to test your ability to construct a system dynamics model using the Stella Architect software. You will be required to use this model to help understand the dynamic consequences of introducing a cat population to an existing seabird population. Furthermore, you will use this same model to incorporate a simple management strategy that examines the effect of a cat removal program. Finally, you will update an initial causal loop diagram (Figure 1) to create a more accurate representation of the system. Background The stock and flow model (SFM) that you will build using Stella Architect is a predator-prey model that has similarities to the Macquarie Island case study that was first introduced in week 1 for MGTS7523. Figure 1 shows a simplified causal loop diagram (CLD) for the system that you will be modelling. The basic story is there is a population of seabirds (the prey) on an island. Initially, there are no cats, however their introduction is the catalyst for a predator-prey system between the birds and the cats. Figure 1. Simple CLD for the predator-prey model Stock and flow model development In this section you are tasked with building a stock and flow model (SFM) using Stella Architect. Model settings Open Stella Architect and create a new (model) file. Open the model settings panel and ensure the following model settings are used: · Start Time = 1, Stop Time = 3650 · Time Units = Days · DT = ½ (0.5) · Integration method = Euler Copy and paste the model settings panel showing these settings below (Mac: use shift+command+4, PC: Use screen print) – make sure that the Start Time, Stop Time, DT, Time Units, and Integration Method are clearly visible [2 marks] The seabird population model The first task is to build a SFM for a seabird population. Your model will have the following stocks: · ‘SEABIRD EGGS’ [birds] – the number of eggs · ‘SEABIRDS’ [birds] – the number of seabirds For simplicity, assume that the units for SEABIRD EGGS and SEABIRDS are the same (birds). The basic structure of the SFM that you need to build is shown in Figure 2. Figure 2. The basic structure for the seabird model Initial conditions The two stocks need initial conditions. Rather than insert these directly into the stocks, we will use converters to store these – this approach allows easy access to these variables if we develop a user-interface. It also addresses the modelling principle that “there should be no hidden variables”. Add two new converters to be used to store the initial conditions for the two stocks of the seabird model, naming them as follows: · ‘Initial seabird eggs’ = 50000 [birds] · ‘Initial seabirds’ = 4700 [birds] Make sure that the units of the stocks are the same as these ‘initial value’ converters. In this model, you are required to colour-code some of your converters. This provides a visual aide for their rapid identifying different types of converters in a SFM. For converters that store initial conditions, we will set the fill colour of these as green. · Set the fill colour for these two new converters (‘Initial seabird eggs’, ‘Initial seabirds’) to green. Controlling the flows The following rate constants help determine how quickly (or slowly) the number of seabirds are created (births) and die (eggs lost, bird deaths). Introduce the following three converters to your model, naming them as follows: · ‘Birth rate eggs’ = 0.027 [1/day] (i.e. for every 1000 birds, 27 eggs are laid each day) · ‘Loss rate of eggs’ = 0.01 [1/day] (the proportional rate that eggs fail to hatch) · ‘Seabird death rate’ = 0.01 [1/day] Now connect these three new converters into the appropriate flows in your model. To account for eggs hatching ('birds hatch’), use the following data to introduce one new converter (‘Egg incubation period’) and connect this to the basic model structure. · ‘Egg incubation period’ = 35 [days] (the time it takes for an egg to hatch) Set the units for each of these four new converters. Set the fill colour for each of these four converters to grey. Note that each of these four converters, which are used to control the flows in the current model, is used in combination with one of the stocks to determine the amount of flow. Based on this, add connections between the appropriate stock and flow so that the flows are correctly set up. Finally, to complete this part of the SFM, you need to set the equations for the four flows to accommodate these additions to the model (use Stella Architect’s built-in unit checker to ensure that all variables have been properly specified). Copy and paste your model structure into the box below [2 marks] If you run the model now you will get exponential growth – explain in the box below why this occurs? [2 marks] Using a graph ( ), show the behaviour over time for SEABIRDS that confirms that exponential growth is occurring. Copy and paste your graph into the space below (make sure that the x and y axes are visible) [3 marks] Adding a carrying capacity effect An issue of the seabird model is that it currently does not allow a shift in feedback dominance as the population increases (or declines). This effect can be introduced by bringing in a carrying capacity effect and adding an additional feedback loop. · Introduce a new converter and name it ‘Carrying capacity’ and set its value = 50000. Set the fill colour for this converter to grey. · Introduce an additional new converter and name it ‘Seabirds:CC’ ‘Seabirds:CC’ is a variable that will compare the current seabird population with the carrying capacity (the maximum number of seabirds that the system can support). It does this by taking the ratio of seabirds to carry capacity (Seabirds divided by the carrying capacity). Connect the SEABIRD stock and the ‘Carrying capacity’ converter to ‘Seabirds:CC’. Add the equation needed for this. Set the units for these two new converters (note that the units for ‘Carrying capacity’ are the same as for SEABIRD). A dimensionless multiplier · Introduce another new converter and name it ‘Effect of seabird population on seabird death rate’. Set the fill colour for this new converter to yellow. Connect ‘Seabirds:CC’ to ‘Effect of seabird population on seabird death rate’. The units for this converter will be the same as ‘Seabirds:CC’. The ‘Effect of seabird population on seabird death rate’ will be a graphical function. Therefore, specify this converter as a graphical function by checking ‘Graphical’ in Stella Architect: The input to the graphical function is the ratio of seabirds to the carrying capacity (Seabirds:CC) à we are only concerned over the range that this ratio is 0 – 1. Therefore set the limits on the x-axis of the graphical function to be 0 – 1. We are interested in converting this input into an effect on the death rate of the seabirds. At a low ratio value (i.e. the input), there should be no effect on the seabird death rate, whilst at a low ratio value, there should be an increasing effect on the seabird death rate (i.e. the death rate will start to increase as the input ratio approaches 1). To accommodate this, set the limits of the y axis to 1-2. Select ‘Points’ and then set the number of data points = 11: Use the following information to populated the table for your graphical function: When Seabirds:CC < 0.700, the utput = 1.000. When Seabirds:CC  = 0.700, the utput = 1.069 When Seabirds:CC  = 0.800, the utput = 1.186 When Seabirds:CC  = 0.900, the utput = 1.452 When Seabirds:CC  = 1.000, the utput = 2.000 Copy and paste your graphical function (as a graph - click on ‘Graph’) below, ensuring that you include the x and y axes [2 marks] · Add another new converter and name it ‘Seabird death rate adjusted for CC’. Delete the connection between ‘Seabird death rate’ and ‘bird deaths’ and add a new connection from ‘Seabird death rate’ to ‘Seabird death rate adjusted for CC’. Now also connect ‘Effect of seabird population on seabird death rate’ to ‘Seabird death rate adjusted for CC’.  Set the equation for ‘Seabird death rate adjusted for CC’ so that its units are the same as for ‘Seabird death rate’ Finally add a new connector from ‘Seabird death rate adjusted for CC’ to ‘bird deaths’ (update your equation for ‘bird deaths’ to reflect this change). Copy and paste your entire SFM below [3 marks] Run the model and paste the chart below showing SEABIRDS over the entire simulation run (make sure that the x and y axes are visible) [2 marks] We now have a seabird population model that will grow until it reaches the carrying capacity of the system. Adding a cat population model To investigate the impact on the seabird population of introducing cats, we need to build a cat population model. The basic structure of the cat model that you need to use is Figure 3. Figure 3. The basic structure for the cat model What type of flows are shown in Figure 3? Why are these types of flows used? [2 marks] Build this basic structure in Stella Architect in the same file as you built your Seabird population model (you will be connecting the two models). Initial conditions The stock needs an initial condition – add a new converter to set the following initial condition for CATS: · ‘Initial cats’ = 0 [cats]. Set the fill colour for this new converter to green. Add the following two converters – these will be the population rate constants for your cat model (set fill to grey): · ‘Cat birth rate’ = 0.00857 [1/day] · ‘Cat death rate’ = 0.006 [1/day] Note that these two new rate constants are used in a linear first-order combination with CATS to control the inflow (cat births) and outflow (cat deaths). Update your SFM to reflect this. Set the equations for ‘cat births’ and ‘cat deaths’ to accommodate these additions to the model. Run the model – create a new graph to display ‘CATS’. Copy and paste this graph below and include with your graph an explanation of the why you observe the trend you see for CATS (make sure that the x and y axes of your graph are visible) [3 marks] Introducing cats into the system We are now going to set up the model so that cats are introduced to the system. The approach that we will use is to create a new flow that represents the introduction of cats. · Add a new inflow (name it ‘Introduction of cats’) and connect it into the CATS stock. We will use Stella Architect’s built-in ‘PULSE’ function to essentially inject some cats into the CATS stock. To do this add the following three new converters: · ‘cat introduced rate’ · ‘cat introduction day’ = 1000 [day] · ‘number of cats introduced’ = 50 [cats] Connect ‘cat introduced rate’ to the ‘Introduction of cats’ flow Connect ‘cat introduction day’ and ‘number of cats introduced’ to ‘cat introduced rate’ Set the fill colour for ‘cat introduction day’ and ‘number of cats introduced’ to grey. Set the equation for ‘cat introduction rate’ = PULSE(number_of_cats_introduced, cat_introduction_day, 100000) [use Stella Architect’s unit checker to recommend units] Finally, set the equation for the ‘Introduction of cats’ flow. Run the model and plot the ‘Introduction of cats’ flow on a graph. Copy and paste your graph below (make sure that the x and y axes are visible) [2 marks]

$25.00 View

[SOLVED] IMAT 3712 Human Computer Interaction Assignment Two

IMAT 3712 Human Computer Interaction Assignment Two Usability Evaluation of an Interactive System Deadline: 12:00 Friday 31 January 2025 (Week 18) Learning Outcomes This assignment is designed to provide practical experience of carrying out an analysis of usability requirements and priorities, performing a systematic usability evaluation using a standard method, and producing a presentation reporting the findings. It assesses module learning outcomes 1 and 5. 1. Be able to apply key general principles of usability, and a comprehensive understanding of different aspects of user experience, both to guide effective design and to evaluate existing systems. 2. Be able to apply a user centred approach to the design of an interactive system, employing appropriate prototyping techniques. 3. Be able to specify requirements for and propose a suitable system design that aligns with the cognitive capabilities of its target human stakeholders and fits the needs of different users for different tasks and environments. 4. Be able to investigate and analyse ethical issues involved in the design or use of an interactive system, drawing on theoretical and practical knowledge of computer ethics. 5. Be able to undertake a sophisticated analysis and appraisal of the suitability of a range of different techniques for evaluating the usability of interactive systems for particular systems, situations and purposes, and apply the evaluation techniques to produce usability evaluations. Submission and Marking Procedure This is an individual assignment. The assignment is worth 40% of the total mark for the module. The mark for the assignment will be out of 100. The intention is to mark the assignment though a live presentation by each student, so the marking will not be anonymous. These presentations will be held as early as possible after the submission deadline for the assignment at the end of January. The presentation should be about 15 to 20 minutes presenting plus about 10 minutes of questions. We are currently planning to hold the presentations live in person but may decide to do them via MS Teams. There will be a submission to hand in as a Word document via Turnitin. The submission should comprise the written documentation you should produce anyway as part of doing the work for the assignment. You need to produce clear and comprehensive documentation of your procedure, your data gathering and your findings. You will be expected to be able to discuss it. This is an individual assignment, and we are expecting you to work alone (apart from test subjects if you do a user trial). However some activities might be done better with more than one person doing them. You may recruit assistance, but if you have help, you need to describe this clearly in your documentation and presentation. If you do a user trial, you need to take your informed consent process seriously, including getting your subjects to sign a consent form. Submission The deadline for the submission of the documentation is 12:00 on Friday 31 January 2025. The submission will need to be submitted electronically to Turnitin via LearningZone. The target date for the completion of marking and the return of results is Monday 24 February 2025 (Week 22). You should also submit your PowerPoint presentation slides, or equivalent, via Turnitin. The deadline for submitting your PowerPoint presentation is Tuesday 4 February 2025 (Week 23). However this is mainly for moderation purposes. Writing The assignment should be written entirely by you and should give a true reflection of your competence in English. It needs to be written in clear, comprehensible English; assignments in murky or unintelligible English with misused words will get fail marks. Getting any human or machine help with producing your assignments that you have not clearly and honestly acknowledged is serious academic misconduct; this will result in severe penalties that can include expulsion from the university. So is plagiarism: using text, ideas or information from other sources that you have not honestly, clearly and accurately cited. Copying other people’s text by paraphrasing it sentence by sentence constitutes plagiarism; do not do this. Fabrication of results (claiming to have collected data or experimental results that you haven’t collected, or are different from what you collected) is also very serious academic misconduct. If you need help with producing assignments written in good English, you can get help, but you need to (1) detail exactly what the help was and who or what provided it, and (2) provide copies of your original versions of texts, so we can evaluate exactly what is yours and what isn’t, so we can arrive at fair marks. Using Grammarly in its standard mode (provided by the free version) to find and correct grammar mistakes is allowed, but using the Grammarly AI feature (in the paid version) to improve your work constitutes cheating. If you are in doubt about what to do, you should consult your tutor. Task Your firm of interaction design consultants is trying to build up a portfolio of impressive work, to enable it to pitch for business convincingly in the future. Your task is to produce a fully documented usability evaluation of an interactive system, plus a presentation of your results, by applying a systematic evaluation methodology. You have a completely free choice of what interactive system you evaluate. The Usability Evaluation Producing the usability evaluation will involve 1. Choosing an interactive system to study. This needs to be a real, existing system. In your documentation you need to tell us where and how you got access to it. 2. Identifying the use cases or aspects of the functioning of the system to be considered, and briefly describing them in your documentation. (These don’t need to be a complete set of use cases; for very complicated systems focusing on one part of what they do is just fine. However you should give a clear indication of what subset of the functionality of the system you are considering, and what you are not considering. If in doubt, cover less functionality in more detail.) 3. Choosing an evaluation methodology. You should apply a standard evaluation methodology such as user testing, cognitive walkthrough, or heuristic evaluation. (If you want to do something non-standard, ask advice from your tutor.) 4. Defining an evaluation procedure. This will include stating one or several user tasks to be tested or considered with exact descriptions of the scenario and the goal the user is trying to achieve, as well as what the evaluator will do to collect results and produce an evaluation. The evaluation procedure needs to be described in full, separately from the description of the results. 5. Carrying out the evaluation. This will involve applying the procedure and documenting what happens, and what the procedure finds. (If applying your procedure looks like an excessive amount of work, or producing an excessively large volume of documentation, ask advice; we would prefer an evaluation giving detailed insight into part of the functionality to an evaluation with broad coverage but a thinner or more superficial analysis.) 6. Deriving findings about the usability of the interactive system from the results of the usability evaluation. This should include consideration of how strong and how general the conclusions are. Guidance You are expected to apply a systematic evaluation method. That is, you need to do a user trial, or a heuristic evaluation, or a cognitive walkthrough. (Make sure you know what a heuristic evaluation or a cognitive walkthrough actually is, before claiming to be doing one.) If you want to use a different approach to doing a usability evaluation, ask advice. We are looking for thorough and detailed evaluations, and especially findings about exactly where there are actual or potential usability problems.  If you think you are doing a disproportionate amount of work, or writing an enormous amount, then you should aim to be thorough and detailed, and compromise on how much of the system you cover. Your evaluation procedure needs to be planned before you try to follow it. You need to (1) describe what your evaluation procedure is, separately from (2) describing what happened and what you found when you applied it, and (3) describing your evaluation of the usability of the system, drawing from what you found from your systematic evaluation procedure. User Trials If you do a user trial, you should aim to observe carefully what your subjects do and where they make mistakes or find things confusing and awkward, and report sources of problems as exactly as possible. Timings for tasks and subjective satisfaction ratings in debriefing are valuable but less interesting than actual usability problems. Remember that having carefully designed, realistic tasks is important, and that unless you want to look at exploration or browsing the tasks should have clear end points and success criteria. Don’t be over-directive: provide clear goals and enough information about the scenario, but don’t tell people what to do. The exact wording of instructions matters. So the instructions need to be described exactly in your documentation and reported in your presentation. If you do a user trial, you need to get your test subjects to consent to participate and sign a consent form. You will need to prepare a consent form. for your subjects to sign; we suggest customizing the consent form. template provided with the assignment. Please don’t have subjects under 18 unless they are your family members, as this severely compromises our ethics approval. You need to say who your subjects were, and how they were recruited, and where you carried out the user trial. This won’t require reporting names unless we need to ask for them. What relevant education and experience people have had is likely to be more important than age or sex. Heuristic Evaluations If you do a heuristic evaluation, you should state what principles and/or guidelines the evaluation is considering. While using Jakob Nielsen’s ten broad categories of usability problems (AKA his ten design principles) is okay, and popular, you shouldn’t make the mistake of thinking that heuristic evaluation necessarily involves using them, or only them. Heuristic evaluations are likely to be more successful when using more detailed and concrete sets of guidelines than just Nielsen’s ten broad categories of usability problems. We recommend Nielsen’s 113 design guidelines for web homepages for doing heuristic evaluations of websites. In a heuristic evaluation, aim to be as exact as possible about which heuristic is violated, and exactly where, and exactly how. Including severity ratings is good. Cognitive Walkthroughs For cognitive walkthroughs, you need to describe the procedure including the cognitive walkthrough questions to be considered at each step, as there are different variants. It’s a good idea to describe the happy path (or paths, if there’s more than one way to do it) for successfully performing the task, as you need to specify this before you do the cognitive walkthrough. You need to show evidence that the questions have been systematically used in the evaluation. If the answer to a question is ‘no problem’ you don’t need more than a tick or a ‘yes’ as documentation. Choice of Interactive System The assignment gives you a completely free choice of what interactive system you consider; but it needs to be a real, existing interactive system that you have access to and can study. You need to tell us how and where you got access to it. Possibilities include software applications such as programming language development environments or case tools or games or photo editing systems; e-commerce websites or museum websites or government websites; one of DMU’s web-based systems for students or staff; electronic devices such as remote controls for televisions or DVD players, or digital cameras, or car radios; or control panels for appliances such as microwave ovens or home heating systems; or a self-service system such as an automatic train ticket vending machine. You may, if you wish, choose to evaluate two very similar and directly competing products, and assess ways in which one is superior to the other. This is often a very good way to produce a good assignment. It’s perfectly okay to decide to evaluate a part of a big or complicated system, or consider a limited set of use cases. When in doubt, go into more detail about less of the system. The one piece of advice we can give is to choose something that is complicated or difficult to use, or is used to carry out complicated tasks, and preferably has obvious usability problems. Studying more complicated and less frequently used features of a system is likely to be more fruitful than focusing on the standard functions people use all the time. Think about use cases and scenarios, and pick ones that will make the tasks complicated. Standard features of highly optimised systems that large numbers of people use, like Amazon, don’t make for interesting evaluations, and won’t give you much to say – please avoid. You may choose to interpret ‘interactive system’ very broadly and present a usability evaluation of a static information display, but this would require a sophisticated and detailed analysis of how people use it for practical tasks, and these tasks would need to be complicated enough to give you something to analyse. Ask advice if you consider this. Written Submission Your report is primarily a collection of your documentation – what you should produce anyway in the course of carrying out your evaluation – with enough supporting information to understand it. This is not an essay – produce terse lists not long paragraphs, and don’t bother with non-essential introductions. We want evidence that you have conducted a systematic application of a well-defined evaluation procedure and have documented the results thoroughly. Produce full documentation of your usability evaluation. This should include 1. A brief statement of what the interactive system is, including version number if applicable, and what it does. Do not write any unnecessary general introduction. Keep this to the minimum we need to understand what the assignment is about. You also need to state clearly how and where you got access to the system to evaluate it. 2. Brief accounts of the use cases considered, in just enough detail to make the rest of the documentation comprehensible, plus a statement of what you are not considering, if you are only looking at part of the system. 3. An exact description of the evaluation procedure to be followed, including what the methodology is, exact descriptions of user tasks being considered, exact wording of the instructions to be given to users in user testing, or the set of guidelines used in heuristic evaluation, or the exact questions being considered at each step in a cognitive walkthrough. 4. The results of the evaluation procedure, including notes made during observations of user trials, while conducting a heuristic evaluation, etc. 5. The findings of your evaluation about the usability of the interactive system. Include notes on how the findings relate to the results of the evaluation, and ideally about how strong the evidence is. 6. Your notes from your evaluation procedure should be included as an appendix. Handwritten notes should be scanned or photocopied or photographed. DO NOT write more than you need to. Do not bother with unnecessary introductions or generalities about usability or human computer interaction. This is just unwelcome extra work for both you and your tutor. Brief means brief. Terse is good. However you do need to be detailed and exact about your procedure and your results and findings. Include word counts for sections. 2-3000 words should be plenty, if you don’t waffle. Presentation The assignment will be assessed largely from your presentation, including how you answer questions and can support what you are saying from evidence in the documentation of your usability evaluation. However, we will look at how thorough, detailed, precise, and clear your methods and results are in your documentation. Your presentation should be 15 to 20 minutes long. It should · Briefly introduce the interactive system, and explain what aspects of the functionality are being covered by the evaluation. · Explain what methodology is being used and (if appropriate) how it has been customized for the needs of this particular evaluation. · Describe the evaluation procedure exactly. · Describe the application of the procedure and what results it produced. If there is too much to describe, it’s better to be thorough about some of the evaluation rather than more superficial about all of it. · Describe the findings about the usability of the system. We want to know exactly what you’ve done and what you’ve found, so don’t describe textbook knowledge or the structure of the assignment, and be precise, concrete and detailed when reporting your procedure and the findings of your evaluation. We expect a PowerPoint presentation – if you would prefer to present your work another way, please discuss this with your tutor. You may include a demonstration of the system in use to illustrate your points, but we don’t expect this. Pictures help. You should be prepared to point to and discuss the content of your hand-in, if and when asked about it during or after your presentation. We recommend planning your presentation, and if possible, getting a friend or two to give you feedback and advice on a live runthrough. At the least, having done a runthrough will give you confidence.      

$25.00 View

[SOLVED] Quadratic Functions Project Mathematics and the Arts SQL

Quadratic Functions Project: Mathematics and the Arts 1.) Find an example of the graph of a quadratic function in a work of art or architecture. Make a copy of the picture of art/architecture. 2.) Draw a coordinate graph system over the picture of the work of art or architecture that you’ve chosen (you may need to enlarge the quadratic part of the artwork to draw a set of coordinate axes.  If so, please include a copy of the original work of art or architecture as well).  Mark the scale clearly.  (You may do this with tracing paper, graph paper, or on the computer) 3.) Find the coordinates of five points on your graph and use these five points to find the equation of your quadratic regression function. Show the work for finding your equation. 4.) Find the coordinates of another point on your graph and check to make sure your model works for that point by substituting into your equation.  Show this work too. 5.) Present your results in a well written report or neat, well organized poster. Your report/poster should include information about the actual size of your artwork as well as the scale that was used in your copy of the picture. Cite your sources. Other information: ●   This project is to be completed independently. No two students can use the same piece of artwork/architecture. ●   WARNING!!! Some pictures may appear to be parabolas but may not actually be real parabolas. If your artwork is not a true parabola, but is close, please make sure that you state that in your project and presentation. Discuss the amount of error from a true parabola. Use the following rubric as a “checklist” to help you as you complete your project. Please turn in this rubric on the day you present your project. It will be used to score your project. Rubric:      and over a copy   artwork.Anaccurate scale thegraphshowingtherelationshipbetweenthepicturesizeandtheactualsize of the artwork/architectu               parabola artwork, withou  coordinateplanewasincluded               .2Aquadraticregressionequationwasaccuratelyfoundandincludedaclearexplanationoftheprocessusedtoobtain theequation.2A 6        quadraticregressionequation                 .       ,thentheerrorfactorwasdiscussed                     ,  .Posterreportincludesatleast             .(i.e.      ?Anyother resources used?)2Total15 points

$25.00 View

[SOLVED] Computer Science CSC263H Homework Assignment 2 2024

Computer Science CSC263H Homework Assignment #2 September 11, 2024 Due: September 25, 2024, by 11:00 am • You must submit your assignment through the Crowdmark system. You will receive by email an invitation through which you can submit your work. If you haven’t used Crowdmark before, give yourself plenty of time to figure it out! • You must submit a separate PDF document with for each question of the assignment. • To work with one or two partners, you and your partner(s) must form. a group on Crowdmark (one submission only per group). We allow groups of up to three students. Submissions by groups of more than three students will not be graded. • The PDF file that you submit for each question must be typeset (not handwritten) and clearly legible. To this end, we encourage you to learn and use the LATEX typesetting system, which is designed to produce high-quality documents that contain mathematical notation. You can use other typesetting systems if you prefer, but handwritten documents are not accepted. • If this assignment is submitted by a group of two or three students, for each assignment question the PDF file that you submit should contain: 1. The name(s) of the student(s) who wrote the solution to this question, and 2. The name(s) of the student(s) who read this solution to verify its clarity and correctness. • By virtue of submitting this assignment you (and your partners, if you have any) acknowledge that you are aware of the homework collaboration policy for this course, as stated here. • For any question, you may use data structures and algorithms previously described in class, or in prerequisites of this course, without describing them. You may also use any result that we covered in class (in lectures or tutorials) by referring to it. • Unless we explicitly state otherwise, you should justify your answers. Your paper will be marked based on the correctness and efficiency of your answers, and the clarity, precision, and conciseness of your presentation. • The total length of your pdf submission should be no more than 3 pages long in a 11pt font. Question 1. (20 marks) Let H be a binomial heap that intially contains n keys (i.e., |H| = n). In this question, you will determine the “amortized” (i.e., average) cost of successively inserting k keys into H. a. Recall that α(n) is the number of 1’s in the binary representation of n. Prove that binomial heap H has exactly n − α(n) edges. Hint: See Appendix B.5 about trees in our CLRS textbook. b. We define the worst-case cost of inserting a new key into H to be the maximum number of pairwise comparisons between keys that is required to do this insertion. Consider the worst-case total cost of successively inserting k keys into H. It is clear that for k = 1 (i.e., in-serting only one key) the worst-case cost is O(log2 n). Show that when k > log2 n, the average cost of an in-sertion, i.e., the worst-case total cost of the k successive insertions divided by k, is bounded above by constant. Hint: Relate the cost of an insertion with the number of edges that it creates and use Part a. Question 2. (20 marks) Part I. In the following, H denotes a binomial max heap, n is the number of items in H, x is (a pointer to the node of) an item inside H, and k is a number (key). a. Describe a simple algorithm to increase the key of a given item x in a binomial max heap H to become k. Your algorithm should not change anything if k ≤ x.key. The worst-case running-time of your algorithm must be O(log n). Give a high-level description of your algorithm in clear English. b. Using part (a), describe a simple algorithm to delete a given item x from a binomial max heap H. The worst-case running-time of your algorithm must be O(log n). Give a high-level description of your algorithm in clear English. Part II. Your task here is to design a data structure called Ultra-Heap that supports the following operations: • Insert(k): inserts the key k into the Ultra-Heap, • ExtractMax(): removes a max key from the Ultra-Heap, • ExtractMin(): removes a min key from the Ultra-Heap, • Merge(D,D′ ): merges Ultra-Heaps D and D′ into one Ultra-Heap. The worst-case running-time of each operation must be O(log n) where n is the total number of items. c. Describe your Ultra-Heap data structure in clear English. Your description should include any new information that you add to existing data structures that you use. d. Explain how you implement each operation of Ultra-Heap in clear English. Hint: Use binomial heaps and your solution to Part I. Question 3. (20 marks) Give a linear-time algorithm that determines if a Binary Search Tree (BST) is an AVL tree (i.e., whether it satisfies the balance property of an AVL tree). The algorithm’s input is a pointer u to the root of a BST T where each node v has the following fields: an integer key, and pointers parent, lchild and rchild to the parent, the left and right children of v in T (any unused pointer is set to Nil). The algorithm’s output should be True if T is an AVL tree, and False otherwise. The worst-case running time of your algorithm must be Θ(n) where n is the number of nodes in T. Describe your algorithm by giving its pseudo-code. Explain why its worst-case running time is Θ(n). Your algorithm will be graded by its correctness, running time, simplicity, and clarity. [The questions below will not be corrected/graded. They are given here as interesting problems that use material that you learned in class.] Question 4. (0 marks) In the following, B1 and B2 are two binary search trees such that every key in B1 is smaller than every key in B2. Describe an algorithm that, given pointers b1 and b2 to the roots of B1 and B2, merges B1 and B2 into a single binary search tree T. Your algorithm should satisfy the following two properties: 1. Its worst–case running time is O(min{h1, h2}), where h1 and h2 are the heights of B1 and B2. 2. The height of the merged tree T is at most max{h1, h2} + 1. Note that the heights h1 and h2 are not given to the algorithm (in other words, the algorithm does not “know” the heights of B1 and B2). Note also that B1, B2 and T are not required to be balanced. Describe your algorithm, and justify its correctness and worst-case running time, in clear and concise English. Hint: First derive an algorithm that runs in O(max{h1, h2}) time, and then optimize it. Question 5. (0 marks) A path between two nodes u, v in a Binary Search Tree (BST), is a sequence of distinct edges connecting a sequence of adjacent nodes in this tree, where the starting node in the sequence is u and the ending node is v; the length of a path is the number of edges in that path. Two distinct nodes u, v are said to adjacent if either u is the parent of v or v is the parent of u. For example, the figure below shows the path between 15 and 45 (length 3), the path between 7 and 20 (length 3), and the path between 47 and 50 (length 1) in a BST. In this question, you must derive an algorithm that, given any two keys in a BST, computes the lenght of the path between these two keys in the tree. To do so, solve the three subquestions outlined below. Henceforth assume that root is not nil and the BST rooted at root does not have duplicate keys. Morever, each node u of the BST has the following fields: key(u), containing the key of the node, lchild(u) and rchild(u), containing pointers to u’s left and right children respectively; note that node u does not have a pointer to its parent. For a key k in the BST, let node(k) be the BST node with key k. For each of the following subquestions, first describe your algorithm in clear and concise English, and then give the pseudocode. Then give a brief explanation of why your algorithm achieves the worst-case time complexity specified in that subquestion (where h is the height of the BST rooted at root). a. Give an efficient algorithm for the following procedure. PathLengthFromRoot(root, k): Given the root of a BST and a key k, return the length of the path between root and node(k). Assume that the key k is in the BST. For example, if root is the root of the BST in Figure 1, then PathLengthFromRoot(root, 15) should return 2, and PathLengthFromRoot(root, 47) should return 3. The worst-case time complexity of your algorithm should be O(h). b. Given the root of a BST and two distinct keys k, m present in the BST, define the FCP of k and m in the BST rooted at root, to be the root of the subtree that is furthest away from root which contains both k and m. In other words, the FCP of k and m is a node parent such that: (a) the subtree rooted at parent has both the keys k and m in it, and (b) the length of the path between root and parent is the maximum among all such parents. Give an efficient algorithm for the following procedure. FCP(root, k, m): Given the root of a BST and two distinct keys k and m, return the FCP of k and m in the BST rooted at root. Assume that both k and m are present in the BST. For example, if root is the root of the BST in Figure 1, then FCP(root, 15, 45) should return the node with key 30, FCP(root, 7, 20) should return the node with key 10, and FCP(root, 50, 47) should return the node with key 50. The worst-case time complexity of your algorithm should be O(h). c. Give an efficient algorithm for the following procedure. PathLength(root, k, m): Given the root of a BST, and two distinct keys k and m, return the length of the path between node(k) and node(m). Assume that the keys k and m are present in the BST. For example, if root is the root of the BST in Figure 1, then PathLength(root, 15, 45) should return 3, and PathLength(root, 50, 47) should return 1. The worst-case time complexity of your algorithm should be O(h). Hint: Use the procedures from Parts a and b.

$25.00 View

[SOLVED] ECE500/600 Vector Spaces Metric Spaces

ECE500/600 ENGINEERING ANALYTICAL TECHNIQUES Vector Spaces: Metric Spaces HW 25 0129 I.  METRIC SPACES: QUESTION [Moon and Stirling, 2000] Let X be an arbitrary set. Show that the function defined by is a metric. II.  METRIC SPACES: QUESTION [Moon and Stirling, 2000] Let (X, d) be a metric space. Show that is a metric on X. What significant feature does this metric possess? III.  METRIC SPACES: QUESTION [Moon and Stirling, 2000] In defining the metric of the sequence space ℓ∞ (0, ∞) as d∞ (x,y) = sup|x(n) - y(n)| , n “sup” is used instead of “max” . To  see the necessity of this definition, define the  sequences {x(n)} and {y(n)} by Show that d∞ (x,y) > |x(n) - y(n)| , yn ≥ 1. IV.  METRIC SPACES: QUESTION [Moon and Stirling, 2000] Let (a) Draw the set B. (b) Determine the boundary of B. (c) Determine the interior of B. V.  METRIC SPACES: QUESTION [Moon and Stirling, 2000]  The  fact  that  a  sequence  is  Cauchy  depends  upon  the  metric employed. Let fn (t) be the sequence of functions in the metric space (C[a,b], d∞), where Show that, Hence, conclude that in this metric space, fn  is not a Cauchy sequence.

$25.00 View

[SOLVED] PGD 2021 8200 Assignment 2 Individual Business Analysis C/C

PGD 2021 8200 Assignment 2: Individual Business Analysis Learning Outcomes Learning Outcome 3: Analyze business internal accounting information to evaluate working capital management performances of a business Learning Outcome 4: From cost and revenue data, apply techniques in deciding upon alternative courses of action and implements budgets in decision making for an organization Assignment Instructions Make sure to read the marking rubric provided on Canvas to ensure that you have answered the questions according to the requirements of each task. Your mark will reflect how well you answered the questions. You are expected to research a range of sources and reference these correctly. You must use your own words to report your findings. Your report completed using professional business English report format and must answer all the questions listed below. The length of your report should be a MINIMUM of 2400 words LO 3 Analyze business internal accounting information to evaluate working capital management performance of a business Task 1                   50 marks Jason Co is an online computer trader which made annual sales of $15,000,000 last year. The most recent financial statement indicates the company owns $2,466,000 trade receivables, $2,220,000 trade payables and $3,000,000 of overdraft. The customers pay within 60 days on average. To encourage customers to pay earlier the company decided that for a payment within 30 days, the customer will receive an early settlement discount of 1%. The finance department suggests that, under the new policy, only 20% of customers will carry on paying in 60 days, 30% of customers will pay after 45 days, and 50% of consumers will take the early discount and pay in 30 days. The finance provider charges Jason Co 6% annually for overdraft facility and the new policy is also expected to reduce the cost of finance when the interest rate remains constant. In terms of inventories, Jason Co places an order of 15,000 units with its supplier every month, which costs $150 per order. Last year, the annual cost of materials was $540,000 and the holding cost is $1.2 per unit per year. The supplier could now offer a 2% bulk discount for orders over 45,000 units and the finance department of Jason Co is required to investigate the proposal. Required: [i] Should Jason adopt the new credit period and early settlement policy? Calculate the net benefit and comment on your findings. Hint: provide your recommendation and evaluate its validity. [20 marks] [ii] Should Jason accept the bulk purchase discount offered by the supplier? Calculate the different costs of inventory (including cost of material, annual ordering cost and annual holding cost before and after taking the discount) and comment on your findings. Hint: provide your recommendation and evaluate its validity. [15 marks] [iii] Provide Jason with any three policies aimed at efficiently managing amounts owed from credit sales. [15 marks] Please note: The lecturer may have an interview with you to ascertain your knowledge. Task 2 LO 4                 25 marks The summarised statement of financial position of Leila Ltd as at 31 May 2023 is as follows: 100071 Current assets Bank                                    20,000 AR                                       200,000 Inventory                              86,000                                             306,000 NCA [net of depreciation]        154,000 Total assets                                           460,000 Current liabilities AP                                         72,000 Accruals [wages]                     3,800 Accruals [expenses]                 2,500                                              78,300 Capital and reserves                 381,700                                                           460,000 Accounts payable represent purchases for May, and accounts receivable the sales for April and May at $100,000 per month. The directors are seeking finance from a bank and have produced the following profit forecast, but the bank, before deciding, has asked for a cash budget for the period showing the maximum anticipated finance needed from month to month. The profit forecast for the next six months is: 100071              Jun                Jul                Aug               Sep              Oct             Nov Sales                 180,000.0       220,000.0      240,000.0      262,000.0      262,000.0   260,000.0 Gross profit        45,000.0        55,000.0         60,000.0       65,500.0        65,500.0     65,000.0 Wages and salaries  20,000.0   18,000.0         24,000.0        27,000.0        32,000.0     24,000.0 Rent                  1,670.0          1,670.0          1,660.0          1,670.0          1,670.0      1,660.0 Other expenses    8,000.0        10,000.0         12,000.0        12,000.0        10,000.0     15,000.0 Profit                  15,330.0        25,330.0         22,340.0        24,830.0        21,830.0     24,340.0 Stock requirement at month end    90,000.0     80,000.0 120,000.0 100,000.0 112,000.0 170,000.0 Further information is given below: 1. At each month-end, one-eighth of a month’s wages and salaries, and a quarter of other expenses, would be outstanding. 2. Rent at the rate of $20,000 per annum is payable quarterly in arrears on 31 August, 30 November, etc. 3. Assume that one month’s credit will be taken on purchases as previously, and that accounts receivable will continue to take two months’ credit. 4. New fixed assets (additional) will be delivered in June and must be paid for on 31 August; cost $200,000. 5. If the bank grants finance, it will continue an existing $50,000 overdraft facility, and give a five-year loan of a fixed amount as soon as necessary to maintain the overdraft within its limit for the whole period under review. You are required to: [a] prepare the cash budget for the period of June – November 2023. [15 marks] [b] prepare a summary statement of financial position as at 30 November 2023. [5 marks] [c] Calculate current ratios at the beginning and at the end of the period. Discuss if the change in these ratios could affect the firm’s ability to obtain short-term loans from the bank. [5 marks] Please note: The lecturer may have an interview with you to ascertain your knowledge. Task 3 LO 4 25 Marks Diogo Ltd makes a standard product, which is budgeted to sell at $17 a unit. It is made from a budgeted 0.5 kilograms of material, budgeted to cost $7 per kilogram, and worked on by an employee paid a budgeted $13 per hour, for a budgeted 15 minutes. Monthly fixed overheads are budgeted at $18,000. The output for March was budgeted at 5,100 units. The actual results for March were as follows:                                                      $ Sales revenue (5,380 units)           79,500 Materials (2,840 kilograms)            (26,400) Labour (1,300 hours)                     (20,700) Fixed overheads                            (19,100) Actual operating profit                    13,300 No inventories existed at the start or end of March. [i] Deduce the budgeted profit for March. [5 marks] [ii] Reconcile it with the actual profit. [12 marks] [iii] Analyse negative variances and explain an impact of your findings on Diogo’s decision making. [8 marks] Please note: The lecturer may have an interview with you to ascertain your knowledge.

$25.00 View

[SOLVED] 24LLP116 Digital Media Audiences and Markets

24LLP116 - Digital Media Audiences and Markets 100% Assignment; Word limit 3000 Feedback returned by: 28th  Feb 2025, Friday Individual Report: Analysis of Audience Engagement in Digital Media Platforms Report Brief Organisations, groups, brands, and people (e.g., celebrities, influencers, politicians, etc) utilise digital media to engage and interact with their audiences across multiple channels and using a variety of content types. In this assignment, you will write an analysis of audience engagement on a digital media platform. You will choose the platform. that is the subject of the analysis in your report and ensure that your report describes and analyses the axes driving audience engagement on the selected platform, including: •   The platform’s business or revenue model(s)—e.g. revenue streams, pricing strategy, etc. If you think it is relevant, particularly as it relates to other axes of analysis, you can include details of the historical evolution of the platform’s business model. •   The ownership and governance model(s)—how does the ownership structure of the platform. and its governance affect audience engagement and the user  experience? The analysis of the platform’s governance can include any aspect of the platform’s governance that can impact user experience. Some examples include (but are not limited to) moderation policies, user data practices, decision making structures related to the development of featues, or any other governance choices that you feel are relevant to your chosen platform. •   The affordances that the platform. gives to the users how they affect and influence the audience’s behaviour on the platform. You shoud include 3 to 5 affordances or interface features and how they shape user behaviour. •   The stakeholders participating in the platform and their incentives, as well how these incentives shape the activities and affordances of the platform. •    Interactions that are being measured and/or monitored on the platform. You should include at least 2 metrics / measures of interaction and how these measures influence audience behaviour on the platform. •   The ethical and privacy considerations around the audience’s engagement on the chosen platform. You can include evaluations of benefits and/or harms/risks to users. Your analysis should maintain a critical synthesis and evaluation of the impact of the above considerations on audience behaviour and engagement on the chosen platform. Suggested Writing Structure Once you have gathered the necessary information, your analysis should include these general sections, but these may differ depending on your specific case: •    Introduction and background o Identify and introduce the platform. that is the subject of your analysis o Set the scene by providing background information, relevant facts, important issues, and outline of the platform’s history and development. o Demonstrate that you have researched the relevant details of the platform. •    Evaluation of the selected axes of analysis o Identify the various axes that you use in your analysis o Evaluate the platform. in the context of each of these axes by discussing how they affect the platform, its stakeholders, and the behaviour and engagement of the audience. o Explain how these axes of analysis complement or conflict with each other and what positive or negative outcomes they have for the platform. and its stakeholders, and in particular the users. •    Recommendations o Provide at least 2 specific and realistic recommendations and ideas to improve any aspects of the platform. and its audience’s engagement which you have identified as not working well. Explain why these recommendations were chosen, and describe any relevant limitations o Support these recommendations and ideas with solid evidence, such as:    Concepts from class (text readings, discussions, lectures)    Your own research from academic sources     Personal experiences of platform. users (anecdotes) •    Reference: A reference list in Harvard Referencing style. Please check the module page for resources related to the coursework and case studies—particularly the reading list—for further guidance. Marking Criteria Marking criteria Weighting Report layout, style, structure, presentation, and clarity •    Does the report follow the suggested structure? •    Does the report entail cover front, title, table of content, abstract, heading/subheadings, page number, body, reference, and appendix (if necessary)? •    Is the report logically organised? •    Is the report legible and grammatically accurate? If empirical data is included, is it presented accurately and appropriately? 10% Research of the Analysis: Relevance, consistency and evidence identified from the analysis, further reading and appropriate citations from relevant sources •    Has the background of the platform been adequately described and researched? •    Does the analysis report follow a consistent and coherent argument? •    Do the references and citations cover the appropriate viewpoints in both breadth and depth? •    Is evidence used accurately, critically, and effectively? •   Are sources cited fully and correctly? Is a properly annotated reference attached? 20% Criticality of the evaluation: Application and appropriate use of key theoretical frameworks and models to support the analysis •    Is an appropriate range of reading and sources called upon? •    Is a specific angle picked up and used to evaluate the case? What angle? •    Does the report address various views in a comprehensive and critical fashion and organically merge them to form its own points? •   Are the arguments strongly supported? 25% Quality of the proposed recommendations •    Have all the relevant components been identified? •    Has the case report brought up a mechanism to accommodate the above components? •    Does the case report cover potent, convincing, and logical points? 25% Originality of insights and recommendations with justification •    Does the report include original illustrations/examples? •    Does the report include insights that are linked to broader industry trends and implications? •    Is there a distinctive, innovative synthesis of material? 20%

$25.00 View

[SOLVED] IRDR0004 Coursework-2 GIS and RS Python

Module Assessment Guideline: IRDR0004 (Part-B) As part of the IRDR0004 module assessment (Part-B or Component 002), you are required to submit an individual technical report, which constitutes 50% of the overall module mark. This  coursework  will  reflect  the  skills  you  have  developed  through  the  lectures (teaching weeks 6-10, Term I), computer lab tutorials, and independent learning. You are strongly encouraged to dedicate sufficient time to practicing in the computer labs, attending office hours, and seeking guidance from the module team, which includes the module lead and postgraduate teaching assistants (PGTAs). The  lectures will  primarily  cover theoretical  concepts, while the  computer cluster sessions will provide practical experience in an interactive and engaging environment. To  enhance  your  understanding, you  should  complement  classroom  learning  by reading relevant literature, reviewing practice and module materials, and watching supplementary videos as part of your independent study. The assessment will test your core competencies in Geographic Information Systems (GIS) and remote sensing (RS) tools and techniques, requiring you to apply these skills effectively in your analysis and interpretation. For your project, adopt a clear, focused, and well-articulated approach. Define your project aim and objectives in a scientific and structured manner to ensure clarity and purpose in your work. Coursework-2 (GIS and RS) 1. Technical Report a)  Submission Format: i.     A 1,500-word individual technical report submitted as a single PDF file. ii.     The submission must include all raw files used for data analysis in a single  zipped  folder  (e.g.,  shapefiles,  masked  satellite  images, processed raster layers, scripts, and any other demographic or statistical data). b)  Weighting: 50% of the module mark c)  Submission Deadline: Wednesday, 05 February 2025 at 1:00 pm (UK time) You  are  encouraged  to  allocate  sufficient  time  for  downloading  data,  cleaning, preparing layers, and conducting analyses. Marks will be awarded based on the quality of work, strength of scientific arguments, demonstration of critical thinking, quantitative data analysis skills, technical writing proficiency, and clarity of presentation. 2. Topic The report will focus on a  comparative study, analysing the ‘before’ and ‘after’ scenarios  or  conducting  a  change  detection  analysis  over  a  reasonable  period. You are encouraged to select a  medium-sized area where significant land cover changes have occurred. Examples of suitable topics include , but are not limited to: a)  Deforestation  and  Land  Cover  Change  in  the  Amazon  Rainforest,  Brazil: Analyse the spatial and temporal  patterns of deforestation  in  the Amazon rainforest over the past decade and its impact on vegetation and biodiversity. b)  Wildfire-Induced  Land  Cover  Changes  in  California,  USA:  Investigate  the effects of recurring wildfires on vegetation, urban areas, and ecosystems in California, focusing on land cover transformations between two time periods. c)  Urban Expansion and Land Use Change in Nairobi, Kenya: Analyse the impact of rapid urbanisation on green spaces and agricultural land over a decade. d)  Glacier  Retreat in the Himalayas: Investigate changes in glacier extent and surrounding land cover in the Himalayas over a decade, focusing on the effects of climate change. e)  Desertification   in   the   Sahel   Region,   Africa:   Study   the   progression   of desertification and its impact on vegetation and agricultural productivity over a 10-year period. f)   Urban Expansion and Refugee Settlement in Juba, South Sudan: Examine how conflict-induced migration has affected urban expansion and natural resource depletion in Juba. 2.1. Example Used in Practical Labs: In our computer practical labs, we examined Land Cover Change Detection in the Kutupalong  Refugee  Camps  in  Cox’s  Bazar,  Bangladesh,  focusing  on  the transformation  of  land  cover  following  the   influx  of   nearly  a   million  displaced populations between 2017 and 2022. However, this specific example cannot be used for your assessment as it was covered during classroom instruction. 2.2. Steps to Follow i.     Select  a  Study  Area  and  Topic:  Choose  a  topic  related  to  disaster  risk reduction or humanitarian crises. Ensure the study area is manageable and relevant to the module’s themes. Also, ensure the chosen area aligns with a recognised administrative boundary, facilitating reproducibility and integration with official data sources. ii.     Download  Relevant  Data: Acquire  the  study area  boundary shapefile and appropriate Landsat 8 satellite images. iii.     Produce  Land Cover Maps: Create land cover maps for two distinct time periods, ensuring a gap of at least 5 to 10 years between them. iv.     Calculate  NDVI:  Perform a  Normalised Difference Vegetation  Index (NDVI) analysis for both time periods. v.     Calculate Land Surface Temperature (LST): Derive land surface temperature for the two time periods to observe changes. vi.     Incorporate Additional  Data:  Incorporate additional geospatial or statistical datasets from credible secondary sources for advanced analysis. vii.     Write the Technical Report: Prepare your report following this guideline. 3. Report Content The technical report should be creative, involve critical thinking, and reflect innovative ideas. A recommended structure is provided below: a)  Title Page:     Project title    Candidate number (but do not include your name anywhere, including in maps, diagrams, or illustrations)     Module details    Word count    Signed statement: “I declare the following work is my own and, where the work of others has been used, it has been clearly identified. ” b)  Abstract (150 words max):     Provide a concise summary of the report, including the key findings. c)  Introduction:    Clearly state the aim and objectives.     Include a brief, focused background and literature review if necessary. d)  Methods:     Detail the methodology, including data sources, datasets used, study area description and justification for selection, and analytical methods.     Ensure sufficient detail for reproducibility. e)  Results:     Present  key  findings  using  well-labelled  tables,  diagrams,  charts, illustrations and figures.     Use visuals to enhance understanding and highlight important patterns. f)   Discussion and Conclusion (Combined):     Interpret the results, linking them to your research aim and objectives.     Provide a critical analysis of the findings and discuss their implications. g)  References:     Ensure all sources cited in the report are included in the references section and formatted consistently according to the required style (e.g., Harvard or APA).     Keep the reference list concise (5-10 entries recommended) but ensure it is complete and includes all key sources used in the analysis. h)  Appendices:     Use a few appendices for additional material (if needed), ensuring the report can stand alone without them. 4. Report Format The report must be a maximum of 1,500 words (+10% allowance), excluding the following:     Title page     Declaration     Abstract     Captions, equations, tables, and figures     AI Statement     References     Appendices 4.1. Word Count Compliance:     Reports with fewer than  1,500 words may result in automatic failure of the coursework.     Reports exceeding 1,650 words will incur a penalty of up to 10% of the total marks.     Note: If the coursework is both over-/under-length and late, the greater of the penalties will be applied. 4.2. Formatting Requirements:     Page Size: A4     Margins: Normal     Orientation: Portrait     Font: Arial     Font Size: 12     Font Colour: Automatic     Line Spacing: 1.5     Paragraph Alignment: Align Left or Justify     Page Numbers: Bottom of the page, aligned to the right 4.3. Referencing and Citations:     Use APA or Harvard (latest edition) for referencing and citations.     Ensure consistency in citations and references throughout the report. 4.4. Plagiarism Policy:     Any submission with a Turnitin similarity score exceeding 10% will be flagged for investigation.     Cases of suspected plagiarism will follow the university’s official procedure. https://www.ucl.ac.uk/academic-manual/sites/academic-manual/files/student_academic_misconduct_adjudication_and_penalties.pdf 5. Report Requirements The report should demonstrate your ability to:     Construct a well-structured, organised, and clear report that adheres to academic standards.     Present valid arguments using scientific evidence, supporting your findings with credible sources.     Provide clear visual aids such as maps, figures, diagrams, and tables to enhance understanding and communicate effectively.     Understand the material you are presenting, showing mastery of the topic and the analytical techniques applied.     Demonstrate  technical  skills  in  handling  geospatial  and  statistical  data effectively, using appropriate tools and methods.     Apply innovative data analytical techniques to generate meaningful and well-articulated visuals that aid in interpreting the results.     Show originality in your ideas, crafting a scientifically valid and meaningful technical report that reflects critical thinking and analytical depth. 6. Generative AI (GenAI) Policy For this assessment,UCL Category 1applies, meaning that you are permitted to use Generative AI (GenAI) tools to assist with revising and preparing your work. However, the final submission must be entirely your own original work. It is your responsibility to ensure that any use of GenAI aligns with UCL’s academic integrity standards and that the content you submit reflects your understanding, effort, and critical thinking. You  are  not  allowed  to  use  GenAI  tools  to  create  figures,  diagrams,  maps,  or illustrations for this coursework. All visual content must be created using appropriate geospatial, statistical, or graphic software relevant to the module, such as QGIS, ArcGIS Pro, Python, or R. Visual outputs must be the product of your own analytical work and software skills. If you use GenAI tools during any stage of your work, you must include a statement under the heading “AI Usage Declaration” in your report. This statement should clearly describe how GenAI was used and demonstrate how your usage complies with UCL’s academic integrity guidelines. Failure to include this declaration may lead to your work being flagged for investigation. It is acceptable to use GenAI tools for tasks such as checking spelling, grammar, or adjusting the tone of your writing. However, it is essential that this usage does not alter the content or meaning of your work. You must fully understand and be able to explain all  aspects  of  your  submission,   including  your  analysis,   interpretations,  and conclusions. Misrepresentation of AI usage or over-reliance on AI-generated content that  compromises  originality   may   result   in   penalties   under   UCL’s   academic regulations. 7. Mark Scheme Report Structure and Writing Style. (5 Marks) •     Clear and coherent structure. •     Logical flow of paragraphs and sections. •     Diagrams and tables appropriately referenced in the text. •     Complete and well-organised reference section. •    Accurate and consistent citation of references throughout the report. •     Proper spelling, grammar, and punctuation. •     Fluidity and clarity of sentences. Figures and Tables (20 Marks) •     Use of original and innovative figures, tables or other types of illustrations. •     Relevance and effectiveness in supporting the report. •     High quality, clarity, and appropriateness of visual elements, including captions and legends. •     Cartographic elements meet professional standards. Content (25 Marks) •    Application of suitable geospatial and statistical techniques. •     Demonstration of scientific and technical competence. •    Ability to maintain a clear argument and effectively fulfil research objectives. •     Emphasis  on   methodology,  data  analysis,  generating   meaningful   results, unloading all sorts of raw data, and interpreting findings. •     Originality and selection of an attention-grabbing and suitable topic. •     Reliability of data sources, raw data and layers, and accuracy of results. Total = 50 Marks 8. Additional Instructions a)  Compliance with Instructions:     Marks will be deducted for failing to follow the instructions, including those regarding deadlines, report structure, font, word limits, format, and other specified requirements.     Missing  critical  instructions,  such  as  failing  to  meet  the  submission deadline, may result in automatic failure of the coursework. b)  Marking Process:    The coursework will be assessed by the module tutor and postgraduate teaching  assistants  (PGTAs)  through  a  first   and  second  marking process. c)  Updates to Instructions:     Instructions may be revised or updated as necessary. Always ensure you download the latest version of the document and read it thoroughly before submission. d)  Research Aim and Objectives:     Ensure you have access to all necessary datasets before formulating your research aim and objectives. e)  Data Usage:    You may use multiple datasets from single or multiple sources for your analysis.    Avoid relying on data from individuals or organisations that may not provide  it on time.  Use  publicly and freely available  secondary data sources to design your project efficiently. f)   Blind Marking:     Do not include your name in any part of your submission. Use only your candidate number to ensure blind marking. 8.1. Importance of Selecting an Appropriate Study Area for Landsat 8 Analysis Selecting an appropriate study area is critical for ensuring the effectiveness and accuracy of the analysis when using Landsat 8 images. Several factors must be considered to optimise the results and minimise potential challenges: a)  Resolution of Landsat 8 Images:     Landsat 8 provides moderate spatial resolution, with most bands at 30 metres per pixel. This resolution is well-suited for analysing medium- sized cities or urban areas, as it captures significant patterns without overwhelming computational resources.    Too  Small  Areas:   If  the  study  area   is  too  small   (e.g.,  a  single neighbourhood), the resolution maybe insufficient to capture meaningful variations or details, leading to poor results.    Too Large Areas: Very large areas may require multiple scenes to cover the extent, increasing the complexity of data handling and analysis. b)  Extent of Change:     Medium-sized cities or urban areas experiencing noticeable land cover changes over time (e.g., urban expansion, deforestation, or disaster impacts) are ideal. Areas with minimal changes may not provide enough variability to analyse effectively. c)  Availability of Imagery:     Landsat 8 provides consistent coverage since 11 February 2013, but availability of cloud-free scenes can vary, especially  in  regions with frequent cloud cover. Choosing a location with accessible, clear imagery reduces the risk of gaps in analysis. d)  Dealing with Multiple Scenes:     If the study area spans multiple Landsat scenes, you must ensure proper mosaicking (using the ‘Merge’ tool) and alignment. This can introduce complexity and potential errors in analysis, especially for beginners. e)  Seasonal Variations:     Land  cover  can  vary  significantly with  seasons,  especially  in  areas affected by agriculture, vegetation cycles, or snow. Selecting imagery from comparable seasons (e.g., summer-to-summer comparisons) is crucial to avoid seasonal bias in the results. f)   Data Processing Complexity:     Large or complex areas might require advanced techniques, such as atmospheric  correction  or  cloud  masking,  which  can  be  resource- intensive.  A  balanced  approach  ensures  that  you  can  focus  on meaningful analysis without being overwhelmed by data preparation. By considering these factors, you can ensure that your chosen study area aligns well with the capabilities of Landsat 8 and the objectives of their analysis, leading to more accurate and meaningful results.

$25.00 View

[SOLVED] EARTH 2 Lab 2 Plate TectonicsR

EARTH 2 Lab  Lab 2: Plate Tectonics Purpose of the lab: • Walk in the intellectual footsteps of the scientists who first discovered plate tectonics • Learn about relative plate motions on the Earth • Appreciate various aspects of oceanic crustal formation at mid-ocean ridges • Consider the pattern of volcanoes and earthquakes at subduction zones • Use hot spot tracks to clock the speed of the plates PART 1: Apparent polar wander – using magnetic Inclination One of the first indications that the Earth’s surface was not static was the observation that the magnetic poles, as measured by the magnetic fields frozen into rocks, appeared to move. Of course, we now understand that the apparent motion is in fact due to the motions of the plates themselves. This question will illustrate how we can use these data to map the motion of the plates through time. A. What is the difference between magnetic inclination and magnetic declination? _________________________________________________________________________________________________________________________ _________________________________________________________________________________________________________________________ _____________________________________________________________________________________________________________________ [3] A simple equation relates the magnetic field inclination (I) we measure at the surface to the latitude (λ): tan I = 2 tan λ Which can be shown in graphical form. to the left. A. What latitude are we at here in Santa Barbara? _________________________________ [1] B. Using the equation or graph above, compute the inclination in SB today: _________________________˚ [2] C. If you were to make a new rock in Santa Barbara today, freeze in the inclination, and then transport that rock (keeping it perfectly horizontal so that the inclination doesn’t change!) down to Oaxaca, Mexico (Latitude = 17˚), by how much would the frozen-in magnetic inclination deviate from the local measured inclination? ___________________________________˚ [3] D. Consider the following section of layered rocks recorded on a small tectonic plate. Each layer has a measured magnetic inclination and age indicated. Assuming you know the path taken by the plate, use the inclination and age data to compute the paleolatitudes at each time – record your answers in the table provided and draw on the diagram when the plate was at various positions along its path. [6] Note “Ma” – literally “Mega annum” – just means “millions of years ago”. Also: while we have provided the plate path in the question above, in real paleomag. studies, researchers have to use a combination of magnetic declination and inclination to recreate the path. Age (Ma)             Latitude ( ˚ ) 0 2 5 10 20 PART 2: Magnetic stripes and seafloor spreading The observation and understanding of magnetic stripes on the seafloor were key to our understanding of seafloor spreading and the motion of the oceanic plates. These measurements provided irrefutable proof of plate tectonic theory. These magnetic isochrons (“equal-time” stripes) allow us to “unwind the tape” and reconstruct the history of plate motions. This phenomenon arises because of periodic reversals in polarity of the Earth’s magnetic field. A. The figure below shows several snapshots of the slow opening of an ocean as two continents separated over the last 4 million years. On the left you are given a key showing the (simplified) magnetic polarity history for the last 4 Ma. Using this key, shade in the seafloor at each age on the figure to indicate the polarity of seafloor in this ocean. You may assume constant spreading rate through time. [8] NOTE: new crust generated in one time step should show up in all subsequent time intervals… NOTE ALSO: that during several of these time periods there are polarity reversals!! B. If you are told that at the present day, the two continents are 240 km apart, what is the full spreading rate of this mid-ocean ridge, in km/million-years? ________________________________ km/Myr [2] C. What is the half-spreading rate (the rate at which either side moves away from the ridge)? ________________________________ km/Myr [1] D. What is the full spreading rate in units of mm/yr ? ________________________________ mm/yr [2] Different mid-ocean ridges spread at different rates. With increasing age, the oceanic plates progressively cool, get denser, and sit lower in the mantle (think about a fully loaded cargo ship versus an empty cargo ship). This results in a systematic, predictable relationship between plate age and ocean depth. The diagram below shows the relationship between ocean depth and distance from the mid-ocean spreading ridge for two different oceans; the Atlantic (blue) and the Pacific (red). E. What is the spreading rate for each ridge: Pacific seafloor spreading rate is ________________________________ km/Ma [3] Atlantic seafloor spreading rate is ________________________________ Km/Ma [3] F. Which ridge has a faster spreading rate? [1] ________________________________ G. By drawing smooth lines through the uneven profiles, estimate the depth of 20 million-year-old seafloor in each of these oceans: Pacific seafloor aged 20 Ma is ________________________________ m depth [3] Atlantic seafloor aged 20 Ma is ________________________________ m depth [3] H. How do these two estimated depths compare to each other? How do you explain this answer? _________________________________________________________________________________________________________________________ _________________________________________________________________________________________________________________________ _________________________________________________________________________________________________________________________ _____________________________________________________________________________________________________________________ [3] I. The inset shows a close up of the axial region for each ridge. The difference in spreading rate also affects the shape of the axial region. What is the difference between the axial region for slow versus fast spread crust? _______________________________________________________________________________________________________________________ _______________________________________________________________________________________________________________________ _______________________________________________________________________________________________________________________ ___________________________________________________________________________________________________________________ [3] PART 3: Sea Floor spreading – fracture zones and transform. faults The observation of large scars in the seafloor – fracture zones – was an important early indicator that the oceans were mobile. Later, these features were recognized as crucial to explaining how relatively linear plate boundaries snake over the surface of our curved planet. Tuzo Wilson showed how the pattern of ridges, transforms, and fracture zones all made sense through the theory of plate tectonics. Activity: Find the seafloor spreading model. Make the sea floor spread by pulling the wooden continents apart. Carefully observe the sea floor spreading centers, the active transform. faults and the fracture zones. (By definition, the fracture zones include both the active transform. faults and the inactive, fossil traces of transform. faults.) Questions: A. As sea floor spreading widens the ocean, do the spreading center segments change length? (circle one) Yes or No [1] B. As sea floor spreading widens the ocean, do the active transform. faults change length? (circle one) Yes or No [1] C. As sea floor spreading widens the ocean, do the fracture zones change length? (circle one) Yes or No [1] D. The figure to the right is a map of a spreading system between two hypothetical plates A and B, with three slightly offset spreading centers segments… - Label each active spreading center: SC - Label each active transform. fault: TF - Label each inactive fracture zone: FZ - Put arrows on the two sides of each active plate boundary segment showing the relative motion between the two sides (i.e., five pairs of arrows).  [8 total] E. The green star shows an earthquake. Would we expect this earthquake to involve plates: separating apart from each other or sliding horizontally past each other ? (select one) [1] F. Consider points 1 and 2 in the figure above, on opposite sides of the dotted line. Is the oceanic crust at these points: the same age or 1 older than 2 or 2 older than 1 ? (select one) [1] PART 4: Plate boundaries The most tectonically active parts of the planet are at the plate boundaries – the interfaces between one plate and another. The plates are – by definition – moving with respect to each other at these boundaries, which creates earthquakes, volcanoes, and dramatic topography. The following questions ask you to analyze carefully the figure of the South America-Pacific plate boundary on the next page. This figure contains a lot of information, so take some time to look over it and ensure you understand it. A. What is the name for this type of plate boundary? _________________________________________________ [1] B. What is the relative motion of the plates across this boundary? _________________________________________________ [1] Figure 1: Figure made using GeoMapApp software showing South America - Pacific plate boundary. Circles show locations of earthquakes since 1960, colored by depth and scaled by magnitude. The red stars are active volcanoes. A topographic profile across the plate boundary at ~30˚S is inset. C. What is the overall relationship between the depth of earthquakes and distance from the plate boundary here (i.e., from the trench just offshore from the continental margin)? How do you explain this trend? _________________________________________________________________________________________________________________________ _________________________________________________________________________________________________________________________ _________________________________________________________________________________________________________________________ _____________________________________________________________________________________________________________________ [3] D. What is the approximate distance between the volcanic arc (the chain of volcanoes, marked by red stars) and the plate boundary? _________________________________________________ [1] E. Draw a cross section that spans the region of the topographic inset at 30˚S in the space below. Show the structure of the plates from the surface down to 200 km depth in the Earth. Make sure to illustrate and label the slab, trench, Andes mountains, volcanoes, and earthquake locations. [5] F. Explain what leads to volcanism in the Andes (and other plate boundaries of this kind). _________________________________________________________________________________________________________________________ _________________________________________________________________________________________________________________________ _________________________________________________________________________________________________________________________ _________________________________________________________________________________________________________________________ _____________________________________________________________________________________________________________________ [3] PART 5: Hotspot tracks Hotspot tracks form. as plates move over hot plumes from the Earth’s deep interior. These plumes are thought to initiate at the core-mantle boundary and undergo melting as they rise to the surface. The melted rock then punches through the plate above the hot spot, creating volcanic seamounts, islands, atolls and guyots (depending on the amount of volcanism and age/maturity of the volcanic edifice). Hotspots remain relatively immobile while plates move over them and can therefore be used to work out the speed and direction of plate motion. The following questions rely on interpretation of the Hawaiian ocean island chain shown in the following figure. The figure gives you compass directions and a horizontal scale, in km, measured along the chain from the youngest ocean island volcano – Kilauea, which is presently active. In red are localities of rocks dated using geochronological techniques. NOTE, one of the ages is actually wrong… so be careful! A. Using this hotspot track, calculate the speed of the plate’s motion over the Earth’s surface, in mm/yr. _______________________________ mm/yr [5] B. Roughly in which compass direction is the plate moving? (we’re looking for an answer like “north” or “southeast”) _______________________________ [1] C. Which of the rock ages is wrong and what should the age of rocks at that locality actually be (be sure that the plate speed that you report for part A is not based on the incorrect point)? The age of ________________ Ma is wrong – rocks on that island should be _______________________ Ma [4]

$25.00 View

[SOLVED] GSND 5345Q Fundamentals of Data Science Homework 2 R

Homework 2 GSND 5345Q, Fundamentals of Data Science Due Monday, January 27th, 2025 Advanced Unix Tools Most Unix implementations include a large number of powerful tools and utilities.   (Unix has been in development for more than 50 years!). We were only able to scratch the surface in our class time. It will take time to become comfortable with Unix, but as you struggle, you will find yourself learning just by looking at man files and finding solutions on the internet. For this Homework, you will explore several more advanced Unix functions. You can use any resource available to you–classmates, the internet (Google and ChatPGT!), and Dr. Johnson. Ask all the questions you want, just make sure you do the work and you learn! 1.  Learn more about tools for downloading files from external servers (e.g., scp, ftp, sftp, rsync), and for to downloading data from webpages (e.g., curl, wget, mget).  Use an appropriate function to download the homework2.tar.gz from the homework folder on course GitHub page.  Give the code you used to download these data.(Hint: To download the homework2.tar.gz from GitHub, control/right click on the "View raw" link and copy the location (see image). If you use the URL in the address bar it downloads the .html for the website)   2.  Learn about the tar function. What is a tarball? How is it different from a .zip file? Download the homework2.tar.gz file from GitHub and unzip the contents, and report that code you used.  How effective is the compression for this tarball? After you complete this homework, add your homework files directory and generate a gzipped tarball for all the Homework 1 data plus your answers. Make sure to provide the code you used to generate the tarball for your homework. 3.  Research the chmod function. Give short explanation of what this function does, its syntax, and examples  when you would use it. Practice chmod by changing the permissions on the ’TB_microbiome_data.txt’ file in the Homework 1 directory from the previous questions. Give examples of the code you used and  show that the code works (e.g., use ls  -l). 4. The grep function is an extremely powerful tool for search (potentially large) files for patterns and strings. One advantage is that you don’t have to open the file to conduct a search! Using the internet, find a short tutorial on the basics of grep, and give the code and results for the following tasks: (a)  How many FC receptor genes are present in the ’TB_nanostring.txt’ file?  (hint: search for ’FC’ in the file) (b)  How many samples (rows) in the ’nanostring_annotation.txt’ do not have a co-morbid condition or other risk factor?(i.e., inverse search – how many rows do not have a "Yes") (c)  How many coronavirus genomes are present in the ’viral.fasta’ file?   How many of these are SARS-COV-2? (d)  How many times does the letter ’A’ (capital or lowercase) appear in all the files from the homework1 tar file?  (i.e., ignore case). (e) What Staphylococcus species are present in the ’TB_microbiome_data.txt’ file?   (hint:  each  separate microbe has its own row in the file). Print out the counts for Mycobacterium tuberculosis. How many Streptococcus species are present? 5.  Learn how to use less to display large text files in the terminal using the man help page. Using the "OPTIONS" section of the man page, open the ’viral.fasta’ file to display so that it does not wrap long lines (default), displays line numbers, and opens at the first occurrence of ’coronavirus’. Provide the command you used to open the file in this way.  Within less, learn and practice how to scroll forward/backward, scroll forward/backward n lines, jump to the middle or end of the file, and search for text in the document. When would it be advantageous to use less over a tool like Microsoft Word? Ask Dr. Johnson why in Unix more is less and less is more :-). 6.  Open a text file in vim and change the file. How do you move to the beginning/end of a line, insert text, copy and paste, delete text and lines? How do you save your file or exit vim with/without saving your result? What are the advantages and disadvantages of vim versus less? In which scenarios would you use each of these? 7.  Learn about pipes and redirects in Unix. In which scenarios would you use them, and why are they helpful? describe what the following commands do: (a)  ls  -l  |   less (b)  ls  -l  >  directory_contents .txt (c)  ls  -l  »  directory_contents .txt (d)  cat  directory_contents .txt  |  head  -3  |  tail  -2 (e)  ls  |  grep  -c  html (f)  ls  |  wc  -l (g)  cat  file1 .txt  file2 .txt  >  file3 .txt You can also us pipes in R! Investigate how to do this and give the code for a great example. 8.  Learn about another Unix command that we have not discussed.  Give a short description of this function, when you would use it, its syntax, and give some examples of its use.

$25.00 View

[SOLVED] CSCI251 Advanced Programming Assignment 2 C/C

CSCl251 Advanced Programming Assignment 2 Aim The objectives of this assignment includes: Learning about encapsulation,inheritance,polymorphism and function overloading Apply the concepts learnt by developing a survey and path planning program Background In  a  theoretical  flat-land  universe,everything  is  in  2  dimensions.People,animals,plants  to planets,moons,galaxies  and  even  space  itself,is  in  2D.In  our  flat-land  space(i.e.'flat-space'), there is a powerful organization called 2D-StarFleet(2DSF),whose goals include seeking out new life and civilization via exploration. While on a routine mission of exploration,the flagship of 2DSF,the Enterprise-2D is trapped in an  expanse  of  space  encircled  by  a  massive  ring  of  violent,electrical  plasma  storm.Data coming in from the sensor array reveals that the only opening in this storm is located at the far end of the enclosed area,from Enterprise-2D's current location. In addition,the sensor data also revealed that this area is populated by strange,2D geometrical shapes,with  sizes  ranging  from  a  small  moon,asteroid,to  large  planets,or  even  a  star!This implies that to travel to the'exit'at the far end of the storm,you need to understand more about the properties of these shapes and attempt to chart a course to navigate to the exit! As  a  Science  Officer  aboard  Enterprise-2D,you  need  to  develop  a  program  that  has  the following capabilities: a)read in sensor data on the strange 2D shapes (via manual input) b)compute  the  area('mass')of  these  shapes c)print shapes report(e.g.list of points:on its perimeter,or totally within shape's area) d)sort shapes data (sorted by special type and area) The next section provides information about the requirements for developing this program. Task Requirements A)In terms of relative positioning,you may assume a coordinate system with Enterprise-2D at the origin,trying to navigate in a general 'upper-right'direction,to get to the exit in the storm.Please refer to Appendix  A,which elaborates on this coordinate system and the unit representation of 2D shapes. IMPORTANT:   For  this  assignment,you  should  not  assume  that  the  2D  shapes  in Appendix  A are positioned exactly as shown in Appendix   A,nor that there are not more shapes.There will,however,only be shapes of the types listed in Appendix  B B)The sensor data coming in from Enterprise-2D's sensor array provides crucial information about the 2D shapes such as name,special type and location of all vertices (that outlines  the perimeter of the shape).Please refer to Appendix   B ,which  provides a  more detailed description of the sensor data. C)To assist you in the initial  class  design of your  program,please  refer to  Appendix  C, which illustrates one possible way of designing your program.It also describes a list of requirements which you need to implement,especially those marked under "compulsory". The classes highlighted in Appendix   C  are  purely  meant  to  store data about the 2D shapes entered into your program by user. D)You  are  required  to  implement  a  main  driver   class  file    called   'Assn2.cpp  ',whose   methods are called to start the program.When started,it should print a menu providing the following functionalities: read in sensor data on the strange 2D shapes (via manual input) compute the area ('mass') of these shapes print shapes report (e.g. list of points on its perimeter, or totally within shapes area) sort shapes data (sorted by special type and area) Appendix D provides more information about implementing this class. E)Once the  program  is  completed  and  tested  to  be  working  successfully,you  are  highly encouraged to add on "new features"to the program that you feel are relevant to the problem.Additional marks may be awarded subject to the relevancy and correctness of the  new  functionalities.(Note:the  additional  features  will  only   be  considered   IF  the program  has  correctly  fulfilled  all  the   basic  requirements  elaborated   in  the  earlier sections!) F)You are to use only C++language to develop your program.There is no restriction on the IDE as long as your source files can be compiled by g++compiler (that comes packaged in Ubuntu linux)and executed in the Ubuntu terminal shell environment. Deliverables 1)       The deliverables include the following: a)The    actual    working    C++program(soft    copy),with  comments  on  each  file, function or block of code to help the tutor understand its purpose. b)A softcopy word document that elaborates on: (Interpreted) requirements of the program Diagram / Illustrations of program design Summary of implementation of each module in your program Reflections on program development (e.g. assumptions made, difficulties faced, what could have been done better, possible enhancements in future, what have you learnt, etc) c)A  program  demo/software  testing  during  lab  session.You  must  be  prepared  to perform certain tasks/answer any questions posed by the tutor. 2)       IMPT:Please    follow  closely, to the submission  instructions in Appendix  E, which contains details about what to submit,file naming conventions,when to submit,where to  submit,etc. 3)       The software demo /testing will be held during lab session where you are supposed to  submit  your  assignment.Some  time  will  be  allocated  for  you  to  present  / demonstrate your program's capabilities during the session. Grading Student's deliverable will be graded according to the following criteria: (i)       Program fulfills all the basic requirements stipulated by the assignment (ii)      Successful  demonstration  of  a  working  program,clarity  of explanation /presentation and satisfactory answers provided during Q&A session. ii)       Additional  effort(e.g.enhancing  the  program  with  relevant  features over and above task    requirements,impressive,‘killer'presentation) (iv)     After  the  submission  of  deliverables,students  will  be  required  undergo  a  software  testing process (to determine the correctness and fulfillment of software requirements.) Further instructions will be given by the Tutor during the subsequent respective labs.  Please pay attention as failure to adhere to instructions will result in deduction of  marks.

$25.00 View

[SOLVED] Cs6035 machine learning 2025

Learning Goals of this Project:Important HighlightsImportant Reference Materials:Project Overview VideoThis is a 16 minute video by the project creator, it covers project concepts.There are other videos on the Setup page that cover installation and other subjects.BACKGROUNDMany of the Projects in CS6035 are focused on offensive security tasks. These are related to Red Team activities/tasks that many of us may associate with cybersecurity. This project will be focused on defensive security tasks, which are usually considered Blue Team activities that are done by many corporate teams.Historically, many defensive security professionals have investigated malicious activity, files, and code. They investigate these to create patterns (often called signatures) that can be used to detect (and prevent) malicious activity, files, and code when that pattern is used again. What this means is that these simple methods only were effective on known threats.This approach was relatively effective in preventing known malware from infecting systems, but it did nothing to protect against novel attacks. As attackers became more sophisticated, they learned to tweak or simply encode their malicious activity, files, or code to avoid detection from these simple pattern matching detections.With this background information, it would be nice if a more general solution could give a score to activity, files, and code that pass through corporate systems every day. This solution would inform the security team that while a certain pattern may not exactly fit a signature of known malicious activity, files, or code it appears to be very similar to examples that were seen in the past that were malicious.Luckily machine learning models can do exactly that if provided with proper training data! Thus, it is no surprise that one of the most powerful tools in the hands of defensive cybersecurity professionals is Machine Learning. Modern detection systems usually use a combination of machine learning models and pattern matching (regular expressions) to detect and prevent malicious activity on networks and devices.This project will focus on teaching the fundamentals of data analysis and building/testing your own machine learning models in python. You’ll be using the open source libraries Pandas and Scikit-Learn.Cybersecurity Machine Learning Careers and TrendsAdditional InformationTable of contents Task 1:Task 1For the first task, let’s get familiar with some pandas basics. pandas is a Python library that deals with Dataframes, which you can think of as a Python class that handles tabular data. In the real world, you would create graphics and other visuals to better understand the dataset you are working with. You would also use plotting tools like PowerBi, Tableau, Data Studio, and Matplotlib. This step is generally known as Exploratory Data Analysis. Since we are using an autograder for this class, we will skip the plotting for this project.For this task, we have released a local test suite. If you are struggling to understand the expected input and outputs for a function, please set up the test suite and use it to debug your function. Please note that the return lines for the provided skeleton functions are placeholders for the data types that the tests are expecting.It’s critical you pass all tests locally before you submit to Gradescope for credit. Do not use Gradescope for debugging.TheoryIn this Task, we’re not yet getting into theory. It’s more nuts and bolts – you will learn the basics of pandas. pandas dataframes are something of a glorified list of lists, mixed in with a dictionary. You get a table of values with rows and columns, and you can modify the column names and index values for the rows. There are numerous functions built into pandas to let you manipulate the data in the dataframe.To be clear, pandas is not part of Python, so when you look up docs, you’ll specifically want the official Pydata pandas docs. Note that we linked to the API docs here, this is the core of the docs you’ll be looking at.You can always get started trying to solve a problem by looking at Stack Overflow posts in Google search results. There you’ll find ideas about how to use the pandas library. In the end, however, you should find yourself in the habit of looking directly at the docs for whichever library you are using, pandas in this case.For those who might need a concrete example to get started, here’s how you would take a pandas dataframe column and return the average of its values:import pandas as pd# create a dataframe from a Python dictdf = pd.DataFrame({“color”:[“yellow”, “green”, “purple”, “red”], “weight”:[124,4.56,384,-2]})df # shows the dataframeNote that the column names are [“color”,”weight”] while the index is [0,1,2,3…] where […] the brackets denote a list.Now that we have created a dataframe, we can find the average weight by summing the values under ‘weight’ and dividing them by the sum:average = df[‘weight’].sum() / len(df[‘weight’])average # if you put a variable as the last line, the variable is printed127.63999999999999Note: In the example above, we’re not paying attention to rounding, you will need to round your answers to the precision asked for in each Task.Also note, we are using slightly older versions of the pandas, Python and other libraries so be sure to look at the docs for the appropriate library version. Often there’s a drop-down at the top of docs sites to select the older version.Refer to the Submissions page for details about submitting your work.Useful Links:Deliverables:Instructions:The Task1.py file has function skeletons that you will complete with Python code, mostly using the pandas library. The goal of each of these functions is to give you familiarity with the pandas library and some general Python concepts like classes, which you may not have seen before. See information about the function’s inputs, outputs, and skeletons below.Table of contentsfind_data_typeIn this function you will take a dataset and the name of a column in it. You will return the column’s data type.Useful Resourceshttps://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.dtypes.htmlINPUTSOUTPUTSnp.dtype – data type of the columnFunction Skeletondef find_data_type(dataset:pd.DataFrame,column_name:str) -> np.dtype:return np.dtype()set_index_colIn this function you will take a dataset and a series and set the index of the dataset to be the seriesUseful Resourceshttps://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Index.htmlINPUTSOUTPUTSa pandas DataFrame indexed by the given index seriesFunction Skeletondef set_index_col(dataset:pd.DataFrame,index:pd.Series) -> pd.DataFrame:return pd.DataFrame()reset_index_colIn this function you will take a dataset with an index already set and reindex the dataset from 0 to n-1, dropping the old indexUseful Resourceshttps://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.reset_index.htmlINPUTSOUTPUTSa pandas DataFrame indexed from 0 to n-1Function Skeletondef reset_index_col(dataset:pd.DataFrame) -> pd.DataFrame:return pd.DataFrame()set_col_typeIn this function you will be given a DataFrame, column name and column type. You will edit the dataset to take the column name you are given and set it to be the type given in the input variableUseful Resourceshttps://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.astype.htmlINPUTSOUTPUTSa pandas DataFrame with the column in column_name changed to the type in new_col_typeFunction Skeleton# Set astype (string, int, datetime)def set_col_type(dataset:pd.DataFrame,column_name:str,new_col_type:type) -> pd.DataFrame:return pd.DataFrame()make_DF_from_2d_arrayIn this function you will take data in an array as well as column and row labels and use that information to create a pandas DataFrameUseful Resourceshttps://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.htmlINPUTSOUTPUTSa pandas DataFrame with columns set from column_name_list, row index set from index and data set from array_2dFunction Skeleton# Take Matrix of numbers and make it into a DataFrame with column name and index numberingdef make_DF_from_2d_array(array_2d:np.array,column_name_list:list[str],index:pd.Series) -> pd.DataFrame:return pd.DataFrame()sort_DF_by_columnIn this function, you are given a dataset and column name. You will return a sorted dataset (sorting rows by the value of the specified column) either in descending or ascending order, depending on the value in the descending variable.Useful Resourceshttps://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.sort_values.htmlINPUTSOUTPUTSa pandas DataFrame sorted by the given column name and in descending or ascending order depending on the value of the descending variableFunction Skeleton# Sort DataFrame by valuesdef sort_DF_by_column(dataset:pd.DataFrame,column_name:str,descending:bool) -> pd.DataFrame:return pd.DataFrame()drop_NA_colsIn this function you are given a DataFrame. You will return a DataFrame with any columns containing NA values droppedUseful Resourceshttps://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.dropna.htmlINPUTSOUTPUTSa pandas DataFrame with any columns that contain an NA value droppedFunction Skeleton# Drop NA values in DataFrame Columns def drop_NA_cols(dataset:pd.DataFrame) -> pd.DataFrame:return pd.DataFrame()drop_NA_rowsIn this function you are given a DataFrame you will return a DataFrame with any rows containing NA values droppedUseful Resourceshttps://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.dropna.htmlINPUTSOUTPUTSa pandas DataFrame with any rows that contain an NA value droppedFunction Skeletondef drop_NA_rows(dataset:pd.DataFrame) -> pd.DataFrame:return pd.DataFrame()make_new_columnIn this function you are given a dataset, a new column name and a string value to fill in the new column. Add the new column to the dataset and return the dataset.Useful Resourceshttps://pandas.pydata.org/pandas-docs/stable/getting_started/intro_tutorials/05_add_columns.htmlINPUTSOUTPUTSa pandas DataFrame with the new column created named new_column_name and filled with the value in new_column_valueFunction Skeletondef make_new_column(dataset:pd.DataFrame,new_column_name:str,new_column_value:list) -> pd.DataFrame:return pd.DataFrame()left_merge_DFs_by_columnIn this function you are given 2 datasets and the name of a column with which you will left join them on using the pandas merge method. The left dataset is dataset1 right dataset is dataset2, for example purposes.Useful Resourceshttps://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html https://stackoverflow.com/questions/53645882/pandas-merging-101INPUTSOUTPUTSa pandas DataFrame containing the two datasets left joined together on the given column nameFunction Skeletondef left_merge_DFs_by_column(left_dataset:pd.DataFrame,right_dataset:pd.DataFrame,join_col_name:str) -> pd.DataFrame:return pd.DataFrame()simpleClassThis project will require you to work with Python Classes. If you are not familiar with them we suggest learning a bit more about them.You will take the inputs into the class initialization and set them as instance variables (of the same name) in the Python class.Useful Resourceshttps://www.w3schools.com/python/python_classes.aspINPUTSOUTPUTSNone, just setup the init method in the class.Function Skeletonclass simpleClass():def __init__(self, length:int, width:int, height:int):passfind_dataset_statisticsNow that you have learned a bit about pandas DataFrames, we will use them to generate some simple summary statistics for a DataFrame. You will be given the dataset as an input variable, as well as a column name for a column in the dataset that serves as a label column. This label column contains binary values (0 and 1) that you also summarize, and also the variable to predict.In this context:This type of binary classification is common in machine learning tasks where we want to be able to predict the field. An example of where this could be useful would be if we were looking at network data, and the label column was IsVirus. We could then analyze the network data of Georgia Tech services and predict if incoming files look like a virus (and if we should alert the security team).Useful ResourcesINPUTSOUTPUTSHint: Consider using the int function to type cast decimalsFunction Skeletondef find_dataset_statistics(dataset:pd.DataFrame,label_col:str) -> tuple[int,int,int,int,int]:n_records = #TODOn_columns = #TODOn_negative = #TODOn_positive = #TODOperc_positive = #TODOreturn n_records,n_columns,n_negative,n_positive,perc_positive    Task 2:Now that you have a basic understanding of pandas and the dataset, it is time to dive into some more complex data processing tasks.TheoryIn machine learning a common goal is to train a model on one set of data. Then we validate the model on a similarly structured but different set of data. You could, for example, train the model on data you have collected historically. Then you would validate the model against real-time data as it comes in, seeing how well it predicts the new data coming in.If you’re looking at a past dataset as we are in these tasks, we need to treat different parts of the data differently to be able to develop and test models. We segregate the data into test and training portions. We train the model on the training data and test the developed model on the test data to see how well it predicts the results.You should never train your models on test data, only on training data.NotesAt a high level it is important to hold out a subset of your data when you train a model. You can see what the expected performance is on unseen sample. Thus, you can determine if the resulting model is overfit (performs much better on training data vs test data).Preprocessing data is essential because most models only take in numerical values. Therefore, categorical features need to be “encoded” to numerical values so that models can use them. A machine learning model may not be able to make sense of “green”, “blue” and “red.” In preprocessing, we’ll convert those to integer values 1, 2 and 3, for example. It’s an interesting question as to what happens when you have training data that has “green,” “red” and blue,” but your testing data says “yellow.”Numerical scaling can be more or less useful depending on the type of model used, but it is especially important in linear models. Numerical scaling is typically taking positive value and “compressing” them into a range between 0 and 1 (inclusive) that retains the relationships among the original data.These preprocessing techniques will provide you with options to augment your dataset and improve model performance.Useful Links:Deliverables:Instructions:The Task2.py File has function skeletons that you will complete with python code (mostly using the pandas and scikit-learn libraries). The Goal of each of these functions is to give you familiarity with the applied concepts of Splitting and Preprocessing Data. See information about the Function’s Inputs, Outputs and Skeletons belowTable of contentsttsIn this function, you will take:You will return features and labels for the training and test sets.At a high level, you can separate the task into two subtasks. The first is splitting your dataset into both features and labels (by columns), and the second is splitting your dataset into training and test sets (by rows). You should use the scikit-learn train_test_split function but will have to write wrapper code around it based on the input values we give you.Useful ResourcesINPUTSOUTPUTSFunction Skeletondef tts(  dataset: pd.DataFrame,label_col: str,test_size: float,should_stratify: bool,random_state: int) -> tuple[pd.DataFrame,pd.DataFrame,pd.Series,pd.Series]:# TODOreturn train_features,test_features,train_labels,test_labelsPreprocessDatasetThe PreprocessDataset Class contains a code skeleton with nine methods for you to implement. Most methods will be split into two parts: one that will be run on the training dataset and one that will be run on the test dataset. In Data Science/Machine Learning, this is done to avoid something called Data Leakage.For this assignment, we don’t expect you to understand the nuances of the concept, but we will have you follow principles that will minimize the chances of it occurring. You will accomplish this by splitting data into training and test datasets and processing those datasets in slightly different ways.Generally, for everything you do in this project, and if you do any ML or Data Science work in the future, you should train/fit on the training data first, then predict/transform on the training and test data. That holds up for basic preprocessing steps like task 2 and for complex models like you will see in tasks 3 and 4.For the purposes of this project, you should never train or fit on the test data (and more generally in any ML project) because your test data is expected to give you an understanding of how your model/predictions will perform on unseen data. If you fit even a preprocessing step to your test data, then you are either giving the model information about the test set it wouldn’t have about unseen data (if you combine train and test and fit to both), or you are providing a different preprocessing than the model is expecting (if you fit a different preprocessor to the test data), and your model would not be expected to perform well.Note: You should train/fit using the train dataset; then, once you have a fit encoder/scaler/pca/model instance, you can transform/predict on the training and test data.You will also notice that we are only preprocessing the Features and not the Labels. There are a few cases where preprocessing steps on labels may be helpful in modeling, but they are definitely more advanced and out of the scope of this introduction. Generally, you will not need to do any preprocessing to your labels beyond potentially encoding a string value (i.e., “Malware” or “Benign”) into an integer value (0 or 1), which is called Label Encoding.PreprocessDataset:__init__Similar to the Task1 simpleClass subtask you previously completed you will initialize the class by adding instance variables (add all the inputs to the class).Useful ResourcesINPUTSExample of feature_engineering_functions:def double_height(dataframe:pd.DataFrame):return dataframe[“height”] * 2 def half_height(dataframe:pd.DataFrame):return dataframe[“height”] / 2 feature_engineering_functions = {“double_height”:double_height,”half_height”:half_height}Don’t worry about copying it we also have examples in the local test cases this is just provided as an illustration of what to expect in your function.OUTPUTSNone, just assign all the input parameters to class variables.Also per the instructions below, you’ll return here and create another instance variable: a scikit-learn OneHotEncoder with any Parameters you may need later.Function Skeletondef __init__(self,one_hot_encode_cols:list[str],min_max_scale_cols:list[str],n_components:int,feature_engineering_functions:dict):# TODO: Add any instance variables you may need to make your functions workreturnPreprocessDataset:one_hot_encode_columns_train and one_hot_encode_columns_testOne Hot Encoding is the process of taking a column and returning a binary vector representing the various values within it. There is a separate function for the training and test datasets since they should be handled separately to avoid data leakage (see the 3rd link in Useful Resources for a little more info on how to handle them).Pseudocodeone_hot_encode_columns_train()one_hot_encode_columns_test()Example Walkthrough (from Local Testing suite):INPUTS:one_hot_encode_cols[“src_ip”,”protocol”]Train FeaturesTest FeaturesTrain DataFrames at each step:2.DataFrame with columns to encode:DataFrame with other columns:4.One Hot Encoded 2d array:5.One Hot Encoded DataFrame with Index and Column Names6.Final DataFrame with passthrough/other columns joined backTest DataFrames at each step:1.DataFrame with columns to encode:DataFrame with other columns:2.One Hot Encoded 2d array:3.One Hot Encoded DataFrame with Index and Column Names4.Final DataFrame with passthrough columns joined backNote: For the local tests and autograder use the column naming scheme of joining the previous column name and the column value with an underscore (similar to above where Type -> Type_Fruit and Type_Vegetable)Note 2: Since you should only be fitting your encoder on the training data, if there are values in your test set that are different than those in the training set, you will denote that with 0s. In the example above, let’s say we have a row in the test set with pizza, which is neither a fruit nor vegetable for the Type_Fruit and Type_Vegetable. It should result in a 0 for both columns. If you don’t handle these properly, you may get errors like Test Failed: Found unknown categories.Note 3: You may be tempted to use the pandas function get_dummies to solve this task, but its a trap. It seems easier, but you will have to do a lot more work to make it handle a train/test split. So, we suggest you use scikit-learn’s OneHotEncoder.Useful ResourcesINPUTSOUTPUTSa pandas DataFrame with the columns listed in one_hot_encode_cols one hot encoded and all other columns in the DataFrame unchangedFunction Skeletondef one_hot_encode_columns_train(self,train_features:pd.DataFrame) -> pd.DataFrame:one_hot_encoded_dataset = pd.DataFrame()return one_hot_encoded_datasetdef one_hot_encode_columns_test(self,test_features:pd.DataFrame) -> pd.DataFrame:one_hot_encoded_dataset = pd.DataFrame()return one_hot_encoded_datasetPreprocessDataset:min_max_scaled_columns_train and min_max_scaled_columns_testMin/Max Scaling is a process to transform numerical features to a specific range, typically [0, 1], to ensure that input values are comparable (similar to how you may have heard of “normalizing” data) and is a crucial preprocessing step for many machine learning algos. In particular this standardization is essential for algorithms like linear regression, logistic regression, k-means, and neural networks, which can be sensitive to the scale of input features, whereas some algos like decision trees are less impacted.By applying Min/Max Scaling, we prevent feature dominance, to ideally improve performance and accuracy of these algorithms and improve training convergence. It’s a recommended step to ensure your models are trained on consistent and standardized data.For the provided assignment you should use the scikit-learn MinMaxScaler function (linked in the resources below) rather than attempting to implement your own scaling function.The rough implementation of the scikit-learn function is provided below for educational purposes.X_std = (X – X.min(axis=0)) / (X.max(axis=0) – X.min(axis=0))X_scaled = X_std * (max – min) + minNote: There are separate functions for the training and test datasets to help avoid data leakage between the test/train datasets. Please refer to the 3rd link in Useful Resources for more information on how to handle this – namely that we should still scale the test data based on our “knowledge” of the train dataset.Example Dataframe:Example Min Max Scaled Dataframe (rounded to 4 decimal places):Note: For the Autograder use the same column name as the original column (ex: Price -> Price)Useful ResourcesINPUTSOUTPUTSa pandas DataFrame with the columns listed in min_max_scale_cols min/max scaled and all other columns in the DataFrame unchangedFunction Skeletondef min_max_scaled_columns_train(self,train_features:pd.DataFrame) -> pd.DataFrame:min_max_scaled_dataset = pd.DataFrame()return min_max_scaled_datasetdef min_max_scaled_columns_test(self,test_features:pd.DataFrame) -> pd.DataFrame:min_max_scaled_dataset = pd.DataFrame()return min_max_scaled_datasetPreprocessDataset:pca_train and pca_testPrincipal Component Analysis is a dimensionality reduction technique (column reduction). It aims to take the variance in your input columns and map the columns into N columns that contain as much of the variance as it can. This technique can be useful if you are trying to train a model faster and has some more advanced uses, especially when training models on data which has many columns but few rows. There is a separate function for the training and test datasets because they should be handled separately to avoid data leakage (see the 3rd link in Useful Resources for a little more info on how to handle them).Note 1: For the local tests and autograder, use the column naming scheme of column names: component_1, component_2 .. component_n for the n_components passed into the __init__ method.Note 2: For your PCA outputs to match the local tests and autograder, make sure you set the seed using a random state of 0 when you initialize the PCA function.Note 3: Since PCA does not work with NA values, make sure you drop any columns that have NA values before running PCA.Useful ResourcesINPUTSOUTPUTSa pandas DataFrame with the generated pca values and using column names: component_1, component_2 .. component_nFunction Skeletondef pca_train(self,train_features:pd.DataFrame) -> pd.DataFrame:# TODO: Read the function description in https://github.gatech.edu/pages/cs6035-tools/cs6035-tools.github.io/Projects/Machine_Learning/Task2.html and implement the function as describedpca_dataset = pd.DataFrame()return pca_dataset def pca_test(self,test_features:pd.DataFrame) -> pd.DataFrame:# TODO: Read the function description in https://github.gatech.edu/pages/cs6035-tools/cs6035-tools.github.io/Projects/Machine_Learning/Task2.html and implement the function as describedpca_dataset = pd.DataFrame()return pca_datasetPreprocessDataset:feature_engineering_train, feature_engineering_testFeature Engineering is a process of using domain knowledge (physics, geometry, sports statistics, business metrics, etc.) to create new features (columns) out of the existing data. This could mean creating an area feature when given the length and width of a triangle or extracting the major and minor version number from a software version or more complex logic depending on the scenario.In cybersecurity in particular, feature engineering is crucial for using domain expert’s (e.g. a security analyst) experience to identify anomalous behavior that might signify a security breach. This could involve creating features that represent deviations from established baselines, such as unusual file access patterns, unexpected network connections, or sudden spikes in CPU usage. These anomaly-based features can help distinguish malicious activity from normal system operations, but the system does not know what data patterns mean anomalous off-hand – that is where you as the domain expert can help by creating features.These methods utilize a dictionary, feature_engineering_functions, passed to the class constructor (__init__). This dictionary defines how to generate new features:Example of what could be passed as the feature_engineering_functions dictionary to __init__:import pandas as pddef double_height(dataframe: pd.DataFrame) -> pd.Series:return dataframe[“height”] * 2 def half_height(dataframe: pd.DataFrame) -> pd.Series:return dataframe[“height”] / 2 example_feature_engineering_functions = {“double_height”: double_height, # Note that functions in python can be passed around and used just like data!“half_height”: half_height} # and the class may be been created like this…# preprocessor = PreprocessDataset(…, feature_engineering_functions=example_feature_engineering_functions, …)In particular for this method, you will be taking in a dictionary with a column name and a function that takes in a DataFrame and returns a column. You’ll be using that to create a new column with the name in the dictionary key. Therefore if you were given the above functions, you would create two new columns named “double_height” and “half_height” in your Dataframe.Useful ResourcesINPUTSOUTPUTSa pandas dataframe with the features described in feature_engineering_train and feature_engineering_test added as new columns and all other columns in the dataframe unchangedFunction Skeletondef feature_engineering_train(self,train_features:pd.DataFrame) -> pd.DataFrame:feature_engineered_dataset = pd.DataFrame()return feature_engineered_datasetdef feature_engineering_test(self,test_features:pd.DataFrame) -> pd.DataFrame:feature_engineered_dataset = pd.DataFrame()return feature_engineered_datasetPreprocessDataset:preprocess_train, preprocess_testNow, we will put three of the above methods together into a preprocess function. This function will take in a dataset and perform encoding, scaling, and feature engineering using the above methods and their respective columns. You should not perform PCA for this function.Useful ResourcesSee resources for one hot encoding, min/max scaling and feature engineering aboveINPUTSOUTPUTSa pandas dataframe for both test and train features with the columns in one_hot_encode_cols encoded, the columns in min_max_scale_cols scaled and the columns described in feature_engineering_functions engineered. You do not need to use PCA here.Function Skeletondef preprocess_train(self,train_features:pd.DataFrame) -> pd.DataFrame:train_features = pd.DataFrame()return train_features def preprocess_test(self,test_features:pd.DataFrame) -> pd.DataFrame:test_features = pd.DataFrame()return test_features Task 3In Task 2 you learned how to split a dataset into training and testing components. Now it’s time to learn about using a K-means model. We will run a basic model on the data to cluster files (rows) with similar attributes together. We will use an unsupervised model.TheoryAn unsupervised model has no label column. By constrast, in supervised learning (which you’ll see in Task 4) the data has features and targets/labels. These labels are effectively an answer key to the data in the feature columns. You don’t have this answer key in unsupervised learning, instead you’re working on data without labels. You’ll need to choose algorithms that can learn from the data, exclusively, without the benefit of lablels.We start with K-means because it is simple to understand the algorithm. For the Mathematics people, you can look at the underlying data structure, a Voronoi diagram. Based on squared Euclidian distances, K-means creates clusters of similar datapoints. Each cluster has a centroid. The idea is that for each sample, it’s associated/clustered with the centroid that is the “closest.”Closest is an interesting concept in higher dimensions. You can think of each feature in a dataset as a dimension in the data. If it’s 2d or 3d, we can visualize it easily. Concepts of distance are clear in 2d and 3d, and they work similarly in 4+d.If you read the Wikipedia articles for K-means you’ll see a discussion of the use of “squared Euclidean distances” in K-means. This is compared with simple Euclidean distances in the Weber problem, and better approaches resulting from k-medians and k-mediods is discussed.Please use scikit-learn to create the model and Yellowbrick to determine the optimal value of k for the dataset.So far, we have functions to split the data and preprocess it. Now, we will run a basic model on the data to cluster files (rows) with similar attributes together. We will use an unsupervised model (model with no label column), K-means. Again, use scikit-learn to create the model and Yellowbrick to determine the optimal value of k for the dataset.Refer to the Submissions page for details about submitting your work.Useful Links:Deliverables:Local Test Dataset InformationFor this task the local test dataset we are using is the NATICUSdroid dataset, which contains 86 columns of data related to android permissions used by benign and malicious Android applications released between 2010 and 2019. For more information such as the introductory paper and the Citations/Acknowledgements you can view the dataset site in the UCI ML repository. In this specific case clustering can be a useful tool to group apps that request similar permissions together. The team that created this dataset hypothesized that malicious apps would exhibit distinct patterns in the types of permissions they request compared to benign apps. This difference in permission request patterns could potentially be used to distinguish between malicious and benign applications.Instructions:The Task3.py File has function skeletons that you will complete with Python code. You will mostly be using the pandas, Yellowbrick and scikit-learn libraries. The goal of each of these functions is to give you familiarity with the applied concepts of Unsupervised Learning. See information about the function’s Inputs, Outputs and Skeletons below.KmeansClusteringThe KmeansClustering Class contains a code skeleton with 4 methods for you to implement.Note: You should train/fit using the train dataset then once you have a Yellowbrick/K-means model instance you can transform/predict on the training and test data.KmeansClustering:__init__Similar to Task 1, you will initialize the class by adding instance variables as needed.Useful ResourcesINPUTSOUTPUTSNoneFunction Skeletondef __init__(self,random_state: int):# TODO: Add any state variables you may need to make your functions workpassKmeansClustering:kmeans_trainKmeans Clustering is a process of grouping together similar rows together and assigning them to a cluster. For this method you will use the training data to fit an optimal K-means cluster on the data.To help you get started we have provided a list of subtasks to complete for this task:Useful ResourcesINPUTSOUTPUTSa list of cluster ids that the K-means model has assigned for each row in the train datasetFunction Skeletondef kmeans_train(self,train_features:pd.DataFrame) -> list:cluster_ids = list()return cluster_idsKmeansClustering:kmeans_testK-means clustering is a process of grouping together similar rows together and assigning them to a cluster. For this method you will use the training data to fit an optimal K-means cluster on the test data.To help you get started, we have provided a list of subtasks to complete for this task:Useful ResourcesINPUTSOUTPUTSa list of cluster ids that the K-means model has assigned for each row in the test datasetFunction Skeletondef kmeans_test(self,test_features:pd.DataFrame) -> list:cluster_ids = list()return cluster_idsKmeansClustering:train_add_kmeans_cluster_id_feature, test_add_kmeans_cluster_id_featureUsing the two methods you completed above (kmeans_train and kmeans_test) you will add a new feature(column) to the training and test dataframes. This is similar to the feature engineering method in Task 2 where you appended new columns onto an existing dataframe.To do this, use the output of the methods (the list of cluster ids you return) from the corresponding train or test method and add it as a new column named kmeans_cluster_id in the input dataframe, then return the full dataframe.Useful ResourcesINPUTSUse the needed instance variables you set in the __init__ method and the kmeans_train and kmeans_test methods you wrote above to produce the needed output.OUTPUTSA pandas dataframe with the kmeans_cluster_id added as a feature and all other input columns unchanged, for each of the two methods train_add_kmeans_cluster_id_feature and test_add_kmeans_cluster_id_feature.Function Skeletondef train_add_kmeans_cluster_id_feature(self,train_features:pd.DataFrame) -> pd.DataFrame:output_df = pd.DataFrame()return output_df def test_add_kmeans_cluster_id_feature(self,test_features:pd.DataFrame) -> pd.DataFrame:output_df = pd.DataFrame()return output_df        Task 4 Now let’s try a few supervised classification models:We have chosen a few commonly used models for you to use here, but there are many options. In the real world, specific algorithms may fit a specific dataset better than other algorithms.You won’t be doing any hyperparameter tuning yet, so you can better focus on writing the basic code. You will:(Note on feature importance: You should use RFE for determining feature importance of your Logistic Regression model, but do NOT use RFE for your Random Forest or Gradient Boosting models to determine feature importance. Please use their built-in values for this.)Useful Links:Deliverables:Local Test Dataset InformationFor this task the local test dataset we are using is the NATICUSdroid dataset, which contains 86 columns of data related to android permissions used by benign and malicious Android applications released between 2010 and 2019. For more information such as the introductory paper and the Citations/Acknowledgements you can view the dataset site in the UCI ML repository. If you look at the online poster for the paper that the dataset creators wrote from their research, they trained a variety of different models including Random Forest, Logistic Regression and XGBoost and calculated a variety of metrics related to training and detection performance. In this task we will guide you through training ML models and calculating performance metrics to compare the predictive abilities of different models.Instructions:The Task4.py File has function skeletons that you will complete with Python code (mostly using the pandas and scikit-learn libraries).The goal of each of these functions is to give you familiarity with the applied concepts of training a model, using it to score records and calculating performance metrics for it. See information about the function inputs, outputs and skeletons below.Table of contentsModelMetricscalculate_naive_metricsA Naive model is a very simple model/prediction that can help to frame how well a more sophisticated model is doing. At best, such a model has random competence at predicting things. At worst, it’s wrong all the time.Since a naive model is incredibly basic (often a constant or randomly selected result), we can expect that any more sophisticated model that we train should outperform it. If the naive Model beats our trained model, it can mean that additional data (rows or columns) is needed in the dataset to improve our model. It can also mean that the dataset doesn’t have a strong enough signal for the target we want to predict.In this function, you’ll implement a simple model that always predicts a constant (function-provided) number, regardless of the input values. Specifically, you’ll use a given constant integer, provided as the parameter naive_assumption, as the model’s prediction. This means the model will always output this constant value, without considering the actual data. Afterward, you will calculate four metrics—accuracy, recall, precision, and F1-score—for both the training and test datasets.[1] Refer to the resources below.Useful ResourcesINPUTSOUTPUTSA completed ModelMetrics object with a training and test metrics dictionary with each one of the metrics rounded to 4 decimal placesFunction Skeletondef calculate_naive_metrics(train_features:pd.DataFrame, test_features:pd.DataFrame, train_targets:pd.Series, test_targets:pd.Series, naive_assumption:int) -> ModelMetrics:train_metrics = {“accuracy” : 0,“recall” : 0,“precision” : 0,“fscore” : 0}test_metrics = {“accuracy” : 0,“recall” : 0,“precision” : 0,“fscore” : 0}naive_metrics = ModelMetrics(“Naive”,train_metrics,test_metrics,None)return naive_metricscalculate_logistic_regression_metricsA logistic regression model is a simple and more explainable statistical model that can be used to estimate the probability of an event (log-odds). At a high level, a logistic regression model uses data in the training set to estimate a column’s weight in a linear approximation function. Conceptually this is similar to estimating m for each column in the line formula you probably know well from geometry: y = m*x + b. If you are interested in learning more, you can read up on the math behind how this works. For this project, we are more focused on showing you how to apply these models, so you can simply use a scikit-learn Logistic Regression model in your code.For this task use scikit-learn’s LogisticRegression class and complete the following subtasks:NOTE: Make sure you use the predicted probabilities for roc aucUseful ResourcesINPUTSThe first 4 are similar to the tts function you created in Task 2:OUTPUTSFunction Skeletondef calculate_logistic_regression_metrics(train_features:pd.DataFrame, test_features:pd.DataFrame, train_targets:pd.Series, test_targets:pd.Series, logreg_kwargs) -> tuple[ModelMetrics,LogisticRegression]:model = LogisticRegression()train_metrics = {“accuracy” : 0,“recall” : 0,“precision” : 0,“fscore” : 0,“fpr” : 0,“fnr” : 0,“roc_auc” : 0}test_metrics = {“accuracy” : 0,“recall” : 0,“precision” : 0,“fscore” : 0,“fpr” : 0,“fnr” : 0,“roc_auc” : 0} log_reg_importance = pd.DataFrame()log_reg_metrics = ModelMetrics(“Logistic Regression”,train_metrics,test_metrics,log_reg_importance) return log_reg_metrics,modelExample of Feature Importance DataFramecalculate_random_forest_metricsA Random Forest model is a more complex model than the naive and Logistic Regression Models you have trained so far. It can still be used to estimate the probability of an event, but achieves this using a different underlying structure: a tree-based model. Conceptually, this looks a lot like many if/else statements chained together into a “tree”. A Random Forest expands on this and trains different trees with different subsets of the data and starting conditions. It does this to get a better estimate than a single tree would give. For this project, we are more focused on showing you how to apply these models, so you can simply use the scikit-learn Random Forest model in your code.For this task use scikit-learn’s Random Forest Classifier class and complete the following subtasks:NOTE: Make sure you use the predicted probabilities for roc aucUseful ResourcesINPUTSOUTPUTSFunction Skeletondef calculate_random_forest_metrics(train_features:pd.DataFrame, test_features:pd.DataFrame, train_targets:pd.Series, test_targets:pd.Series, rf_kwargs) -> tuple[ModelMetrics,RandomForestClassifier]: model = RandomForestClassifier() train_metrics = {“accuracy” : 0,“recall” : 0,“precision” : 0,“fscore” : 0,“fpr” : 0,“fnr” : 0,“roc_auc” : 0} test_metrics = {“accuracy” : 0,“recall” : 0,“precision” : 0,“fscore” : 0,“fpr” : 0,“fnr” : 0,“roc_auc” : 0} rf_importance = pd.DataFrame()rf_metrics = ModelMetrics(“Random Forest”,train_metrics,test_metrics,rf_importance) return rf_metrics,modelExample of Feature Importance DataFramecalculate_gradient_boosting_metricsA Gradient Boosted model is more complex than the Naive and Logistic Regression models and similar in structure to the Random Forest model you just trained. A Gradient Boosted model expands on the tree-based model by using its additional trees to predict the errors from the previous tree. For this project, we are more focused on showing you how to apply these models, so you can simply use the scikit-learn Gradient Boosted Model in your code.For this task use scikit-learn’s Gradient Boosting Classifier class and complete the following subtasks:NOTE: Make sure you use the predicted probabilities for roc aucRefer to the Submissions page for details about submitting your work.Useful ResourcesINPUTSOUTPUTSFunction Skeletondef calculate_gradient_boosting_metrics(train_features:pd.DataFrame, test_features:pd.DataFrame, train_targets:pd.Series, test_targets:pd.Series, gb_kwargs) -> tuple[ModelMetrics,GradientBoostingClassifier]:model = GradientBoostingClassifier()train_metrics = {“accuracy” : 0,“recall” : 0,“precision” : 0,“fscore” : 0,“fpr” : 0,“fnr” : 0,“roc_auc” : 0}test_metrics = {“accuracy” : 0,“recall” : 0,“precision” : 0,“fscore” : 0,“fpr” : 0,“fnr” : 0,“roc_auc” : 0} gb_importance = pd.DataFrame()gb_metrics = ModelMetrics(“Gradient Boosting”,train_metrics,test_metrics,gb_importance) return gb_metrics,modelExample of Feature Importance DataFrame     Task 5: Model Training and Evaluation:Now that you have written functions for different steps of the model-building process, you will put it all together. You will write code that trains a model with hyperparameters you determine (you should do any tuning locally or in a notebook, i.e., don’t tune your model in gradescope since the autograder will likely timeout).Important: Conduct hyperparameter tuning locally or in a separate notebook. Avoid tuning within Gradescope to prevent autograder timeouts.Develop your own local tests to ensure your code functions correctly before submitting to Gradescope. Do not share these tests with other students.train_model_return_scores (ClaMP Dataset)Instructions (10 points):This function focuses on training a model using the ClaMP dataset and evaluating its performance on a test set.Sample Submission (ClaMP):Function Skeleton (ClaMP):import pandas as pd def train_model_return_scores(train_df, test_df) -> pd.DataFrame:“””Trains a model on the ClaMP training data and returns predicted probabilitiesfor the test data. Args:train_df (pd.DataFrame): ClaMP training data with ‘class’ column.test_df (pd.DataFrame): ClaMP test data without ‘class’ column. Returns:pd.DataFrame: DataFrame with ‘index’ and ‘malware_score’ columns.“””# TODO: Implement the model training and prediction logic as described above.test_scores = pd.DataFrame()  # Replace with your implementationreturn test_scorestrain_model_unsw_return_scores (UNSW-NB15 Dataset)Instructions (10 points):This function is similar to the previous one but uses the UNSW-NB15 dataset.Sample Submission (UNSW-NB15):Function Skeleton (UNSW-NB15):import pandas as pd def train_model_unsw_return_scores(train_df, test_df) -> pd.DataFrame:“””Trains a model on the UNSW-NB15 training data and returns predictedprobabilities for the test data. Args:train_df (pd.DataFrame): UNSW-NB15 training data with ‘class’ column.test_df (pd.DataFrame): UNSW-NB15 test data without ‘class’ column. Returns:pd.DataFrame: DataFrame with ‘index’ and ‘prob_class_1’ columns.“””# TODO: Implement the model training and prediction logic as described above.test_scores = pd.DataFrame()  # Replace with your implementationreturn test_scoresDeliverablesDataset InformationClaMP DatasetUNSW-NB15 Dataset 

$25.00 View