FIN2010 Problem Set 1 Due 2025-2-24 at 5PM 1. Suppose you took a $100,000 15 year fixed-rate mortgage at 4.5% (APR) 3 years ago. Now the interest rate has dropped to 4% in the market, and you are considering refinance your mortgage. Hint: refinance means that you take out a new loan and pay off the old loan. (1) What was the original monthly payment? (2) Suppose you just made the 36th monthly payments of the original mortgage today. What is the remaining mortgage balance? (3) If you refinance the mortgage with another bank which offers 4% (APR) and keep the remaining term (that is, 12 years until the mortgage is paid off), what would the new monthly payment be? 2. A foundation announces that it will be offering CUHK(SZ) scholarship to one student. The first scholarship is to be offered exactly one year from now. When the scholarship is offered, the student will receive ¥100,000 annually for a period of four years, beginning from the date the scholarship is offered. This student is then expected to repay the principal amount received (¥400,000) in 10 equal annual installments, interest-free, starting two years after the last payment of the scholarship. This implies that the foundation is really giving an interest-free loan under the guise of a scholarship. The discount rate is 6% and is expected to remain unchanged. (1) What is the PV of the CUHK(SZ) scholarship from the perspective of the student (the scholarship includes both the money provided from the foundation and the repayments made to the foundation)? (2) The foundation plans to provide this scholarship to one student every single year forever. (The first recipient of the scholarship receives the first payment of ¥100,000 one year from today, The second recipient of the scholarship receives the first payment of ¥100,000 two years from today, etc) In order to fund all these scholarships, how much money does the foundation need to have today? 3. The annual membership fee at your health club is $750 a year and is expected to increase at 5% per year. A life membership is $7,500 and the discount rate is 12%. You either pay the annual membership every year (starting right now), or pay the life membership right now. In order to justify taking out the life membership, what would be your minimum life expectancy? 4. You are considering buying a car worth $30,000. The dealer, who is anxious to sell the car, offers you an attractive financing package. You have to make a down-payment of $3,500, and pay the rest over 3 years with monthly payments. The dealer will charge you interest at a constant APR of 2%, which is lower than the market interest rate other dealers offer. (1) What is the monthly payment to the dealer? (2) The dealer offers you a second option: you pay cash, but get a $2,500 discount. Should you go for the loan or should you pay cash? Assume that the market annual interest rate other deals offer is at 5% (APR). 5. Your cousin is entering medical school next fall and asks you for financial help. He needs $65,000 each year at the beginning of the school year for the first two years. After that, he is in residency for two years and will be able to pay you back $10,000 each year (paid after working for 1 year and 2 years, respectively). Then he graduates and becomes a fully qualified doctor, and will be able to pay you $40,000 each year (paid at the end of each year). He promises to pay you $40,000 for 5 years after he graduates. Are you taking a financial loss or gain by helping him out? Assume that the interest rate is 5% and that there is no risk of him breaking his promise. 6. Currently a 20-year Treasury bond with 4% semiannual coupon is traded at a yield of 5% (APR). (1) Is the price above or below 100% of par value? (2) Calculate the current price of the bond as percentage of the par value. (3) If the yield suddenly increased by 0.1%, how much would the price change? Would the price increase or decrease?
INF20016 Big Data Management Aim This unit covers the management perspective of contemporary data management issues (particularly big data) in an organisational/business context. You will be introduced to issues that arise when data is gathered from multiple sources in various formats for many diverse purposes, as well as the relevant managerial, organisational, governance and Information Technology (IT) strategy issues. You will explore why key aspects of a contemporary data management, such as master data management, cloud storage, social media data, non-relational databases, data warehouses, Infrastructure as a Service (IAAS), Platform. as a Service (PAAS) and Software as a Service (SaaS) should not be regarded as exclusively techno-centric concepts, but also as a business/management consideration. Unit learning outcomes (ULOs) Students who successfully complete this Unit should be able to: 1. demonstrate an understanding of the complexity pertaining data, big data and data lifecycle, which may include data quality, data governance, acquisition and procurement, legal, ethical, risk and security issue 2. analyse and evaluate appropriate data management solutions for specific business needs and requirements 3. demonstrate an understanding of big data management as an enabler of business agility and innovation 4. demonstrate critical thinking, problem solving, and ability to communicate effectively as a professional and function as an effective leader or member of a diverse team. Graduate attributes This unit may contribute to the development of the following Swinburne Graduate Attributes: · Verbal communication. · Communicating using different media. · Collaboration and negotiation. · Teamwork roles and processes. · Information literacy. · Technical literacy. Set text There is no prescribed text for this unit. All essential and/or additional readings will be provided in each relevant module. Unit improvements Swinburne Online strives to continuously improve our units in order to provide a high-quality student experience. Please provide feedback through the Student Feedback Survey and our online teaching staff to help us make improvements to this unit. · More support and tutorials have been provided to assist with Tableau throughout the unit. · Editing and minor revisions for improving clarity. · Assignments have been updated to meet accreditation/align with delivery on campus. · More interactivity has been added to the learning materials to facilitate a constructivist approach to learning. Active learning You will be engaged in an active learning environment, undertaking regular online discussions and guided through the learning process by expert teaching staff who will provide regular feedback. On average, you will need to dedicate 12.5 hours each week for your learning that includes readings, discussions with peers, and assignments.
DTS311TC Final Year Project School of AI and Advanced Computing Stage 4 | Level 3 SECTION A: Basic Information Brief Introduction to the Module The FinalYear Project (FYP) is a two-semester-long project delivered through the DTS311TC FinalYear Project module. It accounts for ten cr edits, making it the largest single module currently and is worth 25% ofthe total credits available for Year 4. Therefore, it is a major contributing c omponent to the BEng final degree classification. The FYP acts as asummative assessment of a student’s attainment in the Programme Learning Outcomes, i.e. how well-rounded you areas anX JTLUAIAC graduate. It is an opportunity for a student to integrate allthe knowledge accumulated through the four years of study. At the same ti me, the student must demonstrate their competencies before joining the computing professionals around the world; or demonstrate satisfactory res earchpotentialto further their research career. There are TWO assessment items which make up the FYP, including 1) Proposal Report (10%) and 2) Dissertation (90%). To achieve the most rewarding FYP experience, students are advised to carefully follow the assessment/submission schedule detaile d in the Module Handbook and observe important messages posted onLEARNING MALL, along with email notices and announce ments. Regular meetings and effective communication with the supervisor are particularly essential to a successful FYP project. Key Module Information Module name Module code Credit value Semesterin which t he module is taught Pre-requisites needed for the mod ule FinalYear Project DTS311TC 10 ACYR N/A Programmes on which the module is shared BEng Data Science and Big Data Technologywith Contemporary Entrepreneurialism SECTION B: What You Can Expect from the Module Educational Aims of the Module The module aims to give students the opportunity to work in a guided but independent fashion to explore a substantialproblem in depth, making p ractical use ofprinciples, techniques and methodologies acquired elsewhere in the course. It also gives experience of carrying out a large piece of i ndividual work and inproducing a dissertation. It finally enhances communication skills, both oral and written. Learning Outcomes After completing the module students should be able: A. specify a substantialproblem, and produce a plan to address the problem B. manage their time efectively so as to carryout their plan C. locate and make use of information relevant to their project D. design a solution to a substantialproblem E. implement and test their solution F. evaluate in a critical fashion the work they have done, and to place it in the context of related work G. prepare and deliver a formalpresentation H. structure and write a dissertation describing their projects. Methods of Learning and Teaching Students and their supervisors can decide on the delivery pattern. Some academic supervisors may have a group meeting at the start ofthe projec t if it is of benefit to the students, while others may only have one-by-one meetings with students. Students are required to have weekly orbiweekly meetings with their supervisors. The schedule can be slightly diferent from one academics tafto another and depends on the needs of individual students. Students who never have discussion with their supervisor have to submit a ll of their project materials (including all code and data) to the AIAC review committee to be furtherinvestigated.
DTS311TC FINAL YEAR PROJECT Abstract This paper focuses on how applying blockchain technology as a structure through decentralization and immutability has made significant changes in different fields and offered protection for data exchanges. However, as the applications of real-world complexity increases and the size of the blockchain network increases, it becomes very difficult to extract actionable behavior. related information from blockchain data. To address this problem, this paper outlines and develops an information retrieval system for behavioral analysis of blockchain data. The system is designed to ideally search block chain transaction data and apply data mining techniques to identify behavior. patterns and consequently the ‘abnormals’ in these decentralized networks. The proposed solution combines components of data retrieval approaches of Bloom filters and Merkle trees to enable efficient querying of the data stored in the blockchain. Moreover K-means clustering and isolation forest algorithm are applied to study user activities and detect abnormal activities. The system offers understanding of the transaction dynamics, wallet engagement and contracting mechanics by analyzing the massive data from public blockchain including etherium. Benchmarking of the system is done by running experiments that compare queries per time, accuracy of the clustering, and detecting anomalies. In the current study, initial performance evaluations are revealed, which prove that the system is capable of quickly processing large data streams and giving reasonably accurate behavior. predictions in the areas of fraud prevention, market evaluation, and blockchain administration. This research enriches the field of blockchain analytics and adds efficient and scalable tool for the analysis of the block-chain activity. In addition to improving the capabilities for collecting and processing blockchain data, the conclusions of this research open the door to real-time processing for blockchain data, cross-chain behavior. analysis, and privacy-preserving analysis of blockchain data in the future. 1 Introduction 1.1 Introduction, Background Various industries have been transformed through the deployment of the blockchain as a means to record transactions in a secure, clear, and decentralized way. It was first used in Cryptocurrencies, but has found its way into finance, health care, supply chain management and governance among others. Analogously, as more complex blockchain networks emerge, they produce an enormous quantity of on-chain data that represent transaction information, smart contracting, wallet, and network activities. This increasing volume of data provides unique insights into the behavior. of distributed systems to function, but at the same time raises questions concerning how to gather such data, how to process it, and what methods are good enough, given the size of such networks. Some of the key factors include the capability of the underlying algorithms to analyze huge amounts of block chain data effectively in order to make sense of the top level processes, identify issues such as failures, frauds or cyber attacks and make sound decisions. Regular information retrieval approaches are inadequate to handle blockchain data that may be large in volume, fixed, or distributed across a decentralized system. Consequently, the need for fresh ideas and methods to solve the issue of the storage of extensive big data on blockchain increases. To this end, this research seeks to fill the gap by proposing and developing an information retrieval system for behaviour analysis in block chain settings. 1.2 Scope and Objectives The research is based on the construction of an information retrieve system that will be charged with the responsibility of retrieving transactional information, wallet activity, contract interactions and other blockchain occurrences that depict user behaviors and trends. Thus, it effectively allows for convenient extraction of and behavioral patterns analysis from on-chain blockchain data. First of all, the goal is to establish the means for querying and retrieving blockchain data for the purpose of behavior. analysis, which requires building the adequate data models and indices, as well as query algorithms for data stored in blockchains. Subsequently, through the retrieval system, the research will concentrate on patterns of users’ behavior, the transaction processing and contracts engagement, By employing statistical and machine learning methods, the research will define vital patterns and tendencies as well as predict further behaviors. Last but not least, the effectiveness of the proposed system will be evaluated by using actual blockchain datasets. This includes performing an accuracy check, throughput test, and parallel test in different blockchain settings, including public-based cryptocurrencies and business-owned private blockchains. Finally, it should be noted about an important focus that accompanies this research, namely the evaluation of the ethical considerations and security concerns regarding the analysis of blockchain data. The company will need to take into consideration such factors as privacy and transparency of it and guaranty the compliance with the existing standard and regulation. Thus, the project will help to develop a new field of blockchain data analysis with the following behavior. analyzing tool to be applicable for a wide range of application fields, starting from a market prediction and ending with fraud detection and governance monitoring. 2 Literature Review As more kinds and source of blockchain data become accessible the interest for techniques and tools for searching and utilizing information from distributed networks has risen. Blockchain analytics can be understood as a branch of science that combines data analysis, IT, cryptography, and behavioral finance. In this section, the authors present the state-of-the-art of blockchain data retrieval, behavior. analysis, and techniques put into practice from the related work, underlining approaches, difficulties, and future directions. 2.1 Blockchain Data Retrieval Systems These have been the main problems in blockchain data retrieval that has been a major issue in distributed blockchain analysis. Some of the prior works have also been directed towards developing efficient scheme for creating query systems and indexes to deal with large amount of data associated with the blockchain. 1. Transaction and Block Indexing: The primary challenge in blockchain data retrieval is how to index transaction records and blocks in a manner that supports efficient querying. Traditional relational databases are not well-suited for the unique structure of blockchain data, which is often stored in a linked list of blocks. Several approaches have been proposed, such as Bloom filters and Merkle trees, to efficiently represent and query large-scale blockchain dataBlockchain-Specific Databases: Researchers have also explored the development of blockchain-specific databases that optimize retrieval tasks. For example, BigchainDB is a decentralized database that allows fast querying of blockchain data while maintaining decentralized features . Chaid Oracles are also frequently used to retrieve off-chain data and link it to on-chain information for broader analysis . 2. Decentrali Engines: As blockchain data grows, decentralized search engines have gained traction. These systems allow users to search across multiple blockchain networks without relying on a centralized entity. Examples include Blockchair, a search engine that indexes Bitcoin, Ethereum, and other cryptocurrencies . 2.2 Blockchain Behavior. Analysis Behavioral analysis within these blockchain networks has now gained traction especially because more and more blockchain based platforms are in use for financial services, voting systems and other smart-contract based decentralized applications (dApp). Great insight can be gained when analysing the data on the blockchain to determine patterns on the usage, transaction and contract engagements from the users as well as the malware. 1. Transaction Patterns and Anomaly Detection: A great body of work has been dedicated to the identification of the undesirable activity in blockchain systems. Some works like Chen et al. (2020) present how to approach the identification of fraudulent transactions on/as cryptocurrency networks, based on the design of anomaly detection techniques. Graph-based methods are also used, where blockchain data is represented in terms of a graph, and different community detection algorithms, which are used to identify atypical transaction behavior. or escrow subjects. 2. Behavioral Clustering: Another commh in blockchain behavior. analysis is the categorization of users according to their transacted frequency. Various types of algorithm classes have been utilizing machine learning especially K-means clustering, DBSCAN in grouping the users and in studying typical behaviour among Blockchain networks. These methods are essential for defining common us for typical transaction sizes, frequency or use of specific smart contracts. 3. Smart Contract Interactions: Smart contracts, as self-executing code on the blockchain, provide valuable data for behavioral analysis. Analyzing the execution of smart contracts can reveal patterns such as frequent interactions with specific contracts, user preferences, or abnormal contract activity. Techniques such as static analysis and dynamic analysis of smart contract interactions are employed to identify vulnerabilities or unusual behaviors . 4. Market Behavior. and Forecasting: The behavior. of participants ain-based markets, such as those involving cryptocurrencies and decentralized finance (DeFi), has been another area of intense research. Studies have focused on using blockchain data to predict market trends and behavior. Machine learning models, including time series analysis and reinforcement learning, are commonly applied to predict price movements and market dynamics in response to user actions . 2.3 Challenges and Gaps in Blockchain Behavior. Analysis Despite signifiements in blockchain data retrieval and behavior. analysis, there remain several challenges and gaps in the existing literature. 1. Data Privacy and Security: The transparent nature of blockchain transactions can be a double-edged sword. While it offers the benefit of auditability, it also raises concerns about user privacy and data security. Current research has yet to fully address how to perform. behavior. analysis while maintaining privacy and compliance with regulations such as GDPR . 2. Scalability Issues: As blockchain networks continue to scale, the sheer volume ofnts scalability challenges for behavior. analysis systems. Current systems struggle to maintain performance and efficiency when dealing with large-scale blockchain networks like Bitcoin or Ethereum, which generate millions of transactions daily. 3. Interoperability Across Blockchains: Many existing studies focus on individual blockchain networks, such as Bitcoin or Ethereum, but there is a lack of systems capable of analyzing behavior. across multiple blockchains. This limits the applicability of current approaches to the broader, multi-chain ecosystem emerging in the blockchain space . 4. Real-Time Analysis: Real-time analysis of blockchain data, particularly for fraud detectionbehavior. prediction, remains a significant challenge. Many current methods rely on batch processing or off-line analysis, which makes them unsuitable for time-sensitive applications like financial trading or risk monitoring. 2.4 Opportunities for Future Research The intersection of blockchain data retrieval and behavior. analysis presents ample opportunities for future research. Some promising directions include: 1. Advanced Machine Learning Techniques: The integration of deep learning, reinforcement learning, and graph neural networks in blockchain behavior. analysis could significantly improve the accuracy and scalability of predictive models and anomaly detection systems . 2. Cross-Chain Data Integration: Future systems could focus on integrating data from multiple blockchai allowing for cross-chain behavior. analysis. This could be particularly useful for applications in multi-chain decentralized finance (DeFi) ecosystems, where assets and transactions move across different blockchains . 3. Privacy-Preserving Techniques: Research into zero-knowledge proofs and homomorphic encryption could leprivacy-preserving methods for behavior. analysis, enabling the extraction of meaningful insights from blockchain data without compromising user privacy . 4. Real-Time Blockchain Analytics: Developing systems that can handle real-time data streams from blockchain networks will for applications in trading, risk management, and regulatory compliance. 3 Project Plan 3.1 Proposed Solution / Methodology The major objective that is characteristic for the proposed project is to design a suitable information retrieval system capable of processing and analyzing blockchain data for behavioral purposes. The following methodology will be used to achieve this objective: Data Collection and Preprocessing: Data Sources: The work will involve public blockchains, which include Bitcoin and Ethereum databases. These networks provide great transaction details that encompass transaction information, wallets, and contracts. Data samples will be collected from public blockchain viewers or where necessary, through interaction with the actual networks (e.g., by applying the Ethereum JSON-RPC API). Preprocessing: This would be followed by preprocessing of the raw data to identify required attributes: transaction timestamp, transaction amount, sender and receiver addresses, the addresses associated with the smart contracts and gas utilized. It will also include data screening in that I will exclude transaction/contract executions with no value such as non-transaction data and also standardize the data format for analysis purposes. Designing the Information Retrieval System: Data Indexing: The option of having a custom indexing system will be developed to enable efficient querying of information in the blockchain domain. For the membership test in the dataset, Bloom filters will be used to test for membership in the data set and Merkle trees to test for transaction execution. Indexing also takes into account important characteristics such as when the wallet addresses, type of transactions, and the time stamp. Query Mechanism: The system will incorporate a query function for obtaining blockchain information related to certain parameters such as volume, contract engagements, and wallet transactions. The system will require basic queries such as getting all the transactions from a certain wallet and complex queries such as all the interactions with a certain smart contract within a certain period of time. Behavior. Analysis using Data Mining Techniques: Clustering: The project will use clustering algorithm techniques such as K-means or DBSCAN to categorise the blockchain users by their transactions records. This will be good in tracking normal user behaviour so as to be able to detect abnormal behaviour, which could be result of fraudulent actions. Anomaly Detection: In this context, statistical analysis and machine learning patterns will be used for data pattern search in the case of blockchain data anomalies. Machine learning algorithms that are applied for anomaly detection, such as Isolation Forest or Autoencoders, will allow to detect transactions or behaviours that seem rather suspicious and may indicate fraudulent activity. Implementation Tools and Frameworks: Programming Languages: Python will be used due to rich sets of libraries in data analysis, machine learning, and for interacting with blockchains (Web3.py for Ethereum for instance). Machine Learning Libraries: For the process of machine learning algorithms clustering and anomaly detection, Scikit-learn and TensorFlow will be employed. Blockchain Interaction: Ethereum data will be queried using Web3.py and for Bitcoin, we will use scripts that will scrape blockchain explorers. 3.2 Experimental Design Dataset Selection and Preparation: A test dataset retrieved from the Ethereum network will be chosen in such a manner as to contain simple transfer operations and those with company smart contracts. The dataset will be captured from 1-2 months to capture different behaviors at different network conditions. The following attributes are going to be pre-processed out of the dataset: transaction amount, sender and receiver wallet ids, date and time and gas consumption. Information Retrieval System Testing: System Efficiency: The first activity is to examine the performance of the retrieval system in dealing with various types of queries. There will be requests to get transactions from specific addresses, to get the number of transactions made over a certain timeframe. and interaction data of contracts. The system response time and the precision of the results will be evaluated. Query Performance: To measure the retrieval performance, parameters such as time response to a given query, precision and recall rates when working with large databases will be used. This will include operating with more extended datasets (for example, scale from 10 thousand up to one hundred thousand deals). Behavior. Analysis Experimentation: Clustering Analysis: For the evaluation of user behaviour, initially, the K-means clustering method is adopted for the common classification of blockchain addresses to provide a better explanation of transaction patterns. The Silhouette Score will be calculated as the quantitative measure that will be used to check how well the data points have been clustered and whether clear behavioral patterns can be observed for each cluster. Anomaly Detection: Isolation Forests, which is an anomaly detection model, will be developed on a sample of normal transaction data. These models will then be split on another test dataset to detect outliers. The True Positive Rate (TPR) and False Positive Rate (FPR) measures with respect to anomaly detection system will be computed. Ethical and Security Considerations: The project will also guarantee that all used data meet the privacy and security act or laws in that state or country. Even if the data in a blockchain is open to the public, the study will refrain from acquiring badly necessary personally identifiable information (PII) or data that may be employed to deduce rise PII. Personal information will be removed when necessary, and any conclusion or recommendations concerning privacy issue will be reported in the last report. Comparison with Baseline Systems: There are others current blockchain based data retrieval systems that will be used as baseline systems to benchmark the performance of the proposed system in terms of speed, accuracy and scalability. The comparison to the current system will be conducted in order to establish a point of reference for the evaluation of the proposed system. 3.3 Expected Results The expected outcomes of this project are as follows: Efficient Information Retrieval: The retrieval system is envisioned to interrogate data concerning the blockchain systems within the time limits conventionally deemed reasonable (up to 5 seconds in this case). The used indexing mechanisms should enable the work of the system at large volumes, with decreased degradation of effectiveness. Behavioral Insights and Clustering Results: The clustering analysis should reveal different behaviors within blockchain networks that have not been previously identified. Mobile users must be categorized, which are active trading residents, contract interface users, and occasional visitors. Scenarios revealed in this research will help to define the possible modal behavior. of the typical user of the blockchain. Anomaly Detection Accuracy: The work of the anomaly detection system should allow to define an outlier, for example, a fraudulent transaction or abnormally behaving user, confidently. The models will likely yield 80 % True Positive Rate [1] at the most and, the False Positive Rate will be maintained below 5%. System Performance: An efficient information retrieval system should scale, or efficiently query large blockchain datasets without query performance considerably degrading as the dataset size increases. The system will also prove scalability in that is will be able to respond to the simplest as well as the most complex queries. Insights into Blockchain Behavior. Some of the expected outputs for this project include the would be user behaviors particularly pertaining to usage of the blockchain networks, the frequency of occurrence of transactions, engagement with the block chains, wallets among others. The findings of this analysis could be helpful across such use cases as market prediction, fraud detection and decentralized governance. In conclusion, it is believed that the proposed solution shall deliver a reliable, effective, and easily scalable system for the mining and analysis of the blockchain data evidently enhancing the field of Blockchain analytics. 3.2 Gantt Chart Figure 3.1 Design of Experiments flowchart 4 Conclusion This research work aims at presenting the basic idea and approach to the development of an information retrieval system for behaviour analysis for blockchain networks owing to the novel decentralized and unalterable characteristic of blockchain technologies. Through this study, this work seeks to propose an effective framework for querying and processing blockchain data, good for understanding user characteristics, transaction dynamics and relations in blockchain systems that can be employed in a variety of contexts such as fraud and anomaly detection, market analysis and network management. This project aims to extend upon these techniques using current data search techniques such as Bloom filters and Merkle trees alongside novel Machine Learning techniques including clustering and anomaly detection in order to improve the existing methods of obtaining informative patterns from large scale data sets of the blockchain. The strategy proposed for the analysis of the blockchain data is supposed to be scalable, and the efficiency of its application is also considered to be high, although it can be process considerable amounts of data and provide the user with adequate and valuable information. In this context, the experimental design will assess the proposed system concerning retrieval efficiency, clustering, and anomaly detection. The general performances of the proposed system are also evaluated in this step, and the effectiveness in identifying great human behaviors of interest and potentials of fraud or malicious behaviors in practical blockchain datasets including Ethereum transactions will be investigated. In general, this research will make an enlarged contribution to the developing field of blockchain analytics with unified and effective approach for information search andORAGE processing. The findings of this research will increase the knowledge on blockchain ecosystems and help in decision making across those areas like financial markets, compliance and decentralised management. The results will also open new horizons for generations of studies in block chain data analysis for privacy-preserving, real-time, and cross-chain behaviors. References [1] Liu H, Han D, Li D. Behavior. analysis and blockchain based trust management in VANETs[J]. Journal of Parallel and Distributed Computing, 2021, 151: 61-69. [2] Li M, Zhu J, Zhang T, et al. Bringing decentralized search to decentralized services[C]//15th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 21). 2021: 331-347. [3] Kamal Z A, Fareed R. Data retrieval based on the smart contract within the blockchain[J]. Periodicals of Engineering and Natural Sciences (PEN), 2021, 9(4): 491-507. [4] Ali A, Pasha M F, Fang O H, et al. Big data based smart blockchain for information retrieval in privacy-preserving healthcare system[M]//Big Data Intelligence for Smart Applications. Cham: Springer International Publishing, 2022: 279-296.
PSYCH 402 Exam 1 Long Answer (5 points each) 1. You and your research mentor just finished collecting data for a project. Before conducting analyses, your mentor wants to know what indices of central tendency would be most useful for describing the variables of interest. The variables of interest include Externalizing Behavior. (e.g., aggressive behavior, fighting), Teacher–Child Conflict, Parental Warmth, and the Perceived Stress Scale. Histogram plots are shown below. For each variable, describe which indices of central tendency would best describe a commonly occurring score and why. 2. Let’s pretend that a researcher, Jackie Daytona, has SAT scores from the entire population! Jackie notes that scores from the population are not normally distributed (shown in a histogram below). Let’s also pretend that Jackie Daytona repeatedly drew samples of 40 students from the population data shown in the histogram. Jackie believes that the distribution of sample means from his repeated sampling will not be normally distributed because the population data is not normally distributed. Is Jackie correct or not, and why? According to what theory is Jackie correct or not? 3. You are working in a research lab that examines the development of adolescent risk-taking behavior. In the lab, adolescents complete a simulated driving game to measure risk-taking behavior. A postdoc in the lab wants to know if adolescents engage in more risk-taking behavior. than the general population. He collected data from 28 adolescents and found that the average risk-taking score was 82.3 (alternative mean). The average risk-taking score from the general population is 80.4 (the null mean). From the example above, how would you decide if 82.3 is statistically different from 80.4 from a null-hypothesis significance testing perspective? Your response should elaborate on the following: the research hypothesis, the statistical hypothesis, the null distribution, the alpha level and its interpretation, the p-value, and Type I error rates. Ultimately, what information do you need to determine statistical significance, and what are the possible outcomes of that decision (e.g., types of errors)? Short Answer (2 points each) 4. What is a type II error? How is it different from a type I error? 5. Define and describe dispersion. Please also define and describe the average deviation, variance, and standard deviation. Are there any problems with the interpretation of these metrics for dispersion that the others resolve? 6. You have a bag of chips with various colors. The probability of drawing a green is very low (let’s say there is a 2% chance of drawing a green chip). You bet your friend that he will pay for lunch if you draw the green chip. You end up drawing the green chip, and your friend accuses you of cheating because “the green chip has the lowest probability of being drawn.” Is your friend correct in accusing you of cheating? Explain your reasoning. 7. You are helping a graduate student review a paper. In the paper, the researchers report the mean for a categorical variable code as 0 = “Did not attend a charter school” and 1 = “Attended a charter school.” The mean is reported as 0.72. The graduate student says to you “I do not know what they reported a mean for that variable. It is categorical!” Is the graduate student correct? Does the mean for categorical variables have an interpretation? If so, what is the interpretation? 8. Identify and define each component of the boxplot shown in red letters below. 9. What are some key similarities and differences between the standard deviation and the standard error of the mean? How might they apply to the distribution of the observed scores (i.e., data from a sample of participants that you collected) or the distribution of sample means?
Math 54 Small Project Steps for How to Input Your Data into Excel and Run the Regression Analyses Note: If you are instead using data from the first website of data list you can skip step 1 (the first step) since this step since for collecting the data will be straightforward. Only follow step 1 if you are using data from the second website from the data lists handout. 1. How to paste data obtained from online source into Excel in an organized fashion: a) First, to input your data into an Excel spreadsheet, open up a new Excel file. Go to the web link for the project topic you desire, select and copy the data in the file. b) After copying your selected data from a website on the internet, go into the Excel spreadsheet and click on the first cell, which is cell A1. Then use the command to paste your data into the spreadsheet. c) Now you will realize that although the data looks like it is organized column by column how you would like it to appear, this is actually not the case. Everything is actually spaced out but only in the first column rather than multiple columns. To fix this, click on the Excel DATA tab at the top of the Excel screen, and then click on Text to Columns button (should be located near the middle of the screen. d) Now your data is organized in columns like it should be so that you can select the data columns for the variables you are analyzing. Select and copy the data in each of the two columns of interest (each done separately), and then paste it into a new spreadsheet to have only the two columns of data in the new sheet that you wish to analyze. You can open a new spreadsheet by clicking the small button that is a plus sign inside of a circle, located at the bottom left of the screen. e) Now your data is organized how it needs to be to perform. your regression analysis. Now for the steps on how to do the regression analysis. 2. You will need to install the regression Data Analysis toolbar onto your Excel version unless this toolbar already happens to exist. This can be easily done in just a few simple steps: a) Click the File tab, and then click on Options. b) Click Add-Ins, and then in the Manage box, select Excel Add-ins. Now click Go. c) In the Add-Ins available box, select the Analysis ToolPak check box, and then click OK. d) If you get prompted that the Analysis ToolPak is not currently installed on your computer, click Yes to install it. e) After you load the Analysis ToolPak, the Data Analysis command is available in the Analysis group on the Data tab. 3. Carrying out the necessary least-squares regression analysis, creating the scatter and residual plots, and morel a) First, select the DATA toolbar, and then click on the Data Analysis button. b) Now a popup will come up with a list of options to select. Click on Regression and then click the OK button. c) Now a dialog box will come up, asking what your inputs are for your X variable and for your Y variable. Select the range of cells in your spreadsheet corresponding to your X variable (explanatory variable) data and the range of cells in your spreadsheet corresponding to your Y variable (response variable) data. For example, if the X variable data in your spreadsheet are from cells A1 to A15, then type for the range "A1:A15", and if your Y variable data in your spreadsheet are from cells B1 to B15, then type for the range "B1:B15". Always use a colon ":" to tell Excel that it includes all cells in between the starting cell and ending cell. Important note: Before finishing this part c) step you will also need to under the "Residuals" section of that dialog text box be sure to check mark the "Residual Plots" and "Line Fit Plots" criteria as this will be needed to have Excel generate your Scatter Plot and Residual Plots in the output. d) Now click the OK button and you will now have your Excel Regression Summary Output to use for your written part of the project and to find your regression line. Note that the values under the "Coefficients" column on the bottom tablegive you the values of your bo and bi, according to the least-squares regression equation y= bix+ bo. Lastly, use the p-value column (for the X variable row) to determine whether or not there is a significant linear relationship between X and Y. Note: Once all of your summary output tables and graphs are produced and provided, be sure to re-label the title and axes of your Scatter Plot and Residual Plot.
Game Design Optimisation GAV4056-N Endless Runner Game Deadline Date: 6th January 2025 ICA ASSESSMENT ASSIGNMENT TITLE Endless-Runner Game DELIVERABLES You are to create an endless runner prototype game for PC within Unreal Engine. The endless runner must contain certain requirements that will allow you to show optimal game design techniques (see later pages) however may be of any theme or design you decide. The game prototype should also be targeted to play on a mobile device in addition to PC, making changes to your design where necessary to accommodate for the the change in platform. This should be submitted as an Unreal Project but must be accompanied by a video, that details what you have implemented, shows your game in action and describes blueprint design, class design and algorithm implementation. DEADLINE The deadline for ICA submissions is: 6th January 2025 | 4pm | Blackboard Endless-Runner Guidance Your endless runner should have multiple game design choices that are optimal for creating a game to industry expectations. Each of your prototypes must include all of the Must Have implementations and must include at least two of the Should Have implementations from the list below. You may include more from the should have list and/or include more advanced implementations from the could have list or research and implement another optimal game design inclusion for further marks. Must Have (Your Game MUST have all of these) ● Be an endless runner (with the path the player will take being determined through some sort of algorithm, not a level you have completely designed) ● Have at least three types of pickup/collectable, all with distinctive differences (for example, two collectables that both give points, one more than the other would not count) ● Have at least two interactive NPC/enemies, all with distinctive differences (for example, the same enemy with a different texture would not count) ● Be playable on a PC and Mobile platform. (through emulation or physical device) ● Have a control scheme and UI that changes per platform. (PC and Mobile) ● Include a main menu and a highscore board, with optimised layouts per platform. ● Have clear, well designed Blueprint classes that uses concepts such as parent and child classes; interfaces and override functions. Should Have (Your Game MUST have at least two of these) ● During gameplay, the Spawn function or the creation of new objects is not used / does not take place ● Highscores are saved and stored, with considerations taken for cheating and storage space. ● The layout of a given endless runner session can be saved/remembered in some way and later reloaded, so the player can play the exact same endless runner level they have played again ● A software design pattern has been utilised, is explained in blueprint comments and an accompanying document (500 words) that shows figures and describes how the pattern is used. You may not use a Blackboard as this pattern Could Have (Your Game can include these for further marks) ● Networking (Multiplayer? Scoreboard sharing? Login System?) with additional marks given for Cross Platform. networking ● Additional game modes that significantly changes gameplay and control scheme for both PC and Mobile platforms (Gyroscope? Controller?) ● An additional game design optimisation you have researched and justified To reiterate, at MINIMUM your project must contain ALL of the concepts within the “Must Have” list and at least TWO of the concepts from the “Should Have” list. ICA SUBMISSION ENDLESS RUNNER HAND-IN Project submissions will be made digitally through a link made available on Blackboard. Please ensure you have checked your work, the assessment deliverables, the marking criteria and the guidance checklist prior to submission. The details for submission have been outlined below and should be zipped into a .zip format: ● A video that shows off your game on PC and Mobile. The video file should be no longer than 10 minutes but should show all of the concepts you have implemented from the lists, differences between your Mobile/PC versions and show in editor blueprint images or walkthrough. ● A read-me.txt file, that indicated what version of unreal was used and any references to assets used to help create the game ● Any additional supporting files should be included within the .zip too ● A link (or included with the submission) to your project that can be downloaded if needed ○ The file should be named: STUDENTNUMBER-GDO-Project.zip NOTE ABOUT DIGITAL SUBMISSION It is understandable that due to Blackboard file limitations, some of you may not be able to submit your project via blackboard. In this instance, you must submit a link to download the project via an online service (such as the University provided OneDrive). If this is the case, please test this link. Failure for the module team to be able to download this because of permissions errors you have made will result in your work being unmarked. This download link should remain active for one year, and not be edited after the hand-in date. Any edits made to the link after the hand-in may result in your work being unmarked. ICA EXTENSIONS DETAIL There is the possibility of an extension being granted which extends the initial deadline date for your team, with no penalty. Following this revised deadline, a one week grace period will commence. In order to request an extension, contact your tutor in the first instance who may refer you onto your module or course leader. Please note that extension requests require evidence and all of the following are NOT VALID reasons for requesting an extension: ● Study related circumstances (personal equipment failure, printer problems, failure to take back-up copy of work, misreading the examination timetable, oversleeping, taking the wrong examination). ● Normal exam stress or anxiety experienced during revision or the assessment period. ● Personal disruptions within a student’s control (moving house, change of job, normal job pressure, holidays, weddings, failed travel arrangements, financial issues, poor time-management, routine medical appointments, disruption to routine caring responsibilities). ● Grounds of religion, unless notification was given at the start of the academic year. ● Foreseeable and preventable circumstances. ● Statement of a medical condition without reasonable evidence (medical or otherwise) to support the case. ● Complaints against staff or in relation to delivery of the module/programme. (These are managed through the University’s Student Complaints Policy and Procedure). ● Medical circumstances outside the relevant assessment or learning period for which appropriate adjustments for extenuating circumstances have already been made. ● Long term health condition, for which a student is already receiving reasonable and appropriate adjustments. ● Medical condition supported only by retrospective evidence (such as a doctor’s note stating that a student was seen after the illness, and that a student declared they had been ill previously). ● Late disclosure of circumstances, where a student could reasonably be expected to have contacted a member of staff about the problem, but did not do so.
Game Design Optimisation GAV4056-N Report Deadline Date: 7th January 2025 ICA ASSESSMENT ASSIGNMENT TITLE Report DELIVERABLES Your report element will analyse each of your major design choices you have made throughout the creation of your game and justify why they are optimal. The report should also discuss the differences in your PC targeted version and Mobile targeted version, and include references to support your justification. DEADLINE The deadline for ICA submissions is: 7th January 2025 | 4pm | Blackboard Report Guidance Your report should be 1500 words in length (excluding any title pages of references list). The report should detail your main game design choices and importantly justify why you believe they are optimal through the use of references, figures and comparisons with other techniques. Additionally your report should detail the differences between your PC version and Mobile version of your prototype, explaining why certain elements differ between versions and justifying why these decisions were made. References and figures should be professionally presented in an academic style. (for example, Harvard) and should be easy to read and understand, with some consideration taken to how the document is laid out and orginised. ICA SUBMISSION REPORT HAND-IN Report submissions will be made digitally through a link made available on Blackboard. Please ensure you have checked your work, the assessment deliverables, the marking criteria and the guidance checklist prior to submission. The details for submission have been outlined below: ● Your report in ether a .doc , .docx or .pdf format. (.pdf Prefered). ● The report should include a title page with your name on to identify the report as your own. ICA EXTENSIONS DETAIL There is the possibility of an extension being granted which extends the initial deadline date for your team, with no penalty. Following this revised deadline, a one week grace period will commence. In order to request an extension, contact your tutor in the first instance who may refer you onto your module or course leader. Please note that extension requests require evidence and all of the following are NOT VALID reasons for requesting an extension: ● Study related circumstances (personal equipment failure, printer problems, failure to take back-up copy of work, misreading the examination timetable, oversleeping, taking the wrong examination). ● Normal exam stress or anxiety experienced during revision or the assessment period. ● Personal disruptions within a student’s control (moving house, change of job, normal job pressure, holidays, weddings, failed travel arrangements, financial issues, poor time-management, routine medical appointments, disruption to routine caring responsibilities). ● Grounds of religion, unless notification was given at the start of the academic year. ● Foreseeable and preventable circumstances. ● Statement of a medical condition without reasonable evidence (medical or otherwise) to support the case. ●Complaints against staff or in relation to delivery of the module/programme. (These are managed through the University’s Student Complaints Policy and Procedure). ● Medical circumstances outside the relevant assessment or learning period for which appropriate adjustments for extenuating circumstances have already been made. ● Long term health condition, for which a student is already receiving reasonable and appropriate adjustments. ● Medical condition supported only by retrospective evidence (such as a doctor’s note stating that a student was seen after the illness, and that a student declared they had been ill previously). ● Late disclosure of circumstances, where a student could reasonably be expected to have contacted a member of staff about the problem, but did not do so.
Introduction. In this homework, we will use DataFrames to represent collections of feature vectors. Each row will be a feature vector. Our row labels will be either datetime objects or traditional integer indices, and our column labels will be feature names. We will be examining weather data from Tucson International Airport, TIA, for the thirty-year period 1987-2016. We will investigate this data using clustering, which is a form of unsupervised learning. All the models we have looked at before are supervised, meaning that we use labeled data (with a known target variable) to train the model. In unsupervised learning, we don’t have known values of the target variable (and the target variable may not even be well defined). Unsupervised learning, instead of looking for a “correct” prediction, looks for natural groups or patterns in data. Clustering is the unsupervised analogue of classification. In clustering, we don’t know what the class labels are, and we don’t necessarily know in advance what the class variable is supposed to represent. In this homework, we’ll implement a common clustering algorithm called k-means, which sorts points into clusters by their centroids – which is just a fancy word for the means of vectors. Each point gets assigned to the centroid that it’s closest to. We give the algorithm the data and a value of k – the number of clusters we want to find – and it does its best to find centroids that group the data into reasonable clusters. k-means is an iterative approximation algorithm, which means it starts with an initial guess and then takes steps to improve this guess bit-by-bit, until the steps no longer lead to an improvement. Once the update steps no longer do anything, we’re done and we return the centroids we found. We’ve seen an example of one of these algorithms before, although we didn’t implement it ourselves: the scipy function curve fit uses the same idea. Instructions. Create a module named hw4.py. Below is the spec for ten new functions that you must implement, and a few others that you don’t have to if you don’t want to. Code the new ones up, add the others, and upload your module to the D2L HW4 assignments folder. Unless the spec explicitly instructs you to, don’t do any error-checking; assume that valid arguments will be passed. Testing. Download hw4 test.py and auxiliary testing files and put them in the same folder as your hw4.py module. Each of the ten functions is worth 10% of your correctness score. You can examine the test module in a text editor to understand better what your code should do if necessary; consider the test module to be part of the spec. Documentation. Your module must contain a header docstring containing your name, your section leader’s name, the date, ISTA 331 HW4, and a brief summary of the module. Each function must contain a docstring. Each function docstring should include a description of the function’s purpose, the name, type, and purpose of each parameter, and the type and meaning of the function’s return value. 1 2 ISTA 331 HW: K-MEANS CLUSTERING Grading. Your module will be graded on correctness, documentation, and coding style. Code should be clear and concise. You will only lose style points if your code is a real mess. Include inline comments to explain tricky lines and summarize sections of code. Collaboration. Collaboration is allowed. You are responsible for your learning. Depending too much on others will hurt you on the tests. “Helping” others too much harms them in reality. Cite any sources or collaborators in your header docstring. Leaving this out is dishonest. Resources. Here are some references that might be helpful: Pandas and scikit-learn docs: • https://pandas.pydata.org/pandas-docs/stable/api.html • https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html • https://scikit-learn.org/stable/modules/clustering.html#k-means • https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler. html#sklearn.preprocessing.MinMaxScaler • https://scikit-learn.org/stable/auto_examples/cluster/plot_kmeans_assumptions. html • https://scikit-learn.org/stable/modules/preprocessing.html#preprocessing-scaler • https://scikit-learn.org/stable/ About the data we’re working with: • https://www.ncdc.noaa.gov/ • https://www.ncdc.noaa.gov/data-access/land-based-station-data/land-based-datasets/ global-historical-climatology-network-ghcn • https://www.ncdc.noaa.gov/ghcn-daily-description • https://www.ncdc.noaa.gov/cdo-web/search?datasetid=GHCND • https://catalog.data.gov/dataset/global-surface-summary-of-the-day-gsod Function Specifications. Functions for loading, cleaning, and processing data. • is leap year: takes an integer year and returns True if that year is a leap year, False otherwise. (Look up when leap years occur if you don’t know.) • euclidean distance: takes two feature vectors and returns the Euclidean distance between them. • make frame: This function doesn’t need to take any arguments. Make a DataFrame by calling read csv and passing it the TIA 1987 2016.csv filename. Replace its index with datetime objects by assigning a pandas date range object to the index, and then return the frame. Your returned frame should look something like the following: ISTA 331 HW: K-MEANS CLUSTERING 3 • clean dewpoint: This function takes the frame created by make frame. Looking at our datafile, we see that the dewpoints for March 10th and 11th, 2010, are -999. Those dates are missing in the GSOD data, so we have added them manually. Replace those values in the DataFrame with the average of the dewpoints on those days for the other 29 years. You can ignore any other missing data (pandas has replaced them with NaNs, and that’s ok here). • day of year: This function takes a datetime object and returns the day of the year it represents as an int between 1 and 365. The nontrivial part: if the year is a leap year, return the day of the year as though it were not, unless the date is February 29; in that case, return 366. (Hint: lookup the timetuple method for datetime objects. It returns an object with a useful instance variable, tm yday. Of course, you need to recall how to access an instance variable.) • climatology: This function takes the data frame we have created and uses it to calculate 30-year averages of our feature variables for each day of the year. Instead of calculating these averages manually, we will use a piece of pandas functionality: the groupby method. This is a DataFrame method that takes a function as an argument.1 Call the groupby method, passing it your day of year function. This will return a GroupBy object. All you need to know about this GroupBy object is that you can call its mean method and it will return a DataFrame indexed by the days of the year (1-365) whose values are the averages of the values in the original frame; then you can return this new frame. The frame should look something like this: • scale: This function takes a DataFrame and scales each of the features so that they run from 0 to 1. This is similar in spirit to calculating z-scores for each value, and often improves the performance of k-means; without this step, variables with higher spread would dominate the distance calculation. To save a bit of effort we will use some canned code from scikit-learn, a machine learning module. Make sure to include this import: from sklearn.preprocessing import MinMaxScaler Then, in the scale function, instantiate a MinMaxScaler object with a line like scaler = MinMaxScaler(copy = False) 1We first met the concept of passing a function as an argument in HW 3, in the r squared function. The property of Python that allows us to do this is called “first-class functions”, and is shared by many, but not all, modern programming languages. 4 ISTA 331 HW: K-MEANS CLUSTERING and call its fit transform method, passing it the frame you want to scale: scaler.fit_transform(df) (substituting the name of your argument for df if necessary). Your frame should now look like this: Functions implementing the k-means algorithm. As we saw in class, k-means uses a predict-update loop (a common technique in learning algorithms). Starting from initial guesses, the algorithm makes predictions, then uses those predictions to update its guesses, then uses the new guesses to make predictions, etc., etc. We alternate between predict and update steps until the update no longer changes the predictions, at which point we’re done. • get initial centroids: The k-means algorithm needs an initial guess for the centroids to start running. In common practice these guesses are chosen randomly, and the algorithm is run a few times with different starting points to try to find the best result. But we’ll just do a non-random process and pick k evenly spaced days. This function should take a climatology frame and a value of k and return a DataFrame indexed with standard integer indexing (0, 1, …, k – 1). Row i of this frame should contain the values from row i * (len(df) // k) + 1 of the climatology frame. (the +1 is important because our climate frame is indexed from 1, not 0). • classify: this function takes a centroids frame and a feature vector (i.e. a row from the climatology frame) and returns the label of the cluster whose centroid is closest. • get labels: This function takes a DataFrame (in our case, intended to be a scaled climatology frame) and a labels series and returns a Series that maps the indices (days of year) from the first argument to the labels of the cluster those days belong to. In other words, for each day in the frame, get the closest centroid to that day (use the classify function you just defined) and map that day to that centroid’s label. When this is done, return the Series. (This is the ‘predict’ step.) • update centroids: Here is the ‘update’ step, probably the trickiest part. This function takes the DataFrame, a centroids frame, and a labels series. It replaces the existing values in the centroids frame with the averages of the clusters according to the labels. You’ll have to think carefully about the relationships between the three inputs to do this calculation. • k means: Finally we have all the components. Most of the hard work is done, so we just have to build the predict-update loop. k means should take a DataFrame and a value of k. Get k initial centroids, get the initial labels, and then run a loop until the algorithm stabilizes (i.e. the update step doesn’t change anything). The following pseudocode can be a guide: centriods = get the initial centroids labels = get labels from initial centroids loop until done: update centroids using labels get new labels from the updated centroids check if the old labels == new labels # if they’re the same, we’re done! ISTA 331 HW: K-MEANS CLUSTERING 5 Once you have completed this, you’ve completed the graded portion of the assignment. The next few functions let us evaluate the creature we’ve created. I’ve included implementations of these below but it would be a good exercise to write your own if you have some extra time (ha). • distortion: This function measures how well the clustering detected by k-means fits the data. It takes a (scaled) DataFrame, a labels Series, and a centroids frame. It returns the sum, over all clusters, of the sum of distances from points in the cluster to its centroid. This is a double sum – a sum of a sum. Mathematically, it would be written like this: k X−1 i=0 Xx ∈ Cid(x, x¯i) where ¯xi is the centroid of the ith cluster. • list of kmeans: This function takes a frame and a maximum k and returns a list of k-means dictionaries for k running from 1 to the max k. Each k-means dictionary maps ’centroids’ to the final centroids frame, ’labels’ to the final labels Series, ’k’ to the value of k, and ’distortion’ to the distortion for that k. • extract distortion dict: This function takes a k-means list and returns a dictionary mapping values of k to their distortion values. • plot clusters: This function takes a frame and a label Series and creates a grid of scatterplots, plotting each combination of two variables against one another, and plotting points from different clusters in different colors (and/or with different shaped points). This allows us to visualize the clusters that our model has found. To evaluate the k-means clustering, open an iPython shell with ipython –matplotlib Import your code using from hw4 import * and use the following sequence of commands to create a plot of the distortion: raw = make_frame() clean_dewpoint(make_frame) climo = climatology(raw) scale(climatology) list_of_kmeans = kmeans(climo, 10) # This will probably take a while distortion_dict = extract_distortion_dict(list_of_kmeans) distortion_series = pd.Series(distortion_dict) ax = distortion_series.plot() ax.set_ylabel(’Distortion’, size = 24) ax.set_xlabel(’$k$’, size = 24) Distortion isn’t always a great measure of clustering quality, because it doesn’t penalize higher values of k in any way, so it always decreases when k goes up. Sometimes it is useful to apply the knowledge we have about the quantities we’re working with. What values of k does your intuition tell you might be making particularly informative clusterings? 6 ISTA 331 HW: K-MEANS CLUSTERING Why? (Remember the meaning of the data we are working with – how many natural “groups” do you think there should be?) As another exercise, compute the k = 4 clustering and plot the cluster labels against the day of the year. What does this tell you? I have uploaded a file called cluster eval.py that contains my implementations of these four functions, so you can use those if you don’t have time to write your own.
Distance VectorTable of Contents PROJECT GOAL………………………………………………………………………………………………………………… 2 Part 0: Getting Started………………………………………………………………………………………………….. 2 Part 1: Files Layout……………………………………………………………………………………………………….. 2 Part 2: TODOs………………………………………………………………………………………………………………. 3 Part 3: Testing and Debugging……………………………………………………………………………………….. 4 Part 4: Assumptions and Clarifications……………………………………………………………………………. 4 Part 5: Correct Logs for Provided Topologies……………………………………………………………………. 6 Part 6: Spirit of the Project……………………………………………………………………………………………. 7 Part 7: FAQs…………………………………………………………………………………………………………………. 8 What to Turn In………………………………………………………………………………………………………………. 9 What you can and cannot share………………………………………………………………………………………… 9 Rubric…………………………………………………………………………………………………………………………… 10 In the lectures, you learned about Distance Vector (DV) routing protocols, one of the two classes of routing protocols. DV protocols, such as RIP, use a fully distributed algorithm to find shortest paths by solving the Bellman-Ford equation at each node. In this project, you will develop a distributed Bellman-Ford algorithm and use it to calculate routing paths in a network. This project is similar to the Spanning Tree project, except that we are solving a routing problem, not a switching problem.In “pure” distance vector routing protocols, the hop count (the number of links to be traversed) determines the distance between nodes. Some distance vector routing protocols, that operate at higher levels (like BGP), must make routing decisions based on business valuations. These protocols are sometimes referred to as Path Vector protocols. We will explore this by using weighted links (including negatively weighted links) in our network topologies.We can think of Nodes in this simulation as individual Autonomous Systems (ASes), and the weights on the links as a reflection of the business relationships between ASes. Links are directed, originating at one Node, and terminating at another.You should review some materials on Bellman-Ford. Some resources include:Download and unzip the Project Files for Distance Vector from Canvas in the Assignments section. This project can be completed in the class VM or on your local machine using Python3.10.x. You must be sure that your submission runs properly in Gradescope.The DistanceVector directory contains the following files:There are a few TODOs in DistanceVector.py:To run your algorithm on a specific topology, execute the run.sh bash script:./run.sh *TopoSubstitute the correct, desired filename for *Topo. Don’t use the .txt suffix on the command line. This will execute your implementation of the algorithm in DistanceVector.py on the topology defined in *Topo.txt and log the results (per your logging function) to *Topo.log .NOTE: You should not include the full filename of the topology when executing the run.sh script. For example, to run the algorithm on topo1.txt you should only specify topo1 as the argument to run.sh.For this project, you may create as many topologies as you wish and share them on Ed Discussion. We encourage sharing new topologies with log outputs. Topologies with format errors will get an error back when you try to run them.We’ve included four good topologies for you to use in testing and one bad topology to demonstrate invalid topology. The provided topologies do not cover all the edge cases; your code will be graded against more complex topologies.“advertise” other nodes it can reach (Nodes C and D). (partitioned networks) o topologies that do not require intermediate steps (such as a topology with a single node)Below are the correct final logs for the provided topologies. We are providing them to help you identify correct behavior with respect to negative cycles and the assumptions in the instructions. We are only providing the final round; each topology should produce at least 2 rounds of output. SimpleTopo:A:(A,0) (B,1) (C,3) (D,3) B:(B,0) (A,1) (C,2) (D,2) C:(C,0) (B,2) (A,3) (D,0) D:(D,0) (C,0) (B,2) (A,3) E:(E,0) (D,-1) (C,-1) (B,1) (A,2) SingleLoopTopo:A:(A,0) (D,5) (E,6) (B,6) (C,16) B:(B,0) (A,2) (D,7) (C,10) (E,0) C:(C,0) D:(D,0) (E,1) (B,1) (A,3) (C,11) E:(E,0) (B,0) (A,2) (D,7) (C,10) SimpleNegativeCycle: ComplexTopo:ATT:(ATT,0) (CMCT,-99) (TWC,-99) (GSAT,-8) (UGA,-99) (VONA,-11) (VZ,-3) CMCT:(CMCT,0) (TWC,-99) (ATT,1) (VONA,-10) (GSAT,-7) (UGA,-99) (VZ,-2) DRPA:(DRPA,0) (EGLN,1) (GT,-1) (UC,-1) (CMCT,-99) (TWC,-99) (ATT,13) (OSU,-1) (VONA,2) (GSAT,5) (UGA,-99) (PTGN,1) (VZ,10) EGLN:(EGLN,0) (GT,-2) (UC,-2) (DRPA,1) (CMCT,-99) (OSU,-2) (TWC,-99) (ATT,13) (PTGN,0) (VONA,3) (GSAT,5) (UGA,-99) (VZ,11) GSAT:(GSAT,0) (VONA,-3) (VZ,5) (UGA,-99) (ATT,7) (CMCT,-99) (TWC,-99) GT:(GT,0) (UC,0) (EGLN,2) (OSU,0) (DRPA,3) (PTGN,2) (CMCT,-99) (VONA,5) (TWC,-99) (ATT,15) (VZ,13) (GSAT,7) (UGA,-99) OSU:(OSU,0) (UC,0) (GT,0) (EGLN,2) (PTGN,2) (VONA,5) (DRPA,3) (VZ,13) (GSAT,7) (CMCT,-99) (ATT,15) (UGA,-99) (TWC,-99) PTGN:(PTGN,0) (OSU,-1) (UC,-1) (GT,-1) (EGLN,1) (VONA,3) (VZ,11) (GSAT,5) (DRPA,2) (ATT,13) (UGA,-99) (CMCT,-99) (TWC,-99) TWC:(TWC,0) (CMCT,-99) (ATT,1) (VONA,-10) (VZ,-2) (GSAT,-7) (UGA,-99) UC:(UC,0) (GT,0) (EGLN,2) (OSU,0) (PTGN,2) (DRPA,3) (VONA,5) (CMCT,-99) (VZ,13) (GSAT,7) (TWC,-99) (ATT,15) (UGA,-99) UGA:(UGA,0) (ATT,50) (CMCT,-99) (TWC,-99) (GSAT,42) (VONA,39) (VZ,47) VONA:(VONA,0) (VZ,8) (GSAT,2) (ATT,10) (UGA,-99) (CMCT,-99) (TWC,-99) VZ:(VZ,0) (ATT,2) (CMCT,-99) (TWC,-99) (GSAT,-6) (UGA,-99) (VONA,-9) The goal of this project is to implement a simplified version of a network protocol using a distributed algorithm. This means that your algorithm should be implemented at the network node level. Each network node only knows its internal state, and the information passed to it by its direct neighbors. Declaring global variables will be a violation of the spirit of the project.The skeleton code we provide you runs a simulation of the larger network topology. For simplicity, the Node class defines a link to the overall topology. This means it is possible using the provided code for one Node to access another Node’s internal state. This goes against the spirit of the project and is not permitted. If you have questions about whether your code is accessing data it should not, please ask on Ed Discussion or during office hours!You should not use any global variables for managing any data relating to the Nodes. However, you may use a global variable as a setting. I.E.: NEGATIVE_INFINITY = -99Q: May I import a python module into DistanceVector.py? For example, may I use import collections, typing, etc. A: Your solution should not require any outside Python modules. Please do not import any other modules.Q: What is the best way to format and process node messages? A: There is no right or wrong way to format messages. For best results keep things simple.Q: Is it required that the distance vectors displayed in my log files be alphabetized? A: Look at the finish_round function in Toology.py. Note how the DVs are alphabetized each round, and this is reflected in the provided correct output logs. The nodes within individual vectors are not required to be sorted.Q: Should my solution include an implementation of split horizon? A: That is not a requirement for this project.Q: What if there really is a valid path between two indirectly linked nodes with no cycle and the total cost is -99 or less? To complete this project, submit ONLY your DistanceVector.py file to Gradescope as a single file. Do not modify the name of DistanceVector. You can make an unlimited number of submissions to Gradescope. Your last submission will be your grade unless you activate a different submission. There are some very important guidelines for this file you must follow:Do not share the content of your DistanceVector.py file with your fellow students, on Ed Discussion, or elsewhere publicly. You may share any log files for any topology, and you may also share new topologies. Additionally, code that you write that is not required for turn-in, like testing suites may be shared. It may be a good idea to share a “correct” log for a particular topology, if you have one, when you share the code for that topology.When sharing log files, leave alphabetization on so that your classmates can use the diff tool to see if you are getting the same log outputs as they are.All work must be your own, and consulting Distance Vector Routing solutions, even in another programming language or just for reference, are considered violations of the honor code. Do not reference solutions on Github! Do not use IDE extensions (like Github Copilot) that write or recommend blocks of code to you (autocomplete for function names is fine). For more information see the Syllabus Definition of Plagiarism. We have worked hard to provide you with all the material you need to complete this project without help from Google/Stack Overflow (Searching basic Python syntax is fine). Don’t risk an honor code violation for the project. GRADING NOTE: There is no partial credit for individual topologies; each topology is either “passed” or “failed”.As with previous projects in this course, due to the size of the class, we will not accept resubmissions, modifications to old submissions past the deadline, etc.
553.420/620 Intro. to Probability Assignment #10 10.1. In class we showed that when X1 ∼ Gamma(α1, 1) and X2 ∼ Gamma(α2, 1) are independent, then U1 = X1+X2/X1 ∼ Beta(α1, α2) by using method of Jacobians (taking U2 = X1 +X2). In this problem I want you to extend this result: Let X1 ∼ Gamma(α1, 1), X2 ∼ Gamma(α2, 1), X3 ∼ Gamma(α3, 1) be independent. Set (a) Derive the joint PDF fU1,U2,U3 (u1, u2, u3) of U1, U2, U3. (b) Integrate out u3 to find the joint (marginal) PDf fU1,U2 (u1, u2) of U1, U2. Remark. Regarding part (b): This marginal is an example of the Dirichlet distribution and is to the multi-nomial distribution as the Beta distribution is to the binomial distribution. In general, if X1, X2, . . . , Xk+1 are independent and Xi ∼ Gamma(αi , 1) for i = 1, 2, . . . , k + 1 and, for 1 ≤ j ≤ k, Uj = X1···+Xk+1/+Xj, and Uk+1 = X1 + · · · + Xk+1, then it can be shown that, for 0 < uj < 1, j = 1, 2, . . . , k and 0 < u1 + · · · + uk < 1: 10.2. (independence and uncorrelated are not quite the same thing) In order to define the covariance between two random variables it is necessary for these random variables to possess a finite mean value: Cov(X, Y ) = E[(X − E(X))(Y − E(Y ))]. But random variables can be independent with possessing a finite mean value, for example, just consider two independent Cauchy rvs or Zeta(2) rvs. Therefore, independence does not imply uncorrelated. However, if independent rvs each have a finite mean, then their covariance is 0 (and are therefore uncorrelated). Show this, i.e., you have to show that if X and Y are independent and have finite means, then E(XY ) − E(X)E(Y ) = 0. (b) Consider the discrete rv X with PMF P(X = −1) = 4/1 = P(X = 1), P(X = 0) = 2/1 . Let Y = X2 . Show that X, Y are uncorrelated and not independent. For part (b), in order to show they are not independent you have to show that the joint PMF is not the product of the marginal PMFs. 10.3 (a) Let X1, X2, . . . , Xn be rvs that have finite means and variances. Show that, for any constants a1, a2, . . . , an, (b) Write down what part (a) says in the special case n = 2. (c) What does part (a) tell us about the variance of a sum when the rvs are uncorrelated? Write it down. (d) Now suppose X and Y are independent. Find a value of a with 0 ≤ a ≤ 1 such that V ar(aX + (1 − a)Y ) is minimized. 10.4. X1, X2, X3, . . . are a sequence of i.i.d. uniform(0, 1) random variables. (a) For each fixed n, find the CDF of X(1), the smallest among X1, X2, . . . , Xn. (b) For each fixed n, find the CDF of nX(1) = n min{X1, X2, . . . , Xn}. (c) Compute limn→∞ FnX(1) (x). Does this CDF look familiar? If so, what is it? 10.5. Suppose N ∼ Poisson(λ) with λ > 0. Conditioned on N = n, X ∼ binomial(n, p), i.e., X|N ∼ binomial(N, p). Let Y = N − X. (a) Show that X and Y each have Poisson distributions with respective parameters λp and λ(1 − p). (b) Show that X and Y are, in fact, independent random variables! This shows that conditionally dependent random variables can be made independent through randomization. 10.6. Four people offer bids on an item; say the bid from person i is a continuous rv Xi . Assume the bids are independent and all share the same pdf f(x). The winning bid is the largest bid. The following are separate questions. If there is not enough information to answer, say so. Compute the probability that. . . (a) person 4 has the winning bid. (b) X1 > X2 > X3. (c) X1 > X2 > X3 and person 4 has the winning bid. (d) X1 > X2 > X3 and person 4 does not have the winning bid. (e) person 4 has the winning bid and person 1’s bid is not the lowest bid. (f) the largest bid among the four people is at least $1 more than the next largest bid. 10.7. We have 5 iid uniform(0,1) rvs X1, X2, X3, X4, X5. Find easy ways to compute each of the following if possible. (a) the probability the smallest of these is X5. (b) the probability that X1 < X2. (c) the probability that X1, X3 and X5 are monotone, i.e., either X1 < X3 < X5 or X1 > X3 > X5. (d) the probability that X5 is the largest and X1 is not the smallest. 10.8. X, Y, Z have the joint PDF f(x, y, z) = 4xyz(x + y + z) for 0 < x < 1, 0 < y < 1, 0 < z < 1. Compute P(X < Y |Z > max{X, Y }). Remark. It would be great if you could do this without doing any calculus. 10.9. We have 4 A’s, 4 B’s, and 4 C’s. We randomly disburse these 12 into 4 groups of 3 each. Compute the expected number of groups containing one each of A, B and C. 10.10. A dish has 1 yellow, 1 red, 2 green and 6 blue m&m’s. A person randomly selects 3. (a) Let X count the number of blue candies in the selection. Compute E(X). (b*) Let Y count the number of different colors in the selection. Compute E(Y ). *Please find a way of computing this expected value without deriving the distribution of Y . 10.11. A spinner has 4 equally likely regions. Compute the expected number of spins it will take to see every region hit (at least once).
GENC3004 Personal Finance Overview Learning Community Date* 1. Post Due Date 1pm Friday Week 3 (24 Jan 2025) 2. Details Due Date 1pm Friday Week 3 (24 Jan 2025) 3. Report Due Date 1pm Friday Week 3 (24 Jan 2025) Release of Results 5pm Friday Week 5 (7 Feb 2025) *All times are based on Sydney time The purpose of this assessment is to build a sense of community as we learn about Personal Finance together. It involves posting a summary and some practical applications from one article of your choice that relates to one of the Units of the course under the 'Learning Community' section of the course website (note that you do not need to post one article on each Unit of the course). An ‘article’ can be a website article, news article, blog article, academic research article, book chapter, video or audio podcast of your choice (please do not use any articles or videos published by Andrew Hingston). However, there is a catch. You must post on an article that no one else has posted on this term. In other words, there is a 'first-mover advantage' with this assessment. You should select your article and post on it early before someone else beats you to it. The assessment involves three components: 1. Learning Community Post: Write a short post of up to 1,000 words (with no leeway) under the ‘Learning Community’ section of the course website identifying your article, explaining the main points of the article and identifying how you and/or other participants of the course could practically apply those main points (more details on page 7). You only need to make one post on one article for one Unit of the course (you do not need to post one article for each Unit of the course) . 2. Learning Community Details: Provide the details of both your selected article and your post using the fields provided under the ‘ Learning Community Details’ link in the ‘ Learning Community’ section of the course website. This allows me to create a database of articles and easily check that the article that you have chosen is unique. 3. Learning Community Report: Provide both a screenshot of your post and the text of the post in PDF format using the Microsoft Word template provided under the ‘Learning Community’ section of the course website. You should then submit it using the ‘ Learning Community Report’ link under the ‘ Learning Community’ section of the course website by the Learning Community Report due date (more details on the format and content of this submission start on page 11). If you notice a typo in your post, you are welcome to correct it in your final submission but any changes should be minor. Your first post needs to be your ‘final post’ . You cannot just post a short message “I claim this article” and edit it at the end of the term to include some content. Any posts found like this will be deleted. For your Learning Community Report, you should use the Microsoft Word template provided on the ‘ Learning Community’ section of the course website. This includes a table on page 1 that must be completed. The details in this table should be the same as the ones provided under the ‘ Learning Community Details’ link. Your final file should be submitted in PDF format. Note that as a student at UNSW, you have access to the full suite of Microsoft Office 365 applications. More information about accessing this software is on page 17. Your Learning Community Report should be submitted using Arial or Helvetica fonts only. To avoid the situation in which students just copy the articles used by other students, you must choose a ‘unique’ article for which no posts have yet been made under the relevant Unit of the ‘Learning Community’ section of the course website for this term. In other words, once another student ‘claims’ an article for a particular Unit on the course website, you can no longer use it. There is a ‘first-mover advantage’, so you should claim your source by posting on it as soon as possible during the term. Note that if the article is a book, you should base your post on only one chapter of that book. Other students can still post on other chapters of the same book. There is more information on this on page7. This is an individual assignment. Copying or paraphrasing the work of another student in the current or previous term is academic misconduct and will result in a fail grade being awarded for both this assessment and the course. The Learning Community Assignment is worth 10% of your assessment for this course. Note that some students may mistakenly believe that they need to post on one article for each of the ten Units of the course. This is incorrect. You should only post on one article for one Unit of the course. If you have questions about the Learning Community Assignment, please post your question under the ‘ Learning Community Assignment’ thread in the General Forums. I will only reply to emailed questions if the subject matter of the question is obviously inappropriate for posting on the general forums because it relates to a personal or confidential issue. Weight The Learning Community Assignment is worth 10% of your assessment for this course. You must submit via TurnitIn As indicated earlier, there are three components to this assessment: 1. Learning Community Post: Write a short post of up to 1,000 words (with no leeway) under the ‘Learning Community’ section of the course website. 2. Learning Community Details: Provide the details of both your selected article and your post using the fields provided under the ‘Learning Community Details’ link in the ‘Learning Community’ section of the course website. 3. Learning Community Report: Provide both a screenshot of your post and the text of the post and submit it via Turnitin using the ‘Learning Community Report Submission Link’ under the ‘Learning Community’ section of the course website (more details on the format and content of this submission start on page 11). The key thing is that you must both submit your post on the course website for other students to read AND submit it in PDF format using the Microsoft Word template provided. The version that you submit via Microsoft Word is assessed by the Turnitin plagiarism detection software and is the version that will be graded. The post on the forums is for the benefit of other students. Failing to submit your post using the Microsoft Word template provided under the ‘Learning Community Report Submission Link’ under the ‘Assessment 3: Learning Community’ section of the course website will result in you being awarded a score of zero for this assessment. If no special consideration is granted, the latest that the file can be submitted is 5 days after the original due date (and time). Late Penalties Late submission will incura penalty of 5% per day or part thereof (including weekends) from the due date and time. An assessment will not be accepted after 5 days (120 hours) of the original deadline unless special consideration has been approved. An assignment is considered late if the requested format, such as hard copy or electronic copy, has not been submitted on time or an incorrect document has been submitted. Note that the late penalty will be deducted from the score that you receive for the submission (not the maximum possible mark). If you receive 70/100 for the submission and it is two days late, you will receive a score of 60/100. Note that there is a short ‘grace period’ of a few minutes after the submission time to allow for slow internet connections. No penalty will apply if the submission is within this ‘grace period’ . Please do not email me asking if your submission falls within this grace period. Since this assignment can be submitted at any time during the term, the maximum period of special consideration that will be granted for this assessment (for any reason) is 7 days from the original due date. Also, the maximum extension that will be granted for Equitable Learning Plans will also be 7 days from the original due date. Any submissions received after this will automatically be awarded a score of zero. Special Consideration Special consideration will only be granted for the Learning Community Assessment in exceptional cases. You are responsible for completing your Learning Community post well before the due date/time to allow for unexpected circumstances or illness. Being ill on the last few days before the date of submission will not normally constitute grounds for special consideration. If special consideration is granted, the maximum extension that will be granted is 7 days from the original due date. Any assignments received more than 7 days from the original due date will automatically be awarded a score of zero. An application for Special Consideration together with supporting documentation must be submitted online within 3 working days of the due date. The process for applying for special consideration is here:https://student.unsw.edu.au/special-consideration No Short Extensions The university has recently introduced the option for students to apply for a short extension without providing documentation (such as a medical certificate). This assessment task is not eligible for a short extension without documentation.
Practice Exercise 2 LINC12 Fall 2024 Sept 13, 2024 Practice exercise due: Wednesday September 18, 23:59 on Quercus Entailments vs. Presuppositions For the following pairs of sentences, decide whether the second sentence is an entailment of the first, or is a presupposition, or whether there is no relation. Support your answer by performing test(s), evaluating the results, and describing the conclusions. Some strategies you will want to use: • p → q iff p & ¬q is a contradiction. Try to construct a sentence of this form. and decide whether it is contradictory. • To show that p does not entail q, find a situation where p is true and q is false • To show that p presupposes q, construct a sentence with a non-veridical environment, and put the sentence of p in the environment, and then test whether or not q is still entailed (1) a. It’s late at night. b. My sister is tired. (2) a. My sister forgot that she had to finish an assignment tonight. b. My sister had an assignment to finish tonight. (3) a. Maria is a paediatrician. b. Maria is a doctor. (4) a. Gregor re-installed that addicting game on his computer. b. Gregor used to have an addicting game on his computer. (5) a. Many people are interested in learning Korean these days. b. At least some people are interested in learning Korean. (6) a. My best friend from the Philippines moved to Toronto in February, 2023. b. He was in Toronto in March, 2023.
MICROCONTROLLER PROGRAMMING TEAM ASSIGNMENT: MODULAR SYNTHESISER AIM: To specify, design and implement a synthesiser module using a microcontroller MODULAR SYNTHESISERS A modular synthesiser comprises a series of individual audio or control modules that are not hardwired together but which exchange information through a reconfigurable arrangement of patch cables. Principally a keyboard provides the primary playing front-end. An output provides a voltage proportional to the key pressed corresponding to 1V/Octave and a ‘gate signal’ that indicates a key pressed. Typical modules (and their requirements) include: Voltage Controlled Oscillator (VCO) A signal generator with frequency controlled by an external voltage, varying at 1V/octave. The signal waveform. should be selectable. A modulation input (that effectively sums with the control voltage) should also be provided. Voltage Controlled Amplifier (VCA) An amplifier whose gain is set by an external voltage. Signal input and output is required along with a gain control voltage input Voltage Controlled Filter (VCF) A resonating low pass filter with frequency and resonance front panel controls and a control voltage input that sweeps the filter cutoff. Envelope Generator (ADSR) Generates a gain waveform, typically for the VCA and triggered by a gate input. Controls are required for Attack, Decay, Sustain and Release. Low Frequency Oscillator (LFO) A signal generator with manual frequency controlled. The signal waveform. should be selectable. Other Modules Noise Generator Signal Mixer Signal Splitter Panner Ring Modulator To be submitted: Project files (zipped) with fully commented source code should be uploaded to MyUni. Use the student ID of one of the team members as the filename and include the full names and student IDs of the team in the header of the main program (as comment). Maintain a modular file structure, i.e. keep all Timer related functions in source file timer.c with prototypes exported through timer.h, etc. Include a word document or pdf showing your specification, system diagram and flowcharts, state diagram or pseudo code to document the algorithmic details of your program.
DATA305 For academic year 2025 Machine Learning Techniques for Data Science Data Science uses machine learning methods to fit data and make predictions.In this course you wilearn how to explore data in order to identify the appropriate ethical and cultural considerations and select the appropriate tools to analyse the data,develop the theory that underlines those tools,and see a variety of modern machine learning algorithms (such as Large Language Models)that make moden machine learning such a fascinating topic. Course learning objectives 1 Students who pass this course should be able to: 2 Perform. exploratory data analysis and use it to select appropriate machine learning tools. 3 Identify the ethical, cultural, and privacy principles that are relevant for particular data science projects. 4 Explain the role of optimisation in fitting various machine leaning models to data. 5 Demonstrate the use of a variety of machine learning algorithms to analyse data and interpret the results. Assessment Assignments(x3) CLO:1,2,3,4 Mark:50% Project CLO:1,2,3,4 Mark:30% Test CLO:1,2,3,4 Mark:20%
CIS 5530: Project 1 Link State and Distance Vector Routing Spring 2025 Overview In this assignment, you will implement two routing protocols: link state and distance vector routing. Your ns-3 implementation should be able to read network topology files and calculate the routing table based on the LS and the DV algorithms. There will be commands to bring up/down a node or a link, your code should handle the update and reflect on the output in a timely manner. We also provide an auto-grader and test files to validate your implementation. 1 Specification We will be using the ns-3 discrete network simulator to teach core principles of network routing protocol design and implementation. Your assignment is to extend ns-3 to support efficient routing using link- state and distance-vector protocols. For more information on the existing code base and some tips on how to get started, please read the separate CIS 5530: Project 1 Code Documentation. 2 Project Specifications 2.1 Milestone 1 (12%) In this milestone 1, you will work in teams to develop basic neighbor discovery capabilities to each node. The goals of this first milestone are to become familiar with the ns-3 development environment and understand the TA’s skeleton code. Before you write any code, make sure you read in detail the code documentation, understand the API and structure of the relevant parts of the ns-3 code. All your code should go inside the upenn_cis553 directory and you should not modify other files in the ns-3 directory. You are free to add new packet types to ls-message.cc and dv-message.cc. Feel free to structure your own code, for instance, introduce your own helper files, or have one neighbor discovery module shared by both LS and DV. You will finish the following tasks: 1. Neighbor Discovery. 2. Output neighbor table. Expected Output. Once you finished the above tasks, you should be able to generate the Neighbor List by calling the ‘DUMP NEIGHBORS ’ command. The first row of Neighbor List is the total number of the neighbors for that node. Then followed by a series of neighbor entries. Each neighbor entry should include 〈neighbor node number, neighbor IP address, interface IP address〉 . This needs to work for both LS and DV. For example, the output of command ‘1 LS DUMP NEIGHBORS’ for 10-ls .sce and 10 .topo should be **************** Neighbor List ******************** NeighborNumber NeighborAddr InterfaceAddr 2 0 10.0.0.1 10.0.0.2 8 10.0.6.2 10.0.6.1 Submission for Milestone 1. In Gradescope, select the relevant GitHub repository and branch, and Gradescope will automatically pull/test the most recent version. Only one person in the team needs to submit. Be sure to include all other team members in the Gradescope submission (look for the add teammates feature in the top right corner). 2.2 Milestone 2 (88%) Your assignment is to extend your node to support efficient routing by implementing two protocols: link-state and distance-vector routing. If your implementation works, you will be able to route packets hop-by-hop through the network, having packets propagate through a path, only involving nodes on the route to the destination. 2.2.1 Link-state routing Your node must implement link-state routing to construct a routing table, and use this routing table to forward packets towards their destination. You should read about link-state routing in the Peterson textbook and in our lecture slides. The link-state protocol generally involves four steps (See more details in the code guidelines): 1. Neighbor discovery. (Built in MS1) 2. Link-state flooding. 3. Shortest-path calculation. 4. Forwarding. 2.2.2 Distance-vector routing The second routing protocol you have to implement is the distance-vector routing protocol, described in our lecture notes and in the Peterson’s textbook. Your solution should address the count-to-infinity problem by bounding the distance to a maximum of 16 hops. Note that we are not implementing the entire RIP protocol, but a simple distance vector routing protocol that consists of the following four steps (more details in the code guidelines): 1. Neighbor discovery. (Built in MS1) 2. Distance-vector exchange. 3. Route calculation. 4. Forwarding. Expected Output: Your output should be able to pass the auto-grader test. For example, when you run the protocol over the 10 nodes topology with the command: $ ./build/scratch/simulator-main --routing=LS --scenario=scratch/scenarios/10-ls.sce --inet-topo=scratch/topologies/10.topo --result-check=scratch/results/10-ls .output If you passed, you will get this message for each of the tests. XXX is correct Submission for Milestone 2: As with Milestone 1, select the relevant GitHub repository and branch, and Gradescope will automatically pull/test the most recent version. You will need to submit separately for LS and DV for small and large topologies. We strongly recommend you ensure that your implementation passes the local autograder first before submitting to Gradescope. As before, make sure you include all your teammates. 3 Extra Credit (submit together with Milestone 2) Doing extra credit is entirely optional. We offer extra-credit problems as a way of providing challenges for those students with both the time and interest to pursue certain issues in more depth. You should only attempt the extra credits after you have completed the regular portions of the project. Do note that if your regular credit LS and DV do not work correctly, we reserve the right not to award extra credit points. Hence, we recommend that you only start working on extra credits after you have finished the original assignment. Submit your extra credit together with Milestone 2 (same deadline). We will provide a separate submission folder on Gradescope for this submission. You can add your own custom command line arguments to showcase a given extra credit feature. You are free to make any changes to any part of the ns-3 code base for extra credit, but make sure to not break your original submission in case you need to go back and change anything (e.g., keep things in a separate branch of the repo). There is no autograder for extra credit. You will need to schedule a meeting with a TA to demonstrate your extra credit. After the deadline, TAs will reach out to schedule a meeting with your group. Here are some examples of extra credits that you can implement. For extra credit, you are responsible for providing your own test cases: • Delay tolerant networking (10%) Implement summary-based epidemic routing protocol. Demon- strate actual implementation in a highly disconnected wireless network where nodes are connected to each other infrequently. You will need to read up outside of course material on this protocol. • Incremental Dijkstra computation (5%) Given a route update, instead of recomputing Dijkstra from scratch for all entries, perform. incremental recomputation that only updates routes relevant to the shortest path. To demonstrate working implementation, you need to demonstrate that your simulation can run significantly faster for the same network size (while computing the correct routes, of course!). • Your proposal (X%) If you have a cool extension in mind, feel free to contact the teaching staff to discuss feasibility and points awarded.
Capstone Project in Mathematical Sciences III Semester 2 2024 Applied Mathematics Project Title: Heat and mass transport problems in industry Transport of matter/heat energy due to molecular motion (diffusion/conduction) and with a flow (advection) arises in many industrial processes. This project will focus on such processes. You will first consider the derivation of (1) the diffusion/conduction equation and (2) the advection-diffusion/conduction equation to obtain an understanding of the physics and assumptions behind them, and their application to both mass and energy transport processes. For this you will make use of the text book (see above). You will then concentrate on a particular class of energy or mass transport problems in the context of different industrial processes. These problems can be written in terms of one or two partial differential equations in one spatial dimension (x) and time (t). In a sufficiently long time the solution approaches a steady state which can be found by seeking a time-independent solution. This reduces the problem to one or two ordinary differential equations in the spatial variable (x). You should have considered the solution of these equations in Differential Equations II or an equivalent course. Applications involving transport of heat energy include (1) ground-source heating, (2) solar heating, (3) lake-source cooling, (4) continuous pasteurisation (e.g. of milk), and (5) heat exchangers more generally. Some mass transport problems include (6) blood oxygenation in a heart-lung machine, and (7) haemodialysis. You will undertake a small project enabling you to learn in the context of one of these ap-plications, or something similar. You will derive, non-dimensionalise (scale), simplify (where possible), and solve the relevant equations to obtain solutions which you interpret for the appli-cation you are considering. You will present your work in a written project report. Goals 1) To understand the derivation of mathematical models from physical conservation laws. 2) To understand the difference between diffusive, conductive and advective transport. 3) To appreciate the need to make assumptions in order to render an industrial problem tractable. 4) To apply mathematical techniques you have learned in the context of real-world applications. 5) To interpret results, with regard for the assumptions, made in the context of an application. 6) To gain experience in oral and written presentation of mathematics. Assumed knowledge You should have passed one of MATHS 2102 Differential Equations II, MATHS 2106 Differential Equations for Engineers, MATHS 2201 Engineering Mathematics IIA, or equivalent. Ability to write basic MATLAB programs would be beneficial. Outline You will, in general, meet each week with your supervisor for both tuition and to discuss and obtain help with your work. In some weeks, these meetings will be replaced by a meeting of all groups to work on the development of writing skills needed for the project work. In addition to the project report and poster, you will complete a number of written assignments, covering both aspects of writing skills and material relevant to the project.
Understanding Entrepreneurialism BUS61704 Synopsis: This module will equip students with an understanding of the values underpinning entrepreneurialism and engender an entrepreneurial mind-set, inspiring them to adopt entrepreneurial behaviours, including creativity and innovation, problem-solving skills, manage risks, overcome challenges, and cope with failures. This module will explore the characteristics and traits of entrepreneurs and demonstrate that exploiting a new opportunity is a process involving planning, resourcing, managing activities including risks (the journey), and teamwork. A fundamental outcome of entrepreneurship is creating value through developing new products and services to meet identified market needs, which may involve establishing a new business entity. To start a successful business, an entrepreneur must be highly motivated, have entrepreneurial characteristics, a high-risk appetite and key management skills. Entrepreneurship involves mobilising human capital and social capital as well as financial capital. These entrepreneurial competencies are as important to the success of new ventures as the nature of the market opportunities they address Experiential learning: Learning is achieved through undertaking a hands-on group entrepreneurial project involving identifying market needs, proposing innovative solution(s) and assess various risks, where students will take individual responsibility, rely on teamwork to solve problems/challenges, and learn from one another. Action learning: Participation in class and group discussions throughout the process of entrepreneurial journey facilitates students in equipping an entrepreneurial mind-set and adopt entrepreneurial behaviours. This is a coursework-based module, involving both individual and teamwork. Students will be assessed on participation throughout the module, group project, and individual assignments. LEARNING OUTCOMES: Upon completion of the module you should be able to: 1 Identify entrepreneurial values, theories and practice BBus: 3 TGC: 2a.1,2a.2, 2a.3, 2a.4 2 2 Adopt entrepreneurial mindset through reflecting the journey and evaluate actions needed for future entrepreneurial endeavours BBus: 4 TGC: 7.1, 7.3, 7.5 4 3 Exhibit entrepreneurial behaviour through active participation and communication in team setting BBus: 5 TGC: 6.3, 6.4, 6.5 1 4 Discuss the elements contributing to entrepreneurial problems and opportunities BBus: 3 TGC: 2b.1, 2b.4, 2b.5 3 PLO3: Apply critical and creative thinking skills for problem solving and decision making in a business context. PLO4: Display innovative entrepreneurialism. PLO5: Demonstrate leadership, teamwork, communication and social skills in business. Transferable Skills: Skills learned in this module of study which utilized in other setting. Details of each assessment task: Assessment Task 1 - Participation throughout the semester Individual Assignment: Participation+ Guest Lecture Reflection WriteUp (20%) Students are expected to participate actively in lectures and tutorials, group discussions, and submit online assignments. • 5% FROM YOUR IN CLASS ENGAGEMENT • 15% FROM YOUR 3 GUEST LECTURE REFLECTIONS ((Please be informed that the number of Guest Lecture & the reflection reports subject to change based o Guest speakers availability.Any changes will be updated accordingly in advance.) 1. Lectures and tutorials: students are to engage in class discussions by providing their ideas and opinions, as well as ask questions during guest speaker session(s). 2. Online channels: students are to submit online assignments through TIMEs to demonstrate learning progress throughout the module. Assessment Task 2 - Group Project Group Assignment: Entrepreneurial Venture Project (40%)-This is a group assignment. Students will be assigned by the tutor into teams of 3 to 6 students. The group is to ideate, design and plan an entrepreneurial venture through steps of problem identification,opportunity assessment, and market analysis. Then present the business plan in the context of pitching the business idea. The group project consists of 3 parts: 1. Proposal – using template provided (not graded by week 5 or 6(TBC) 2. Written work - Business plan (Week 11) 3. Final Presentation – pitch the business plan and present your entrepreneurial journey (week 10 till Week 13) Group Project Part 1 - Group Project Proposal + Business Plan (30%) Leaning on teamwork, team members are to brainstorm and propose a business idea, starting with assessing the needs and opportunities (through identifying the problems) in the market. Team members are expected to actively voice out their opinions and at the sametime listen to other members’ ideas. Each team is to submit a proposal by session 4. The Proposal Follow the template given. The proposal is not graded but a chance for you to get feedback and think through if it is visible. It is not a fix formula that you must follow through to the end of the semester but works as a guidance and you may improvise wherever needed. The Business Plan • Format – average 15 pages for the content of the business plan (Microsoft Word, A4 size), excluding appendix, references, and other forms such as cover page and group project declaration form. (use your own judgement in terms of number of pages, 15 pages is usually the standard for business plan with the intention to test water, if your group would like to take the idea forward to the real world and do a more extensive business plan, feel free to do so). • Style. – either the contemporary style. that is more visual or traditional style. that is more wordy • Times New Roman Font, size 12 at 1.5 spacing • To include but not limited to: 1. The team (team member names and roles/tasks) 2. Vision & Mission 3. The product and/or service 4. Business model & Pricing (depending on your business model, pricing would be either cost + mark-up or subscription based, refer to lecture on Business Model) 5. Target market (who will be your customer, where are they) 6. Target market analysis (e.g. How big is your market? Who are the existing players/your competitors? Growth rate/potential of this market? How willing are your target customers to purchase your product?). This may be done through either primary data (you survey or interview potential customers) or secondary data (look online) 7. Market positioning (e.g. are you going for affordability, niche, or high end) 8. Promotion & Marketing (how and where do you promote/market your product/service) 9. Financial projection – (1) estimated expenses at startup and 6 months ahead; (2) estimated revenue at startup and 6 months ahead; Optional: estimated profit projection at startup and 6 months ahead 10. The entrepreneurial journey: challenge(s)/problem(s) faced during this group project as an entrepreneurial team & solution(s), and conclusion of the journey (key learning & what can be improved) To do well, (1) do supply your business plan with visuals – such as charts, tables, graphs, and sample design of your marketing materials or product; (2) pay attention in classes. In plain words, show your efforts. And, NO, you don’t have to follow exactly the sequence as above, you may also do your own research online to find the most fitting format according to your line of business/industry. Submission: Softcopy – Microsoft Word (myTIMeS) Step 1: Turnitin Tab on TiMes: Only the Content (including accompanying graphs, tables, pictures, figures, etc, without cover page and declaration form) & References (if any) If you include cover page and declaration form, the Turnitin will show HIGH percentage of similarity. Step 2: Compilation of everything into 1 Microsoft Word file on myTIMeS (there will be submission folders created for your respective tutorial groups) In ONE file 1. Assignment Cover page 2. Marking Grid 3. Turnitin Report (the whole Turnitin report) 4. Content of the report 5. Reference list (if any) 6. Appendixes (if any) 7. Group members declaration form. and time sheet Group Project Part 2 - Group Project Presentation (10%) Each group has maximum 15 minutes to present the group project.(time may vary depending on the semester and number of students, tutors to inform. and confirm the time allowed) To include: (1) Pitching your business idea to a potential investor/lecturer/tutor (2) Your entrepreneurial journey as an entrepreneurial team Submission: Softcopy – submit in the file that is Microsoft PowerPoint (myTIMeS) Timeline of this Group Assignment: Session 2 Team formation Session 3/4 Teams are to further refine the ideas Session 5/6 Submit the Proposal Session 4 to Session 9 Team members to work on the group project Session 10 to 14 Presentation Assessment Task 3 - Individual Assignment Individual Assignment: Individual Report (20%) This is a piece of individual work. Choose an entrepreneur of your preference and analyse the entrepreneur. You may choose an entrepreneur: (1) whom you know first-hand (either through your observation of someone you know; or you interview the entrepreneur), or (2) do a research through secondary sources such as biography or internet. For those who choose someone you know, yes, it is ok if the person is your parents or siblings, please do attach some “proof”, such as archival photos, the business’ social media page, etc. There is no restriction on the nationality of the chosen entrepreneur and entrepreneurship is not bounded by nationality. The focus is on the entrepreneurialism of the chosen entrepreneur. It is common for any entrepreneur to start/own/operate several businesses at the sametime. One business may have started with many challenges at the start but went well afterwards, and viceversa. Of the businesses started up by an entrepreneur, may have different progress and all. So, use your own judgement in deciding the context to illustrate your analysis and learning. Questions to answer: 1. Provide a brief background of the chosen entrepreneur – points to consider but not limited to: childhood, education, skills, and others. Analyse his/her entrepreneurial characteristics, and entrepreneurial leadership qualities. 2. Entrepreneurial motivation and context for starting up the business(es) 3. What was the problem that the entrepreneur identified in the market? How did the entrepreneur address the problem? Who were the entrepreneur’s target market? 4. The success and/or failures of the business (What went wrong, what did the entrepreneur did to salvage the business? Or what did the entrepreneur did exceptionally well to make the business success?) 5. How does the entrepreneur inspire you to be a future entrepreneur? Tips to dowell in answering these questions: (represents the content of your essay/report) • Write the assignment following the 5 points above into 5 parts. Start with the question number, the question, followed by your answer. • Higher scores will be given for analysis instead of pure descriptive content. You are to analyse the entrepreneur, not to copy and paste from the internet. For example, From the internet: Ms X started XYZ at the age of 17, the business led to a huge success. Following the success, she expanded quickly, getting USD10 million investments from angel investors. However, due to pandemic, her business went downhill, and she was in huge debt. Not giving up, she found new ways of selling her products online and the business is once again successful. Your own analysis in your own words: The entrepreneur Ihave chosen is Ms X. It is admirable tome that she was bold enough to start her business at the age of 17 despite not having a degree and work experience. Although she was successful in getting USD10 million investment, she lost it all due to pandemic. The fact that she didn’t give up despite the huge debt shows her perseverance. Therefore, we can see that her entrepreneurial traits include being bold, risk-taking, and perseverance. • You must apply the concepts or principles learned in this module. • You must support your discussions with relevant theories. Format Write your answer in a report with the following format: • Work must be typewritten using Times New Roman Font, size 12 at 2.0 spacing. • Length: 2000 words (plus minus 10%), not including cover page, reference list, tables, charts, illustrations, and Turnitin report. Marks will be deducted for excessive length. It is advised to spread between the questions, for example, not to have a descriptive story of the entrepreneur for 1,000 words and 200 words for each of the other points. Do include visuals.