Function for Calculating Z-scores for data in a Numpy Array
Concept:
As a preliminary step in data analysis, certain types of data are transformed to ensure “good behavior” and “compatibility with other data.” One such transformation is the Z-score. Two series that has been transformed using the Z-score are more easily compared: [login to view URL]
[login to view URL]
In this project, you calculate Z-scores. You will be using your knowledge of Numpy. As well, you learn about the nuances of constructing a function. Also, if you check the links above, you will learn a useful concept in statistics.
Requirement:
The basic requirement is that you produce a function that takes a numpy array as input and outputs an array of the same shape in which the data is transformed through Z-scores. This means that if the array is (5,2), then for each of the two columns, 5 values are used to calculate the mean and sigma and for each of the values in the column, you calculate: (value – mean)/sigma.
A google search “z-score numpy array” will give you plenty of ideas about implementation and you could achieve an efficient implementation with just one line of code! So let me throw in an enhanced requirement: the user also inputs a scalar value (the second input) to indicate the desired column for transformation. Thus, only the specified column is transformed and the other(s) are left alone. For example, if the user inputs a (10,4) array and inputs a column-indicating scalar with a value of 3, this means that only the third column requires transformation (the three remaining columns are left untouched). If the user does not input this scalar, the default is that all columns are transformed. (Hint: a low-level familiarity with Linear Algebra can quickly guide you to an elegant answer here; you can use 1D arrays of ones for sigma and zeros for means and modify them appropriately to complete your calculation. To elaborate, if you use mean = 0 and sigma = 1, the transformation does not change value. Alternatively, use loops and brute force! Whatever works!)
Finally, I want you to implement checks on whether the inputs make sense. For example, is the first input a numpy array (print error message if it is something else) and is the second input appropriate (print error message if scalar does not correspond to a column number)?
Submission:
You are required to code in an ipython notebook and create a pdf. Show your function in one cell – call the function myz. In another cell, create inputs using the following code:
x1 = [login to view URL]([[4,3,12],[1,5,20],[1,2,3],[10,20,40],[7,2,44]])
x2 = 3
x3 = 6
Run the function four times – myz(x1), myz(x1,x2), myz(x1,x3), and myz(x2,x3) – and generate results. In the next cell, please bullet-list things you learned by doing this project. Print to pdf and submit pdf.
I am a data scientist. I am good at machine learning and data science. I am a statistician and have a strong background of theoretical statistics and mathematics. I am good at web scraping using python, requests, selenium etc. I am also good at implementing machine learning and deep learning based models. I have experience of implementing multivariate regression ,Factor Analysis, Principle Components analysis, ANOVA , LASSO and Elastic net, classification and regresison trees, Monte Carlo methods etc. in R. I have attached a few of my codes on my profile.
Dear client?
I highly value professionalism and hold myself strictly accountable to represent my client’s work. I aim to form a long-term working relationship.
For 3 years I’ve worked in Mathematics, R Programming Language, SPSS Statistics, Statistical Analysis and Statistics and so I am accustomed to working with different tasks related to statistics.
I have a deep passion for research and this guarantees you that all of my work is 100% original. I also ensure that I complete the work within the agreed time and a standardized rate.
Please, consider my bid feel free to share more details about your project.
Thank you.
Hi,
I am happy to connect with you today.
I went through your project description, and I would like to bid for this project as I think I can create value in your work :) I have been into data analysis for more than 6 years now. For the last one year, I have been working on Machine learning algorithms on both R and Python. I hope to do the work as per your expectations.
Please connect with me if you are interested to work with me. I will be glad.
Warm regards,
Kiran