diff --git a/charts.html b/charts.html index 2ecc06c..9701c31 100644 --- a/charts.html +++ b/charts.html @@ -34,14 +34,14 @@ -
+
2021-02-26T00:28:10.683678image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/2021-02-26T10:10:02.054013image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-02-26T00:28:10.856331image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-02-26T10:10:02.239329image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-02-26T00:28:11.029310image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-02-26T10:10:02.420515image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-02-26T00:28:11.214303image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-02-26T10:10:02.611738image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-02-26T00:28:04.876701image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-02-26T10:09:55.809530image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
A simple visualization of nullity by column.
2021-02-26T00:28:05.198514image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
A simple visualization of nullity by column.
2021-02-26T10:09:56.168557image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-02-26T00:28:05.517995image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-02-26T10:09:56.550469image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2021-02-26T00:28:05.814492image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2021-02-26T10:09:56.877377image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

TimestampKaupunkiIkäSukupuoliTyökokemusTyösuhteen luonneTyöaikaRooliEtäKuukausipalkkaVuositulotKilpailukykyinenTyöpaikkaVapaa sanaKk-tulot
02021-02-15 11:57:08.316PK-Seutu33NaN10.0Työntekijä / palkollinen1.0Arkkitehti50/506500.083000.0TrueNaNNaN6916.666667
12021-02-15 11:57:19.676Turku33mies14.0Työntekijä / palkollinen1.0full-stackEtä5000.062500.0TrueNaNNaN5208.333333
22021-02-15 11:58:03.592PK-Seutu28mies2.0Työntekijä / palkollinen1.0Full-stack ohjelmistokehittäjäEtä2475.030000.0FalseNaNNaN2500.000000
32021-02-15 11:58:15.261Tampere33mies22.0Yrittäjä1.0web-arkkitehtiEtä4300.0100000.0TrueNaNNaN8333.333333
42021-02-15 11:58:16.983PK-Seutu28mies2.0Työntekijä / palkollinen1.0OhjelmistokehittäjäEtä3000.037500.0FalseNaNNaN3125.000000
52021-02-15 11:58:49.454PK-Seutu43mies23.0Työntekijä / palkollinen1.0OhjelmistokehittäjäToimisto8000.0100000.0TrueNaNNaN8333.333333
62021-02-15 12:00:03.771PK-Seutu33mies10.0Freelancer1.0OhjelmistokehittäjäEtä6000.0140000.0TrueNaNNaN11666.666667
72021-02-15 12:00:04.655Tampere33NaN10.0Työntekijä / palkollinen1.0OhjelmistokehittäjäToimisto4250.054000.0TrueNaNNaN4500.000000
82021-02-15 12:01:00.769Tampere33mies6.0Työntekijä / palkollinen1.0Lead developerToimisto4000.050000.0FalseNaNNaN4166.666667
92021-02-15 12:02:03.577Tallinna33mies12.0Freelancer1.0NaNEtäNaN200000.0TrueQuestradeNaN16666.666667

Last rows

TimestampKaupunkiIkäSukupuoliTyökokemusTyösuhteen luonneTyöaikaRooliEtäKuukausipalkkaVuositulotKilpailukykyinenTyöpaikkaVapaa sanaKk-tulot
4812021-02-24 16:09:32.939PK-Seutu33mies2.0Työntekijä / palkollinen1.0OhjelmistokehittäjäEtä3500.043750.0TrueNaNNaN3645.833333
4822021-02-24 17:28:57.097Kuopio23mies2.0Työntekijä / palkollinen1.0frontend50/502900.036000.0FalseNaNNaN3000.000000
4832021-02-24 23:49:22.242Tampere28mies2.0Työntekijä / palkollinen1.0Ohjelmistokehittäjä (frontend)Etä2860.035850.0FalseNaNNaN2987.500000
4842021-02-25 09:34:48.368Kuopio33mies6.0Työntekijä / palkollinen1.0Ohjelmistokehittäjä, Tech LeadToimisto4500.056250.0TrueNaNNaN4687.500000
4852021-02-25 10:53:41.881PK-Seutu28mies6.0Työntekijä / palkollinen1.0full-stack50/505100.064000.0FalseNaNNaN5333.333333
4862021-02-25 11:09:16.999PK-Seutu28mies3.0Työntekijä / palkollinen1.0Fullstack ja pientä devops tunkkiaToimisto3500.044000.0FalseNaNNaN3666.666667
4872021-02-25 11:10:42.322NaN33NaN15.0Työntekijä / palkollinen1.0NaNToimisto5200.068000.0FalseNaNNaN5666.666667
4882021-02-25 12:33:58.490PK-Seutu28mies5.0Työntekijä / palkollinen1.0Full-stack developerToimisto5500.068000.0TrueNaNNaN5666.666667
4892021-02-25 14:10:32.597Tampere23muu1.0Työntekijä / palkollinen0.5Systems Administrator ja firmän sisäinen 1st line -tukihessuToimisto1081.014000.0TrueNaNKk-palkkani on varsinkin vaihteleva, koska riippuu vuorolisistä (mahdollisista pyhä- ja yövuoroista ja tuurauksista). Jonkinlaisen oletuksen nyt yritin lyödä vuositulolle, mutta taitaa jäädä todellisuudessa hivenen sen alle.1166.666667
4902021-02-25 21:17:36.323PK-Seutu33mies10.0Työntekijä / palkollinen1.0Full-stack ohjemistokehittäjäToimisto4600.058000.0TrueNaNNaN4833.333333