Overview

Dataset statistics

Number of variables24
Number of observations684
Missing cells6815
Missing cells (%)41.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory111.6 KiB
Average record size in memory167.1 B

Variable types

DateTime1
Categorical13
Numeric8
Unsupported1
Boolean1

Alerts

Palvelut has a high cardinality: 52 distinct values High cardinality
Työpaikka has a high cardinality: 89 distinct values High cardinality
Rooli has a high cardinality: 270 distinct values High cardinality
Työkokemus is highly correlated with Kuukausipalkka and 2 other fieldsHigh correlation
Tuntilaskutus (ALV 0%, euroina) is highly correlated with Vuosilaskutus (ALV 0%, euroina)High correlation
Vuosilaskutus (ALV 0%, euroina) is highly correlated with Tuntilaskutus (ALV 0%, euroina)High correlation
Kuukausipalkka is highly correlated with Vuositulot and 1 other fieldsHigh correlation
Vuositulot is highly correlated with Kuukausipalkka and 1 other fieldsHigh correlation
Kk-tulot is highly correlated with Kuukausipalkka and 1 other fieldsHigh correlation
Sukupuoli has 53 (7.7%) missing values Missing
Montako vuotta olet tehnyt laskuttavaa työtä alalla? has 617 (90.2%) missing values Missing
Palvelut has 618 (90.4%) missing values Missing
Tuntilaskutus (ALV 0%, euroina) has 625 (91.4%) missing values Missing
Vuosilaskutus (ALV 0%, euroina) has 622 (90.9%) missing values Missing
Hankitko asiakkaasi itse suoraan vai käytätkö välitysfirmojen palveluita? has 616 (90.1%) missing values Missing
Mistä asiakkaat ovat? has 616 (90.1%) missing values Missing
Työpaikka has 535 (78.2%) missing values Missing
Kaupunki has 80 (11.7%) missing values Missing
Millaisessa yrityksessä työskentelet has 75 (11.0%) missing values Missing
Työaika has 72 (10.5%) missing values Missing
Rooli has 90 (13.2%) missing values Missing
Etä has 80 (11.7%) missing values Missing
Kuukausipalkka has 77 (11.3%) missing values Missing
Vuositulot has 79 (11.5%) missing values Missing
Vapaa kuvaus kokonaiskompensaatiomallista has 498 (72.8%) missing values Missing
Kilpailukykyinen has 80 (11.7%) missing values Missing
Vapaa sana has 643 (94.0%) missing values Missing
Ideoita ensi vuoden kyselyyn has 653 (95.5%) missing values Missing
Kk-tulot has 79 (11.5%) missing values Missing
Vapaa sana is uniformly distributed Uniform
Ideoita ensi vuoden kyselyyn is uniformly distributed Uniform
Timestamp has unique values Unique
Vapaa kuvaus kokonaiskompensaatiomallista is an unsupported type, check if it needs cleaning or further analysis Unsupported
Työkokemus has 13 (1.9%) zeros Zeros

Reproduction

Analysis started2022-10-10 09:08:57.716318
Analysis finished2022-10-10 09:09:07.984874
Duration10.27 seconds
Software versionpandas-profiling v3.3.0
Download configurationconfig.json

Variables

Timestamp
Date

UNIQUE

Distinct684
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
Minimum2022-09-26 16:35:50.002000
Maximum2022-10-10 07:49:49.204000
2022-10-10T09:09:08.042215image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:08.170811image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
Palkansaaja
616 
Laskuttaja
68 

Length

Max length11
Median length11
Mean length10.9005848
Min length10

Characters and Unicode

Total characters7456
Distinct characters10
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPalkansaaja
2nd rowPalkansaaja
3rd rowPalkansaaja
4th rowPalkansaaja
5th rowLaskuttaja

Common Values

ValueCountFrequency (%)
Palkansaaja616
90.1%
Laskuttaja68
 
9.9%

Length

2022-10-10T09:09:08.288659image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-10-10T09:09:08.387572image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
palkansaaja616
90.1%
laskuttaja68
 
9.9%

Most occurring characters

ValueCountFrequency (%)
a3284
44.0%
k684
 
9.2%
s684
 
9.2%
j684
 
9.2%
P616
 
8.3%
l616
 
8.3%
n616
 
8.3%
t136
 
1.8%
L68
 
0.9%
u68
 
0.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter6772
90.8%
Uppercase Letter684
 
9.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a3284
48.5%
k684
 
10.1%
s684
 
10.1%
j684
 
10.1%
l616
 
9.1%
n616
 
9.1%
t136
 
2.0%
u68
 
1.0%
Uppercase Letter
ValueCountFrequency (%)
P616
90.1%
L68
 
9.9%

Most occurring scripts

ValueCountFrequency (%)
Latin7456
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a3284
44.0%
k684
 
9.2%
s684
 
9.2%
j684
 
9.2%
P616
 
8.3%
l616
 
8.3%
n616
 
8.3%
t136
 
1.8%
L68
 
0.9%
u68
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII7456
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a3284
44.0%
k684
 
9.2%
s684
 
9.2%
j684
 
9.2%
P616
 
8.3%
l616
 
8.3%
n616
 
8.3%
t136
 
1.8%
L68
 
0.9%
u68
 
0.9%

Ikä
Categorical

Distinct8
Distinct (%)1.2%
Missing3
Missing (%)0.4%
Memory size1.1 KiB
33
202 
38
196 
28
135 
43
93 
48
25 
Other values (3)
30 

Length

Max length6
Median length2
Mean length2.005873715
Min length2

Characters and Unicode

Total characters1366
Distinct characters8
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row33
2nd row33
3rd row33
4th row38
5th row28

Common Values

ValueCountFrequency (%)
33202
29.5%
38196
28.7%
28135
19.7%
4393
13.6%
4825
 
3.7%
2324
 
3.5%
535
 
0.7%
> 55 v1
 
0.1%
(Missing)3
 
0.4%

Length

2022-10-10T09:09:08.478932image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-10-10T09:09:08.598445image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
33202
29.6%
38196
28.7%
28135
19.8%
4393
13.6%
4825
 
3.7%
2324
 
3.5%
535
 
0.7%
1
 
0.1%
551
 
0.1%
v1
 
0.1%

Most occurring characters

ValueCountFrequency (%)
3722
52.9%
8356
26.1%
2159
 
11.6%
4118
 
8.6%
57
 
0.5%
2
 
0.1%
>1
 
0.1%
v1
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1362
99.7%
Space Separator2
 
0.1%
Math Symbol1
 
0.1%
Lowercase Letter1
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3722
53.0%
8356
26.1%
2159
 
11.7%
4118
 
8.7%
57
 
0.5%
Space Separator
ValueCountFrequency (%)
2
100.0%
Math Symbol
ValueCountFrequency (%)
>1
100.0%
Lowercase Letter
ValueCountFrequency (%)
v1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common1365
99.9%
Latin1
 
0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
3722
52.9%
8356
26.1%
2159
 
11.6%
4118
 
8.6%
57
 
0.5%
2
 
0.1%
>1
 
0.1%
Latin
ValueCountFrequency (%)
v1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1366
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3722
52.9%
8356
26.1%
2159
 
11.6%
4118
 
8.6%
57
 
0.5%
2
 
0.1%
>1
 
0.1%
v1
 
0.1%

Sukupuoli
Categorical

MISSING

Distinct3
Distinct (%)0.5%
Missing53
Missing (%)7.7%
Memory size944.0 B
mies
548 
nainen
72 
muu
 
11

Length

Max length6
Median length4
Mean length4.210776545
Min length3

Characters and Unicode

Total characters2657
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowmies
2nd rowmies
3rd rowmies
4th rowmies
5th rowmies

Common Values

ValueCountFrequency (%)
mies548
80.1%
nainen72
 
10.5%
muu11
 
1.6%
(Missing)53
 
7.7%

Length

2022-10-10T09:09:08.705850image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-10-10T09:09:08.805870image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
mies548
86.8%
nainen72
 
11.4%
muu11
 
1.7%

Most occurring characters

ValueCountFrequency (%)
i620
23.3%
e620
23.3%
m559
21.0%
s548
20.6%
n216
 
8.1%
a72
 
2.7%
u22
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2657
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i620
23.3%
e620
23.3%
m559
21.0%
s548
20.6%
n216
 
8.1%
a72
 
2.7%
u22
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
Latin2657
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i620
23.3%
e620
23.3%
m559
21.0%
s548
20.6%
n216
 
8.1%
a72
 
2.7%
u22
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII2657
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i620
23.3%
e620
23.3%
m559
21.0%
s548
20.6%
n216
 
8.1%
a72
 
2.7%
u22
 
0.8%

Työkokemus
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct30
Distinct (%)4.4%
Missing4
Missing (%)0.6%
Infinite0
Infinite (%)0.0%
Mean10.2
Minimum0
Maximum31
Zeros13
Zeros (%)1.9%
Negative0
Negative (%)0.0%
Memory size5.5 KiB
2022-10-10T09:09:08.894708image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q15
median10
Q315
95-th percentile22
Maximum31
Range31
Interquartile range (IQR)10

Descriptive statistics

Standard deviation6.165942852
Coefficient of variation (CV)0.6045042012
Kurtosis-0.2905675607
Mean10.2
Median Absolute Deviation (MAD)5
Skewness0.5327745524
Sum6936
Variance38.01885125
MonotonicityNot monotonic
2022-10-10T09:09:08.995147image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
1058
 
8.5%
554
 
7.9%
1249
 
7.2%
1549
 
7.2%
847
 
6.9%
743
 
6.3%
441
 
6.0%
635
 
5.1%
229
 
4.2%
1426
 
3.8%
Other values (20)249
36.4%
ValueCountFrequency (%)
013
 
1.9%
123
3.4%
229
4.2%
324
3.5%
441
6.0%
554
7.9%
635
5.1%
743
6.3%
847
6.9%
922
3.2%
ValueCountFrequency (%)
311
 
0.1%
281
 
0.1%
271
 
0.1%
262
 
0.3%
258
1.2%
245
 
0.7%
239
1.3%
2217
2.5%
214
 
0.6%
2019
2.8%
Distinct18
Distinct (%)26.9%
Missing617
Missing (%)90.2%
Infinite0
Infinite (%)0.0%
Mean3.582089552
Minimum0
Maximum16
Zeros3
Zeros (%)0.4%
Negative0
Negative (%)0.0%
Memory size5.5 KiB
2022-10-10T09:09:09.096005image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.5
Q11
median2
Q34
95-th percentile11.7
Maximum16
Range16
Interquartile range (IQR)3

Descriptive statistics

Standard deviation3.704621413
Coefficient of variation (CV)1.034206811
Kurtosis2.474208291
Mean3.582089552
Median Absolute Deviation (MAD)1
Skewness1.731398046
Sum240
Variance13.72421981
MonotonicityNot monotonic
2022-10-10T09:09:09.190290image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%)
118
 
2.6%
211
 
1.6%
47
 
1.0%
36
 
0.9%
54
 
0.6%
03
 
0.4%
1.53
 
0.4%
83
 
0.4%
0.52
 
0.3%
102
 
0.3%
Other values (8)8
 
1.2%
(Missing)617
90.2%
ValueCountFrequency (%)
03
 
0.4%
0.52
 
0.3%
118
2.6%
1.53
 
0.4%
211
1.6%
2.51
 
0.1%
36
 
0.9%
47
 
1.0%
54
 
0.6%
61
 
0.1%
ValueCountFrequency (%)
161
 
0.1%
151
 
0.1%
131
 
0.1%
121
 
0.1%
111
 
0.1%
102
0.3%
91
 
0.1%
83
0.4%
61
 
0.1%
54
0.6%

Palvelut
Categorical

HIGH CARDINALITY
MISSING

Distinct52
Distinct (%)78.8%
Missing618
Missing (%)90.4%
Memory size5.5 KiB
Full stack
14 
Softadevausta
 
2
Arkkitehti/projektipäällikkö
 
1
Softadevaus, data engineering
 
1
Backend devops
 
1
Other values (47)
47 

Length

Max length130
Median length50
Mean length27.60606061
Min length3

Characters and Unicode

Total characters1822
Distinct characters53
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique50 ?
Unique (%)75.8%

Sample

1st rowData-analytiikka, Arkkitehtuuri, Data Engineering,
2nd rowFullstack
3rd rowFull-stack developer ja arkkitehti
4th rowFull stack
5th rowDevausta ja projarointia

Common Values

ValueCountFrequency (%)
Full stack14
 
2.0%
Softadevausta2
 
0.3%
Arkkitehti/projektipäällikkö1
 
0.1%
Softadevaus, data engineering1
 
0.1%
Backend devops1
 
0.1%
Full stack web-ohjelmointia1
 
0.1%
Full-stack, lead developer1
 
0.1%
Frontend, full stack1
 
0.1%
Team Lead, projekti- ja tuotekonsultointi1
 
0.1%
Full stack web ja mobiili1
 
0.1%
Other values (42)42
 
6.1%
(Missing)618
90.4%

Length

2022-10-10T09:09:09.320012image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
full37
 
16.3%
stack37
 
16.3%
ja13
 
5.7%
devops9
 
4.0%
backend7
 
3.1%
arkkitehtuuria6
 
2.6%
6
 
2.6%
frontend5
 
2.2%
softadevausta5
 
2.2%
arkkitehtuuri5
 
2.2%
Other values (73)97
42.7%

Most occurring characters

ValueCountFrequency (%)
a175
 
9.6%
t167
 
9.2%
162
 
8.9%
l127
 
7.0%
e116
 
6.4%
k109
 
6.0%
u106
 
5.8%
s100
 
5.5%
i97
 
5.3%
o67
 
3.7%
Other values (43)596
32.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1502
82.4%
Space Separator162
 
8.9%
Uppercase Letter88
 
4.8%
Other Punctuation58
 
3.2%
Dash Punctuation8
 
0.4%
Decimal Number1
 
0.1%
Open Punctuation1
 
0.1%
Close Punctuation1
 
0.1%
Math Symbol1
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a175
11.7%
t167
11.1%
l127
 
8.5%
e116
 
7.7%
k109
 
7.3%
u106
 
7.1%
s100
 
6.7%
i97
 
6.5%
o67
 
4.5%
n65
 
4.3%
Other values (17)373
24.8%
Uppercase Letter
ValueCountFrequency (%)
F37
42.0%
S10
 
11.4%
D8
 
9.1%
B6
 
6.8%
A4
 
4.5%
C4
 
4.5%
E3
 
3.4%
O3
 
3.4%
W3
 
3.4%
T3
 
3.4%
Other values (5)7
 
8.0%
Other Punctuation
ValueCountFrequency (%)
,47
81.0%
.4
 
6.9%
/4
 
6.9%
&2
 
3.4%
:1
 
1.7%
Space Separator
ValueCountFrequency (%)
162
100.0%
Dash Punctuation
ValueCountFrequency (%)
-8
100.0%
Decimal Number
ValueCountFrequency (%)
31
100.0%
Open Punctuation
ValueCountFrequency (%)
(1
100.0%
Close Punctuation
ValueCountFrequency (%)
)1
100.0%
Math Symbol
ValueCountFrequency (%)
+1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1590
87.3%
Common232
 
12.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a175
 
11.0%
t167
 
10.5%
l127
 
8.0%
e116
 
7.3%
k109
 
6.9%
u106
 
6.7%
s100
 
6.3%
i97
 
6.1%
o67
 
4.2%
n65
 
4.1%
Other values (32)461
29.0%
Common
ValueCountFrequency (%)
162
69.8%
,47
 
20.3%
-8
 
3.4%
.4
 
1.7%
/4
 
1.7%
&2
 
0.9%
31
 
0.4%
(1
 
0.4%
)1
 
0.4%
+1
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII1799
98.7%
None23
 
1.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a175
 
9.7%
t167
 
9.3%
162
 
9.0%
l127
 
7.1%
e116
 
6.4%
k109
 
6.1%
u106
 
5.9%
s100
 
5.6%
i97
 
5.4%
o67
 
3.7%
Other values (41)573
31.9%
None
ValueCountFrequency (%)
ä20
87.0%
ö3
 
13.0%

Tuntilaskutus (ALV 0%, euroina)
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct25
Distinct (%)42.4%
Missing625
Missing (%)91.4%
Infinite0
Infinite (%)0.0%
Mean93.5
Minimum50
Maximum170
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.5 KiB
2022-10-10T09:09:09.431815image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum50
5-th percentile65
Q180
median90
Q399.5
95-th percentile132
Maximum170
Range120
Interquartile range (IQR)19.5

Descriptive statistics

Standard deviation21.63908501
Coefficient of variation (CV)0.2314340643
Kurtosis2.634267697
Mean93.5
Median Absolute Deviation (MAD)10
Skewness1.312304188
Sum5516.5
Variance468.25
MonotonicityNot monotonic
2022-10-10T09:09:09.532732image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
8011
 
1.6%
9010
 
1.5%
856
 
0.9%
1205
 
0.7%
953
 
0.4%
1052
 
0.3%
882
 
0.3%
1502
 
0.3%
652
 
0.3%
981
 
0.1%
Other values (15)15
 
2.2%
(Missing)625
91.4%
ValueCountFrequency (%)
501
 
0.1%
601
 
0.1%
652
 
0.3%
701
 
0.1%
721
 
0.1%
761
 
0.1%
8011
1.6%
841
 
0.1%
856
0.9%
861
 
0.1%
ValueCountFrequency (%)
1701
 
0.1%
1502
 
0.3%
1301
 
0.1%
1205
0.7%
1161
 
0.1%
1101
 
0.1%
107.51
 
0.1%
1052
 
0.3%
1001
 
0.1%
991
 
0.1%

Vuosilaskutus (ALV 0%, euroina)
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct34
Distinct (%)54.8%
Missing622
Missing (%)90.9%
Infinite0
Infinite (%)0.0%
Mean134460.1613
Minimum0
Maximum300000
Zeros1
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size5.5 KiB
2022-10-10T09:09:09.637825image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile32000
Q1112500
median138000
Q3160000
95-th percentile200000
Maximum300000
Range300000
Interquartile range (IQR)47500

Descriptive statistics

Standard deviation51161.92272
Coefficient of variation (CV)0.3804987457
Kurtosis2.057827297
Mean134460.1613
Median Absolute Deviation (MAD)22000
Skewness-0.066051067
Sum8336530
Variance2617542336
MonotonicityNot monotonic
2022-10-10T09:09:09.861924image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=34)
ValueCountFrequency (%)
1500005
 
0.7%
1400005
 
0.7%
1250004
 
0.6%
1200004
 
0.6%
1600004
 
0.6%
1350004
 
0.6%
1800003
 
0.4%
1000002
 
0.3%
2000002
 
0.3%
1450002
 
0.3%
Other values (24)27
 
3.9%
(Missing)622
90.9%
ValueCountFrequency (%)
01
0.1%
301
0.1%
295001
0.1%
300001
0.1%
700001
0.1%
750001
0.1%
800002
0.3%
840001
0.1%
930001
0.1%
950001
0.1%
ValueCountFrequency (%)
3000001
 
0.1%
2360001
 
0.1%
2300001
 
0.1%
2000002
0.3%
1900002
0.3%
1800003
0.4%
1700002
0.3%
1660001
 
0.1%
1600004
0.6%
1550001
 
0.1%
Distinct5
Distinct (%)7.4%
Missing616
Missing (%)90.1%
Memory size5.5 KiB
Käytän välitysfirmoja
27 
Itse
26 
Itse, Käytän välitysfirmoja
13 
Itse, Verkosto
 
1
Käytän välitysfirmoja, LinkedIn:istä tullut suoraan monta kyselyä, nykyinenkin projekti
 
1

Length

Max length87
Median length27
Mean length16.51470588
Min length4

Characters and Unicode

Total characters1123
Distinct characters26
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)2.9%

Sample

1st rowKäytän välitysfirmoja
2nd rowItse
3rd rowKäytän välitysfirmoja
4th rowKäytän välitysfirmoja
5th rowItse

Common Values

ValueCountFrequency (%)
Käytän välitysfirmoja27
 
3.9%
Itse26
 
3.8%
Itse, Käytän välitysfirmoja13
 
1.9%
Itse, Verkosto1
 
0.1%
Käytän välitysfirmoja, LinkedIn:istä tullut suoraan monta kyselyä, nykyinenkin projekti1
 
0.1%
(Missing)616
90.1%

Length

2022-10-10T09:09:09.969517image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-10-10T09:09:10.077398image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
käytän41
31.5%
välitysfirmoja41
31.5%
itse40
30.8%
verkosto1
 
0.8%
linkedin:istä1
 
0.8%
tullut1
 
0.8%
suoraan1
 
0.8%
monta1
 
0.8%
kyselyä1
 
0.8%
nykyinenkin1
 
0.8%

Most occurring characters

ValueCountFrequency (%)
t128
 
11.4%
ä125
 
11.1%
i87
 
7.7%
y86
 
7.7%
s85
 
7.6%
62
 
5.5%
n49
 
4.4%
o46
 
4.1%
e45
 
4.0%
a44
 
3.9%
Other values (16)366
32.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter960
85.5%
Uppercase Letter84
 
7.5%
Space Separator62
 
5.5%
Other Punctuation17
 
1.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t128
13.3%
ä125
13.0%
i87
 
9.1%
y86
 
9.0%
s85
 
8.9%
n49
 
5.1%
o46
 
4.8%
e45
 
4.7%
a44
 
4.6%
l44
 
4.6%
Other values (9)221
23.0%
Uppercase Letter
ValueCountFrequency (%)
I41
48.8%
K41
48.8%
V1
 
1.2%
L1
 
1.2%
Other Punctuation
ValueCountFrequency (%)
,16
94.1%
:1
 
5.9%
Space Separator
ValueCountFrequency (%)
62
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1044
93.0%
Common79
 
7.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
t128
12.3%
ä125
12.0%
i87
 
8.3%
y86
 
8.2%
s85
 
8.1%
n49
 
4.7%
o46
 
4.4%
e45
 
4.3%
a44
 
4.2%
l44
 
4.2%
Other values (13)305
29.2%
Common
ValueCountFrequency (%)
62
78.5%
,16
 
20.3%
:1
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII998
88.9%
None125
 
11.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t128
 
12.8%
i87
 
8.7%
y86
 
8.6%
s85
 
8.5%
62
 
6.2%
n49
 
4.9%
o46
 
4.6%
e45
 
4.5%
a44
 
4.4%
l44
 
4.4%
Other values (15)322
32.3%
None
ValueCountFrequency (%)
ä125
100.0%

Mistä asiakkaat ovat?
Categorical

MISSING

Distinct3
Distinct (%)4.4%
Missing616
Missing (%)90.1%
Memory size5.5 KiB
Suomesta
45 
Suomesta, Ulkomailta
12 
Ulkomailta
11 

Length

Max length20
Median length8
Mean length10.44117647
Min length8

Characters and Unicode

Total characters710
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSuomesta
2nd rowSuomesta, Ulkomailta
3rd rowSuomesta
4th rowSuomesta
5th rowSuomesta

Common Values

ValueCountFrequency (%)
Suomesta45
 
6.6%
Suomesta, Ulkomailta12
 
1.8%
Ulkomailta11
 
1.6%
(Missing)616
90.1%

Length

2022-10-10T09:09:10.187099image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-10-10T09:09:10.288816image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
suomesta57
71.2%
ulkomailta23
28.7%

Most occurring characters

ValueCountFrequency (%)
a103
14.5%
o80
11.3%
m80
11.3%
t80
11.3%
S57
8.0%
u57
8.0%
e57
8.0%
s57
8.0%
l46
6.5%
U23
 
3.2%
Other values (4)70
9.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter606
85.4%
Uppercase Letter80
 
11.3%
Other Punctuation12
 
1.7%
Space Separator12
 
1.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a103
17.0%
o80
13.2%
m80
13.2%
t80
13.2%
u57
9.4%
e57
9.4%
s57
9.4%
l46
7.6%
k23
 
3.8%
i23
 
3.8%
Uppercase Letter
ValueCountFrequency (%)
S57
71.2%
U23
28.7%
Other Punctuation
ValueCountFrequency (%)
,12
100.0%
Space Separator
ValueCountFrequency (%)
12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin686
96.6%
Common24
 
3.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a103
15.0%
o80
11.7%
m80
11.7%
t80
11.7%
S57
8.3%
u57
8.3%
e57
8.3%
s57
8.3%
l46
6.7%
U23
 
3.4%
Other values (2)46
6.7%
Common
ValueCountFrequency (%)
,12
50.0%
12
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII710
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a103
14.5%
o80
11.3%
m80
11.3%
t80
11.3%
S57
8.0%
u57
8.0%
e57
8.0%
s57
8.0%
l46
6.5%
U23
 
3.2%
Other values (4)70
9.9%

Työpaikka
Categorical

HIGH CARDINALITY
MISSING

Distinct89
Distinct (%)59.7%
Missing535
Missing (%)78.2%
Memory size5.5 KiB
Reaktor
13 
Vincit
 
10
Gofore
 
6
Mavericks
 
6
Fraktio
 
4
Other values (84)
110 

Length

Max length43
Median length29
Mean length9.181208054
Min length1

Characters and Unicode

Total characters1368
Distinct characters57
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique69 ?
Unique (%)46.3%

Sample

1st row-
2nd rowVisma
3rd rowTalenom
4th rowGofore
5th rowVincit

Common Values

ValueCountFrequency (%)
Reaktor13
 
1.9%
Vincit10
 
1.5%
Gofore6
 
0.9%
Mavericks6
 
0.9%
Fraktio4
 
0.6%
Futurice4
 
0.6%
Mehiläinen4
 
0.6%
Wolt4
 
0.6%
Exove3
 
0.4%
Visma3
 
0.4%
Other values (79)92
 
13.5%
(Missing)535
78.2%

Length

2022-10-10T09:09:10.390699image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
reaktor13
 
6.8%
vincit10
 
5.3%
mavericks9
 
4.7%
siili7
 
3.7%
gofore6
 
3.2%
futurice5
 
2.6%
fraktio4
 
2.1%
mehiläinen4
 
2.1%
wolt4
 
2.1%
compile3
 
1.6%
Other values (109)125
65.8%

Most occurring characters

ValueCountFrequency (%)
i150
 
11.0%
e115
 
8.4%
o109
 
8.0%
t98
 
7.2%
a90
 
6.6%
r75
 
5.5%
n72
 
5.3%
l65
 
4.8%
u50
 
3.7%
k47
 
3.4%
Other values (47)497
36.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1130
82.6%
Uppercase Letter184
 
13.5%
Space Separator43
 
3.1%
Dash Punctuation6
 
0.4%
Other Punctuation3
 
0.2%
Decimal Number2
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i150
13.3%
e115
10.2%
o109
9.6%
t98
 
8.7%
a90
 
8.0%
r75
 
6.6%
n72
 
6.4%
l65
 
5.8%
u50
 
4.4%
k47
 
4.2%
Other values (18)259
22.9%
Uppercase Letter
ValueCountFrequency (%)
S28
15.2%
M20
10.9%
V17
9.2%
F16
 
8.7%
R15
 
8.2%
G11
 
6.0%
A11
 
6.0%
C10
 
5.4%
W8
 
4.3%
H6
 
3.3%
Other values (13)42
22.8%
Other Punctuation
ValueCountFrequency (%)
.2
66.7%
,1
33.3%
Decimal Number
ValueCountFrequency (%)
11
50.0%
21
50.0%
Space Separator
ValueCountFrequency (%)
43
100.0%
Dash Punctuation
ValueCountFrequency (%)
-6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1314
96.1%
Common54
 
3.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
i150
 
11.4%
e115
 
8.8%
o109
 
8.3%
t98
 
7.5%
a90
 
6.8%
r75
 
5.7%
n72
 
5.5%
l65
 
4.9%
u50
 
3.8%
k47
 
3.6%
Other values (41)443
33.7%
Common
ValueCountFrequency (%)
43
79.6%
-6
 
11.1%
.2
 
3.7%
11
 
1.9%
,1
 
1.9%
21
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII1360
99.4%
None8
 
0.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i150
 
11.0%
e115
 
8.5%
o109
 
8.0%
t98
 
7.2%
a90
 
6.6%
r75
 
5.5%
n72
 
5.3%
l65
 
4.8%
u50
 
3.7%
k47
 
3.5%
Other values (45)489
36.0%
None
ValueCountFrequency (%)
ä5
62.5%
ö3
37.5%

Kaupunki
Categorical

MISSING

Distinct34
Distinct (%)5.6%
Missing80
Missing (%)11.7%
Memory size2.1 KiB
PK-Seutu
321 
Tampere
122 
Turku
67 
Oulu
33 
Jyväskylä
 
14
Other values (29)
47 

Length

Max length42
Median length8
Mean length7.364238411
Min length2

Characters and Unicode

Total characters4448
Distinct characters44
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique22 ?
Unique (%)3.6%

Sample

1st rowPK-Seutu
2nd rowPK-Seutu
3rd rowTurku
4th rowPK-Seutu
5th rowPK-Seutu

Common Values

ValueCountFrequency (%)
PK-Seutu321
46.9%
Tampere122
 
17.8%
Turku67
 
9.8%
Oulu33
 
4.8%
Jyväskylä14
 
2.0%
Vaasa7
 
1.0%
Kuopio4
 
0.6%
Pori4
 
0.6%
Joensuu3
 
0.4%
Lappeenranta3
 
0.4%
Other values (24)26
 
3.8%
(Missing)80
 
11.7%

Length

2022-10-10T09:09:10.501165image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
pk-seutu321
52.1%
tampere122
 
19.8%
turku67
 
10.9%
oulu33
 
5.4%
jyväskylä14
 
2.3%
vaasa7
 
1.1%
kuopio4
 
0.6%
pori4
 
0.6%
lappeenranta3
 
0.5%
joensuu3
 
0.5%
Other values (35)38
 
6.2%

Most occurring characters

ValueCountFrequency (%)
u860
19.3%
e592
13.3%
t341
 
7.7%
K331
 
7.4%
P325
 
7.3%
S325
 
7.3%
-323
 
7.3%
r205
 
4.6%
T190
 
4.3%
a184
 
4.1%
Other values (34)772
17.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2850
64.1%
Uppercase Letter1258
28.3%
Dash Punctuation323
 
7.3%
Space Separator12
 
0.3%
Other Punctuation5
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u860
30.2%
e592
20.8%
t341
 
12.0%
r205
 
7.2%
a184
 
6.5%
p139
 
4.9%
m131
 
4.6%
k89
 
3.1%
l58
 
2.0%
ä43
 
1.5%
Other values (14)208
 
7.3%
Uppercase Letter
ValueCountFrequency (%)
K331
26.3%
P325
25.8%
S325
25.8%
T190
15.1%
O33
 
2.6%
J20
 
1.6%
V8
 
0.6%
L7
 
0.6%
E4
 
0.3%
H3
 
0.2%
Other values (7)12
 
1.0%
Dash Punctuation
ValueCountFrequency (%)
-323
100.0%
Space Separator
ValueCountFrequency (%)
12
100.0%
Other Punctuation
ValueCountFrequency (%)
,5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin4108
92.4%
Common340
 
7.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
u860
20.9%
e592
14.4%
t341
 
8.3%
K331
 
8.1%
P325
 
7.9%
S325
 
7.9%
r205
 
5.0%
T190
 
4.6%
a184
 
4.5%
p139
 
3.4%
Other values (31)616
15.0%
Common
ValueCountFrequency (%)
-323
95.0%
12
 
3.5%
,5
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII4401
98.9%
None47
 
1.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
u860
19.5%
e592
13.5%
t341
 
7.7%
K331
 
7.5%
P325
 
7.4%
S325
 
7.4%
-323
 
7.3%
r205
 
4.7%
T190
 
4.3%
a184
 
4.2%
Other values (31)725
16.5%
None
ValueCountFrequency (%)
ä43
91.5%
ü2
 
4.3%
ö2
 
4.3%
Distinct13
Distinct (%)2.1%
Missing75
Missing (%)11.0%
Memory size5.5 KiB
Konsulttitalossa
290 
Tuotetalossa, jonka core-bisnes on softa
178 
Yrityksessä, jossa softa on tukeva toiminto (esim pankit, terveysala, yms)
109 
Julkinen tai kolmas sektori
 
23
Konsultointia ja omaa softaa
 
1
Other values (8)
 
8

Length

Max length74
Median length44
Mean length34
Min length10

Characters and Unicode

Total characters20706
Distinct characters40
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)1.5%

Sample

1st rowKonsulttitalossa
2nd rowTuotetalossa, jonka core-bisnes on softa
3rd rowTuotetalossa, jonka core-bisnes on softa
4th rowKonsulttitalossa
5th rowKonsulttitalossa

Common Values

ValueCountFrequency (%)
Konsulttitalossa290
42.4%
Tuotetalossa, jonka core-bisnes on softa178
26.0%
Yrityksessä, jossa softa on tukeva toiminto (esim pankit, terveysala, yms)109
 
15.9%
Julkinen tai kolmas sektori23
 
3.4%
Konsultointia ja omaa softaa1
 
0.1%
Mainos/digitoimisto1
 
0.1%
Konsulttitalossa ops-puolella 1
 
0.1%
Infrastruktuuri-/kapasiteettipalvelut1
 
0.1%
Konsultointi + tuote hybridifirmassa1
 
0.1%
Tuotetalo, core-bisnes fyysisissä tuotteissa1
 
0.1%
Other values (3)3
 
0.4%
(Missing)75
 
11.0%

Length

2022-10-10T09:09:10.613271image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
konsulttitalossa291
12.2%
on287
12.0%
softa287
12.0%
core-bisnes179
 
7.5%
jonka178
 
7.5%
tuotetalossa178
 
7.5%
esim109
 
4.6%
terveysala109
 
4.6%
pankit109
 
4.6%
yms109
 
4.6%
Other values (27)547
23.0%

Most occurring characters

ValueCountFrequency (%)
s2815
13.6%
o2261
10.9%
t2240
10.8%
a2016
9.7%
1775
 
8.6%
n1206
 
5.8%
e1143
 
5.5%
i1107
 
5.3%
l925
 
4.5%
u613
 
3.0%
Other values (30)4605
22.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter17412
84.1%
Space Separator1775
 
8.6%
Uppercase Letter609
 
2.9%
Other Punctuation509
 
2.5%
Dash Punctuation182
 
0.9%
Close Punctuation109
 
0.5%
Open Punctuation109
 
0.5%
Math Symbol1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s2815
16.2%
o2261
13.0%
t2240
12.9%
a2016
11.6%
n1206
6.9%
e1143
6.6%
i1107
 
6.4%
l925
 
5.3%
u613
 
3.5%
k579
 
3.3%
Other values (13)2507
14.4%
Uppercase Letter
ValueCountFrequency (%)
K293
48.1%
T179
29.4%
Y109
 
17.9%
J23
 
3.8%
M1
 
0.2%
I1
 
0.2%
U1
 
0.2%
W1
 
0.2%
P1
 
0.2%
Other Punctuation
ValueCountFrequency (%)
,506
99.4%
/2
 
0.4%
.1
 
0.2%
Space Separator
ValueCountFrequency (%)
1775
100.0%
Dash Punctuation
ValueCountFrequency (%)
-182
100.0%
Close Punctuation
ValueCountFrequency (%)
)109
100.0%
Open Punctuation
ValueCountFrequency (%)
(109
100.0%
Math Symbol
ValueCountFrequency (%)
+1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin18021
87.0%
Common2685
 
13.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s2815
15.6%
o2261
12.5%
t2240
12.4%
a2016
11.2%
n1206
 
6.7%
e1143
 
6.3%
i1107
 
6.1%
l925
 
5.1%
u613
 
3.4%
k579
 
3.2%
Other values (22)3116
17.3%
Common
ValueCountFrequency (%)
1775
66.1%
,506
 
18.8%
-182
 
6.8%
)109
 
4.1%
(109
 
4.1%
/2
 
0.1%
+1
 
< 0.1%
.1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII20595
99.5%
None111
 
0.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s2815
13.7%
o2261
11.0%
t2240
10.9%
a2016
9.8%
1775
 
8.6%
n1206
 
5.9%
e1143
 
5.5%
i1107
 
5.4%
l925
 
4.5%
u613
 
3.0%
Other values (29)4494
21.8%
None
ValueCountFrequency (%)
ä111
100.0%

Työaika
Real number (ℝ≥0)

MISSING

Distinct6
Distinct (%)1.0%
Missing72
Missing (%)10.5%
Infinite0
Infinite (%)0.0%
Mean0.985130719
Minimum0.4
Maximum1
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.5 KiB
2022-10-10T09:09:10.704792image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0.4
5-th percentile0.8
Q11
median1
Q31
95-th percentile1
Maximum1
Range0.6
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.0654168247
Coefficient of variation (CV)0.06640420752
Kurtosis34.06705282
Mean0.985130719
Median Absolute Deviation (MAD)0
Skewness-5.398413318
Sum602.9
Variance0.004279360953
MonotonicityNot monotonic
2022-10-10T09:09:10.783355image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
1575
84.1%
0.828
 
4.1%
0.64
 
0.6%
0.42
 
0.3%
0.92
 
0.3%
0.51
 
0.1%
(Missing)72
 
10.5%
ValueCountFrequency (%)
0.42
 
0.3%
0.51
 
0.1%
0.64
 
0.6%
0.828
 
4.1%
0.92
 
0.3%
1575
84.1%
ValueCountFrequency (%)
1575
84.1%
0.92
 
0.3%
0.828
 
4.1%
0.64
 
0.6%
0.51
 
0.1%
0.42
 
0.3%

Rooli
Categorical

HIGH CARDINALITY
MISSING

Distinct270
Distinct (%)45.5%
Missing90
Missing (%)13.2%
Memory size5.5 KiB
Full-stack
88 
Ohjelmistokehittäjä
61 
Arkkitehti
 
28
Lead developer
 
12
Backend
 
12
Other values (265)
393 

Length

Max length98
Median length62
Mean length18.14141414
Min length2

Characters and Unicode

Total characters10776
Distinct characters60
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique210 ?
Unique (%)35.4%

Sample

1st rowTeknologiajohtaja
2nd rowOhjelmistokehittäjä
3rd rowFull-stack-ohjelmistokehittäjä
4th rowDevaaja
5th rowFull-stack

Common Values

ValueCountFrequency (%)
Full-stack88
 
12.9%
Ohjelmistokehittäjä61
 
8.9%
Arkkitehti28
 
4.1%
Lead developer12
 
1.8%
Backend12
 
1.8%
Full-stack ohjelmistokehittäjä10
 
1.5%
Frontend9
 
1.3%
Ohjelmistokehittäjä (full-stack)9
 
1.3%
Full-stack developer9
 
1.3%
CTO6
 
0.9%
Other values (260)350
51.2%
(Missing)90
 
13.2%

Length

2022-10-10T09:09:10.905576image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
full-stack159
 
15.3%
ohjelmistokehittäjä121
 
11.7%
developer73
 
7.0%
engineer47
 
4.5%
lead39
 
3.8%
arkkitehti37
 
3.6%
backend36
 
3.5%
senior34
 
3.3%
software28
 
2.7%
frontend25
 
2.4%
Other values (200)438
42.2%

Most occurring characters

ValueCountFrequency (%)
e1113
 
10.3%
t1042
 
9.7%
l752
 
7.0%
i738
 
6.8%
k589
 
5.5%
a588
 
5.5%
s482
 
4.5%
451
 
4.2%
o445
 
4.1%
ä382
 
3.5%
Other values (50)4194
38.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter9176
85.2%
Uppercase Letter790
 
7.3%
Space Separator451
 
4.2%
Dash Punctuation210
 
1.9%
Other Punctuation90
 
0.8%
Close Punctuation26
 
0.2%
Open Punctuation25
 
0.2%
Math Symbol8
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e1113
12.1%
t1042
 
11.4%
l752
 
8.2%
i738
 
8.0%
k589
 
6.4%
a588
 
6.4%
s482
 
5.3%
o445
 
4.8%
ä382
 
4.2%
n376
 
4.1%
Other values (17)2669
29.1%
Uppercase Letter
ValueCountFrequency (%)
F194
24.6%
O142
18.0%
S100
12.7%
D71
 
9.0%
A44
 
5.6%
E44
 
5.6%
L38
 
4.8%
T32
 
4.1%
B27
 
3.4%
P23
 
2.9%
Other values (13)75
 
9.5%
Other Punctuation
ValueCountFrequency (%)
,58
64.4%
/27
30.0%
.2
 
2.2%
&2
 
2.2%
:1
 
1.1%
Space Separator
ValueCountFrequency (%)
451
100.0%
Dash Punctuation
ValueCountFrequency (%)
-210
100.0%
Close Punctuation
ValueCountFrequency (%)
)26
100.0%
Open Punctuation
ValueCountFrequency (%)
(25
100.0%
Math Symbol
ValueCountFrequency (%)
+8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin9966
92.5%
Common810
 
7.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e1113
 
11.2%
t1042
 
10.5%
l752
 
7.5%
i738
 
7.4%
k589
 
5.9%
a588
 
5.9%
s482
 
4.8%
o445
 
4.5%
ä382
 
3.8%
n376
 
3.8%
Other values (40)3459
34.7%
Common
ValueCountFrequency (%)
451
55.7%
-210
25.9%
,58
 
7.2%
/27
 
3.3%
)26
 
3.2%
(25
 
3.1%
+8
 
1.0%
.2
 
0.2%
&2
 
0.2%
:1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII10371
96.2%
None405
 
3.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e1113
 
10.7%
t1042
 
10.0%
l752
 
7.3%
i738
 
7.1%
k589
 
5.7%
a588
 
5.7%
s482
 
4.6%
451
 
4.3%
o445
 
4.3%
n376
 
3.6%
Other values (48)3795
36.6%
None
ValueCountFrequency (%)
ä382
94.3%
ö23
 
5.7%

Etä
Categorical

MISSING

Distinct3
Distinct (%)0.5%
Missing80
Missing (%)11.7%
Memory size944.0 B
Etä
343 
50/50
185 
Toimisto
76 

Length

Max length8
Median length3
Mean length4.241721854
Min length3

Characters and Unicode

Total characters2562
Distinct characters11
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row50/50
2nd rowEtä
3rd row50/50
4th row50/50
5th rowEtä

Common Values

ValueCountFrequency (%)
Etä343
50.1%
50/50185
27.0%
Toimisto76
 
11.1%
(Missing)80
 
11.7%

Length

2022-10-10T09:09:11.021520image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-10-10T09:09:11.122895image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
etä343
56.8%
50/50185
30.6%
toimisto76
 
12.6%

Most occurring characters

ValueCountFrequency (%)
t419
16.4%
5370
14.4%
0370
14.4%
E343
13.4%
ä343
13.4%
/185
7.2%
o152
 
5.9%
i152
 
5.9%
T76
 
3.0%
m76
 
3.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1218
47.5%
Decimal Number740
28.9%
Uppercase Letter419
 
16.4%
Other Punctuation185
 
7.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t419
34.4%
ä343
28.2%
o152
 
12.5%
i152
 
12.5%
m76
 
6.2%
s76
 
6.2%
Decimal Number
ValueCountFrequency (%)
5370
50.0%
0370
50.0%
Uppercase Letter
ValueCountFrequency (%)
E343
81.9%
T76
 
18.1%
Other Punctuation
ValueCountFrequency (%)
/185
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1637
63.9%
Common925
36.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
t419
25.6%
E343
21.0%
ä343
21.0%
o152
 
9.3%
i152
 
9.3%
T76
 
4.6%
m76
 
4.6%
s76
 
4.6%
Common
ValueCountFrequency (%)
5370
40.0%
0370
40.0%
/185
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII2219
86.6%
None343
 
13.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t419
18.9%
5370
16.7%
0370
16.7%
E343
15.5%
/185
8.3%
o152
 
6.8%
i152
 
6.8%
T76
 
3.4%
m76
 
3.4%
s76
 
3.4%
None
ValueCountFrequency (%)
ä343
100.0%

Kuukausipalkka
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct198
Distinct (%)32.6%
Missing77
Missing (%)11.3%
Infinite0
Infinite (%)0.0%
Mean5325.574959
Minimum1080
Maximum22500
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.5 KiB
2022-10-10T09:09:11.225579image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum1080
5-th percentile3000
Q14300
median5100
Q36100
95-th percentile8000
Maximum22500
Range21420
Interquartile range (IQR)1800

Descriptive statistics

Standard deviation1825.757921
Coefficient of variation (CV)0.342828321
Kurtosis14.9942356
Mean5325.574959
Median Absolute Deviation (MAD)900
Skewness2.311007423
Sum3232624
Variance3333391.987
MonotonicityNot monotonic
2022-10-10T09:09:11.342994image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
500029
 
4.2%
550025
 
3.7%
450023
 
3.4%
600020
 
2.9%
650015
 
2.2%
480015
 
2.2%
460015
 
2.2%
400015
 
2.2%
520014
 
2.0%
540013
 
1.9%
Other values (188)423
61.8%
(Missing)77
 
11.3%
ValueCountFrequency (%)
10801
 
0.1%
12001
 
0.1%
16601
 
0.1%
17601
 
0.1%
18801
 
0.1%
19031
 
0.1%
20003
0.4%
22001
 
0.1%
23002
0.3%
23411
 
0.1%
ValueCountFrequency (%)
225001
 
0.1%
140001
 
0.1%
133331
 
0.1%
130001
 
0.1%
129001
 
0.1%
125001
 
0.1%
112501
 
0.1%
112001
 
0.1%
105002
 
0.3%
100005
0.7%

Vuositulot
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct233
Distinct (%)38.5%
Missing79
Missing (%)11.5%
Infinite0
Infinite (%)0.0%
Mean68104.9314
Minimum0
Maximum290000
Zeros2
Zeros (%)0.3%
Negative0
Negative (%)0.0%
Memory size5.5 KiB
2022-10-10T09:09:11.465712image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile32500
Q153000
median65000
Q379000
95-th percentile108000
Maximum290000
Range290000
Interquartile range (IQR)26000

Descriptive statistics

Standard deviation28301.01439
Coefficient of variation (CV)0.4155501489
Kurtosis10.46651083
Mean68104.9314
Median Absolute Deviation (MAD)13000
Skewness1.968967269
Sum41203483.5
Variance800947415.2
MonotonicityNot monotonic
2022-10-10T09:09:11.588036image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6000023
 
3.4%
7000018
 
2.6%
5000016
 
2.3%
6250014
 
2.0%
8000013
 
1.9%
7500013
 
1.9%
6500013
 
1.9%
10000012
 
1.8%
5500010
 
1.5%
5750010
 
1.5%
Other values (223)463
67.7%
(Missing)79
 
11.5%
ValueCountFrequency (%)
02
0.3%
62.51
0.1%
5001
0.1%
10001
0.1%
20001
0.1%
25001
0.1%
40001
0.1%
48001
0.1%
50001
0.1%
50621
0.1%
ValueCountFrequency (%)
2900001
0.1%
2200001
0.1%
2100001
0.1%
2000002
0.3%
1880001
0.1%
1700001
0.1%
1600002
0.3%
1560001
0.1%
1500002
0.3%
1480001
0.1%

Vapaa kuvaus kokonaiskompensaatiomallista
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing498
Missing (%)72.8%
Memory size5.5 KiB

Kilpailukykyinen
Boolean

MISSING

Distinct2
Distinct (%)0.3%
Missing80
Missing (%)11.7%
Memory size5.5 KiB
True
443 
False
161 
(Missing)
80 
ValueCountFrequency (%)
True443
64.8%
False161
 
23.5%
(Missing)80
 
11.7%
2022-10-10T09:09:11.707406image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Vapaa sana
Categorical

MISSING
UNIFORM

Distinct41
Distinct (%)100.0%
Missing643
Missing (%)94.0%
Memory size5.5 KiB
Työsuhteessa ulkomaalaiseen firmaan, ainoastaan palkanmaksu menee oman firman kautta
 
1
kokemusta jo kolme viikkoa it alalta
 
1
Laskuttavan työn lisäksi teen tuotekehitystä omaan b2c SaaS-palveluun, josta maksan itselleni myös palkkaa.
 
1
tietyn palkkatason jälkeen työntekijät alkavat ottamaan enemmän lomaa ja vapaa-aika. Se on eri asia kuin 100% allokaatio työtä tehdessä ja vaikuttaa vuosituloihin. Pääomatulot työnantajan osakkeilla taas voivat usein olla kerta luontoisia, mutta tuoda paljonkin lisätuloja
 
1
Tällä hetkellä sopparit tehdään 1v kerrallaan. Firman muoto OY.
 
1
Other values (36)
36 

Length

Max length323
Median length98
Mean length111.5609756
Min length2

Characters and Unicode

Total characters4574
Distinct characters68
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique41 ?
Unique (%)100.0%

Sample

1st rowLykkyä tykö vapaakenttien normalisointiin!
2nd rowLaskutan palkan firmani kautta
3rd rowVastasin kyselyyn ns. päätyönantajani mukaan. Omaani vastaavissa tehtävissä suhteellisen usein saatetaan kuitenkin keikkailla startupeissa tms. osa-aikaisesti päätyön lisäksi. Kahden työpaikan ujuttaminen vastauksiin ei onnistunut.
4th rowKunta TES == 38 kokonaista päivää vuodessa, työaika 7h15min/päivä
5th rowOlen tosiaan poikkeus siinä, että kirjoitan koodia töissä erittäin harvoin. Välillä jotain devops-tyylistä tulee harrastettu, mutta toistaiseksi harvemmin.

Common Values

ValueCountFrequency (%)
Työsuhteessa ulkomaalaiseen firmaan, ainoastaan palkanmaksu menee oman firman kautta1
 
0.1%
kokemusta jo kolme viikkoa it alalta1
 
0.1%
Laskuttavan työn lisäksi teen tuotekehitystä omaan b2c SaaS-palveluun, josta maksan itselleni myös palkkaa.1
 
0.1%
tietyn palkkatason jälkeen työntekijät alkavat ottamaan enemmän lomaa ja vapaa-aika. Se on eri asia kuin 100% allokaatio työtä tehdessä ja vaikuttaa vuosituloihin. Pääomatulot työnantajan osakkeilla taas voivat usein olla kerta luontoisia, mutta tuoda paljonkin lisätuloja 1
 
0.1%
Tällä hetkellä sopparit tehdään 1v kerrallaan. Firman muoto OY.1
 
0.1%
Yrittäjähenkisessä työskentelyssä ei usein lasketa työtunteja. Siten esim. 90% työaika on aika näennäinen. Välillä tehdään kovempaa ja välillä rauhallisemmin.1
 
0.1%
En oo ihan varma miten muut toisissa vastaavan kokoluokan yrityksissä saa palkkaa vastaavasta roolista.1
 
0.1%
Koulu vielä kesken ja se vaikuttaa palkkaan1
 
0.1%
Palkka ei ole kilpailukykyinen, jos mietin paljonko pyytäisin vastaavasta positiosta palkkaa muissa yrityksissä, mutta varsin tarpeeksi nykyisessä yrityksessä.1
 
0.1%
Yritys Yhdysvalloista, kuten suurin osa tiimiäkin. Presenssi myös Suomessa1
 
0.1%
Other values (31)31
 
4.5%
(Missing)643
94.0%

Length

2022-10-10T09:09:11.809275image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
on15
 
2.6%
ja12
 
2.1%
mutta9
 
1.6%
ei7
 
1.2%
myös6
 
1.1%
enemmän6
 
1.1%
palkkaa6
 
1.1%
palkka5
 
0.9%
että5
 
0.9%
5
 
0.9%
Other values (405)495
86.7%

Most occurring characters

ValueCountFrequency (%)
a551
12.0%
534
11.7%
i369
 
8.1%
t365
 
8.0%
n314
 
6.9%
s294
 
6.4%
e263
 
5.7%
k242
 
5.3%
l220
 
4.8%
o213
 
4.7%
Other values (58)1209
26.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter3824
83.6%
Space Separator534
 
11.7%
Other Punctuation95
 
2.1%
Uppercase Letter68
 
1.5%
Decimal Number29
 
0.6%
Dash Punctuation11
 
0.2%
Math Symbol5
 
0.1%
Close Punctuation4
 
0.1%
Open Punctuation3
 
0.1%
Control1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a551
14.4%
i369
9.6%
t365
9.5%
n314
 
8.2%
s294
 
7.7%
e263
 
6.9%
k242
 
6.3%
l220
 
5.8%
o213
 
5.6%
u174
 
4.6%
Other values (15)819
21.4%
Uppercase Letter
ValueCountFrequency (%)
K8
11.8%
S7
10.3%
V7
10.3%
P7
10.3%
O6
8.8%
E6
8.8%
T6
8.8%
Y4
 
5.9%
L3
 
4.4%
J2
 
2.9%
Other values (8)12
17.6%
Other Punctuation
ValueCountFrequency (%)
.41
43.2%
,40
42.1%
%4
 
4.2%
/3
 
3.2%
"2
 
2.1%
:2
 
2.1%
!2
 
2.1%
*1
 
1.1%
Decimal Number
ValueCountFrequency (%)
08
27.6%
15
17.2%
34
13.8%
24
13.8%
83
 
10.3%
53
 
10.3%
71
 
3.4%
91
 
3.4%
Math Symbol
ValueCountFrequency (%)
=2
40.0%
~1
20.0%
<1
20.0%
+1
20.0%
Space Separator
ValueCountFrequency (%)
534
100.0%
Dash Punctuation
ValueCountFrequency (%)
-11
100.0%
Close Punctuation
ValueCountFrequency (%)
)4
100.0%
Open Punctuation
ValueCountFrequency (%)
(3
100.0%
Control
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin3892
85.1%
Common682
 
14.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a551
14.2%
i369
9.5%
t365
9.4%
n314
 
8.1%
s294
 
7.6%
e263
 
6.8%
k242
 
6.2%
l220
 
5.7%
o213
 
5.5%
u174
 
4.5%
Other values (33)887
22.8%
Common
ValueCountFrequency (%)
534
78.3%
.41
 
6.0%
,40
 
5.9%
-11
 
1.6%
08
 
1.2%
15
 
0.7%
%4
 
0.6%
34
 
0.6%
24
 
0.6%
)4
 
0.6%
Other values (15)27
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII4392
96.0%
None182
 
4.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a551
12.5%
534
12.2%
i369
 
8.4%
t365
 
8.3%
n314
 
7.1%
s294
 
6.7%
e263
 
6.0%
k242
 
5.5%
l220
 
5.0%
o213
 
4.8%
Other values (56)1027
23.4%
None
ValueCountFrequency (%)
ä146
80.2%
ö36
 
19.8%

Ideoita ensi vuoden kyselyyn
Categorical

MISSING
UNIFORM

Distinct31
Distinct (%)100.0%
Missing653
Missing (%)95.5%
Memory size5.5 KiB
Tämä oli mukavan lyhyt ja ytimekäs
 
1
mietityttää, olisiko kuitenkin koulutustausta relevantti tieto?
 
1
Kyselyn voisi toteuttaa myös englanniksi, sillä suomessa työskentelee paljon suomea taitamattomia :)
 
1
Aikanaan, kun toteutin laajan palkkatutkimuksen ohjelmistoalalle, sisällytin mukaan monia kysymyksiä työntekijöiden rooleista, arvoista, koulutustasosta ja tehtävänimikkeistä. Näistä sai sitten todella mielenkiintoista tietoa yhdistettynä palkkatietoihin. Vuosilaskutus on sellainen tapa kysyä palkkaa, että se johtaa helposti harhaan. Mitä jos on korkea laskutus, mutta päättää tehdä kuusituntista viikkoa? Mitä jos on vanhempainvapaalla puolet vuodesta? Tällaisella yhdellä vuosilaskutusta koskevalla kysymyksellä saa helposti harhaanjohtavia vastauksia.
 
1
Millä kielillä koodaa? Millä alustoilla?
 
1
Other values (26)
26 

Length

Max length558
Median length89
Mean length96.80645161
Min length13

Characters and Unicode

Total characters3001
Distinct characters54
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique31 ?
Unique (%)100.0%

Sample

1st rowTämä oli mukavan lyhyt ja ytimekäs
2nd row5/5 hyvää duunia
3rd rowKiinnostaisi tietää mitä kautta ihmiset löysivät työnsä.
4th rowEhkä voisi huomioida olennaiset semi-satunnaiset sivuduunit palkkaduunarin ja yrittäjän välimaastossa olevilla.
5th rowLuontaisedut, autoedun arvo yms. voisi olla mukana kyselyssä

Common Values

ValueCountFrequency (%)
Tämä oli mukavan lyhyt ja ytimekäs1
 
0.1%
mietityttää, olisiko kuitenkin koulutustausta relevantti tieto?1
 
0.1%
Kyselyn voisi toteuttaa myös englanniksi, sillä suomessa työskentelee paljon suomea taitamattomia :)1
 
0.1%
Aikanaan, kun toteutin laajan palkkatutkimuksen ohjelmistoalalle, sisällytin mukaan monia kysymyksiä työntekijöiden rooleista, arvoista, koulutustasosta ja tehtävänimikkeistä. Näistä sai sitten todella mielenkiintoista tietoa yhdistettynä palkkatietoihin. Vuosilaskutus on sellainen tapa kysyä palkkaa, että se johtaa helposti harhaan. Mitä jos on korkea laskutus, mutta päättää tehdä kuusituntista viikkoa? Mitä jos on vanhempainvapaalla puolet vuodesta? Tällaisella yhdellä vuosilaskutusta koskevalla kysymyksellä saa helposti harhaanjohtavia vastauksia.1
 
0.1%
Millä kielillä koodaa? Millä alustoilla?1
 
0.1%
Palkan kilpailukykyisyyttä voisi arvioida skaalalla 1-5 eikä vain kyllä/ei....1
 
0.1%
Voisi kysellä kuinka paljon työajasta voi käyttää itseopiskeluun tai muuhun ei suoraan tuottavaan työhön.1
 
0.1%
Työsuhde-edut ois kiva olla erillisenä kysymyksenä.1
 
0.1%
Työsuhde-edut1
 
0.1%
Ehkä joku työsuhde-edut rahallisesti osuus myös?1
 
0.1%
Other values (21)21
 
3.1%
(Missing)653
95.5%

Length

2022-10-10T09:09:12.048800image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
voisi11
 
3.0%
on8
 
2.2%
ja7
 
1.9%
mitä7
 
1.9%
jos5
 
1.4%
onko5
 
1.4%
kysyä5
 
1.4%
myös5
 
1.4%
se4
 
1.1%
ehkä4
 
1.1%
Other values (267)304
83.3%

Most occurring characters

ValueCountFrequency (%)
338
11.3%
a323
10.8%
t262
 
8.7%
i254
 
8.5%
s220
 
7.3%
o178
 
5.9%
e164
 
5.5%
l163
 
5.4%
k153
 
5.1%
n147
 
4.9%
Other values (44)799
26.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2517
83.9%
Space Separator338
 
11.3%
Other Punctuation81
 
2.7%
Uppercase Letter44
 
1.5%
Dash Punctuation8
 
0.3%
Decimal Number4
 
0.1%
Close Punctuation3
 
0.1%
Control2
 
0.1%
Open Punctuation2
 
0.1%
Other Symbol2
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a323
12.8%
t262
10.4%
i254
10.1%
s220
8.7%
o178
 
7.1%
e164
 
6.5%
l163
 
6.5%
k153
 
6.1%
n147
 
5.8%
u126
 
5.0%
Other values (14)527
20.9%
Uppercase Letter
ValueCountFrequency (%)
T9
20.5%
M7
15.9%
K6
13.6%
O5
11.4%
E4
9.1%
V3
 
6.8%
P2
 
4.5%
J2
 
4.5%
S1
 
2.3%
F1
 
2.3%
Other values (4)4
9.1%
Other Punctuation
ValueCountFrequency (%)
.32
39.5%
,23
28.4%
?13
16.0%
"6
 
7.4%
/3
 
3.7%
!3
 
3.7%
:1
 
1.2%
Decimal Number
ValueCountFrequency (%)
53
75.0%
11
 
25.0%
Other Symbol
ValueCountFrequency (%)
😍1
50.0%
😄1
50.0%
Space Separator
ValueCountFrequency (%)
338
100.0%
Dash Punctuation
ValueCountFrequency (%)
-8
100.0%
Close Punctuation
ValueCountFrequency (%)
)3
100.0%
Control
ValueCountFrequency (%)
2
100.0%
Open Punctuation
ValueCountFrequency (%)
(2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2561
85.3%
Common440
 
14.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a323
12.6%
t262
10.2%
i254
9.9%
s220
 
8.6%
o178
 
7.0%
e164
 
6.4%
l163
 
6.4%
k153
 
6.0%
n147
 
5.7%
u126
 
4.9%
Other values (28)571
22.3%
Common
ValueCountFrequency (%)
338
76.8%
.32
 
7.3%
,23
 
5.2%
?13
 
3.0%
-8
 
1.8%
"6
 
1.4%
)3
 
0.7%
53
 
0.7%
/3
 
0.7%
!3
 
0.7%
Other values (6)8
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII2882
96.0%
None117
 
3.9%
Emoticons2
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
338
11.7%
a323
11.2%
t262
 
9.1%
i254
 
8.8%
s220
 
7.6%
o178
 
6.2%
e164
 
5.7%
l163
 
5.7%
k153
 
5.3%
n147
 
5.1%
Other values (40)680
23.6%
None
ValueCountFrequency (%)
ä94
80.3%
ö23
 
19.7%
Emoticons
ValueCountFrequency (%)
😍1
50.0%
😄1
50.0%

Kk-tulot
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct233
Distinct (%)38.5%
Missing79
Missing (%)11.5%
Infinite0
Infinite (%)0.0%
Mean5675.41095
Minimum0
Maximum24166.66667
Zeros2
Zeros (%)0.3%
Negative0
Negative (%)0.0%
Memory size5.5 KiB
2022-10-10T09:09:12.167770image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2708.333333
Q14416.666667
median5416.666667
Q36583.333333
95-th percentile9000
Maximum24166.66667
Range24166.66667
Interquartile range (IQR)2166.666667

Descriptive statistics

Standard deviation2358.417865
Coefficient of variation (CV)0.4155501489
Kurtosis10.46651083
Mean5675.41095
Median Absolute Deviation (MAD)1083.333333
Skewness1.968967269
Sum3433623.625
Variance5562134.828
MonotonicityNot monotonic
2022-10-10T09:09:12.296192image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
500023
 
3.4%
5833.33333318
 
2.6%
4166.66666716
 
2.3%
5208.33333314
 
2.0%
6666.66666713
 
1.9%
625013
 
1.9%
5416.66666713
 
1.9%
8333.33333312
 
1.8%
4583.33333310
 
1.5%
4791.66666710
 
1.5%
Other values (223)463
67.7%
(Missing)79
 
11.5%
ValueCountFrequency (%)
02
0.3%
5.2083333331
0.1%
41.666666671
0.1%
83.333333331
0.1%
166.66666671
0.1%
208.33333331
0.1%
333.33333331
0.1%
4001
0.1%
416.66666671
0.1%
421.83333331
0.1%
ValueCountFrequency (%)
24166.666671
0.1%
18333.333331
0.1%
175001
0.1%
16666.666672
0.3%
15666.666671
0.1%
14166.666671
0.1%
13333.333332
0.3%
130001
0.1%
125002
0.3%
12333.333331
0.1%

Interactions

2022-10-10T09:09:05.528824image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:00.141004image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:01.017394image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:01.740944image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:02.428265image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:03.217837image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:03.950795image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:04.673039image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:05.624755image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:00.347868image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:01.113103image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:01.829325image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:02.623699image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:03.315707image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:04.043624image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:04.772579image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:05.707017image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:00.445258image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:01.217345image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:01.925733image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:02.719928image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:03.396616image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:04.125206image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:04.965618image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:05.786157image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:00.533059image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:01.313360image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:02.013620image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:02.805911image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:03.475729image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:04.205683image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:05.044114image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:05.865535image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:00.625566image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:01.410913image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:02.104295image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:02.898715image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:03.554718image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:04.285652image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:05.123040image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:05.966106image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:00.728152image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:01.493231image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:02.184640image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:02.977699image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:03.657092image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:04.383296image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:05.225967image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:06.065511image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:00.820023image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:01.575797image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:02.266909image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:03.057823image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:03.749923image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:04.470779image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:05.327922image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:06.164982image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:00.920950image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:01.658357image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:02.347613image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:03.137249image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:03.852110image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:04.573920image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-10T09:09:05.430324image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Correlations

2022-10-10T09:09:12.404998image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-10-10T09:09:12.585873image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-10-10T09:09:12.745523image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Missing values

2022-10-10T09:09:06.375522image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-10-10T09:09:06.939647image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-10-10T09:09:07.416409image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-10-10T09:09:07.843241image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

TimestampOletko palkansaaja vai laskuttaja?IkäSukupuoliTyökokemusMontako vuotta olet tehnyt laskuttavaa työtä alalla?PalvelutTuntilaskutus (ALV 0%, euroina)Vuosilaskutus (ALV 0%, euroina)Hankitko asiakkaasi itse suoraan vai käytätkö välitysfirmojen palveluita?Mistä asiakkaat ovat?TyöpaikkaKaupunkiMillaisessa yrityksessä työskenteletTyöaikaRooliEtäKuukausipalkkaVuositulotVapaa kuvaus kokonaiskompensaatiomallistaKilpailukykyinenVapaa sanaIdeoita ensi vuoden kyselyynKk-tulot
02022-09-26 16:35:50.002Palkansaaja33mies12.0NaNNaNNaNNaNNaNNaN-PK-SeutuKonsulttitalossa1.0Teknologiajohtaja50/506500.081250.0NaNTrueNaNNaN6770.833333
12022-09-26 16:37:21.049Palkansaaja33mies16.0NaNNaNNaNNaNNaNNaNNaNPK-SeutuTuotetalossa, jonka core-bisnes on softa1.0OhjelmistokehittäjäEtä9000.0117000.0NaNTrueNaNNaN9750.000000
22022-09-26 16:38:47.396Palkansaaja33mies16.0NaNNaNNaNNaNNaNNaNNaNTurkuTuotetalossa, jonka core-bisnes on softa1.0Full-stack-ohjelmistokehittäjä50/505000.062500.0NaNFalseNaNNaN5208.333333
32022-09-26 16:39:47.534Palkansaaja38mies13.0NaNNaNNaNNaNNaNNaNNaNPK-SeutuKonsulttitalossa1.0Devaaja50/505100.063750.0NaNFalseNaNNaN5312.500000
42022-09-26 16:41:09.685Laskuttaja28mies6.01.0Data-analytiikka, Arkkitehtuuri, Data Engineering,90.0160000.0Käytän välitysfirmojaSuomestaNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
52022-09-26 16:43:39.266Laskuttaja28mies6.010.0Fullstack80.0100000.0ItseSuomesta, UlkomailtaNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNLykkyä tykö vapaakenttien normalisointiin!Tämä oli mukavan lyhyt ja ytimekäsNaN
62022-09-26 16:44:27.744Palkansaaja38mies12.0NaNNaNNaNNaNNaNNaNNaNPK-SeutuKonsulttitalossa1.0Full-stackEtä7500.090000.0NaNTrueNaNNaN7500.000000
72022-09-26 16:44:49.112Palkansaaja33mies12.0NaNNaNNaNNaNNaNNaNNaNVaasaTuotetalossa, jonka core-bisnes on softa1.0Ohjelmistokehittäjä full-stack, laitteistokehitys, tekoäly/koneoppiminenEtä3700.048000.0Kuukausipalkka + vaihtelevan kokoinen joulubonusTrueNaNNaN4000.000000
82022-09-26 16:45:12.422Palkansaaja33mies4.0NaNNaNNaNNaNNaNNaNVismaTampereKonsulttitalossa1.0Full-stackEtä4600.057500.0NaNTrueNaNNaN4791.666667
92022-09-26 16:45:44.793Palkansaaja38mies14.0NaNNaNNaNNaNNaNNaNNaNPK-SeutuYrityksessä, jossa softa on tukeva toiminto (esim pankit, terveysala, yms)1.0NaN50/504300.055000.0NaNFalseNaNNaN4583.333333

Last rows

TimestampOletko palkansaaja vai laskuttaja?IkäSukupuoliTyökokemusMontako vuotta olet tehnyt laskuttavaa työtä alalla?PalvelutTuntilaskutus (ALV 0%, euroina)Vuosilaskutus (ALV 0%, euroina)Hankitko asiakkaasi itse suoraan vai käytätkö välitysfirmojen palveluita?Mistä asiakkaat ovat?TyöpaikkaKaupunkiMillaisessa yrityksessä työskenteletTyöaikaRooliEtäKuukausipalkkaVuositulotVapaa kuvaus kokonaiskompensaatiomallistaKilpailukykyinenVapaa sanaIdeoita ensi vuoden kyselyynKk-tulot
6742022-10-09 18:56:30.713Palkansaaja38mies20.0NaNNaNNaNNaNNaNNaNNaNPK-SeutuKonsulttitalossa1.0Web-analyytikkoEtä7300.090000.0NaNFalseNaNNaN7500.000000
6752022-10-09 19:31:27.704Laskuttaja28mies4.01.0Full stack86.0125000.0Käytän välitysfirmojaSuomestaNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
6762022-10-09 20:54:49.686Palkansaaja33nainen0.0NaNNaNNaNNaNNaNNaNNaNPK-SeutuTuotetalossa, jonka core-bisnes on softa1.0Junior frontend devEtä3750.047000.0Palkkamalliin kuului osakkeita n. 17t € arvosta, vestautumisaika 4 vuotta, kertyvät asteittain.TrueNaNTyökokemusvuosien vaihtoehdoissa voisi olla kokonaislukujen sijaan mahdollista valita myös esim. "alle vuosi". Mun relevantti kokemus alalta on puoli vuotta, joten en haluais millään vastata "nolla vuotta" 😄3916.666667
6772022-10-09 21:34:52.664Palkansaaja33mies6.0NaNNaNNaNNaNNaNNaNNaNLappeenrantaKonsulttitalossa0.8NaN50/504200.052500.0NaNNaNNaNNaN4375.000000
6782022-10-09 22:07:02.512Palkansaaja33NaN5.0NaNNaNNaNNaNNaNNaNNaNTampereTuotetalossa, jonka core-bisnes on softa1.0Team leader50/505100.063750.0NaNFalseNaNNaN5312.500000
6792022-10-09 22:29:23.021Palkansaaja33mies6.0NaNNaNNaNNaNNaNNaNNaNPK-SeutuKonsulttitalossa1.0OhjelmistokehittäjäToimisto4730.061000.0Kiinteä kuukausipalkka + vuosibonus yrityksen tuloksen mukaanFalseNaNNaN5083.333333
6802022-10-10 06:26:34.080Laskuttaja33mies12.0NaNNaN170.0NaNItseSuomesta, UlkomailtaNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
6812022-10-10 06:52:45.143Palkansaaja28mies2.0NaNNaNNaNNaNNaNNaNHelsingin KaupunkiPK-SeutuJulkinen tai kolmas sektori1.0Backend, devops, projektipäällikköToimisto2300.028750.0NaNFalseNaNNaN2395.833333
6822022-10-10 07:46:57.646Palkansaaja33NaN7.0NaNNaNNaNNaNNaNNaNFraktioPK-SeutuKonsulttitalossa1.0Suunnittelija50/504900.061250.0NaNTrueNaNNaN5104.166667
6832022-10-10 07:49:49.204Laskuttaja23mies7.04.0Backend, systems120.0135000.0ItseUlkomailtaNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN