Commit Graph

17 Commits

Author SHA1 Message Date
Aarni Koskela
663cd3d349 Add 2025 survey data support
The 2025 survey uses a single English-only xlsx (instead of separate
fi/en files) with a restructured schema: compensation is split into
base salary, commission, lomaraha, bonus, and equity components;
working time is h/week instead of percentage; and competitive salary
is categorical instead of boolean. Vuositulot is now synthesized
from the component fields.

Drop COLUMN_MAP_2024, COLUMN_MAP_2024_EN_TO_FI, VALUE_MAP_2024_EN_TO_FI,
read_initial_dfs_2024, read_data_2024, map_sukupuoli, map_vuositulot,
split_boolean_column_to_other, apply_fixups, and the associated gender
value lists and boolean text maps. All of this exists in version history.

- KKPALKKA now includes base salary + commission (median 5500 → 5800)
- Apply map_numberlike to tuntilaskutus and vuosilaskutus columns to
  handle string values like "60 000" and "100 000"
- Filter out zeros when computing tunnusluvut on the index page so
  stats reflect actual reported values

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 15:40:13 +02:00
Aarni Koskela
1d14bfe765 Renovate for 2025, uvify CI (#20) 2026-03-11 10:50:21 +02:00
Aarni Koskela
e4fd9ae1a7 Add verification for expected data length (#19) 2024-10-29 12:23:00 +02:00
Aarni Koskela
1a0ae2502b Update for 2024 2024-10-28 11:48:39 +02:00
Aarni Koskela
de20fd9283 Sort imports 2024-10-28 11:48:29 +02:00
Aarni Koskela
773aad8749 Drop and fixup data based on ID, not timestamp 2023-09-28 16:45:48 +03:00
Aarni Koskela
471a1ee9da Add hashes as "Vastaustunniste" 2023-09-28 16:45:48 +03:00
Aarni Koskela
fe06cc38bc Ruffify 2023-09-28 15:30:30 +03:00
Aarni Koskela
2049638e13 Adjust obvious data errors in TYOAIKA (h/t tvainika) 2023-09-25 14:16:59 +03:00
Aarni Koskela
d71d0a188c Improve column maps, drop duplicate row 2023-09-25 09:23:37 +03:00
Aarni Koskela
e730ee89fe Tweak everything for 2023 2023-09-24 22:00:19 +03:00
Aarni Koskela
9e1ab195c6 Run black 2022-10-19 12:11:30 +03:00
Aarni Koskela
4e97f5fc3e Improve data normalization 2022-10-17 16:36:10 +03:00
Aarni Koskela
335cf15064 Apply some data fixes 2022-10-10 12:07:48 +03:00
Aarni Koskela
aff1533f7f Update ingestion script for 2022 2022-10-04 10:11:03 +03:00
Aarni Koskela
538bc6083a Allow parametrizing paths 2022-08-31 15:12:46 +03:00
Aarni Koskela
cdc6d9cc89 Packagify pulkka 2022-08-31 15:09:27 +03:00