Skip to content

chore: treat empty-string env vars as absent in Configuration#965

Draft
vdusek wants to merge 3 commits into
masterfrom
fix/empty-string-json-env-vars
Draft

chore: treat empty-string env vars as absent in Configuration#965
vdusek wants to merge 3 commits into
masterfrom
fix/empty-string-json-env-vars

Conversation

@vdusek

@vdusek vdusek commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Description

The Apify platform sometimes sets some env vars to an empty string instead of leaving them unset (see #303, or this Discord thread). For fields whose type can't parse '', validation raised and crashed Actor.init().

A few fields were already guarded against this with ad-hoc BeforeValidator lambdas. This PR consolidates them into a single shared _default_if_empty helper and defensively applies the same fallback to the remaining typed / JSON fields, so any field that starts arriving as '' falls back to its default instead of crashing.

Why chore and not fix

I verified against the current apify-worker code (getEnvsForActor in act2_run_job.ts). Only four env vars are actually emitted as '' today, and those were already handled:

  • max_paid_dataset_items (maxItems ? String(...) : '')
  • max_total_charge_usd (maxTotalChargeUsd ? String(...) : '')
  • timeout_at (timeoutAt ? toISOString() : '')
  • user_is_paying (isRunOfPayingUser ? '1' : '')

The remaining fields are covered defensively. The worker always gives them a real value (started_at, dedicated_cpus, proxy_port, web_server_port, is_at_home), never sets them at all (metamorph_after_sleep, test_pay_per_event), or returns at least {} for the JSON ones (charged_event_counts, and pricing). The worker also wraps every value in String(...), so an unset value reaches the Actor as the literal "undefined", not ''.

So this doesn't fix an observed crash. It's cleanup (consolidating the lambdas) plus hardening for the future, hence chore.

Changes

  • Added a shared _default_if_empty(default=...) validator and replaced the existing ad-hoc lambdas (timeout_at, max_paid_dataset_items, max_total_charge_usd, user_is_paying) with it.
  • Applied it defensively to started_at, dedicated_cpus, proxy_port, web_server_port, metamorph_after_sleep, is_at_home, test_pay_per_event.
  • JSON fields (ACTOR_STORAGES_JSON, APIFY_CHARGED_ACTOR_EVENT_COUNTS): treat '' as absent instead of letting json.loads('') raise, mirroring the existing _parse_actor_pricing_info behavior.
  • Regression tests for all affected env vars.

@vdusek vdusek added adhoc Ad-hoc unplanned task added during the sprint. t-tooling Issues with this label are in the ownership of the tooling team. labels Jun 11, 2026
@vdusek vdusek self-assigned this Jun 11, 2026
@codecov

codecov Bot commented Jun 11, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.91%. Comparing base (0daca28) to head (5cc2a2a).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #965      +/-   ##
==========================================
+ Coverage   89.90%   89.91%   +0.01%     
==========================================
  Files          49       49              
  Lines        3091     3095       +4     
==========================================
+ Hits         2779     2783       +4     
  Misses        312      312              
Flag Coverage Δ
e2e 35.96% <66.66%> (+0.05%) ⬆️
integration 56.93% <100.00%> (+0.05%) ⬆️
unit 78.77% <100.00%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@vdusek vdusek requested a review from Pijukatel June 11, 2026 13:15
@vdusek vdusek marked this pull request as ready for review June 11, 2026 13:15

def test_actor_storages_env_var_empty_string_becomes_none(monkeypatch: pytest.MonkeyPatch) -> None:
"""Test that an empty env var for actor_storages is converted to None instead of crashing."""
monkeypatch.setenv('ACTOR_STORAGES_JSON', '')

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this actually less defensive?

Is there any actual use case for having 'ACTOR_STORAGES_JSON' be set as an empty string? The env variable clearly states it should be a json. So either have json there, or don't set the variable.

Code should not try to guess the intention from an ambiguous action. And here we are guessing that an empty env variable probably means the user did not want to set it up in the first place...

@github-actions github-actions Bot added this to the 142nd sprint - Tooling team milestone Jun 12, 2026
@github-actions github-actions Bot added the tested Temporary label used only programatically for some analytics. label Jun 12, 2026
@vdusek vdusek marked this pull request as draft June 14, 2026 16:43
@vdusek vdusek changed the title fix: treat empty-string JSON env vars as absent in Configuration fix: treat empty-string env vars as absent in Configuration Jun 14, 2026
@vdusek vdusek changed the title fix: treat empty-string env vars as absent in Configuration chore: treat empty-string env vars as absent in Configuration Jun 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

adhoc Ad-hoc unplanned task added during the sprint. t-tooling Issues with this label are in the ownership of the tooling team. tested Temporary label used only programatically for some analytics.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants