Skip to content

Refactor download_url test into TestDownloadUrl class#8829

Open
e-mny wants to merge 6 commits intoProject-MONAI:devfrom
e-mny:dev
Open

Refactor download_url test into TestDownloadUrl class#8829
e-mny wants to merge 6 commits intoProject-MONAI:devfrom
e-mny:dev

Conversation

@e-mny
Copy link
Copy Markdown

@e-mny e-mny commented Apr 17, 2026

Fixes #8510

Description

  • The data_config.json needs to be updated with HF links in view of migrating to HF.
  • Implement download_url tests

Types of changes

  • Non-breaking change (fix or new feature that would not break existing functionality).
  • New tests added to cover the changes.
  • Integration tests passed locally by running ./runtests.sh -f -u --net --coverage.
  • Quick tests passed locally by running ./runtests.sh --quick --unittests --disttests.
  • In-line docstrings updated.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 17, 2026

📝 Walkthrough

Walkthrough

Updates test-data URLs in tests/testing_data/data_config.json to point to the HuggingFace testing_data dataset. Modifies tests/test_utils.py: expands skip logic to treat ValueError messages containing "hash check" as skip-worthy, updates DOWNLOAD_FAIL_MSGS to use "hash check", adds SAMPLE_TIFF constants and a TestDownloadUrl unit test that verifies download success with a correct hash and that an incorrect hash raises RuntimeError. Changes monai/apps/utils.py: download_url now raises ValueError (with an expanded message) when a post-download hash check fails.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed Title mentions refactoring download_url test, which is partially related but understates broader scope (data migration, hash validation changes).
Description check ✅ Passed Description covers main objectives (data_config.json update, download_url tests) and follows template with proper checkboxes.
Linked Issues check ✅ Passed PR addresses #8510 objectives: migrates test data URLs to HuggingFace dataset, implements download_url tests with hash validation.
Out of Scope Changes check ✅ Passed All changes align with #8510 scope: data_config.json URL migration, download_url test implementation, hash validation improvements.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/testing_data/data_config.json (1)

88-92: ⚠️ Potential issue | 🟡 Minor

Two URLs were missed in the HF migration.

images.nrrd_example (line 89) and configs.test_meta_file (line 165) still point at github.com/Project-MONAI/MONAI-extra-test-data/releases/download/0.8.1/.... Per the PR objective (issue #8510), all test data URLs should move to the HF dataset. Please migrate these too (or note why they're exempt).

#!/bin/bash
# Confirm remaining GitHub Releases URLs in the config.
rg -n 'MONAI-extra-test-data' tests/testing_data/data_config.json

Also applies to: 163-168

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/testing_data/data_config.json` around lines 88 - 92, The two remaining
test-data entries images.nrrd_example and configs.test_meta_file still point to
the old GitHub release; update their "url" fields to the corresponding
HuggingFace dataset URLs used by the rest of the config (replace the
github.com/Project-MONAI/MONAI-extra-test-data/releases/download/0.8.1/... links
with the HF dataset path), and if the file content or location changed on HF,
update the "hash_val" (keeping "hash_type": "sha256") to the new checksum; after
changing images.nrrd_example and configs.test_meta_file in data_config.json, run
the repo search used in the PR verification (e.g., rg -n 'MONAI-extra-test-data'
tests/testing_data/data_config.json) to ensure no remaining GitHub-release links
remain.
🧹 Nitpick comments (1)
tests/test_utils.py (1)

195-215: Consider sourcing URL/hash from testing_data_config instead of hard-coding.

wsi_generic_tiff in data_config.json already holds the same URL/hash_type/hash_val. Using testing_data_config("images", "wsi_generic_tiff", ...) would avoid duplication and keep the test in sync if the config is ever updated.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_utils.py` around lines 195 - 215, The test_download_url test
currently hard-codes SAMPLE_TIFF, SAMPLE_TIFF_HASH and SAMPLE_TIFF_HASH_TYPE;
replace those with values loaded from the shared test config by calling
testing_data_config("images", "wsi_generic_tiff", "url"),
testing_data_config("images", "wsi_generic_tiff", "hash_val") and
testing_data_config("images", "wsi_generic_tiff", "hash_type") (or destructure a
single call if helper returns all fields) and pass those into download_url
instead of the constants; update imports in tests/test_utils.py to include
testing_data_config and keep the existing download_url usage and assertions
unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/test_utils.py`:
- Around line 209-215: Replace the second transient download with a
deterministic hash-mismatch check: first download SAMPLE_TIFF once (guarded by
skip_if_downloading_fails()) into a temp filepath (e.g., model.tiff) using
download_url, then call download_url again with the same filepath but with the
incorrect hash_val to trigger the hash-validation branch and
assertRaises(RuntimeError); this avoids re-downloading to model_bad.tiff and
eliminates network flakiness—use the symbols download_url,
skip_if_downloading_fails, SAMPLE_TIFF, and SAMPLE_TIFF_HASH_TYPE to locate and
modify the test.

---

Outside diff comments:
In `@tests/testing_data/data_config.json`:
- Around line 88-92: The two remaining test-data entries images.nrrd_example and
configs.test_meta_file still point to the old GitHub release; update their "url"
fields to the corresponding HuggingFace dataset URLs used by the rest of the
config (replace the
github.com/Project-MONAI/MONAI-extra-test-data/releases/download/0.8.1/... links
with the HF dataset path), and if the file content or location changed on HF,
update the "hash_val" (keeping "hash_type": "sha256") to the new checksum; after
changing images.nrrd_example and configs.test_meta_file in data_config.json, run
the repo search used in the PR verification (e.g., rg -n 'MONAI-extra-test-data'
tests/testing_data/data_config.json) to ensure no remaining GitHub-release links
remain.

---

Nitpick comments:
In `@tests/test_utils.py`:
- Around line 195-215: The test_download_url test currently hard-codes
SAMPLE_TIFF, SAMPLE_TIFF_HASH and SAMPLE_TIFF_HASH_TYPE; replace those with
values loaded from the shared test config by calling
testing_data_config("images", "wsi_generic_tiff", "url"),
testing_data_config("images", "wsi_generic_tiff", "hash_val") and
testing_data_config("images", "wsi_generic_tiff", "hash_type") (or destructure a
single call if helper returns all fields) and pass those into download_url
instead of the constants; update imports in tests/test_utils.py to include
testing_data_config and keep the existing download_url usage and assertions
unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d4c21ce8-3d39-4944-869e-df48b5adeeb6

📥 Commits

Reviewing files that changed from the base of the PR and between 1dc47a5 and 6296784.

📒 Files selected for processing (2)
  • tests/test_utils.py
  • tests/testing_data/data_config.json

Comment thread tests/test_utils.py
Copy link
Copy Markdown
Member

@ericspod ericspod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @e-mny sorry for losing track of the previous PR! This looks good now and will merge.

@ericspod
Copy link
Copy Markdown
Member

Please have a look at the DCO issue, we should be able to do a remedial commit to sort it.

e-mny and others added 2 commits April 18, 2026 10:25
DCO Remediation Commit for Enoch Mok <enochmokny@gmail.com>

I, Enoch Mok <enochmokny@gmail.com>, hereby add my Signed-off-by to this commit: d02da49

Signed-off-by: Enoch Mok <enochmokny@gmail.com>
@e-mny
Copy link
Copy Markdown
Author

e-mny commented Apr 18, 2026

Thanks @ericspod for your guidance throughout this, I am excited to have made my first contribution (albeit small)!

I do wish to be doing this consistently - may I kindly ask if you guys have preference on which issues I should focus on, whether those in the 'Issues' tab or under MONAI v1.6 are more important to you guys?

To future collaborations and contributions 🍻

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
tests/test_utils.py (1)

179-187: ⚠️ Potential issue | 🔴 Critical

Broken except chain — rt_e is undefined and non-matching RuntimeErrors are now swallowed.

The new except ValueError block was inserted between the (RuntimeError, OSError) body and its raise rt_e, so:

  1. Line 187 raise rt_e now lives inside the except ValueError handler, where rt_e is not in scope (Ruff F821). If a ValueError without "hash check" is raised, this path explodes with NameError.
  2. The original intent — re-raising a RuntimeError/OSError whose message doesn't match DOWNLOAD_FAIL_MSGS — is gone. Such errors are now silently swallowed, masking real failures.
  3. Missing from on the re-raise (B904).
Proposed fix
     except (RuntimeError, OSError) as rt_e:
         err_str = str(rt_e)
         if any(k in err_str for k in DOWNLOAD_FAIL_MSGS):
             raise unittest.SkipTest(f"Error while downloading: {rt_e}") from rt_e  # incomplete download
+        raise
     except ValueError as v_e:
         if "hash check" in str(v_e):
             raise unittest.SkipTest(f"Hash value error while downloading: {v_e}") from v_e
-
-        raise rt_e
+        raise
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_utils.py` around lines 179 - 187, The except chain currently
mis-scopes rt_e and swallows RuntimeError/OSError; move the logic so that the
(RuntimeError, OSError) except (the block that defines rt_e) either re-raises
the original exception when its message doesn't match DOWNLOAD_FAIL_MSGS (use
"raise rt_e from rt_e" or simply "raise" inside that except to preserve
traceback), and keep the ValueError except separate (handle "hash check" by
raising unittest.SkipTest with v_e, otherwise re-raise the ValueError with
"raise" or "raise v_e from v_e"); ensure rt_e is not referenced inside the
except ValueError block and that all re-raises include proper "from" chaining
where applicable.
monai/apps/utils.py (1)

215-276: ⚠️ Potential issue | 🔴 Critical

Docstring and breaking change: RuntimeError vs ValueError

Line 223 documents RuntimeError for downloaded-file hash failure, but line 271 raises ValueError. This breaks both the docstring and downstream code:

  • tests/apps/test_download_and_extract.py line 45 catches (ContentTooShortError, HTTPError, RuntimeError)ValueError escapes uncaught.
  • Line 48 checks str(e).startswith("md5 check"), but the new message is "md5 hash check ...", so the assertion fails even if caught.
  • External code catching RuntimeError around download_url is similarly broken.

Fix: Change line 271 back to RuntimeError (simplest, non-breaking) or update the docstring at line 223 and all call sites together.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@monai/apps/utils.py` around lines 215 - 276, The post-download hash-failure
currently raises ValueError (the raise after logger.info and check_hash) which
conflicts with the docstring/tests expecting a RuntimeError and a message
starting with "{hash_type} check"; change that raise to raise RuntimeError and
restore the message prefix to "{hash_type} check of downloaded file failed:
URL={url}, filepath={filepath}, expected {hash_type}={hash_val}..." so
callers/tests that catch RuntimeError and check str(e).startswith("{hash_type}
check") will continue to work; keep the check_hash call and other surrounding
logic (functions: check_hash, _download_with_progress, _basename) unchanged.
🧹 Nitpick comments (1)
tests/test_utils.py (1)

195-218: Docstring nit: document the actual class contract.

Per the repo's Python guidelines (Google-style docstrings documenting raised exceptions), the class/method docstrings here say Raises: RuntimeError but post-fix the hash-mismatch path should be ValueError. Update once the exception type is settled (see comment on Line 212).

As per coding guidelines: "Docstrings should be present for all definition which describe each variable, return value, and raised exception in the appropriate section of the Google-style of docstrings."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_utils.py` around lines 195 - 218, Update the docstring and test to
reflect the correct exception contract for download_url: change the "Raises:
RuntimeError" entry in the test_download_url docstring to "Raises: ValueError"
and update the assertion that expects an exception
(self.assertRaises(RuntimeError)) to expect ValueError instead; refer to the
TestDownloadUrl class and its test_download_url method and the download_url call
to ensure the docstring and assertion match the download_url behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/test_utils.py`:
- Line 82: The test expectation for the hash-failure message doesn't match the
RuntimeError raised in download_url's existing-file branch
(RuntimeError(f"{hash_type} check of existing file failed: ...")) — update the
test in tests/test_utils.py to assert a common substring present in both
messages (e.g., " check" or include both variants like f"{hash_type} check" and
"hash check") or change the assertion to check for f"{hash_type} check" so it
matches the existing-file error and the downloaded-file error consistently;
reference the download_url function and the RuntimeError message formatting when
making the change.
- Around line 212-218: The test currently expects a RuntimeError but
download_url now raises ValueError on the fresh-download/hash-mismatch path:
update the assertion to use self.assertRaises(ValueError) for the download_url
call (identify the call by download_url(..., filepath=os.path.join(tempdir,
"model_bad.tiff"), hash_val="0"*64, hash_type=SAMPLE_TIFF_HASH_TYPE)).
Additionally, wrap that download_url invocation in a narrow network-exception
guard so CI flakes don’t fail the assertion: catch network-related exceptions
(e.g., requests.exceptions.RequestException / urllib.error.URLError / OSError)
and call or re-use skip_if_downloading_fails() (or otherwise skip the test) for
those cases, but do not catch or mask the ValueError path so the assertRaises
still verifies the hash-error behavior.

---

Outside diff comments:
In `@monai/apps/utils.py`:
- Around line 215-276: The post-download hash-failure currently raises
ValueError (the raise after logger.info and check_hash) which conflicts with the
docstring/tests expecting a RuntimeError and a message starting with
"{hash_type} check"; change that raise to raise RuntimeError and restore the
message prefix to "{hash_type} check of downloaded file failed: URL={url},
filepath={filepath}, expected {hash_type}={hash_val}..." so callers/tests that
catch RuntimeError and check str(e).startswith("{hash_type} check") will
continue to work; keep the check_hash call and other surrounding logic
(functions: check_hash, _download_with_progress, _basename) unchanged.

In `@tests/test_utils.py`:
- Around line 179-187: The except chain currently mis-scopes rt_e and swallows
RuntimeError/OSError; move the logic so that the (RuntimeError, OSError) except
(the block that defines rt_e) either re-raises the original exception when its
message doesn't match DOWNLOAD_FAIL_MSGS (use "raise rt_e from rt_e" or simply
"raise" inside that except to preserve traceback), and keep the ValueError
except separate (handle "hash check" by raising unittest.SkipTest with v_e,
otherwise re-raise the ValueError with "raise" or "raise v_e from v_e"); ensure
rt_e is not referenced inside the except ValueError block and that all re-raises
include proper "from" chaining where applicable.

---

Nitpick comments:
In `@tests/test_utils.py`:
- Around line 195-218: Update the docstring and test to reflect the correct
exception contract for download_url: change the "Raises: RuntimeError" entry in
the test_download_url docstring to "Raises: ValueError" and update the assertion
that expects an exception (self.assertRaises(RuntimeError)) to expect ValueError
instead; refer to the TestDownloadUrl class and its test_download_url method and
the download_url call to ensure the docstring and assertion match the
download_url behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e6a9e321-e574-4979-b315-d9e2ce06b551

📥 Commits

Reviewing files that changed from the base of the PR and between 6296784 and 840861f.

📒 Files selected for processing (2)
  • monai/apps/utils.py
  • tests/test_utils.py

Comment thread tests/test_utils.py Outdated
"network issue",
"gdown dependency", # gdown not installed
"md5 check",
"hash check", # check hash value of downloaded file
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

"hash check" no longer matches the existing-file failure message.

The existing-file branch in download_url raises RuntimeError(f"{hash_type} check of existing file failed: ...") — which does not contain the substring "hash check". Only the new downloaded-file message ("{hash_type} hash check of downloaded file failed") matches. Consider using " check" or keeping both "md5 check" and "hash check", or aligning both error messages to share a common phrase.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_utils.py` at line 82, The test expectation for the hash-failure
message doesn't match the RuntimeError raised in download_url's existing-file
branch (RuntimeError(f"{hash_type} check of existing file failed: ...")) —
update the test in tests/test_utils.py to assert a common substring present in
both messages (e.g., " check" or include both variants like f"{hash_type} check"
and "hash check") or change the assertion to check for f"{hash_type} check" so
it matches the existing-file error and the downloaded-file error consistently;
reference the download_url function and the RuntimeError message formatting when
making the change.

Comment thread tests/test_utils.py
Comment on lines +212 to +218
with self.assertRaises(RuntimeError):
download_url(
url=SAMPLE_TIFF,
filepath=os.path.join(tempdir, "model_bad.tiff"),
hash_val="0" * 64,
hash_type=SAMPLE_TIFF_HASH_TYPE,
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

assertRaises(RuntimeError) will not match — download_url now raises ValueError on this path.

model_bad.tiff does not pre-exist, so this goes through the fresh-download branch, which per the companion change in monai/apps/utils.py now raises ValueError (not RuntimeError). The test will fail deterministically whenever the network is healthy enough to complete the download.

-            with self.assertRaises(RuntimeError):
+            with self.assertRaises(ValueError):
                 download_url(
                     url=SAMPLE_TIFF,
                     filepath=os.path.join(tempdir, "model_bad.tiff"),
                     hash_val="0" * 64,
                     hash_type=SAMPLE_TIFF_HASH_TYPE,
                 )

Also worth wrapping network exceptions here so CI flakes don't masquerade as assertion failures — but per @ericspod's earlier comment you want the second download to be a real attempt, so only catch non-hash errors (e.g., let skip_if_downloading_fails() handle transients while still asserting the hash path raises).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_utils.py` around lines 212 - 218, The test currently expects a
RuntimeError but download_url now raises ValueError on the
fresh-download/hash-mismatch path: update the assertion to use
self.assertRaises(ValueError) for the download_url call (identify the call by
download_url(..., filepath=os.path.join(tempdir, "model_bad.tiff"),
hash_val="0"*64, hash_type=SAMPLE_TIFF_HASH_TYPE)). Additionally, wrap that
download_url invocation in a narrow network-exception guard so CI flakes don’t
fail the assertion: catch network-related exceptions (e.g.,
requests.exceptions.RequestException / urllib.error.URLError / OSError) and call
or re-use skip_if_downloading_fails() (or otherwise skip the test) for those
cases, but do not catch or mask the ValueError path so the assertRaises still
verifies the hash-error behavior.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
tests/test_utils.py (2)

198-218: ⚠️ Potential issue | 🔴 Critical

Expect ValueError for the bad-hash download.

download_url now raises ValueError on the fresh-download hash-mismatch path, so this assertion and docstring are stale.

Proposed fix
-        Raises:
-            RuntimeError: When the downloaded file's hash does not match.
+        Raises:
+            AssertionError: If the bad hash does not fail validation.
@@
-            with self.assertRaises(RuntimeError):
+            with self.assertRaises(ValueError):
                 download_url(
                     url=SAMPLE_TIFF,
                     filepath=os.path.join(tempdir, "model_bad.tiff"),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_utils.py` around lines 198 - 218, Update the test and docstring to
expect ValueError instead of RuntimeError for a hash-mismatch on a fresh
download: change the test_download_url docstring "Raises" section and the
assertion using self.assertRaises(RuntimeError) to
self.assertRaises(ValueError), referencing the download_url function and the
test_download_url test that triggers the bad-hash path.

78-87: ⚠️ Potential issue | 🟡 Minor

Keep a sentinel for existing-file hash failures too.

"hash check" matches downloaded-file failures, but not the existing-file message pattern like sha256 check of existing file failed, so cached hash failures may no longer be skipped.

Proposed fix
-    "hash check",    # check hash value of downloaded file
+    "hash check",  # check hash value of downloaded file
+    "check of existing file failed",  # check hash value of existing file
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_utils.py` around lines 78 - 87, The DOWNLOAD_FAIL_MSGS tuple used
to detect download/cache failures misses the cached-file hash-failure pattern,
so add an additional sentinel string that matches messages like "sha256 check of
existing file failed" (or a broader phrase such as "check of existing file
failed" or "sha256 check") to DOWNLOAD_FAIL_MSGS so cached hash failures are
recognized and skipped; update the DOWNLOAD_FAIL_MSGS definition (the
DOWNLOAD_FAIL_MSGS symbol) to include that sentinel alongside the existing "hash
check" entry.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/test_utils.py`:
- Around line 185-187: The except block in tests/test_utils.py catches
ValueError as v_e and only raises SkipTest when the message contains "hash
check", but swallows all other ValueError instances; update the except
ValueError handler (the branch handling v_e) so that if "hash check" is not in
str(v_e) you re-raise the original ValueError (raise v_e) instead of letting it
pass silently—keep the existing SkipTest behavior when the message matches.

---

Duplicate comments:
In `@tests/test_utils.py`:
- Around line 198-218: Update the test and docstring to expect ValueError
instead of RuntimeError for a hash-mismatch on a fresh download: change the
test_download_url docstring "Raises" section and the assertion using
self.assertRaises(RuntimeError) to self.assertRaises(ValueError), referencing
the download_url function and the test_download_url test that triggers the
bad-hash path.
- Around line 78-87: The DOWNLOAD_FAIL_MSGS tuple used to detect download/cache
failures misses the cached-file hash-failure pattern, so add an additional
sentinel string that matches messages like "sha256 check of existing file
failed" (or a broader phrase such as "check of existing file failed" or "sha256
check") to DOWNLOAD_FAIL_MSGS so cached hash failures are recognized and
skipped; update the DOWNLOAD_FAIL_MSGS definition (the DOWNLOAD_FAIL_MSGS
symbol) to include that sentinel alongside the existing "hash check" entry.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e8f5b280-b860-48e2-8ba3-4aef3a1f0fe9

📥 Commits

Reviewing files that changed from the base of the PR and between 840861f and 16e552e.

📒 Files selected for processing (1)
  • tests/test_utils.py

Comment thread tests/test_utils.py Outdated
Comment on lines +185 to +187
except ValueError as v_e:
if "hash check" in str(v_e):
raise unittest.SkipTest(f"Hash value error while downloading: {v_e}") from v_e
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Re-raise unrelated ValueErrors.

A non-hash ValueError currently falls through and gets swallowed, which can make failing tests pass.

Proposed fix
     except ValueError as v_e:
         if "hash check" in str(v_e):
             raise unittest.SkipTest(f"Hash value error while downloading: {v_e}") from v_e
+        raise
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_utils.py` around lines 185 - 187, The except block in
tests/test_utils.py catches ValueError as v_e and only raises SkipTest when the
message contains "hash check", but swallows all other ValueError instances;
update the except ValueError handler (the branch handling v_e) so that if "hash
check" is not in str(v_e) you re-raise the original ValueError (raise v_e)
instead of letting it pass silently—keep the existing SkipTest behavior when the
message matches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Moving Testing Data to HuggingFace

2 participants