Conversation
Fixes pypa#1998 Currently, linkcheck is blocking PRs if any link in the project isn't accessible. Due to the high number of links, this regularly blocks unrelated PRs (see pypa#2018 for a recent example). This PR moves linkcheck to a separate, non-blocking check. It will still run on PRs and report a failure if a newly added link 404s (pypa#1998 (comment)), but an unrelated failure won't block PRs anymore. We also run the check on a daily cron, so that links going away is surfaced separately from PRs.
5692294 to
3bfdecb
Compare
|
I'd rather block PRs on newly added broken links. How can we detect this? Maybe store the Sphinx builder's structured output as an artifact and exclude those links that existed prior to PRs? |
|
I'm not aware of any tool or function that would support only blocking on new URLs, but we can add it separately. This PR is about not blocking working PRs on a broken linkcheck, including things such as getting changes from accepted PEPs linkcheck. |
|
My 0.02c is that we can rely on maintainer discretion here -- maintainers should still review the linkcheck job when it fails (especially when a PR has new links being added), but can override it when it fails for transient reasons (like these recent kinds of cert issues). |
Yes. But my concern is the additional maintenance burden. When a job isn't marked as required, the contributors may be tempted to take the easy way out and rely on the maintainer to tell them that it actually should pass as opposed as the UI showing them that they need to look into it, which may happen w/o the maintainer's involvement. |
So linkcheck produces JSONL like this: https://github.com/ansible/awx-plugins/actions/runs/24041509767#summary-70114297278. If we have that pre-downloaded into the job from artifacts/cache, we could have Then, |
|
As one of the other other reviewers, I don't think pushing the burden of mitigating the risk of invalid links back to human reviewers is a good idea. Another potential source of inspiration for ways to keep link rot from blocking unrelated PRs would be diff cover. |
|
The linkcheck still runs, and it is still reported. What this PR changes is it that it stops blocking unrelated PRs in a way that can't be mitigated, as the check becomes optional instead of mandatory. The current situation is that all PEP authors and spec maintainers are randomly affected by this. If we specifically want to ensure that linked content is still available, we should archive it in the internet archive and provide an archived link next to live link, as e.g. wikipedia does: https://en.wikipedia.org/wiki/Python_(programming_language)#References |
In my previous comment, I offered an idea of exactly how to mitigate this case.. |
|
I can offer this PR as an improvement, because I'm trying to improve specs maintenance and this is a frequent problem, but implementing a new linkcheck logic here is out of scope for me. |
Fixes #1998
Currently, linkcheck is blocking PRs if any link in the project isn't accessible. Due to the high number of links, this regularly blocks unrelated PRs (see #2018 for a recent example).
This PR moves linkcheck to a separate, non-blocking check. It will still run on PRs and report a failure if a newly added link 404s (#1998 (comment)), but an unrelated failure won't block PRs anymore. We also run the check on a daily cron, so that links going away is surfaced separately from PRs.
📚 Documentation preview 📚: https://python-packaging-user-guide--2035.org.readthedocs.build/en/2035/