Reduce false positives in privacy policy check#305
Closed
rbowen wants to merge 1 commit intoapache:masterfrom
Closed
Reduce false positives in privacy policy check#305rbowen wants to merge 1 commit intoapache:masterfrom
rbowen wants to merge 1 commit intoapache:masterfrom
Conversation
Many projects host their own privacy policy page on their *.apache.org subdomain (e.g., beam.apache.org/privacy_policy, karaf.apache.org/privacy.html). These pages typically mirror or link to the canonical ASF privacy policy, but are currently flagged as non-compliant because the validation regex only accepts two exact canonical URLs. This change adds a third alternative that accepts any *.apache.org URL containing 'privac' (covering privacy, privacy-policy, privacypolicy, etc.). This eliminates 17 of 19 privacy warnings as false positives while still correctly rejecting links to non-ASF domains (e.g., policies.google.com). Also adds rspec tests for the privacy check.
Contributor
|
-1 According to https://www.apache.org/foundation/marks/pmcs.html#navigation, projects must link to the privacy website. There is no option for alternatives. Whimsy needs to follow the policy. |
Contributor
|
Or the policy needs to be changed |
Contributor
Author
|
Noted. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem: The privacy policy site check flags 17 projects as non-compliant even though they link to valid privacy policy pages hosted on their own *.apache.org subdomain. This is a 89% false positive rate for this check.
Fix: Expand the CHECK_VALIDATE regex to also accept any URL on *.apache.org that contains "privac" in the path.
Projects that will move from WARN → PASS: beam, bookkeeper, bval, helix, hudi, johnzon, karaf, knox, openjpa, opennlp, pig, shiro, systemds, tomee, uima, unomi, zookeeper
Still correctly rejected: policies.google.com/privacy (parquet), github.com/apache/privacy-website (dataprivacy)
Many projects host their own privacy policy page on their *.apache.org subdomain (e.g., beam.apache.org/privacy_policy, karaf.apache.org/privacy.html). These pages typically mirror or link to the canonical ASF privacy policy, but are currently flagged as non-compliant because the validation regex only accepts two exact canonical URLs.
This change adds a third alternative that accepts any *.apache.org URL containing 'privac' (covering privacy, privacy-policy, privacypolicy, etc.).
This eliminates 17 of 19 privacy warnings as false positives while still correctly rejecting links to non-ASF domains (e.g., policies.google.com).
Also adds rspec tests for the privacy check.