Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A network request made from a worker is still subject to blocking rule(s) even when allowAllRequests is in place #737

Open
Yuki2718 opened this issue Dec 9, 2024 · 11 comments
Labels
needs-triage: chrome Chrome needs to assess this issue for the first time needs-triage: safari Safari needs to assess this issue for the first time topic: dnr Related to declarativeNetRequest

Comments

@Yuki2718
Copy link

Yuki2718 commented Dec 9, 2024

uBlockOrigin/uBOL-home#247

@github-actions github-actions bot added needs-triage: chrome Chrome needs to assess this issue for the first time needs-triage: firefox Firefox needs to assess this issue for the first time needs-triage: safari Safari needs to assess this issue for the first time labels Dec 9, 2024
@gorhill
Copy link

gorhill commented Dec 9, 2024

The issue is that using allowAllRequests on a specific origin still does cause network requests from 3rd-party service workers to be subjected to DNR block rules.

The steps to reproduce:

When selecting "no filtering", uBOL adds a dynamic DNR allowAllRequests rule for the site, as follow:

{
  "action": {
    "type": "allowAllRequests"
  },
  "condition": {
    "requestDomains": [
      "zaber.com"
    ],
    "resourceTypes": [
      "main_frame"
    ]
  },
  "id": "_dynamic/8000000",
  "priority": 100
}

The page will be reloaded. See that there are still network requests blocked as per Network pane despite the allowAllRequests:

image

My understanding is that these network requests originate from service worker loaded from https://www.googletagmanager.com/gtm.js?id=GTM-PFVSHQ9&ngsw-bypass=true, and they are not being seen as coming from https://www.zaber.com directly or indirectly.

@oliverdunk
Copy link
Member

Hi both, thanks for raising this. We discussed it in yesterday's meeting (#742) and were all in agreement that we want to expose as much as possible to the API - and that service worker requests should be part of that.

At the same time, the current behavior is likely working as intended since the request is being made from a service worker. It's hard to attribute a request from a service worker to a specific tab and so it's unclear how allowAllRequests would apply.

Do you have any thoughts on what the requirements are for the most important use cases, and what changes we could make to the API to support that? For example, we could apply allowAllRequests to requests from regular workers (those are more clearly attributed to a specific tab) and then have another mechanism to handle requests from service workers and shared workers.

@ghostwords
Copy link

ghostwords commented Dec 20, 2024

It's hard to attribute a request from a service worker to a specific tab and so it's unclear how allowAllRequests would apply.

The simple fact is that extension authors could apply workarounds to get "disable on a site" functionality to work in Manifest V2. Now we are stuck with Manifest V3, and just as feared, we have to convince browsers to fix what already worked well enough in MV2.

@ghostwords
Copy link

When there is an allowAllRequests rule for a site domain, all requests should be allowed for that site. End of story. If that's not what's happening, browsers need to fix that. Oh, are service worker-initiated requests hard to attribute? We know!! Welcome to the club! Now fix the bugs please.

@oliverdunk
Copy link
Member

When there is an allowAllRequests rule for a site domain, all requests should be allowed for that site.

The allowAllRequests action applies to a particular document and its subframes - not any request associated with a given site. That isn't to say we don't see the value in having a way to disable DNR rules on a given site - it's very much the opposite. It's just that we might want to use a different mechanism for that.

On service workers specifically, it's possible to write code like this:

const response = fetch("https://example.com/data.json");

self.addEventListener("fetch", (event) => {
  event.respondWith(response);
});

In that case, the request for data.json could have started before the tab was even open and isn't associated with any site - in fact it wasn't made for any particular request.

There are certainly some more feasible things to look at - for example maybe we could block the service worker from ever seeing a fetch callback if a request was made from a page with allowAllRequests. That's worth discussing though to make sure what we come up with addresses the use cases you have.

@ghostwords
Copy link

If allowAllRequests() doesn't actually mean "disable for site", then Manifest V3 is missing "disable for site" functionality. Then, fine, it's not that allowAllRequests() is buggy, it's that MV3 is missing an important use case. The end result is the same, extensions are worse off than before.

@gorhill
Copy link

gorhill commented Dec 20, 2024

Our filter ||googletagmanager.com^ can be changed on our side to be better targeted => ||googletagmanager.com^$third-party.

This fixes a couple of undue blocking. However there are still network requests to googletagmanager.com which are still blocked/redirected locally, here are the details of these requests:

Request 1:
{
  "documentLifecycle": "active",
  "frameId": -1,
  "frameType": "outermost_frame",
  "initiator": "https://www.zaber.com",
  "method": "GET",
  "parentFrameId": -1,
  "requestId": "11208",
  "tabId": -1,
  "type": "xmlhttprequest",
  "url": "https://www.googletagmanager.com/gtag/js?id=G-LPVSPC812X&l=dataLayer&cx=c&gtm=45He4cc1v841624434za200"
}

Triggered local redirect rule:

{
  "action": {
    "redirect": {
      "extensionPath": "/web_accessible_resources/google-analytics_analytics.js"
    },
    "type": "redirect"
  },
  "condition": {
    "resourceTypes": [
      "script",
      "xmlhttprequest"
    ],
    "urlFilter": "||googletagmanager.com/gtag/js"
  },
  "id": "_dynamic/428",
  "priority": 20
}
Request 2:
{
  "documentLifecycle": "active",
  "frameId": -1,
  "frameType": "outermost_frame",
  "initiator": "https://www.zaber.com",
  "method": "GET",
  "parentFrameId": -1,
  "requestId": "11210",
  "tabId": -1,
  "type": "xmlhttprequest",
  "url": "https://www.googletagmanager.com/gtag/destination?id=AW-1071949151&l=dataLayer&cx=c&gtm=45He4cc1v841624434za200"
}

Triggered block rule:

{
  "action": {
    "type": "block"
  },
  "condition": {
    "domainType": "thirdParty",
    "requestDomains": [
      "0emm.com",
      "0stats.com",
      "100widgets.com",
      "103bees.com",
      "105app.com",
      "11nux.com",
      "123-counter.de",
      "123compteur.com",
      "123count.com",
      "123date.me",
      "...",
      "zontera.com",
      "zoomanalytics.co",
      "zoomino.com",
      "zoosnet.net",
      "zoossoft.net",
      "zprk.io",
      "ztcadx.com",
      "zuzab.com",
      "zwaar.org",
      "zx-adnet.com"
    ]
  },
  "id": "default/12396",
  "priority": 10
}

Two other network requests to googletagmanager.com where properly not blocked because of allow/allAllRequests rules for zaber.com:

High priority allow rules which were added as a result of the site being set to no-filtering:
{
  "action": {
    "type": "allow"
  },
  "condition": {
    "initiatorDomains": [
      "zaber.com"
    ],
    "resourceTypes": [
      "script"
    ]
  },
  "id": "_dynamic/8000001",
  "priority": 100
}

{
  "action": {
    "type": "allowAllRequests"
  },
  "condition": {
    "requestDomains": [
      "zaber.com"
    ],
    "resourceTypes": [
      "main_frame"
    ]
  },
  "id": "_dynamic/8000000",
  "priority": 100
}

I gather that those two last rules worked because the network request initiator was https://www.zaber.com and originating from the actual tab.

The initiator for the two unduly blocked googletagmanager.com requests is also https://www.zaber.com, except that they do not originate from an actual tab, probably from a worker. It seems to me the allowAllRequest rule could have been applied to these tabless network requests.

@gorhill
Copy link

gorhill commented Dec 20, 2024

Sorry, shortly after I commented above, I realized that the blocked network requests were xmlhttprequest. If I add that resource type to the allow rule for zaber.com, then this fixes the issue:

{
  "action": {
    "type": "allow"
  },
  "condition": {
    "initiatorDomains": [
      "zaber.com"
    ],
    "resourceTypes": [
      "script",
      "xmlhttprequest"
    ]
  },
  "id": "_dynamic/8000001",
  "priority": 100
}

So in the end no issue, we just need to use better targeted filters, and better targeted allow rules.

@gorhill
Copy link

gorhill commented Dec 20, 2024

Actually there is still just a small hurdle left to properly enforce disabling a content blocker on a specific site: the issue is that the tabIds property in RuleCondition can be used only for session rules, not dynamic rules:

List of tabs.Tab.id which the rule should match. An ID of tabs.TAB_ID_NONE matches requests which don't originate from a tab. An empty list is not allowed. Only supported for session-scoped rules.

I understand that it does not make sense to use tab ids for actual tabs, which is going to be different from a session to another, but this is not true for tabs.TAB_ID_NONE. Being able to use tabs.TAB_ID_NONE for dynamic rules would solve the issue of properly creating allow/allowAllRequests rules to disable content blocking for specific sites.

The idea is that when someone disable content blocking for a specific site, we want to allow all network requests for when the site is loaded as top site, not necessarily when the site is loaded embedded in a sub_frame. So being able to target only main_frame and worker (tabs.TAB_ID_NONE) would ensure no undue allowing in case of a site which is embedded, which would cause discrepancy between the state of the blocker (as per popup panel) and the undue allowing in the embedded frame. Not sure I am being clear here.

I suppose for now I could re-create those session rules at extension launch, but I am not sure they could be enforced in time in case the user navigate to a site by the browser launching from an external application.

@gorhill
Copy link

gorhill commented Dec 21, 2024

As far as uBO Lite is concerned, I consider this issue solved with no changes required to DNR API. It can be closed unless other content blockers suffering the same issue think differently.

There could still be edge cases of creating the session rules after a page started loading at browser launch, but I think this should be its own issue if ever this becomes an actual issue to end users.

@Rob--W Rob--W added topic: dnr Related to declarativeNetRequest and removed needs-triage: firefox Firefox needs to assess this issue for the first time labels Dec 24, 2024
@Rob--W
Copy link
Member

Rob--W commented Dec 24, 2024

There is some discussion above on the usefulness of "allowAllRequests" and the infeasibility of supporting that in general with service workers. If there are concrete proposals to address that use case, I'd welcome separate issues so that they can be evaluated separately by the members of this group, including the browser vendors.

Since this report was originally motivated by a use case from uBlock Origin, and @gorhill stated that this is a non-issue with the current DNR API, I'm inclined to close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-triage: chrome Chrome needs to assess this issue for the first time needs-triage: safari Safari needs to assess this issue for the first time topic: dnr Related to declarativeNetRequest
Projects
None yet
Development

No branches or pull requests

5 participants