8000 Weather forecast spam for a Dutch herring fast food shop · Issue #48730 · home-assistant/core · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Weather forecast spam for a Dutch herring fast food shop #48730

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
geira opened this issue Apr 6, 2021 · 27 comments
Closed

Weather forecast spam for a Dutch herring fast food shop #48730

geira opened this issue Apr 6, 2021 · 27 comments

Comments

@geira
Copy link
geira commented Apr 6, 2021

The problem

While generating a heatmap of locations used on api.met.no weather forecast, we found a couple of hotspots with significantly more traffic than the rest of the world. One of these are coming from 5250 different ip addresses around the world, all running HomeAssistant. Together these generate 6800+ requests per hour for lat=52.3731&lon=4.8903, which is a herring snack bar on Raadhuisstraat in Amsterdam called Haring & Zo.

Now, I can think of no reason why users in Singapore should be checking the weather for Amsterdam 3 times per second unless there is a default location hardcoded somewhere in the HA code. I have tried searching the core source code for these coordinates but have found nothing. Since all these requests are a) unnecessary and b) screw up our statistics it would be nice if you could find the bug or alternatively filter out the spam.

I have checked the logs, and the problem seem to be present in 100+ different versions. The following are the ones generating the most spam:
8867 "HomeAssistant/2021.3.4
2498 "HomeAssistant/2021.2.3
1630 "HomeAssistant/2021.1.5
650 "HomeAssistant/2020.12.1
640 "HomeAssistant/2021.1.4
616 "HomeAssistant/2021.3.3
531 "HomeAssistant/2021.3.2
385 "HomeAssistant/2020.12.2
374 "HomeAssistant/2021.2.1
371 "HomeAssistant/0.116.4
309 "HomeAssistant/2021.1.1
302 "HomeAssistant/0.114.4

Feel free to contact us on weatherapi-adm@met.no for more information.

What is version of Home Assistant Core has the issue?

2021.3.4

What was the last working version of Home Assistant Core?

No response

What type of installation are you running?

Home Assistant Core

Integration causing the issue

met

Link to integration documentation on our website

https://www.home-assistant.io/integrations/met

Example YAML snippet

No response

Anything in the logs that might be useful for us?

No response

@probot-home-assistant
Copy link

met documentation
met source
(message by IssueLinks)

@probot-home-assistant
Copy link

Hey there @Danielhiversen, @thimic, mind taking a look at this issue as its been labeled with an integration (met) you are listed as a codeowner for? Thanks!
(message by CodeOwnersMention)

@Danielhiversen
Copy link
Member

Thanks for the report.
52.3731, 4.8903 is the default location if no location is specified by the user.

So it is approximately 1 request per Home Assistant instance per hour? That would be expected.

@geira
Copy link
Author
geira commented Apr 6, 2021

Yes, it's about once per hour, but why should a user in Mexico ask for the weather in Netherlands? It seems all these users haven't set their location properly and so defaults to Amsterdam. Here is a list of the top IP address origin countries for the spam requests:

   3708 US
   3301 NL
   2798 DE
   2088 CN
   1029 GB
    925 RU
    837 IT
    784 FR
    695 PL
    563 CA
    546 SE
    531 BE
    472 ES
    427 CH
    391 AU
    385 AT
    365 DK
    364 TW
    361 UA
    359 NO
    333 CZ
    319 VN
    276 KR
    275 RO
    271 PT
    253 HU
    ...

Why should 87 different users in Vietnam be checking the weather for Amsterdam every hour? This is too regular to be a coincidence, so must be a bug.

@geira
Copy link
Author
geira commented Apr 6, 2021

latlon_heatmap

Here is a heatmap of today's traffic to api.met.no. The left hotspot is all coming from HA. The right one is a border crossing between Germany and the Czech Republic, all coming from a user in Italy which has been blocked.

@Danielhiversen
Copy link
Member

Why should 87 different users in Vietnam be checking the weather for Amsterdam every hour?

52.3731, 4.8903 is the default location if no location is specified by the user.

From the onboarding:
image

@geira
Copy link
Author
geira commented Apr 6, 2021

Well, I can't see much use for defaulting to this, and there certainly is not point in asking for weather forecasts unless this field has been populated. I would recommend setting this field to undefined so as not to confuse your users with bogus data.

As per our Terms of Service you must not cause unnecessary traffic. This is clearly unnecessary since it serves no purpose for the users. While the traffic amount is not harmful it impedes our systems for monitoring and statistics. Please filter out such traffic or we will be forced to implement throttling of all HA users.

@tdejneka
Copy link
tdejneka commented Apr 6, 2021

I haven't installed a fresh copy of Home Assistant recently so I can't recall for certain if the Met Office integration is installed by default. I believe it is, so perhaps the simplest solution is to remove the Met Office integration from the onboarding process. It will not resolve the many instances already generating needless traffic to met.no but at least it will help prevent additional traffic.

@Danielhiversen
Copy link
Member
Danielhiversen commented Apr 6, 2021

Well, I can't see much use for defaulting to this,

It is the default location of the home, so not only used by the met.no component.

As per our Terms of Service you must not cause unnecessary traffic. This is clearly unnecessary since it serves no purpose for the users. While the traffic amount is not harmful it impedes our systems for monitoring and statistics. Please filter out such traffic or we will be forced to implement throttling of all HA users.

Then I think we have to remove met.no as the default weat 8000 her component
The purpose is to let the user easily get an impression of how HA works by showing a weather component after the onboarding. It will always be someone that is not changing the home location, and will then get the weather from the default location.

@Danielhiversen
Copy link
Member

My comment was meant as a proposal for the other HA devs, and not as a threat. Sorry if that was unclear.
@user897943 You are welcome to propose another solution.

@frenck
Copy link
Member
frenck commented Apr 6, 2021

I think your comments are off-topic and more a general perspective than the problem discussed here. Please use the community forums for general discussions @user897943. Thanks 👍

@frenck
Copy link
Member
frenck commented Apr 6, 2021

So, a couple of things going on here that we could interact on and help with.

It is indeed the default location. Although as stated by @geira:

This is clearly unnecessary since it serves no purpose for the users.

I don't agree with that statement. From Home Assistant's perspective, it sets up a starting environment to work with. Thus has a purpose. I can see, however, how that is different from your perspective.

Just to be clear on any path forward. For any current systems, we cannot change the current behavior, as it is local systems we don't control or can change the behavior of. With all our goodwill and co-operation.

Then I think we have to remove met.no as the default weather component

That doesn't solve it for any existing systems.

I suggest we start looking at solving this issue in 2 ways, to get working on the current problem first, before looking into new solutions.

  1. If the default coordinates are used, we don't spawn the integration (Not loading the config entry).
  2. For the onboarding, we do similar, however, we do not create the config entry to begin with.

That would address and solve the current concerns raised, without changing actual goals.

Especially point 1, would help with any existing systems (with the big "if"; if they upgrade to a release that handles this).

@frenck
Copy link
Member
frenck commented Apr 6, 2021

@user897943 You are generalizing again instead of focussing on the problem presented. Please use the community forums. Consider this the last warning.

@Danielhiversen
Copy link
Member

@frenck
Where are the default coordinates sat? In leaflet?

@frenck
Copy link
Member
frenck commented Apr 6, 2021

@Danielhiversen Its in the frontend:

const amsterdam = [52.3731339, 4.8903147];

https://github.com/home-assistant/frontend/blob/109910d18f379e9a23cfc54f3f209e1d06042d72/src/onboarding/onboarding-core-config.ts#L29

So, it gets into the backend by the frontend onboarding at this point.

@frenck
8000
Copy link
Member
frenck commented Apr 6, 2021

This is also exactly how it ends up in the core config storage
image

@frenck
Copy link
Member
frenck commented Apr 6, 2021

@Danielhiversen Not sure if you are working on this or not. But do you agree on the given path?
I've whipped something up just now, that might be reasonable to tag into the release tomorrow (so could do a PR)

@Danielhiversen
Copy link
Member

I agree with you 👍
Please, open a PR

@frenck
Copy link
Member
frenck commented Apr 6, 2021

Opened up a PR here: #48741

It is not closing this issue on merge, so we can keep this channel open for a moment to discuss effects.

@geira
Copy link
Author
geira commented Apr 7, 2021

API usage

Perhaps some actual data might illustrate the problem. HomeAssistant is by far the largest user of the locationforecast API, with a quarter of the total traffic. While the forecasting service is primarily made for Yr, they don't use the API which is a public, free service funded by the Norwegian taxpayers. As such it is subject to the tyrrany of the commons and require users to cooperate in order to function efficiently.

The Terms of Service allows you to make up to 20 req/sec without special permission. HA uses about 5 times as much, averaging 100 req/sec. We've never made an issue of this as we consider HA a useful service and wish to contribute to FOSS since all our systems are built on open source. We do however expect your full cooperation when your implementation creates problems at our end, potentially disrupting service and generating problems for other users. Stifling criticism is not considered very cooperative.

You are of course welcome to choose other, commercial providers instead of us by default. As previously mentioned though this does not solve the issue with the existing installations, of which we estimate there is about 250,000. These would naturally have to update their installations, but judging from experience this should not pose a problem since 75% of them have already upgraded after we terminated locationforecast/1.9 in favour of 2.0.

Filtering out requests where the location is not set is one option, but that would show a blank card. Another could be to randomize the location for every request to show example forecasts for different parts of the world. There is nothing to say that the weather forecast location must match where the user's location (owners of caravans and holiday houses often use weather forecasts to plan leisure activities), which also solves a lot of privacy concerns. Even setting it to 0,0 would be preferable, as we already filter out this due to a large number of buggy robots and the Gulf of Guinea is not likely to be confused with legitimate requests. Assuming everyone lives in Amsterdam unless otherwise specified does however seem like an unfortunate assumption with potentially unforeseen consequences.

@Danielhiversen
Copy link
Member
Danielhiversen commented Apr 7, 2021

Thanks for your comments again.
Some good news, is that we have fixed the issue so we will not fetch weather data if the user does not specify their location: #48741
We also fixed an issue creating some unnecessary request when Home Assistant starts up: #48756

Both will be part of the release today.

Just remember:

For any current systems, we cannot change the current behavior, as it is local systems we don't control or can change the behavior of.

So it will not change before the user update his system.

Please let us know if it is anything else we should improve

@frenck
Copy link
Member
frenck commented Apr 7, 2021

We've never made an issue of this as we consider HA a useful service and wish to contribute to FOSS since all our systems are built on open source.

Which we appreciated and been thankful for, now and in the past.

We do however expect your full cooperation when your implementation creates problems at our end, potentially disrupting service and generating problems for other users.

I think we have been very cooperative? We implemented 2 changes yesterday, based on the above. All other raised concerns have been addressed in the past as well.

We have even added it to the beta for the release today, which actually was cut last week. If there is an issue like this, we absolutely want to help mitigate it immediately.

Stifling criticism is not considered very cooperative.

? I'm confused, could you clarify this?

Filtering out requests where the location is not set is one option

Yeah, implemented something similar, we don't load the integration at all if that is the case as of today's release.

Even setting it to 0,0 would be preferable, as we already filter out this due to a large number of buggy robots and the Gulf of Guinea is not likely to be confused with legitimate requests.

Setting it to 0,0 by default during onboarding is definitely a better option, yet, those requests should not end up with you guys in those cases. The current implementation that will be shipped today already incorporates that scenario. Right now, the default Amsterdam location coordinates and 0,0 will be handled the same way: Not loading at all.

And as mentioned above by @Danielhiversen, if there are other problems like described as in this issue, please let us know. Your service is appreciated by us and the community.

@samnewman86
Copy link

API usage

Perhaps some actual data might illustrate the problem. HomeAssistant is by far the largest user of the locationforecast API, with a quarter of the total traffic. While the forecasting service is primarily made for Yr, they don't use the API which is a public, free service funded by the Norwegian taxpayers. As such it is subject to the tyrrany of the commons and require users to cooperate in order to function efficiently.

The Terms of Service allows you to make up to 20 req/sec without special permission. HA uses about 5 times as much, averaging 100 req/sec. We've never made an issue of this as we consider HA a useful service and wish to contribute to FOSS since all our systems are built on open source. We do however expect your full cooperation when your implementation creates problems at our end, potentially disrupting service and generating problems for other users. Stifling criticism is not considered very cooperative.

You are of course welcome to choose other, commercial providers instead of us by default. As previously mentioned though this does not solve the issue with the existing installations, of which we estimate there is about 250,000. These would naturally have to update their installations, but judging from experience this should not pose a problem since 75% of them have already upgraded after we terminated locationforecast/1.9 in favour of 2.0.

Filtering out requests where the location is not set is one option, but that would show a blank card. Another could be to randomize the location for every request to show example forecasts for different parts of the world. There is nothing to say that the weather forecast location must match where the user's location (owners of caravans and holiday houses often use weather forecasts to plan leisure activities), which also solves a lot of privacy concerns. Even setting it to 0,0 would be preferable, as we already filter out this due to a large number of buggy robots and the Gulf of Guinea is not likely to be confused with legitimate requests. Assuming everyone lives in Amsterdam unless otherwise specified does however seem like an unfortunate assumption with potentially unforeseen consequences.

Thanks for coming here's to notify us about this, so a fix could be made.
After all it would have been easy for you guys just to block requests from home assistant users.

@github-actions
Copy link
github-actions bot commented Jul 7, 2021

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates.
Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍
This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale label Jul 7, 2021
@github-actions github-actions bot locked and limited conversation to collaborators Aug 13, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants
@geira @frenck @Danielhiversen @tdejneka @samnewman86 and others
0