[00:59:01] <wikibugs>	 10DBA, 06Labs, 13Patch-For-Review, 07Regression: Tool Labs: Add skin, language, and variant to user_properties_anon - https://phabricator.wikimedia.org/T152043#2836353 (10Andrew) I merged the puppet change, but maybe this needs to be run by hand -- I've never done it.
[01:41:01] <wikibugs>	 10DBA, 10ArchCom-RfC, 10MediaWiki-Database, 07RfC: Should we bump minimum supported MySQL Version? - https://phabricator.wikimedia.org/T161232#3125897 (10Krinkle) +1 for bumping the MySQL requirement 5.5.  <https://www.mediawiki.org/wiki/Version_lifecycle> | MediaWiki | Note | Released | EOL |--|--|--|-- |...
[05:50:08] <wikibugs_>	 10DBA, 13Patch-For-Review: Unify revision table on s7 - https://phabricator.wikimedia.org/T160390#3153003 (10Marostegui) dbstore2001 is done.
[06:05:35] <wikibugs_>	 10DBA, 13Patch-For-Review: Unify revision table on s7 - https://phabricator.wikimedia.org/T160390#3153013 (10Marostegui) db1034 was already done, so no need to do that one.
[06:37:49] <wikibugs_>	 10DBA, 13Patch-For-Review: Remove partitions from metawiki.pagelinks in s7 - https://phabricator.wikimedia.org/T153300#3153062 (10Marostegui) db2029 is done: ``` root@db2029.codfw.wmnet[metawiki]> select @@hostname; +------------+ | @@hostname | +------------+ | db2029     | +------------+ 1 row in set (0.03 s...
[06:45:49] <wikibugs>	 10DBA: Remove partitioning from db2019 (codfw master) commonswiki.templatelinks - https://phabricator.wikimedia.org/T161683#3153070 (10Marostegui) a:03Marostegui
[06:46:00] <wikibugs_>	 10DBA: Remove partitioning from db2019 (codfw master) commonswiki.templatelinks - https://phabricator.wikimedia.org/T161683#3139513 (10Marostegui) This alter table is now running
[06:49:36] <jynus>	 we have double the traffic than usual on s1
[06:49:41] <jynus>	 since 20:00 UTC
[06:50:58] <marostegui>	 oh right
[06:51:18] <jynus>	 https://grafana.wikimedia.org/dashboard/db/mysql-aggregated?orgId=1&var-dc=eqiad%20prometheus%2Fops&var-group=core&var-shard=s1&var-role=All&from=1491202270252&to=1491288670253
[06:51:43] <marostegui>	 	•	20:44 bsitzmann@tin: Started deploy [mobileapps/deploy@20ab197]: Update mobileapps to https://gerrit.wikimedia.org/r/#/q/fdd4e31
[06:51:46] <marostegui>	 	•	19:21 hashar: Finished deployment of project-logos optimization for https://phabricator.wikimedia.org/T161999 / https://gerrit.wikimedia.org/r/#/c/346057/ . And purged the related logos
[06:51:50] <marostegui>	 	•	19:18 hashar@tin: Synchronized static/images/project-logos: Optimize a few project logos - https://phabricator.wikimedia.org/T161999 (duration: 00m 44s)
[06:52:19] <jynus>	 none seem to fit
[06:52:59] <marostegui>	 https://grafana.wikimedia.org/dashboard/db/mysql?orgId=1&var-dc=eqiad%20prometheus%2Fops&var-server=db1080&from=now-24h&to=now -> looks pretty big indeed
[06:56:29] <jynus>	 are dumps runing or something?
[06:57:29] <marostegui>	 well, they shouldn't affect db1080 (if they are running)
[06:57:45] <marostegui>	 and db1080 has a big increase in connections, traffic, sorts, etc
[06:57:49] <jynus>	 true, this is main traffic stuff
[06:58:12] <marostegui>	 i am checking operations logs from yesterday around that time too
[07:03:06] <jynus>	 it happens around 20:01:40
[07:11:33] <marostegui>	 i have went thru all the phabricator tickets around that time looking for some updates or stuff, but I haven't found anything relevant
[07:13:06] <jynus>	 I am checking performance_schema to understand which query or queries are sent so frequently
[07:29:21] <jynus>	 SELECT `page_id` , `page_len` , `page_is_redirect` , `page_latest` , `page_content_model` FROM `page` WHERE `page_namespace` = ? AND `page_title` = ? LIMIT ?
[07:29:29] <jynus>	 This is the most frequenty query by far
[07:29:32] <marostegui>	 A tcpdump revelas that LinkCache::fetchPageRow is the most common one
[07:29:40] <marostegui>	 oh, let me see if it matches my tcpdump
[07:29:49] <jynus>	 linkcache?
[07:29:57] <marostegui>	 SELECT /* LinkCache::fetchPageRow  */  page_id,page_len,page_is_redirect,page_latest,page_content_model  FROM `page`    WHERE page_namespace = '0' AND page_title = 'xx'  LIMIT 1
[07:30:02] <marostegui>	 it is the same
[07:30:03] <marostegui>	 haha
[07:31:27] <jynus>	 the other one frequent is heartbeat checks
[07:32:05] <marostegui>	 well, it is interesting we have reached the same query through two different ways of checking
[07:35:51] <jynus>	 https://phabricator.wikimedia.org/P5193
[07:37:02] <marostegui>	 that is such a big difference 
[07:37:16] <marostegui>	 it must be that one - I was looking in gerrit and phabricator for traces of that query
[07:45:21] <jynus>	 the query has always been there
[07:45:33] <jynus>	 the question is why is more frequent now
[07:45:48] <jynus>	 maybe it is just someone querying heavily
[07:46:09] <marostegui>	 but _that_ heavily?
[07:46:20] <marostegui>	 and for that long?
[07:48:45] <marostegui>	 the rc slaves are also affected (not in such a big way, but also increased their traffic)
[07:49:03] <marostegui>	 (they have less traffic weight ofc)
[08:21:21] <marostegui>	 why is db1052 (s1) master having long running selects?
[08:22:08] <marostegui>	 or at least tendril is showing it
[08:26:25] <marostegui>	 https://grafana-admin.wikimedia.org/dashboard/db/mysql?orgId=1&var-dc=eqiad%20prometheus%2Fops&var-server=db1052&from=now-3h&to=now
[08:27:52] <jynus>	 what is the weight?
[08:28:22] <marostegui>	 0
[08:29:07] <jynus>	 no 52 references on the file :-/
[08:29:21] <jynus>	 except as a master
[08:29:28] <marostegui>	 yep
[08:29:30] <jynus>	 problem with sync?
[08:29:48] <jynus>	 https://tendril.wikimedia.org/report/slow_queries?host=%5Edb1052&user=wikiuser&schema=wik&qmode=eq&query=&hours=1
[08:29:48] <marostegui>	 I have synced the file many times today already
[08:29:53] <jynus>	 happening since 8
[08:30:05] <marostegui>	 yes
[08:30:09] <marostegui>	 and nothing was done at 8 today
[08:30:13] <marostegui>	 as per SAL
[08:31:23] <marostegui>	 the graphs shows it is not happening anymore, but why did it happen and why isn't happening now
[08:33:29] <jynus>	 and it is only happening on the master
[08:33:41] <marostegui>	 yes
[08:33:46] <jynus>	 Category::refreshCounts for example is only executing on the master, not on the slaves
[08:34:00] <marostegui>	 let me check other masters
[08:34:10] <jynus>	 the updates are normal
[08:34:18] <jynus>	 but they should not take so much time
[08:34:58] <marostegui>	 looks only enwiki
[08:36:47] <marostegui>	 https://grafana-admin.wikimedia.org/dashboard/file/server-board.json?refresh=1m&orgId=1&from=now-24h&to=now&var-server=db1052&var-network=eth0 nice spikes too
[08:37:41] <jynus>	 no lag: https://grafana-admin.wikimedia.org/dashboard/db/mysql-replication-lag?orgId=1&var-dc=eqiad%20prometheus%2Fops
[08:38:15] <marostegui>	 so, is there any way apart from changing the weight in the php file that a master can get SELECTs?
[08:38:21] <marostegui>	 (and if there is no lag as you showed?)
[08:38:32] <marostegui>	 how can selects arrive to the master?
[08:38:34] <jynus>	 someone manually running it?
[08:38:42] <jynus>	 but then it would be run with another user
[08:38:52] <marostegui>	 yeah
[08:38:55] <marostegui>	 let me check the IPs
[08:39:14] <jynus>	 bad/corrupt file config?
[08:39:24] <marostegui>	 but we haven't chagned the weight in months
[08:39:27] <marostegui>	 *weeks I guess
[08:39:37] <jynus>	 loadbalancer bug?
[08:39:55] <marostegui>	 mw1197.eqiad.wmnet.
[08:40:01] <marostegui>	 that is a random ip running the select 
[08:40:44] <marostegui>	 and: mw1290.eqiad.wmnet.
[08:40:46] <marostegui>	 those two
[08:41:24] <jynus>	 try to see the group of those servers (text, api, etc.)
[08:41:55] <marostegui>	 yep, will do, checking logs to see if there was any apache restart, rsync, etc
[08:46:25] <jynus>	 oh, "LOCK IN SHARE MODE"
[08:47:01] <jynus>	 so probably that should run in the master, but there was contention there
[08:47:20] <marostegui>	 where did you see that?
[08:47:40] <marostegui>	 (both servers from api, btw)
[08:47:40] <jynus>	 https://tendril.wikimedia.org/report/slow_queries_checksum?checksum=8f7978fc16e0c2381f2405ce821ebfe7&host=%5Edb1052&user=wikiuser&schema=wik&hours=1https://tendril.wikimedia.org/report/slow_queries_checksum?checksum=8f7978fc16e0c2381f2405ce821ebfe7&host=%5Edb1052&user=wikiuser&schema=wik&hours=1
[08:48:10] <jynus>	 this is a one time thing- but we should report it
[08:48:31] <jynus>	 I think it is a bug
[08:48:52] <jynus>	 things should not be locked for 1/2 hour
[08:49:05] <marostegui>	 so those selects are meant tobe in the master?
[08:49:14] <jynus>	 yes, I would assume
[08:49:24] <jynus>	 but there was contention on categorylinks
[08:49:37] <jynus>	 and that created slow stuff everywhere
[08:51:07] <marostegui>	 that whole explanation makes total sense
[08:51:22] <marostegui>	 the question now is, why did we have contention there...
[08:51:55] <jynus>	 do you have the api calls handy?
[08:52:00] <jynus>	 I am writing a ticket
[08:52:15] <marostegui>	 the queries?
[08:53:18] <marostegui>	 https://phabricator.wikimedia.org/P5195
[08:53:33] <marostegui>	 didn't run show full processlist, only show processlist
[08:53:50] <jynus>	 ok, don't worry
[08:54:13] <marostegui>	 but looks like this: SELECT /* Category::refreshCounts */ COUNT(*) AS `pages`, COUNT( (CASE WHEN page_namespace = '14' THEN 1 ELSE NULL END) ) AS `subcats`, COUNT( (CASE WHEN page_namespace = '6' THEN 1 ELSE NULL END) ) AS `files` FROM `categorylinks`, `page` WHERE cl_to = 'All_stub_articles' AND (page_id = cl_from) LIMIT 1 LOCK IN SHARE MODE
[08:59:09] <jynus>	 Category::refreshCounts
[08:59:57] <jynus>	 we report, and check that it is followed
[09:00:03] <jynus>	 back to the QPS issue
[09:07:22] <marostegui>	 you want me to report this issue or you were doing it already?
[09:11:06] <jynus>	 https://phabricator.wikimedia.org/T162121
[09:11:14] <jynus>	 sorry, my paste failed earlier
[09:12:29] <marostegui>	 ah
[09:12:30] <marostegui>	 thanks
[09:33:18] <wikibugs>	 10DBA: convert dbstore1001 to InnoDB compressed by importing db shards to it - https://phabricator.wikimedia.org/T159430#3067264 (10Marostegui) Looks like we are hitting this: https://jira.mariadb.org/browse/MDEV-9027 on dbstore1002, so I would like to convert a couple of tables from tokudb to innodb there and s...
[09:41:52] <jynus>	 it is not bing, the number of requests do not increase dramatically at 20h
[09:42:20] <marostegui>	 yeah, I was checking luca paste and I was like: maybe I am missing something obvious
[09:43:16] <marostegui>	 and if it is a search engine, I would expect increases in every single shard
[09:43:23] <marostegui>	 at least, a small bump
[10:50:11] <wikibugs>	 10DBA, 10MediaWiki-API, 06Operations, 10Traffic: Someone is parsing all enwiki pages using the action api at a rate of ~2M pages/hour - https://phabricator.wikimedia.org/T162129#3153389 (10jcrespo)
[10:55:36] <wikibugs_>	 10DBA, 10MediaWiki-API, 06Operations, 10Traffic: Someone is parsing all enwiki pages using the action api at a rate of ~2M pages/hour - https://phabricator.wikimedia.org/T162129#3153405 (10jcrespo) Origin ips (under NDA): {P5199}  The queries done are:  ``` ?format=json&action=parse&page=[*title*]&prop=tex...
[10:58:42] <volans>	 #askdba: tendril crons on terbium will continue to work as is after the switchover, not changes required, right?
[10:59:20] <marostegui>	 I don't think so, no?
[11:02:59] <jynus>	 tendril is not failed over
[11:03:10] <jynus>	 if terbium is failed over
[11:03:16] <jynus>	 it should just work
[11:07:07] <volans>	 yep, just checking
[11:07:13] <volans>	 thanks for the clarification
[11:15:41] <wikibugs>	 10DBA, 06Labs: Prepare and check storage layer for khw.wikipedia - https://phabricator.wikimedia.org/T160870#3153421 (10Rachitrali) 05stalled>03Open Dear brothers and Sisters, I am affiliated with Khowar Wikipedia incubator project as test admin since 2008 and regularly contributing with wikimedia foundati...
[11:36:22] <wikibugs>	 10DBA, 13Patch-For-Review: Unify revision table on s7 - https://phabricator.wikimedia.org/T160390#3153436 (10Marostegui) db2068 is done: ``` root@neodymium:/home/marostegui# for i in `cat s7_T160390`; do echo $i; mysql --skip-ssl -hdb2068.codfw.wmnet $i -e "show create table revision\G" | egrep "KEY";done arwi...
[11:43:34] <wikibugs>	 10DBA, 06Labs: Prepare and check storage layer for khw.wikipedia - https://phabricator.wikimedia.org/T160870#3153454 (10Urbanecm) 05Open>03declined Hence the parent task is closed, this doesn't make sense anymore.
[11:51:28] <wikibugs>	 10DBA, 05codfw-rollout: Analyze if we want to replace some masters in eqiad while it is not active - https://phabricator.wikimedia.org/T162133#3153482 (10Marostegui)
[12:13:31] <wikibugs_>	 10DBA, 06Operations, 10ops-eqiad: Decommission db1057 - https://phabricator.wikimedia.org/T162135#3153532 (10Marostegui)
[12:13:41] <wikibugs_>	 10DBA, 06Operations, 10ops-eqiad: Decommission db1057 - https://phabricator.wikimedia.org/T162135#3153550 (10Marostegui) p:05Triage>03Normal
[12:27:47] <wikibugs>	 10DBA, 10Wikidata, 13Patch-For-Review, 03Wikidata-Sprint: Wikibase\Repo\Store\Sql\SqlEntitiesWithoutTermFinder::getEntitiesWithoutTerm can take 19 hours to execute and it is run by the web requests user - https://phabricator.wikimedia.org/T160887#3153575 (10aude) think there already is a limit of 5000
[12:28:42] <wikibugs_>	 10DBA, 10Wikidata, 13Patch-For-Review, 03Wikidata-Sprint: Wikibase\Repo\Store\Sql\SqlEntitiesWithoutTermFinder::getEntitiesWithoutTerm can take 19 hours to execute and it is run by the web requests user - https://phabricator.wikimedia.org/T160887#3153576 (10aude) think we can mark this as resolved?  If we...
[12:36:59] <wikibugs>	 10DBA, 10Wikidata, 13Patch-For-Review, 03Wikidata-Sprint: Wikibase\Repo\Store\Sql\SqlEntitiesWithoutTermFinder::getEntitiesWithoutTerm can take 19 hours to execute and it is run by the web requests user - https://phabricator.wikimedia.org/T160887#3153598 (10daniel) @hoo The all-option has been removed, as...
[12:49:30] <wikibugs_>	 10DBA, 10Wikidata, 03Wikidata-Sprint: Wikibase\Repo\Store\Sql\SqlEntitiesWithoutTermFinder::getEntitiesWithoutTerm can take 19 hours to execute and it is run by the web requests user - https://phabricator.wikimedia.org/T160887#3153617 (10aude) 05Open>03Resolved
[13:00:15] <wikibugs>	 10DBA, 10Wikidata, 03Wikidata-Sprint: Wikibase\Repo\Store\Sql\SqlEntitiesWithoutTermFinder::getEntitiesWithoutTerm can take 19 hours to execute and it is run by the web requests user - https://phabricator.wikimedia.org/T160887#3153637 (10jcrespo) Thank you very much for working on this- do you have an estima...
[13:02:59] <wikibugs_>	 10DBA, 10Wikidata, 03Wikidata-Sprint: Wikibase\Repo\Store\Sql\SqlEntitiesWithoutTermFinder::getEntitiesWithoutTerm can take 19 hours to execute and it is run by the web requests user - https://phabricator.wikimedia.org/T160887#3153640 (10aude) @jcrespo we backported/deployed this last Thursday
[13:04:28] <wikibugs>	 10DBA, 10Wikidata, 03Wikidata-Sprint: Wikibase\Repo\Store\Sql\SqlEntitiesWithoutTermFinder::getEntitiesWithoutTerm can take 19 hours to execute and it is run by the web requests user - https://phabricator.wikimedia.org/T160887#3153641 (10jcrespo) Thank you again!
[13:10:32] <wikibugs_>	 10DBA: How many revision comments are exactly the same? Get some stats. - https://phabricator.wikimedia.org/T162138#3153644 (10jcrespo)
[13:28:17] <wikibugs>	 10DBA, 10MediaWiki-API, 06Operations, 10Traffic: Someone is parsing all enwiki pages using the action api at a rate of ~2M pages/hour - https://phabricator.wikimedia.org/T162129#3153673 (10jcrespo) Seems to have stopped for now since 12:34 UTC: https://grafana.wikimedia.org/dashboard/db/api-summary?panelId...
[13:48:20] <wikibugs>	 10DBA, 10MediaWiki-API, 06Operations, 10Traffic: Someone is parsing all enwiki pages using the action api at a rate of ~2M pages/hour - https://phabricator.wikimedia.org/T162129#3153759 (10jcrespo) 05Open>03stalled
[13:51:44] <moritzm>	 can I upgrade postgres (update from jessie point update) on labsdb1004?
[13:52:29] <jynus>	 labsdb1004?
[13:53:38] <moritzm>	 role::postgres::master is that maintained by DBAs?
[13:53:38] <jynus>	 to be fair, I do not know who is using that, ask cloud team- operation-wise I have no problem- I am unsure about the impact
[13:53:46] <moritzm>	 ok, will do
[13:53:54] <jynus>	 it is not osm nor tools not replicas
[13:54:06] <jynus>	 so I am not 100% sure the usage
[13:54:14] <jynus>	 in fact, I have a meetign with labs in half an hour
[13:54:16] <jynus>	 I can ask
[13:54:26] <moritzm>	 no, I'll do that. when I've found out, I'll report back :-)
[13:54:27] <jynus>	 about labsdb* server future
[13:59:56] <volans>	 moritzm: those this upgrade affect also puppetdb postgres?
[14:00:15] <volans>	 s/those///
[14:00:17] <jynus>	 in any case, it is likely it is a small impact, give 30 minutes and I will tell you
[14:01:36] <moritzm>	 I've pinged labs admins on IRC, we'll see when they're around
[14:02:06] <moritzm>	 volans: yes, postgres on nihal and nitrogen also needs the update
[14:02:58] <volans>	 ok, then we might want to do it together with https://gerrit.wikimedia.org/r/#/c/346110/ I guess both will generate a spam of failing puppet around
[14:03:36] <moritzm>	 sure, let's bundle it tomorrow?
[14:03:45] <volans>	 works for me!
[14:03:50] <moritzm>	 ok, nice
[14:04:36] <wikibugs>	 10DBA, 06Labs, 06Operations: eqiad: (2) hardware access request for labsdb1004 & 5 refresh - https://phabricator.wikimedia.org/T161754#3153870 (10chasemp)
[14:04:57] <volans>	 akosiaris: any caveat for the postgres upgrade on puppetdb hosts?
[14:05:07] <akosiaris>	 volans: postgres ?
[14:05:09] <volans>	 (see above for context)
[14:05:36] <akosiaris>	 ah
[14:05:47] <akosiaris>	 moritzm: the usual minor stuff of alerts in the ops channel
[14:06:05] <volans>	 we have a master/slave there, I guess upgrade first the slave
[14:06:08] <akosiaris>	 yes
[14:07:54] <moritzm>	 ok, thanks. when labsdb1006 was reimaged, it already received the new version, so it doesn't need to be updated
[14:08:16] <moritzm>	 will upgrade 1005 shortly, then
[14:08:41] <volans>	 thanks alex
[14:08:48] <jynus>	 oh, so 4 and 5 are an independent master and slave?
[14:09:20] <jynus>	 for postgres, I mean, I knew for mysql
[14:09:58] <moritzm>	 hmm. no I confused this with osm.
[14:10:20] <moritzm>	 what is the slave running against labsdb1004 (using role::postgres::master)?
[14:10:37] <jynus>	 ok, let me talk with alex an chase and they will clarify probably lots of things for mw
[14:10:40] <jynus>	 *me
[14:10:45] <moritzm>	 sounds good!
[14:11:09] <jynus>	 and I will report to you the plan
[14:12:02] <jynus>	 likely we will be able to restart those easily (the daemons)
[15:13:51] <jynus>	 moritzm, so those are used mostly by wikitags on labs, maybe others
[15:14:28] <jynus>	 I think we need to schedule maintenance with halfak, akosiaris, is that right ?
[15:14:54] <akosiaris>	 labsdb1004 ? yeah it's aaron mostly
[15:16:20] <wikibugs_>	 10DBA, 06Labs, 06Operations: eqiad: (2) hardware access request for labsdb1004 & 5 refresh - https://phabricator.wikimedia.org/T161754#3154025 (10chasemp) 05Open>03stalled
[15:16:32] <wikibugs>	 10DBA, 06Labs, 06Operations: eqiad: (2) hardware access request for labsdb1006 & 7 refresh - https://phabricator.wikimedia.org/T161755#3154026 (10chasemp) 05Open>03stalled
[15:18:19] <jynus>	 I am looking at https://grafana.wikimedia.org/dashboard/file/server-board.json?refresh=1m&orgId=1&var-server=labsdb1007&var-network=eth0 and https://grafana.wikimedia.org/dashboard/file/server-board.json?refresh=1m&orgId=1&var-server=labsdb1005&var-network=eth0
[15:18:27] <jynus>	 and they are not *that* loaded
[15:18:59] <jynus>	 compared to some others like: https://grafana.wikimedia.org/dashboard/file/server-board.json?refresh=1m&orgId=1&var-server=labsdb1001&var-network=eth0
[15:20:09] <jynus>	 but we need iops and memory
[15:21:12] <jynus>	 while some of the labsdb1005 spikes are probably single projects taking too many resources
[15:29:26] <wikibugs_>	 10DBA, 06Operations, 10ops-codfw: codfw racking first 10 DB servers - https://phabricator.wikimedia.org/T162159#3154083 (10Marostegui)
[16:00:16] <wikibugs>	 10DBA, 06Operations, 10ops-codfw: codfw rack/setup first 10 DB servers - https://phabricator.wikimedia.org/T162159#3154178 (10Papaul)
[16:00:39] <wikibugs>	 10DBA, 06Operations, 10ops-codfw: codfw rack/setup first 10 DB servers - https://phabricator.wikimedia.org/T162159#3154083 (10Papaul) p:05Triage>03Normal a:03Papaul
[17:47:10] <wikibugs_>	 10DBA, 10MediaWiki-API, 06Operations, 10Traffic: Someone is parsing all enwiki pages using the action api at a rate of ~2M pages/hour - https://phabricator.wikimedia.org/T162129#3153389 (10Legoktm) > Requests do not have a user agent  There's no user-agent header at all or is it some generic UA?
[17:57:09] <wikibugs>	 10DBA, 10MediaWiki-API, 06Operations, 10Traffic, 05Security: Someone is parsing all enwiki pages using the action api at a rate of ~2M pages/hour - https://phabricator.wikimedia.org/T162129#3154452 (10MaxSem)
[17:58:14] <wikibugs_>	 10DBA, 10MediaWiki-API, 06Operations, 10Traffic: Someone is parsing all enwiki pages using the action api at a rate of ~2M pages/hour - https://phabricator.wikimedia.org/T162129#3153389 (10MaxSem)
[18:31:10] <wikibugs>	 10DBA: How many revision comments are exactly the same? Get some stats. - https://phabricator.wikimedia.org/T162138#3154557 (10daniel) Ideally, not just count unique; group them and get the number of re-uses in each group, to get a distribution.
[18:52:21] <wikibugs_>	 10DBA, 10MediaWiki-API, 06Operations, 10Traffic: Someone is parsing all enwiki pages using the action api at a rate of ~2M pages/hour - https://phabricator.wikimedia.org/T162129#3154681 (10jcrespo) User agent was "-" (without quotes).
[18:57:21] <wikibugs>	 10DBA: How many revision comments are exactly the same? Get some stats. - https://phabricator.wikimedia.org/T162138#3154713 (10jcrespo) That was the plan :-).
[18:57:27] <wikibugs>	 10DBA, 10MediaWiki-API, 06Operations, 10Traffic: Someone is parsing all enwiki pages using the action api at a rate of ~2M pages/hour - https://phabricator.wikimedia.org/T162129#3153389 (10MaxSem) We used to block API requests that provided no UA - anybody remembers why did we stop doing that?
[19:02:22] <wikibugs>	 10DBA, 10Wikidata, 13Patch-For-Review, 15User-Daniel, and 2 others: Use redis-based lock manager for dispatchChanges on test sites. - https://phabricator.wikimedia.org/T159828#3154746 (10daniel) @hoo @aude Are you ok with merging/deploying the config patch?  I'd like to test this as follows: * stop the dis...
[19:04:16] <wikibugs>	 10DBA: How many revision comments are exactly the same? Get some stats. - https://phabricator.wikimedia.org/T162138#3154749 (10daniel) Which wikis will you run this on? I guess the more bots and gadgets are used on a wiki, the more re-usable messages we'll see.
[19:29:14] <wikibugs>	 10DBA, 10MediaWiki-API, 06Operations, 10Traffic: Someone is parsing all enwiki pages using the action api at a rate of ~2M pages/hour - https://phabricator.wikimedia.org/T162129#3153389 (10Tgr) >>! In T162129#3154681, @jcrespo wrote: > User agent was "-" (without quotes).  More likely, nothing at all. The...
[19:33:53] <wikibugs>	 10DBA, 10MediaWiki-API, 06Operations, 10Traffic: Someone is parsing all enwiki pages using the action api at a rate of ~2M pages/hour - https://phabricator.wikimedia.org/T162129#3154911 (10Tgr) Did the IPs change periodically or did they actually use 50 boxes to query the API in parallel? The second case s...
[19:54:06] <wikibugs>	 10DBA, 10MediaWiki-API, 06Operations, 10Traffic: Someone is parsing all enwiki pages using the action api at a rate of ~2M pages/hour - https://phabricator.wikimedia.org/T162129#3154951 (10Tgr) Seems to have restarted  (at least based on raw GET volume, haven't looked at what type it is). See P5199#27747 f...
[20:16:11] <wikibugs>	 10DBA, 10MediaWiki-API, 06Operations, 10Traffic: Someone is parsing all enwiki pages using the action api at a rate of ~2M pages/hour - https://phabricator.wikimedia.org/T162129#3153389 (10Anomie) >>! In T162129#3154715, @MaxSem wrote: > We used to block API requests that provided no UA - anybody remembers...
[20:17:49] <wikibugs>	 10DBA, 10MediaWiki-API, 06Operations, 10Traffic: Someone is parsing all enwiki pages using the action api at a rate of ~2M pages/hour - https://phabricator.wikimedia.org/T162129#3155015 (10jcrespo) He is back, and now trying to parse Special pages, too :-)  > Did the IPs change periodically or did they act...
[20:25:29] <wikibugs_>	 10DBA: How many revision comments are exactly the same? Get some stats. - https://phabricator.wikimedia.org/T162138#3155043 (10jcrespo) I am running something on enwiki- we can test others depending on the first results. For example, maybe commons and wikidata have more bot-like edits?
[20:39:43] <wikibugs>	 10DBA, 10MediaWiki-API, 06Operations, 10Traffic: Someone is parsing all enwiki pages using the action api at a rate of ~2M pages/hour - https://phabricator.wikimedia.org/T162129#3155110 (10Anomie) The simple solution may be to just block the IPs in varnish or the like, perhaps delivering a message like "If...
[20:47:29] <wikibugs_>	 10DBA: How many revision comments are exactly the same? Get some stats. - https://phabricator.wikimedia.org/T162138#3155132 (10jcrespo) 17 minutes for a full tables scan, less than I expected: ``` mysql> SELECT rev_comment FROM revision PROCEDURE ANALYSE(1); +-----------------------------+-----------------------...
[20:56:41] <wikibugs_>	 10DBA, 10MediaWiki-API, 06Operations, 10Traffic: Someone is parsing all enwiki pages using the action api at a rate of ~2M pages/hour - https://phabricator.wikimedia.org/T162129#3155143 (10Tgr) > I don't think it is malign, just parallelizing queries to load balancing source IPs (always the same ones).  Ye...
[21:02:24] <wikibugs_>	 10DBA: How many revision comments are exactly the same? Get some stats. - https://phabricator.wikimedia.org/T162138#3155159 (10Niharika) For better clarity:  | Field_name                  | Min_value                                                                                       | Max_value    | Min_length...
[21:06:00] <wikibugs_>	 10DBA: How many revision comments are exactly the same? Get some stats. - https://phabricator.wikimedia.org/T162138#3155167 (10jcrespo) I am getting better and more stats soon, hold your breath!  //Note: above-Min_value may had some space-like characters for start.//
[21:36:04] <wikibugs_>	 10DBA: How many revision comments are exactly the same? Get some stats. - https://phabricator.wikimedia.org/T162138#3155337 (10jcrespo) BTW, the avg_value_or_avg_length = 43.3397 means there are approximately 43.3397*745508534 = 30GB only on comment text (probably more due to blob storing inneficiences), which i...
[21:43:08] <wikibugs>	 10DBA, 10Monitoring, 06Operations: tendril cert expiry alerts on dbmonitor hosts - https://phabricator.wikimedia.org/T162183#3155357 (10jcrespo)
[23:05:43] <wikibugs>	 10DBA: Remove partitioning from db2019 (codfw master) commonswiki.templatelinks - https://phabricator.wikimedia.org/T161683#3139513 (10jcrespo) Self reminder to increase the downtime if tomorrow is has not yet finished/caught up.
[23:27:58] <wikibugs_>	 10DBA, 06Operations: dbstore1002 in bad shape - https://phabricator.wikimedia.org/T162212#3155755 (10jcrespo)
[23:45:25] <wikibugs>	 10DBA, 06Operations: dbstore1002 in bad shape - https://phabricator.wikimedia.org/T162212#3155813 (10jcrespo) Probably excessive memory pressure due to heavy mysql usage... blah blah blah... restarted cleanly ... updated kernel... check new import script... check long running queries,... mysql error log is cle...
[23:50:21] <wikibugs_>	 10DBA, 10Analytics, 06Operations: Prep to decommission old dbstore hosts (db1046, db1047) - https://phabricator.wikimedia.org/T156844#3155832 (10jcrespo) > @leila, we can dump and copy to analytics-store, as long as there aren't any database.table name collisions.  I hope you are aware that if for any reason...
[23:56:21] <wikibugs>	 10DBA, 06Operations: dbstore1002 in bad shape - https://phabricator.wikimedia.org/T162212#3155839 (10jcrespo) There is also more load than usual since the 29, that could have contributed to it: https://grafana.wikimedia.org/dashboard/db/mysql?orgId=1&var-dc=eqiad%20prometheus%2Fops&var-server=dbstore1002&from=...