[00:33:32] 10DBA, 10Icinga, 10Monitoring, 06Operations, 13Patch-For-Review: tendril cert expiry alerts on dbmonitor hosts - https://phabricator.wikimedia.org/T162183#3182160 (10Dzahn) fixed. false positives are gone, the real check stays and is OK https://icinga.wikimedia.org/cgi-bin/icinga/status.cgi?search_stri... [00:33:38] 10DBA, 10Icinga, 10Monitoring, 06Operations, 13Patch-For-Review: tendril cert expiry alerts on dbmonitor hosts - https://phabricator.wikimedia.org/T162183#3182163 (10Dzahn) 05Open>03Resolved [00:34:08] 10DBA, 10Icinga, 10Monitoring, 06Operations: tendril cert expiry alerts on dbmonitor hosts - https://phabricator.wikimedia.org/T162183#3154666 (10Dzahn) [01:32:38] 10DBA, 06Operations: dbtree broken (for some users?) - https://phabricator.wikimedia.org/T162976#3182225 (10Dzahn) [01:39:46] 10DBA, 06Operations: dbtree broken (for some users?) - https://phabricator.wikimedia.org/T162976#3182213 (10Paladox) it works for me. [02:12:23] 10DBA, 10Icinga, 10Monitoring, 06Operations: tendril cert expiry alerts on dbmonitor hosts - https://phabricator.wikimedia.org/T162183#3182269 (10Dzahn) [02:12:25] 10DBA, 06Operations, 10Traffic, 13Patch-For-Review: convert tendril to use Letsencrypt for SSL cert (deadline 2017-03-17) - https://phabricator.wikimedia.org/T154938#3182270 (10Dzahn) [02:14:12] 10DBA, 10Icinga, 10Monitoring, 06Operations: tendril cert expiry alerts on dbmonitor hosts - https://phabricator.wikimedia.org/T162183#3154666 (10Dzahn) [04:49:18] 10DBA, 06Operations: dbtree broken (for some users?) - https://phabricator.wikimedia.org/T162976#3182380 (10Peachey88) [14:47:38] 10DBA, 06Operations: dbtree broken (for some users?) - https://phabricator.wikimedia.org/T162976#3182213 (10Marostegui) Works for me [14:53:34] 10DBA, 06Community-Tech, 06Labs, 10Tool-Labs: Fix Plagiabot DB corruption - https://phabricator.wikimedia.org/T162932#3179794 (10Marostegui) In addition to what Jaime said looks like the server has been crashing lately: https://grafana.wikimedia.org/dashboard/db/mysql?orgId=1&var-dc=eqiad%20prometheus%2Fop... [15:03:25] 10DBA, 06Community-Tech, 06Labs, 10Tool-Labs: Fix Plagiabot DB corruption - https://phabricator.wikimedia.org/T162932#3179794 (10eranroz) I may have fixed it: ``` repair table copyright_diffs; +-------------------------------------+--------+----------+-------------------------------------------------------... [15:12:00] 10DBA, 06Community-Tech, 06Labs, 10Tool-Labs: Fix Plagiabot DB corruption - https://phabricator.wikimedia.org/T162932#3183067 (10eranroz) >>! In T162932#3181900, @jcrespo wrote: > Cannot reproduce: > > > ``` > root@labsdb1001[s51306__copyright_p]> SELECT * FROM copyright_diffs ORDER BY diff_timestamp DES... [15:32:56] 10DBA, 06Community-Tech, 06Labs, 10Tool-Labs: Fix Plagiabot DB corruption - https://phabricator.wikimedia.org/T162932#3183091 (10Marostegui) Yes, you can use `alter table s51306__copyright_p.copyright_diffs ENGINE=InnoDB;` to migrate it to InnoDB if you like. [17:56:04] 10DBA, 06Revision-Scoring-As-A-Service, 10rsaas-articlequality : [Discuss] Hosting the monthly article quality dataset on labsDB - https://phabricator.wikimedia.org/T146718#3183524 (10Halfak) [18:28:48] 10DBA, 06Revision-Scoring-As-A-Service, 10rsaas-articlequality : [Discuss] Hosting the monthly article quality dataset on labsDB - https://phabricator.wikimedia.org/T146718#3183623 (10Halfak) I just came to check on this task because it's been sitting for almost a month and I've been seeing overwhelming dema... [19:00:38] 10DBA, 06Labs: S1 replag at 3 hours - https://phabricator.wikimedia.org/T163023#3183654 (10Matthewrbowker) [19:01:42] 10DBA, 06Operations: dbtree broken (for some users?) - https://phabricator.wikimedia.org/T162976#3183666 (10jcrespo) 05Open>03stalled Most likely a one-time error that got cached for some time? Tendril db tends to fail quite regularly due to large queries asking for large reports (but that is mostly ok). W... [19:02:24] 10DBA, 06Labs: S1 replag at 3 hours - https://phabricator.wikimedia.org/T163023#3183654 (10jcrespo) It is getting better now. [19:03:11] 10DBA, 06Labs: S1 replag at 3 hours - https://phabricator.wikimedia.org/T163023#3183670 (10Matthewrbowker) Shard 1 is, it has improved by about 7 hours recently. Shard 3 has doubled though, it was at 21 minutes a half hour ago. [19:10:30] 10DBA, 06Labs: S1 replag at 3 hours - https://phabricator.wikimedia.org/T163023#3183678 (10jcrespo) Yep, 2 problems in one. s1 is labsdb1001 crashing regularly in the last 2 days. s3 was the filtering server getting stuck on s3 only due to a table corruption on a TokuDB index. Both solved now. [19:11:02] 10DBA, 06Labs: S1 replag at 3 hours - https://phabricator.wikimedia.org/T163023#3183679 (10jcrespo) I will merge this into a general ticket for the labsdb1001 issues, but lag should be going down now everywhere. [19:12:30] 10DBA, 06Labs: S1 replag at 3 hours - https://phabricator.wikimedia.org/T163023#3183685 (10jcrespo) [19:22:28] 10DBA, 06Labs: S1 replag at 3 hours - https://phabricator.wikimedia.org/T163023#3183690 (10Matthewrbowker) @jcrespo Thank you so much for the quick look. [20:33:58] 10DBA, 06Operations: dbtree broken (for some users?) - https://phabricator.wikimedia.org/T162976#3183833 (10bd808) ``` Accept-Ranges: bytes Age: 10 Content-Encoding: gzip Content-Length: 76 Content-Type: text/html; charset=UTF-8 Date: Fri, 14 Apr 2017 20:31:37 GMT Server: Apache Strict-Transport-Security: max-... [21:25:42] 10DBA, 10Wikidata: Repeated reports of wikidatawiki (s5) API going read only - https://phabricator.wikimedia.org/T123867#3183910 (10Multichill) In the last 10 minutes: pywikibot.data.api.APIError: readonly: The database has been automatically locked while the slave database servers catch up to the master [rea... [21:26:25] 10DBA, 06MediaWiki-Platform-Team, 10MediaWiki-extensions-Linter, 10Wikimedia-Extension-setup, and 2 others: Review and deploy Linter extension to Wikimedia wikis - https://phabricator.wikimedia.org/T148609#3183913 (10Volans) [21:38:06] 10DBA, 06MediaWiki-Platform-Team, 10MediaWiki-extensions-Linter, 10Wikimedia-Extension-setup, and 2 others: Review and deploy Linter extension to Wikimedia wikis - https://phabricator.wikimedia.org/T148609#2727699 (10Reedy) I do notice there's no caching infront of the DB calls to getTotals Possibly worth... [22:27:46] 10DBA, 06MediaWiki-Platform-Team, 10MediaWiki-extensions-Linter, 10Wikimedia-Extension-setup, and 2 others: Review and deploy Linter extension to Wikimedia wikis - https://phabricator.wikimedia.org/T148609#3184035 (10Reedy) >>! In T148609#3183893, @Stashbot wrote: > {nav icon=file, name=Mentioned in SAL (#... [22:31:52] 10DBA, 06MediaWiki-Platform-Team, 10MediaWiki-extensions-Linter, 10Wikimedia-Extension-setup, and 2 others: Review and deploy Linter extension to Wikimedia wikis - https://phabricator.wikimedia.org/T148609#3184036 (10Reedy) Guess when Linter was deployed? :) {F7550011 size=full} [23:32:12] 10DBA, 10MediaWiki-extensions-WikibaseClient, 10Wikidata, 15User-Daniel: Usage tracking: record which statement group is used - https://phabricator.wikimedia.org/T151717#3184079 (10jcrespo) > We want to collect additional information on one of these wikis for a while If that doesn't involve a schema chang... [23:45:30] 10DBA, 06Operations, 10Traffic: dbtree broken (for some users?) - https://phabricator.wikimedia.org/T162976#3184091 (10jcrespo) 05stalled>03Open I assume that is a hit of an error message? Traffic: What is tendril.wikimedia.org's caching policy so that this can happen? I would expect a smaller TTL than...