[01:02:22] RECOVERY - Host tools-secgroup-test-103 is UP: PING OK - Packet loss = 0%, RTA = 1.31 ms [01:06:09] RECOVERY - Host secgroup-lag-102 is UP: PING OK - Packet loss = 0%, RTA = 0.48 ms [01:18:18] PROBLEM - Host tools-secgroup-test-103 is DOWN: CRITICAL - Host Unreachable (10.68.21.22) [01:18:51] 06Labs, 10Tool-Labs, 06Discovery, 06Maps, and 3 others: PostgreSQL query planner bug on labsdb1006 - https://phabricator.wikimedia.org/T145599#2635546 (10MaxSem) [01:22:49] 06Labs, 10Tool-Labs, 06Discovery, 06Maps, and 3 others: PostgreSQL query planner bug on labsdb1006 - https://phabricator.wikimedia.org/T145599#2635546 (10Yurik) I think we should simply update the postgres/postgis on labs instance [01:23:32] PROBLEM - Host secgroup-lag-102 is DOWN: CRITICAL - Host Unreachable (10.68.17.218) [01:38:39] RECOVERY - Host tools-secgroup-test-102 is UP: PING OK - Packet loss = 0%, RTA = 0.64 ms [01:43:36] PROBLEM - Host tools-secgroup-test-102 is DOWN: CRITICAL - Host Unreachable (10.68.21.170) [06:48:33] PROBLEM - Puppet run on tools-webgrid-lighttpd-1412 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [07:15:39] 06Labs: cronspam from labscontrol1001, labstore1001, labnet1002.eqiad.wmnet, labsdb1003.eqiad.wmnet - https://phabricator.wikimedia.org/T132422#2635836 (10elukey) Seeing again dns-floating-ip-updater.py exceptions: ``` Cron Daemon 10:22 PM (10 hours ago) to root No han... [07:17:54] 06Labs, 15User-Nikerabbit: Request creation of wmwcourse labs project - https://phabricator.wikimedia.org/T144388#2635838 (10Nikerabbit) [07:20:21] 06Labs: cronspam from labscontrol1001, labstore1001, labnet1002.eqiad.wmnet, labsdb1003.eqiad.wmnet - https://phabricator.wikimedia.org/T132422#2635841 (10elukey) >>! In T132422#2622675, @MoritzMuehlenhoff wrote: > Maybe these are coming from unpuppetised base services installed by Debian/Ubuntu? Will investiga... [07:23:34] RECOVERY - Puppet run on tools-webgrid-lighttpd-1412 is OK: OK: Less than 1.00% above the threshold [0.0] [08:16:16] 06Labs, 10Labs-Infrastructure, 05Continuous-Integration-Scaling, 13Patch-For-Review: Bump quota of Nodepool instances (contintcloud tenant) - https://phabricator.wikimedia.org/T133911#2635935 (10hashar) [08:16:18] 06Labs, 07Tracking: Existing Labs project quota increase requests (Tracking) - https://phabricator.wikimedia.org/T140904#2635934 (10hashar) [08:18:12] 06Labs, 10Labs-Infrastructure, 05Continuous-Integration-Scaling, 13Patch-For-Review: Bump quota of Nodepool instances (contintcloud tenant) - https://phabricator.wikimedia.org/T133911#2248624 (10hashar) [08:18:54] 10Labs-project-other, 10Quarry, 06Discovery, 10Wikidata, and 2 others: Setup sparqly service at https://sparqly.wmflabs.org/ (like Quarry) - https://phabricator.wikimedia.org/T104762#2635939 (10Multichill) With the current SPARQL setup it's easy to share queries either by full url or by short url. I think... [08:19:38] 06Labs: Please raise quota for deployment-prep - https://phabricator.wikimedia.org/T145611#2635940 (10MoritzMuehlenhoff) [08:31:53] 06Labs: Please raise quota for deployment-prep - https://phabricator.wikimedia.org/T145611#2635987 (10hashar) [08:31:55] 06Labs, 10Tool-Labs: jsub should respect .sge_request - https://phabricator.wikimedia.org/T145269#2635992 (10whym) .jsubrc was what I was looking for - I really should have [[https://wikitech.wikimedia.org/w/index.php?title=Help:Tool_Labs/Grid&diff=834685&oldid=819264|read the manual]]. [10:30:13] (03PS1) 10Giuseppe Lavagetto: Added fake secrets for puppetmasters [labs/private] - 10https://gerrit.wikimedia.org/r/310516 [10:30:49] (03CR) 10Giuseppe Lavagetto: [C: 032 V: 032] Added fake secrets for puppetmasters [labs/private] - 10https://gerrit.wikimedia.org/r/310516 (owner: 10Giuseppe Lavagetto) [10:49:25] 10Labs-project-other, 10Quarry, 06Discovery, 10Wikidata, and 2 others: Setup sparqly service at https://sparqly.wmflabs.org/ (like Quarry) - https://phabricator.wikimedia.org/T104762#2636433 (10Base) Do I get it right that now a query cannot be longer than URL length limit? How much exactly is that number... [10:55:57] 10Labs-project-other, 10Quarry, 06Discovery, 10Wikidata, and 2 others: Setup sparqly service at https://sparqly.wmflabs.org/ (like Quarry) - https://phabricator.wikimedia.org/T104762#1426314 (10jcrespo) @Base, your questions are very interesting, and you seem to have really nice suggestions, but I would s... [12:08:20] (03CR) 10Aklapper: [C: 04-1] "Hi Xavier, thank you and congratulations to your contribution in Gerrit! We would love to see more contributions from you! :)" [labs/tools/WikiConvFR-training-2016] - 10https://gerrit.wikimedia.org/r/305865 (owner: 10Xavier Combelle) [12:08:46] (03CR) 10Aklapper: [C: 04-1] "Hi GAllegre, thank you and congratulations to your contribution in Gerrit! We would love to see more contributions from you! :)" [labs/tools/WikiConvFR-training-2016] - 10https://gerrit.wikimedia.org/r/305864 (owner: 10GAllegre) [12:22:18] my wikipidia account Block ?HELP [12:22:48] How to unblock my account [12:26:19] 06Labs: Request increased quota for labs project - https://phabricator.wikimedia.org/T145636#2636577 (10fgiunchedi) [12:30:41] 06Labs: Request increased quota for deployment-prep labs project - https://phabricator.wikimedia.org/T145636#2636598 (10fgiunchedi) [12:32:03] 06Labs: Request increased quota for deployment-prep labs project - https://phabricator.wikimedia.org/T145636#2636577 (10fgiunchedi) see also {T145636} and {T53497} for context [12:47:39] please help delete a tool account http://tools.wmflabs.org/antigng/ , it is created bu mistake and may cause some misleading problem [12:59:05] hi all. yuvipanda, madhuvishy, can one of you respond to https://lists.wikimedia.org/pipermail/wikilovesmonuments/2016-September/008317.html ? (I /think/ the lag is since resolved, but it would be good to have a response from someone from the Labs team about what could be the cause of it, for future references.) [13:58:14] PROBLEM - Puppet run on tools-exec-1410 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [14:36:39] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Bibek bro was created, changed by Bibek bro link https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Access_Request/Bibek_bro edit summary: Created page with "{{Tools Access Request |Justification=Heool everyone I want to use Tools project for running a bot in [http//ne.wikipedia.org Nepali Wikipedia]. Please help/teach me create a..." [14:37:51] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Bibek bro was modified, changed by Bibek bro link https://wikitech.wikimedia.org/w/index.php?diff=835496 edit summary: [15:02:14] 06Labs: cronspam from labscontrol1001, labstore1001, labnet1002.eqiad.wmnet, labsdb1003.eqiad.wmnet - https://phabricator.wikimedia.org/T132422#2636946 (10AlexMonk-WMF) >>! In T132422#2635836, @elukey wrote: > Seeing again dns-floating-ip-updater.py exceptions: > ``` > keystoneclient.exceptions.ConnectionRefused... [15:21:47] !log deployment-prep cherry-pick https://gerrit.wikimedia.org/r/#/c/310557/ on puppet master [15:21:47] Please !log in #wikimedia-releng for beta cluster SAL [15:21:52] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Deployment-prep/SAL, Master [15:22:01] thanks stashbot [17:38:58] 06Labs, 10Beta-Cluster-Infrastructure: Please raise quota for deployment-prep - https://phabricator.wikimedia.org/T145611#2637678 (10AlexMonk-WMF) [17:39:06] 06Labs, 10Beta-Cluster-Infrastructure: Request increased quota for deployment-prep labs project - https://phabricator.wikimedia.org/T145636#2637679 (10AlexMonk-WMF) [17:40:23] 06Labs, 10Beta-Cluster-Infrastructure: Please raise quota for deployment-prep - https://phabricator.wikimedia.org/T145611#2635940 (10AlexMonk-WMF) From a deployment-prep admin PoV, I'd prefer the quota bump include the full VCPU count and RAM of the instances you'd like to create, rather than leaving us with 0... [17:56:01] hey leila [17:56:17] yes, I looked at http://tools.wmflabs.org/replag/ and the lag seems to have resolved [17:56:21] I'm no longer subscribed to that list tho :( can you convey the message? [17:59:30] yuvipanda: I'll do it [18:00:10] aah i am not on the list either [18:00:25] thought it was labs-l [18:00:33] yuvipanda: If I wanted to take https://phabricator.wikimedia.org/T53434 on as a volunteer (because I despirately need the functionality) is that OK? Or if it's declined it's done done done end of story... [18:03:30] Matthew_: I'm pretty sure if you want to use *icinga* it's declined [18:03:53] I do agree some monitoring is required, and if you want to take a look at prometheus blackbox_exporter + alertmanager I'd welcome tha [18:04:21] May I ask why...?? [18:05:22] we're generally trying to move away from icinga in all places :) [18:05:41] Ah. Icinga is what I'm farmiliar with that's why I ask. [18:05:53] yeah, I understand. [18:06:19] But anyway, I agree that we need some sort of monitoring. I've resorted to a third-party tool to check on xtools for example. If you can tell me what you want in a ticket, I'll look into it. I'm that despirate :) [18:07:13] I'll try, but atm I'm too swamped to even think about it :( can you start a ticket with what *you* want, and then I can add / apend to it? [18:08:04] yuvipanda: How about this... I'll reopen T53434 with a comment and assign it to myself. I'll make a note of what you wanted to do and then we can go from there. Deal? [18:08:04] T53434: Setup an icinga instance to monitor tools on tool-labs - https://phabricator.wikimedia.org/T53434 [18:08:50] sure, as long as you take out icinga from the title, and redo it to be a 'this is the problem I want to solve' ticket rather than a 'this is the solution that we should implement' ticket :) [18:09:09] Can do. [18:09:10] I'd still prefer a new ticket, since otherwise the first bunch of comments on this ticket would just be confusing [18:09:22] but don't particularly care :) [18:10:16] 06Labs, 10Tool-Labs: Implement a system to montior tools on tool-labs - https://phabricator.wikimedia.org/T53434#2637753 (10Matthewrbowker) 05declined>03Open p:05Triage>03Normal a:05yuvipanda>03Matthewrbowker [18:10:34] Matthew_, don't use anything like Icinga, please. It's terrible :/ [18:11:36] Really? I love icinga actually. Nagios is the problem one. [18:11:59] I've had problems with both. [18:12:07] 06Labs, 10Tool-Labs: Implement a system to montior tools on tool-labs - https://phabricator.wikimedia.org/T53434#572676 (10Matthewrbowker) I am re-opening this ticket and taking it on in my capacity as a volunteer. Icinga is not the given solution for this, so I've also generalized the title. @yuvipanda wa... [18:12:22] I set up an instance from scratch of icinga2 and I haven't had problems yet. [18:15:23] I hear Zabbix is good. If you wanted any suggestions. [18:16:28] Or you could tie it in with the future logging system that is T127367 [18:16:29] T127367: Overhaul logging setup for Tools (Tracking) - https://phabricator.wikimedia.org/T127367 [18:16:30] I'm trying to standardize on prometheus as the thing to use for tools [18:16:46] (and wikimedia production is slowly moving towards that as well) [18:16:53] you can see the current tools setup on tools-prometheus.wmflabs.org/tools [18:17:50] I never got the hang of that. [18:17:58] It asks me to enter an expression. [18:19:17] (that's where a lot of the graphs in grafana-labs.wikimedia.org come from) [18:19:30] indeed, the prometheus interface itself is rarely used directly. [18:19:37] just for exploration / quick expression checks [18:19:55] usually you us grafana for visualization and dashboards, and alertmanager for alerts [18:20:01] (we don't have alertmanager setup) [18:23:53] Ah. [19:30:26] Anyone in here any experience with Grafana? Could it be used in combination with an SQL query that returns a number and graph that? The graph would be of the number of non-disambiguation pages linking to disambiguation pages on a wiki [19:30:44] Or do we already have this graph somewhere? :-) [19:30:58] multichill: addshore has been working on something that would allow doing this [19:31:07] addshore: we should schedule some time to get your patch merged [19:31:48] yuvipanda: I remembered seeing graphs of this at some point, but I'm pretty sure that was back in the Toolserver days with rrdtool [19:47:39] 10Tool-Labs-tools-Xtools: Administats is showing normal users too - https://phabricator.wikimedia.org/T145677#2638102 (10Luke081515) [19:48:00] 10Tool-Labs-tools-Xtools: Administats is showing normal users too - https://phabricator.wikimedia.org/T145677#2638115 (10Luke081515) [19:49:20] i wish horizon would remember my login [19:49:26] digging out the phone for 2fa is a pain in the ass [19:55:43] 10Tool-Labs-tools-Xtools: Administats is showing normal users too - https://phabricator.wikimedia.org/T145677#2638128 (10Matthewrbowker) p:05Triage>03Normal [20:05:17] addshore: Going offline, maybe you can answer later or leave a message somewhere? [20:07:33] 10Tool-Labs-tools-Xtools: Adminstats is showing non-admin users too - https://phabricator.wikimedia.org/T145677#2638151 (10FriedhelmW) [20:08:27] brion file a bug for that, that would be a very good feature [20:22:52] 06Labs, 10Labs-Infrastructure: Puppetize mysql on labtest - https://phabricator.wikimedia.org/T145679#2638179 (10yuvipanda) [20:24:11] 06Labs, 10Labs-Infrastructure: Puppetize mysql on labtest - https://phabricator.wikimedia.org/T145679#2638195 (10yuvipanda) A lotta things stop working when this limit is hit, including puppet runs on clients. [20:30:54] 10Tool-Labs-tools-Xtools: Wikiviewstats does not support Wikidata - https://phabricator.wikimedia.org/T63833#2638261 (10Matthewrbowker) [20:30:56] 10Tool-Labs-tools-Xtools: resurrect wikiviewstats tool - https://phabricator.wikimedia.org/T91320#2638259 (10Matthewrbowker) 05Open>03declined Closing this task... We have decided not to revive WikiViewStats. Instead, if you need this functionality look at http://tools.wmflabs.org/pageviews [20:30:58] yuvipanda: we should! [20:31:00] 10Tool-Labs-tools-Xtools: Wikiviewstats does not support Wikidata - https://phabricator.wikimedia.org/T63833#677111 (10Matthewrbowker) 05Open>03declined Closing this task... We have decided not to revive WikiViewStats. Instead, if you need this functionality look at http://tools.wmflabs.org/pageviews [20:31:19] multichill is gone :( [20:31:29] addshore: wanna do it now? :P [20:31:31] 06Labs, 10Tool-Labs, 06Discovery, 06Maps, and 3 others: PostgreSQL query planner bug on labsdb1006 - https://phabricator.wikimedia.org/T145599#2638264 (10Yurik) p:05Triage>03Low [20:31:51] im in a hostel and can ssh anywhere and vary tired, so probably not now :/ [20:32:01] but, well, whats the worst that culd happen? [20:32:51] addshore: :D when do you wanna schedule it? [20:32:54] next week? this week? [20:33:25] brion: I think the horizon session timeout is 12h. We could probably raise that if someone opened a bug about it and explained why this is a horrible user experience (hint, hint) [20:33:51] yuvipanda: next week! I'm in belgium until sunday! [20:33:56] ah nice [20:34:02] back in the uk on sunday then! [20:38:30] addshore im in the uk :) [20:40:39] ok :D [20:40:50] * brion ... i like sessions that last until 2038 personally [20:41:42] LOL [20:42:03] brion But then why not set it unlimited if you want it to 2038? [20:42:20] unix time ends in 2038 [20:42:23] there is no future [20:42:27] lol [20:42:28] the 64-bit machines take over then [20:42:43] yeh, luckly all my products are 64bit [20:42:53] iphone, ipad. So my time will be running past then [20:43:12] I'm going to retire that year too because I don't want to deal with y38k crap [20:44:13] bd808 what is y38k? [20:44:37] https://en.wikipedia.org/wiki/Year_2038_problem [20:45:41] a bit after my 65th bday unix melts [20:46:12] LOL [20:46:27] I will be young still by there [20:46:44] Probaly in my 30-40 lol [20:47:10] paladox: You have 7797 Days until then. Or so. [20:47:19] LOL [20:47:31] I put a countdown in my phabricator tracking it :) [20:49:01] LOL [22:07:33] 06Labs, 10Horizon: Horizon loses credentials every day - https://phabricator.wikimedia.org/T145703#2638694 (10brion) [22:24:43] 10Tool-Labs-tools-Xtools: Adminstats is showing non-admin users too - https://phabricator.wikimedia.org/T145677#2638769 (10doctaxon) But it's right: the user deleted the redirect page anyway, if sysop or not, it has to be logged in the deletion log. Where is the bug? I suppose, it works fine. [22:56:26] 10Tool-Labs-tools-Xtools: Adminstats is showing non-admin users too - https://phabricator.wikimedia.org/T145677#2638828 (10Matthewrbowker) >>! In T145677#2638769, @doctaxon wrote: > Where is the bug? I suppose, it works fine. My understanding of the specific request was not to remove it, just have the page disp... [23:27:37] 06Labs: known_host key updating on virt* (and possibly elsewhere) - https://phabricator.wikimedia.org/T93748#1144980 (10AlexMonk-WMF) Maybe this was fixed by https://gerrit.wikimedia.org/r/#/c/210926/ ?