[00:20:51] 6Labs, 3Labs-Sprint-109, 5Patch-For-Review: Make a fact for project_id on labs instances - https://phabricator.wikimedia.org/T93684#1526270 (10Andrew) scfc: yes, see blocking bug list [01:18:42] The pagelinks table should contain all visible links, right? For example https://en.wikipedia.org/wiki/Arif_Asadov has category links but these don't show up in pagelinks table. I seem to be missing some basic. [01:21:57] ashwinpp: categories are in the categorylinks table :) [01:22:18] ashwinpp: https://www.mediawiki.org/wiki/Manual:Database_layout has some pointers [01:23:26] Ah, but the pagelinks table also has links to category pages, right? With namespace 14? [01:37:33] ashwinpp: those are inline links like [[:Category:Foo]]. The categories shown at the bottom of the page are in categorylinks. [01:51:10] I see, thanks legoktm [02:06:27] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Lt910001 was modified, changed by Tim Landscheidt link https://wikitech.wikimedia.org/w/index.php?diff=173660 edit summary: [05:41:14] 6Labs, 10Tool-Labs: Install composer on tools-login - https://phabricator.wikimedia.org/T104789#1526647 (10Krinkle) [06:26:46] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Sulhan was created, changed by Sulhan link https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Access_Request/Sulhan edit summary: Created page with "{{Tools Access Request |Justification=Research for my master thesis: detecting vandalism in Wikipedia Indonesia. My thesis focus on specific language (Bahasa Indonesia, id.wik..." [09:09:25] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Sulhan was modified, changed by Merlijn van Deen link https://wikitech.wikimedia.org/w/index.php?diff=173670 edit summary: [09:47:10] 6Labs, 10Tool-Labs: Tool Labs virt reboot checklist - https://phabricator.wikimedia.org/T108669#1526831 (10valhallasw) 3NEW [09:47:18] 6Labs, 10Tool-Labs: Tool Labs virt reboot checklist - https://phabricator.wikimedia.org/T108669#1526839 (10valhallasw) [09:56:22] andrewbogott: ok, I have no clue how the cyberbot exec node is supposed to be handled... [09:56:22] Coren: ^ [10:01:33] andrewbogott: https://etherpad.wikimedia.org/p/T108669 is a writeup, but I'm not sure if it's correct, and I have no clue how to handle cyberbot. Please check with Coren. [10:06:57] 6Labs, 10Beta-Cluster, 10Labs-Infrastructure, 6operations: beta: Get SSL certificates for *.{projects}.beta.wmflabs.org - https://phabricator.wikimedia.org/T50501#1526856 (10ArielGlenn) @greg so what's the decision; there's also https://phabricator.wikimedia.org/T75919 and https://phabricator.wikimedia.or... [11:49:49] 6Labs: Have checkpoint check for external network access from labs - https://phabricator.wikimedia.org/T107455#1527093 (10mark) Lack of external network connectivity will be immediately obvious from practically any of the other checks, won't it? :) Seems a bit redundant. [11:50:40] 6Labs: Setup checkpoint check for private DNS - https://phabricator.wikimedia.org/T107453#1527094 (10mark) What makes internal/public DNS different? Can't that be rolled into one check? [12:04:42] 6Labs: Setup checkpoint check for private DNS - https://phabricator.wikimedia.org/T107453#1527112 (10yuvipanda) No - public DNS is serving *.wmflabs.org to the external world and private is serving *.eqiad.wmflabs. They are done via different mechanisms too afaict [12:04:56] 6Labs: Have checkpoint check for external network access from labs - https://phabricator.wikimedia.org/T107455#1527113 (10yuvipanda) I agree :) [12:06:08] 6Labs: Have checkpoint check for external network access from labs - https://phabricator.wikimedia.org/T107455#1527122 (10yuvipanda) actually - it won't be if it is labs to the internet that is fucked, rather than the other way around. But if that can hardly happen.... [12:59:28] YuviPanda: If you're not on holidays :-p can you take a look at https://etherpad.wikimedia.org/p/T108669 [13:23:30] valhallasw`cloud, andrewbogott: dedicated queues/nodes were given with all the right caveats. I always try to warn the maintainer in advance so they can do anything they feel is helpful but otherwise just kill all tasks and reboot them. [13:57:13] Coren: can you communicate that to cyberpower? see the thread on labs-l [14:03:27] valhallasw`cloud: afaick, he's already discussing it and asked us to wait until a specific job is over? (Which seems quite okay to me as he estimates it should be around the right time anyways). Acoomodating this seems a good idea. [14:04:09] Coren: yes, but I'm not sure what the current result would be. Would all cyberpower jobs be offline for an hour? [14:05:32] valhallasw`cloud: Ah, yes - by necessity. I don't think you need to worry about this, hey's aware that his jobs can run nowehere else during a maintenance window as this is part of the explicit downside of a dedicated node. :-) [14:09:16] given that he explicitly asked "How will this affect Cyberbot's continuous scripts?" -- I'm not sure he is [14:09:46] Hm. [14:09:56] Well, I'll reply to clarify just in case. [14:10:17] Thanks :-) [14:10:20] But I think his question was because Andrew forgot to mention that specific host. [14:12:34] {{done}} [14:12:50] * andrewbogott back, catching up [14:12:59] After a node is rebooted, I can prioritize certain VMs. [14:13:14] So, if cyberbot needs to be at the top, we can drop the downtime to 10 mins or so. [14:13:22] As long as it’s not competing with others for that top spot [14:13:49] My mental priority is: tools, CI, beta, everything else [14:14:48] andrewbogott: I don't think it's a catastrophe either way. I'm sure he'd appreciate the shorter downtime though. I think that within tools: global services first, dedicated nodes, then general nodes. [14:15:18] andrewbogott: Because (a) affects everyone, (b) have no redundancy and (c) have plenty redundancy [14:15:31] respectively. [14:15:43] yep, sounds right [14:16:18] valhallasw`cloud: I’m looking at your etherpad and realizing that I actually don’t know how to do the 'mail results to labs-l’ steps. I’ve maybe never emailed via CLI before. Can you sample commands to the pad? [14:16:31] andrewbogott: I just copy-pasted it from the console yesterday :D [14:18:05] (I'm not completely sure how to mail from the console, probably | mail ) [14:19:21] ok. Also — is it useful/important to explicitly kill noncontinuous jobs? Presumably shutting down the node will accomplish that :) [14:22:04] I'm not sure how gridengine handles that -- if they are killed, presumably a signal will be sent to the master [14:25:56] yeah, I guess it’s friendlier to kill them explicitly [15:13:54] Coren: did you get someone else to look at the timer patches? [15:14:10] also this TZ seems weird since I get overlap with both europe and andrewbogott / Coren [15:14:14] No, that's next on my todo list. [15:14:24] YuviPanda: That means you gotta move there. :-) [15:15:02] Stockholm? [15:15:10] YES COME TO EUROPE [15:15:12] I went to WMSE Office. super nice people! [15:15:20] valhallasw`cloud: if only countries would let me! :P [15:15:37] WMF should open an official office [15:15:40] then it should be easy [15:15:52] (I think) [15:16:14] * Coren heard shrieks coming for the SF lawyers. [15:16:16] yeah, but an european office was discussed and discarded [15:16:37] yeah so if I move to europe I've to move as a freelancer contracting for wikimedia [15:16:40] whch isn't really easy [15:16:46] but Stockholm is pretty! [15:17:56] 6Labs, 10Beta-Cluster, 10Labs-Infrastructure, 6operations: beta: Get SSL certificates for *.{projects}.beta.wmflabs.org - https://phabricator.wikimedia.org/T50501#1527574 (10greg) @ArielGlenn: there's an NDA's task at T97593 which Brandon is driving. [15:19:47] 6Labs: Have checkpoint check for Wikitech availability - https://phabricator.wikimedia.org/T107457#1527581 (10yuvipanda) @mark has approved doing a simple http 200 check for wikitech. [15:20:08] 6Labs: Setup an availability checker for all labsdb hosts - https://phabricator.wikimedia.org/T107449#1527582 (10yuvipanda) All 5 hosts approved. [15:20:35] 6Labs, 3Labs-Sprint-108, 3Labs-Sprint-109: Have checkpoint checks for all labs services (Tracking) - https://phabricator.wikimedia.org/T107058#1527586 (10yuvipanda) [15:20:36] 6Labs: Have checkpoint check for external network access from labs - https://phabricator.wikimedia.org/T107455#1527583 (10yuvipanda) 5Open>3declined a:3yuvipanda This was denied as superfluous by @mark. [15:56:00] valhallasw`cloud: I have a meeting in a few minutes, after that I’ll start shutting off those queues [16:01:43] andrewbogott: ok! [16:14:54] !log commonsarchive Upgraded MediaWiki to 1.25.2 [16:16:15] Hmm where is morebots [16:35:29] !log tools.morebots forced rescheduling of all bots as they all seem to be offline [16:35:32] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.morebots/SAL, Master [17:04:34] valhallasw`cloud: …and now I’m immobilized with hunger so it’s going to be another hour or so before I’m ready to do anything useful :) Will you still be around? [17:18:34] legoktm: does your legobot logging work? [17:20:25] Negative24: uh, which logging? [17:21:21] legoktm: in the main scheduler TimedRotating and logging [17:22:08] I was using it as an example and as it turns out it doesn't work :P but I fixed it [17:22:33] Negative24: link to which code? I've written a lot of things called "legobot" :P [17:22:56] legoktm: https://github.com/legoktm/legobot/blob/master/main.py#L18 [17:23:26] heh, yeah that never worked [17:23:53] is that copy still being used? I could write a quick PR to fix it if you'd like [17:25:45] legoktm: its a quick 2 line fix [17:26:03] nope, I dont' think anything uses it. But feel free to send the PR :) [17:26:44] legoktm: btw, I'm using parts of the logging and threading for bot24 (if you don't mind) [17:31:25] legoktm: there you go [17:31:53] yay [17:32:03] and I don't mind at all, that's awesome [17:32:49] * Negative24 should probably check with the author in the future to see if their code is working before he starts trying to learn from it [17:51:14] andrewbogott: yes, probably, although not all the time so it could be 30 mins before I respond. I'll probably go to sleep around 23h`~ 21h UTC [18:17:23] !log tools depooling tools-exec-1201 tools-exec-1202 tools-exec-1204 tools-exec-1206 tools-exec-1209 tools-exec-1213 tools-exec-1217 tools-exec-1218 tools-exec-1408 tools-webgrid-generic-1404 tools-webgrid-lighttpd-1409 tools-webgrid-lighttpd-1410 in anticipation of labvirt1001 reboot tomorrow [18:17:26] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, dummy [18:17:30] valhallasw`cloud: ^ [18:18:03] sounds good [18:18:43] so now I guess we wait an hour for the next step :) [18:18:59] yep [18:19:56] let me fix a bug in my script in the meanwhile... the webgrid queues are of course also continuous [19:18:34] valhallasw`cloud: the output from your script is much shorter than I feared. [19:18:52] yeah, it's comparable to what I sent yesterday [19:19:07] 20 jobs [19:26:03] is it a knownbug that Page history in xtools doesn't work? https://tools.wmflabs.org/xtools-articleinfo/?article=Example&project=en.wikipedia.org [19:27:01] valhallasw`cloud: so I just emailed myself from tools-master using my personal email address, and that worked. [19:27:16] But when I tried morebots@tools.wmflabs.org (which I’m an owner of) I got nothing [19:27:41] andrewbogott: that's using python blah.py | mail morebots ? [19:27:57] cat contents.txt | mail -s "This is a test" morebots@tools.wmflabs.org [19:28:52] 2015-08-11 19:24:04 1ZPF9Y-0008Nd-1N ** morebots@tools.wmflabs.org: Unrouteable address [19:28:55] oh! it's tools.morebots [19:29:14] tools.morebots@tools.wmflabs.org [19:29:33] ok, trying again... [19:30:06] yep, that’s it! Thanks. [19:32:22] valhallasw`cloud: suppose you could modify your tool to output yaml? Then I can make these notice emails a bit friendlier. [19:32:53] andrewbogott: yes, of course. Same format? You can also just hack the python file [19:33:26] same format is fine, I think… just so that I can safely correlate tools with jobs. [19:39:40] test received. [19:39:40] andrewbogott: ok, the formatting is bad, but it's vaild yaml [19:39:52] andrewbogott: (same script) [19:40:14] should work, thanks. [19:55:13] valhallasw`cloud: mind proofreading? https://etherpad.wikimedia.org/p/kAh7UZSoyI [19:55:54] andrewbogott: mmm, maybe we should also add the continuous jobs to that list? [19:56:04] this list is the list of jobs that will be killed (not rescheduled) [19:56:53] ok… I maybe don’t understand the distinction. [19:57:15] your tool outputs all jobs that are still running, right? [19:57:22] no, just the non-continuous ones [19:57:24] So, do I care wether or not they’re continuous? [19:57:26] oh [19:57:46] well, the continuous jobs we’re going to explicitly reschedule, right? [19:57:50] because I consider the continuous ones irrelevant (they should be OK to reschedule, otherwise they shouldn't have been submitted to continuous) [19:57:50] yes [19:58:13] so, I think best to leave them out of this email. The general “things will restart” emails that I’ve already sent should cover that. [19:58:34] ok, sounds good [20:01:38] andrewbogott: it's covered, but maybe we should be explicit in saying 'sorry, you can't schedule a job that will run for longer than a day or so for now' [20:01:51] ok... [20:02:44] meh, something like this, I guess? [20:04:35] updated [20:05:00] looks good to me! [20:11:54] hm, these emails arrive from: andrew@tools.wmflabs.org [20:11:59] which, I’m not sure I can receive replies there [20:12:04] maybe I can fix that [20:12:50] if you got the mail to morebots, that should work [20:13:51] sending works, just trying to make sure people can reply with alarm :) [20:16:10] anyway, here we go [20:16:47] mails sent (I think) [20:27:58] cool! [20:49:57] 6Labs, 10pywikibot-core: pywikipedia.org is not responding; pywikibot.org is not registered - https://phabricator.wikimedia.org/T106311#1529143 (10siebrand) DNS for pywikipedia.org has been updated to wikimedia.org.. Registration for pywikibot.org has been completed. I'm waiting for the DNS zone to come online... [20:55:54] 6Labs, 10pywikibot-core: pywikipedia.org is not responding; pywikibot.org is not registered - https://phabricator.wikimedia.org/T106311#1529159 (10siebrand) pywikibot.org's DNS has been updated to point to wikimedia.org.. [21:18:24] 6Labs, 10pywikibot-core: pywikipedia.org is not responding; pywikibot.org is not registered - https://phabricator.wikimedia.org/T106311#1529244 (10BBlack) Note that in the long run, there's still an issue here with TLS termination. You're in ample company, as we have a boat-load of these sorts of domainnames... [23:08:31] if i needed more disk space than is available in the normal xl labs instances, is there a place to request that? [23:09:55] currently looking at setting up a cluster of 3xl machines in labs as an elasticsearch cluster to import production indexes to and run tests with different analyzers/indexing/etc. Ideally we would need a TB of disk space between the 3 machines to have enough room for reindexes against copies of the larger prod indexes [23:21:53] ebernhardson: I *think* it is technically possible. Opening a phab ticket describing the need might be the best way to get it worked out. [23:31:53] bd808: any clue which project that would go with? [23:32:49] * ebernhardson guesses labs-infrastructure ... probably wrong but someone will repoint it :) [23:33:14] seems like a good enough place to start [23:33:47] the phab #labs-* projects don't make it easy to choose today :/ [23:34:08] ebernhardson: There is also https://phabricator.wikimedia.org/T76375 [23:34:15] the project creation tracking bug [23:34:37] 6Labs, 10CirrusSearch, 6Discovery, 10Labs-Infrastructure: Make available an XL labs instance with ~350GB available disk space. - https://phabricator.wikimedia.org/T108767#1529929 (10EBernhardson) 3NEW [23:35:49] hmm, not really a new project (we had a 'search' project already and just took it over). [23:35:52] thanks! [23:36:20] 6Labs, 10CirrusSearch, 6Discovery, 10Labs-Infrastructure: Make available an XL labs instance with ~350GB available disk space. - https://phabricator.wikimedia.org/T108767#1529942 (10EBernhardson) [23:39:43] (03PS1) 10Jforrester: Don't duplicate things from #wikimedia-collaboration into -dev [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/230945