[00:00:30] /var is full on tools-exec-08. [00:00:48] And on tools-master. [00:01:05] The latter is probably the cause for the hiccup. [00:02:54] !log tools tools-master: rm -f /var/log/diamond/* && restart diamond [00:02:56] Logged the message, Master [00:04:32] Seems to work again. [01:10:41] DB is locked on betalabs? [01:28:10] The Wikipedia database is temporarily in read-only mode... [01:35:22] what happened to meetbot? Links to meeting logs on https://meta.wikimedia.org/wiki/IRC_office_hours are all 404s. [01:36:07] I'll link to the wm-bot log of the meeting, but it's not HTML so harder to read [02:28:23] !log deployment-prep Unbroke replication on deployment-db2, it's catching up now [02:28:27] Logged the message, Mr. Obvious [04:32:45] Somebody removed the user-agent didn't they? [04:32:55] https://meta.wikimedia.org/wiki/User-Agent_policy [06:02:48] the only flaw of dispenser's reflinks is after it makes it change, it goes to https instead of going back to http (for those not using the secured connection0,00) [07:44:00] 3Wikimedia Labs / 3deployment-prep (beta): beta labs db stays read-only - 10https://bugzilla.wikimedia.org/67509#c1 (10Antoine "hashar" Musso) 5NEW>3RESO/WOR Not sure what happened, but it works for me now. [08:51:54] !log tools tools-exec-08 (some hours ago): rm -f /var/log/diamond/* && restart diamond [08:51:57] Logged the message, Master [09:32:46] hello [09:32:47] any admin around already? [09:37:30] 3Wikimedia Labs / 3tools: Some issues: tools-webgrid-03/04, tools-login - 10https://bugzilla.wikimedia.org/67329#c2 (10metatron) puppet run fail again tools-login / tools-dev. ssh tools-webgrid-03/04 still impossible -login: The last Puppet run was at Wed Jul 2 19:13:29 UTC 2014 (2139 minutes ago) -dev: Th... [09:51:51] nosy1: hey! [09:52:06] nosy1: I'm around for another 10min, anything I can do? [09:56:36] 'SELECT command denied to user 'u2815'@'10.68.16.7' for table 'namespacename' < on dispenser's dabsolver tool.. [10:08:42] YuviPanda: there is still an ugly commons-federated-table-issue that sucks [10:08:48] https://bugzilla.wikimedia.org/show_bug.cgi?id=59683 [10:09:02] i hope to catch mark later [10:09:18] comets: which database does he use? [10:09:35] one is without a _p at the end and will not work [10:10:50] "ProgrammingError(1146, "Table 'p_dpl_p.ch_results' doesn't exist", None)" ? [10:11:01] comets: which db? [10:11:03] comets: url? [10:11:17] https://tools.wmflabs.org/dispenser/cgi-bin/dab_solver.py? [10:11:23] ah ok p_dpl_p [10:11:34] is a project db on toolserver [10:11:39] i think DispenserAFK has simply not rewritten his script for labs yet [10:11:43] i guess db names should have to be changed [10:11:56] he will probably not do so as i heared [11:13:50] Merlissimo: nosy1 DispenserAFK's tools don't run on labs, I think. He's just using a labs URL to proxy things to some other private server [11:14:43] probably. if the tools work without db its not even hard with db youd have to use ssh tunnels :D [11:44:45] 3Wikimedia Labs: New instances are stuck in "The certificate retrieved from the master does not match the agent's private key." - 10https://bugzilla.wikimedia.org/61413#c9 (10Andre Klapper) abogott: Do you know if this happened, and does this make this ticket obsolete? [13:31:27] Why is User-Agent being stripped? From HTTP requests? [14:11:47] Dispenser: mistake? [14:12:06] Dispenser: if you check your access.log it should be there [14:13:32] Dispenser: its not being stripped [14:15:23] Well it was last night, my tools broke because there was no user-agent [14:15:50] Dispenser: must have been a glitch [14:15:59] http://tools.wmflabs.org/betacommand-dev/cgi-bin/test [14:26:23] I'm pretty sure I don't know of a way by which UA might only be intermittently stripped. [14:26:31] (With our current setup, that is) [14:27:24] That your front end proxy would be stripping it. But I figured it out, I was calling the wrong PHP variable [14:28:41] Betacommand: I took a peek at what you sent me; it's a bit light on detail, but should suffice for at least a first round of talk. Ima post it as an RFC shortly and point the wikimedia-l thread at it. [14:30:08] Coren: https://gerrit.wikimedia.org/r/#/c/143861/ should finally give us puppet freshness metrics on toollabs, will do for the proxy and then for gridengine [14:31:00] and then alerts. [14:38:50] Oh, eeew. Sudo to cat? [14:53:23] Hi. Since some hours, I am not seeing any instances on projects on [14:53:26] https://wikitech.wikimedia.org/wiki/Special:NovaInstance [14:53:49] Do others still see their instances? [15:03:02] Coren: yeah, the previous patchsets didn't have that, but people -2'd that for other reasonms... [15:04:02] Coren: and I think changing permissions / ownership of /var/lib/puppet is out of the question, since that is what my initial patch did [15:04:33] What group owns /var/lib/puppet anyways? [15:04:52] Coren: puppet [15:04:55] And wouldn't it be easier to make sure diamond is in that group? [15:05:59] Coren: that gives it permissions to read a lot more things [15:06:06] no minimalism [15:06:45] * Coren ponders. [15:07:07] Like I said, I'm not in love with using sudo in that context; but not enough to block the patch. [15:07:35] Coren: according to _joe_, best solution in this context is to fix the puppet package to have o+x for /var/lib/puppet [15:08:12] I had an exec there that'll do that, but that was -2'd. Had to use exec since puppetmaster already defines that as file (and with the same permissions I want noless) BUT STUPID PUPPET DUPLICATE DETECTION IS STUPID :) [15:08:50] Very much so; but you can override things in puppet. [15:09:11] Coren: right, but not with a file{} without doing if defines. [15:09:32] Yes you can.' [15:09:37] Coren: how so? [15:09:43] Coren: see PS7 for my older approach [15:09:45] ensure_resource? [15:09:48] Check modules/toollabs/manifests/hba.pp where I do exactly that. [15:09:58] (for /etc/security/access.conf) [15:10:20] oh dear [15:10:21] qchris: I see the same thing. Basically the problem is that I think the login session does NOT expire, but something else expires and that has the effect of making the sessions invisible [15:10:30] so what you do is - log out, log in, everything's ok [15:10:31] I don't override mode in that one, but the principle is the same. [15:10:36] milimetric: Ok. Thanks. [15:10:38] is that doing resource collection? [15:10:49] but Coren, you should know about this too ^ (though I see you're busy and I don't think it's urgent) [15:11:19] sorry, "effect of making the labs instances invisible" [15:11:29] milimetric: Logout/Login worked. Thanks. [15:11:33] np [15:11:38] YuviPanda: Nyes. That is, it uses some amusing property of resource collection in order to 'redefine' a resource defined elsewhere. [15:13:06] That wouldn't work between manifests because we don't do puppetdb, but it's perfectly allowable to collect a single resource to instanciate it. [15:14:18] It's kinda dangerous though because it doesn't do what you expect if the original resource doesn't in fact exist (it becomes a noop rather than fail as you'd expect) so you have to be careful that the net effect isn't detrimental if it is. [15:15:24] But it's not dangerous as in "uses edge behaviour"; that is actually specified in the docs: "When given a block of attributes and values, a collector will set and override those attributes for every resource (virtual or not) that matches its search expression." [15:15:33] Coren: ah, in this case it would be detrimental, I think, since I'd want the perms to be set even if the resource is not defined elsewhere [15:16:04] YuviPanda: If it's not defined anywhere, then you have nothing that sets it in the first place so you've got nothing to override. [15:16:27] Coren: it's not *explicitly* defined everywhere, is my point. It's set up by the puppet package [15:16:39] Coren: and puppetmaster explicitly defines it to set its permissions [15:17:11] Ah, and you want this to work as expected even when there /is/ no puppetmaster included. [15:17:20] indeed [15:17:43] You're right; that wouldn't work unless the resource existed always and not just in puppetmaster. [15:17:45] as I said, ideal solution is to do this in the puppet deb. [15:17:47] indeed. [15:18:27] Well, either in the puppet deb or unconditionally in, say, ::base. But that'd be a Bad Thing since /var/lib/puppet can contain things that really shouldn't be readable. [15:19:04] So yeah; looks like sudo is the least worst approach. [15:19:11] yeah [15:19:23] Coren: the things that shouldn't be readable have their own perms set. My patch was just doing o+x [15:19:38] so I can read /var/lib/puppet/state/last_run_summary.yaml, which has appropriate perms for world readableness [15:19:48] Well, short of a hardlink and fancy ACLs which would be even worse from a management point of view. [15:20:47] yup, [15:20:50] Coren: so we sudo to cat [15:21:02] YuviPanda: BTW, diamond is spamming root with bad sudos. [15:21:05] I never thought there'd be a time when I'd have to shell out for cat [15:21:20] Coren: graphite-test you mean? that was an earlier version of the patch, should've been fixed now [15:21:56] YuviPanda: I honestly still think shelling out to do a sudo cat is ugly. [15:22:17] Coren: oh I completely 100% agree [15:22:43] IMO least bad solution is to exec and set 0751 on /var/lib/puppet, there's no security issues with that since the actually sensitive stuff has its own perms [15:22:55] but since that's not an option... [15:25:02] anyway, I've to go now [15:25:14] Coren: hopefully we won't have to do that for other collectors (grid and proxy) [15:26:37] now to go afk [15:26:39] cya [16:09:36] hi. when should redis be used instead of sql databases.? [16:13:33] rohit-dua: That depends mostly on what you need it for; redis is really good at a couple of well-defined things, not so much for the rest. Also, redis offers no guarantee of permanence. [16:14:16] rohit-dua: redis is, fundamentally, a key-value store. [16:17:18] Coren: i want to store book(s) title , description, and other metadata for a period of about 2-3 days [16:18:00] redis would work if you have an obvious key and you don't mind the possibility of loosing some of the data and have to fetch it back again. [16:18:14] At the gain of some speed and simpler handling. [16:18:31] 3Wikimedia Labs: New instances are stuck in "The certificate retrieved from the master does not match the agent's private key." - 10https://bugzilla.wikimedia.org/61413#c10 (10Andrew Bogott) 5NEW>3RESO/WON Yes, I just now deleted the 'nagios' project as it had no instances. [16:19:42] Coren: yes there is an unique key. So what do you suggest: store a json dumped data in redis key. or store it with diff keys like uniqueKey:description , uniqueKey:title etc.. for each book [16:20:23] That'd depend on your use pattern; I'd expect you'd normally always fetch all the metadata together so the json value is probably the simplest. [16:24:31] Coren: thank you. I'll go with json dump for max data. and diff. keys for those which are called seperately. [17:39:46] Coren: you around? [18:49:30] 3Wikimedia Labs / 3deployment-prep (beta): db error on beta labs "centralauth.renameuser_status' doesn't exist" - 10https://bugzilla.wikimedia.org/67485#c2 (10Chris Steipp) Was this resolved? It looks like the rename patch hasn't been reverted, but I'm authenticating to beta just fine (and the rename patch c... [19:04:01] 3Wikimedia Labs / 3deployment-prep (beta): db error on beta labs "centralauth.renameuser_status' doesn't exist" - 10https://bugzilla.wikimedia.org/67485#c3 (10Antoine "hashar" Musso) 5NEW>3RESO/FIX The database update only run once per hour, so I guess the patch introducing the new table has been introdu... [19:09:55] What's the toolserver.namespaces substitute? [19:14:15] Dispenser: api calls [19:14:44] How do I make API calls from bash!? [19:15:03] curl? [19:15:41] How do I use API calls from bash!?* [19:16:16] View 'enwiki_p.namespaces' references invalid table(s) or column(s) or function(s) or definer/invoker of view lack rights to use them [19:22:59] Dispenser: basically, there is no table because it's bad for the database (replag etc) to do joins with such tables on the database (that's what I understand, anyway) [19:23:15] Dispenser: so you should get the local namespaces in your client code, from the API [19:23:24] which is pretty fast, as it's in the same data center anyway [19:26:28] Its another points offailure, decreased availability, and requires more error handling code [20:21:07] Dispenser: can you give a hand at https://www.mediawiki.org/wiki/Requests_for_comment/Caching_References answering questions [20:42:16] Betacommand: I think your reply is good. Sj has a point, but then Wikisource is doing a poor job compared archive.org too. Regarding pete point on licensing, there are legal issues; Archive.org uses congressional Fair Use for Libraries when redistributing content. [20:43:11] Likely some legal contraption is needed [21:08:07] Dispenser: regarding toolserver.namespace: [21:08:24] the db is now called s51892_toolserverdb_p [21:19:24] nosy1: Its not gonna be renamed? [22:10:52] Has anyone taken up the task of producing Video Tutorials from late Jackson Peebles? I assume there where some of his produced, but they seems to have been removed [22:17:21] so, I can't reboot my labs instance [22:17:24] "The requested host does not exist. " [22:17:32] it's rejecting my ssh key too [22:17:39] the instance is social-tools1 [22:17:49] I'm guessing no one is around, so I'll file a bug [22:20:33] https://bugzilla.wikimedia.org/show_bug.cgi?id=67545 [22:20:48] 3Wikimedia Labs: Cannot log into nor reboot social-tools1 instance "The requested host does not exist. " - 10https://bugzilla.wikimedia.org/67545 (10Kunal Mehta (Legoktm)) 3NEW p:3Unprio s:3normal a:3None I'm not able to ssh into social-tools1: $ ssh legoktm@social-tools1.eqiad.wmflabs If you are hav... [22:21:37] have you turned it off and on again? [22:23:57] :P [22:32:19] social-tools1 :o [22:34:05] RoanKattouw_away, ping me when back, i would love to learn how you got repl to unstuck yesterday [22:37:27] mwalker: help! [22:37:53] I seem to have lost permissions on labs - can't manage instances or web proxies anymore :( [22:38:12] which is weird because I appear in the project admin group for services still [22:38:20] mvolz, I saw your email; I'm looking [22:38:23] also... go to bed!? [22:38:33] isn't it crazy late over there? [22:38:48] I'm eating apple crumble before bed in a true American fashion [22:39:14] but you're right, night :) [22:39:24] ooooh; tasty :) [22:39:28] I'll see what I can do [22:39:32] ty [22:39:46] before you disappear though -- *high five* for zotero bugfixes [22:39:51] mvolz: I ran into something similar, and just file a bug for it: https://bugzilla.wikimedia.org/67545 [22:39:56] filed* [22:41:08] mvolz, well; it appears I can still manage things... can you tell me what proxy you want created? [22:41:15] * mvolz receives high five [22:41:35] 1970 please. [22:42:04] legoktm: thanks, good to know I'm not having sleep dep hallucinations [22:42:12] on citoid. [22:42:28] ok; do you want me to leave the port 80 proxy? [22:42:33] nah [22:42:40] but it doesn't really matter [22:43:33] ok; proxy created [22:43:43] also i removed and re-added you to the admins group [22:43:46] that might have helped [22:43:50] if you want to try again [22:45:44] sadly not [22:45:54] can you tell me the url? I forget the format [22:47:57] mvolz, https://wikitech.wikimedia.org/wiki/Special:NovaProxy [22:48:40] yeah, unfortunately I can neither see existing proxies nor create new ones [22:48:42] same with instances [22:49:29] hmm... well I dont really have any special privileges to help you further -- can you send an email to the labs list [22:49:50] though; to check if it's related to legokt_'s problem; can you ssh into the box? [22:50:01] yeah I just tried that and there was no issue [22:50:07] so I'm not sure if they're related [22:50:07] :/ [22:50:21] well, that's good for you :P [22:54:07] oh, it's working now! [22:54:12] yay. [22:54:17] sorry legoktm [22:54:41] maybe the re-add took effect finally. [22:54:59] * mvolz can now go to bed. [22:55:21] heh; mvolz have a good weekend! [22:55:26] you too :) [23:39:47] Coren: only one webgrid host for queue webgrid-lighttpd is available which is nearly full. 2 and 3 are in error state which must be cleared [23:43:00] * Coren looks into it. [23:50:18] thx Coren