[00:01:40] anyway, labs is pretty important overall and it's staffed less than most of wikimedia's projects [00:02:33] Still amazes me how many jobs WMF has open all the time... even if they are mostly for developers heh [00:04:47] Damianz: Compare the open jobs count for WMF with the open jobs at any other company running a web property as large as the wiki[mp]edia farm and I think you'll see it's pretty low. [00:10:28] Depends in what area though... commercial stuff comes into play a lot. For example we have an A/B team like 4 times larger than our ops team [00:23:00] Why doesn't curl work on Labs? [00:23:09] on an own instance [00:23:51] what are you trying to curl? [00:24:04] things with dns pointing to external ips don't work because of nat and firewalls and things [00:24:56] oh [00:25:20] for example tools.wmflabs.org is like tools-webproxy internally [00:25:44] * Damianz really wishes we had split horizon dns or some network sticky tape [00:26:48] Damianz: it works with tools-webproxy [00:26:57] sorry [00:27:13] np - it's a silly thing [02:53:32] Hi [02:53:55] Is the labsdb thing fixed? [03:24:31] PiRSquared: If you mean access problems from the web servers, then yes -- several hours ago. [03:26:24] equiad status -- Grid: Working -- Replicated DBs: Working -- Local DBs: Working -- Web: Working [04:03:15] Ah, this is immensely cool. I managed to properly have tools.wmflabs.org proxy to the right spot in eqiad if the pmtpa side is disabled automatically. No URL will be harmed during the migration. :-) [04:05:34] https://wikitech.wikimedia.org/wiki/Tool_Labs/Migration_to_eqiad#During_the_migration_period [04:43:27] Change on 12mediawiki a page Wikimedia Labs was modified, changed by Sandaru link https://www.mediawiki.org/w/index.php?diff=919343 edit summary: [+34] adding the correct link to tool labs instead of redirect page [04:46:43] Change on 12mediawiki a page Wikimedia Labs was modified, changed by Sandaru link https://www.mediawiki.org/w/index.php?diff=919345 edit summary: [-4] [04:48:24] Change on 12mediawiki a page Wikimedia Labs was modified, changed by GeorgeBarnick link https://www.mediawiki.org/w/index.php?diff=919346 edit summary: [-4] unlink [05:56:19] so, when exactly will we be able to create to create eqiad instances? [08:14:34] Coren|Busy: Could you provide an example how to copy directly from pmtpa to eqiad? [08:17:47] Coren|Busy: trying to access labstore.svc.eqiad.wmnet:/project/tools/project from pmtpa gives 'no route to host' [13:26:15] giftpflanze: Sometime today; waiting of Andrew for that. [13:27:23] hedonil: I haven't written the pack-my-stuff scripts yet (coming today). It's possible to do "by hand" but a little complicated. [14:24:29] wee [14:31:17] andrewbogott: hi [14:31:26] andrewbogott: when is it possible to create instances in new cluste [14:31:28] r [14:31:30] :o [14:31:47] petan: Soon! Let me flip a couple of switches :) [14:31:52] ok [14:31:55] after that I will read my epic backscroll [14:32:17] half of that is me :P [14:33:22] ok, I'll ignore those parts :p [14:34:27] petan, reload the 'manage instances' page. What do you see? [14:36:07] eqiad [Creation is disabled] [14:36:17] oops, I did it backwards [14:36:29] can machine in eqiad have same hostname as these in pmtpa [14:36:35] ok, how about that? [14:36:40] Yep, same hostname is fine. [14:36:45] ok [14:36:51] I need to use fqdn when scp? [14:36:54] I suppose [14:36:55] When referring to instances in the other dc just use fully qualified... [14:36:56] yes [14:37:00] :) [14:37:38] for your proxycommand use 'bastion-eqiad.wmflabs.org' to access eqiad machines. [14:41:51] jorm: Apologies about Unicorn… it should be better now. Let me know if not. [14:45:41] petan: Working now? [14:46:36] andrewbogott: yes I already started migrating my things :> [14:47:08] great! petan, I take it you already updated the wiki page about your projects? [14:47:20] I need to remove them from the rsync queue. [14:47:21] no [14:47:25] aha [14:47:27] where is it [14:47:48] https://wikitech.wikimedia.org/wiki/Labs_Eqiad_Migration_Progress [14:47:55] !migration is https://wikitech.wikimedia.org/wiki/Labs_Eqiad_Migration_Progress [14:47:55] Key was added [14:48:09] Also… I'm going to kill memcached on wikitech because I see some bad behavior. You may have to log back in again, sorry... [14:48:17] I will them [14:49:06] yeah, that fixed it. [14:50:18] andrewbogott: 2 striked, I will also do icinga by hand I guess and maybe some more [14:50:37] great, thanks! [14:54:20] andrewbogott: Re memcached = logged out after a while? Previously, my login stayed for seven days or so. Or does that have to do with me being contentadmin? [14:54:51] scfc_de: When I restarted memcached that probably logged out everyone :) [14:55:09] I'll try not to make a habit of it. [14:55:49] It happened quite more than once :-). Are you aware of any different login characteristics for contentadmins? [14:56:42] scfc_de: contentadmin shouldn't matter... [14:57:11] I'm going to close my browser and make sure I'm still logged in... [14:58:03] Hmmm. I didn't actually close my browser; I suspected I changed my IP address and maybe contentadmin's rules didn't allow that. [14:58:32] scfc_de: nah, something is a bit broken with sessions. It's not just you. [14:58:36] I'll check it out. [14:59:59] (Just a nuisance, not a blocker.) [15:00:19] yep [15:40:59] I have 79 scanned historic books in pdf format to upload to Commons from the World Digital Library, I have already uploaded over 350 that were under the 100MB limit. Any suggestions of an easy way of doing it, perhaps granting temporary permission to Faebot to upload the files? The list is at https://commons.wikimedia.org/wiki/Commons:Batch_uploading/World_Digital_Library#100MB [15:41:53] Just to make that a bit clearer, they are all over the 100MB limit, so I cannot upload them. [15:42:29] Splitting them would seem an unnecessary drag too - seeing as most users/reusers would want to browse the complete book. [15:49:10] jzerebecki: Can you update me on the status of the puppet-cleanup project? [15:55:27] Fae: You would need to talk to ... someone else :-) about that. Have you raised your issue at the Commons Village Pump? [15:57:43] ok, next migration step involves breaking my irc bouncer. So… back in a few. [16:00:26] Coren|Busy, isn't today the migration? [16:00:53] Isn't Cyberpower678 the least patient user, ever? :-) [16:01:03] Yes. [16:01:05] :p [16:01:21] Coren|Busy, it's called being excited and curious. [16:03:08] Cyberpower678: today /begins/ the migration which will take many days [16:03:45] andrewbogott, err. I know. I want to have my stuff migrated ASAP [16:04:09] Coren|Busy, what do I need to do about DBs? [16:04:28] Cyberpower678: This will be made known. [16:04:36] Right now, you need to wait. [16:04:40] Looking forward to it. [16:06:23] scfc_de: Not yet, I was told by someone else that this was a developer thing, but I'll raise it at the VP. :-) [16:18:06] Coren|Busy, I can't seem to write to my new project cybertools. Am I forgetting something? [16:18:32] ... new *project* cybertools? [16:18:56] Coren|Busy, service group [16:19:02] Tools project [16:19:14] [11:04:36] Right now, you need to wait. [16:20:02] Coren|Busy, wait for what? I fail to see how that connects to me getting Permission Denied [16:20:25] * andrewbogott suspects that Coren|Busy is Busy [16:20:42] andrewbogott, what makes you say that? [16:20:46] Oh wait. [16:21:00] FAIL [16:23:04] * Cyberpower678 is surprised he did not notice the Busy in Coren|Busy's nick. :p [16:42:47] I am getting a 504 Gateway Time-out (nginx/1.1.19) for a large page on my test wiki. I did some google searches and I am supposed to make some changes to an nginx.conf file, but I can't find it. Any suggestions? [16:52:01] Howie: " /usr/local/nginx/conf, /etc/nginx, or /usr/local/etc/nginx. " [16:52:12] Howie: likely /etc/nginx/nginx.conf [16:56:15] @notify andrewbogott [16:56:15] I'll let you know when I see andrewbogott around here [16:56:51] mutante: there is no second storage on new eqiad boxes [16:56:54] /dev/vdb is missing [16:57:21] andrewbogott [16:57:31] andrewbogott there is no second storage on new eqiad boxes [16:57:48] /dev/vdb [16:57:52] Ah, talking about the disk config? [16:57:57] yes [16:58:05] there are 3 partitions in /dev/vda [16:58:14] but no vdb with 20gb size [16:58:20] as mentioned in instance template [16:58:37] Hm, does that mean that all instance flavors are the same size? [16:58:39] * andrewbogott tries it [16:59:23] I don't know but I expected /dev/vdb to be present there [16:59:27] mutante: I looked at that those places and still found no nginx directory [16:59:28] where are these 20gb now? [16:59:48] petan: I don't know, looking. [16:59:55] It wouldn't surprise me if the partitions were rearranged somewhat [17:00:25] Howie: how did you set that wiki up? you can try this, first run "updatedb" and after that is finished indexing "locate nginx.conf" [17:01:26] andrewbogott: I don't really mind, but this is quite a step back, because having all data on same physical partition is never a good idea [17:01:34] when I fill it up, system will become unstable [17:01:47] and I likely will fill it up accidentaly with some logs :P [17:01:56] mutante: when I run updatedb I get the following: root@drmf:~# updatedb The program 'updatedb' can be found in the following packages: * mlocate * locate Try: apt-get install [17:05:00] petan: except it is different partitions, right? Just on the same device? [17:05:14] andrewbogott: no, there are no free partitions [17:05:29] andrewbogott: by default it's all / and 1 separate for /var and 1 for swap [17:05:50] so by default there is no option to create any extra filesystem for own purposes [17:06:56] Howie: apt-get install locate [17:06:58] btw I made instance with 20gb storage and it only has 8gb / :o [17:07:12] I don't really mind, but that template is lying [17:09:12] mutante: I did that, which worked properly, and then I did updatebd which printed "/usr/bin/find: `/data/project': No such file or directory" and then "locate nginx.conf" which printed out nothing. [17:09:36] this is the url which produces the 504 --> http://drmf.instance-proxy.wmflabs.org/wiki/Zeta_and_Related_Functions [17:10:38] Howie: sounds like an issue with storage [17:10:54] since /data/project is gone. maybe related to the other talk above [17:12:06] mutante: what other talk? How can I determine if this is an issue due to storage [17:12:47] I did a 'df' and I am not near 100% utilization of disk resources [17:12:51] Howie: andrewbogott and petan [17:12:57] hello [17:12:59] Howie: unrelated [17:13:13] mutante: what's up [17:13:28] petan: Howie reports /data/project': No such file or directory [17:13:28] mutante: no, not related [17:13:41] should he expect /data/project or nah [17:13:41] mutante: on which priject [17:13:44] * project [17:13:48] Howie: ^ [17:14:00] drmf [17:14:05] mutante: I was offline, don't really know what you're talking about, sorry [17:14:05] or maybe even better, which instance [17:14:13] @labs-instances drmf [17:14:18] @labs-instance drmf [17:14:21] @labs-instance-list drmf [17:14:23] omg [17:14:25] @help [17:14:25] I am running http://meta.wikimedia.org/wiki/WM-Bot version wikimedia bot v. 2.0.0.4 my source code is licensed under GPL and located at https://github.com/benapetr/wikimedia-bot I will be very happy if you fix my bugs or implement new features [17:14:57] http://drmf.instance-proxy.wmflabs.org/ [17:16:17] I think the LVM stuff was done by Coren|Busy (module labs_lvm), so maybe ask him when he's not busy. [17:16:57] scfc_de: which lvm [17:17:06] I don't know anyone else than me who was using lvm on labs [17:17:54] operations/puppet:modules/labs_lvm/; seems to be included in toollabs; don't know about other instances. [17:19:01] so the nginx error is a symptom of LVS issues? [17:19:08] lvm [17:19:22] i think by now everybody involved is confused:) success [17:19:50] andrewbogott: /dev/vdb mysteriously appeared lol [17:19:56] no idea who did it [17:20:06] really? On what instance? [17:20:09] I hope it's not gonna disappear just that fast :P [17:20:17] i-000000e7.eqiad.wmflabs [17:20:27] maybe it appeared on the first or second puppet run? [17:20:29] andrewbogott: summary, i tried to help out Howie with finding his nginx config and then it turned into /data/project being gone [17:20:36] mutante: I think it would useful for Howie to provide some more information :-): What project, what instance, what Puppet classes, etc. [17:20:41] that may be possible despite it's weird [17:21:08] mutante: if you know what project it is, I can see what's up with /data/project [17:21:31] andrewbogott: it's called drmf [17:21:38] andrewbogott: lol nevermind, I was looking at wrong window [17:21:41] too many terminals open [17:21:48] it's not there [17:21:56] mutante: there is no project by that name [17:22:36] Howie: ^ for further support the project name is needed [17:22:50] one sec [17:23:20] Isn't it enough to get this from the url? [17:24:20] Maybe the project name is "Math" [17:24:32] we may be under that [17:26:27] mutante: it does not look to me like project 'math' was ever configured to have a /data/project [17:26:31] do you have reason to think otherwise? [17:27:03] there's a home directory "andrew" in our installation /home [17:27:06] andrewbogott: i don't , i'm support Howie, i am in a position where all i do is relay messgaes [17:27:30] so what is the actual problem then? Howie, or mutante? [17:27:31] The system was set up by "physikerwelt" [17:27:36] andrewbogott: well, yes, because: [17:27:45] which printed "/usr/bin/find: `/data/project': No such file or directory" [17:27:51] he told me that [17:28:02] and that happens on "updatedb" [17:28:49] I am getting a 504 Gateway Time-out (nginx/1.1.19) for a large page on my test wiki. I did some google searches and I am supposed to make some changes to an nginx.conf file, but I can't find it. I am trying to fix this. [17:28:52] Howie: ok, can you please repeat what you said above, with the URL and stuff [17:28:55] thanks [17:29:31] I get the error when I try to access the following url -> http://drmf.instance-proxy.wmflabs.org/wiki/Zeta_and_Related_Functions [17:30:18] (the page exists on my MediaWiki installation) [17:30:39] petan: Check the labs_lvm class and the labs_lvm::volume instance. Documented in puppet. That's what you want. [17:31:17] andrewbogott: https://bugzilla.wikimedia.org/show_bug.cgi?id=62219 [17:32:09] Howie: do you have reason to think this worked in the past? [17:32:37] petan: What does vgdisplay say? [17:32:49] scfc_de: where [17:33:09] On that instance? [17:34:36] andrewbogott: I did some google searches and basically there is a timeout and I need to increase some system variables in an nginx.conf file, but I can't find it. [17:34:57] scfc_de: vgdisplay isn't going to help unless the puppet configuration for the instance includes the 'labs_lvm' class. Then the extra space will show as a volume group [17:35:35] Coren|Busy: Ah, okay. [17:35:44] Howie: ok, nginx does not seem to be installed on that box. [17:36:04] then how come I am getting that error which specfically refers to nginx? [17:36:57] andrewbogott: nginx/1.1.19 [17:38:01] Coren|Busy: cool thing [17:38:35] Howie: andrewbogott: Doesn't the nginx error come from instance-proxy? [17:39:04] Howie: sorry, I'm being needlessly obscure :) you are accesing the instance through a web proxy, which is running nginx. [17:39:13] It's the same proxy that all of labs uses. [17:39:28] So, not something that labs users can individually configure. [17:39:39] If there's a bug in the config then I can have a look. [17:40:09] poh [17:40:12] (oh) [17:40:23] thank you [17:40:50] Howie, indeed, I can probably set you up to use a different proxy right now that may work better. [17:41:11] What would you like the hostname to me? Just drmf.wmflabs.org? [17:41:16] andrewbogott, thanks! [17:41:22] that is perfect! [17:42:27] andrewbogott: That timeout is killing me, is there someway to make this longer so that I can view the MediaWiki page? [17:42:32] Oh, dammit, the apache config on that site insists on using instance-proxy [17:45:41] Howie: that box has a convoluted apache setup which I can't find my way through. You can access it at drmf.wmflabs.org but it immediately redirects via instance-proxy which is (probably) causing your problem. [17:45:59] So if you want to sort through the apache setup and figure out why it's redirecting or rewriting the url... [17:46:30] (btw, I was wrong before, didn't read the url closely enough. 'instance-proxy' is way out of date, no longer supported.) [17:46:52] i see [17:47:03] So what do you reccomend? [17:47:13] "So if you want to sort through the apache setup and figure out why it's redirecting or rewriting the url..." [17:47:24] I'm just assuming here that your apache skills are stronger than mine, which are negligable [17:47:37] (bad assumption) :) [17:48:13] It's using hhvm, among other things [17:50:40] andrewbogott: thanks for your help, I will try and figure out what is going on and come back here if I can resolve this issue of redirection. [17:50:59] Howie: You're welcome, sorry I don't have an immediate fix. [17:51:11] np [18:05:28] matanya: I am migrating etherpad instances now; they will shut down in pmtpa shortly [18:05:37] thanks [18:22:59] petan: when bots is obsolete why do you migrate it? [18:24:36] andrewbogott: Ahm...why did the epl instance get deleted? [18:25:04] rdwrer: what project/instance? [18:25:17] Andrew Bogott deleted instance 'etherpad-lite' in project Nova Resource:Etherpad [18:25:31] rdwrer: I'm migrating those instances at matanya's request. [18:25:43] matanya: OK, so, same question, why are you doing that? [18:25:52] I'm running into some tangles so have needed to delete the new instance a few times. [18:26:31] * Coren|Busy chuckles. [18:26:38] means rdwrer ? [18:26:55] matanya: Why are you migrating the EPL instances? [18:26:57] I note that the echo notification might be clearer; if you delete an instance it doesn't actually tell you that it's a new copy and not the pmtpa one. :-) [18:27:13] not etherpad-lite, just etherpad [18:27:41] matanya: there is an instance called 'etherpad-lite' in the 'etherpad' project [18:28:03] not mine [18:28:29] And yet that's getting deleted [18:28:34] Or...copied. Whatever. [18:28:48] rdwrer: it has to get migrated sooner or later. [18:28:55] rdwrer: are you subscribed to labs-l? [18:29:06] Yeah [18:29:22] Oh. [18:29:36] matanya: so… your bug about migration, it just says to migrate instances in etherpad. Did you not mean both of them? [18:30:03] i think we should migrate both, but i don't own the other one [18:30:08] Yeah, both [18:30:19] I didn't understand what you meant by migrating [18:32:26] rdwrer: https://wikitech.wikimedia.org/wiki/Labs_Eqiad_Migration_Howto [18:32:46] I get it now [18:32:53] I guess I'll need to do this sometime this month, sigh [18:33:41] except i'm doing it right now, for etherpad. [18:33:46] If you have other projects, then yes. [18:33:59] no xsmall anymore? [18:35:02] rdwrer: sorry for all the notifications, I'm arguing with openstack [18:35:16] giftpflanze: you mean the 0-size disk flavor? That hasn't existed in a year. [18:35:18] What was the bastion for eqiad again? [18:35:36] w/e [18:35:51] i'm sure i saw it this very morning in pmtpa [18:35:54] bastion-eqiad.wmflabs.org [18:36:01] andrewbogott: Thanks. [18:36:01] giftpflanze: lemme look [18:37:01] scfc_de: please put it in topic [18:38:38] matanya: a) You could do it yourself :-). b) I don't think that all information needs to be in the topic. Could you add it to https://wikitech.wikimedia.org/wiki/Help:Access, please? [18:39:39] all the color codes in console output make it look very broken … [18:41:33] scfc_de: a) good to know. b) ok. c) done :) [18:41:57] matanya: Thanks. [18:42:22] rdwrer: is there anything in the etherpad projects /data/project that should be preserved? [18:42:58] andrewbogott: Ugh, I forget, but I want to say "yes" [18:43:03] ok [18:43:15] andrewbogott: I'm pretty sure some part of it is important [18:43:22] No. Wait. That's the orgchart project. [18:43:50] Migration instructions just sent to labs-l and wikitech-l. Also at https://wikitech.wikimedia.org/wiki/Labs_Eqiad_Tools_Migration [18:47:44] Coren, andrewbogott; when one of you has a chance; I'd really appreciate you looking at https://rt.wikimedia.org/Ticket/Display.html?id=6969 [18:48:57] mwalker: I'll handle it later today if andrew doesn't get to it first; I want to keep an eye on migration for a little bit. [18:49:02] *nods* [18:49:11] thanks [19:03:43] Coren: there should be a way to migrate and include log files [19:04:14] Betacommand: There is; just compress them first. But also, most people have multi-GB logfiles so I definitely do not want to do it by default. [19:04:45] s/most/many/ [19:05:13] Coren: Ive got quite a few small .log files scattered around.... :( [19:05:40] Betacommand: find . -name '*.log' -exec gzip {} \; [19:05:52] what would i use in my ssh config (ProxyCommand) to access eqiad? so far it seems to oom my laptop :\ [19:06:10] bastion-eqiad.wmflabs.org [19:06:15] giftpflanze: ^ [19:06:25] Betacommand: The reverse being find . -name '*.log.gz' -exec gunzip {} \; [19:07:39] hello [19:07:51] Coren: how long should it take to move? [19:08:03] what is the hostname for wikidata? [19:08:26] trying to tunnel it using mysql workbench [19:10:08] Superyetkin: You mean the replica DB server? That should be wikidatawiki.labsdb [19:10:37] what about default schema name? [19:11:16] *** unknown mysql server host wikidatawiki.labsdb [19:11:59] Oops, you're right. Moment, please. [19:12:09] sure [19:12:52] Coren: will crons get migrated? [19:13:11] Superyetkin: No, wikidatawiki.labsdb does resolve correctly. Does your Workbench setup work for other DBs? [19:14:19] yes, I can use trwiki with no problem at all [19:14:23] mwalker: try now? [19:14:37] Betacommand: Yes, but commented out by default. [19:15:03] port 3306 for wikidata? [19:15:30] Superyetkin: And if you replace trwiki.labsdb with wikidatawiki.labsdb you get an error "unknown mysql server host"? That shouldn't happen. [19:16:03] Superyetkin: Yes, standard port 3306 for all replica DB servers accessed with the aliases *.labsdb. [19:16:16] web proxies don't work for new instances on eqiad [19:16:25] YuviPanda: ^ [19:16:39] 502 Bad Gateway [19:16:54] unknown mysql server host wikidatawiki.labsdb [19:17:01] andrewbogott, whoo; it works! [19:17:03] thanks [19:17:07] I am confused... spelling issue? [19:17:34] andrewbogott, I definitely would not have known that this was part of oauth [19:18:22] Superyetkin: No, if you copied the error message, that's the correct host. Strange. What Tools host do you use in Workbench? tools-login.wmflabs.org? tools-dev.wmflabs.org? [19:19:59] Superyetkin: wikidatawiki is the same setup as trwiki (well, actually nlwiki as it's on shard s5, while trwiki is on s2). [19:24:00] jorm: A couple of questions about unicorn: a) Is it working now? b) do you mind if I move and reboot it overnight tonight? [19:24:05] no way [19:24:12] hm, is bastion-eqiad fully functioning? i can't connect to my instance from there [19:24:30] my mysql workbench connection settings screwed... [19:24:39] it appears to be working now, yes. [19:24:51] and i'd prefer if you did that, like, *now*. [19:24:53] giftpflanze: Neither can I. [19:25:05] jorm: Sure, I can do it now. [19:25:06] ah, ok [19:25:09] because i'm going to spin up some user tests, and i can't guarantee when they'll fire off. [19:25:17] awesome. [19:25:17] jorm: Does it depend on stuff in /data/project? [19:25:21] giftpflanze: It should be. Else, there is a bug. [19:25:32] nope. [19:25:32] i haven't uploaded anything yet. [19:25:49] i opt for bug [19:25:55] Coren: Ends for me with "ssh_exchange_identification: Connection closed by remote host". [19:26:08] they'll be sitting in teh apache web directory. [19:26:12] giftpflanze: I'll help you debug in just a minute... [19:26:16] scfc_de: It may take some time before the first puppet run completes, by the way. [19:26:28] giftpflanze: What happens when you try? [19:27:09] using agent forwarding: gifti@bastion1:~$ ssh dwl [19:27:09] ssh: connect to host dwl port 22: Connection timed out [19:27:25] * Damianz appears [19:27:30] using ProxyCommand: getsockname failed: Bad file descriptor [19:27:33] channel 0: open failed: connect failed: Connection timed out [19:27:34] giftpflanze: is dwl an eqiad or pmtpa instance? Or both? [19:27:36] ssh_exchange_identification: Connection closed by remote host [19:27:39] both [19:28:05] http://tools.wmflabs.org/paste/view/c5c05cc3 [19:28:31] andrewbogott: But I specifically do "ssh toolsbeta-pbuilder.eqiad.wmflabs", so no ambiguity. [19:28:35] scfc_de: How long ago was that instance made? [19:29:03] Coren: 18:33:03Z [19:29:29] Should be long enough. Lemme see... [19:29:44] scfc_de: I think something is wrong with dns, try using the ip for a bit until I have a chance to investigate? [19:29:52] (No Puppet classes or other fancy stuff selected at creation.) [19:29:59] andrewbogott: Let me try. [19:30:27] scfc_de: Works for me, with the name even. [19:30:55] scfc_de: Lemme try to look at the logs. [19:31:19] scfc_de: Try to log in now? [19:31:32] I think there was a brief (unexplained) dns failure. Seems better now :/ [19:32:29] all the cool kids use ip addresses [19:32:54] Both toolsbeta-pbuilder.eqiad.wmflabs and 10.68.16.44 still the same error. Coren, andrewbogott: Are you using bastion-eqiad.wmflabs.org or something WMF-ops only? [19:33:07] thanks scfc, it worked now [19:33:21] scfc_de: bastion-eqiad. And I don't even see you /attempting/ to connect to the instance. [19:33:21] scfc_de: the same [19:33:40] do you know where the language links are stored in wikidatawiki DB? [19:33:59] scfc_de: Oh, wait. I lied. [19:34:49] Superyetkin: I don't; I assume there are some docs for wikidata (and I think there's a mailing list or so). [19:35:17] scfc_de: Wait; what project is this? toolsbeta? [19:35:17] ok, thanks anyway [19:35:22] Coren: Yes. [19:36:04] When I "ssh toolsbeta-pbuilder.eqiad.wmflabs" on bastion-eqiad this times out, so it should fail due to public key. [19:36:54] scfc_de: Your security groups for toolsbeta only allow ssh from 10.4.0.0/21 [19:37:11] scfc_de: eqiad is in 10.68. [19:37:12] :-) [19:37:23] Sure that's not a default? [19:37:51] scfc_de: It might well be. I can only speak to its current value. [19:38:12] scfc_de: Either way, the solution is trivial: fix it. :-) 10.0.0.0/8 is quite reasonable. [19:38:34] Security groups span both datacenters. [19:38:43] Coren: Will do. Does that solve giftpflanze's problem as well? [19:38:45] So you get whatever your current security setup is. [19:39:00] scfc_de: I don't know. I haven't looked at his. [19:39:07] giftpflanze: what instance and project is it? [19:39:13] dwl/dwl [19:39:46] andrewbogott: So the "pmtpa"/"eqiad" headings at https://wikitech.wikimedia.org/wiki/Special:NovaSecurityGroup are meaningless? [19:40:05] <^d> andrewbogott: I just moved "lucene-test" project to useless. [19:40:15] scfc_de: sorry, what I just said is wrong :) [19:40:15] giftpflanze: check your security groups? [19:40:24] I duplicated pmtpa security groups in eqiad. [19:40:28] So they are /currently/ the same. [19:40:31] But can diverge in the future. [19:40:36] andrewbogott: k [19:40:45] ^d: thank you! [19:41:13] mm, ok [19:41:48] <^d> andrewbogott: Also "solr" [19:41:51] <^d> Obsolete testing. [19:41:57] keep 'em coming! [19:43:05] why is it that i cannot change security groups but only delete/add them? [19:43:13] And after changing security groups, "ssh toolsbeta-pbuilder.eqiad.wmflabs" works. Thanks, Coren. [19:43:13] <^d> gerrit I can probably migrate myself. [19:43:18] <^d> It's just fabricator :p [19:44:15] ^d: Could you comment on https://bugzilla.wikimedia.org/show_bug.cgi?id=61967#c4 when you have some time, please? [19:45:05] <^d> User exists in ldap because user was in svn. [19:45:16] <^d> Doesn't exist in gerrit at all afaik. [19:46:13] So if he were removed from LDAP, what would be the consequences? [19:46:43] <^d> Nothing for gerrit. [19:46:50] <^d> Gerrit doesn't even know the user exists. [19:47:09] giftpflanze: Because openstack sucks. Just delete the wrong one and create a new one with the right values. [19:47:30] For that matter, you can also safely create the new permissive one before removing the old one. [19:47:33] <^d> andrewbogott: From skimming the list that's the only ones I can think of that can be deleted. [19:51:46] andrewbogott: Fun. Special:NovaSudoer is broken. [19:52:02] how so? [19:52:13] andrewbogott: Ah. Apparently, only for tools. [19:52:25] Gives a no-content page if tools is in my filter. [19:52:30] There was a recent patch... [19:53:22] https://gerrit.wikimedia.org/r/#/c/111755/ <- the only recent change in that area [19:53:35] The migration shouldn't much matter since it's all in ldap [19:53:55] [yaaay, it works] [19:54:37] andrewbogott: non-issue for me atm. I'll revisit later when there is less time pressure. [19:54:43] ok [19:54:45] but i still get "getsockname failed: Bad file descriptor" [19:54:58] and "Killed by signal 1" when logging out [19:56:51] giftpflanze: For "Killed by signal 1", add "-q" to ProxyCommand's ssh. [19:57:08] i normally do, scfc_de [19:57:37] Coren: how will I know when the migration is done? [19:58:02] Betacommand: Just running the migrate-tool command again will give you the current status. [19:58:29] I use 'watch migrate-tool ' when I want to keep an eye on things. :-) [19:58:36] jorm: Want to check my work? Verify that http://unicorn.wmflabs.org/whatever does what you expect, and also that you can log in to unicorn.eqiad.wmflabs? [19:59:16] Betacommand: It's not very fast; I'm being careful to not saturate the network or filesystems. Excpect about 1G/min. [19:59:32] i have to ssh to unicorn.equiad.wmflabs.org now? [19:59:42] Coren: I should have less than 1Gb to move :P [20:00:09] Betacommand: Then it shouldn't be very long before it's done. :-) [20:00:11] the webserver is working correctly. [20:00:16] but i'm not able to ssh. [20:00:54] jorm: unicorn.eqiad.wmflabs [20:01:02] Coren: whats the new tools-logn name? [20:01:14] that's not a FQDN. [20:01:24] jorm: how do you connect to labs instances normally? Do you have a proxycommand set up? [20:01:37] i type "ssh unicorn.wmflabs.org" [20:01:48] Right now it's tools-login-eqiad.wmflabs.org; it'll become tools-login.wmflabs.org after the migration. You can also ssh from tools-login to it with the name 'eqiad' [20:02:06] Betacommand: ^ [20:02:15] Coren: thanks [20:02:25] jorm: ssh unicorn.wmflabs.org should give you some kind of 'danger! man in the middle attack' warning [20:02:31] since I pointed it at a new instance... [20:02:50] i am getting network unreachable. [20:02:57] ok, let me poke around a bit [20:03:15] Coren: how dare you call it eqiad the law is qu :P [20:03:38] Heh. It's 'eq' 'iad'. :-) [20:04:40] Coren: it doest matter you should have added a u when combining them [20:04:45] And for those curious: 'eq' = 'Equinix' (IIRC), 'iad' = airport code for Washington; pmtpa = tpa = Tampa. [20:05:20] jorm, what do you get for $host unicorn.wmflabs.org ? [20:05:57] (I can't log in either at the moment, but trying to gather info) [20:06:00] zombieland:Prototypes bharris$ host unicorn.wmflabs.org [20:06:00] unicorn.wmflabs.org has address 208.80.155.134 [20:06:17] Coren: ... Doesnt work https://dpaste.de/SXR0 [20:06:39] Hmmm. [20:07:33] Hm. [20:07:48] That should have kicked in HBA. Lemme try to figure out why not. [20:08:32] Coren: why does nothing in labs ever work for me? [20:08:33] jorm, can you ssh now? [20:08:58] ssh: connect to host unicorn.wmflabs.org port 22: Network is unreachable [20:09:01] nope. [20:09:44] huh. [20:09:49] you're in the office? [20:10:13] Betacommand: Maybe you should turn off your personal entropy generation field? :-) [20:11:08] coren: this sentence weird: "This is prepare your tool for migration, and schedule the data copy." ;) [20:11:20] Coren: I dont have one [20:11:31] giftpflanze: Oh, forgot to fix it in that script. [20:11:54] Oh, in the email? [20:11:59] * Coren headdesks. [20:12:01] i am not. i'm at home right now. [20:12:16] in the wiki page [20:12:47] jorm: Oh! Stranger yet [20:12:53] It works for me :/ [20:13:00] what ip are you seeing? [20:13:24] 208.80.155.134 -- same as you, right? [20:13:30] coren, can you ssh to ^ ? [20:13:37] aka unicorn.wmflabs.org? [20:13:51] wow. traceroute is weird. [20:13:54] yeah, same ip. [20:14:13] andrewbogott: Yep. [20:14:14] it goes to here: [20:14:15] 10 208.80.155.134 (208.80.155.134) 90.277 ms 84.917 ms 85.104 ms [20:14:20] and then it stars up. [20:14:33] it's almost like it's trying to go further than the ip i want it to go. [20:14:38] On "ssh -v 208.80.155.134", I get "Permission denied (publickey).", so works from this side of the pond. [20:15:01] whoah! what the fuck: [20:15:02] zombieland:Prototypes bharris$ ping unicorn.wmflabs.org [20:15:03] PING unicorn.wmflabs.org (208.80.153.170): 56 data bytes [20:15:08] wrong ip. [20:15:17] ok, so there's a stale dns entry in there somewhere [20:15:25] unless you have it set in /etc/hosts [20:15:27] this is a dns issue. [20:15:34] i can ssh to the IP correctly. [20:15:44] that doesn't surprise me much, could take a bit for the change to get to you. [20:15:53] oh, if the IP works, then... [20:15:59] well, dig this: [20:16:02] weird that 'host' gives you a different IP than ssh! [20:16:08] the browser sees it fine. [20:16:16] than ping, actualy. [20:16:46] Coren: any progress? [20:16:58] host gives me the correct ip; ping goes to the wrong one; chrome goes to the correct server. [20:17:28] Betacommand: Not yet, but in the meantime you can ssh directly to tools-login-eqiad.wmflabs.org; it's the host based auth between the two login boxes that is broken, not access from outside. [20:19:11] Coren: how long is the tools copy waiting list? [20:20:11] giftpflanze: Not very at this moment but there seems to be a fairly large tool currently in progress. [20:20:15] jorm: So… I'm sorta hoping this will just fix itself in half an hour or so... [20:20:17] videoconvert [20:20:23] ok, thx [20:20:27] sounds like things are otherwise working, except for ping and ssh hating you? [20:21:40] Betacommand: Try now. [20:22:11] Coren: give me a sec I closed that session and am restarting a bunch of stuff [20:25:02] kk [20:25:14] i'm heading to the train now [20:25:32] Coren: worked [20:26:24] Betacommand: Yeah, I had forgotton to actually /authorize/ tools-login for HBA. :-) [20:31:01] coren: in migrate-user there is finish-migration mentioned [20:31:14] giftpflanze: Yep, that one you do at the other end. :-) [20:31:24] eh? [20:31:29] In eqiad. [20:31:44] quoting from wiki: " There is no second half of the script to run once it is complete." [20:32:00] Oh! In migrate-*user* [20:32:08] Documentation error. Lemme go fix this. [20:32:46] you can also fix eq*u*iad [20:32:47] Before I go, here's that link one more time, for individual labs projects: https://wikitech.wikimedia.org/wiki/Labs_Eqiad_Migration_Howto [20:33:23] giftpflanze: fix't [20:33:56] :) [20:36:09] Coren, you still very busy? [20:36:24] Cyberpower678: For anything not directly migration-related, at least. [20:36:26] Cyberpower678: what u need [20:36:40] btw what is data/scratch :o [20:37:06] petan: It's globally available scratch space for things like large temporary files etc. [20:37:25] globally project wide? or labs-wide? [20:37:54] I mean if I put file there where would I see it [20:38:09] petan: labs-wide. It's not backed by redundant hardware, so not good for long-term storage, and because it's labs-wide one should never put confidential stuff there; but it's available and handy. [20:38:19] aha cool [20:38:24] well, labs-wide-in-eqiad. [20:38:29] yes I will use it a lot [20:39:08] Coren: joe is not available in eqiad's tools, should i file a bug? [20:39:17] Coren: that migrate-user should probably tar.gz the old home after copy and remove all stuff so user isn't confused if they are in old or new cluster [20:39:31] they will accidentaly work in old cluster and then they figure they need to resync all their files [20:39:34] giftpflanze: That means it was never put in puppet, so yeah; open a bug please. [20:39:58] petan: Perhaps just changing the prompt in pmtpa to make it clear? [20:40:00] k [20:40:03] I rm -rfed my home just to make sure [20:40:19] petan: I prefer to not touch the pmtpa data if it can be avoided. [20:40:25] yes prompt would do [20:40:38] well, these were my data :P [20:40:43] petan, I get permission denied on cybertools [20:40:53] Cyberpower678: ok... when? [20:40:56] Now [20:40:58] doing become? [20:41:05] I mean what you did before you got it [20:41:09] what command [20:41:13] No. Trying to SFTP stuff [20:41:23] sftp to where, new or old cluster [20:41:28] IDK [20:41:34] I can't tell [20:41:36] which public address you use [20:41:41] tools-login.wmflabs.org [20:41:47] or tools-login-eqiad... [20:42:31] /data/project/cybertools [20:42:39] tools-login [20:42:47] !tools-equiad is new cluster's bastion is 208.80.155.130 tools-login-eqiad.wmflabs.org [20:42:47] Key was added [20:43:08] Cyberpower678: sec [20:43:46] Cyberpower678: I myself would recommend you to migrate your bot first and start using new cluster before you start uploading files to old cluster... [20:44:29] Cyberpower678: what client you use [20:44:29] petan, an instruction manual would be delightful. [20:44:39] Coren: what is the public ip address of tools in the grid of eqiad's tool labs? [20:44:49] petan, I use SmartFTP [20:45:06] it seems blocked from freenode, i think [20:45:28] !migration is https://wikitech.wikimedia.org/wiki/Tool_Labs/Migration_to_eqiad [20:45:29] This key already exist - remove it, if you want to change it [20:45:33] !migration [20:45:33] https://wikitech.wikimedia.org/wiki/Labs_Eqiad_Migration_Progress [20:45:46] !toolsmigration is https://wikitech.wikimedia.org/wiki/Tool_Labs/Migration_to_eqiad [20:45:47] Key was added [20:45:47] giftpflanze: Ah, you mean outgoing? Ah; I have not yet set the trick with identd up. It should be up shortly. [20:45:52] !toolsmig | Cyberpower678 [20:45:52] Cyberpower678: https://wikitech.wikimedia.org/wiki/Tool_Labs/Migration_to_eqiad [20:45:58] aha … [20:46:23] giftpflanze: I have a bot on new cluster connected to freenode now [20:46:57] petan: tools or not tools? [20:47:05] non-tools [20:47:19] what do i need to do? [20:47:23] but I can migrate replag bot now [20:47:25] to test [20:48:09] petan, thanks. [20:49:30] All DOne. I hope :DD [20:49:31] lol [20:49:38] that script is fun [20:50:52] indeed freenode refuse it [20:50:57] meh [20:51:08] it's here! :o [20:51:11] @replag [20:51:12] Replication lag is approximately 00:00:00.9233100 [20:51:14] @replag [20:51:15] Replication lag is approximately 00:00:00.9235180 [20:51:20] :> [20:51:36] I like how it's approximately in nano seconds :D [20:52:24] giftpflanze: this is on tools ^ [20:53:10] meh …?! [20:53:57] Once the copy is complete, you can run 'finish-migration' [20:53:57] in equiad to finalize the process. [20:54:00] * legoktm spots a typo :) [20:54:16] petan: Too many users. I need to allocate IPs to the exec nodes. [20:54:29] * Coren goes do that now. [20:54:37] Coren: or poke freenode staff for exception [20:55:26] mpelletier: So what's the magic incantation then? [20:56:21] :( i could scp from pmtpa-tools-login to eqiad, but now i cannot anymore [20:57:16] magnusmanske: you want to: mv ...DATA.public_html public_html && ( cat ...DATA.crontab | crontab ) && rm ...MIGRATE.STATUS [20:57:36] magnusmanske: After that, you should be able to migrate-tool again [20:57:46] (The earlier command from the tool account) [20:57:51] Coren: tnaks [20:57:55] thanks [20:59:16] Coren: I migrated two of my tools, and it looks like everything went well (yay!), but I don't need to delete anything on the pmtpa side right? [20:59:44] legoktm: Nope, your homes are marked as migrated; if you do nothing they will just go away when we turn pmtpa off. [21:00:18] sweet [21:01:40] Coren: i have a few tools I created at one point, but don't need anymore, is there a way I can just delete them? [21:02:14] legoktm: Hm, no, but I'll provide a script for that in the coming days. [21:02:44] ok, sounds good [21:02:49] thanks for making it super smooth! [21:03:00] I try. :-) [21:03:23] whyyy? i always get permission denied when i try to access my tools's data :( [21:03:30] -s [21:03:51] giftpflanze: You'll need to be more precise than this. What data, what are you doing to access it? [21:04:23] gifti@tools-login:~$ scp -q ~local-giftbot/check.out eqiad:~tools.giftbot [21:06:30] giftpflanze: You may want to use a trailing / there to avoid scp being confused about what the target is. scp -q ~local-giftbot/check.out eqiad:~tools.giftbot/ [21:06:36] giftpflanze: But also, why not use migrate-tool? [21:07:20] don't you remember? *.{out,err,log} aren't copied [21:07:56] Ah. Like the email said, you could also have just gzip'ed them first. :-) [21:08:14] didn't think about it beforehand [21:08:21] But yeah. scp -q ~local-giftbot/check.out eqiad:~tools.giftbot/ should work. [21:09:13] Oh! No it won't. :-) [21:09:32] giftpflanze: For some reason, you removed your access to the .out file. :-) [21:09:39] -rw------- 1 local-giftbot local-giftbot 362654 Mar 4 00:03 /data/project/giftbot//check.out [21:10:18] o.o [21:10:24] did i? ^^ [21:10:39] In fact, your permissions are /really/ odd. I think you did a recursive chmod in the past by accident that removed group access to *everything* :-) [21:10:56] interesting [21:11:09] giftpflanze: I can fix 'em if you want. [21:11:20] what would you do to fix it? [21:12:15] find ~local-giftbot -type f -print0 | xargs -0 chmod g+rw && ind ~local-giftbot -type d -print0 | xargs -0 chmod g+rwx [21:12:30] omfg … [21:12:37] Then the same in eqiad since those permissions would have been copied. [21:13:02] chmod g=u would do the same? [21:13:29] giftpflanze: I think that might trample over the g+s. [21:13:43] hm [21:14:08] giftpflanze: Ah! But g+u would work. [21:14:22] chmod -R g+u ~local-giftbot :-) [21:14:46] I never tried g+u before; it does exactly what I would have expected. [21:14:57] oh :) [21:15:52] magnusmanske: glamtools is not small :-) [21:16:48] yup, that's what happens if I have to store EVERYTHING myself :-( [21:19:14] Coren: I need access to my tool database mixnmatch_p on wikidatawiki.labsdb [21:19:15] How? [21:19:21] Was not migrated? [21:20:08] Databases on replicas are left put; your new credentials has full access to it (under its previous name) [21:20:56] If the name is an issue, you can dump, then restore with the new name. There was no really "clean" way of handling this; mysql doesn't allow renaming databases. [21:22:11] Coren:So, it should work as "local-" but not as "tools."? [21:23:48] magnusmanske: ... what? No, your database is named 'p50380g50851__mixnmatch_p'; that's still there and your new replica.my.cnf credentials has full access to it. [21:24:03] Coren:Ah! [21:24:44] * Damianz wonders if Coren has time for that account today [21:24:48] If it were possible, I'd have renamed it to sXXXXX__mixmatch_p. [21:24:58] But it's not. :-( So I did the least bad thing. [21:25:20] Damianz: Sorry, give context again? I probably do, but my head is full of migration. [21:25:50] Remove 2fa from my wikitech account so I can login to the wiki again. [21:26:06] Damianz: Ah. Yes, I can handle that now. What's your wikitech account name? [21:26:27] DamianZaremba wiki, damian shell [21:26:37] Coren: Thanks, works now! [21:29:43] Damianz: Try? [21:29:55] Works, thanks! [21:33:24] does the mapping of ~/public_html to http://tools-eqiad.wmflabs.org/TOOL work yet? http://tools-eqiad.wmflabs.org/asurabot/log/bali.log fails for me with the same .htaccess and file/folder permissions as in tampa [21:34:46] sitic: Did you read the "Difference between eqiad tools and pmtpa tools" section of the web page? The only supported web system in eqiad is the new one: [21:34:48] !newweb [21:34:48] https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Help/NewWeb [21:35:03] The apaches have been deprecated for months. :-) [21:35:08] ah ok, must have missed that [21:35:27] No worries. Unless you have really fancy stuff in your .htaccess, you can probably just 'webservice start' [21:37:55] Coren: how do you check which host you are connected to? [21:38:33] Betacommand: hostname -f But also, I changed the prompt in pmtpa to be more visibly the old -login [21:38:49] (For new sessions) [21:42:38] Coren: Is there just one tools db? [21:43:20] Damianz: I'm not sure I understand the question. You mean 'tools-db' specifically? There is one per data center; the old one was a VM, the new one runs on actual hardware. [21:43:46] Damianz: The migration script helpfully dumps and restores the databases. [21:44:08] Ah - I didn't know if it was like replicas where it always lived on that side... [21:44:23] Coren: https://wikitech.wikimedia.org/w/index.php?title=Tool_Labs%2FMigration_to_eqiad&diff=102695&oldid=102693 ;-) [21:44:24] Since I need to test compiled things run properly ideally if pmtpa can connect to new toolsdb it makes migration easier [21:45:09] Damianz: I /think/ it should work (use labsdb1005.eqiad.wmnet) but I make no promises. [21:45:36] Should only need it tmp and can stop the bot if really needed... [21:45:39] * Damianz give sit a gooo [21:45:44] magnusmanske: That went well! Yeay! [21:47:10] Coren: will the webservice auto start from now on? [21:47:56] Betacommand: Not until the end of migration when the eqiad proxy becomes 'the' proxy. At that time, it will try to start the webservice if someone hits the URL but it is down. [21:48:12] (But I can't turn that on if there is an extra proxy in the way; that breaks badly) [21:50:28] I had the choice between disabling that for now or not having tools.wmflabs.org automagically proxy to tools-eqiad for moved tools. Guess which won? :-) [21:52:39] Coren: One item remaining. I am running stuff on a VM assigned to the wikidata people. Ideally, this should be migrated as-is (no puppet though); otherwise, I can set up a clean VM quickly. Do you know who is supposed to migrate this? https://wikitech.wikimedia.org/wiki/Nova_Resource:I-0000073a.pmtpa.wmflabs [21:57:50] magnusmanske: It's possible to migrate instances (Andrew can help) but if you can create a new one instead that's generally better (you'll get a fresh image, and no legacy config) [21:58:29] bah [21:58:44] Can I get my replica user/pass for a tool if I kinda overwrote the file by accident [22:10:16] Hi, I've just migrated itsource into eqiad and my local installation ot internetarchive py module fails giving an error "from requests.auth import AuthBase...ImportError:cannot import AuthBase" [22:11:49] previos pip install of the internetarchive package run happily. I can't imagine what is AuthBase, it comes from pip install. [22:12:26] requests is a python http client rest thing ia probably uses [22:13:36] I tried to re-run pip but I didn't find it into eqiad.... [22:14:17] Damianz: The replica.my.cnf will mysteriously reappear if you delete it by accident. :-) [22:14:31] It will also reappear if you delete in on purpose. :-) [22:14:48] Ah [22:14:55] * Damianz makes it dissapear and waits [22:15:21] finish-migration: sudo: sorry, a password is required to run sudo [22:18:04] coren: ↑ [22:18:41] gifti: Huh. Odd; that normally only occurs on very new tools. What tool was this? [22:18:47] ggu [22:19:22] gifti: For some reason, eqiad doesn't think you're a maintainer. Huh. [22:19:36] ehm, aha [22:19:58] I wonder if that's because the maintainers changed while the two infrastructures were separate. [22:20:10] Were you only recently (like, past two weeks) added? [22:20:38] no, i was added from the start iirc [22:20:53] scfc_de: do you remember? [22:22:08] Well, I know just touching the list of group maintainers updated it, so it should work now (you'll have to log on and back on) [22:22:20] Krinkle: ping [22:22:47] Coren: "the credentials for tools are /not/ the same" - so, if I now have a database "p50380g50828__data" and I'll create a new one in two month it will have a fully new preamble? And another tool may have databases starting with the old preamble? [22:23:07] yay [22:23:16] excellent :) [22:23:42] apper: Specifically, tools now have a 'sXXXXX' preamble; but access to the pYYYYYgXXXXX databases is grandfathered in by the migration tool. [22:24:37] jarry1250__: Yes? [22:24:41] Coren: okay, thanks. The same scheme would have really led to confusions ;) [22:24:53] Krinkle: Will you be migrating tsintuition soon? [22:25:00] jarry1250__: That was done months ago [22:25:13] Krinkle: to eqiad, I mean? [22:25:13] apper: You may dump older databases and create them with the new preamble if you want - and it may be simpler to do so in the long run - but since some of them can be *very* large that's not a requirement. [22:25:22] jarry1250__: Do I have to do that manually? [22:25:49] Krinkle: Coren sent round a nice email today. Basically yes :) [22:25:55] Krinkle: It's basically two commands; but any listed maintainer can do it. Ref.: the email on labs-l [22:25:58] I don't intend to do that unless I have to. Got enough work to do as it is migrating remaining tools to labs at all. I'm assuming for the moment that for simple web tools labs will not require any manual actions to migrate to eqiad [22:26:41] Krinkle: There will be an automatic migration at the end of the period, yes. [22:27:06] Coren: When will the main domain name switch? [22:27:20] assuming tools-eqiad is not an entry point we intend to publicise [22:27:54] Krinkle: It's planned for 17th; but tools.wmflabs.org automagically proxies to tools-eqiad for migrated tools to hide the switch. [22:28:08] k [22:28:41] Krinkle: It's just that all my tools are erroring now I've switched and tsintuition hasn't. If you add me to maintainers, I'll do it myself (who else uses TsIntuition?) [22:29:06] Coren, does the script copy files across, or copy then delete? [22:29:10] Coren: okay, that's always a possibility, but I will start with the old names ;). Thanks for the information. Migration process sounds good, I'll try it in a few days, hope it works good. are there experiences with how long it lasts? Whats the most time-consuming task? I think dumping and re-importing large databases could last some time?! [22:29:19] jarry1250__: There's deployment scripts I'll have to test, if you broke your tools by migrating before testing, I can't be responsible for that. [22:29:41] Since it appears there is no way to test it first, I won't migrate blindly at the risk of breaking my other tools [22:30:14] Krinkle: but you're just going to let them break whenever the automatic switchover is made? [22:30:24] Coren: What does it mean by 'stop cron'? Will it unlist the crontab or disable the user in some way? [22:30:37] Comment it, I think. [22:30:54] jarry1250__: Well, I didn't come up with how this migration is orchestrated. I'll try the best I can within what is given to me. [22:31:16] apper: I don't know about general rules; but Magnus is done migrating a few dozen tools in ~1h, kinks and all. [22:31:42] jarry1250__: It copies and leaves the originals untouched. [22:32:09] Coren: Can I keep downtime of the web service under an hour? Is there a way I can try the copy made to eqiad before having tools.wmflabs.org/intuition switch over? [22:32:16] !log tools petrb: uninstalling apache2 from tools-dev it has nothing to do there [22:32:18] Logged the message, Master [22:32:20] Coren: Does that mean it'll be running on both? [22:32:35] (the cron entries) [22:33:00] Since this tools is both server side and client side (url API, but also php include path for other tools), it's importnat that it stays working on both clusters [22:33:04] tool* [22:33:57] Krinkle: I should have said "mostly" intact. It disables cron by storing the crontab into a file, stops running jobs, and moves the public_html aside. You can then reenable them in pmtpa if you want, keeping in mind that they will then diverge. [22:34:40] Krinkle: The plan wasn't intended for running in both places, but there is nothing that prevents it. [22:35:00] Coren: It's fine for the version in pmtpa to stay fixed at a version. It just needs the file system to stay intact. [22:35:02] And nothing will break by doing so. [22:35:17] !log tools petrb: uninstalling it from -login too [22:35:19] Logged the message, Master [22:35:20] No cron entries needed actually. just curious whether they'd stay running in pmtpa as well, I wasn't saying I need it to :) [22:35:39] Krinkle: Then the only thing you have to do after running migrate-tool is rename public_html back and restart your webservice. [22:36:26] Coren: After the migration on eqiad's end, the tools. web server will start proxying, right? So I won't need to restart the web service [22:36:36] Right. [22:36:46] the only thing that would need to remain in pmtpa is the php include path, which I assume won't be touched (/data/mytool/somedir) [22:36:52] It won't. [22:37:20] * Coren goes to dinner, will BRB [22:37:51] Coren: Since it is queued, how is down time handled? Ideally it won't set public_html aside until right before it plucks it off the queue [22:38:02] Depends on how long the queue is of course, are we talking minutes or hours? [22:38:38] it contains a public API used by various gadgets on wikis that people depend on within other tools etc. uptime should be as good as we can. I'm willing to invest a little extra manual effort to ensure this [22:38:45] Screw queuing, migrate all the things [22:38:51] Krinkle: queue was ~2mins for me 10 minutes ago [22:44:10] Different version of jsub in eqiad? [22:44:38] jarry1250__: migrating intuitoin atm [22:44:39] Damianz: Bits are so cheap these days; what's up? [22:45:11] krinkle: Yay :) Thanks [22:45:16] Seems to treat full paths differently.. but it might just be my script. $HOME/cluebotng/run_core.sh ends up as /data/project/cluebot//data/project/cluebot/cluebotng/run_bot.sh: No such file or directory [22:49:01] andrewbogott_afk: unicorn is up and running fine. i had to blow away the entry from the known_hosts file but no biggie. [22:49:07] so thanks! [22:49:14] Damianz: Works for me with a trivial test. [22:49:36] hmm [22:50:14] Ok - jsub is just retarded [22:50:15] Do I run 'finish-migration ' whenever or after the MIGRATE.STATUS file on old tools-login says it is finished? [22:50:24] Krinkle: minutes. [22:50:27] If the file isn't executable it throws a screwy message about paths [22:50:41] Krinkle: It'll only allow you to finish-migration once completed. [22:50:45] OK [22:52:09] Krinkle: But no, there public_html is set aside before the copy starts (to avoid some web interface running while the copy takes place) [22:52:32] as long as it doesn't take more than a few minutes.. [22:52:42] Krinkle: Expect ~2min/G [22:53:26] A bit more if it's lots of small files, a bit less if its a few large files. [22:53:28] Coren: just took a look at my cleaned up file system. 337MB :P [22:53:45] Is there a way to "broadcast" a message on the labs from a program, to other programs that are listening for the broadcast? [22:54:11] a930913: what are you trying to do? [22:54:25] a930913: There is no standard way of doing it but lots of possible methods. Depends mostly on what you want to do. [22:54:59] Betacommand: RC from IRC to various programs of mine that consume it. [22:55:15] Rather than each having its own IRC. [22:55:28] Coren: ran migrate 22:39, should something have happened by now? [22:56:07] a930913: how much interaction will there be? [22:56:10] Did you do migrate-tool or migrate-user? Because I see you did the latter some time ago and it's long completed. :-) [22:56:22] Coren: Yeah, I did user first [22:56:31] you could use a file based queue system [22:56:34] I did `migrate-tool intuition` a minute after it [22:56:43] Betacommand: Interaction? [22:56:47] I don't see an queued copy request. Odd. Lemme look into it. [22:57:19] a930913: how much and how often will data be passed from the IRC client to a program [22:57:52] Betacommand: Every edit/log/ w/e, etc. [22:58:17] Krinkle: Ah, interesting. Your migrate-tool failed entirely and is stuck thinking you're still trying to run it. No net effect. I wonder what happened. [22:58:20] a930913: why not just use DB queries then? [22:58:31] Krinkle: You didn't save the output from it by any chance? [22:58:51] Betacommand: ClueBot's IRC feed. [22:58:58] Coren: I did [22:59:12] Coren: https://gist.github.com/Krinkle/5fed87140371d6c386c3 [22:59:26] a930913: is that a filtered IRC feed or just the raw feed? [22:59:29] Coren: hold on [22:59:30] wrong one [22:59:36] Betacommand: Raw+ [22:59:39] Yeah, I was about to say...! :-) [22:59:57] "It didn't work because you ^C it, dude!" :-) [22:59:59] a930913: if its raw just use the recentchanges table in the database [23:00:07] Betacommand: + [23:00:21] Betacommand: Extra data there. [23:00:43] a930913: you could throw it into a local database [23:00:47] use that [23:00:56] or log it to a file, and have the bots read the file [23:01:12] Krinkle: I see your request just came in. [23:01:16] Yeah [23:01:30] Proceed with migration? (Yes/No): Yes [23:01:30] . saving and disabling crontab [23:01:30] no crontab for local-intuition [23:01:32] no crontab for local-intuition [23:01:34] . stopping all jobs [23:01:36] local-intuition has registered the job 1453384 for deletion [23:01:37] or try to use sockets to communicate, but that doesnt work across different hosts [23:01:38] no crontab twice? [23:01:44] Betacommand: Anything that doesn't use the fs? [23:01:45] I wonder what that job was, there were no jobs on intution afaik [23:02:23] Betacommand: I'm tending towards sockets, but is there an easy way to broadcast? [23:02:24] a930913: how many child programs are there? [23:02:51] Betacommand: Variable. I.e. I don't want one breaking stopping the others. [23:02:55] no crontab twice because I don't check that there was a crontab saved before flusing it. [23:03:03] I.e.: harmless noise. [23:03:16] Krinkle: The job was your webservice. [23:03:24] Coren: So after finilizing migration (which I assume should also be done under the *user* account on tools-login-eqiad) I shoudl become and run 'webservice start' [23:03:27] Coren: Ah, ok [23:03:50] Krinkle: You assume correctly. In fact, intuition has been done copying for a while, you can do it now. [23:03:55] a930913: how variable are you talking? 1-3 20-500 or 500+? [23:04:13] shortcut: 'ssh eqiad' from the pmtpa -login will get you there. [23:04:33] Coren: thats what you say, but it didnt work for me :) [23:04:43] Betacommand: It does /now/. :-) [23:04:43] a930913: We have a Redis server at tools-redis, and you could use that (I think, never tried). [23:05:00] Betacommand: You just find all the bugs before anyone else. You're Tool Lab's crash test dummy. :-P [23:05:12] Betacommand: ~1-5 [23:05:16] a930913: I seriously think a db would be your best option [23:05:23] a930913: That's one way indeed; redis supports message queues. [23:05:39] scfc_de: What is redis? Do I google? [23:05:55] Coren: Ive lost count of the number of "bugs" that Ive ran into [23:06:17] Most labs bugs are features *hehem.. so when are we getting email* [23:06:21] Betacommand: And everyone else who then didn't run into them because you did it first and they got fixed should be thankful. :-) [23:06:29] a930913: a) A key-value store. b) A message queue: You can fill a "channel", and several processes can read that. MediaWiki uses it for recent changes. [23:06:36] Damianz: Right after migration ends. [23:06:41] Coren: I doubt they are :P [23:06:51] a930913: Caveat: As said, not tested by me. [23:07:08] well Im afk [23:07:23] I think cbng is happy in eqiad now... so petan can not complain about huggle being inefficient at scoring :D [23:09:11] The Redis looks good, how do I use it with the labs? [23:09:14] !redis [23:09:14] There is no memcache on tools project, only redis. If you want explanation, talk to YuviPanda [23:09:46] memcache on tools is just a bad idea... yay to no security [23:10:18] Damianz: Are there saboteurs on the labs? :o [23:10:49] There's probably moron...users who will mess with your stuff [23:11:31] Damianz: We have, thankfully, not gotten antisocial users on tools yet. [23:13:10] a930913: https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Help#Redis [23:13:18] apart from the time production stuff was calling tools web stuff :D but no... I can imagine how there would be though, reading the mailing list etc [23:13:29] !redis del [23:13:29] You are not authorized to perform this, sorry [23:14:06] Damianz: That wasn't malice, that was just a case of poor planning. :-P [23:14:21] it was accidently rather anti social :D [23:16:03] I really should move web stuff to use oauth rather than passwords hmm [23:16:08] * Damianz adds to very long todo list [23:18:27] Coren: I can't seem to find anomiebot's tools-db database on eqiad after doing the migration. [23:18:48] Does redis accept modifications by anybody? [23:19:04] anomie|away: It will have been renamed with your new prefix (sXXXXX__ rather than pYYYYYgXXXXX__) [23:19:30] Well, more precisely, it will have been restored with the new name on the new database. eqiad tools-db is a real server now. [23:20:04] Coren: anomiebot's prefix is s51055, but I don't see any with that prefix when I do "mysql --defaults-file=~/replica.my.cnf -h tools-db" in eqiad and "show databases;" [23:20:16] If I've moved a tool and db over while ignoring your scripts, do I need to do anything to stop it auto copying? Everything is moved into a backup folder so aside from the db a copy wouldn't really be that bad anyway [23:20:55] Damianz: Make sure that it's marked as migrated on https://wikitech.wikimedia.org/wiki/Tool_Labs/Migration_to_eqiad [23:21:18] anomie|away: Give me a minute, and I'll look into it [23:23:49] Coren: I wonder if the migration script got confused because anomiebot was old and still using .my.cnf and user "anomiebot" on tools-db in pmtpa? [23:23:59] a930913: Yes; that's why you need to prefix your channels with a random string only known to your tools. [23:24:12] anomie|away: Ah! Yes, that musta been it! [23:24:12] Since we have the whole tools redis changes thing and huggle/stiki/others using cluebot feeds I should make the bot push there as well hmm maybe [23:24:28] anomie|away: It won't have been able to dump the DB at all then. [23:24:43] * anomie|away will just do it manually then [23:24:44] anomie|away: Sorry, you'll have to do that manually. :-( [23:26:05] scfc_de: So you can't browse? [23:26:31] a930913: I *think* all listing keys/channels commands have been disabled. [23:26:34] Damianz: I was planning on doing that on the layer up, but if you want to do it... [23:26:53] scfc_de: Ah, ok. Thanks btw. [23:28:06] a930913: Well the relay bot can do it easily, since the actual bot sends udp to the relay with the channel and message then the relay splits it. Should be simple to mirror it somewhere else, maybe the multiplexing to 'feeds' could be done externally. No need to read irc though hopefully [23:31:17] Damianz: YuviPanda uses Redis for ... a Gerrit changes thread, IIRC. So he may give some advice if needed. [23:32:44] Damianz: How would it be stored? JSON string in a list? [23:33:37] If you want json I can make the bot spam udp to somewhere with a json encoding of the array... at the point it hits the relay it's the final color encoded string for irc [23:34:58] The problem is that UDP is monodirectional. [23:35:05] Hence the reason for redis. [23:36:40] UDP from the bot is easiest, because it's like OMGLOAD of processes sending the notifications so it's like flood -> feed -> flood [23:38:22] Damianz: But it gets messy why you consider that you'd have to UDP the right labserver. [23:38:51] Considering I already have to know where 5 processes live that's clunnkily handled [23:39:53] Normally, I'd suggest multicast but I don't actually know if our virtual network speaks it. [23:40:20] YuviPanda: Can you tell me about the purposes of testing-cache, pearl-hammer, and multimedia-dragons on the multimedia project in labs? I suspect I may be able to drop some or all of them in the eqiad migration. [23:41:05] Hmm, I recall something of a bell packet that is essentially omnicast... [23:41:28] pearl-hammer is obviously for hammering pearls, duh. :-) [23:42:20] Coren: crontabs aren't being copied from tools-dev in pmtpa to tools-dev in eqiad, either. Or disabled in pmtpa. [23:43:00] Oh! No; no crontab from tools-dev is migrated. I didn't even know anyone ran stuff there! [23:43:19] Weren't we supposed to use tools-dev for stuff like that? [23:44:01] anomie|away: Not especially; tools-dev was mostly intended to run tools interactively or build stuff. There's nothing /wrong/ about it, I just never though anyone used it for that. [23:44:22] * Coren adds this to the script, just in case. [23:44:39] * a930913 raises his hand and mutters something about using tools-dev for everything. [23:45:12] * anomie|away runs cronjobs to email log summaries and to rotate logs monthly from tools-dev [23:45:59] * Coren added that to the migration script. [23:47:05] Coren: Mutlicast is lovely... but sometimes you only want something done/recieved once even if it's running twice because tools can get its pants into a twist [23:47:39] Coren: If tools-dev is for building stuff, why is automake installed on tools-login and not on tools-dev? [23:48:16] anomie|away: Because nobody noticed and complained before you just did? :-) Please open a bugzilla, it should really be. [23:48:38] Coren: I don't use it, I just ran my script that makes http://tools.wmflabs.org/anomiebot/packages.html and was looking for weirdness [23:49:18] Coren: what's the command to check the migration status? migrate-tool or just migrate-tool? [23:49:59] Alchimista: 'migrate-tool ' since you can have more than one pending. [23:51:37] anomie|away: 'Packages on some but not all exec servers' would be more useful if you excluded webgrid-* which are special purpose. [23:53:57] anomie|away: 'Packages on some but not all exec servers' would be more useful if you excluded webgrid-* which are special purpose. [23:59:18] Damianz: ? [23:59:27] ?