[00:09:05] bd808: it appears I need to reset an OAuth consumer completely. could you delete a64df0838beeb553bb56a33ec35d8e8a and i'll have it's replacement in a min? [00:10:29] AmandaNP: "UTRS OAuth Authenticator" ? [00:10:36] yep [00:11:10] {{done}} [00:13:48] 06Labs, 10Tool-Labs, 10DBA: enwiki_p replica on s1 is corrupted - https://phabricator.wikimedia.org/T134203#2257889 (10Krenair) That's now resolved, but I still have trouble logging in: ```krenair@tools-bastion-03:~$ mysql -h labsdb1001.eqiad.wmnet Welcome to the MariaDB monitor. Commands end with ; or \g.... [00:18:07] 06Labs, 10Tool-Labs, 10DBA: enwiki_p replica on s1 is corrupted - https://phabricator.wikimedia.org/T134203#2970947 (10Krenair) Woah, wait, what? I can get in as `s52299` but not as `u2170`? ```tools.alex@tools-bastion-03:~$ mysql -h labsdb-web.eqiad.wmnet Welcome to the MariaDB monitor. Commands end with ;... [00:20:15] bd808: a3fafb336bbfb21d00f07ce1c52017a5 [00:22:08] AmandaNP: {{done}} [00:42:50] 06Labs, 10Labs-Infrastructure, 13Patch-For-Review, 07Puppet: Puppet failure on instance creation - https://phabricator.wikimedia.org/T156297#2971086 (10Andrew) The interesting part in that log is this: hostname: Name or service not known That means that the instance didn't get a DNS record. That see... [01:00:28] 06Labs, 10Labs-Infrastructure, 13Patch-For-Review, 07Puppet: Puppet failure on instance creation - https://phabricator.wikimedia.org/T156297#2971190 (10Andrew) https://gerrit.wikimedia.org/r/#/c/334224/ (add more threads) doesn't seem to help much... something must be leaking connections. I'll have to dig... [01:21:33] 06Labs, 10Tool-Labs: Unable to connect to new database servers - https://phabricator.wikimedia.org/T156307#2971279 (10russblau) 05Resolved>03Open Still not working; same error message as previously reported (ERROR 1045). [01:43:04] 06Labs, 10Labs-Infrastructure, 13Patch-For-Review, 07Puppet: Puppet failure on instance creation - https://phabricator.wikimedia.org/T156297#2971317 (10Andrew) 05Open>03Resolved a:03Andrew 334228 should resolve this issue (although not the general keystone performance problem.) Delete and recreate y... [01:44:55] 06Labs, 10Labs-Infrastructure: keystone admin api easily overwhelmed - https://phabricator.wikimedia.org/T156337#2971320 (10Andrew) [01:45:57] 06Labs, 10Labs-Infrastructure: keystone admin api easily overwhelmed - https://phabricator.wikimedia.org/T156337#2971334 (10Andrew) My test case is running the 'allprecise2.py' novastats script on labvirt1001. If I set up me env with novaobserver (and the public port, 5000) all is well. But with novaadmin cr... [01:49:29] 06Labs, 10Tool-Labs: supercount @ Tool Labs: High Rep Lag - https://phabricator.wikimedia.org/T156338#2971338 (10JustBerry) [02:03:04] 06Labs, 10Tool-Labs: supercount @ Tool Labs: High Rep Lag - https://phabricator.wikimedia.org/T156338#2971351 (10JustBerry) a:03Cyberpower678 [02:07:05] 06Labs, 10Tool-Labs: supercount @ Tool Labs: High Rep Lag - https://phabricator.wikimedia.org/T156338#2971353 (10Cyberpower678) 05Open>03Invalid This isn't in my control. [02:08:52] 06Labs, 10MediaWiki-Vagrant, 15User-Ladsgroup, 15User-bd808: Vagrant 1.9.1 provision failure on Trusty using role::labs:mediawiki_vagrant - https://phabricator.wikimedia.org/T155196#2971357 (10bd808) With verbose logging I see this output: ``` $ VAGRANT_LOG=DEBUG vagrant up ... lots of stuff that is unrela... [02:16:51] 06Labs, 10MediaWiki-Vagrant, 15User-Ladsgroup, 15User-bd808: Vagrant 1.9.1 provision failure on Trusty using role::labs:mediawiki_vagrant - https://phabricator.wikimedia.org/T155196#2971359 (10bd808) Whatever the problem is, it seems to be related to the Vagrant 1.9.1 package. @Ladsgroup @WMDE-leszek to w... [02:38:19] 06Labs, 10MediaWiki-Vagrant, 15User-Ladsgroup, 15User-bd808: Vagrant 1.9.1 provision failure on Trusty using role::labs:mediawiki_vagrant - https://phabricator.wikimedia.org/T155196#2937071 (10scfc) >>! In T155196#2971357, @bd808 wrote: > […] > The `host_ip = read_host_ip` result would be the equivalent of... [02:44:00] RECOVERY - Puppet run on tools-services-02 is OK: OK: Less than 1.00% above the threshold [0.0] [03:40:03] PROBLEM - Puppet run on tools-services-02 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [03:47:06] how does one list the users in a tools.x group? [03:56:10] samwilson: how about "getent group" followed by group name [03:56:42] mutante: yep, perfect! thanks :) [03:56:46] yw [04:06:38] 06Labs, 10Tool-Labs, 06Community-Tech-Tool-Labs, 07Epic: Tools web interface for tool authors (Brainstorming ticket) - https://phabricator.wikimedia.org/T128158#2971434 (10bd808) [04:07:26] 06Labs, 10Tool-Labs, 06Community-Tech-Tool-Labs, 07Epic: Tools web interface for tool authors (Brainstorming ticket) - https://phabricator.wikimedia.org/T128158#2065676 (10bd808) >>! In T128158#2211741, @MZMcBride wrote: >>>! In T128158#2128397, @bd808 wrote: >> Here are a some of the things I find confusi... [04:13:20] PROBLEM - Free space - all mounts on tools-bastion-02 is CRITICAL: CRITICAL: tools.tools-bastion-02.diskspace._public_dumps.byte_percentfree (No valid datapoints found)tools.tools-bastion-02.diskspace.root.byte_percentfree (<10.00%) [04:19:51] samwilson: you can also see them at https://tools.wmflabs.org/?tool=TOOL_NAME [04:20:00] e.g. https://tools.wmflabs.org/?tool=sal [04:22:32] bd808: oh cool, i forgot about that, thanks :) [04:23:33] could i add a link to that in template:tool on wikitech? [04:24:39] 06Labs, 10Tool-Labs: Request to be added as maintainer to abandoned bibleversefinder/ tool - https://phabricator.wikimedia.org/T91585#2971443 (10bd808) [04:24:46] 06Labs, 10Tool-Labs, 06Community-Tech-Tool-Labs, 06Developer-Relations, and 4 others: Set up process / criteria for taking over abandoned tools - https://phabricator.wikimedia.org/T87730#2971439 (10bd808) 05Open>03Resolved I'm going to declare victory on this task. The committee will have more work to... [04:25:10] samwilson: I don't know who would stop you :) [04:26:31] bd808: i'll try it and see... :) [04:37:50] 10Striker, 15User-bd808: Deploy Striker account creation and management workflow - https://phabricator.wikimedia.org/T156195#2971448 (10bd808) 05Open>03Resolved Announced on wikitech-l: https://lists.wikimedia.org/pipermail/labs-l/2017-January/004881.html [05:03:06] 06Labs, 10Striker, 06Community-Tech-Tool-Labs, 10wikitech.wikimedia.org: Update Tool Labs account creation docs on wikitech to mention Striker - https://phabricator.wikimedia.org/T156340#2971462 (10bd808) [05:04:55] 10Striker: Add link to ssh key generation instructions on Wikitech - https://phabricator.wikimedia.org/T156341#2971475 (10bd808) [05:05:21] 06Labs, 10Striker, 06Community-Tech-Tool-Labs, 10wikitech.wikimedia.org, 07Documentation: Update Tool Labs account creation docs on wikitech to mention Striker - https://phabricator.wikimedia.org/T156340#2971487 (10bd808) [05:09:36] 06Labs, 10Labs-Infrastructure: keystone admin api easily overwhelmed - https://phabricator.wikimedia.org/T156337#2971320 (10bd808) With the uwsgi deploy, what are the various choke points for parallel connections? Apache worker pool -> uwsgi worker pool -> uwsgi worker thread pool? Do we have any instrumentati... [05:19:07] !log utrs delete utrs-live and utrs-database per T156297 [05:19:12] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Utrs/SAL [05:19:12] T156297: Puppet failure on instance creation - https://phabricator.wikimedia.org/T156297 [05:24:23] !log utrs create utrs-production and utrs-database [05:24:26] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Utrs/SAL [05:25:50] 06Tool-Labs-standards-committee: Figure out how communications and meetings will work for the Tool Labs standards committee - https://phabricator.wikimedia.org/T156075#2971507 (10zhuyifei1999) >>! In T156075#2967774, @Quiddity wrote: > (I have no preference, auto-complete knows all!) +1. Just one thought: there... [05:27:28] 06Labs, 10Labs-Infrastructure, 13Patch-For-Review, 07Puppet: Puppet failure on instance creation - https://phabricator.wikimedia.org/T156297#2971508 (10DeltaQuad) @Andrew It appears recreation was successful, although the number of notices on the console output is crazy. Is that to be expected? [05:46:33] Hello all. I'm having trouble with the python3 Kubernetes instructions [05:46:46] tools.pmidtool@tools-bastion-03:~$ webservice --backend=kubernetes python2 shell [05:46:47] Traceback (most recent call last): [05:46:47] File "/usr/local/bin/webservice", line 163, in [05:46:47] job.shell() [05:46:48] File "/usr/lib/python2.7/dist-packages/toollabs/webservice/backends/kubernetesbackend.py", line 390, in shell [05:46:50] 'interactive' [05:46:52] File "/usr/lib/python2.7/subprocess.py", line 710, in __init__ [05:46:54] errread, errwrite) [05:46:56] File "/usr/lib/python2.7/subprocess.py", line 1327, in _execute_child [05:49:18] boo. he got flood banned [05:50:11] Sigyn: the stuff from gstupp wasn't really spam. floody, yes; but not spam [05:50:51] Need a 'use a gist' in the topic? [05:51:09] :) nobody reads the topic [05:51:21] but yeah we could stick one in there [05:51:41] or phabricator paste [05:51:54] if only the whole world had clients that automatically pastebinned [05:52:07] or the irc protocol wasn't line oriented [05:57:27] Hmm, I'm trying to do a git fetch origin from gerrit, and it's just sitting there [05:57:59] friendly12345: on one of the tools bastions? [05:58:23] nfs and git are a notoriously slow combination unfortunately [05:59:58] someday™ we will have a better storage layer [06:00:18] someday lol [06:00:24] bd808: This is ssh://@gerrit.wikimedia.org, me remotely [06:00:37] I wish we get rid of nfs first [06:01:11] zhuyifei1999_: well we need shared block storage of some form. we can't really get rid if NFS without a replacement [06:01:33] ceph ? [06:01:37] there are plans but they are vague at the moment [06:01:39] yeah [06:01:56] ceph is the most likely answer [06:02:28] * bd808 is yawning a lot so headed to bed [06:02:34] how about scap? :P [06:02:56] learning to do it the production way [06:03:14] heh. scap is the worlds best rsync client but not a solution for OpenGridEngine [06:04:09] https://deis.com/workflow/ or https://www.openshift.org/ are the future :) [06:06:09] Devs throw code over the fence for Ops to clean up. lol [06:30:55] 06Labs, 10Tool-Labs: xtools on Tool Labs: Rep Lag High - https://phabricator.wikimedia.org/T156345#2971545 (10JustBerry) [06:40:43] PROBLEM - Puppet run on tools-exec-1411 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [06:44:07] RECOVERY - Free space - all mounts on tools-exec-1221 is OK: OK: tools.tools-exec-1221.diskspace._public_dumps.byte_percentfree (No valid datapoints found) [06:48:15] 06Labs, 10Tool-Labs, 10Tool-Labs-tools-Xtools: xtools on Tool Labs: Rep Lag High - https://phabricator.wikimedia.org/T156345#2971568 (10zhuyifei1999) https://tools.wmflabs.org/replag/ shows no replag. Does xtools determind replag with [[https://en.wikiversity.org/wiki/Special:RecentChanges|recent changes]]?... [06:50:19] 06Labs, 10Tool-Labs, 10Tool-Labs-tools-Xtools: xtools on Tool Labs: Rep Lag High - https://phabricator.wikimedia.org/T156345#2971545 (10jcrespo) I can confirm no lag is real: ``` root@localhost[(none)]> SELECT * FROM heartbeat_p.heartbeat; +-------+----------------------------+------+ | shard | last_updated... [06:55:45] 06Labs, 10Tool-Labs, 10Tool-Labs-tools-Xtools: xtools on Tool Labs: Rep Lag High - https://phabricator.wikimedia.org/T156345#2971580 (10JustBerry) //Update:// Caution: Replication lag is high, changes newer than 20 minutes may not be shown. @zhuyifei1999 Seems like that might not be the case, as the most re... [06:56:14] 06Labs, 10Tool-Labs, 10Tool-Labs-tools-Xtools: xtools on Tool Labs: Rep Lag High - https://phabricator.wikimedia.org/T156345#2971584 (10JustBerry) Adding back subscribers. Unintentional. [06:57:09] PROBLEM - Puppet run on tools-webgrid-lighttpd-1414 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [06:57:47] 06Labs, 10Tool-Labs: supercount @ Tool Labs: High Rep Lag - https://phabricator.wikimedia.org/T156338#2971586 (10JustBerry) a:05Cyberpower678>03None Per previous comment. [06:58:53] 06Labs, 10Tool-Labs: supercount @ Tool Labs: High Rep Lag - https://phabricator.wikimedia.org/T156338#2971589 (10JustBerry) //Update:// "High Replication Lag Oh dear. The database appears to be lagging behind. I won't be able to present you with information newer than 63 seconds." @jcrespo may be able to veri... [06:59:34] 06Labs, 10Tool-Labs: supercount @ Tool Labs: High Rep Lag - https://phabricator.wikimedia.org/T156338#2971591 (10JustBerry) 05Invalid>03Open Problem remains, unresolved, remove assignee accordingly. [07:15:43] RECOVERY - Puppet run on tools-exec-1411 is OK: OK: Less than 1.00% above the threshold [0.0] [07:32:09] RECOVERY - Puppet run on tools-webgrid-lighttpd-1414 is OK: OK: Less than 1.00% above the threshold [0.0] [08:36:16] 06Tool-Labs-standards-committee: Figure out how communications and meetings will work for the Tool Labs standards committee - https://phabricator.wikimedia.org/T156075#2971652 (10Matanya) I volunteer to be the list admin/spam handler. I agree quterly meetings is a good idea. I think we should also create a lis... [09:28:16] 06Labs, 10Labs-Infrastructure: Deprecate precise instances in Labs by 03/31/2017 - https://phabricator.wikimedia.org/T143349#2971785 (10hoo) [09:40:41] (03PS1) 10Giuseppe Lavagetto: Adding private data for configcluster in codfw [labs/private] - 10https://gerrit.wikimedia.org/r/334255 [09:43:19] (03CR) 10Giuseppe Lavagetto: [V: 032 C: 032] Adding private data for configcluster in codfw [labs/private] - 10https://gerrit.wikimedia.org/r/334255 (owner: 10Giuseppe Lavagetto) [09:46:51] 10Tool-Labs-tools-LTA-Knowledgebase: Read directly from replica.my.cnf - https://phabricator.wikimedia.org/T156352#2971862 (10Samtar) [09:48:49] 10Tool-Labs-tools-LTA-Knowledgebase: Write user guide - https://phabricator.wikimedia.org/T156353#2971877 (10Samtar) [10:05:05] 06Labs, 10MediaWiki-Vagrant, 15User-Ladsgroup, 15User-bd808: Vagrant 1.9.1 provision failure on Trusty using role::labs:mediawiki_vagrant - https://phabricator.wikimedia.org/T155196#2971936 (10WMDE-leszek) @bd808 thanks for looking into this! Downgrading Vagrant helped so far. I'd be more than happy to hel... [10:07:13] (03CR) 10jenkins-bot: Localisation updates from https://translatewiki.net. [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/334262 (owner: 10L10n-bot) [10:59:13] 06Labs, 06Operations, 10netops: asw-c2-eqiad reboots & fdb_mac_entry_mc_set() issues - https://phabricator.wikimedia.org/T155875#2972065 (10Marostegui) Hi, The pending work of: T156008 shouldn't be a blocker to replace the switch. The switchover was done, and only pending to move dbstore1001 to replicate f... [11:41:55] I'm having a bit of trouble accessing the social tools instances on Labs. I've followed what I think is the right documentation, but I get closed by remote host messages. [14:05:45] thanks andrewbogott for the review! there isn't quite enough space atm on graphite production for 90d, cc chasemp re: https://gerrit.wikimedia.org/r/#/c/334342/ [14:39:39] 06Labs, 10Tool-Labs: supercount @ Tool Labs: High Rep Lag - https://phabricator.wikimedia.org/T156338#2971338 (10valhallasw) The message doesn't show replag, but the //time since the last edit in the replica database//. On high-volume wikis, these are roughly equivalent (because the time between edits is < 1s)... [14:42:59] thanks for handling it godog [15:01:47] 06Labs, 10Labs-Infrastructure: keystone admin api easily overwhelmed - https://phabricator.wikimedia.org/T156337#2972768 (10chasemp) p:05Triage>03High [15:05:52] chasemp: no worries! [15:08:41] PROBLEM - Puppet run on tools-services-01 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [15:45:44] madhuvishy: can I steal https://phabricator.wikimedia.org/P4805 ? ;) [15:46:56] 06Labs, 10Tool-Labs, 06Community-Tech-Tool-Labs: Make a nag system to email maintainers of tools still running on precise gird hosts - https://phabricator.wikimedia.org/T149214#2972886 (10bd808) a:03madhuvishy [15:51:05] 06Tool-Labs-standards-committee: Figure out how communications and meetings will work for the Tool Labs standards committee - https://phabricator.wikimedia.org/T156075#2972895 (10bd808) >>! In T156075#2971652, @Matanya wrote: > I think we should also create a list of tools still on precise and work on a roadmap... [16:00:08] PROBLEM - Free space - all mounts on tools-exec-1221 is CRITICAL: CRITICAL: tools.tools-exec-1221.diskspace._public_dumps.byte_percentfree (No valid datapoints found)tools.tools-exec-1221.diskspace.root.byte_percentfree (<55.56%) [16:03:41] RECOVERY - Puppet run on tools-services-01 is OK: OK: Less than 1.00% above the threshold [0.0] [16:08:55] !log tools major cleanup for stale var items on tools-exec-1221 [16:08:58] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [16:17:33] Is there someone around able to approve an OAuth consumer? [16:18:12] AmandaNP: o/ [16:18:22] \o [16:18:29] bd808 lives! [16:18:30] looking now [16:18:43] AmandaNP: did that callback work? :) [16:18:44] it's litterally an extension of yesterday [16:18:57] samtar: we'll see [16:19:11] and looks like someone beat you to it bd808 [16:19:26] that darn halfak :) [16:19:36] thanks halfak :) [16:20:09] RECOVERY - Free space - all mounts on tools-exec-1221 is OK: OK: tools.tools-exec-1221.diskspace._public_dumps.byte_percentfree (No valid datapoints found) [16:22:01] samtar: looks like it may have worked, but I got fatals on my end that I suspect is my tool despising me [16:23:09] Good stuff :) Anomie updated the comment in the example code [16:23:55] yep I saw that [16:24:37] hmm /me might have caused a session conflict [16:25:06] zhuyifei1999_: sure! [16:25:26] what license is it under? [16:25:28] Lcawte: hi, what instances are you looking at? [16:26:04] AmandaNP: have you used the example code? I had some fun with session_names/IDs when I was trying to get stuff working yesterday [16:26:44] ya I'm using an offshoot of the code that LFaraone worked on like 2 years ago [16:26:46] zhuyifei1999_: yeah - i think i should add one [16:27:50] madhuvishy: social-tools(1 & 2) [16:31:40] Samtar: I get Notice: A session had already been started - ignoring session_start() in /usr/ALPHAgit/public_html/src/oauth.php on line 200 [16:31:41] Fatal error: Class 'User' not found in /usr/ALPHAgit/public_html/login.php on line 132 [16:32:06] but then I remove the oauth keys from the URL and it works [16:32:22] like i'm logged in [16:32:27] oh really? o.O [16:32:35] and I can't find this User class on line 132 [16:33:19] Is the source available? [16:33:32] oh cause i'm in the wrong file [16:33:34] duh [16:33:59] :D [16:34:01] samtar: https://github.com/UTRS/utrs/tree/pr/57/public_html [16:34:19] UTRS is getting OAuth?? [16:34:22] * bd808 is super happy to see people helping each other out in this channel :D [16:37:19] yep [16:37:24] finally like 2 years late [16:38:14] zhuyifei1999_: updated https://phabricator.wikimedia.org/P4805 [16:38:32] thx [16:38:41] Lcawte: ah yes I can't login as root either [16:38:51] ok I fixed that fatal. Someone forgot to update a class name which worries me what else fatals that I don't know about -_- [16:39:54] samtar: so if I just want to verify the user has previous given access instead of the allow button every time they login, what do I need to do now? [16:44:08] you should place the token from the callback (on allowing access) into a session? But if you're asking for every time someone logs in, I *think* they do need to be prompted each time (I do for when I log into Phab with OAuth?) - currently my tool asks every time, gets the identity of the editor, checks against a table and then proceeds [16:45:41] bd808 might be able to correct me though ^ [16:45:51] https://www.mediawiki.org/wiki/OAuth/For_Developers#Avoid_repetitive_login_prompts [16:46:07] * AmandaNP looks [16:46:14] If it's auth-only that will work [16:46:32] if the grant has more rights then a prompt will still happen [16:47:03] bd808: will that work for mine, identity + email/full name? [16:47:09] yes [16:48:06] I'm sure I've been able to log in to this labs instance before :/ [16:49:01] hmm... /me eyes existing code [16:49:22] oh I can't read [16:49:26] * AmandaNP checks again [16:50:26] there is no authenticate in the code... [16:52:15] my non-root key works just fine on social-tools1 [16:52:32] unless luke took it out [16:52:48] and on social-tools2 [16:53:00] let me get puppet updated on those boxes and then we'll see what the access log looks like... [16:53:30] nope. samtar if your using that code, it doesn't have that function [16:55:57] I think i'll just save it for an enhancement later [16:57:19] 06Labs, 10Tool-Labs, 06Tool-Labs-standards-committee: Create a wall for tools migration to trusty - https://phabricator.wikimedia.org/T156386#2973168 (10zhuyifei1999) [16:58:09] 06Labs, 10Tool-Labs: Unable to connect to new database servers - https://phabricator.wikimedia.org/T156307#2973200 (10chasemp) 05Open>03Resolved @russblau you seem to have old credentials which are only valid through happenstance (and only on the old cluster) in `.my.cnf` as someone copied a now obsolete r... [17:01:01] Lcawte: Can you try to log in to social-tools1 while I watch the auth log? [17:06:23] bd808: uuuuh, how does striker handle 2fa? [17:06:34] i guess that's what breaks login for me [17:06:59] annika: it uses action api calls to talk to wikitech [17:07:50] annika: are you getting an error message or just not getting past the 2fa token entry screen? [17:08:16] i get a mask for user name and password, and after submission: Please enter a correct username and password. Note that both fields may be case-sensitive. [17:08:42] i tried lower and upper case to no avail [17:08:51] 06Labs, 10Tool-Labs, 06Tool-Labs-standards-committee: Create a wall for tools migration to trusty - https://phabricator.wikimedia.org/T156386#2973268 (10zhuyifei1999) I think P4805 can bootstrap the wall. **Problems:** # Migration complete (which could be a webservice on k8s) vs abandoned/inactive tool tha... [17:08:59] ok. that's before the 2fa check. The username should be your Wikitech account name [17:09:09] So for me that is BryanDavis [17:09:12] i used it [17:13:20] 06Tool-Labs-standards-committee: Figure out how communications and meetings will work for the Tool Labs standards committee - https://phabricator.wikimedia.org/T156075#2973294 (10zhuyifei1999) >>! In T156075#2972895, @bd808 wrote: > @madhuvishy has been making some progress towards creating a list based on a com... [17:14:34] annika: what's your shell name? I'll check the ldap record to see if it is funky [17:17:21] Lcawte: tools-social2 should be fine now [17:17:46] bd808: Gifti [17:18:03] well, gifti [17:18:15] the other is the wikitech user name [17:19:04] *nod* the LDAP record looks ok. We have some users with different "cn" and "sn" values that have had problems, but both of yours are the same "Gifti" value [17:19:27] So on toolsadmin your username should be Gifti [17:19:40] ok [17:19:50] and the password should be the same one you use to log into wikitech and gerrit's web ui [17:20:00] mhm [17:24:54] 06Labs, 10Tool-Labs, 06Tool-Labs-standards-committee: Create a wall for tools migration to trusty - https://phabricator.wikimedia.org/T156386#2973319 (10bd808) >>! In T156386#2973268, @zhuyifei1999 wrote: > **Problems:** > # Migration complete (which could be a webservice on k8s) vs abandoned/inactive tool t... [17:29:40] PROBLEM - Puppet run on tools-services-01 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [17:38:56] Lcawte: tools-social1 too [17:42:38] bd808: 'Please do not start making a "big bag of tools" tool account.' <= I wish deleting a tool account is cheaper [17:43:22] it would become yet another useless tool after the migration is complete... [17:43:46] zhuyifei1999_: we will fix that at some point, but having the LDAP entry laying around really doesn't hurt anything [17:44:18] zhuyifei1999_: agreed, that's a thorn in our paw [17:44:59] AFAIK the general issue is that we don't know exactly all the bits that need to be cleaned up [17:59:30] 10Tool-Labs-tools-Xtools, 03Community-Tech-Sprint: Investigation: Plan for rewriting XTools - https://phabricator.wikimedia.org/T154551#2973438 (10DannyH) p:05Normal>03High [18:04:43] RECOVERY - Puppet run on tools-services-01 is OK: OK: Less than 1.00% above the threshold [0.0] [18:18:16] 10Tool-Labs-tools-stewardbots, 07WorkType-Maintenance: Evaluate cleanup on StewardBot's code - https://phabricator.wikimedia.org/T149404#2973476 (10MarcoAurelio) [18:33:40] 06Labs, 10Labs-Infrastructure, 13Patch-For-Review, 07Puppet: Puppet failure on instance creation - https://phabricator.wikimedia.org/T156297#2973537 (10Andrew) I'm not sure I know what you mean by the 'console output'. If you're talking about the system log then, yes, it always tells you a lot. [18:38:41] 06Labs, 10Tool-Labs, 10DBA: Labs users reporting timouts when connecting to labsdb-web.eqiad.wmnet - https://phabricator.wikimedia.org/T156285#2973566 (10Superyetkin) Database connection works but queries end up with the following error. ``` SELECT command denied to user 's51698'@'10.64.37.15' for table 'wb... [18:40:22] 06Labs, 10Tool-Labs, 10DBA: Labs users reporting timouts when connecting to labsdb-web.eqiad.wmnet - https://phabricator.wikimedia.org/T156285#2973589 (10jcrespo) @Superyetkin Can you provide the server and the full query used? [18:55:08] 06Labs, 10Tool-Labs, 10DBA: Labs users reporting timouts when connecting to labsdb-web.eqiad.wmnet - https://phabricator.wikimedia.org/T156285#2973675 (10Superyetkin) mysql_select_db fails with "Access denied for user 's51698'@'%' to database 'trwiki_p'" on [[ http://tools.wmflabs.org/superyetkin/test.php |... [18:56:56] andrewbogott: madhuvishy - Sorry, went AFK. No luck still. [18:58:01] Lcawte: try social-tools1 again please? [18:58:25] Done. [18:58:33] what os are you using? [18:59:03] Ubuntu 16.04 (Desktop) [18:59:36] can you do ssh -vvv and then paste the output someplace? [18:59:43] Also, can you contact any other systems in labs? [19:00:06] It looks to me like your config is incorrect, since you aren't even getting as far as attempting a login on the actual social-tools1 box [19:01:25] andrewbogott: https://pastebin.com/gTb7Vwac [19:03:02] Lcawte: ok, you can see from that output that you're failing to connect to the bastion in the first place. So let's focus on that for now... [19:03:47] can you just try to ssh -vvv primary.bastion.wmflabs.org and see what that says? [19:05:20] 06Labs, 10Tool-Labs, 10DBA: Labs users reporting timouts when connecting to labsdb-web.eqiad.wmnet - https://phabricator.wikimedia.org/T156285#2973711 (10jcrespo) @Superyetkin trwiki is s2, which is not yet part of the available wikis. right now only enwiki and the 800 s3 wikis are available. s2, s4, s5, s6... [19:06:23] andrewbogott: https://pastebin.com/QFmYvTfZ [19:07:01] Lcawte: it looks to me like you are trying to connect with username 'lewiscawte' but that's not your labs username is it? [19:07:28] andrewbogott: Well I tried it by specifying my LDAP username (it's further down the paste...) that seems to go a bit better. [19:07:46] The ProxyCommand SSH config is meant to specify LDAP username to SSH though, right? [19:07:57] bd808: ping [19:08:15] https://wikitech.wikimedia.org/wiki/Special:Contributions/Dosken_300 [19:08:20] A bit better, in a sense that I authenticate to bastion/primary.bastion. [19:08:41] I however blocked https://wikitech.wikimedia.org/wiki/Special:Contributions/Doctor_Satori [19:10:07] Lcawte: so… of course I don't know what your ssh config looks like. But you definitely need to auth as lcawte. And that works for the bastion... [19:10:12] which means that your key is set up correctly [19:10:14] TabbyCat: I'm tired of that person whoever they are :/ [19:10:18] So all the pieces are there [19:10:42] bd808: can Dosken_300 be blocked too? [19:11:01] bd808: it'd help to CU those accounts and set IP/range blocks as appropriate [19:11:02] andrewbogott: My SSH config looks a lot like https://wikitech.wikimedia.org/wiki/Help:Access#Accessing_instances_with_ProxyCommand_ssh_option_.28recommended.29 ... if you were to replace with lcawte [19:11:15] TabbyCat: all contribs are abuse spam, so yes [19:12:21] bd808: so done both :) [19:12:40] maybe an abusefilter should be setup there [19:12:51] Lcawte: you are going to have to debug your proxy command locally. Most likely it's failing just because you're not doing lcawte@ but it could be something else. The important difference is that when you were trying to connect to social-tools1 the initial connection to bastion was failing, even though we've established that it's possible for that to work [19:13:20] * samtar heard the words abusefilter.. [19:16:50] 06Labs, 10MediaWiki-extensions-Page_Forms, 10wikitech.wikimedia.org: Accesing Special:FormEdit gives a blank empty page - https://phabricator.wikimedia.org/T156406#2973734 (10MarcoAurelio) [19:17:03] TabbyCat: that would be useful. I've never learned how to set that up. [19:17:17] bd808: which thing? AF? [19:17:28] yeah [19:17:57] Well, I've created some, I guess I could help. But there are really good people for doing that [19:17:59] Its on wikitech but rules aren't really maintained by anyone AFAIK [19:18:15] musikanimal for example is very good with filters [19:18:45] well thank ya :) [19:19:14] spam on wikitech? [19:19:14] andrewbogott: Should I be able to log in to bastion and then social-tools from there? [19:19:34] Lcawte: not without a proxycommand [19:19:44] unless you do key-forwarding stuff with is a whole other can of worms [19:19:54] (I'm in a meeting now, sorry) [19:34:15] musikanimal: vandalism, etc, yep [19:35:05] there's a bunch of general vandalism/spam/etc filters you could import from meta/enwiki that would probably help [19:37:32] I'm happy to help if you need :) [20:09:36] 10Tool-Labs-tools-LTA-Knowledgebase: Change from MD5 to bcrypt - https://phabricator.wikimedia.org/T155948#2959798 (10DatGuy) Is this still needed after the conversion to OAuth? [20:16:27] 10Tool-Labs-tools-LTA-Knowledgebase: Change from MD5 to bcrypt - https://phabricator.wikimedia.org/T155948#2974023 (10DatGuy) 05Open>03declined Declined - Unneeded per discussion on IRC [20:18:00] 10Tool-Labs-tools-LTA-Knowledgebase: Migrate to OAuth - https://phabricator.wikimedia.org/T155841#2974031 (10DatGuy) 05Open>03Resolved Aside from some tidying up, this task is pretty much done. Feel free to reopen if there is anything major. [20:21:33] Hey folks. [20:21:46] I'm looking to load a big dataset into labsdb. [20:22:01] I've been talking to jynus about it here: https://phabricator.wikimedia.org/T146718 [20:22:21] He told me at the dev summit that there are now some new db machines which means that I can start loading the data. [20:22:38] Any special considerations or do I just work in my user_db like normal? [20:22:39] well, the first part is true [20:22:45] we do not have, however [20:22:50] a method [20:23:01] 10Tool-Labs-tools-LTA-Knowledgebase: Add continuous integration - https://phabricator.wikimedia.org/T155944#2974057 (10DatGuy) Check your email for proposals [20:23:04] to easily import [20:23:17] Was thinking of "mysqlimport --local ..." [20:23:35] that is the point, that is not yet supported [20:23:43] chase wants to revisit that [20:23:58] on the new servers, I mean [20:24:12] if you need the now [20:24:31] give me the files and I will import them for you [20:24:38] *them [20:24:53] until we have a proper procedure [20:25:17] OK great! I'll have a link to them shortly. [20:25:19] the servers are very much in beta, undocumented [20:25:34] we should decide with you a structure, too [20:25:43] like datesets_p or something db? [20:25:52] would that work? [20:25:59] Arg. I need to drop a column to save space. I'll need to do that and get back to you. [20:26:05] datasets_p would be great! [20:26:08] brb [20:27:13] 10Tool-Labs-tools-LTA-Knowledgebase: Fix password hashing - https://phabricator.wikimedia.org/T155934#2974069 (10Samtar) [20:27:15] 10Tool-Labs-tools-LTA-Knowledgebase: Automatically fail authentication for any password exceeding 4096 bytes - https://phabricator.wikimedia.org/T155946#2974067 (10Samtar) 05Open>03declined No longer required as we have migrated to OAuth [20:28:01] jynus: also what's the status of getting s1 on the new labsdbs? for enwiki primarily [20:28:15] s1 is available [20:28:19] and s3 [20:28:25] 36 wikis are left [20:28:39] but we were super-busy lately [20:28:52] with high-priority tasks [20:29:10] jynus: w00t, didn't realize s1 is available! [20:29:19] the most important part is documentation [20:29:22] * yuvipanda nods [20:29:31] we will not make them official until they are documented [20:29:52] * yuvipanda nods [20:30:05] but halfak can test it out I guess ;) [20:30:15] (and I need to move Quarry to tools and point that instance at new labsdbs) [20:30:25] aaa so many things to do [20:30:28] sure [20:30:40] OK back but I have to hop into a meeting. [20:30:44] the bigest issue is the writes [20:31:05] eventually we want for example users like halfak to share things without needing help [20:31:15] but that makes HA more complex [20:31:38] * yuvipanda nods [20:31:58] maybe we can setup custom replication channels [20:32:03] for responsible users [20:32:29] where responsible means for me using InnoDB mostly :-) [20:32:31] yeah, it's gotta be a whitelist [20:32:58] and go through a process [20:32:59] jynus: I still think we should just force everyone to use InnoDB :) [20:33:08] well, yes, no favoritisms [20:33:25] the thing is, we do not have yet a method [20:33:33] so first, finishing the imports [20:33:40] so all wikis are available [20:33:59] then tuning so it is super-stable [20:34:00] brb landing! [20:34:08] cya in however long it takes us immigration to process me [20:34:09] then more features [20:34:17] bye [21:24:48] jynus, just got a chance to come back and look at this. I' [21:24:53] ll need some indexes. [21:25:13] Would it be OK if I just proposed the SQL statement to create them in the phab task? [21:44:07] 06Labs, 10MediaWiki-extensions-Page_Forms, 10wikitech.wikimedia.org: Accesing Special:FormEdit gives a blank empty page - https://phabricator.wikimedia.org/T156406#2973734 (10Yaron_Koren) I'd call this a rather minor problem, but I believe that upgrading to a more recent version of this extension, i.e. Page... [21:54:00] 06Labs, 10MediaWiki-extensions-Page_Forms, 10wikitech.wikimedia.org: Accesing Special:FormEdit gives a blank empty page - https://phabricator.wikimedia.org/T156406#2973734 (10bd808) @MarcoAurelio was there something particular you were trying to work on, or did you just hit that page and notice the unexpecte... [22:12:05] Lcawte: sorry I ducked out… I have a bit of time to help now if you're still stuck, otherwise we can give this another go tomorrow. [22:14:15] 06Labs, 10MediaWiki-extensions-Page_Forms, 10wikitech.wikimedia.org: Accesing Special:FormEdit gives a blank empty page - https://phabricator.wikimedia.org/T156406#2973734 (10scfc) I witnessed something like that recently when I went to (for example) https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Ac... [22:19:37] 06Labs, 10Tool-Labs: supercount @ Tool Labs: High Rep Lag - https://phabricator.wikimedia.org/T156338#2974365 (10JustBerry) @valhallasw Thanks for clarifying. [22:20:09] 06Labs, 10Tool-Labs: supercount @ Tool Labs: High Rep Lag - https://phabricator.wikimedia.org/T156338#2974369 (10JustBerry) 05Open>03Resolved a:03JustBerry [22:30:51] andrewbogott, been chasing the leader of the Wikibrain IEG again recently. Has there been any progress made on labs tools that need access to large chunks of disk space yet? (500GB) [22:30:59] I think he wanted to make a big memory-mapped file. [22:33:31] halfak: not to speak for andrewbogott, but I think that would just be a matter of finding a libvert server that had that much disk available and making a custom VM image profile for the instance. [22:34:00] if it needs to be sahred across multiple VMs that's a bit trickier. NFS is our only solution for that today [22:35:31] halfak: bd808 is right, except 500GB is a bigger use case than really anything we currently support. It's possible but outside the scope of what laps currently does. [22:36:02] And of course I wouldn't recommend that you store anything there that you wouldn't be able to regenerate at a moment's notice. [22:36:40] yeah. the big problem with a single libvert is that we have no backup plan if that hardware dies [22:36:58] we need ceph :) [22:37:03] bd808: yeah, also one of resources :) [22:38:02] true enough. finding $$ for hardware is doable, but finding people to install and maintain software is a growing problem [22:38:33] I'll write to the Ministry of Magic and see if we can get some Time Turners allocated [22:39:17] andrewbogott: Could https://wikitech.wikimedia.org/wiki/Help:Access#Connection_closed_by_remote_host be part of my issues? [22:41:27] Lcawte: I don't think any of those are you. [22:41:34] We know your key is right, because it works on the bastion. [22:42:04] I think you should do another test with -vvv but make sure you specify your username@ [22:42:48] As in change the proxy to specify my username in that? I've already done that, and the same result. [22:45:06] no, I mean... [22:45:09] actually type [22:45:37] $ ssh -vvv lcawte@social-tools1. [22:45:39] and see what it says [22:47:05] It's not even taken into consideration by the proxy. [22:47:18] hi Lcawte [22:47:29] I pushed some other work on andrewbogott, so now I get to help you :) [22:47:36] am going to read backscroll for a minute to see what's up [22:47:57] *end user reply* Labs is broken. [22:48:27] helpful :) [22:49:22] Lcawte: is social-tools1 the name of the instance you're trying to ssh into? [22:49:32] yuvipanda: Yep [22:49:43] hmm it isn't letting me in either [22:49:59] so I wonder if the instance is totally broken. [22:50:00] yuvipanda: root keys are screwed up there, maybe use tools2 as your test case [22:50:04] Lcawte: what project is it on? I'll check [22:50:10] although my userkey and madhu's works on tools1 [22:50:27] andrewbogott: what's 'tools1' and 'tools2'? where are these instances? [22:50:45] I'm talking about social-tools1 and social-tools2 [22:50:47] andrewbogott: does this mean the social-tools1 instance is pretty much unrecoverable? [22:50:51] ah I see [22:50:52] ok [22:50:57] yuvipanda: as I said, I can access it just fine with my user key [22:51:00] I don't know what's up with root keys [22:51:13] andrewbogott: yeah, sorry, makes sense now [22:51:31] well, the fact that root keys don't work doesn't make sense :( But we didn't investigate much [22:51:33] Hi, I have a tool running with the kubernetes backend. I tried to restart it, but when I do so, I get the following error: OSError: [Errno 2] No such file or directory: '/var/crash/_usr_bin_webservice.53053.crash'. Has anyone seen this issue before? [22:52:33] ewulczyn: I'm on the way to the office, I'll see you there in maybe 20-25min? [22:52:52] ok! [22:53:21] Lcawte: what OS are you attempting to login from? and can you log into other projects? [22:53:25] (sorry if you already covered these with andrewbogott - I can't fetch backscroll far enough) [22:53:52] I also have to get out of a train now to go to the WMF office, I'll be back online in about 15min. sorry [22:54:07] yuvipanda: I can log into other SSH servers, but I don't have access to other Labs stuff. [22:54:26] Lcawte: are you a member of the tools project? [22:54:51] if not I'll just add you there and see if that works [22:54:54] but for now, brb sorry [23:02:53] !log tools Disabling puppet on tools-checker instances to test https://gerrit.wikimedia.org/r/#/c/334433/ [23:02:57] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [23:04:16] yuvipanda: I'm about to head out (it's late, and I'm going to the gym. Any chance we can pick this up tomorrow?) [23:21:56] 10Tool-Labs-tools-Pageviews: Implement a monthly granularity view for Pageviews - https://phabricator.wikimedia.org/T151373#2815449 (10MusikAnimal) 05Open>03Resolved a:03MusikAnimal Still needs to implemented in Siteviews, once T156312 is resolved [23:22:21] 10Tool-Labs-tools-Pageviews: Add hourly/monthly stats to Siteviews - https://phabricator.wikimedia.org/T148610#2974566 (10MusikAnimal) [23:29:57] Lcawte sure [23:37:31] !log tools reenabled puppet on tools-checker [23:37:33] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL