[00:59:48] <Dragonfly6-7>	 hey, humans
[01:00:05] <Dragonfly6-7>	 there's a problem with the GLAMOROUS tool
[01:00:45] <Dragonfly6-7>	 specifically, when it's supposed to link to wikidata, it links to "wikidata.wikipedia.org"
[02:10:26] <wikibugs>	 6Labs: Process for user backups - https://phabricator.wikimedia.org/T85608#1143928 (10coren) Jessie saves.  Snapshots are back, and working, but not yet user-accessible (design work will be needed, perhaps automount?)  At the very least, once we turn the feature on, admins can recover user files.
[02:53:30] <Betacommand>	 Coren: secuirty question about user backups, is access limited to the projects that a user has access to?
[02:54:20] <Coren>	 When it'll be automated, you can only see the part of the snapshots that matches the project yes.
[03:27:52] <Betacommand>	 !install
[04:16:03] <YuviPanda|zz>	 Coren: I also usually add legoktm to help me catch python style issues. I can help.too :)
[04:16:14] <legoktm>	 hello
[06:35:34] <shinken-wm>	 PROBLEM - Puppet failure on tools-trusty is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0]  
[06:41:55] <shinken-wm>	 RECOVERY - Free space - all mounts on tools-dev is OK: OK: All targets OK  
[06:42:41] <shinken-wm>	 PROBLEM - Puppet failure on tools-exec-01 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0]  
[07:17:37] <shinken-wm>	 RECOVERY - Puppet failure on tools-exec-01 is OK: OK: Less than 1.00% above the threshold [0.0]  
[10:42:42] <grrrit-wm>	 (03PS1) 10Steinsplitter: Adding new tags for #wikimedia-commons-tech [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/199242 
[12:18:44] <wikibugs>	 6Labs: Storage capacity & redundancy expansion (tracking) - https://phabricator.wikimedia.org/T85604#1144764 (10coren) So here is the current picture:  * The new filesystem on thin volumes is in place and contains a copy of the live filesystem, but rsync is unable to keep up with the rate of change so actual dow...
[12:20:59] <Coren>	 YuviPanda|zz: Sorry about the alert.  I ended up finishing my day well past 0h to make sure all my ducks are in order for a meeting with Mark today.  :-)
[12:21:20] <Coren>	 I have a few hours now, I'll use them for that.
[12:26:02] <wikibugs>	 6Labs: Upgrade labstore2001 to Jessie - https://phabricator.wikimedia.org/T93740#1144787 (10coren) 3NEW a:3coren
[12:30:21] <wikibugs>	 6Labs: Replicate data between codfw and eqiad - https://phabricator.wikimedia.org/T85606#1144798 (10coren) This is ready to start; the replicated copy will not be the live one until the filesystem switch needed for T85608 is done but it does not depend on it.  What //is// a dependency is to finish tracking down...
[12:31:21] <wikibugs>	 6Labs: Process for user backups - https://phabricator.wikimedia.org/T85608#950693 (10coren) This is now working on the new (not live) filesystem.  We are pending only the switch.
[12:41:04] <shinken-wm>	 PROBLEM - Puppet failure on tools-bastion-01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0]  
[13:01:08] <shinken-wm>	 RECOVERY - Puppet failure on tools-bastion-01 is OK: OK: Less than 1.00% above the threshold [0.0]  
[13:44:17] <a930913>	 Coren: Is there a way I can see the db load graphs?
[13:46:50] <Coren>	 a930913: Our graphite is restricted to staff (or maybe NDA signers also, I'd need to check).  Would you like me to pull some numbers out for you?
[13:48:20] <a930913>	 Coren: Are there any obvious loads? Like a rising edge and a falling edge?
[13:48:30] * Coren checks
[13:48:36] <a930913>	 In, say, the last half day.
[13:48:49] <^d>	 Coren: nda also, yes
[13:49:00] <^d>	 (also: ganglia might be useful here?)
[13:49:29] <a930913>	 ^d: I couldn't see anything that looked like it would be the db.
[13:50:33] <Coren>	 a930913: In Miscelaneous quiad, you want labsdb*
[13:50:39] <Coren>	 eqiad*
[13:50:47] <Coren>	 Depending on which you connect to.
[13:51:22] <Coren>	 Wait, that has all but 1001-1003.  They must be elsewhere.
[13:51:26] * Coren hunts them down.
[13:52:51] <Coren>	 Ah!  In mysql eqiad.
[13:53:31] <Coren>	 http://ganglia.wikimedia.org/latest/?r=day&cs=&ce=&m=cpu_report&c=MySQL+eqiad&h=labsdb1001.eqiad.wmnet&tab=m&vn=&hide-hf=false&mc=2&z=small&metric_group=ALLGROUPS
[13:53:38] <Coren>	 a930913: ^^ Might be all you need
[13:56:28] <a930913>	 Coren: \o/
[13:57:04] <a930913>	 Coren: 100{1,3} are balanced between?
[13:59:22] <Coren>	 a930913: No, which you hit depends on what DB you are working with
[13:59:40] <a930913>	 Coren: wikidatawiki
[14:00:38] <Coren>	 a930913: 1003 then
[14:01:47] <a930913>	 Coren: ^d: Danke.
[14:01:56] <^d>	 yw
[14:02:07] <Coren>	 np
[14:03:13] <a930913>	 Now to see if I can see myself :p
[14:04:01] <a930913>	 Coren: The theory is, if I can't see myself affecting it, I don't need to worry about overloading, right? :p
[14:05:01] <Coren>	 a930913: It's not optimal, but it's a good starting point.
[14:05:31] <a930913>	 Coren: I'm running other diagnostic too.
[14:05:45] <a930913>	 Merged a whole load of query too.
[14:06:18] <a930913>	 Instead of each Q, it grabs everything for the item in one hit.
[14:06:34] <Coren>	 Sounds like a good optimization.
[14:06:35] <a930913>	 Not sure where the balance is though.
[14:07:24] <a930913>	 I.e. do I grab more data that I don't need in one go, or do I get many datas of what I want?
[14:07:34] <a930913>	 And Springle hasn't been on here :(
[14:08:26] <a930913>	 In theory, I could make a super mega large query to grab everything I needed in one go.
[14:09:00] <a930913>	 But would that be optimal.
[14:10:16] <Coren>	 I... don't know.  In the absence of our DBA you can always just measure.  :-)
[14:11:32] <a930913>	 Coren: I have a queue system that buffers the requests which means I can limit them. How do I work out how many jobs I can submit at once? (Assuming much CPU/memory/disk bound.)
[14:12:20] <Coren>	 As a sysadmin, my answer must be "as little as will still get you your results in a reasonable amount of time".  :-)
[14:12:32] <Coren>	 Be conservative at first.
[14:12:48] <a930913>	 Coren: These are web requests.
[14:13:21] <a930913>	 "Conservative at first" I.e. increase each day until complaint? :D
[14:13:55] <Coren>	 No, it means start with something small (like 2-3 at most) and increase when you see requests piling up regularily.  :-)
[14:15:00] <a930913>	 :)
[14:17:11] <a930913>	 The good news is they take a fraction of the memory I thought they would :)
[14:24:00] <wikibugs>	 6Labs: known_host key updating on virt* (and possibly elsewhere) - https://phabricator.wikimedia.org/T93748#1144980 (10Andrew) 3NEW
[15:37:37] <YuviPanda>	 Coren: hey! cool :) did you manage to get it done in the meantime?
[15:37:54] <Coren>	 YuviPanda: In meeting with Mark, will talk to you shortly.
[15:38:00] <YuviPanda>	 Coren: ah cool
[15:46:40] <a930913>	 Coren: When you finish your meeting, what server do I look at for the /data/project/ nfs load?
[15:46:54] <a930913>	 Unless YuviPanda, do you know?
[15:46:55] <YuviPanda>	 a930913: labstore1001
[15:46:59] <a930913>	 :)
[15:48:35] <Coren>	 a930913: That one is in "Labs NFS cluster eqiad"
[15:48:48] <a930913>	 Yeah, found it :)
[15:49:54] <a930913>	 Good news then. I can't see me.
[15:50:32] <a930913>	 Though someone is really doing some IO every five minutes :o
[15:54:06] <thcipriani>	 a930913: fwiw, I noticed IO on labs was really bad yesterday around this time. wa of like 95 doing a bunch of apt-get installs...
[15:56:29] <a930913>	 thcipriani: Not me, I was SQL bound then :p
[15:56:56] <a930913>	 Now I'm CPU bound. parsing JSON.
[16:05:55] <YuviPanda>	 Coren: should I start reviewing https://gerrit.wikimedia.org/r/199267 now or wait for you to take out the WIP tag?
[16:13:31] <Vivek>	 YuviPanda: Hi
[16:13:59] <Vivek>	 Did not find andrewbogott online for some days now.
[16:14:43] <andrewbogott>	 Vivek: It’s because I’m in North America :)
[16:14:58] <andrewbogott>	 But I’m in a meeting now, and for a long time into the future :(
[16:15:17] <JohnFLewis>	 andrewbogott: always a reason :p
[16:16:13] <andrewbogott>	 Vivek: I can multitask as long as you don’t mind my having a very short attention span
[16:16:32] <Vivek>	 andrewbogott: So I found you, good.
[16:16:39] <Vivek>	 :)
[16:17:16] <andrewbogott>	 Vivek: also, my bouncer is always here so you can leave questions or pm me when I’m AFK
[16:18:48] <Vivek>	 sure.
[16:35:28] <grrrit-wm>	 (03PS1) 10Southparkfan: Replace absolute paths with relative paths [labs/tools/WMT] - 10https://gerrit.wikimedia.org/r/199281 
[16:44:06] <wm-bot>	 Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Glaisher was created, changed by Glaisher link https://wikitech.wikimedia.org/wiki/Nova+Resource%3aTools%2fAccess+Request%2fGlaisher edit summary: Created page with "{{Tools Access Request |Justification=Host some tools |Completed=false |User Name=Glaisher }}"
[16:45:56] <Coren>	 YuviPanda: Go ahead and start reviewing now - I need to do a notification before I take some bandwidth to make the new filesystem the live one for several days anyways.
[16:48:07] * Coren braces of the scathing review. "Your python reads like perl!"
[16:48:08] <Coren>	 :-)
[17:00:00] <Negative24>	 Why is the Phabricator Diffusion puppet repo behind Gitblit and Gerrit?
[17:00:59] <YuviPanda>	 Negative24: I am not sure, I think mostly because nobody uses Diffusion...
[17:01:42] <Negative24>	 Nobody should because Diffusion is referencing Gerrit they should exactly the same
[17:01:51] <YuviPanda>	 heh
[17:02:06] <Negative24>	 They aren't even mirrors they have the same storage
[17:02:16] <YuviPanda>	 github?
[17:02:18] <YuviPanda>	 oh
[17:02:25] <Negative24>	 no github is a mirror
[17:02:32] <YuviPanda>	 Negative24: ^d is the man you are looking for, maaabe
[17:02:55] <Negative24>	 just a little confusing referencing files that are different all around
[17:03:00] <^d>	 Diffusion polls on its own, it's not a push system
[17:03:22] <Negative24>	 exactly. Its setup so that Phabricator only looks for new changes
[17:03:32] <Negative24>	 its read-only to the gerrit repos
[17:04:05] <Negative24>	 at least that's how its supposed to be setup until #gerrit-migration
[17:07:27] <Negative24>	 its also not helpful that Diffusion uses phab usernames and Gitblit uses real names and then Gerrit uses LDAP usernames :)
[17:11:58] <wikibugs>	 6Labs, 5Patch-For-Review: Clarify public/private role for holmium (aka labs-ns2) - https://phabricator.wikimedia.org/T93639#1145566 (10Andrew)
[17:16:42] <^d>	 Negative24: My suggestion is to forget gitblit :p
[17:18:57] <Negative24>	 ^d: i only use it in these type of situations. its been acting real slow anyways
[17:19:14] <^d>	 it's always slow
[17:20:22] <Negative24>	 github also reports that Diffusion is behind :(
[17:21:07] <YuviPanda>	 just use github :)
[17:21:11] <YuviPanda>	 that’s what I do
[17:21:13] <YuviPanda>	 gitblit is terrible
[17:27:49] <Negative24>	 YuviPanda: i'm not worrying about gitblit because i know how bad it is but I'm trying to figure out phabricator
[17:28:03] <YuviPanda>	 aaaaah, I see. that makes sense.
[17:28:06] <Negative24>	 thats what i have been working with
[17:28:10] <YuviPanda>	 Negative24: chasemp might also know
[17:30:38] <SMalyshev>	 anybody knows what could cause "DB connection error: Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (111) ()" on labs machine in the middle of a script run? mysql is fine now, but looks like something killed it in the middle of a run...
[17:30:53] <Negative24>	 that's a permission issue
[17:30:53] <chasemp>	 diffusion: in general terms it's behind because it hasn't pulled in new things :)
[17:31:05] <chasemp>	 in practical terms it does a rolling pull scheduled based on how active it sees a repo
[17:31:14] <chasemp>	 and we haven't worried about a bit of delay
[17:31:31] <Negative24>	 chasemp: Is that related to this: https://secure.phabricator.com/book/phabricator/article/diffusion_updates/
[17:31:42] <chasemp>	 yes basically
[17:41:45] <wikibugs>	 6Labs, 5Patch-For-Review, 7Puppet: puppet-run is confused by stale lock files - https://phabricator.wikimedia.org/T92766#1145775 (10BBlack) 5Open>3Resolved a:3BBlack I'm assuming the fixup from a week ago worked for labs as well, closing.  Re-open if not! :)
[17:44:39] <chasemp>	 Negative24: also I noticed a bug so thanks :)
[17:46:42] <Negative24>	 chasemp: Which?
[17:47:34] <chasemp>	 a permission issue not an upstream issue
[17:48:02] <Negative24>	 CC me on it. I also found a permission issue this morning so I wonder if it is the same thing
[17:50:28] <chasemp>	 Negative24: https://phabricator.wikimedia.org/rOPUP5b535a7a0915a9687d99057010c843c279f6e597
[17:51:20] <Negative24>	 chasemp: Yes that is what I found but not just scripts/repository the whole scripts dir needs to be owned by phd
[17:51:42] <chasemp>	 can you point me to why say setup tools needs to be owned by phd?
[17:51:45] <Negative24>	 you only cleared up the repo management files but there are many more that still error out
[17:52:14] <Negative24>	 yeah I was looking for doc but the phab setup docs aren't very clear
[17:52:50] <Negative24>	 it goes along the lines of "git clone and be happy. oh and here are some docs about how to use it :)"
[17:52:58] <chasemp>	 some things like util and user in scripts I havne't seen a reason for what you are suggesting
[17:53:08] <chasemp>	 opposed at this point to doing that without knowing why
[17:53:18] <chasemp>	 but yes it's all loosely documented
[17:55:08] <Negative24>	 I guess its fine for the moment but as we go forward exploring more phab apps we may find more but yes the repo management was all that I was impacted by for the moment
[17:55:31] <chasemp>	 the only other thing I am aware of is the ssh / repo hosting
[17:55:36] <chasemp>	 but that will be a big deal either way
[17:59:32] <Negative24>	 chasemp: https://gerrit.wikimedia.org/r/#/c/198769/
[18:03:28] <chasemp>	 gtg
[18:08:26] <Negative24>	 chasemp: Looks like it resolved itself (or did you do something?)
[18:08:43] <chasemp>	 to what are you referring?
[18:14:18] <afeder>	 I've added a section to Help:Tool Labs: https://wikitech.wikimedia.org/wiki/Help:Tool_Labs#Setting_up_code_review_and_version_control  Complain if anything looks wrong
[18:14:53] <afeder>	 (new section being: https://wikitech.wikimedia.org/wiki/Help:Tool_Labs#Enabling_simple_public_HTTP_access_to_local_Git_repository)
[18:17:55] <wikibugs>	 6Labs: Make a labs_storage module - https://phabricator.wikimedia.org/T93781#1146005 (10coren) 3NEW
[18:18:26] <wikibugs>	 6Labs: Make a labs_storage module - https://phabricator.wikimedia.org/T93781#1146015 (10yuvipanda) This counts for labstore1003 too, right?
[18:22:17] <wm-bot>	 Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Glaisher was modified, changed by Tim Landscheidt link https://wikitech.wikimedia.org/w/index.php?diff=149856 edit summary: 
[18:27:16] * Negative24 hates Comcast
[18:28:14] <Nemo_bis>	 Who doesn't
[18:28:36] <Negative24>	 If I get disconnected for no apparent reason blame xfinity
[18:30:26] <wikibugs>	 6Labs, 5Patch-For-Review: Replicate data between codfw and eqiad - https://phabricator.wikimedia.org/T85606#1146053 (10coren)
[18:30:27] <wikibugs>	 6Labs, 5Patch-For-Review: Process for user backups - https://phabricator.wikimedia.org/T85608#1146052 (10coren)
[18:30:43] <wikibugs>	 6Labs, 5Patch-For-Review: Process for user backups - https://phabricator.wikimedia.org/T85608#950693 (10coren)
[18:30:44] <wikibugs>	 6Labs, 5Patch-For-Review: Replicate data between codfw and eqiad - https://phabricator.wikimedia.org/T85606#950673 (10coren)
[18:39:25] <wikibugs>	 6Labs, 10MediaWiki-extensions-OpenStackManager, 5Patch-For-Review: Wikitech 'manage instances' displays "PHP Fatal error:  Call to a member function getImageName() on a non-object" - https://phabricator.wikimedia.org/T89856#1146101 (10Andrew) 5Open>3Resolved Patch is merged -- now instances that refer to...
[19:13:20] <wikibugs>	 6Labs, 5Patch-For-Review: Make a labs_storage module - https://phabricator.wikimedia.org/T93781#1146209 (10coren) There will be a class for labstore1003 too, yes, though my first pass will be [12]00[12]
[19:17:27] <wikibugs>	 6Labs: Sync up the new labs NFS project filesystem with the live one - https://phabricator.wikimedia.org/T93792#1146217 (10coren) 3NEW a:3coren
[19:18:04] <wikibugs>	 6Labs: Sync up the new labs NFS project filesystem with the live one - https://phabricator.wikimedia.org/T93792#1146217 (10coren) (As a note, this will be running in a screen session so that it can be supervised)
[19:18:57] <wikibugs>	 6Labs: Sync up the new labs NFS project filesystem with the live one - https://phabricator.wikimedia.org/T93792#1146240 (10yuvipanda) oh, so is the new backedup file system going to be on /mnt?
[19:20:47] <wikibugs>	 6Labs: Sync up the new labs NFS project filesystem with the live one - https://phabricator.wikimedia.org/T93792#1146245 (10coren) No, it will take the old volume's place at /srv/project.  /mnt is only used during the copy process because they (obviously) need to both be mounted.
[19:21:17] <wikibugs>	 6Labs: Sync up the new labs NFS project filesystem with the live one - https://phabricator.wikimedia.org/T93792#1146247 (10yuvipanda) Ah fair enough :)
[19:33:45] <wikibugs>	 6Labs, 10hardware-requests, 6operations: Replace virt1000 with a newer warrantied server - https://phabricator.wikimedia.org/T90626#1146302 (10RobH) virt1000 is a dual X5647  @ 2.93GHz w/ 32GB.  Also if the replacement has to be under warranty, it'll be slightly more challenging.  Is there a particular entry...
[19:38:45] <grrrit-wm>	 (03CR) 10John F. Lewis: [C: 032 V: 032] Replace absolute paths with relative paths [labs/tools/WMT] - 10https://gerrit.wikimedia.org/r/199281 (owner: 10Southparkfan)
[19:45:43] <wikibugs>	 6Labs, 5Patch-For-Review: Move to a new dns scheme for labs:  hostname.projectname.eqiad.wmflabs - https://phabricator.wikimedia.org/T93087#1146331 (10Andrew) Every labs instance now has puppetVar: use_dnsmasq=true in ldap.
[19:46:59] <wikibugs>	 6Labs: Make a fact for project_id on labs instances - https://phabricator.wikimedia.org/T93684#1146340 (10Andrew)
[19:47:01] <wikibugs>	 6Labs, 5Patch-For-Review: Move to a new dns scheme for labs:  hostname.projectname.eqiad.wmflabs - https://phabricator.wikimedia.org/T93087#1146339 (10Andrew)
[20:04:33] <wikibugs>	 6Labs: Make a fact for project_id on labs instances - https://phabricator.wikimedia.org/T93684#1146406 (10coren) Isn't that what ${instanceproject} is?
[20:45:45] <grrrit-wm>	 (03PS1) 10Southparkfan: Use new double-hashed channel namespace [labs/tools/WMT] - 10https://gerrit.wikimedia.org/r/199325 
[20:46:48] <grrrit-wm>	 (03CR) 10Alpha: [C: 032 V: 032] Use new double-hashed channel namespace [labs/tools/WMT] - 10https://gerrit.wikimedia.org/r/199325 (owner: 10Southparkfan)
[21:00:38] <wikibugs>	 6Labs: Sync up the new labs NFS project filesystem with the live one - https://phabricator.wikimedia.org/T93792#1146529 (10coren) This has now started.
[21:02:53] <andrewbogott>	 Coren: that probably is what instanceproject is — I can find references to it but can’t tell where it comes from… can you?
[21:08:32] <grrrit-wm>	 (03PS1) 10John F. Lewis: feed #wmt to new channel [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/199329 
[21:09:12] <andrewbogott>	 oh, from ldap of course
[21:09:22] <andrewbogott>	 so it’s there but not a fact
[21:09:58] <grrrit-wm>	 (03PS1) 10John F. Lewis: feed #wmt to new channel [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/199330 
[21:48:30] <multichill>	 YuviPanda: You could have just imported the private ssh key for the server and nobody would have noticed your change ;-)
[22:32:06] <wikibugs>	 10Tool-Labs: Memory Exhausted Near / Tool labs error while querying with Python - https://phabricator.wikimedia.org/T93074#1146898 (10Springle) For the error 1064: https://bugs.mysql.com/bug.php?id=69383 .  Check the size of your largest prepare statements.  The second error, "Commands out of sync; you can't run...
[23:13:51] <Negative24>	 chasemp: Did you see my comment on https://phabricator.wikimedia.org/rOPUP5b535a7a0915a9687d99057010c843c279f6e597
[23:14:37] <chasemp>	 ah I don't think anyone is tracking diffusion comments but in essence, can you give me a reason to make the change?
[23:16:27] <Negative24>	 all the other permission configs set the group to phd
[23:16:48] <Negative24>	 i guess i should be commenting on gerrit
[23:16:58] <Negative24>	 i'm so wrapped up in phab at the moment
[23:18:09] <chasemp>	 well repository tools can be managed via sudo by some users as root
[23:18:21] <chasemp>	 but if I user and group them as phd I have to allow those users to sudo as phd
[23:18:28] <chasemp>	 not sure I want to mix it up that way
[23:18:49] <chasemp>	 I'm ok with how it is for now at least for as long as we are shaking things out and we have a reason to make another change
[23:19:00] <Negative24>	 ok that's fine
[23:20:35] <chasemp>	 cli perms for phab are an emerging field :)