[01:32:36] <wm-bot>	 Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Edoderoo was modified, changed by Tim Landscheidt link https://wikitech.wikimedia.org/w/index.php?diff=209603 edit summary: 
[01:32:46] <wm-bot>	 Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/PetrohsW was modified, changed by Tim Landscheidt link https://wikitech.wikimedia.org/w/index.php?diff=209604 edit summary: 
[01:39:05] <wikibugs>	 6Labs, 10wikitech.wikimedia.org: "Edit with form" missing on a Tools access request page - https://phabricator.wikimedia.org/T118136#1833167 (10scfc) And https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Access_Request/Edoderoo.  @Krenair: It would be nice if someone with SMW/SMF knowledge could share ho...
[10:42:03] <wikibugs>	 6Labs, 10Tool-Labs, 7Database: s51078 is executing the same >1h query every 5 minutes - https://phabricator.wikimedia.org/T119695#1833462 (10jcrespo) 3NEW
[10:47:07] <wikibugs>	 6Labs, 10Tool-Labs, 7Database: s51078 is executing the same >1h query every 5 minutes - https://phabricator.wikimedia.org/T119695#1833470 (10jcrespo) 5Open>3Resolved a:3jcrespo Throttled to one connection per user.
[10:54:37] <wikibugs>	 6Labs, 10Tool-Labs, 7Database: s51078 is executing the same >1h query every 5 minutes - https://phabricator.wikimedia.org/T119695#1833485 (10jcrespo)
[10:54:38] <wikibugs>	 6Labs, 10Tool-Labs, 7Database: Certain tools users create multiple long running queries that take all memory from labsdb hosts, slowing it down and potentially crashing (tracking) - https://phabricator.wikimedia.org/T119601#1833484 (10jcrespo)
[10:54:56] <wikibugs>	 6Labs, 10Tool-Labs, 7Database: Certain tools users create multiple long running queries that take all memory from labsdb hosts, slowing it down and potentially crashing (tracking) - https://phabricator.wikimedia.org/T119601#1833486 (10jcrespo) p:5Triage>3High
[10:55:29] <wikibugs>	 6Labs, 10Tool-Labs, 7Database: tools.joanjoc is executing the same >1h query every 5 minutes - https://phabricator.wikimedia.org/T119695#1833490 (10valhallasw)
[11:14:29] <wikibugs>	 6Labs, 6operations: Untangle labs/production roles from labs/instance roles - https://phabricator.wikimedia.org/T119401#1833513 (10yuvipanda) I've done this for most things, just a couple left (openldap::labs is a new and sad exception :()
[12:52:13] <wikibugs>	 6Labs, 10Tool-Labs: Setup an icinga instance to monitor tools on tool-labs - https://phabricator.wikimedia.org/T53434#1833695 (10zhuyifei1999)
[12:55:46] <wikibugs>	 10Tool-Labs-tools-Other, 6Wikisource: OCR scripts need updating at tools labs by updating the "tesseract-ben" package - https://phabricator.wikimedia.org/T117711#1833698 (10Aklapper)
[12:57:27] <wikibugs>	 6Labs, 6operations, 5Patch-For-Review, 7Puppet: Self hosted puppetmaster is broken - https://phabricator.wikimedia.org/T119541#1833712 (10akosiaris) I setup a new self hosted puppetmaster environment today and I did not meet this problem.
[13:11:59] <tinajohnson>	 Hi, I get this error https://dpaste.de/OQWF while trying to 'announce issue' which sends out Echo notifications to subscribers of a newsletters in vagrant. Tried restarting the redis-server but it did not... just kept saying "Stopping redis-server: " Could someone help ?
[13:14:01] <tonythomas>	 tinajohnson: well. I got https://dpaste.de/vqUR/raw while trying to login to wikitech. :\ 
[14:04:09] <hashar>	 !log upgrading zuul on labs to 2.1.0-60-g1cc37f7-wmf3 ( https://review.openstack.org/#/c/249207/2  https://phabricator.wikimedia.org/T97106 )
[14:04:10] <labs-morebots>	 upgrading is not a valid project.
[14:04:13] <hashar>	 grr
[17:02:55] <wikibugs>	 6Labs, 7Database: Database replicas: replicate user.user_touched - https://phabricator.wikimedia.org/T92841#1834218 (10jcrespo) The importing is taking place now. It will take a while, as we have 5GB of user data per server.
[18:03:01] <Luke081515>	 jynus: Enwiki has a replag of 1:10:58 at the moment
[18:03:11] <jynus>	 Luke081515, yes, it is expected
[18:03:13] <Luke081515>	 and it's still growing
[18:03:16] <Luke081515>	 ok
[18:03:17] <jynus>	 see server admin log
[18:03:56] <Luke081515>	 ok, thanks
[18:04:10] <jynus>	 now that you have exact lag measuring you are not going to let me pass one, don't you? :-)
[18:05:25] <Luke081515>	 is azwikitionary affected too?
[18:05:34] <jynus>	 no, only enwiki
[18:05:46] <jynus>	 all of them will be eventually
[18:05:51] <Luke081515>	 because azwiktionary has replag > 5 hours
[18:06:00] <jynus>	 I doubt it
[18:06:55] <jynus>	 https://phabricator.wikimedia.org/P2361
[18:07:16] <jynus>	 http://tools.wmflabs.org/betacommand-dev/cgi-bin/replag
[18:07:25] <Luke081515>	 ok, was not the databse problem. Since 14:02 no one edited azwiktionary
[18:07:30] <Luke081515>	 *database
[18:07:35] <Luke081515>	 *not a
[18:07:35] <jynus>	 exactly
[18:07:43] <jynus>	 that is why my method is more accurate
[18:08:41] <Betacommand>	 Luke081515: like I said you need  to take the numbers with a grain of salt, and know how its calculated 
[18:09:13] <jynus>	 Betacommand, can I challenge you to create a similar page with the new table? :-P
[18:09:43] <Betacommand>	 jynus: eventually
[18:09:47] <jynus>	 :-)
[18:10:06] <Betacommand>	 jynus: Not in the mood today, and busy all weekend
[18:10:12] <jynus>	 of course
[18:10:48] <Betacommand>	 jynus: your method is still not 100%
[18:10:52] <jynus>	 why?
[18:11:24] <Betacommand>	 jynus: cases where a query locks a database/table from incoming writes while it works
[18:11:44] <Betacommand>	 Ive seen that cause 1-2 hour lags
[18:11:45] <jynus>	 when a query locks a table, the whole replication stops
[18:11:56] <jynus>	 and so does heartbeat
[18:12:05] <Betacommand>	 jynus: I think thats per database not shard
[18:12:10] <jynus>	 no
[18:12:16] <jynus>	 there is no replication per database
[18:12:25] <jynus>	 only shards are replicated
[18:12:36] <Betacommand>	 ah, must have been with the old system
[18:12:36] <jynus>	 there is no parallel replication, (yet)
[18:12:57] <jynus>	 if there was, then I would implement a counter per database
[18:13:16] <Betacommand>	 jynus: Ive been using replicas for 10 years now :P
[18:13:30] <jynus>	 I can assure you, we are just reuing the setup created on production
[18:13:36] <jynus>	 it is very accurate
[18:13:43] <jynus>	 only 10 years?
[18:13:54] <jynus>	 :-)
[18:14:07] <Betacommand>	 jynus: WMF replicas
[18:14:11] <jynus>	 ah!
[18:14:35] <Betacommand>	 so Ive seen quite a bit
[18:14:46] <jynus>	 we need help on the DBA team
[18:14:52] <jynus>	 we are now a group of
[18:14:55] <jynus>	 1 people
[18:15:33] <jynus>	 and we are a bit busy, I know now where I can get help, join #wikimedia-databases
[18:15:46] <jynus>	 :.-)
[18:15:49] <jynus>	 :-)
[18:16:09] <Betacommand>	 jynus: If your willing to teach, Ill volunteer my time. Ive been doing IT for years.
[18:16:28] <jynus>	 sure!
[18:17:03] <jynus>	 I hope you can understand why things sometimes go slow at labs, production takes most of my time
[18:17:27] <Betacommand>	 jynus: I remember the days when Brion was the only paid IT person
[18:17:32] <jynus>	 true!
[18:17:41] <jynus>	 but we have grown a bit also!
[18:17:58] <Betacommand>	 jynus: Hell Ive actually caused a few hiccups in my time
[18:18:05] <jynus>	 :-)
[18:18:33] <jynus>	 I still can tell you that without volunteer time, this would not work
[18:18:34] <Betacommand>	 trying to clear a 1.2 million item watchlist caused a headache 
[18:19:25] <jynus>	 that is why the tools are so important
[18:24:37] <jynus>	 Luke081515, it should be shrinking now
[18:25:05] <jynus>	 but now you have more data to play around!
[18:25:11] <Luke081515>	 ok, thanks
[18:29:27] <wikibugs>	 6Labs, 7Database: Database replicas: replicate user.user_touched - https://phabricator.wikimedia.org/T92841#1834333 (10jcrespo) enwiki has been backfilled, it took 5.81GB of transference and 1:28:42 (time).  Will backfill the rest of the wikis later.
[18:30:45] <jynus>	 to go back the conversation, the "query is executing so the lag is not taken into account" is a problem of SHOW SLAVE STATUS "Seconds_behind_master"
[18:31:19] <jynus>	 not of pt-heartbeat, that is why I say it is very accurate
[18:32:18] <jynus>	 if a 5-second query was executed on the master, it will show 5 (probably more, 10, 5 for each slave in between) on the lag
[18:40:07] <bd808>	 jynus: The heartbeat_p.heartbeat is the same from any https://tools.wmflabs.org/bd808-test/
[18:40:19] <bd808>	 err.. the same from any db right?
[18:40:49] <bd808>	 that demo I threw up reads it from mysql:dbname=meta_p;host=s7.labsdb
[18:44:27] <jynus>	 no
[18:44:40] <jynus>	 there are 7 * 3 server different replication channels
[18:45:40] <jynus>	 the division is somewhere, let me search it
[19:13:06] <jynus_>	 technically that hasn't changed
[19:13:06] <bd808>	 I'll play with it tonight. It seems like it shouldn't be too hard to make a nice report
[19:13:06] <jynus_>	 I could even create a federated table
[19:13:07] * bd808 needs to drive to the turkey eating location now :)
[19:13:07] <jynus_>	 so that you have the 3 tables on the 3 servers
[19:13:07] <jynus_>	 :-)
[19:13:07] <jynus_>	 have fun!
[23:54:04] <Le0n>	 hi, someone here who can restart xtools?