[07:15:18] <wikibugs>	 10DBA, 13Patch-For-Review: codfw: Fix S4 commonswiki.templatelinks partitions - https://phabricator.wikimedia.org/T149079#2784887 (10Marostegui) db1059 is now finished  ``` root@neodymium:~# mysql -hdb1059 -A commonswiki -e "nopager;show create table templatelinks\G" PAGER set to stdout ***********************...
[07:25:41] <wikibugs>	 10DBA, 13Patch-For-Review: Deploy gtid_domain_id flag in our mysql hosts - https://phabricator.wikimedia.org/T149418#2784890 (10Marostegui) The above change has been working fine for the last 3 days,   The master has rotate the logs a few times  ``` root@neodymium:~# mysql -hdb1020 -e "show master status\G" **...
[09:11:12] <jynus>	 Stage: 1 of 2 'copy to tmp table'   37.8% of stage done
[09:11:31] <jynus>	 14h
[09:11:35] <marostegui>	 db2042?
[09:11:41] <jynus>	 yes
[09:12:28] <jynus>	 that will take 2 days at least
[09:12:36] <marostegui>	 Yeah saw it there in the morning, didn't you do a partitioning some weeks ago too?
[09:12:49] <marostegui>	 I thought it took like 20 hours or so that last one
[09:12:51] <marostegui>	 just curious
[09:13:06] <jynus>	 yes, probably on a better server
[09:13:40] <jynus>	 and one without a predictive disk failure
[09:14:15] <marostegui>	 the predictive failure shoulnd't affect the performance right? I mean the RAID is still Optimal right?
[09:14:24] <marostegui>	 lots of "right" in the same sentence :p
[09:14:37] <jynus>	 ha
[09:14:58] <marostegui>	 the obscure world of raid controllers...
[09:15:11] <jynus>	 should =/= what actually happens
[09:15:22] <marostegui>	 exactly XD
[09:16:13] <jynus>	 I am extending the downtime until monday
[09:16:23] <marostegui>	 makes sense yes
[09:16:30] <marostegui>	 you are off tomorrow if I remember well? 
[09:16:51] <jynus>	 yes, although I am not too well in health right now
[09:17:07] <marostegui>	 why not rest a bit today too?
[09:17:22] <jynus>	 I am actually feeling better than yesterday
[09:17:53] <jynus>	 yesterday I only had failure with the db2042 reimage
[09:18:30] <marostegui>	 yes, I saw that
[09:18:37] <marostegui>	 And your favorite alter table is finished too \o/
[09:19:28] <jynus>	 I think the only way to make those substainable is to reduce servers and make those faster
[09:19:51] <jynus>	 the main issue are those slow <db1050
[09:20:55] <marostegui>	 yes, the problem is also the amount of wikis in s3 (which we can do nothing about)
[09:21:05] <marostegui>	 I remember you said in the ticket something like 30k alter tables
[09:21:16] <jynus>	 we have 2 issues with wikis
[09:21:24] <jynus>	 the smaller ones and the larger ones
[09:21:53] <jynus>	 ideally that would be solved on software
[09:22:08] <marostegui>	 although with online ddl things are a lot easier now
[09:22:24] <jynus>	 it was worse on 5.5
[09:22:42] <jynus>	 pt-osc on the master
[09:23:05] <marostegui>	 imagine before 5.5...pfff
[09:23:22] <jynus>	 before, tables where too small to care
[09:23:48] <marostegui>	 I remember the pain back in tuenti, doing lots of master switches just for a single alter :(
[09:23:54] <jynus>	 and stability was not a goal
[09:24:07] <marostegui>	 what was the goal?
[09:24:34] <jynus>	 apparently there was lots of outages and read-only times
[09:25:01] <marostegui>	 oh wow
[09:28:37] <jynus>	 before you deploy
[09:28:49] <jynus>	 let me do another change 
[09:28:53] <marostegui>	 sure
[09:29:00] <marostegui>	 it got merged
[09:29:12] <marostegui>	 but i will wait
[09:31:02] <jynus>	 https://gerrit.wikimedia.org/r/320633
[09:31:13] <doctaxon>	 jynus: SQL database are very fast, but then I have done a query that lasted 1.5 min. Is this correct or have I done a syntax error?
[09:31:19] <doctaxon>	 MariaDB [dewiki_p]> select il_to from imagelinks, page where page_title = 'Sozialdemokratische_Partei_Österreichs' and il_from = page_id;
[09:31:52] <doctaxon>	 +-----------------------------------------------------------+
[09:31:52] <doctaxon>	 | il_to                                                     |
[09:31:52] <doctaxon>	 +-----------------------------------------------------------+
[09:31:52] <doctaxon>	 | Alfred_Gusenbauer_Linz_2008.jpg                           |
[09:31:52] <doctaxon>	 | Austria_Bundesadler.svg                                   |
[09:31:54] <doctaxon>	 | Austria_Parlament_Portikus.JPG                            |
[09:31:57] <doctaxon>	 | Bruno_Kreisky.jpg                                         |
[09:32:04] <doctaxon>	 33 rows in set (1 min 29.36 sec)
[09:32:23] <jynus>	 please don't pastemultiple lines on IRC, doctaxon 
[09:32:40] <doctaxon>	 okay
[09:32:44] <jynus>	 use https://phabricator.wikimedia.org/paste/edit/form/14/ instead
[09:33:11] <doctaxon>	 okay, what do you say to the syntax?
[09:34:48] <marostegui>	 jynus: I did +1 to your change, I can do +2 if you like and deploy everything
[09:34:54] <jynus>	 ok
[09:35:17] <marostegui>	 sure!
[09:35:40] <jynus>	 doctaxon, see your exaplain, your query reads 100 million rows, so it is slow:https://phabricator.wikimedia.org/P4403
[09:36:56] <jynus>	 on a matter of style, "imagelinks, page" is a syntax allowed but discouraged as not clear
[09:37:33] <doctaxon>	 jynus: but I am not sure, what I have to change on syntax. I want to get all picture files of the page 'Sozialdemokratische_Partei_Österreichs' without to know its page_id
[09:40:42] <jynus>	 "imagelinks JOIN page ON il_from = page_id WHERE page_title = 'Sozialdemokratische_Partei_Österreichs'" is the preferred syntax
[09:42:30] <doctaxon>	 that only gives ->    nothing else
[09:42:53] <doctaxon>	 what does this  ->  mean?
[09:42:57] <jynus>	 that was not the completequery
[09:43:12] <doctaxon>	 select imagelinks JOIN page ON il_from = page_id WHERE page_title = 'Sozialdemokratische_Partei_Österreichs'
[09:43:17] <jynus>	 only the from clause, and you need to write a ; at the end of every query
[09:43:36] <doctaxon>	 ah okay
[09:46:06] <doctaxon>	 jynus: gives almost the same: 33 rows in set (1 min 13.20 sec)
[09:46:46] <jynus>	 you can get it a bit better by using the other index with " and il_from_namespace = 0"
[09:46:57] <jynus>	 all pages need a namespace
[09:47:05] <jynus>	 the main one is namespace 0
[09:50:14] <doctaxon>	 jynus: in the ON clause?
[09:50:30] <jynus>	 in the where
[09:50:40] <jynus>	 ON == conditions for join
[09:50:47] <jynus>	 WHERE == later conditions
[09:51:06] <jynus>	 same effect on JOINs, not on outer joins
[09:53:31] <doctaxon>	 jynus: 29 rows in set (45.61 sec)   <- 4 rows are missing now
[09:54:36] <jynus>	 either they were removed or they were not on the main page (maybe a talk page, for example)
[09:55:41] <jynus>	 I see at least 2 images here: https://de.wikipedia.org/wiki/Diskussion:Sozialdemokratische_Partei_%C3%96sterreichs
[09:56:15] <doctaxon>	 oh, I understand
[09:56:30] <doctaxon>	 is this some difficult
[09:56:37] <doctaxon>	 *facepalm*
[09:56:53] <jynus>	 SQL does what you tell it to to, not what you want :-)
[09:57:03] <doctaxon>	 ^^
[09:57:50] <wikibugs>	 10DBA, 06Operations, 10ops-codfw: install new disks into dbstore2001 - https://phabricator.wikimedia.org/T149457#2785156 (10Marostegui) The snapshots are taken so `dbstore2001` is ready to get the new disks. Later today.  I have placed them at: `dbstore2002:/srv/tmp` there is one from 7th Nov and another one...
[09:58:26] <jynus>	 marostegui, it may be delayed, did you see the order problems?
[09:58:42] <marostegui>	 jynus: Yes, but only in eqiad
[09:58:54] <marostegui>	 jynus: I asked papaul to double check if codfw delivery was fine
[09:59:06] <jynus>	 we got the vlan change
[09:59:13] <jynus>	 for labsdb1008
[09:59:15] <marostegui>	 for labsdb1008
[09:59:16] <marostegui>	 right
[09:59:18] <marostegui>	 nice :)
[09:59:22] <jynus>	 we need to reimage
[09:59:32] <jynus>	 but need chris' help
[09:59:44] <marostegui>	 why?
[10:00:00] <jynus>	 we need to assign an ip
[10:00:12] <jynus>	 we do not have access to the server now
[10:00:25] <jynus>	 it is like a new server intall
[10:02:27] <marostegui>	 aaah
[10:02:30] <marostegui>	 I see
[10:02:37] <marostegui>	 I can take care of that if you like
[10:02:46] <jynus>	 it is ok
[10:02:51] <marostegui>	 ok
[10:02:57] <jynus>	 it has to be relabeled on the dc
[10:03:03] <marostegui>	 I will probably import s5 today/tomorrow to dbstore2002
[10:03:05] <jynus>	 so we need onsite asistance
[10:03:19] <marostegui>	 gotcha
[10:03:58] <volans>	 FYI wmf-auto-reimage doesn't support the feature of renaming that is in wmf-reimage but if the dns will be changed and you cleanup puppet/salt cert for the old name manually it should work as a normal reimage
[10:04:18] <volans>	 you can also use the new "--new" option that skips some steps in case of "new" hosts
[10:04:19] <marostegui>	 ah great
[10:04:27] <marostegui>	 is the —new already deployed?
[10:05:05] <volans>	 yes,look for "args.new" in https://github.com/wikimedia/operations-puppet/blob/bf2870f7ffd98373c64031a225c69a1aa6f79683/modules/salt/files/wmf_auto_reimage.py
[10:05:24] <marostegui>	 volans: so the —new doesn't work really with servers which are not completely new but renamed?
[10:05:44] <volans>	 the new skips the validation is an already installed host and the icinga downtime
[10:06:11] <volans>	 and the depool from conftool, if set with the -c
[10:06:34] <volans>	 then it passes the option '--no-clean' to wmf-reimage
[10:07:16] <volans>	 that skips the cert clean from puppet and salt
[10:08:00] <marostegui>	 ah I see I see
[10:08:02] <volans>	 so that the first step is basically set pxe and reboot
[10:08:44] <volans>	 as long as the DNS is set and IPMI works, it should work AFAICT
[10:09:36] <volans>	 until T150160 is fixed, check if the hosts you're reimaging are not in the list on P4379
[10:09:37] <stashbot>	 T150160: Remote IPMI doens't work for ~17% of the fleet - https://phabricator.wikimedia.org/T150160
[10:12:55] <marostegui>	 No, it is not there, pheew :)
[10:35:26] <jynus>	 I've started with https://gerrit.wikimedia.org/r/320752 it may need some iterations still
[11:50:58] <wikibugs>	 10DBA, 13Patch-For-Review: Audit MySQL configurations - https://phabricator.wikimedia.org/T133333#2785348 (10Marostegui) p:05High>03Normal
[11:54:24] <wikibugs>	 10DBA, 06Operations: mysql boxes not in ganglia - https://phabricator.wikimedia.org/T87209#2785350 (10jcrespo) 05Open>03declined
[12:08:31] <wikibugs>	 10DBA, 06Operations, 13Patch-For-Review: Gerrit 285208 broke eventlogging_sync.sh - https://phabricator.wikimedia.org/T133588#2785367 (10jcrespo) 05Open>03Resolved a:03jcrespo closing because the ongoing issues were fixed, and long-term fixes will be done on T124307
[12:10:51] <jynus>	 meta has 10 tickets already
[12:13:10] <marostegui>	 well, that is good :)
[12:15:27] <jynus>	 check the current status, not sure if it was everthing you mentioned
[12:15:27] <marostegui>	 I am going for lunch - will review it later :)
[12:15:27] <jynus>	 ah, yes sorry
[12:15:27] <marostegui>	 no worries
[12:15:27] <marostegui>	 see you in a bit
[12:19:12] <wikibugs>	 10DBA, 10Packaging: MariaDB package improvements - https://phabricator.wikimedia.org/T127811#2785387 (10jcrespo) 05Open>03Resolved a:03jcrespo This is too ambitious- packaging was improved. Systemd was analyzed as not working (except on compatibility mode) for 5.6/10.0.
[13:45:49] <wikibugs>	 10DBA, 13Patch-For-Review: Deploy gtid_domain_id flag in our mysql hosts - https://phabricator.wikimedia.org/T149418#2785564 (10Marostegui) I have restarted the mXX hosts from codfw so they get the new gtid_domain_id on Monday I will apply it on the rest of servers (and masters)  ``` root@neodymium:~# for i in...
[14:21:47] <wikibugs>	 10DBA, 10CirrusSearch, 06Discovery, 06Discovery-Search (Current work), and 2 others: MySQL chooses poor query plan for link counting query - https://phabricator.wikimedia.org/T143932#2785687 (10jcrespo) I've analyzed the problem, and the issue is not the query plans themselves, they use the pl_namespace co...
[14:30:00] <wikibugs>	 10DBA, 10CirrusSearch, 06Discovery, 06Discovery-Search (Current work), and 2 others: MySQL chooses poor query plan for link counting query - https://phabricator.wikimedia.org/T143932#2785698 (10jcrespo) At a minimum, the following changes should be done:  * All queries should use the vslow databases so the...
[14:37:56] <wikibugs>	 10DBA, 10CirrusSearch, 06Discovery, 06Discovery-Search (Current work), 13Patch-For-Review: CirrusSearch SQL query for locating pages for reindex performs poorly - https://phabricator.wikimedia.org/T147957#2785706 (10jcrespo) > There isn't a big rush here, this query is incredibly rare. It's part of a rei...
[14:38:47] <wikibugs>	 10DBA, 10CirrusSearch, 06Discovery, 06Discovery-Search (Current work), 13Patch-For-Review: CirrusSearch SQL query for locating pages for reindex performs poorly - https://phabricator.wikimedia.org/T147957#2785721 (10jcrespo) a:05jcrespo>03None
[14:39:43] <wikibugs>	 10DBA, 10CirrusSearch, 06Discovery, 06Discovery-Search (Current work), and 2 others: MySQL chooses poor query plan for link counting query - https://phabricator.wikimedia.org/T143932#2785723 (10jcrespo) As a note, the latest index changes on pagelinks should have improved the query planning in the first pl...
[14:46:28] <wikibugs>	 07Blocked-on-schema-change, 10DBA, 10Wikimedia-Site-requests, 06Wikisource, and 2 others: Schema change for page content language - https://phabricator.wikimedia.org/T69223#2150795 (10jcrespo) p:05Triage>03Normal a:03jcrespo
[15:16:17] <marostegui>	 jynus: I did an update&upgrade on db2010 but I am not sure about the state of the 10.0.28 package (or maybe this server was wrong before): https://phabricator.wikimedia.org/P4405
[15:16:59] <jynus>	 that is wrong
[15:17:07] <jynus>	 run puppet
[15:17:15] <wikibugs>	 10DBA, 13Patch-For-Review: codfw: Fix S4 commonswiki.templatelinks partitions - https://phabricator.wikimedia.org/T149079#2785839 (10Marostegui) This is now running on db1068
[15:17:15] <marostegui>	 ok
[15:17:23] <jynus>	 that should fix it, I think
[15:17:30] <marostegui>	 running
[15:17:50] <marostegui>	 otice: /Stage[main]/Mariadb::Service/File[/opt/wmf-mariadb10/service]/ensure: created
[15:17:53] <marostegui>	 \o/
[15:18:09] <jynus>	 it is the issue of the package not having that anymore
[15:18:16] <jynus>	 and handling it on puppet
[15:18:36] <jynus>	 for a brief period it disappears
[15:19:05] <marostegui>	 sure sure, it is fine now :)
[15:19:07] <jynus>	 not seen on new installs
[15:19:29] <jynus>	 mysql_upgrade and restart
[15:19:47] <jynus>	 not sure if we enabled p_s there
[15:22:04] <jynus>	 ./software/dbtools/osc_host.sh --host=db1068.eqiad.wmnet --port=3306 --db=commonswiki --table=oldimage --method=ddl --no-replicate "MODIFY COLUMN oi_minor_mime varbinary(100) NOT NULL default 'unknown'"
[15:22:14] <jynus>	 ^this is what I intend to run for starters
[15:22:23] <marostegui>	 how big is that table?
[15:22:32] <marostegui>	 I am running an alter on templatelinks (removing partitions)
[15:22:33] <jynus>	 oldimage is medium
[15:22:42] <jynus>	 image is the large one
[15:22:55] <jynus>	 filearchive should be small
[15:23:12] <jynus>	 I think oldimage should be blocking
[15:23:17] <jynus>	 because no PK
[15:23:23] <jynus>	 the others, I am not sure
[15:23:33] <marostegui>	 I think it will be fine I just want to  make sure I can pool the server tomorrow before the weekend
[15:23:43] <marostegui>	 you think it will be done by tomorrow evening?
[15:23:49] <jynus>	 oldimage yes
[15:23:53] <jynus>	 image not
[15:24:07] <jynus>	 that is why I asked about your opinion
[15:25:04] <marostegui>	 then I would rather leave image for next week
[15:25:10] <jynus>	 ok
[15:25:10] <marostegui>	 I don't like to leave that server out for the weekend
[15:25:13] <marostegui>	 Unless you are in a rush
[15:25:41] <jynus>	 no, just had the idea of combining both difficult changes in the same depooling cycle
[15:26:14] <marostegui>	 I wouldn't mind if the weekend wasn't too close :(
[15:26:21] <jynus>	 gotcha
[15:26:27] <marostegui>	 sorry for pushing back
[15:26:43] <jynus>	 no, I like you taking decisions
[15:26:47] <jynus>	 :-)
[15:27:01] <marostegui>	 :)
[15:28:36] <jynus>	 I am waiting/blocked for several tickets now, any suggestions?
[15:29:32] <jynus>	 I am thinking of finishing the auth_socket deployment
[15:32:36] <marostegui>	 Is there anything that can be picked up in Next? :)
[15:33:30] <jynus>	 not really, 3 are blocked
[15:33:36] <jynus>	 1 is not prioritary
[15:34:38] <jynus>	 and the whole labs thing is blocked on labsdb1008
[15:35:01] <mark>	 we should be able to get that unblocked soon :)
[15:36:38] <marostegui>	  jynus is there anything in dbstore2001 you need to save?
[15:36:53] <marostegui>	 In 30 minutes papaul will probably change the disks (if the shipment was correct)
[15:37:10] <jynus>	 nope
[15:37:22] <marostegui>	 roger
[15:37:32] <jynus>	 i think you reimaged it already
[15:37:44] <jynus>	 so it should be all new things
[15:38:15] <marostegui>	 yes, but just in case :)
[15:39:41] <marostegui>	 jynus: https://phabricator.wikimedia.org/T148507 i saw that, which is low priority but maybe it will only take you one hour to do and we can get rid of it?
[15:39:45] <marostegui>	 (just suggesting stuff)
[15:40:16] <jynus>	 nope
[15:40:20] <jynus>	 that is an epic one
[15:40:55] <marostegui>	 should we give it that tag and move it to meta/epic?
[15:41:06] <jynus>	 it is not meta, though
[15:41:10] <jynus>	 only very difficult
[15:47:49] <wikibugs>	 10DBA: Meta ticket: Deploy InnoDB compression where possible - https://phabricator.wikimedia.org/T150438#2785948 (10Marostegui)
[15:48:22] <wikibugs>	 10DBA: Test InnoDB compression - https://phabricator.wikimedia.org/T139055#2418098 (10Marostegui)
[15:48:24] <wikibugs>	 10DBA: Meta ticket: Deploy InnoDB compression where possible - https://phabricator.wikimedia.org/T150438#2785948 (10Marostegui)
[15:49:31] <wikibugs>	 10DBA, 06Labs, 10Labs-Infrastructure, 10MediaWiki-extensions-ORES, 06Revision-Scoring-As-A-Service: Replicate ores_classification and ores_model tables in labs - https://phabricator.wikimedia.org/T148561#2726255 (10Halfak) @jcrespo, I see that this is placed in the "Blocked external/Not db team" on the D...
[15:50:41] <wikibugs>	 10DBA: Test InnoDB compression - https://phabricator.wikimedia.org/T139055#2785989 (10Marostegui) I am closing this ticket now as the tests were successful.  We have created a parent meta task (T150438) so we can create subtasks for further compression tasks and track it goes across production.
[15:52:28] <wikibugs>	 10DBA: Meta ticket: Deploy InnoDB compression where possible - https://phabricator.wikimedia.org/T150438#2785948 (10Marostegui)
[15:52:30] <wikibugs>	 10DBA: Test InnoDB compression - https://phabricator.wikimedia.org/T139055#2785993 (10Marostegui) 05Open>03Resolved
[15:52:32] <wikibugs>	 10DBA, 06Operations, 07Upstream: TokuDB crashes frequently -consider upgrade it or search for alternative engines with similar features - https://phabricator.wikimedia.org/T109069#2785995 (10Marostegui)
[15:52:44] <wikibugs>	 10DBA, 06Labs, 10Labs-Infrastructure, 10MediaWiki-extensions-ORES, 06Revision-Scoring-As-A-Service: Replicate ores_classification and ores_model tables in labs - https://phabricator.wikimedia.org/T148561#2785996 (10Halfak) Maybe we should be talking to @chasemp and @madhuvishy about getting this done... ?
[16:27:35] <wikibugs>	 10DBA, 06Labs, 10Labs-Infrastructure, 10MediaWiki-extensions-ORES, 06Revision-Scoring-As-A-Service: Replicate ores_classification and ores_model tables in labs - https://phabricator.wikimedia.org/T148561#2786145 (10jcrespo) I believe now that it is unblocked, the tables are actually replicated, so labs t...
[16:29:41] <wikibugs>	 10DBA, 06Labs, 10Labs-Infrastructure, 10MediaWiki-extensions-ORES, 06Revision-Scoring-As-A-Service: Replicate ores_classification and ores_model tables in labs - https://phabricator.wikimedia.org/T148561#2786150 (10jcrespo) I can confirm they are in labs:   ``` $ mysql -h labsdb1001 wikidatawiki -e "SELE...
[16:36:53] <jynus>	 I can take care of the reimage, if you want, of dbstore2001
[16:38:19] <wikibugs>	 10DBA, 06Operations, 10ops-codfw: install new disks into dbstore2001 - https://phabricator.wikimedia.org/T149457#2753248 (10ops-monitoring-bot) Script wmf_auto_reimage was launched by marostegui on neodymium.eqiad.wmnet for hosts: ``` ['dbstore2001.codfw.wmnet'] ``` The log can be found in `/var/log/wmf-auto...
[16:52:38] <wikibugs>	 10DBA, 06Operations, 10ops-codfw: install new disks into dbstore2001 - https://phabricator.wikimedia.org/T149457#2786211 (10ops-monitoring-bot) Script wmf_auto_reimage was launched by marostegui on neodymium.eqiad.wmnet for hosts: ``` ['dbstore2001.codfw.wmnet'] ``` The log can be found in `/var/log/wmf-auto...
[16:54:50] <wikibugs>	 10DBA, 06Labs, 10Labs-Infrastructure, 10MediaWiki-extensions-ORES, 06Revision-Scoring-As-A-Service: Replicate ores_classification and ores_model tables in labs - https://phabricator.wikimedia.org/T148561#2786222 (10Halfak) Great!  Thanks.  Resolving
[16:55:07] <wikibugs>	 10DBA, 06Labs, 10Labs-Infrastructure, 10MediaWiki-extensions-ORES, 06Revision-Scoring-As-A-Service: Replicate ores_classification and ores_model tables in labs - https://phabricator.wikimedia.org/T148561#2726255 (10Halfak) 05Open>03Resolved
[16:56:11] <wikibugs>	 10DBA, 06Labs, 10Labs-Infrastructure, 10MediaWiki-extensions-ORES, 06Revision-Scoring-As-A-Service: Replicate ores_classification and ores_model tables in labs - https://phabricator.wikimedia.org/T148561#2786227 (10jcrespo) Sorry I was not clear enough- the tables are replicated, but they are probably no...
[16:56:18] <wikibugs>	 10DBA, 06Labs, 10Labs-Infrastructure, 10MediaWiki-extensions-ORES, 06Revision-Scoring-As-A-Service: Replicate ores_classification and ores_model tables in labs - https://phabricator.wikimedia.org/T148561#2786228 (10jcrespo) 05Resolved>03Open
[16:56:39] <wikibugs>	 10DBA, 06Operations, 10ops-codfw: install new disks into dbstore2001 - https://phabricator.wikimedia.org/T149457#2786229 (10ops-monitoring-bot) Script wmf_auto_reimage was launched by marostegui on neodymium.eqiad.wmnet for hosts: ``` ['dbstore2001.codfw.wmnet'] ``` The log can be found in `/var/log/wmf-auto...
[16:57:25] <wikibugs>	 10DBA, 06Labs, 10Labs-Infrastructure, 10MediaWiki-extensions-ORES, 06Revision-Scoring-As-A-Service: Replicate ores_classification and ores_model tables in labs - https://phabricator.wikimedia.org/T148561#2786230 (10Ladsgroup) @jcrespo is right: ``` ladsgroup@tools-bastion-03:~$ sql fawiki "select oresm_i...
[17:02:14] <wikibugs>	 10DBA, 06Operations, 10ops-codfw: install new disks into dbstore2001 - https://phabricator.wikimedia.org/T149457#2786255 (10Marostegui) The reason for so many script runs is that the server doesn't get reimaged if it not on.  ``` Set Chassis Power Control to Cycle failed: Command not supported in present sta...
[17:05:24] <wikibugs>	 10DBA, 06Operations, 10ops-codfw: install new disks into dbstore2001 - https://phabricator.wikimedia.org/T149457#2786294 (10Volans) The issue with `wmf-reimage` is tracked in T150448
[17:33:34] <wikibugs>	 10DBA, 06Labs, 10Labs-Infrastructure, 10MediaWiki-extensions-ORES, 06Revision-Scoring-As-A-Service: Replicate ores_classification and ores_model tables in labs - https://phabricator.wikimedia.org/T148561#2786420 (10Halfak) @jcrespo I see.  Who should figure this out?
[17:39:59] <wikibugs>	 10DBA, 06Labs, 10Labs-Infrastructure, 10MediaWiki-extensions-ORES, 06Revision-Scoring-As-A-Service: Replicate ores_classification and ores_model tables in labs - https://phabricator.wikimedia.org/T148561#2786440 (10jcrespo) I think labs ops will know, but they are mostly unavailable/busy this week, so I...
[17:44:20] <wikibugs>	 10DBA, 06Labs, 10Labs-Infrastructure, 10MediaWiki-extensions-ORES, 06Revision-Scoring-As-A-Service: Replicate ores_classification and ores_model tables in labs - https://phabricator.wikimedia.org/T148561#2786444 (10Ladsgroup) I do the amending.
[18:06:23] <wikibugs>	 10DBA, 06Operations, 10ops-codfw: install new disks into dbstore2001 - https://phabricator.wikimedia.org/T149457#2786548 (10ops-monitoring-bot) Script wmf_auto_reimage was launched by marostegui on neodymium.eqiad.wmnet for hosts: ``` ['dbstore2001.codfw.wmnet'] ``` The log can be found in `/var/log/wmf-auto...
[18:13:23] <wikibugs>	 10DBA, 06Labs, 10Labs-Infrastructure, 10MediaWiki-extensions-ORES, and 3 others: Replicate ores_classification and ores_model tables in labs - https://phabricator.wikimedia.org/T148561#2786574 (10Halfak) a:03Ladsgroup
[18:15:35] <wikibugs>	 10DBA, 06Labs, 10Labs-Infrastructure, 10MediaWiki-extensions-ORES, and 3 others: Replicate ores_classification and ores_model tables in labs - https://phabricator.wikimedia.org/T148561#2726255 (10AlexMonk-WMF) >>! In T148561#2786440, @jcrespo wrote: > @Krenair may know, but it is not something he has to do...
[18:33:51] <wikibugs>	 10DBA, 06Operations, 10hardware-requests, 10ops-eqiad, 13Patch-For-Review: Decommission db1042 - https://phabricator.wikimedia.org/T149793#2786684 (10Cmjohnson) p:05Triage>03Normal
[18:35:03] <wikibugs>	 10DBA, 06Operations, 10ops-codfw: install new disks into dbstore2001 - https://phabricator.wikimedia.org/T149457#2786704 (10ops-monitoring-bot) Completed auto-reimage of hosts: ``` ['dbstore2001.codfw.wmnet'] ```  and were **ALL** successful.
[18:35:24] <marostegui>	 \o/
[18:39:23] <wikibugs>	 10DBA, 06Operations, 10ops-codfw: install new disks into dbstore2001 - https://phabricator.wikimedia.org/T149457#2786716 (10Marostegui) The server got reinstalled and looks good:  ``` root@dbstore2001:~# lsb_release -a No LSB modules are available. Distributor ID: Debian Description:    Debian GNU/Linux 8.6...
[18:39:34] <wikibugs>	 10DBA, 06Operations, 10ops-codfw: install new disks into dbstore2001 - https://phabricator.wikimedia.org/T149457#2786719 (10Marostegui) a:03Marostegui
[18:41:02] <jynus>	 Available: 11T
[18:41:14] <marostegui>	 :)
[18:42:29] <marostegui>	 Going to logoff now! Have a good day off tomorrow!
[18:42:38] <jynus>	 same, bye!
[19:10:09] <wikibugs>	 10DBA, 06Operations, 10ops-eqiad: labsdb1009 boot issues (power supply and controller?) - https://phabricator.wikimedia.org/T150211#2778014 (10Cmjohnson) -Confirmed power supply is not working, reseated and still not working. HP support request needs to be submitted.
[19:27:34] <wikibugs>	 10DBA: Under high load, there is replication check pile-ups on coredbs, specially enwiki API servers - https://phabricator.wikimedia.org/T150474#2786984 (10jcrespo)
[19:29:44] <wikibugs>	 10DBA, 07Wikimedia-log-errors: Under high load, there is replication check pile-ups on coredbs, specially enwiki API servers - https://phabricator.wikimedia.org/T150474#2787011 (10jcrespo)
[19:33:11] <wikibugs>	 10DBA, 06Labs, 06Operations, 07Tracking: Database replication services (tracking) - https://phabricator.wikimedia.org/T50930#2787049 (10AlexMonk-WMF)
[19:33:13] <wikibugs>	 10DBA, 06Labs, 10Labs-Infrastructure, 10MediaWiki-extensions-ORES, and 3 others: Replicate ores_classification and ores_model tables in labs - https://phabricator.wikimedia.org/T148561#2787048 (10AlexMonk-WMF)
[19:35:29] <wikibugs>	 10DBA, 06Labs, 10Labs-Infrastructure, 10MediaWiki-extensions-ORES, and 3 others: Replicate ores_classification and ores_model tables in labs - https://phabricator.wikimedia.org/T148561#2787065 (10jcrespo) @AlexMonk-WMF, this is not a replication or Database issue, as I just demonstrated.
[19:35:43] <wikibugs>	 10DBA, 06Labs, 10Labs-Infrastructure, 10MediaWiki-extensions-ORES, and 3 others: Replicate ores_classification and ores_model tables in labs - https://phabricator.wikimedia.org/T148561#2787066 (10jcrespo)
[19:35:48] <wikibugs>	 10DBA, 06Labs, 06Operations, 07Tracking: Database replication services (tracking) - https://phabricator.wikimedia.org/T50930#2787067 (10jcrespo)
[20:01:15] <wikibugs>	 10DBA, 06Labs, 10Labs-Infrastructure, 10MediaWiki-extensions-ORES, and 3 others: Replicate ores_classification and ores_model tables in labs - https://phabricator.wikimedia.org/T148561#2787171 (10AlexMonk-WMF) >>! In T148561#2787065, @jcrespo wrote: > @AlexMonk-WMF, this is not a replication or Database is...
[20:01:28] <wikibugs>	 10DBA, 06Labs, 10Labs-Infrastructure, 10MediaWiki-extensions-ORES, and 3 others: Replicate ores_classification and ores_model tables in labs - https://phabricator.wikimedia.org/T148561#2787186 (10AlexMonk-WMF)
[20:01:34] <wikibugs>	 10DBA, 06Labs, 06Operations, 07Tracking: Database replication services (tracking) - https://phabricator.wikimedia.org/T50930#2787187 (10AlexMonk-WMF)